liuwenjing17 commented on PR #5599:
URL: https://github.com/apache/hbase/pull/5599#issuecomment-1878527381

   > > > I still do not fully understand the problem here...
   > > > If we do not set millis to zero, it will only affect the life time of 
a MOB file for less than 1 second, how could it make the MOB file expire 2 
hours earlier?
   > > 
   > > 
   > > Because in org.apache.hadoop.hbase.mob.MobUtils, the creation time of 
mob files is obtained by parsing their names from fileName using the statement 
(Date fileDate = parseDate(MobFileName.getDateFromName(fileName));). For 
instance, data created on 20240105, their timestamps will be parsed as 
1704384000000 (2024-01-05 00:00:00). In this way, when the master expired mob 
thread starts, it may affect the life time of a MOB file for less than 1 day.
   > 
   > Then the problem is we should use a timestamp instead of '20240105' in the 
mob file name? I still do not understand why setting MILLISECOND to 0 can solve 
the problem...
   
   Here is an example:
   
   1. Assume the Time-To-Live (TTL) for the mob data is set to 1 day.
   2. We write mob data at 18:33 on 01/04/2023, and the data is flushed to a 
mob file named xxxx20230104xxxx.
   3. The mob expiration thread starts within 1 day, at 10:45 on 01/05/2023.
   4. When checking ts, the standard expired timestamp is calculated as 
(currentTS - 1day parsed by Calendar) : 1704297600720 (2024-01-04 00:00:00)(3 
random digits when only set to SECOND level)
      The mob file's ts, parsed from its name, is : 1704297600000 (2024-01-04 
00:00:00)
   5. if (fileDate.getTime() < expireDate.getTime()) {/* expired */} 
      if statement is true, indicating that the mob file has expired, and it 
will be cleaned. **These 3 random digits cause the mob files to be cleaned 
earlier than expected.**
   6. But if we set to MILLISECOND level, the expireDate.getTime() will be 
1704297600000, and if statement will be false. In this case, the mob file will 
be retained as intended.
   
   And here is the link to jira: 
https://issues.apache.org/jira/browse/HBASE-28287


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to