in hbase0.98.10, DefaultCompactPolicy sort HFiles using seq_id as the main factor.the new file created after compaction will get ist seq_id from hregion,if we have some HFiles, seq_ids are as follows:f1 4f2 6f3 8f4 9f5 12
if we compact file f2,f3,f4, get f6_new, we will get seq_id larger than f5, say 14, for example f1 4 f5 12 f6_new 14 when we do compact, we will delete HFiles whose maxTimeStamp is expire, but in the example above, HFiles with small timestamp are compacted with files with large timestamp, just because they have similar seq_id so will decrease the chance of delete whole old HFiles so, i think we can modify the way new HFile create from compaction get seq_id, just get the max seq_id from the files compacted in the above example, the seq_id of file f6_new will be max(6,8,9) = 9 in this way, files with similar timestamp will also have similar seq_id, will increase the chance of deleting whole HFiles, reduce the pressure of compaction so, do you think this will works and, are there any problem if i set seq_id like this
