[ 
https://issues.apache.org/jira/browse/HBASE-26271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17415789#comment-17415789
 ] 

Josh Elser commented on HBASE-26271:
------------------------------------

{quote}2. How to prevent deleting newly added files which are not in track yet. 
Use the generate time of the HFile to determine?
{quote}
This got me wondering how Accumulo does this (I used to know). They track files 
per region in the region's row in their meta table. They actually maintain a 
special row at the end of the metadata table which is just a list of files 
which need to be deleted. Their "HFileCleaner" starts by reading that part of 
the meta table, rather than trying to read from the filesystem itself.

If you'd apply this to HBase, we would record in the store tracker every hfile 
which exists in the region directory but is no longer needed. The HFileCleaner 
would read these all, aggregate them, filter out files which are no longer 
needed (e.g. not referenced by snapshot, replication), delete/archive the file, 
and then update the storetracker to record that we deleted that file.

This approach could work, but I worry about the amount of churn to the 
storefile tracker file itself. Maybe it's no worse than we're already doing?

For the crashed flush/compaction problem, I'm struggling to come up with 
cleanup mechanism. One crazy idea could be to look for WAL markers for flushes 
which started but never completed, and then clean up those files as a part of 
WAL recovery. Glancing at HRegion, we don't write a marker to the WAL at flush 
(it's configurable, but looks like it's false in the relevant codepath). I feel 
like some kind of marker to indicate that we started this operation and it 
never finished is the general idea for what we want. A Procedure is probably 
overkill and would impact write perf.

> Cleanup the broken store files under data directory
> ---------------------------------------------------
>
>                 Key: HBASE-26271
>                 URL: https://issues.apache.org/jira/browse/HBASE-26271
>             Project: HBase
>          Issue Type: Sub-task
>          Components: HFile
>            Reporter: Duo Zhang
>            Priority: Major
>
> As for some new store file tracker implementation, we allow flush/compaction 
> to write directly to data directory, so if we crash in the middle, there will 
> be broken store files left in the data directory.
> We should find a proper way to delete these broken files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to