[ 
https://issues.apache.org/jira/browse/HBASE-26271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17418169#comment-17418169
 ] 

Szabolcs Bukros commented on HBASE-26271:
-----------------------------------------

[~zhangduo], [~elserj] 

1. The approach I'm testing now would be to add a ScheduledChore on region 
start that periodically checks the hfiles for each store where persistent  
storage is enabled. Maybe allowing this to be called from shell too. I think it 
is safer this way. We can run it rarely enough to minimize the performance 
impact but make sure to keep the folder clean.
2. I would check the ModificationTime and add a massive waiting period. The 
period can be big enough to be safely outside a realistic compaction runtime, 
since we are in no hurry to archive these files. I'm not sure a more 
complicated solution is warranted.
" we could still fail before inserting these files into store file tracker 
right?" My thoughts exactly. The safest solution seems to be just listing the 
file system. Also considering this is rs specific and we can add some jitter 
the impact should not be significant either.

What do you think?

> Cleanup the broken store files under data directory
> ---------------------------------------------------
>
>                 Key: HBASE-26271
>                 URL: https://issues.apache.org/jira/browse/HBASE-26271
>             Project: HBase
>          Issue Type: Sub-task
>          Components: HFile
>            Reporter: Duo Zhang
>            Priority: Major
>
> As for some new store file tracker implementation, we allow flush/compaction 
> to write directly to data directory, so if we crash in the middle, there will 
> be broken store files left in the data directory.
> We should find a proper way to delete these broken files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to