ctubbsii commented on issue #3694:
URL: https://github.com/apache/accumulo/issues/3694#issuecomment-1680203415

   > The AccumuloGC queried the metadata table to find file references for 
deletion candidates. The metadata table scan returned `~tab:~pr` entries but 
failed to return any `srv` or `file` entries.
   > 
   > Because of this, the GC concluded that no file references existed for a 
collection of deletion candidates and then removed files from HDFS that had 
valid references on the metadata table.
   
   This implies that one measure to try to prevent this from happening again 
could be to remove the locality groups in the metadata, so the file entries are 
in the same locality group as the file entries. However, this comes with its 
own problems: First, you'd have to force a full major compaction on the 
metadata table to force this to take effect (and that could be costly... and 
the IO could stress whatever underlying filesystem or disk errors are causing 
the problem in the first place). Second, this could cause slower metadata 
performance. Third, there may still be weird behavior where file entries and 
prev row entries are in different RFile blocks or different HDFS blocks, and 
there could still be hidden errors when one block is read and not when the 
other is read... it's not clear to me how that would actually manifest itself, 
though (because it's not clear to me why HDFS errors are failing to propagate 
up into our code to begin with).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to