ivakegg opened a new issue #1963:
URL: https://github.com/apache/accumulo/issues/1963


   This is similar to #650 but different enough I thought it warranted a 
separate ticket.  The is related to the 1.x versions.
   
   Basically the problem is being able to absolutely verify that a bulk 
imported file was successfully loaded into the system.  This requires being 
able to determine what the file is renamed to during the bulk import process.  
Given that information we would be able to scan the accumulo.metadata table to 
find its matching entry.  We realize that there is a race condition here in 
which the GC could have removed it before verification could take place.  That 
situation could be handled by looking in the GC logs which is not very clean 
but doable.  We could of course monitor the master log to determine the file 
mapping as well but I was hoping for a cleaner solution.
   
   One possibility is to actually include the name of the original file in the 
key or value within the file column family of the accumulo metadata.  Another 
possibility is to have the master pass back the list of file name mappings to 
the client.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to