[GitHub] [accumulo] EdColeman commented on issue #2361: Utility to generate splits

GitBox Thu, 18 Nov 2021 12:54:46 -0800


EdColeman commented on issue #2361:
URL: https://github.com/apache/accumulo/issues/2361#issuecomment-973262661



   Another consideration may be to just accept the desired size (based on split 
threshold?) and then run through the file(s) and spit out a split that would be 
the row-index before the desired size was met / exceeded.
   
   Also, depending on compression, file hdfs size and entity size may report 
differently. Assuming that you would track / care about the uncompressed size 
because I think that's what is maintained in the metadata and used in split 
calculations. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [accumulo] EdColeman commented on issue #2361: Utility to generate splits

Reply via email to