Anoop Sam John created HBASE-6040:
-------------------------------------

             Summary: Use block encoding and HBase handled checksum 
verification in bulk loading using HFileOutputFormat
                 Key: HBASE-6040
                 URL: https://issues.apache.org/jira/browse/HBASE-6040
             Project: HBase
          Issue Type: Improvement
          Components: mapreduce
            Reporter: Anoop Sam John
            Assignee: Anoop Sam John


When the data is bulk loaded using HFileOutputFormat, we are not using the 
block encoding and the HBase handled checksum features..  When the writer is 
created for making the HFile, I am not seeing any such info passing to the 
WriterBuilder.
In HFileOutputFormat.getNewWriter(byte[] family, Configuration conf), we dont 
have these info and do not pass also to the writer... So those HFiles will not 
have these optimizations..

Later in LoadIncrementalHFiles.copyHFileHalf(), where we physically divide one 
HFile(created by the MR) iff it can not belong to just one region, I can see we 
pass the datablock encoding details and checksum details to the new HFile 
writer. But this step wont happen normally I think..


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to