Usage of block encoding in bulk loading

Anoop Sam John Fri, 11 May 2012 10:19:01 -0700

Hi Devs
              When the data is bulk loaded using HFileOutputFormat, we are not 
using the block encoding and the HBase handled checksum features I think..  
When the writer is created for making the HFile, I am not seeing any such info 
passing to the WriterBuilder.
In HFileOutputFormat.getNewWriter(byte[] family, Configuration conf), we dont 
have these info and do not pass also to the writer... So those HFiles will not 
have these optimizations..


Later in LoadIncrementalHFiles.copyHFileHalf(), where we physically divide one 
HFile(created by the MR) iff it can not belong to just one region, I can see we 
pass the datablock encoding details and checksum details to the new HFile 
writer. But this step wont happen normally I think..

Correct me if my understanding is wrong pls...

Thanks
Anoop

Usage of block encoding in bulk loading

Reply via email to