ramkrish86 commented on pull request #2664:
URL: https://github.com/apache/hbase/pull/2664#issuecomment-740445961


   In the latest commit apart from having the ContiguousCellComparator, We also 
found that the bulk load performance was slower inspite of overall improving 
the comparator performance by above 15%. 
   The reason was that PutsortReducer - get a given row with all the cells for 
that row and that gets written to the hfile. So effectively it is one row that 
is geting added to the map. Now even when cases where there are 300 cells in a 
row, the optimization that we expect out of ContiguousCellComparator changes 
does not kick in. That is due to the various branches we still have in the code 
and the number of cells for the optimization to kick in is still lesser. 
   For those cases if we can bring up the KVComparator again (currently it is 
deprecated - see the PutsortReducer changes in the patch) and use that 
KVComparator specifically for these bulk load type of cases then we are 
performing 15% faster than 1.3 branch.  This is in line with what we are trying 
to do in https://issues.apache.org/jira/browse/HBASE-24754.
   I can open up a discussion thread with all the details in the dev@ for 
others to chime in.
   @anoopsjohn , @saintstack - FYI.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to