-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22098/
-----------------------------------------------------------

(Updated May 30, 2014, 5:54 p.m.)


Review request for drill.


Changes
-------

Updated patch.


Repository: drill-git


Description
-------

The bug is in the parallelization logic of HBaseGroupScan. The current code 
favored region affinity over load distribution.

Since Drill's parallelizer code already takes care of creating endpoints slots 
based on affinity, HBaseGroupScan should only distribute the work evenly among 
provided slots.

The modified algorithm ensures that, for 'm' regions to scan and 'n' endpoint 
slots:
1. Each slot gets at least floor(m/n) and at most ceil(m/n) regions.
2. Each slot on a single host with regions affinity gets even distribution of 
regions hosted on it.


Diffs (updated)
-----

  
contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseGroupScan.java
 809aa86 
  
contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseSubScan.java
 d9f2b7c 
  contrib/storage-hbase/src/test/java/org/apache/drill/hbase/BaseHBaseTest.java 
96f0c4a 
  
contrib/storage-hbase/src/test/java/org/apache/drill/hbase/HBaseTestsSuite.java 
e30f79e 
  
contrib/storage-hbase/src/test/java/org/apache/drill/hbase/TestHBaseRegionScanAssignments.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/22098/diff/


Testing
-------

Added test TestHBaseRegionScanAssignments.


Thanks,

Aditya Kishore

Reply via email to