Hi,all.
   In sqoop we can specify the parameter --split-by,which can determine which 
field we will use to split map recored.
But if the split field's data is skew.The workload between maps will be 
imbalance.I want to know why sqoop does not use 
select count(*) from table/num-maps to determine each map's workload.As I know 
some other base class of  DataDrivenDBInputFormat's
has the implementation of select count(*) from table/num-maps.Then why sqoop 
override this.


Reply via email to