Hi,all. In sqoop we can specify the parameter --split-by,which can determine which field we will use to split map recored. But if the split field's data is skew.The workload between maps will be imbalance.I want to know why sqoop does not use select count(*) from table/num-maps to determine each map's workload.As I know some other base class of DataDrivenDBInputFormat's has the implementation of select count(*) from table/num-maps.Then why sqoop override this.
- the confusion of --split-by parameter [email protected]
- Re: the confusion of --split-by parameter Abraham Elmahrek
- Re: Re: the confusion of --split-by parame... [email protected]
- Re: Re: the confusion of --split-by pa... Abraham Elmahrek
- Importing Table and Column comments Venkat, Ankam
- Re: Importing Table and Column comment... pratik khadloya
- RE: Importing Table and Column com... Venkat, Ankam
- Re: Importing Table and Colum... pratik khadloya
- RE: Importing Table and C... Venkat, Ankam
