I think he idea is feasible.

[email protected] <[email protected]> 于2020年6月3日周三 上午10:16写道:

> Hi all,
>     ds-dev add the sqoop component and the sqoop component need to
> enhancment.
>     some optimization point:
>     Sqoop's data access and data export do not support Hadoop-level custom
> parameters, that is, -D level parameters
>         – MR task name
>         – MR map and reduce memory and quantity, etc.
>     • Split-by field is not supported. If -m is greater than 1, if the
> primary key of the relational database table is not self-increasing, Sqoop
>         It may cause duplicate data imported into Hadoop. The general
> solution is to specify a split-by field. therefore,
>         split-by needs support
>     • Cannot customize parameters, such as import mysql, some tables can
> add –direct to speed up the import speed
>
>     ideas:
>     • The task name of Sqoop is universal, and it must be changed to the
> required parameter on the Sqoop page
>     • Add Hadoop custom parameter input box for setting MR parameter
> memory, etc.
>     • Add Sqoop task-level custom parameters, like –driect, –fetch-size
> and other parameters used in specific situations
>     • Add option button to choose, custom script or use template script,
> refer to the design of DataX node
>
>     If the idea is feasible, I will implement this.
>
>
> Best
>
>
> Eights-Li  黄立
> [email protected]
>


-- 

DolphinScheduler(Incubator)  PPMC
Jun Gao 高俊
[email protected]

Reply via email to