+1 for removing scoop.

Spark should be considered a replacement.

On Mon, Aug 8, 2022 at 11:01 PM Yuqi Gu <[email protected]> wrote:

> Thank you guys for the comments.
>
> Sqoop is the tool of the data migration to transfer data between Hadoop
> system (HDFS, Hive,Hbase...) and RDBMS (Mysql, Oracle, Postgre...).
> Sqoop is dependent on MapReduce. Some guys developed the features to
> supoort sqoop running on Spark, but this feature was not merged into Sqoop
> trunk and  it was not officialy supported by Sqoop.
>
> For Spark, Spark could directly take the JDBC to read Data from RDBMS and
> conver them into dataframes instead of using sqoop to transfer the data.
> Spark codes (Scala, Java, JDBC configuration, SQL) may be hard to
> understand and maintain for the new comers.
> But anyway Spark is the credible alternative for Sqoop and is the flexible
> and simple tool to greatly increase the efficiency of data migration,
> extraction, processing.
> IMO, it makes sense to remove Sqoop from Bigtop.
>
> Thanks,
>
> BRs,
> Yuqi
>

Reply via email to