+1 for removing scoop. Spark should be considered a replacement.
On Mon, Aug 8, 2022 at 11:01 PM Yuqi Gu <[email protected]> wrote: > Thank you guys for the comments. > > Sqoop is the tool of the data migration to transfer data between Hadoop > system (HDFS, Hive,Hbase...) and RDBMS (Mysql, Oracle, Postgre...). > Sqoop is dependent on MapReduce. Some guys developed the features to > supoort sqoop running on Spark, but this feature was not merged into Sqoop > trunk and it was not officialy supported by Sqoop. > > For Spark, Spark could directly take the JDBC to read Data from RDBMS and > conver them into dataframes instead of using sqoop to transfer the data. > Spark codes (Scala, Java, JDBC configuration, SQL) may be hard to > understand and maintain for the new comers. > But anyway Spark is the credible alternative for Sqoop and is the flexible > and simple tool to greatly increase the efficiency of data migration, > extraction, processing. > IMO, it makes sense to remove Sqoop from Bigtop. > > Thanks, > > BRs, > Yuqi >
