Hi Team,

RIght now our existing flow is

Oracle-->Sqoop --> Hive--> Hive Queries on Spark-sql (Hive
Context)-->Destination Hive table -->sqoop export to Oracle

Half of the Hive UDFS required is developed in Java UDF..

SO Now I want to know if I run the native scala UDF's than runninng hive
java udfs in spark-sql will there be any performance difference


Can we skip the Sqoop Import and export part and

Instead directly load data from oracle to spark and code Scala UDF's for
transformations and export output data back to oracle?

RIght now the architecture we are using is

oracle-->Sqoop (Import)-->Hive Tables--> Hive Queries --> Spark-SQL--> Hive
--> Oracle
what would be optimal architecture to process data from oracle using spark
?? can i anyway better this process ?




Regards,
Sirisha

Reply via email to