Hi Team, RIght now our existing flow is
Oracle-->Sqoop --> Hive--> Hive Queries on Spark-sql (Hive Context)-->Destination Hive table -->sqoop export to Oracle Half of the Hive UDFS required is developed in Java UDF.. SO Now I want to know if I run the native scala UDF's than runninng hive java udfs in spark-sql will there be any performance difference Can we skip the Sqoop Import and export part and Instead directly load data from oracle to spark and code Scala UDF's for transformations and export output data back to oracle? RIght now the architecture we are using is oracle-->Sqoop (Import)-->Hive Tables--> Hive Queries --> Spark-SQL--> Hive --> Oracle what would be optimal architecture to process data from oracle using spark ?? can i anyway better this process ? Regards, Sirisha