Hi I have around 2000 Hive source partitions to process and insert data into same table and different partition. For e.g. I have the following query
hiveContext.sql("insert into table myTable partition(mypartition="someparition") bla bla) If I call above query in Spark driver program it runs fine and creates corresponding partition in HDFS. Now this works but it is very slow takes 4-5 hours to process all 2000 partitions. So I though of using ExecutorService and calling above query with couple of similar insert into queries in Callable threads. Now using threads become definitely faster but I dont see any parition created in HDFS is it concurrency issue since every thread is trying to insert into same table but different patition I see tasks are running very fast and getting finished but dont see any partition in HDFS please guide I am new to Spark and Hive. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Calling-hiveContext-sql-insert-into-table-xyz-in-multiple-threads-tp24298.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org