Hi, Following is a sample code snippet:
*val *userDF = userRecsDF.toDF("idPartitioner", "dtPartitioner", "userId", "userRecord").persist() System.*out*.println(" userRecsDF.partitions.size"+ userRecsDF.partitions.size) userDF.registerTempTable("userRecordsTemp") sqlContext.sql("SET hive.default.fileformat=Orc ") sqlContext.sql("set hive.enforce.bucketing = true; ") sqlContext.sql("set hive.enforce.sorting = true; ") sqlContext.sql(" CREATE EXTERNAL TABLE IF NOT EXISTS users (userId STRING, userRecord STRING) PARTITIONED BY (idPartitioner STRING, dtPartitioner STRING) stored as ORC LOCATION '/user/userId/userRecords' ") sqlContext.sql( """ from userRecordsTemp ps insert overwrite table users partition(idPartitioner, dtPartitioner) select ps.userId, ps.userRecord, ps.idPartitioner, ps.dtPartitioner CLUSTER BY idPartitioner, dtPartitioner """.stripMargin) On Fri, Jun 10, 2016 at 12:10 AM, Bijay Pathak <bijay.pat...@cloudwick.com> wrote: > Hello, > > Looks like you are hitting this: > https://issues.apache.org/jira/browse/HIVE-11940. > > Thanks, > Bijay > > > > On Thu, Jun 9, 2016 at 9:25 PM, Mich Talebzadeh <mich.talebza...@gmail.com > > wrote: > >> cam you provide a code snippet of how you are populating the target table >> from temp table. >> >> >> HTH >> >> Dr Mich Talebzadeh >> >> >> >> LinkedIn * >> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >> >> >> >> http://talebzadehmich.wordpress.com >> >> >> >> On 9 June 2016 at 23:43, swetha kasireddy <swethakasire...@gmail.com> >> wrote: >> >>> No, I am reading the data from hdfs, transforming it , registering the >>> data in a temp table using registerTempTable and then doing insert >>> overwrite using Spark SQl' hiveContext. >>> >>> On Thu, Jun 9, 2016 at 3:40 PM, Mich Talebzadeh < >>> mich.talebza...@gmail.com> wrote: >>> >>>> how are you doing the insert? from an existing table? >>>> >>>> Dr Mich Talebzadeh >>>> >>>> >>>> >>>> LinkedIn * >>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>>> >>>> >>>> >>>> http://talebzadehmich.wordpress.com >>>> >>>> >>>> >>>> On 9 June 2016 at 21:16, Stephen Boesch <java...@gmail.com> wrote: >>>> >>>>> How many workers (/cpu cores) are assigned to this job? >>>>> >>>>> 2016-06-09 13:01 GMT-07:00 SRK <swethakasire...@gmail.com>: >>>>> >>>>>> Hi, >>>>>> >>>>>> How to insert data into 2000 partitions(directories) of ORC/parquet >>>>>> at a >>>>>> time using Spark SQL? It seems to be not performant when I try to >>>>>> insert >>>>>> 2000 directories of Parquet/ORC using Spark SQL. Did anyone face this >>>>>> issue? >>>>>> >>>>>> Thanks! >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> View this message in context: >>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-insert-data-into-2000-partitions-directories-of-ORC-parquet-at-a-time-using-Spark-SQL-tp27132.html >>>>>> Sent from the Apache Spark User List mailing list archive at >>>>>> Nabble.com. >>>>>> >>>>>> --------------------------------------------------------------------- >>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>>>>> For additional commands, e-mail: user-h...@spark.apache.org >>>>>> >>>>>> >>>>> >>>> >>> >> >