Hi Bijay, If I am hitting this issue, https://issues.apache.org/jira/browse/HIVE-11940. What needs to be done? Incrementing to higher version of hive is the only solution?
Thanks! On Mon, Jun 13, 2016 at 10:47 AM, swetha kasireddy < swethakasire...@gmail.com> wrote: > Hi, > > Following is a sample code snippet: > > > *val *userDF = userRecsDF.toDF("idPartitioner", "dtPartitioner", "userId", > "userRecord").persist() > System.*out*.println(" userRecsDF.partitions.size"+ > userRecsDF.partitions.size) > > userDF.registerTempTable("userRecordsTemp") > > sqlContext.sql("SET hive.default.fileformat=Orc ") > sqlContext.sql("set hive.enforce.bucketing = true; ") > sqlContext.sql("set hive.enforce.sorting = true; ") > sqlContext.sql(" CREATE EXTERNAL TABLE IF NOT EXISTS users (userId > STRING, userRecord STRING) PARTITIONED BY (idPartitioner STRING, > dtPartitioner STRING) stored as ORC LOCATION '/user/userId/userRecords' " > ) > sqlContext.sql( > """ from userRecordsTemp ps insert overwrite table users > partition(idPartitioner, dtPartitioner) select ps.userId, ps.userRecord, > ps.idPartitioner, ps.dtPartitioner CLUSTER BY idPartitioner, dtPartitioner > """.stripMargin) > > > > > On Fri, Jun 10, 2016 at 12:10 AM, Bijay Pathak <bijay.pat...@cloudwick.com > > wrote: > >> Hello, >> >> Looks like you are hitting this: >> https://issues.apache.org/jira/browse/HIVE-11940. >> >> Thanks, >> Bijay >> >> >> >> On Thu, Jun 9, 2016 at 9:25 PM, Mich Talebzadeh < >> mich.talebza...@gmail.com> wrote: >> >>> cam you provide a code snippet of how you are populating the target >>> table from temp table. >>> >>> >>> HTH >>> >>> Dr Mich Talebzadeh >>> >>> >>> >>> LinkedIn * >>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>> >>> >>> >>> http://talebzadehmich.wordpress.com >>> >>> >>> >>> On 9 June 2016 at 23:43, swetha kasireddy <swethakasire...@gmail.com> >>> wrote: >>> >>>> No, I am reading the data from hdfs, transforming it , registering the >>>> data in a temp table using registerTempTable and then doing insert >>>> overwrite using Spark SQl' hiveContext. >>>> >>>> On Thu, Jun 9, 2016 at 3:40 PM, Mich Talebzadeh < >>>> mich.talebza...@gmail.com> wrote: >>>> >>>>> how are you doing the insert? from an existing table? >>>>> >>>>> Dr Mich Talebzadeh >>>>> >>>>> >>>>> >>>>> LinkedIn * >>>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>>>> >>>>> >>>>> >>>>> http://talebzadehmich.wordpress.com >>>>> >>>>> >>>>> >>>>> On 9 June 2016 at 21:16, Stephen Boesch <java...@gmail.com> wrote: >>>>> >>>>>> How many workers (/cpu cores) are assigned to this job? >>>>>> >>>>>> 2016-06-09 13:01 GMT-07:00 SRK <swethakasire...@gmail.com>: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> How to insert data into 2000 partitions(directories) of ORC/parquet >>>>>>> at a >>>>>>> time using Spark SQL? It seems to be not performant when I try to >>>>>>> insert >>>>>>> 2000 directories of Parquet/ORC using Spark SQL. Did anyone face >>>>>>> this issue? >>>>>>> >>>>>>> Thanks! >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> View this message in context: >>>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-insert-data-into-2000-partitions-directories-of-ORC-parquet-at-a-time-using-Spark-SQL-tp27132.html >>>>>>> Sent from the Apache Spark User List mailing list archive at >>>>>>> Nabble.com. >>>>>>> >>>>>>> --------------------------------------------------------------------- >>>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>>>>>> For additional commands, e-mail: user-h...@spark.apache.org >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >