Save to a Partitioned Table using a Derived Column

Benjamin Kim Fri, 03 Jun 2016 06:14:16 -0700

Does anyone know how to save data in a DataFrame to a table partitioned using 
an existing column reformatted into a derived column?


                val partitionedDf = df.withColumn("dt", 
concat(substring($"timestamp", 1, 10), lit(" "), substring($"timestamp", 12, 
2), lit(":00")))

                sqlContext.setConf("hive.exec.dynamic.partition", "true")
                sqlContext.setConf("hive.exec.dynamic.partition.mode", 
"nonstrict")
                partitionedDf.write
                    .mode(SaveMode.Append)
                    .partitionBy("dt")
                    .saveAsTable("ds.amo_bi_events")

I am getting an ArrayOutOfBounds error. There are 83 columns in the destination 
table. But after adding the derived column, then I get an 84 error. I assumed 
that the column used for the partition would not be counted.

Can someone please help.

Thanks,
Ben
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Save to a Partitioned Table using a Derived Column

Reply via email to