[ https://issues.apache.org/jira/browse/SPARK-28050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Leanken.Lin updated SPARK-28050: -------------------------------- Description: {code:java} // Some comments here val ptTableName = "mc_test_pt_table" sql(s"CREATE TABLE ${ptTableName} (name STRING, num BIGINT) PARTITIONED BY (pt1 STRING, pt2 STRING)") val df = spark.sparkContext.parallelize(0 to 99, 2) .map(f => { (s"name-$f", f) }) .toDF("name", "num") // if i want to insert df into a specific partition // say pt1='2018',pt2='0601' current api does not supported // only with following work around df.createOrReplaceTempView(s"${ptTableName}_tmp_view") sql(s"insert into table ${ptTableName} partition (pt1='2018', pt2='0601') select * from ${ptTableName}_tmp_view") {code} Propose to have another API in DataframeWriter that can do somethink like: {code:java} df.write.insertInto(ptTableName, "pt1='2018',pt2='0601'") {code} we have a lot of this kind of scenario in our production env. providing a api like this will make us less painful. was: ``` val ptTableName = "mc_test_pt_table" sql(s"CREATE TABLE ${ptTableName} (name STRING, num BIGINT) PARTITIONED BY (pt1 STRING, pt2 STRING)") val df = spark.sparkContext.parallelize(0 to 99, 2) .map(f => { (s"name-$f", f) }) .toDF("name", "num") // if i want to insert df into a specific partition // say pt1='2018',pt2='0601' current api does not supported // only with following work around df.createOrReplaceTempView(s"${ptTableName}_tmp_view") sql(s"insert into table ${ptTableName} partition (pt1='2018', pt2='0601') select * from ${ptTableName}_tmp_view") ``` Propose to have another API in DataframeWriter that can do somethink like: ``` df.write.insertInto(ptTableName, "pt1='2018',pt2='0601'") ``` we have a lot of this kind of scenario in our production env. providing a api like this will make us less painful. > DataFrameWriter support insertInto a specific table partition > ------------------------------------------------------------- > > Key: SPARK-28050 > URL: https://issues.apache.org/jira/browse/SPARK-28050 > Project: Spark > Issue Type: New Feature > Components: SQL > Affects Versions: 2.3.3, 2.4.3 > Reporter: Leanken.Lin > Priority: Minor > Fix For: 2.3.3, 2.4.3 > > > {code:java} > // Some comments here > val ptTableName = "mc_test_pt_table" > sql(s"CREATE TABLE ${ptTableName} (name STRING, num BIGINT) PARTITIONED BY > (pt1 STRING, pt2 STRING)") > val df = spark.sparkContext.parallelize(0 to 99, 2) > .map(f => > { > (s"name-$f", f) > }) > .toDF("name", "num") > // if i want to insert df into a specific partition > // say pt1='2018',pt2='0601' current api does not supported > // only with following work around > df.createOrReplaceTempView(s"${ptTableName}_tmp_view") > sql(s"insert into table ${ptTableName} partition (pt1='2018', pt2='0601') > select * from ${ptTableName}_tmp_view") > {code} > Propose to have another API in DataframeWriter that can do somethink like: > {code:java} > df.write.insertInto(ptTableName, "pt1='2018',pt2='0601'") > {code} > we have a lot of this kind of scenario in our production env. providing a api > like this will make us less painful. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org