[ https://issues.apache.org/jira/browse/SPARK-40885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
zzzzming95 updated SPARK-40885: ------------------------------- Description: When using dynamic partitions to write data and sort partitions and data fields, Spark will filter the sorting of data fields. reproduce sql: {code:java} CREATE TABLE `sort_table`( `id` int, `name` string ) PARTITIONED BY ( `dt` string) stored as textfile LOCATION 'sort_table';CREATE TABLE `test_table`( `id` int, `name` string) PARTITIONED BY ( `dt` string) stored as textfile LOCATION 'test_table';//gen test data insert into test_table partition(dt=20221011) select 10,"15" union all select 1,"10" union all select 5,"50" union all select 20,"2" union all select 30,"14" ; set spark.hadoop.hive.exec.dynamici.partition=true set spark.hadoop.hive.exec.dynamic.partition.mode=nonstrict insert overwrite table sort_table partition(dt) select id,name,dt from test_table order by name,dt; {code} The Sort operator of DAG has only one sort field, but there are actually two in SQL.(See the attached drawing) It relate this issue : https://issues.apache.org/jira/browse/SPARK-40588 was: When using dynamic partitions to write data and sort partitions and data fields, Spark will filter the sorting of data fields. reproduce sql: {code:java} CREATE TABLE `sort_table`( `id` int, `name` string ) PARTITIONED BY ( `dt` string) stored as textfile LOCATION 'sort_table';CREATE TABLE `test_table`( `id` int, `name` string) PARTITIONED BY ( `dt` string) stored as textfile LOCATION 'test_table';//gen test data insert into test_table partition(dt=20221011) select 10,"15" union all select 1,"10" union all select 5,"50" union all select 20,"2" union all select 30,"14" ; set spark.hadoop.hive.exec.dynamici.partition=true set spark.hadoop.hive.exec.dynamic.partition.mode=nonstrict insert overwrite table sort_table partition(dt) select id,name,dt from test_table order by name,dt; {code} !image-2022-10-23-11-09-47-759.png! The Sort operator of DAG has only one sort field, but there are actually two in SQL. It relate this issue : https://issues.apache.org/jira/browse/SPARK-40588 > Spark will filter out data field sorting when dynamic partitions and data > fields are sorted at the same time > ------------------------------------------------------------------------------------------------------------ > > Key: SPARK-40885 > URL: https://issues.apache.org/jira/browse/SPARK-40885 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 3.1.2, 3.3.0, 3.2.2 > Reporter: zzzzming95 > Priority: Major > Fix For: 3.4.0 > > Attachments: 1666494504884.jpg > > > When using dynamic partitions to write data and sort partitions and data > fields, Spark will filter the sorting of data fields. > > reproduce sql: > {code:java} > CREATE TABLE `sort_table`( > `id` int, > `name` string > ) > PARTITIONED BY ( > `dt` string) > stored as textfile > LOCATION 'sort_table';CREATE TABLE `test_table`( > `id` int, > `name` string) > PARTITIONED BY ( > `dt` string) > stored as textfile > LOCATION > 'test_table';//gen test data > insert into test_table partition(dt=20221011) select 10,"15" union all select > 1,"10" union all select 5,"50" union all select 20,"2" union all select > 30,"14" ; > set spark.hadoop.hive.exec.dynamici.partition=true > set spark.hadoop.hive.exec.dynamic.partition.mode=nonstrict > insert overwrite table sort_table partition(dt) select id,name,dt from > test_table order by name,dt; > {code} > > The Sort operator of DAG has only one sort field, but there are actually two > in SQL.(See the attached drawing) > > It relate this issue : https://issues.apache.org/jira/browse/SPARK-40588 -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org