Anton Okolnychyi created SPARK-42779:
----------------------------------------

             Summary: Allow V2 writes to indicate advisory partition size
                 Key: SPARK-42779
                 URL: https://issues.apache.org/jira/browse/SPARK-42779
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 3.5.0
            Reporter: Anton Okolnychyi


Data sources may request a particular distribution and ordering of data during 
V2 writes. If AQE is enabled, the default session advisory partition size 
(64MB) will be used as guidance. Unfortunately, this default value can still 
lead to small files because the written data can be compressed nicely using 
columnar file formats. Spark should allow data sources to indicate the advisory 
shuffle partition size, just like it lets data sources request a particular 
number of partitions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to