[GitHub] [iceberg] cccs-jc opened a new issue, #4889: Support ORDERED BY in CTAS statement

GitBox Sat, 28 May 2022 08:47:34 -0700


cccs-jc opened a new issue, #4889:
URL: https://github.com/apache/iceberg/issues/4889


   The dbt-spark adapter uses CTAS to create tables. 
   https://iceberg.apache.org/docs/latest/spark-ddl/#create-table--as-select
   
   ```sql
   REPLACE TABLE prod.db.sample
   USING iceberg
   PARTITIONED BY (part)
   TBLPROPERTIES ('key'='value')
   AS SELECT ...
   ```
   When the table is partition iceberg implicitly performs a order by for the 
given partitions. However for some tables you want to partition by day and also 
sort by user_id.
   
   You can achieve this by applying a write ordered by.
   
   
https://iceberg.apache.org/docs/latest/spark-ddl/#alter-table--write-ordered-by
   
   However there does not seem to be a way to specify the ordered by clause 
when creating the table i.e.: using a CTAS. I would need this capability to 
implement support for ORDERED BY in dbt-spark 
https://github.com/dbt-labs/dbt-spark/issues/343
   ```sql
   REPLACE TABLE prod.db.sample
   USING iceberg
   PARTITIONED BY (part)
   ORDERED BY part, user_id
   TBLPROPERTIES ('key'='value')
   AS SELECT ...
   ```
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] cccs-jc opened a new issue, #4889: Support ORDERED BY in CTAS statement

Reply via email to