huan233usc opened a new pull request, #2575:
URL: https://github.com/apache/iceberg-rust/pull/2575

   ## Which issue does this PR close?
   
   - Closes #2050
   
   ## What changes are included in this PR?
   
   `CREATE EXTERNAL TABLE ... STORED AS ICEBERG` (via 
`IcebergTableProviderFactory`) previously rejected any `PARTITIONED BY` clause 
outright.
   
   DataFusion's `PARTITIONED BY` grammar only accepts plain column names — it 
cannot express Iceberg transforms such as `bucket(16, id)` or `days(ts)` 
(unlike Spark's native DSv2 grammar). Given that constraint, this PR:
   
   - Stops rejecting `table_partition_cols` in `check_cmd`.
   - Adds `validate_partition_columns`, run after the table is loaded:
     - If the table's default partition spec uses any **non-identity** 
transform, returns a clear `FeatureUnsupported` error naming the offending 
field/transform.
     - Otherwise validates that the declared columns exactly match the identity 
partition columns **in order** (consistent with 
`PartitionSpec::is_compatible_with` and Java's `PartitionSpec.compatibleWith`, 
where field order is significant).
   - Omitting `PARTITIONED BY` keeps the previous behavior: any table — 
including non-identity partitioned ones — can still be registered for read-only 
access.
   - A `TODO` is left to support non-identity transforms once DataFusion's 
grammar can express them.
   
   ### Example
   
   ```sql
   CREATE EXTERNAL TABLE my_iceberg_table
   STORED AS ICEBERG LOCATION '/path/to/metadata.json'
   PARTITIONED BY (event_date);
   ```
   
   ## Are these changes tested?
   
   Yes. Added unit tests in `table_provider_factory.rs` plus two metadata 
fixtures (bucket-partitioned and multi-identity-partitioned):
   
   - single identity column match / mismatch
   - multiple identity columns match / wrong order / subset (count mismatch)
   - non-identity (`bucket[4]`) transform rejected with a clear error
   - non-identity partitioned table still registers when `PARTITIONED BY` is 
omitted
   
   `cargo test -p iceberg-datafusion` and `cargo clippy -p iceberg-datafusion 
--all-targets` pass.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to