JerAguilon opened a new issue, #14914: URL: https://github.com/apache/iceberg/issues/14914
### Query engine Trino and ClickHouse, though this question is more about the spec itself ### Question I've been reading [this document](https://apache.github.io/iceberg/docs/1.4.3/partitioning/#icebergs-hidden-partitioning) and I am unsure what to conclude for this particular use case, and I am seeing divergent behavior across some table engines. Suppose I have a table with a schema of: ``` sale_id: BIGINT price: BIGINT date: DATE ``` And it is partitioned on `date`. If we guarantee that each Parquet data file contains data for one and only one `date`, are we allowed to just have `sale_id, price` in the physical files, while marking `date` values in the manifest layer? Given that there's a "separation between physical and logical," this seems to be analogous to a transformation like `bucket`, except with some constant value. But the spec doesn't explicitly discuss this. I have a table with this format. On Trino's implementation, it seems to handle this gracefully, substituting partition values as you'd except. However, some other engines (https://github.com/ClickHouse/ClickHouse/issues/92732) instead return `NULL` values. It'd be very helpful to explicitly clarify if `identity` partitions can also be hidden, in the same way that `bucket` transformations are. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
