rdblue commented on pull request #1612:
URL: https://github.com/apache/iceberg/pull/1612#issuecomment-715592301


   I'm not quite convinced that not supporting the Hive `PARTITIONED BY` clause 
is the right way to go, but I think it is a reasonable step to get this patch 
done. We don't need to support it to support the Schema DDL, so it would be 
fine with me to throw an exception and reject its use for now.
   
   In the long term, I think we do want Iceberg partitioning to be exposed in 
the normal way for Hive because it would be confusing for a partitioned Iceberg 
table to show up as unpartitioned. That said, there are significant differences 
between the two partitioning approaches:
   1. Partitioning never changes the table schema, but Hive partition columns 
are always at the end
   2. Hive partition columns can't be changed
   3. Iceberg supports hidden partitions that can't be shown in Hive
   
   The differences may be significant enough that it would cause problems to 
expose even Iceberg identity partitions to Hive. For example, if Hive expects 
to get a partition key and fill in data values, then that would be a problem.
   
   What are the chances of integrating Iceberg into Hive itself and solving 
some of these limitations?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to