jackye1995 opened a new issue, #6810:
URL: https://github.com/apache/iceberg/issues/6810

   ### Feature Request / Improvement
   
   cc @RussellSpitzer @pvary @amogh-jahagirdar @rajarshisarkar @singhpk234 
   
   We had some discussion that `FileIO` should technically be specific to each 
table so that all readers and writers use the same one for each table, but 
currently it is defined as catalog property and decided by the end user 
dynamically.
   
   This issue is solved by REST catalog, but technically we can also achieve 
that in Glue, Hive, or any catalog that supports table parameters. (not the 
Iceberg table properties, but the ones we use to store `table_type=ICEBERG` and 
`metadata_location`)
   
   The idea is that user can configure overrides of catalog property for 
specific tables through the table parameters part of the Glue/Hive metastore.
   
   For example, if the Spark session default FileIO is HadoopFileIO, but I want 
to use S3FileIO for a specific table, I can update the table's parameters with 
the catalog properties like `io-impl` and any S3FileIO related configurations, 
and then we can update code to respect those overrides.
   
   This is technically already happening today even in REST catalog through the 
`config` part of a `LoadTableResponse`: 
https://github.com/apache/iceberg/blob/master/open-api/rest-catalog-open-api.yaml#L1701,
 used in 
https://github.com/apache/iceberg/blob/b6b9972538ffcbae10b7e80e82cc444254d49103/core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java#L310
   
   In Glue catalog, we are also technically doing table specific catalog 
properties override even today, for (1) LakeFormation related security 
configurations, (2) table specific S3 tags, because each table requires 
different credential provider for the same FIleIO: 
https://github.com/apache/iceberg/blob/master/aws/src/main/java/org/apache/iceberg/aws/glue/GlueCatalog.java#L205-L230
   
   Any thoughts?
   
   
   ### Query engine
   
   None


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to