kbendick commented on issue #2903:
URL: https://github.com/apache/iceberg/issues/2903#issuecomment-891459560


   @GrigorievNick You have a few options.
   
   You can set `'write.storage-object.enabled'=true` in the table properties, 
which will append some randomness to the beginning of the path for the data. 
This would still keep the folder hierarchy, but would allow you to spread data 
files across multiple partitions.
   
   To be able to have full control over the data layout, you can implement your 
own LocationProvider and then set `write.location-provider.impl`. Here is a 
test that uses a custom location provider: 
https://github.com/apache/iceberg/blob/90225d6c9413016d611e2ce5eff37db1bc1b4fc5/core/src/test/java/org/apache/iceberg/TestLocationProvider.java
   
   See also the documentation on writing your own file io implementation, 
location provider, or catalog: 
https://iceberg.apache.org/custom-catalog/#custom-file-io-implementation
   
   However, altering the files outside of Iceberg is going to cause issues as 
the data files are stored in the metadata lists. So I don't know how 
accomplishable your ultimate goal is. Perhaps others can weigh in, but the 
custom location provider (or more likely custom FileIO and LocationProvider - 
which are relatively small interfaces) could be used to at least get all of the 
data into a flat folder structure. But coalescing the files after the fact is 
something that you'd need to do with some sort of Iceberg API (e.g. maybe read 
in the files to coalesce, create a temporary view of that data, and then use 
`MERGE INTO` or `INSERT` to handle the write), so that appropriate metadata is 
written.
   
   Hope that helps.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to