rdblue commented on issue #2068: URL: https://github.com/apache/iceberg/issues/2068#issuecomment-780992492
I think that we should only infer partitioning when we know that we are converting from a Hive table with identity-partition columns. Since that is a "well-known" format, it should be somewhat reliable. Random paths in a file system are more risky to parse for table data. I could be convinced otherwise, but it seems like a stretch to match an Iceberg table's partitioning to paths. +1 to checking file footers before importing. The files should not have IDs in the schemas and we should make sure that the schemas can be converted to something readable using the name mapping. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
