guilload commented on issue #170: Add support for Iceberg MR / InputFormat and OutputFormat APIs URL: https://github.com/apache/incubator-iceberg/issues/170#issuecomment-519238245 Hey, I started working on an MR InputFormat for Iceberg since there’s some overlap with the InputFormat I’m working on for Hive. I have a few questions: – Thoughts on using `GenericRecord` as data container for both the Input and Output formats? – `IcebergPigInputFormat` declares a public constructor `IcebergPigInputFormat(Table table)` but I don’t think that’s an option for MR since the `JobSubmitter` class instantiates the input format via reflection without constructor arguments. How do we solve this? Have the user pass the canonical name of the `Catalog` class they want to use along with a table identifier in the job config and then instantiate the catalog via reflection and finally load the table? – Regarding serialization, `IcebergPigInputFormat` relies on Java serialization via `org.apache.pig.impl.util.ObjectSerializer`, shall I reuse the same approach or use a different serialization mechanism? Thanks!
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
