guilload commented on issue #170: Add support for Iceberg MR / InputFormat and 
OutputFormat APIs
URL: 
https://github.com/apache/incubator-iceberg/issues/170#issuecomment-519238245
 
 
   Hey,
   
   I started working on an MR InputFormat for Iceberg since there’s some 
overlap with the InputFormat I’m working on for Hive. I have a few questions:
   
   – Thoughts on using `GenericRecord` as data container for both the Input and 
Output formats?
   
   – `IcebergPigInputFormat` declares a public constructor 
`IcebergPigInputFormat(Table table)` but I don’t think that’s an option for MR 
since the `JobSubmitter` class instantiates the input format via reflection 
without constructor arguments. How do we solve this? Have the user pass the 
canonical name of the `Catalog` class they want to use along with a table 
identifier in the job config and then instantiate the catalog via reflection 
and finally load the table?
   
   – Regarding serialization, `IcebergPigInputFormat` relies on Java 
serialization via `org.apache.pig.impl.util.ObjectSerializer`, shall I reuse 
the same approach or use a different serialization mechanism?
   
   Thanks!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to