Dmitriy V. Ryaboy commented on PIG-966:

LoadFunc has a method called determineSchema, not getSchema. This implies some 
sort of introspection, so I can see interpreting this as "if you are looking at 
the data, use determineSchema, and if you have a metadata store/repo then 
implement LoadMetadata". 

But I agree this is clunky and potentially confusing. 

I am of two minds about this. On one hand, moving the method make sense as it's 
metadata-related. On the other hand, it makes implementations that work with 
self-describing formats like Avro implement a heavy-looking interface, and 
requires further changes to existing LoadFunc implementations that will have to 
be ported. 

Another issue is that LoadMetadata.getSchema() returns a ResourceSchema, 
whereas LoadFunc.determineSchema() returns Pig's Schema. The two are compatible 
(I have a translation from one to the other in PIG-760), but not the same. 

> Proposed rework for LoadFunc, StoreFunc, and Slice/r interfaces
> ---------------------------------------------------------------
>                 Key: PIG-966
>                 URL: https://issues.apache.org/jira/browse/PIG-966
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Alan Gates
>            Assignee: Alan Gates
> I propose that we rework the LoadFunc, StoreFunc, and Slice/r interfaces 
> significantly.  See http://wiki.apache.org/pig/LoadStoreRedesignProposal for 
> full details

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

Reply via email to