Pradeep Kamath commented on PIG-966:

I wanted to check on other's thoughts on having getSchema() method in 
LoadMetadata Vs LoadFunc

 * getSchema() is inherently a metadata query and hence LoadMetadata seems the 
right category for it.

 * LoadMetadata has other methods (getStatistics, getPartitonKeys, 
setPartitionFilter) which are more relevant for SQL like data stores. Loaders 
on self-describing data like BinStorage and say a JSON Loader would want to 
implement getSchema() since they can provide a schema but would need to return 
nulls for these other methods. I am wondering if this would make the users 
think whether they  are implementing a wrong interface since they are returning 
nulls for most methods or implementing the one method incorrectly  i.e. whether 
LoadMetadata is meant to be implemented only for SQL kind of data

If we feel the con outweighs the pro should we move getSchema into LoadFunc? - 

> Proposed rework for LoadFunc, StoreFunc, and Slice/r interfaces
> ---------------------------------------------------------------
>                 Key: PIG-966
>                 URL: https://issues.apache.org/jira/browse/PIG-966
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Alan Gates
>            Assignee: Alan Gates
> I propose that we rework the LoadFunc, StoreFunc, and Slice/r interfaces 
> significantly.  See http://wiki.apache.org/pig/LoadStoreRedesignProposal for 
> full details

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

Reply via email to