[
https://issues.apache.org/jira/browse/PIG-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780873#action_12780873
]
Pradeep Kamath commented on PIG-966:
------------------------------------
I wanted to check on other's thoughts on having getSchema() method in
LoadMetadata Vs LoadFunc
Pros:
* getSchema() is inherently a metadata query and hence LoadMetadata seems the
right category for it.
Cons:
* LoadMetadata has other methods (getStatistics, getPartitonKeys,
setPartitionFilter) which are more relevant for SQL like data stores. Loaders
on self-describing data like BinStorage and say a JSON Loader would want to
implement getSchema() since they can provide a schema but would need to return
nulls for these other methods. I am wondering if this would make the users
think whether they are implementing a wrong interface since they are returning
nulls for most methods or implementing the one method incorrectly i.e. whether
LoadMetadata is meant to be implemented only for SQL kind of data
If we feel the con outweighs the pro should we move getSchema into LoadFunc? -
thoughts?
> Proposed rework for LoadFunc, StoreFunc, and Slice/r interfaces
> ---------------------------------------------------------------
>
> Key: PIG-966
> URL: https://issues.apache.org/jira/browse/PIG-966
> Project: Pig
> Issue Type: Improvement
> Components: impl
> Reporter: Alan Gates
> Assignee: Alan Gates
>
> I propose that we rework the LoadFunc, StoreFunc, and Slice/r interfaces
> significantly. See http://wiki.apache.org/pig/LoadStoreRedesignProposal for
> full details
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.