Pradeep Kamath updated PIG-652:
Fix Version/s: types_branch
Assignee: Pradeep Kamath (was: Alan Gates)
Affects Version/s: types_branch
Hadoop Flags: [Incompatible change]
Status: Patch Available (was: Open)
Submitting a patch with a few changes to the way this will work. Very soon we
will have the ability to store multiple outputs in the map or reduce phase of a
job (https://issues.apache.org/jira/browse/PIG-627). In that scenario the
OutputFormat will still need to be able to get a handle of the corresponding
StoreFunc, location and schema to use for the particular output that it is
trying to write. To Enable this a Utility class - MapRedUtil is being
introduced which has static methods which will take a JobConf and return these
pieces of information. When PIG-627 is implemented, these utility classes will
hide the inner Pig implementation to map the multiple stores to the
corresponding StoreFunc, location and schema.
The new method in StoreFunc proposed at the beginning of this issue will still
be used to ask the StoreFunc if it will provide an OutputFormat implementation.
> Need to give user control of OutputFormat
> Key: PIG-652
> URL: https://issues.apache.org/jira/browse/PIG-652
> Project: Pig
> Issue Type: New Feature
> Components: impl
> Affects Versions: types_branch
> Reporter: Alan Gates
> Assignee: Pradeep Kamath
> Fix For: types_branch
> Attachments: PIG-652.patch
> Pig currently allows users some control over InputFormat via the Slicer and
> Slice interfaces. It does not allow any control over OutputFormat and
> RecordWriter interfaces. It just allows the user to implement a storage
> function that controls how the data is serialized. For hadoop tables, we
> will need to allow custom OutputFormats that prepare output information and
> objects needed by a Table store function.
This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.