Raghu Angadi commented on PIG-833:
will try to get some initial docs attached to this jira asap. I think the
current plan is to have proper wiki pages (and attached here). This is part of
the reason by we would like to keep this jira open.
The bulk initial dump is certainly not desirable but has been fairly common for
many contrib projects in Hadoop. A bit of rush to get this committed to contrib
is in part to avoid such large changes going again. The longer we delay larger
the patch is going to get. We want to get the subsequent patches and
discussions to public jira asap and we are already doing that.
I would like to clarify that this is not a PIG feature but rather a contrib
project. We would not want this commit to be generalized for PIG commits. All
the responsibility is with Zebra team. This patch is the initial verion. It
does include many tests.
> Storage access layer
> Key: PIG-833
> URL: https://issues.apache.org/jira/browse/PIG-833
> Project: Pig
> Issue Type: New Feature
> Reporter: Jay Tang
> Attachments: hadoop20.jar.bz2, PIG-833-zebra.patch,
> PIG-833-zebra.patch.bz2, PIG-833-zebra.patch.bz2,
> TEST-org.apache.hadoop.zebra.pig.TestCheckin1.txt, test.out, zebra-javadoc.tgz
> A layer is needed to provide a high level data access abstraction and a
> tabular view of data in Hadoop, and could free Pig users from implementing
> their own data storage/retrieval code. This layer should also include a
> columnar storage format in order to provide fast data projection,
> CPU/space-efficient data serialization, and a schema language to manage
> physical storage metadata. Eventually it could also support predicate
> pushdown for further performance improvement. Initially, this layer could be
> a contrib project in Pig and become a hadoop subproject later on.
This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.