[
https://issues.apache.org/jira/browse/CASSANDRA-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254712#comment-13254712
]
Brandon Williams edited comment on CASSANDRA-3909 at 4/16/12 2:17 PM:
----------------------------------------------------------------------
CASSANDRA-3264 (and subsequently CASSANDRA-3883) added wide row support to
hadoop, by returning one column in the row in every call. Pig, however, is
fancy enough that it could handle a wide row in a bag, since bags spill to
disk; it just needs the pagination for transport since thrift doesn't stream.
Also, if we returned what CFIF gave us, a user wanting to work within the row
would need another costly M/R job to join the row back to its original state,
so we essentially need to 'undo' the pagination and rebuild the row as a bag.
This patch does that, with the caveat that you cannot access any indexes (and
frankly if you have indexes on a wide row you're probably doing something
wrong) since it's impossible for us to order the indexes correctly ahead of
time in a wide row.
was (Author: brandon.williams):
CASSANDRA-3264 (and subsequently CASSANDRA-3883) added wide row support to
hadoop, by returning one column in the row in every call. Pig, however, is
fancy enough that it could handle a wide row in a bag, since bags spill to
disk; it just needs the pagination to for transport since thrift doesn't
stream. Also, if we returned what CFIF gave us, a user wanting to work within
the row would need another costly M/R job to join the row back to its original
state, so we essentially need to 'undo' the pagination and rebuild the row as a
bag. This patch does that, with the caveat that you cannot access any indexes
(and frankly if you have indexes on a wide row you're probably doing something
wrong) since it's impossible for us to order the indexes correctly ahead of
time in a wide row.
> Pig should handle wide rows
> ---------------------------
>
> Key: CASSANDRA-3909
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3909
> Project: Cassandra
> Issue Type: Bug
> Components: Hadoop
> Reporter: Brandon Williams
> Assignee: Brandon Williams
> Fix For: 1.1.1
>
> Attachments: 3909.txt
>
>
> Pig should be able to use the wide row support in CFIF.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira