[
https://issues.apache.org/jira/browse/PIG-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12994320#comment-12994320
]
Robbie Strickland commented on PIG-1849:
----------------------------------------
I am using the input format directly, with this sample data:
(6B108476-1C40-4847-A1B0-9DA4B0B0BF83,{(12345,{(TestColumn,This is a
test),(TestColumn2,This is a test 2)}),(12346,{(TestColumn1,This is a test
1),(TestColumn2,This is a test 2)})})
and this load statement:
rows = LOAD 'cassandra://E3/StreamByProfile' USING CassandraStorage() AS
(objectid, scolumns: bag {ST: tuple(timestamp, columns: bag {T:
tuple(name:chararray, value)})});
I have tried quite a number of different schema possibilities, but all produce
effectively the same result. They don't produce an error; when you attempt to
reference individual items in a bag you still get the full bag (even though it
allows the syntax). Attempts to flatten create the same issue.
> Pig cannot dereference Cassandra subcolumns in a Super Column Family
> --------------------------------------------------------------------
>
> Key: PIG-1849
> URL: https://issues.apache.org/jira/browse/PIG-1849
> Project: Pig
> Issue Type: Bug
> Components: data
> Affects Versions: 0.8.0
> Environment: Ubuntu 10, Cassandra 0.7, Hadoop
> Reporter: Robbie Strickland
> Labels: cassandra
>
> When using the ColumnFamilyInputFormat to load data from a Cassandra Super
> Column Family, the subcolumns always return in a bag where individual values
> cannot be dereferenced, no matter what schema is used. Flattening does not
> solve the issue.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira