Re: CqlStorage creates wrong schema for Pig

Miguel Angel Martin junquera Fri, 30 Aug 2013 01:03:09 -0700

I try this:

*rows = LOAD
'cql://keyspace1/test?page_size=1&split_size=4&where_clause=age%3D30' USING
CqlStorage();*


*dump rows;*

*ILLUSTRATE rows;*

*describe rows;*

*
*

*values2= FOREACH rows GENERATE  TOTUPLE (id) as
(mycolumn:tuple(name,value));*

*dump values2;*

*describe values2;*
*
*

But I get this results:



-------------------------------------------------------------
| rows     | id:chararray   | age:int   | title:chararray   |
-------------------------------------------------------------
|          | (id, 6)        | (age, 30) | (title, QA)       |
-------------------------------------------------------------

rows: {id: chararray,age: int,title: chararray}
2013-08-30 09:54:37,831 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 1031: Incompatable field schema: left is
"tuple_0:tuple(mycolumn:tuple(name:bytearray,value:bytearray))", right is
"org.apache.pig.builtin.totuple_id_1:tuple(id:chararray)"





or



....

*values2= FOREACH rows GENERATE  TOTUPLE (id) ;*
*dump values2;*
*describe values2;*




and  the results are:


...
(((id,6)))
(((id,5)))
values2: {org.apache.pig.builtin.totuple_id_8: (id: chararray)}



Aggg!!!!!


*
*



Miguel Angel Martín Junquera
Analyst Engineer.
miguelangel.mar...@brainsins.com



2013/8/26 Miguel Angel Martin junquera <mianmarjun.mailingl...@gmail.com>

> hi Chad .
>
> I have this issue
>
> I send a mail to user-pig-list and  I still i can resolve this, and I can
> not  access to column values.
> In this mail  I write some things that I try without results... and
> information about this issue.
>
>
>
> http://mail-archives.apache.org/mod_mbox/pig-user/201308.mbox/%3ccajeg_hq9s2po3_xytzx5xki4j1mao8q26jydg2wndy_kyiv...@mail.gmail.com%3E
>
>
>
> I hope  someOne reply  one comment, idea or  solution about  this issue or
> bug.
>
>
> I have reviewed the CqlStorage class in code cassandra 1.2.8  but i do not
> have configure the environmetn to debug  and trace this issue.
>
> Only  I find some comments like, but I do not understand at all.
>
>
> /**
>
>  * A LoadStoreFunc for retrieving data from and storing data to Cassandra
>
>  *
>
>  * A row from a standard CF will be returned as nested tuples:
>
>  * (((key1, value1), (key2, value2)), ((name1, val1), (name2, val2))).
>  */
>
>
> I you found some idea or solution, please post it
>
> thanks
>
>
>
>
>
>
>
>
>
> 2013/8/23 Chad Johnston <cjohns...@megatome.com>
>
>> (I'm using Cassandra 1.2.8 and Pig 0.11.1)
>>
>> I'm loading some simple data from Cassandra into Pig using CqlStorage.
>> The CqlStorage loader defines a Pig schema based on the Cassandra schema,
>> but it seems to be wrong.
>>
>> If I do:
>>
>> data = LOAD 'cql://bookdata/books' USING CqlStorage();
>> DESCRIBE data;
>>
>> I get this:
>>
>> data: {isbn: chararray,bookauthor: chararray,booktitle:
>> chararray,publisher: chararray,yearofpublication: int}
>>
>> However, if I DUMP data, I get results like these:
>>
>> ((isbn,0425093387),(bookauthor,Georgette Heyer),(booktitle,Death in the
>> Stocks),(publisher,Berkley Pub Group),(yearofpublication,1986))
>>
>> Clearly the results from Cassandra are key/value pairs, as would be
>> expected. I don't know why the schema generated by CqlStorage() would be so
>> different.
>>
>> This is really causing me problems trying to access the column values. I
>> tried a naive approach of FLATTENing each tuple, then trying to access the
>> values that way:
>>
>> flattened = FOREACH data GENERATE
>>   FLATTEN(isbn),
>>   FLATTEN(booktitle),
>>   ...
>> values = FOREACH flattened GENERATE
>>   $1 AS ISBN,
>>   $3 AS BookTitle,
>>   ...
>>
>> As soon as I try to access field $5, Pig complains about the index being
>> out of bounds.
>>
>> Is there a way to solve the schema/reality mismatch? Am I doing something
>> wrong, or have I stumbled across a defect?
>>
>> Thanks,
>> Chad
>>
>
>

Re: CqlStorage creates wrong schema for Pig

Reply via email to