[ 
https://issues.apache.org/jira/browse/PIG-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheolsoo Park updated PIG-2837:
-------------------------------

    Status: Patch Available  (was: Open)

It seems that AvroStorage does not support recursive record and generic union:

{quote}
1. Limited support for "record": we do not support recursively defined record 
because the number of fields in such records is data dependent.
2. Limited support for "union": we only accept nullable union like ["null", 
"some-type"].
{quote}
https://cwiki.apache.org/PIG/avrostorage.html

AvroStorage checks the above limitations and throws exceptions when violated; 
however, since #2 is checked before #1, we ends up with stack overflow if 
schema is recursive. This can be avoided by changing the order of the checks so 
that AvroStorage fails fast if schema is recursive.

I uploaded a patch that changes the order of the checks and adds two test cases 
to TestAvroStorage to verify that proper exceptions are thrown for two cases. 
My test can be run with the following commands:
{code}
tar -xf avro_test_files.tar.gz
ant clean compile-test piggybank -Dhadoopversion=20
cd contrib/piggybank/java
ant test -Dtestcase=TestAvroStorage
{code}
                
> AvroStorage throws StackOverFlowError
> -------------------------------------
>
>                 Key: PIG-2837
>                 URL: https://issues.apache.org/jira/browse/PIG-2837
>             Project: Pig
>          Issue Type: Bug
>          Components: piggybank
>    Affects Versions: 0.10.0
>            Reporter: Mubarak Seyed
>            Assignee: Cheolsoo Park
>         Attachments: PIG-2837.patch, avro_test_files.tar.gz
>
>
> When i try to dump avro data using
> {code}
> records = LOAD '/logs/records/07262012/01/1/Record.1343265732700.avro' using 
> org.apache.pig.piggybank.storage.avro.AvroStorage(); 
> dump records;
> {code}
> {code}
> Pig Stack Trace 
> --------------- 
> ERROR 2998: Unhandled internal error. null
> java.lang.StackOverflowError 
> at 
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:258)
>  
> at 
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:262)
>  
> at 
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:262)
>  
> at 
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:271)
>  
> at 
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:284)
>  
> at 
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:262)
>  
> at 
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:271)
>  
> at 
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:284)
>  
> at 
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:262)
>  
> at 
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:271)
>  
> at 
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:284)
>  
> at 
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:262)
>  
> at 
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:271)
>  
> at 
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:284)
>  
> at 
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:262)
>  
> at 
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:271)
>  
> at 
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:284)
>  
> at 
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:262)
>  
> at 
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:271)
>  
> at 
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:284)
>  
> at 
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:262)
>  
> at 
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:271)
>  
> at 
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:284)
> {code}
> I did verify the avro schema using avro-tools and dump the data as json 
> format, data looks good.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to