[ 
https://issues.apache.org/jira/browse/PIG-816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Kamath updated PIG-816:
-------------------------------

    Attachment: PIG-816.patch

Attached patch to address the issue. The change is to serialize the store func 
spec using ObjectSerializer before storing it in the jobconf. 
ObjectSerializer.serialize() uses default java serialization
 and then further encodes the output so that control characters get encoded as 
regular characters. Otherwise any control characters in the store funcspec 
would break the job.xml which is created by hadoop from the jobconf.

> PigStorage() does not accept Unicode characters in its contructor 
> ------------------------------------------------------------------
>
>                 Key: PIG-816
>                 URL: https://issues.apache.org/jira/browse/PIG-816
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.3.0
>            Reporter: Viraj Bhat
>            Priority: Critical
>             Fix For: 0.3.0
>
>         Attachments: PIG-816.patch, pig_1243043613713.log
>
>
> Simple Pig script which uses Unicode characters in the PigStorage() 
> constructor fails with the following error:
> {code}
> studenttab = LOAD '/user/viraj/studenttab10k' AS (name:chararray, 
> age:int,gpa:float);
> X2 = GROUP studenttab by age;
> Y2 = FOREACH X2 GENERATE group, COUNT(studenttab);
> store Y2 into '/user/viraj/y2' using PigStorage('\u0001');
> {code}
> ========================================================================================
> ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Unable to recreate 
> exception from backend error: org.apache.hadoop.ipc.RemoteException: 
> java.io.IOException: java.lang.RuntimeException: 
> org.xml.sax.SAXParseException: Character reference "&#1" is an invalid XML 
> character.
> ========================================================================================
> Attaching log file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to