[ 
https://issues.apache.org/jira/browse/PIG-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395091#comment-14395091
 ] 

Rohini Palaniswamy commented on PIG-4491:
-----------------------------------------

Got it. Looked more closely at the OutputHandler code. It seems to rely on the 
delimiter always ending in "\n". If it was anything else it would break. We 
will have to fix that sometime. 

> Streaming Python Bytearray Bugs
> -------------------------------
>
>                 Key: PIG-4491
>                 URL: https://issues.apache.org/jira/browse/PIG-4491
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.12.1, 0.13.1, 0.14.1
>            Reporter: Jeremy Karn
>            Assignee: Jeremy Karn
>             Fix For: 0.15.0
>
>         Attachments: PIG-4491.patch
>
>
> While using a streaming python udf that returned a byte array we hit a couple 
> of bugs.
> The first was: 
> {panel}
> org.apache.pig.impl.streaming.StreamingUDFException: LINE : 
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 0: 
> ordinal not in range(128)
> {panel}
> and the second (after fixing the first) was a null pointer exception.
> I traced the problem to two issues:
> 1. In the python controller the output from the udf was being logged as a 
> unicode string which can fail for bytearrays.
> 2. Newlines in the data at the start of a response weren't being handled 
> properly on the Java side.
> I'm attaching a patch w/ tests that fix these two issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to