[ 
https://issues.apache.org/jira/browse/PIG-3478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13818623#comment-13818623
 ] 

Cheolsoo Park commented on PIG-3478:
------------------------------------

[~jeremykarn], I ran e2e tests (StreamingPythonUDFs) on an EMR Hadoop 2.2 
cluster and saw two issues as follows:
# NPE in StreamingUDF.java
{code}
2013-11-10 22:32:19,694 FATAL [IPC Server handler 11 on 33809] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
attempt_1383086282107_1892_m_000000_3 - exited : 
org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while 
executing [POUserFunc (Name: 
POUserFunc(org.apache.pig.impl.builtin.StreamingUDF)[int] - scope-3 Operator 
Key: scope-3) children: null at []]: java.lang.NullPointerException
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:338)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:378)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:298)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:282)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:775)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: java.lang.NullPointerException
        at 
org.apache.pig.impl.builtin.StreamingUDF.ensureUserFileAvailable(StreamingUDF.java:249)
        at 
org.apache.pig.impl.builtin.StreamingUDF.constructCommand(StreamingUDF.java:218)
        at 
org.apache.pig.impl.builtin.StreamingUDF.startUdfController(StreamingUDF.java:163)
        at 
org.apache.pig.impl.builtin.StreamingUDF.initialize(StreamingUDF.java:156)
        at org.apache.pig.impl.builtin.StreamingUDF.exec(StreamingUDF.java:146)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:330)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNextInteger(POUserFunc.java:379)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:321)
        ... 13 more
{code}
NPE is thrown from {{udfFileStream.close();}} where udfFileStream is null.
# After fixing #1 by adding a null check, I ran into this error:
{code}
2013-11-10 23:00:51,402 FATAL [IPC Server handler 11 on 40139] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
attempt_1383086282107_1905_m_000000_3 - exited : 
org.apache.pig.backend.executionengine.ExecException: ERROR 2078: Caught error 
from UDF: StreamingUDF [Could not create directory: 
/home/hadoop/.versions/2.2.0/logs/udfOutput]at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:358)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNextInteger(POUserFunc.java:379)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:321)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:378)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:298)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:282)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:775)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: java.io.IOException: Could not create directory: 
/home/hadoop/.versions/2.2.0/logs/udfOutput
at 
org.apache.pig.scripting.ScriptingOutputCapturer.getTaskLogDir(ScriptingOutputCapturer.java:104)
at 
org.apache.pig.scripting.ScriptingOutputCapturer.getStandardOutputRootWriteLocation(ScriptingOutputCapturer.java:86)
at 
org.apache.pig.impl.builtin.StreamingUDF.constructCommand(StreamingUDF.java:187)
at 
org.apache.pig.impl.builtin.StreamingUDF.startUdfController(StreamingUDF.java:163)
at org.apache.pig.impl.builtin.StreamingUDF.initialize(StreamingUDF.java:156)at 
org.apache.pig.impl.builtin.StreamingUDF.exec(StreamingUDF.java:146)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:330)...
 15 more
{code}

Can you look into these failures? We should also enable {{StreamingPythonUDFs}} 
tests in nightly.conf once they're fixed.

> Make StreamingUDF work for Hadoop 2
> -----------------------------------
>
>                 Key: PIG-3478
>                 URL: https://issues.apache.org/jira/browse/PIG-3478
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>            Reporter: Daniel Dai
>            Assignee: Jeremy Karn
>             Fix For: 0.12.1
>
>         Attachments: PIG-3478.patch
>
>
> PIG-2417 introduced Streaming UDF. However, it does not work under Hadoop 2. 
> Both unit tests/e2e tests under Haodop 2 fails. We need to fix it.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to