[jira] [Commented] (PIG-5372) SAMPLE/RANDOM(udf) before skewed join failing with NPE

2019-01-24 Thread Rohini Palaniswamy (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751688#comment-16751688
 ] 

Rohini Palaniswamy commented on PIG-5372:
-

+1

> SAMPLE/RANDOM(udf) before skewed join failing with NPE
> --
>
> Key: PIG-5372
> URL: https://issues.apache.org/jira/browse/PIG-5372
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.16.0
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Major
> Attachments: pig-5372-v1.patch, pig-5372-v2.patch
>
>
> Sample short code like below
> {code}
> A = LOAD 'input.txt' AS (a1:int, a2:chararray, a3:int);
> B = LOAD 'input.txt' AS (b1:int, b2:chararray, b3:int);
> A2 = FOREACH A generate *, RANDOM() as randnum;
> D = join A2 by a1, B by b1 using 'skewed' parallel 2;
> store D into '$output';
> {code}
> Fails with NPE. 
> {noformat}
> 2018-12-12 16:06:04,860 [Dispatcher thread: Central] INFO  
> org.apache.tez.dag.history.HistoryEventHandler - 
> [HISTORY][DAG:dag_1544648742542_0001_1][Event:TASK_FINISHED]: 
> vertexName=scope-55, taskId=task_1544648742542_0001_1_02_00, 
> startTime=1544648745036, finishTime=1544648764857, timeTaken=19821, 
> status=KILLED, successfulAttemptID=null, diagnostics=TaskAttempt 0 failed, 
> info=[Error: Failure while running 
> task:org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: Local Rearrange[tuple]{int}(false) - scope-29 ->   
> scope-58 Operator Key: scope-29): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing [POUserFunc (Name: 
> POUserFunc(org.apache.pig.builtin.RANDOM)[double] - scope-40 Operator Key: 
> scope-40) children: null at []]: java.lang.NullPointerException
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:315)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:287)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POLocalRearrangeTez.getNextTuple(POLocalRearrangeTez.java:131)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:420)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:282)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: 
> Exception while executing [POUserFunc (Name: 
> POUserFunc(org.apache.pig.builtin.RANDOM)[double] - scope-40 Operator Key: 
> scope-40) children: null at []]: java.lang.NullPointerException
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:367)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:408)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:325)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:305)
> ... 17 more
> Caused by: java.lang.NullPointerException
> at org.apache.pig.builtin.RANDOM.exec(RANDOM.java:51)
> at org.apache.pig.builtin.RANDOM.exec(RANDOM.java:37)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:332)
> at 
> 

[jira] [Commented] (PIG-5372) SAMPLE/RANDOM(udf) before skewed join failing with NPE

2019-01-02 Thread Daniel Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732373#comment-16732373
 ] 

Daniel Dai commented on PIG-5372:
-

Wow that's back in 2010 :). I think SkewedPartitioner.setConf is passing conf 
to MapRedUtil.loadPartitionFileFromLocalCache via PigMapReduce.sJobConf. This 
is no longer necessary as MapRedUtil.loadPartitionFileFromLocalCache takes 
mapConf parameter (in a later patch). We can change 
MapRedUtil.loadPartitionFileFromLocalCache to retrieve 
fs.file.impl/fs.hdfs.impl from mapConf. Then we don't need overwrite 
PigMapReduce.sJobConf in SkewedPartitioner.setConf.

> SAMPLE/RANDOM(udf) before skewed join failing with NPE
> --
>
> Key: PIG-5372
> URL: https://issues.apache.org/jira/browse/PIG-5372
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.16.0
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Major
> Attachments: pig-5372-v1.patch
>
>
> Sample short code like below
> {code}
> A = LOAD 'input.txt' AS (a1:int, a2:chararray, a3:int);
> B = LOAD 'input.txt' AS (b1:int, b2:chararray, b3:int);
> A2 = FOREACH A generate *, RANDOM() as randnum;
> D = join A2 by a1, B by b1 using 'skewed' parallel 2;
> store D into '$output';
> {code}
> Fails with NPE. 
> {noformat}
> 2018-12-12 16:06:04,860 [Dispatcher thread: Central] INFO  
> org.apache.tez.dag.history.HistoryEventHandler - 
> [HISTORY][DAG:dag_1544648742542_0001_1][Event:TASK_FINISHED]: 
> vertexName=scope-55, taskId=task_1544648742542_0001_1_02_00, 
> startTime=1544648745036, finishTime=1544648764857, timeTaken=19821, 
> status=KILLED, successfulAttemptID=null, diagnostics=TaskAttempt 0 failed, 
> info=[Error: Failure while running 
> task:org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: Local Rearrange[tuple]{int}(false) - scope-29 ->   
> scope-58 Operator Key: scope-29): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing [POUserFunc (Name: 
> POUserFunc(org.apache.pig.builtin.RANDOM)[double] - scope-40 Operator Key: 
> scope-40) children: null at []]: java.lang.NullPointerException
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:315)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:287)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POLocalRearrangeTez.getNextTuple(POLocalRearrangeTez.java:131)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:420)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:282)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: 
> Exception while executing [POUserFunc (Name: 
> POUserFunc(org.apache.pig.builtin.RANDOM)[double] - scope-40 Operator Key: 
> scope-40) children: null at []]: java.lang.NullPointerException
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:367)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:408)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:325)
> at 
> 

[jira] [Commented] (PIG-5372) SAMPLE/RANDOM(udf) before skewed join failing with NPE

2018-12-24 Thread Koji Noguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728458#comment-16728458
 ] 

Koji Noguchi commented on PIG-5372:
---

bq. I am not able to tell exactly why Daniel is doing that in 
https://issues.apache.org/jira/browse/PIG-1467

Me neither.  [~daijy], can you help us? 

Without fully understanding what's happening there, I would rather keep the 
current patch to avoid introducing unexpected regression.

> SAMPLE/RANDOM(udf) before skewed join failing with NPE
> --
>
> Key: PIG-5372
> URL: https://issues.apache.org/jira/browse/PIG-5372
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.16.0
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Major
> Attachments: pig-5372-v1.patch
>
>
> Sample short code like below
> {code}
> A = LOAD 'input.txt' AS (a1:int, a2:chararray, a3:int);
> B = LOAD 'input.txt' AS (b1:int, b2:chararray, b3:int);
> A2 = FOREACH A generate *, RANDOM() as randnum;
> D = join A2 by a1, B by b1 using 'skewed' parallel 2;
> store D into '$output';
> {code}
> Fails with NPE. 
> {noformat}
> 2018-12-12 16:06:04,860 [Dispatcher thread: Central] INFO  
> org.apache.tez.dag.history.HistoryEventHandler - 
> [HISTORY][DAG:dag_1544648742542_0001_1][Event:TASK_FINISHED]: 
> vertexName=scope-55, taskId=task_1544648742542_0001_1_02_00, 
> startTime=1544648745036, finishTime=1544648764857, timeTaken=19821, 
> status=KILLED, successfulAttemptID=null, diagnostics=TaskAttempt 0 failed, 
> info=[Error: Failure while running 
> task:org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: Local Rearrange[tuple]{int}(false) - scope-29 ->   
> scope-58 Operator Key: scope-29): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing [POUserFunc (Name: 
> POUserFunc(org.apache.pig.builtin.RANDOM)[double] - scope-40 Operator Key: 
> scope-40) children: null at []]: java.lang.NullPointerException
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:315)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:287)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POLocalRearrangeTez.getNextTuple(POLocalRearrangeTez.java:131)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:420)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:282)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: 
> Exception while executing [POUserFunc (Name: 
> POUserFunc(org.apache.pig.builtin.RANDOM)[double] - scope-40 Operator Key: 
> scope-40) children: null at []]: java.lang.NullPointerException
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:367)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:408)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:325)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:305)
> ... 17 more
> Caused by: java.lang.NullPointerException
> at org.apache.pig.builtin.RANDOM.exec(RANDOM.java:51)
> at 

[jira] [Commented] (PIG-5372) SAMPLE/RANDOM(udf) before skewed join failing with NPE

2018-12-21 Thread Rohini Palaniswamy (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16727017#comment-16727017
 ] 

Rohini Palaniswamy commented on PIG-5372:
-

This failure is only with Tez or mapreduce as well? I am not able to tell 
exactly why Daniel is doing that in 
https://issues.apache.org/jira/browse/PIG-1467 as the description of jira does 
not have proper stacktrace.  In Tez, we read the quantile file from memory 
(broadcast edge) instead of local file (distributed cache) like in mapreduce. 
So we can override setConf() in SkewedPartitionerTez if this issue is specific 
to Tez.

> SAMPLE/RANDOM(udf) before skewed join failing with NPE
> --
>
> Key: PIG-5372
> URL: https://issues.apache.org/jira/browse/PIG-5372
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.16.0
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Major
> Attachments: pig-5372-v1.patch
>
>
> Sample short code like below
> {code}
> A = LOAD 'input.txt' AS (a1:int, a2:chararray, a3:int);
> B = LOAD 'input.txt' AS (b1:int, b2:chararray, b3:int);
> A2 = FOREACH A generate *, RANDOM() as randnum;
> D = join A2 by a1, B by b1 using 'skewed' parallel 2;
> store D into '$output';
> {code}
> Fails with NPE. 
> {noformat}
> 2018-12-12 16:06:04,860 [Dispatcher thread: Central] INFO  
> org.apache.tez.dag.history.HistoryEventHandler - 
> [HISTORY][DAG:dag_1544648742542_0001_1][Event:TASK_FINISHED]: 
> vertexName=scope-55, taskId=task_1544648742542_0001_1_02_00, 
> startTime=1544648745036, finishTime=1544648764857, timeTaken=19821, 
> status=KILLED, successfulAttemptID=null, diagnostics=TaskAttempt 0 failed, 
> info=[Error: Failure while running 
> task:org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: Local Rearrange[tuple]{int}(false) - scope-29 ->   
> scope-58 Operator Key: scope-29): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing [POUserFunc (Name: 
> POUserFunc(org.apache.pig.builtin.RANDOM)[double] - scope-40 Operator Key: 
> scope-40) children: null at []]: java.lang.NullPointerException
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:315)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:287)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POLocalRearrangeTez.getNextTuple(POLocalRearrangeTez.java:131)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:420)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:282)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: 
> Exception while executing [POUserFunc (Name: 
> POUserFunc(org.apache.pig.builtin.RANDOM)[double] - scope-40 Operator Key: 
> scope-40) children: null at []]: java.lang.NullPointerException
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:367)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:408)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:325)
> at 
> 

[jira] [Commented] (PIG-5372) SAMPLE/RANDOM(udf) before skewed join failing with NPE

2018-12-18 Thread Koji Noguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724307#comment-16724307
 ] 

Koji Noguchi commented on PIG-5372:
---

bq. I think it should be fine to remove the below lines instead. They seem to 
be not used

I think the setting came from PIG-1467.



> SAMPLE/RANDOM(udf) before skewed join failing with NPE
> --
>
> Key: PIG-5372
> URL: https://issues.apache.org/jira/browse/PIG-5372
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.16.0
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Major
> Attachments: pig-5372-v1.patch
>
>
> Sample short code like below
> {code}
> A = LOAD 'input.txt' AS (a1:int, a2:chararray, a3:int);
> B = LOAD 'input.txt' AS (b1:int, b2:chararray, b3:int);
> A2 = FOREACH A generate *, RANDOM() as randnum;
> D = join A2 by a1, B by b1 using 'skewed' parallel 2;
> store D into '$output';
> {code}
> Fails with NPE. 
> {noformat}
> 2018-12-12 16:06:04,860 [Dispatcher thread: Central] INFO  
> org.apache.tez.dag.history.HistoryEventHandler - 
> [HISTORY][DAG:dag_1544648742542_0001_1][Event:TASK_FINISHED]: 
> vertexName=scope-55, taskId=task_1544648742542_0001_1_02_00, 
> startTime=1544648745036, finishTime=1544648764857, timeTaken=19821, 
> status=KILLED, successfulAttemptID=null, diagnostics=TaskAttempt 0 failed, 
> info=[Error: Failure while running 
> task:org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: Local Rearrange[tuple]{int}(false) - scope-29 ->   
> scope-58 Operator Key: scope-29): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing [POUserFunc (Name: 
> POUserFunc(org.apache.pig.builtin.RANDOM)[double] - scope-40 Operator Key: 
> scope-40) children: null at []]: java.lang.NullPointerException
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:315)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:287)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POLocalRearrangeTez.getNextTuple(POLocalRearrangeTez.java:131)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:420)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:282)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: 
> Exception while executing [POUserFunc (Name: 
> POUserFunc(org.apache.pig.builtin.RANDOM)[double] - scope-40 Operator Key: 
> scope-40) children: null at []]: java.lang.NullPointerException
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:367)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:408)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:325)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:305)
> ... 17 more
> Caused by: java.lang.NullPointerException
> at org.apache.pig.builtin.RANDOM.exec(RANDOM.java:51)
> at org.apache.pig.builtin.RANDOM.exec(RANDOM.java:37)
> at 
> 

[jira] [Commented] (PIG-5372) SAMPLE/RANDOM(udf) before skewed join failing with NPE

2018-12-18 Thread Rohini Palaniswamy (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724285#comment-16724285
 ] 

Rohini Palaniswamy commented on PIG-5372:
-

I think it should be fine to remove the below lines instead. They seem to be 
not used

{code}
PigMapReduce.sJobConfInternal.set(conf);
PigMapReduce.sJobConf = conf;
{code}

> SAMPLE/RANDOM(udf) before skewed join failing with NPE
> --
>
> Key: PIG-5372
> URL: https://issues.apache.org/jira/browse/PIG-5372
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.16.0
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Major
> Attachments: pig-5372-v1.patch
>
>
> Sample short code like below
> {code}
> A = LOAD 'input.txt' AS (a1:int, a2:chararray, a3:int);
> B = LOAD 'input.txt' AS (b1:int, b2:chararray, b3:int);
> A2 = FOREACH A generate *, RANDOM() as randnum;
> D = join A2 by a1, B by b1 using 'skewed' parallel 2;
> store D into '$output';
> {code}
> Fails with NPE. 
> {noformat}
> 2018-12-12 16:06:04,860 [Dispatcher thread: Central] INFO  
> org.apache.tez.dag.history.HistoryEventHandler - 
> [HISTORY][DAG:dag_1544648742542_0001_1][Event:TASK_FINISHED]: 
> vertexName=scope-55, taskId=task_1544648742542_0001_1_02_00, 
> startTime=1544648745036, finishTime=1544648764857, timeTaken=19821, 
> status=KILLED, successfulAttemptID=null, diagnostics=TaskAttempt 0 failed, 
> info=[Error: Failure while running 
> task:org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: Local Rearrange[tuple]{int}(false) - scope-29 ->   
> scope-58 Operator Key: scope-29): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing [POUserFunc (Name: 
> POUserFunc(org.apache.pig.builtin.RANDOM)[double] - scope-40 Operator Key: 
> scope-40) children: null at []]: java.lang.NullPointerException
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:315)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:287)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POLocalRearrangeTez.getNextTuple(POLocalRearrangeTez.java:131)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:420)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:282)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: 
> Exception while executing [POUserFunc (Name: 
> POUserFunc(org.apache.pig.builtin.RANDOM)[double] - scope-40 Operator Key: 
> scope-40) children: null at []]: java.lang.NullPointerException
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:367)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:408)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:325)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:305)
> ... 17 more
> Caused by: java.lang.NullPointerException
> at org.apache.pig.builtin.RANDOM.exec(RANDOM.java:51)
> at org.apache.pig.builtin.RANDOM.exec(RANDOM.java:37)
> at 
>