[jira] [Commented] (PIG-5372) SAMPLE/RANDOM(udf) before skewed join failing with NPE
[ https://issues.apache.org/jira/browse/PIG-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751688#comment-16751688 ] Rohini Palaniswamy commented on PIG-5372: - +1 > SAMPLE/RANDOM(udf) before skewed join failing with NPE > -- > > Key: PIG-5372 > URL: https://issues.apache.org/jira/browse/PIG-5372 > Project: Pig > Issue Type: Bug >Affects Versions: 0.16.0 >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Attachments: pig-5372-v1.patch, pig-5372-v2.patch > > > Sample short code like below > {code} > A = LOAD 'input.txt' AS (a1:int, a2:chararray, a3:int); > B = LOAD 'input.txt' AS (b1:int, b2:chararray, b3:int); > A2 = FOREACH A generate *, RANDOM() as randnum; > D = join A2 by a1, B by b1 using 'skewed' parallel 2; > store D into '$output'; > {code} > Fails with NPE. > {noformat} > 2018-12-12 16:06:04,860 [Dispatcher thread: Central] INFO > org.apache.tez.dag.history.HistoryEventHandler - > [HISTORY][DAG:dag_1544648742542_0001_1][Event:TASK_FINISHED]: > vertexName=scope-55, taskId=task_1544648742542_0001_1_02_00, > startTime=1544648745036, finishTime=1544648764857, timeTaken=19821, > status=KILLED, successfulAttemptID=null, diagnostics=TaskAttempt 0 failed, > info=[Error: Failure while running > task:org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception > while executing (Name: Local Rearrange[tuple]{int}(false) - scope-29 -> > scope-58 Operator Key: scope-29): > org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception > while executing [POUserFunc (Name: > POUserFunc(org.apache.pig.builtin.RANDOM)[double] - scope-40 Operator Key: > scope-40) children: null at []]: java.lang.NullPointerException > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:315) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:287) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POLocalRearrangeTez.getNextTuple(POLocalRearrangeTez.java:131) > at > org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:420) > at > org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:282) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: > Exception while executing [POUserFunc (Name: > POUserFunc(org.apache.pig.builtin.RANDOM)[double] - scope-40 Operator Key: > scope-40) children: null at []]: java.lang.NullPointerException > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:367) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:408) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:325) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:305) > ... 17 more > Caused by: java.lang.NullPointerException > at org.apache.pig.builtin.RANDOM.exec(RANDOM.java:51) > at org.apache.pig.builtin.RANDOM.exec(RANDOM.java:37) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:332) > at >
[jira] [Commented] (PIG-5372) SAMPLE/RANDOM(udf) before skewed join failing with NPE
[ https://issues.apache.org/jira/browse/PIG-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732373#comment-16732373 ] Daniel Dai commented on PIG-5372: - Wow that's back in 2010 :). I think SkewedPartitioner.setConf is passing conf to MapRedUtil.loadPartitionFileFromLocalCache via PigMapReduce.sJobConf. This is no longer necessary as MapRedUtil.loadPartitionFileFromLocalCache takes mapConf parameter (in a later patch). We can change MapRedUtil.loadPartitionFileFromLocalCache to retrieve fs.file.impl/fs.hdfs.impl from mapConf. Then we don't need overwrite PigMapReduce.sJobConf in SkewedPartitioner.setConf. > SAMPLE/RANDOM(udf) before skewed join failing with NPE > -- > > Key: PIG-5372 > URL: https://issues.apache.org/jira/browse/PIG-5372 > Project: Pig > Issue Type: Bug >Affects Versions: 0.16.0 >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Attachments: pig-5372-v1.patch > > > Sample short code like below > {code} > A = LOAD 'input.txt' AS (a1:int, a2:chararray, a3:int); > B = LOAD 'input.txt' AS (b1:int, b2:chararray, b3:int); > A2 = FOREACH A generate *, RANDOM() as randnum; > D = join A2 by a1, B by b1 using 'skewed' parallel 2; > store D into '$output'; > {code} > Fails with NPE. > {noformat} > 2018-12-12 16:06:04,860 [Dispatcher thread: Central] INFO > org.apache.tez.dag.history.HistoryEventHandler - > [HISTORY][DAG:dag_1544648742542_0001_1][Event:TASK_FINISHED]: > vertexName=scope-55, taskId=task_1544648742542_0001_1_02_00, > startTime=1544648745036, finishTime=1544648764857, timeTaken=19821, > status=KILLED, successfulAttemptID=null, diagnostics=TaskAttempt 0 failed, > info=[Error: Failure while running > task:org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception > while executing (Name: Local Rearrange[tuple]{int}(false) - scope-29 -> > scope-58 Operator Key: scope-29): > org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception > while executing [POUserFunc (Name: > POUserFunc(org.apache.pig.builtin.RANDOM)[double] - scope-40 Operator Key: > scope-40) children: null at []]: java.lang.NullPointerException > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:315) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:287) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POLocalRearrangeTez.getNextTuple(POLocalRearrangeTez.java:131) > at > org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:420) > at > org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:282) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: > Exception while executing [POUserFunc (Name: > POUserFunc(org.apache.pig.builtin.RANDOM)[double] - scope-40 Operator Key: > scope-40) children: null at []]: java.lang.NullPointerException > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:367) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:408) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:325) > at >
[jira] [Commented] (PIG-5372) SAMPLE/RANDOM(udf) before skewed join failing with NPE
[ https://issues.apache.org/jira/browse/PIG-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728458#comment-16728458 ] Koji Noguchi commented on PIG-5372: --- bq. I am not able to tell exactly why Daniel is doing that in https://issues.apache.org/jira/browse/PIG-1467 Me neither. [~daijy], can you help us? Without fully understanding what's happening there, I would rather keep the current patch to avoid introducing unexpected regression. > SAMPLE/RANDOM(udf) before skewed join failing with NPE > -- > > Key: PIG-5372 > URL: https://issues.apache.org/jira/browse/PIG-5372 > Project: Pig > Issue Type: Bug >Affects Versions: 0.16.0 >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Attachments: pig-5372-v1.patch > > > Sample short code like below > {code} > A = LOAD 'input.txt' AS (a1:int, a2:chararray, a3:int); > B = LOAD 'input.txt' AS (b1:int, b2:chararray, b3:int); > A2 = FOREACH A generate *, RANDOM() as randnum; > D = join A2 by a1, B by b1 using 'skewed' parallel 2; > store D into '$output'; > {code} > Fails with NPE. > {noformat} > 2018-12-12 16:06:04,860 [Dispatcher thread: Central] INFO > org.apache.tez.dag.history.HistoryEventHandler - > [HISTORY][DAG:dag_1544648742542_0001_1][Event:TASK_FINISHED]: > vertexName=scope-55, taskId=task_1544648742542_0001_1_02_00, > startTime=1544648745036, finishTime=1544648764857, timeTaken=19821, > status=KILLED, successfulAttemptID=null, diagnostics=TaskAttempt 0 failed, > info=[Error: Failure while running > task:org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception > while executing (Name: Local Rearrange[tuple]{int}(false) - scope-29 -> > scope-58 Operator Key: scope-29): > org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception > while executing [POUserFunc (Name: > POUserFunc(org.apache.pig.builtin.RANDOM)[double] - scope-40 Operator Key: > scope-40) children: null at []]: java.lang.NullPointerException > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:315) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:287) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POLocalRearrangeTez.getNextTuple(POLocalRearrangeTez.java:131) > at > org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:420) > at > org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:282) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: > Exception while executing [POUserFunc (Name: > POUserFunc(org.apache.pig.builtin.RANDOM)[double] - scope-40 Operator Key: > scope-40) children: null at []]: java.lang.NullPointerException > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:367) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:408) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:325) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:305) > ... 17 more > Caused by: java.lang.NullPointerException > at org.apache.pig.builtin.RANDOM.exec(RANDOM.java:51) > at
[jira] [Commented] (PIG-5372) SAMPLE/RANDOM(udf) before skewed join failing with NPE
[ https://issues.apache.org/jira/browse/PIG-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16727017#comment-16727017 ] Rohini Palaniswamy commented on PIG-5372: - This failure is only with Tez or mapreduce as well? I am not able to tell exactly why Daniel is doing that in https://issues.apache.org/jira/browse/PIG-1467 as the description of jira does not have proper stacktrace. In Tez, we read the quantile file from memory (broadcast edge) instead of local file (distributed cache) like in mapreduce. So we can override setConf() in SkewedPartitionerTez if this issue is specific to Tez. > SAMPLE/RANDOM(udf) before skewed join failing with NPE > -- > > Key: PIG-5372 > URL: https://issues.apache.org/jira/browse/PIG-5372 > Project: Pig > Issue Type: Bug >Affects Versions: 0.16.0 >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Attachments: pig-5372-v1.patch > > > Sample short code like below > {code} > A = LOAD 'input.txt' AS (a1:int, a2:chararray, a3:int); > B = LOAD 'input.txt' AS (b1:int, b2:chararray, b3:int); > A2 = FOREACH A generate *, RANDOM() as randnum; > D = join A2 by a1, B by b1 using 'skewed' parallel 2; > store D into '$output'; > {code} > Fails with NPE. > {noformat} > 2018-12-12 16:06:04,860 [Dispatcher thread: Central] INFO > org.apache.tez.dag.history.HistoryEventHandler - > [HISTORY][DAG:dag_1544648742542_0001_1][Event:TASK_FINISHED]: > vertexName=scope-55, taskId=task_1544648742542_0001_1_02_00, > startTime=1544648745036, finishTime=1544648764857, timeTaken=19821, > status=KILLED, successfulAttemptID=null, diagnostics=TaskAttempt 0 failed, > info=[Error: Failure while running > task:org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception > while executing (Name: Local Rearrange[tuple]{int}(false) - scope-29 -> > scope-58 Operator Key: scope-29): > org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception > while executing [POUserFunc (Name: > POUserFunc(org.apache.pig.builtin.RANDOM)[double] - scope-40 Operator Key: > scope-40) children: null at []]: java.lang.NullPointerException > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:315) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:287) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POLocalRearrangeTez.getNextTuple(POLocalRearrangeTez.java:131) > at > org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:420) > at > org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:282) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: > Exception while executing [POUserFunc (Name: > POUserFunc(org.apache.pig.builtin.RANDOM)[double] - scope-40 Operator Key: > scope-40) children: null at []]: java.lang.NullPointerException > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:367) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:408) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:325) > at >
[jira] [Commented] (PIG-5372) SAMPLE/RANDOM(udf) before skewed join failing with NPE
[ https://issues.apache.org/jira/browse/PIG-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724307#comment-16724307 ] Koji Noguchi commented on PIG-5372: --- bq. I think it should be fine to remove the below lines instead. They seem to be not used I think the setting came from PIG-1467. > SAMPLE/RANDOM(udf) before skewed join failing with NPE > -- > > Key: PIG-5372 > URL: https://issues.apache.org/jira/browse/PIG-5372 > Project: Pig > Issue Type: Bug >Affects Versions: 0.16.0 >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Attachments: pig-5372-v1.patch > > > Sample short code like below > {code} > A = LOAD 'input.txt' AS (a1:int, a2:chararray, a3:int); > B = LOAD 'input.txt' AS (b1:int, b2:chararray, b3:int); > A2 = FOREACH A generate *, RANDOM() as randnum; > D = join A2 by a1, B by b1 using 'skewed' parallel 2; > store D into '$output'; > {code} > Fails with NPE. > {noformat} > 2018-12-12 16:06:04,860 [Dispatcher thread: Central] INFO > org.apache.tez.dag.history.HistoryEventHandler - > [HISTORY][DAG:dag_1544648742542_0001_1][Event:TASK_FINISHED]: > vertexName=scope-55, taskId=task_1544648742542_0001_1_02_00, > startTime=1544648745036, finishTime=1544648764857, timeTaken=19821, > status=KILLED, successfulAttemptID=null, diagnostics=TaskAttempt 0 failed, > info=[Error: Failure while running > task:org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception > while executing (Name: Local Rearrange[tuple]{int}(false) - scope-29 -> > scope-58 Operator Key: scope-29): > org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception > while executing [POUserFunc (Name: > POUserFunc(org.apache.pig.builtin.RANDOM)[double] - scope-40 Operator Key: > scope-40) children: null at []]: java.lang.NullPointerException > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:315) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:287) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POLocalRearrangeTez.getNextTuple(POLocalRearrangeTez.java:131) > at > org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:420) > at > org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:282) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: > Exception while executing [POUserFunc (Name: > POUserFunc(org.apache.pig.builtin.RANDOM)[double] - scope-40 Operator Key: > scope-40) children: null at []]: java.lang.NullPointerException > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:367) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:408) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:325) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:305) > ... 17 more > Caused by: java.lang.NullPointerException > at org.apache.pig.builtin.RANDOM.exec(RANDOM.java:51) > at org.apache.pig.builtin.RANDOM.exec(RANDOM.java:37) > at >
[jira] [Commented] (PIG-5372) SAMPLE/RANDOM(udf) before skewed join failing with NPE
[ https://issues.apache.org/jira/browse/PIG-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724285#comment-16724285 ] Rohini Palaniswamy commented on PIG-5372: - I think it should be fine to remove the below lines instead. They seem to be not used {code} PigMapReduce.sJobConfInternal.set(conf); PigMapReduce.sJobConf = conf; {code} > SAMPLE/RANDOM(udf) before skewed join failing with NPE > -- > > Key: PIG-5372 > URL: https://issues.apache.org/jira/browse/PIG-5372 > Project: Pig > Issue Type: Bug >Affects Versions: 0.16.0 >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Attachments: pig-5372-v1.patch > > > Sample short code like below > {code} > A = LOAD 'input.txt' AS (a1:int, a2:chararray, a3:int); > B = LOAD 'input.txt' AS (b1:int, b2:chararray, b3:int); > A2 = FOREACH A generate *, RANDOM() as randnum; > D = join A2 by a1, B by b1 using 'skewed' parallel 2; > store D into '$output'; > {code} > Fails with NPE. > {noformat} > 2018-12-12 16:06:04,860 [Dispatcher thread: Central] INFO > org.apache.tez.dag.history.HistoryEventHandler - > [HISTORY][DAG:dag_1544648742542_0001_1][Event:TASK_FINISHED]: > vertexName=scope-55, taskId=task_1544648742542_0001_1_02_00, > startTime=1544648745036, finishTime=1544648764857, timeTaken=19821, > status=KILLED, successfulAttemptID=null, diagnostics=TaskAttempt 0 failed, > info=[Error: Failure while running > task:org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception > while executing (Name: Local Rearrange[tuple]{int}(false) - scope-29 -> > scope-58 Operator Key: scope-29): > org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception > while executing [POUserFunc (Name: > POUserFunc(org.apache.pig.builtin.RANDOM)[double] - scope-40 Operator Key: > scope-40) children: null at []]: java.lang.NullPointerException > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:315) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:287) > at > org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POLocalRearrangeTez.getNextTuple(POLocalRearrangeTez.java:131) > at > org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:420) > at > org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:282) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: > Exception while executing [POUserFunc (Name: > POUserFunc(org.apache.pig.builtin.RANDOM)[double] - scope-40 Operator Key: > scope-40) children: null at []]: java.lang.NullPointerException > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:367) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:408) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:325) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:305) > ... 17 more > Caused by: java.lang.NullPointerException > at org.apache.pig.builtin.RANDOM.exec(RANDOM.java:51) > at org.apache.pig.builtin.RANDOM.exec(RANDOM.java:37) > at >