[jira] [Commented] (PIG-5439) Support Spark 3 and drop SparkShim
[ https://issues.apache.org/jira/browse/PIG-5439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844427#comment-17844427 ] Rohini Palaniswamy commented on PIG-5439: - +1 > Support Spark 3 and drop SparkShim > -- > > Key: PIG-5439 > URL: https://issues.apache.org/jira/browse/PIG-5439 > Project: Pig > Issue Type: Improvement > Components: spark >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Fix For: 0.19.0 > > Attachments: pig-5439-v01.patch, pig-5439-v02.patch > > > Support Pig-on-Spark to run on spark3. > Initial version would only run up to Spark 3.2.4 and not on 3.3 or 3.4. > This is due to log4j mismatch. > After moving to log4j2 (PIG-5426), we can move Spark to 3.3 or higher. > So far, not all unit/e2e tests pass with the proposed patch but at least > compilation goes through. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5450) Pig-on-Spark3 E2E ORC test failing with java.lang.VerifyError: Bad return type
[ https://issues.apache.org/jira/browse/PIG-5450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837863#comment-17837863 ] Rohini Palaniswamy commented on PIG-5450: - +1 > Pig-on-Spark3 E2E ORC test failing with java.lang.VerifyError: Bad return type > -- > > Key: PIG-5450 > URL: https://issues.apache.org/jira/browse/PIG-5450 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Attachments: pig-5450-v01.patch > > > {noformat} > Caused by: java.lang.VerifyError: Bad return type > Exception Details: > Location: > org/apache/orc/impl/TypeUtils.createColumn(Lorg/apache/orc/TypeDescription;Lorg/apache/orc/TypeDescription$RowBatchVersion;I)Lorg/apache/hadoop/hive/ql/exec/vector/ColumnVector; > @117: areturn > Reason: > Type 'org/apache/hadoop/hive/ql/exec/vector/DateColumnVector' (current frame, > stack[0]) is not assignable to > 'org/apache/hadoop/hive/ql/exec/vector/ColumnVector' (from method signature) > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5449) TestEmptyInputDir failing on pig-on-spark3
[ https://issues.apache.org/jira/browse/PIG-5449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837862#comment-17837862 ] Rohini Palaniswamy commented on PIG-5449: - +1 > TestEmptyInputDir failing on pig-on-spark3 > -- > > Key: PIG-5449 > URL: https://issues.apache.org/jira/browse/PIG-5449 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Attachments: pig-5449-v01.patch > > > TestEmptyInputDir failing on pig-on-spark3 with > {noformat:title=TestEmptyInputDir.testMergeJoinFailure} > junit.framework.AssertionFailedError > at > org.apache.pig.test.TestEmptyInputDir.testMergeJoin(TestEmptyInputDir.java:141) > {noformat} > {noformat:title=TestEmptyInputDir.testGroupByFailure} > junit.framework.AssertionFailedError > at > org.apache.pig.test.TestEmptyInputDir.testGroupBy(TestEmptyInputDir.java:80) > {noformat} > {noformat:title=TestEmptyInputDir.testBloomJoinOuterFailure} > junit.framework.AssertionFailedError > at > org.apache.pig.test.TestEmptyInputDir.testBloomJoinOuter(TestEmptyInputDir.java:297) > {noformat} > {noformat:title=TestEmptyInputDir.testFRJoinFailure} > junit.framework.AssertionFailedError > at > org.apache.pig.test.TestEmptyInputDir.testFRJoin(TestEmptyInputDir.java:171) > {noformat} > {noformat:title=TestEmptyInputDir.testBloomJoinFailure} > junit.framework.AssertionFailedError > at > org.apache.pig.test.TestEmptyInputDir.testBloomJoin(TestEmptyInputDir.java:267) > {noformat} > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5448) All TestHBaseStorage tests failing on pig-on-spark3
[ https://issues.apache.org/jira/browse/PIG-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837861#comment-17837861 ] Rohini Palaniswamy commented on PIG-5448: - +1 > All TestHBaseStorage tests failing on pig-on-spark3 > --- > > Key: PIG-5448 > URL: https://issues.apache.org/jira/browse/PIG-5448 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Minor > Attachments: pig-5448-v01.patch > > > For Pig on Spark3 (with PIG-5439), all of the TestHBaseStorage unit tests are > failing with > {noformat} > org.apache.pig.PigException: ERROR 1002: Unable to store alias b > at org.apache.pig.PigServer.storeEx(PigServer.java:1127) > at org.apache.pig.PigServer.store(PigServer.java:1086) > at > org.apache.pig.test.TestHBaseStorage.testStoreToHBase_1_with_delete(TestHBaseStorage.java:1251) > Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get > the rdds of this spark operator: > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37) > at > org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46) > at > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:241) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:290) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1479) > at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1464) > at org.apache.pig.PigServer.storeEx(PigServer.java:1123) > Caused by: java.lang.RuntimeException: No task metrics available for jobId 0 > at > org.apache.pig.tools.pigstats.spark.SparkJobStats.collectStats(SparkJobStats.java:109) > at > org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:77) > at > org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:73) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112) > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5438) Update SparkCounter.Accumulator to AccumulatorV2
[ https://issues.apache.org/jira/browse/PIG-5438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837860#comment-17837860 ] Rohini Palaniswamy commented on PIG-5438: - +1 > Update SparkCounter.Accumulator to AccumulatorV2 > > > Key: PIG-5438 > URL: https://issues.apache.org/jira/browse/PIG-5438 > Project: Pig > Issue Type: Improvement > Components: spark >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Trivial > Fix For: 0.19.0 > > Attachments: pig-5438-v01.patch > > > Original Accumulator is deprecated in Spark2 and gone in Spark3. > AccumulatorV2 is usable on both Spark2 and Spark3. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5446) Tez TestPigProgressReporting.testProgressReportingWithStatusMessage failing
[ https://issues.apache.org/jira/browse/PIG-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17826791#comment-17826791 ] Rohini Palaniswamy commented on PIG-5446: - +1 > Tez TestPigProgressReporting.testProgressReportingWithStatusMessage failing > --- > > Key: PIG-5446 > URL: https://issues.apache.org/jira/browse/PIG-5446 > Project: Pig > Issue Type: Bug > Components: tez >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Attachments: pig-5446-v01.patch > > > {noformat} > Unable to open iterator for alias B. Backend error : Vertex failed, > vertexName=scope-4, vertexId=vertex_1707216362777_0001_1_00, > diagnostics=[Task failed, taskId=task_1707216362777_0001_1_00_00, > diagnostics=[TaskAttempt 0 failed, info=[Attempt failed because it appears to > make no progress for 1ms], TaskAttempt 1 failed, info=[Attempt failed > because it appears to make no progress for 1ms]], Vertex did not succeed > due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex > vertex_1707216362777_0001_1_00 [scope-4] killed/failed due > to:OWN_TASK_FAILURE] DAG did not succeed due to VERTEX_FAILURE. > failedVertices:1 killedVertices:0 > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to > open iterator for alias B. Backend error : Vertex failed, vertexName=scope-4, > vertexId=vertex_1707216362777_0001_1_00, diagnostics=[Task failed, > taskId=task_1707216362777_0001_1_00_00, diagnostics=[TaskAttempt 0 > failed, info=[Attempt failed because it appears to make no progress for > 1ms], TaskAttempt 1 failed, info=[Attempt failed because it appears to > make no progress for 1ms]], Vertex did not succeed due to > OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex > vertex_1707216362777_0001_1_00 [scope-4] killed/failed due > to:OWN_TASK_FAILURE] > DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0 > at org.apache.pig.PigServer.openIterator(PigServer.java:1014) > at > org.apache.pig.test.TestPigProgressReporting.testProgressReportingWithStatusMessage(TestPigProgressReporting.java:58) > Caused by: org.apache.tez.dag.api.TezException: Vertex failed, > vertexName=scope-4, vertexId=vertex_1707216362777_0001_1_00, > diagnostics=[Task failed, taskId=task_1707216362777_0001_1_00_00, > diagnostics=[TaskAttempt 0 failed, info=[Attempt failed because it appears to > make no progress for 1ms], TaskAttempt 1 failed, info=[Attempt failed > because it appears to make no progress for 1ms]], Vertex did not succeed > due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex > vertex_1707216362777_0001_1_00 [scope-4] killed/failed due > to:OWN_TASK_FAILURE] > DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0 > at > org.apache.pig.tools.pigstats.tez.TezPigScriptStats.accumulateStats(TezPigScriptStats.java:204) > at > org.apache.pig.backend.hadoop.executionengine.tez.TezJob.run(TezJob.java:243) > at > org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher$1.run(TezLauncher.java:212) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 45.647 {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5416) Spark unit tests failing randomly with "java.lang.RuntimeException: Unexpected job execution status RUNNING"
[ https://issues.apache.org/jira/browse/PIG-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17826790#comment-17826790 ] Rohini Palaniswamy commented on PIG-5416: - +1 > Spark unit tests failing randomly with "java.lang.RuntimeException: > Unexpected job execution status RUNNING" > > > Key: PIG-5416 > URL: https://issues.apache.org/jira/browse/PIG-5416 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Koji Noguchi >Priority: Minor > Attachments: pig-5416-v01.patch > > > Spark unit tests fail randomly with same errors. > Sample stack trace showing "Caused by: java.lang.RuntimeException: > Unexpected job execution status RUNNING". > {noformat:title=TestBuiltInBagToTupleOrString.testPigScriptForBagToTupleUDF} > Unable to store alias B > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to > store alias B > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1783) > at org.apache.pig.PigServer.registerQuery(PigServer.java:708) > at org.apache.pig.PigServer.registerQuery(PigServer.java:721) > at > org.apache.pig.test.TestBuiltInBagToTupleOrString.testPigScriptForBagToTupleUDF(TestBuiltInBagToTupleOrString.java:429) > Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get > the rdds of this spark operator: > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37) > at > org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46) > at > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:240) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:290) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1479) > at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1464) > at org.apache.pig.PigServer.execute(PigServer.java:1453) > at org.apache.pig.PigServer.access$500(PigServer.java:119) > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1778) > Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING > at > org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138) > at > org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75) > at > org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112) > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5447) Pig-on-Spark TestSkewedJoin.testSkewedJoinOuter failing with NoSuchElementException
[ https://issues.apache.org/jira/browse/PIG-5447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17826789#comment-17826789 ] Rohini Palaniswamy commented on PIG-5447: - +1 > Pig-on-Spark TestSkewedJoin.testSkewedJoinOuter failing with > NoSuchElementException > --- > > Key: PIG-5447 > URL: https://issues.apache.org/jira/browse/PIG-5447 > Project: Pig > Issue Type: Bug >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Attachments: pig-5447-v01.patch > > > TestSkewedJoin.testSkewedJoinOuter is consistently failing for right-outer > and full-outer joins. > "Caused by: java.util.NoSuchElementException: next on empty iterator" -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5437) Add lib and idea folder to .gitignore
[ https://issues.apache.org/jira/browse/PIG-5437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5437: Fix Version/s: 0.18.0 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) +1. Committed to trunk and branch-0.18. Thanks for the contribution [~maswin] > Add lib and idea folder to .gitignore > - > > Key: PIG-5437 > URL: https://issues.apache.org/jira/browse/PIG-5437 > Project: Pig > Issue Type: Improvement >Reporter: Alagappan Maruthappan >Assignee: Alagappan Maruthappan >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5437-0.patch > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5420) Update accumulo dependency to 1.10.1
[ https://issues.apache.org/jira/browse/PIG-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5420: Fix Version/s: 0.18.1 > Update accumulo dependency to 1.10.1 > > > Key: PIG-5420 > URL: https://issues.apache.org/jira/browse/PIG-5420 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Trivial > Fix For: 0.18.1 > > Attachments: pig-5420-v01.patch > > > Following owasp/cve report. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5419) Upgrade Joda time version
[ https://issues.apache.org/jira/browse/PIG-5419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5419: Fix Version/s: 0.18.1 (was: 0.18.0) Can you update to 2.12.5 ? > Upgrade Joda time version > - > > Key: PIG-5419 > URL: https://issues.apache.org/jira/browse/PIG-5419 > Project: Pig > Issue Type: Improvement >Reporter: Venkatasubrahmanian Narayanan >Assignee: Venkatasubrahmanian Narayanan >Priority: Minor > Fix For: 0.18.1 > > Attachments: PIG-5419.patch > > > Pig depends on an older version of Joda time, which can result in conflicts > with other versions in some workflows. Upgrading it to the latest > version(2.10.13) will resolve Pig's side of such issues. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PIG-5440) Extra jars needed for hive3
[ https://issues.apache.org/jira/browse/PIG-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy resolved PIG-5440. - Fix Version/s: 0.18.0 Hadoop Flags: Reviewed Resolution: Fixed Committed to trunk and branch-0.18. Thanks [~knoguchi] > Extra jars needed for hive3 > --- > > Key: PIG-5440 > URL: https://issues.apache.org/jira/browse/PIG-5440 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Minor > Fix For: 0.18.0 > > Attachments: pig-5440-v01.patch, pig-5440-v02.patch > > > When testing Hive3, e2e tests were failing with > {{Caused by: java.lang.NoClassDefFoundError: > org/apache/hadoop/hive/llap/security/LlapSigner$Signable}} etc. > Updating dependent classes. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5438) Update SparkCounter.Accumulator to AccumulatorV2
[ https://issues.apache.org/jira/browse/PIG-5438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5438: Fix Version/s: 0.19.0 > Update SparkCounter.Accumulator to AccumulatorV2 > > > Key: PIG-5438 > URL: https://issues.apache.org/jira/browse/PIG-5438 > Project: Pig > Issue Type: Improvement > Components: spark >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Trivial > Fix For: 0.19.0 > > Attachments: pig-5438-v01.patch > > > Original Accumulator is deprecated in Spark2 and gone in Spark3. > AccumulatorV2 is usable on both Spark2 and Spark3. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5439) Support Spark 3 and drop SparkShim
[ https://issues.apache.org/jira/browse/PIG-5439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5439: Fix Version/s: 0.19.0 > Support Spark 3 and drop SparkShim > -- > > Key: PIG-5439 > URL: https://issues.apache.org/jira/browse/PIG-5439 > Project: Pig > Issue Type: Improvement > Components: spark >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Fix For: 0.19.0 > > Attachments: pig-5439-v01.patch > > > Support Pig-on-Spark to run on spark3. > Initial version would only run up to Spark 3.2.4 and not on 3.3 or 3.4. > This is due to log4j mismatch. > After moving to log4j2 (PIG-5426), we can move Spark to 3.3 or higher. > So far, not all unit/e2e tests pass with the proposed patch but at least > compilation goes through. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5414) Build failure on Linux ARM64 due to old Apache Avro
[ https://issues.apache.org/jira/browse/PIG-5414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5414: Fix Version/s: 0.18.1 > Build failure on Linux ARM64 due to old Apache Avro > --- > > Key: PIG-5414 > URL: https://issues.apache.org/jira/browse/PIG-5414 > Project: Pig > Issue Type: Bug > Components: build >Affects Versions: 0.18.0 >Reporter: Martin Tzvetanov Grigorov >Assignee: Martin Tzvetanov Grigorov >Priority: Major > Fix For: 0.18.1 > > Attachments: 35.patch, > TEST-org.apache.pig.builtin.TestAvroStorage.txt, > TEST-org.apache.pig.builtin.TestOrcStorage.txt, > TEST-org.apache.pig.builtin.TestOrcStoragePushdown.txt > > > Trying to build Apache Pig on Ubuntu 20.04.3 ARM64 fails because of old > version of Snappy and Avro libraries: > > {code:java} > Testsuite: org.apache.pig.builtin.TestAvroStorage > Tests run: 0, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.1 sec > - Standard Output --- > 2021-10-12 14:43:35,483 [main] INFO > org.apache.pig.impl.util.SpillableMemoryManager - Selected heap (PS Old Gen) > of size 1431830528 to monitor. collectionUsageThreshold = 1064828928, > usageThreshold = 1064828928 > 2021-10-12 14:43:35,489 [main] INFO org.apache.pig.ExecTypeProvider - > Trying ExecType : LOCAL > 2021-10-12 14:43:35,489 [main] INFO org.apache.pig.ExecTypeProvider - > Picked LOCAL as the ExecType > 2021-10-12 14:43:35,515 [main] WARN org.apache.hadoop.conf.Configuration - > DEPRECATED: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml > is deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml > to override properties of core-default.xml, mapred-default.xml and > hdfs-default.xml respectively > 2021-10-12 14:43:35,755 [main] INFO > org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is > deprecated. Instead, use mapreduce.jobtracker.address > 2021-10-12 14:43:35,899 [main] WARN org.apache.hadoop.util.NativeCodeLoader > - Unable to load native-hadoop library for your platform... using > builtin-java classes where applicable > 2021-10-12 14:43:35,916 [main] INFO > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting > to hadoop file system at: file:/// > 2021-10-12 14:43:36,116 [main] INFO > org.apache.hadoop.conf.Configuration.deprecation - io.bytes.per.checksum is > deprecated. Instead, use dfs.bytes-per-checksum > 2021-10-12 14:43:36,137 [main] INFO org.apache.pig.PigServer - Pig Script > ID for the session: PIG-default-01426621-bc19-499f-981e-b13959fe0d84 > 2021-10-12 14:43:36,137 [main] WARN org.apache.pig.PigServer - ATS is > disabled since yarn.timeline-service.enabled set to false > 2021-10-12 14:43:36,150 [main] INFO org.apache.pig.builtin.TestAvroStorage > - creating > test/org/apache/pig/builtin/avro/data/avro/uncompressed/arraysAsOutputByPig.avro > 2021-10-12 14:43:36,502 [main] INFO org.apache.pig.builtin.TestAvroStorage > - Could not generate avro file: > test/org/apache/pig/builtin/avro/data/avro/uncompressed/arraysAsOutputByPig.avro > java.net.ConnectException: Call From martin/127.0.0.1 to localhost:40073 > failed on connection exception: java.net.ConnectException: Connection > refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792) > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732) > at org.apache.hadoop.ipc.Client.call(Client.java:1479) > at org.apache.hadoop.ipc.Client.call(Client.java:1412) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) > at com.sun.proxy.$Proxy13.getBlockLocations(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:255) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ... > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5418) Utils.parseSchema(String), parseConstant(String) leak memory
[ https://issues.apache.org/jira/browse/PIG-5418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5418: Fix Version/s: 0.18.1 > Utils.parseSchema(String), parseConstant(String) leak memory > > > Key: PIG-5418 > URL: https://issues.apache.org/jira/browse/PIG-5418 > Project: Pig > Issue Type: Improvement >Reporter: Jacob Tolar >Assignee: Jacob Tolar >Priority: Minor > Fix For: 0.18.1 > > Attachments: PIG-5418.patch > > > A minor issue: I noticed that Utils.parseSchema() and parseConstant() leak > memory. I noticed this while running a unit test for a UDF several thousand > times and checking the heap. > Links are to latest commit as of creating this ticket: > https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/impl/util/Utils.java#L244-L256 > {{new PigContext()}} [creates a MapReduce > ExecutionEngine|https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/impl/PigContext.java#L269]. > > This creates a > [MapReduceLauncher|https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRExecutionEngine.java#L34]. > > This registers a [Hadoop shutdown > hook|https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java#L104-L105] > which doesn't go away until the JVM dies. See: > https://hadoop.apache.org/docs/r2.8.2/hadoop-project-dist/hadoop-common/api/org/apache/hadoop/util/ShutdownHookManager.html > . > I will attach a proposed patch. From my reading of the code and running > tests, the existing schema parse APIs do not actually use anything from this > dummy PigContext, and with a minor tweak it can be passed in as NULL, > avoiding the creation of these extra resources. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5443) Add testcase for skew join for tez grace shuffle vertex manager
[ https://issues.apache.org/jira/browse/PIG-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5443: Description: Need to add test case for fix in https://issues.apache.org/jira/browse/PIG-5441. Can just modify one of the existing skewed join unit or e2e test cases by increasing mappers (split size) or adding PARALLEL 2 for right side data. Also check if one-one edges are affected by this part of the code. (was: Need to add test case for fix in https://issues.apache.org/jira/browse/PIG-5441. Can just modify one of the existing skewed join unit or e2e test cases by increasing mappers (split size) or adding PARALLEL 2 for right side data. ) > Add testcase for skew join for tez grace shuffle vertex manager > --- > > Key: PIG-5443 > URL: https://issues.apache.org/jira/browse/PIG-5443 > Project: Pig > Issue Type: Task > Reporter: Rohini Palaniswamy >Priority: Minor > > Need to add test case for fix in > https://issues.apache.org/jira/browse/PIG-5441. Can just modify one of the > existing skewed join unit or e2e test cases by increasing mappers (split > size) or adding PARALLEL 2 for right side data. Also check if one-one edges > are affected by this part of the code. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PIG-5443) Add testcase for skew join for tez grace shuffle vertex manager
Rohini Palaniswamy created PIG-5443: --- Summary: Add testcase for skew join for tez grace shuffle vertex manager Key: PIG-5443 URL: https://issues.apache.org/jira/browse/PIG-5443 Project: Pig Issue Type: Task Reporter: Rohini Palaniswamy Need to add test case for fix in https://issues.apache.org/jira/browse/PIG-5441. Can just modify one of the existing skewed join unit or e2e test cases by increasing mappers (split size) or adding PARALLEL 2 for right side data. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PIG-5442) Add only credentials from setStoreLocation to the Job Conf
[ https://issues.apache.org/jira/browse/PIG-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy resolved PIG-5442. - Fix Version/s: 0.18.0 Hadoop Flags: Reviewed Resolution: Fixed +1. Committed to branch-0.18. and trunk. Thanks for the contribution [~maswin] > Add only credentials from setStoreLocation to the Job Conf > -- > > Key: PIG-5442 > URL: https://issues.apache.org/jira/browse/PIG-5442 > Project: Pig > Issue Type: Bug >Reporter: Alagappan Maruthappan >Assignee: Alagappan Maruthappan >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5442-1.patch > > > While testing HCatStorer with Iceberg realized Pig calls setStoreLocation on > all Stores with the same Job object - > [https://github.com/apache/pig/blob/b050a33c66fc22d648370b5c6bda04e0e51d3aa3/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java#L1081] > Setting populated by one store is affecting the other stores. In my case the > "mapred.output.committer.class" is set as HiveIcebergCommitter by PigStore > that is used by the Iceberg table and the other stores which inserts data to > a non-iceberg tables also use that setting and trying to use > HiveIcebergCommitter. > > On checking with [~rohini] , it is called to get the credentials from all > stores since addCredentials API was added later and not all stores have > implemented it and some still set configuration in setLocation method (i.e, > HCatStorer). > > Fixed it by passing a separate copy of Job object to each store's setLocation > method and adding only the credential object from the call. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5442) Add only credentials from setStoreLocation to the Job Conf
[ https://issues.apache.org/jira/browse/PIG-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5442: Attachment: PIG-5442-1.patch > Add only credentials from setStoreLocation to the Job Conf > -- > > Key: PIG-5442 > URL: https://issues.apache.org/jira/browse/PIG-5442 > Project: Pig > Issue Type: Bug >Reporter: Alagappan Maruthappan >Assignee: Alagappan Maruthappan >Priority: Major > Attachments: PIG-5442-1.patch > > > While testing HCatStorer with Iceberg realized Pig calls setStoreLocation on > all Stores with the same Job object - > [https://github.com/apache/pig/blob/b050a33c66fc22d648370b5c6bda04e0e51d3aa3/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java#L1081] > Setting populated by one store is affecting the other stores. In my case the > "mapred.output.committer.class" is set as HiveIcebergCommitter by PigStore > that is used by the Iceberg table and the other stores which inserts data to > a non-iceberg tables also use that setting and trying to use > HiveIcebergCommitter. > > On checking with [~rohini] , it is called to get the credentials from all > stores since addCredentials API was added later and not all stores have > implemented it and some still set configuration in setLocation method (i.e, > HCatStorer). > > Fixed it by passing a separate copy of Job object to each store's setLocation > method and adding only the credential object from the call. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5441) Pig skew join tez grace reducer fails to find shuffle data
[ https://issues.apache.org/jira/browse/PIG-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5441: Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) Patch committed to branch-0.18 and trunk. Thanks [~yigress] for the contribution. > Pig skew join tez grace reducer fails to find shuffle data > -- > > Key: PIG-5441 > URL: https://issues.apache.org/jira/browse/PIG-5441 > Project: Pig > Issue Type: Bug > Components: tez >Affects Versions: 0.17.0 >Reporter: Yi Zhang >Assignee: Yi Zhang >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5441.patch > > > User pig tez skew join encountered issue of not finding shuffle data from the > sampler aggregate vertex. The right side join has >1 reducers. > For workaround adjust tez.runtime.transfer.data-via-events.max-size to avoid > spill to disk for the sampler aggregation vertex. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5441) Pig skew join tez grace reducer fails to find shuffle data
[ https://issues.apache.org/jira/browse/PIG-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17725781#comment-17725781 ] Rohini Palaniswamy commented on PIG-5441: - +1. Can you just attach the patch to jira ? > Pig skew join tez grace reducer fails to find shuffle data > -- > > Key: PIG-5441 > URL: https://issues.apache.org/jira/browse/PIG-5441 > Project: Pig > Issue Type: Bug > Components: tez >Affects Versions: 0.17.0 >Reporter: Yi Zhang >Assignee: Yi Zhang >Priority: Major > Fix For: 0.18.0 > > > User pig tez skew join encountered issue of not finding shuffle data from the > sampler aggregate vertex. The right side join has >1 reducers. > For workaround adjust tez.runtime.transfer.data-via-events.max-size to avoid > spill to disk for the sampler aggregation vertex. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5441) Pig skew join tez grace reducer fails to find shuffle data
[ https://issues.apache.org/jira/browse/PIG-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5441: Fix Version/s: 0.18.0 Assignee: Yi Zhang Status: Patch Available (was: Open) > Pig skew join tez grace reducer fails to find shuffle data > -- > > Key: PIG-5441 > URL: https://issues.apache.org/jira/browse/PIG-5441 > Project: Pig > Issue Type: Bug > Components: tez >Affects Versions: 0.17.0 >Reporter: Yi Zhang >Assignee: Yi Zhang >Priority: Major > Fix For: 0.18.0 > > > User pig tez skew join encountered issue of not finding shuffle data from the > sampler aggregate vertex. The right side join has >1 reducers. > For workaround adjust tez.runtime.transfer.data-via-events.max-size to avoid > spill to disk for the sampler aggregation vertex. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5440) Extra jars needed for hive3
[ https://issues.apache.org/jira/browse/PIG-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17725780#comment-17725780 ] Rohini Palaniswamy commented on PIG-5440: - +1 > Extra jars needed for hive3 > --- > > Key: PIG-5440 > URL: https://issues.apache.org/jira/browse/PIG-5440 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Minor > Attachments: pig-5440-v01.patch, pig-5440-v02.patch > > > When testing Hive3, e2e tests were failing with > {{Caused by: java.lang.NoClassDefFoundError: > org/apache/hadoop/hive/llap/security/LlapSigner$Signable}} etc. > Updating dependent classes. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5440) Extra jars needed for hive3
[ https://issues.apache.org/jira/browse/PIG-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17722276#comment-17722276 ] Rohini Palaniswamy commented on PIG-5440: - +1. Can you add space between "orc-shims","aircompressor" before commit ? > Extra jars needed for hive3 > --- > > Key: PIG-5440 > URL: https://issues.apache.org/jira/browse/PIG-5440 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Minor > Attachments: pig-5440-v01.patch > > > When testing Hive3, e2e tests were failing with > {{Caused by: java.lang.NoClassDefFoundError: > org/apache/hadoop/hive/llap/security/LlapSigner$Signable}} etc. > Updating dependent classes. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (PIG-5432) OrcStorage fails to detect schema in some cases
[ https://issues.apache.org/jira/browse/PIG-5432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17706981#comment-17706981 ] Rohini Palaniswamy edited comment on PIG-5432 at 3/30/23 5:33 PM: -- +1. Committed to branch-0.18 and trunk. Thanks for the contribution [~jtolar] was (Author: rohini): +1. Committed to branch-0.18 and trunk. Thanks for contribution [~jtolar] > OrcStorage fails to detect schema in some cases > --- > > Key: PIG-5432 > URL: https://issues.apache.org/jira/browse/PIG-5432 > Project: Pig > Issue Type: Bug >Reporter: Jacob Tolar >Assignee: Jacob Tolar >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5432.v01.patch > > > OrcStorage needs to detect the schema of input data paths. If some data paths > have no ORC files (perhaps only a _SUCCESS marker is present), this will > fail. > For example: > {code} > A = LOAD '/path/to/20230101,/path/to/20230102' USING OrcStorage(); > {code} > If {{/path/to/20230101}} contains only a _SUCCESS marker and {{20230102}} > contains data, OrcStorage fails to detect the schema and Pig exits with a > confusing/unhelpful error, something like "Cannot find any ORC files from > . Probably multiple load/store statements in script." > The code tries to use a search algorithm to recursively search through all > input paths for the data (via Utils.depthFirstSearchForFile), but it is > implemented incorrectly and returns early in this scenario. > See: > https://github.com/apache/pig/blob/c0d75ba930f9aa5c6454d0264a96f82b45279202/src/org/apache/pig/builtin/OrcStorage.java#L389-L408 > https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/impl/util/Utils.java#L629-L667 > I'll attach a patch. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5432) OrcStorage fails to detect schema in some cases
[ https://issues.apache.org/jira/browse/PIG-5432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5432: Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) +1. Committed to branch-0.18 and trunk. Thanks for contribution [~jtolar] > OrcStorage fails to detect schema in some cases > --- > > Key: PIG-5432 > URL: https://issues.apache.org/jira/browse/PIG-5432 > Project: Pig > Issue Type: Bug >Reporter: Jacob Tolar >Assignee: Jacob Tolar >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5432.v01.patch > > > OrcStorage needs to detect the schema of input data paths. If some data paths > have no ORC files (perhaps only a _SUCCESS marker is present), this will > fail. > For example: > {code} > A = LOAD '/path/to/20230101,/path/to/20230102' USING OrcStorage(); > {code} > If {{/path/to/20230101}} contains only a _SUCCESS marker and {{20230102}} > contains data, OrcStorage fails to detect the schema and Pig exits with a > confusing/unhelpful error, something like "Cannot find any ORC files from > . Probably multiple load/store statements in script." > The code tries to use a search algorithm to recursively search through all > input paths for the data (via Utils.depthFirstSearchForFile), but it is > implemented incorrectly and returns early in this scenario. > See: > https://github.com/apache/pig/blob/c0d75ba930f9aa5c6454d0264a96f82b45279202/src/org/apache/pig/builtin/OrcStorage.java#L389-L408 > https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/impl/util/Utils.java#L629-L667 > I'll attach a patch. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PIG-5432) OrcStorage fails to detect schema in some cases
[ https://issues.apache.org/jira/browse/PIG-5432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy reassigned PIG-5432: --- Fix Version/s: 0.18.0 Assignee: Jacob Tolar > OrcStorage fails to detect schema in some cases > --- > > Key: PIG-5432 > URL: https://issues.apache.org/jira/browse/PIG-5432 > Project: Pig > Issue Type: Bug >Reporter: Jacob Tolar >Assignee: Jacob Tolar >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5432.v01.patch > > > OrcStorage needs to detect the schema of input data paths. If some data paths > have no ORC files (perhaps only a _SUCCESS marker is present), this will > fail. > For example: > {code} > A = LOAD '/path/to/20230101,/path/to/20230102' USING OrcStorage(); > {code} > If {{/path/to/20230101}} contains only a _SUCCESS marker and {{20230102}} > contains data, OrcStorage fails to detect the schema and Pig exits with a > confusing/unhelpful error, something like "Cannot find any ORC files from > . Probably multiple load/store statements in script." > The code tries to use a search algorithm to recursively search through all > input paths for the data (via Utils.depthFirstSearchForFile), but it is > implemented incorrectly and returns early in this scenario. > See: > https://github.com/apache/pig/blob/c0d75ba930f9aa5c6454d0264a96f82b45279202/src/org/apache/pig/builtin/OrcStorage.java#L389-L408 > https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/impl/util/Utils.java#L629-L667 > I'll attach a patch. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: Branching for 0.18 release
This is done now and the release branch is at https://svn.apache.org/repos/asf/pig/branches/branch-0.18 On Sun, Jan 15, 2023 at 6:08 PM Rohini Palaniswamy wrote: > Hi all, >Will be creating a branch for the 0.18 release from trunk tomorrow > afternoon. > > Regards, > Rohini >
[jira] [Resolved] (PIG-5436) update owasp version
[ https://issues.apache.org/jira/browse/PIG-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy resolved PIG-5436. - Fix Version/s: 0.18.0 Hadoop Flags: Reviewed Resolution: Fixed Committed to trunk. Thanks Koji > update owasp version > > > Key: PIG-5436 > URL: https://issues.apache.org/jira/browse/PIG-5436 > Project: Pig > Issue Type: Test >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Trivial > Fix For: 0.18.0 > > Attachments: pig-5436-v01.patch > > > Owasp testing started to fail with > {quote}Caused by: org.h2.jdbc.JdbcBatchUpdateException: Value too long for > column "VERSIONENDEXCLUDING VARCHAR(50) SELECTIVITY 1" > {quote} > > Following https://github.com/jeremylong/DependencyCheck/issues/5225, updating > the owasp version. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5435) pig.exec.reducers.max does not take effect for skewed join
[ https://issues.apache.org/jira/browse/PIG-5435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5435: Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Thanks [~vnarayanan7] and [~knoguchi]. > pig.exec.reducers.max does not take effect for skewed join > -- > > Key: PIG-5435 > URL: https://issues.apache.org/jira/browse/PIG-5435 > Project: Pig > Issue Type: Bug > Reporter: Rohini Palaniswamy >Assignee: Venkatasubrahmanian Narayanan >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5435-1.patch > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5434) Migrate from log4j to reload4j
[ https://issues.apache.org/jira/browse/PIG-5434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5434: Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Thanks Koji > Migrate from log4j to reload4j > -- > > Key: PIG-5434 > URL: https://issues.apache.org/jira/browse/PIG-5434 > Project: Pig > Issue Type: Improvement > Reporter: Rohini Palaniswamy > Assignee: Rohini Palaniswamy >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5434-1.patch > > > Was trying to migrate to log4j2.x (PIG-5426) but was running into issues. As > 0.18 is delayed long enough, migrating to reload4j in this release similar to > HADOOP-18088. Will migrate to log4j2.x in the next release. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5417) Replace guava's Files.createTempDir()
[ https://issues.apache.org/jira/browse/PIG-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5417: Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Thank you for the contribution [~xiaoheipangzi] > Replace guava's Files.createTempDir() > - > > Key: PIG-5417 > URL: https://issues.apache.org/jira/browse/PIG-5417 > Project: Pig > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5417-1.patch > > > see [https://www.cvedetails.com/cve/CVE-2020-8908/] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5436) update owasp version
[ https://issues.apache.org/jira/browse/PIG-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677558#comment-17677558 ] Rohini Palaniswamy commented on PIG-5436: - +1 > update owasp version > > > Key: PIG-5436 > URL: https://issues.apache.org/jira/browse/PIG-5436 > Project: Pig > Issue Type: Test >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Trivial > Attachments: pig-5436-v01.patch > > > Owasp testing started to fail with > {quote}Caused by: org.h2.jdbc.JdbcBatchUpdateException: Value too long for > column "VERSIONENDEXCLUDING VARCHAR(50) SELECTIVITY 1" > {quote} > > Following https://github.com/jeremylong/DependencyCheck/issues/5225, updating > the owasp version. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5435) pig.exec.reducers.max does not take effect for skewed join
[ https://issues.apache.org/jira/browse/PIG-5435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5435: Status: Patch Available (was: Open) This is a patch that I had reviewed internally. [~knoguchi] can you +1 here. > pig.exec.reducers.max does not take effect for skewed join > -- > > Key: PIG-5435 > URL: https://issues.apache.org/jira/browse/PIG-5435 > Project: Pig > Issue Type: Bug > Reporter: Rohini Palaniswamy >Assignee: Venkatasubrahmanian Narayanan >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5435-1.patch > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5435) pig.exec.reducers.max does not take effect for skewed join
[ https://issues.apache.org/jira/browse/PIG-5435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5435: Attachment: PIG-5435-1.patch > pig.exec.reducers.max does not take effect for skewed join > -- > > Key: PIG-5435 > URL: https://issues.apache.org/jira/browse/PIG-5435 > Project: Pig > Issue Type: Bug > Reporter: Rohini Palaniswamy >Assignee: Venkatasubrahmanian Narayanan >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5435-1.patch > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PIG-5435) pig.exec.reducers.max does not take effect for skewed join
Rohini Palaniswamy created PIG-5435: --- Summary: pig.exec.reducers.max does not take effect for skewed join Key: PIG-5435 URL: https://issues.apache.org/jira/browse/PIG-5435 Project: Pig Issue Type: Bug Reporter: Rohini Palaniswamy Assignee: Venkatasubrahmanian Narayanan Fix For: 0.18.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5417) Replace guava's Files.createTempDir()
[ https://issues.apache.org/jira/browse/PIG-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5417: Status: Patch Available (was: Open) > Replace guava's Files.createTempDir() > - > > Key: PIG-5417 > URL: https://issues.apache.org/jira/browse/PIG-5417 > Project: Pig > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5417-1.patch > > > see [https://www.cvedetails.com/cve/CVE-2020-8908/] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5417) Replace guava's Files.createTempDir()
[ https://issues.apache.org/jira/browse/PIG-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677523#comment-17677523 ] Rohini Palaniswamy commented on PIG-5417: - Downloaded https://patch-diff.githubusercontent.com/raw/apache/pig/pull/36.patch and was going to commit it, but compilation failed as it did not catch IOException. Updated the patch with a try catch block. [~knoguchi], can you +1 as there is a minor change from the original patch? Thought will get this into the release as it address a CVE. > Replace guava's Files.createTempDir() > - > > Key: PIG-5417 > URL: https://issues.apache.org/jira/browse/PIG-5417 > Project: Pig > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5417-1.patch > > > see [https://www.cvedetails.com/cve/CVE-2020-8908/] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5417) Replace guava's Files.createTempDir()
[ https://issues.apache.org/jira/browse/PIG-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5417: Attachment: PIG-5417-1.patch > Replace guava's Files.createTempDir() > - > > Key: PIG-5417 > URL: https://issues.apache.org/jira/browse/PIG-5417 > Project: Pig > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5417-1.patch > > > see [https://www.cvedetails.com/cve/CVE-2020-8908/] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PIG-5417) Replace guava's Files.createTempDir()
[ https://issues.apache.org/jira/browse/PIG-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy reassigned PIG-5417: --- Assignee: lujie (was: lujie) > Replace guava's Files.createTempDir() > - > > Key: PIG-5417 > URL: https://issues.apache.org/jira/browse/PIG-5417 > Project: Pig > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Major > Fix For: 0.18.0 > > > see [https://www.cvedetails.com/cve/CVE-2020-8908/] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PIG-5417) Replace guava's Files.createTempDir()
[ https://issues.apache.org/jira/browse/PIG-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy reassigned PIG-5417: --- Fix Version/s: 0.18.0 Assignee: lujie > Replace guava's Files.createTempDir() > - > > Key: PIG-5417 > URL: https://issues.apache.org/jira/browse/PIG-5417 > Project: Pig > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Major > Fix For: 0.18.0 > > > see [https://www.cvedetails.com/cve/CVE-2020-8908/] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5434) Migrate from log4j to reload4j
[ https://issues.apache.org/jira/browse/PIG-5434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5434: Status: Patch Available (was: Open) This patch also upgrades to the latest slf4j version > Migrate from log4j to reload4j > -- > > Key: PIG-5434 > URL: https://issues.apache.org/jira/browse/PIG-5434 > Project: Pig > Issue Type: Improvement > Reporter: Rohini Palaniswamy > Assignee: Rohini Palaniswamy >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5434-1.patch > > > Was trying to migrate to log4j2.x (PIG-5426) but was running into issues. As > 0.18 is delayed long enough, migrating to reload4j in this release similar to > HADOOP-18088. Will migrate to log4j2.x in the next release. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5434) Migrate from log4j to reload4j
[ https://issues.apache.org/jira/browse/PIG-5434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5434: Attachment: PIG-5434-1.patch > Migrate from log4j to reload4j > -- > > Key: PIG-5434 > URL: https://issues.apache.org/jira/browse/PIG-5434 > Project: Pig > Issue Type: Improvement > Reporter: Rohini Palaniswamy > Assignee: Rohini Palaniswamy >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5434-1.patch > > > Was trying to migrate to log4j2.x (PIG-5426) but was running into issues. As > 0.18 is delayed long enough, migrating to reload4j in this release similar to > HADOOP-18088. Will migrate to log4j2.x in the next release. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5426) Migrate to log4j2.x
[ https://issues.apache.org/jira/browse/PIG-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5426: Fix Version/s: 0.19.0 (was: 0.18.0) Running into issues. As 0.18 is delayed long enough, migrating to reload4j in that release as part of PIG-5434. Will migrate to log4j2.x in the next release. > Migrate to log4j2.x > --- > > Key: PIG-5426 > URL: https://issues.apache.org/jira/browse/PIG-5426 > Project: Pig > Issue Type: Improvement > Reporter: Rohini Palaniswamy > Assignee: Rohini Palaniswamy >Priority: Major > Fix For: 0.19.0 > > > Hadoop (HADOOP-18088) decided to migrate to reload4j to address log4j1.x > vulnerabilities. I did the work of migrating Oozie server and client to > log4j2.x while launched hadoop jobs will still use 1.x till hadoop migrates. > So think it should be easy to do that for pig client as well. If it does not > work as expected, will just go with the easy switch to reload4j. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PIG-5434) Migrate from log4j to reload4j
Rohini Palaniswamy created PIG-5434: --- Summary: Migrate from log4j to reload4j Key: PIG-5434 URL: https://issues.apache.org/jira/browse/PIG-5434 Project: Pig Issue Type: Improvement Reporter: Rohini Palaniswamy Assignee: Rohini Palaniswamy Fix For: 0.18.0 Was trying to migrate to log4j2.x (PIG-5426) but was running into issues. As 0.18 is delayed long enough, migrating to reload4j in this release similar to HADOOP-18088. Will migrate to log4j2.x in the next release. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Branching for 0.18 release
Hi all, Will be creating a branch for the 0.18 release from trunk tomorrow afternoon. Regards, Rohini
[jira] [Updated] (PIG-5431) Date datatype is different between Hive 1.x and Hive 3.x
[ https://issues.apache.org/jira/browse/PIG-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5431: Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Thanks Koji for the review. > Date datatype is different between Hive 1.x and Hive 3.x > > > Key: PIG-5431 > URL: https://issues.apache.org/jira/browse/PIG-5431 > Project: Pig > Issue Type: Bug > Reporter: Rohini Palaniswamy > Assignee: Rohini Palaniswamy >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5431-1.patch > > > java.lang.ClassCastException: org.apache.hadoop.hive.common.type.Date cannot > be cast to java.sql.Date -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5433) Fix test failures with TestHBaseStorage and htrace dependency
[ https://issues.apache.org/jira/browse/PIG-5433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5433: Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Thanks Koji. > Fix test failures with TestHBaseStorage and htrace dependency > - > > Key: PIG-5433 > URL: https://issues.apache.org/jira/browse/PIG-5433 > Project: Pig > Issue Type: Bug > Reporter: Rohini Palaniswamy > Assignee: Rohini Palaniswamy >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5433-1.patch > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5431) Date datatype is different between Hive 1.x and Hive 3.x
[ https://issues.apache.org/jira/browse/PIG-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5431: Status: Patch Available (was: Open) > Date datatype is different between Hive 1.x and Hive 3.x > > > Key: PIG-5431 > URL: https://issues.apache.org/jira/browse/PIG-5431 > Project: Pig > Issue Type: Bug > Reporter: Rohini Palaniswamy > Assignee: Rohini Palaniswamy >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5431-1.patch > > > java.lang.ClassCastException: org.apache.hadoop.hive.common.type.Date cannot > be cast to java.sql.Date -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5431) Date datatype is different between Hive 1.x and Hive 3.x
[ https://issues.apache.org/jira/browse/PIG-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5431: Attachment: PIG-5431-1.patch > Date datatype is different between Hive 1.x and Hive 3.x > > > Key: PIG-5431 > URL: https://issues.apache.org/jira/browse/PIG-5431 > Project: Pig > Issue Type: Bug > Reporter: Rohini Palaniswamy > Assignee: Rohini Palaniswamy >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5431-1.patch > > > java.lang.ClassCastException: org.apache.hadoop.hive.common.type.Date cannot > be cast to java.sql.Date -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5433) Fix test failures with TestHBaseStorage and htrace dependency
[ https://issues.apache.org/jira/browse/PIG-5433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5433: Attachment: PIG-5433-1.patch > Fix test failures with TestHBaseStorage and htrace dependency > - > > Key: PIG-5433 > URL: https://issues.apache.org/jira/browse/PIG-5433 > Project: Pig > Issue Type: Bug > Reporter: Rohini Palaniswamy > Assignee: Rohini Palaniswamy >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5433-1.patch > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5433) Fix test failures with TestHBaseStorage and htrace dependency
[ https://issues.apache.org/jira/browse/PIG-5433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5433: Status: Patch Available (was: Open) > Fix test failures with TestHBaseStorage and htrace dependency > - > > Key: PIG-5433 > URL: https://issues.apache.org/jira/browse/PIG-5433 > Project: Pig > Issue Type: Bug > Reporter: Rohini Palaniswamy > Assignee: Rohini Palaniswamy >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5433-1.patch > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5433) Fix test failures with TestHBaseStorage and htrace dependency
[ https://issues.apache.org/jira/browse/PIG-5433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677083#comment-17677083 ] Rohini Palaniswamy commented on PIG-5433: - Ran into below test failure {code} org/apache/htrace/core/Tracer$Builder java.lang.NoClassDefFoundError: org/apache/htrace/core/Tracer$Builder at org.apache.hadoop.fs.FsTracer.get(FsTracer.java:42) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3256) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:121) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3310) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3278) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:475) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:228) at org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:68) at org.apache.pig.backend.hadoop.datastorage.HDataStorage.(HDataStorage.java:58) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:227) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:111) at org.apache.pig.impl.PigContext.connect(PigContext.java:310) at org.apache.pig.PigServer.(PigServer.java:232) at org.apache.pig.PigServer.(PigServer.java:220) at org.apache.pig.PigServer.(PigServer.java:212) at org.apache.pig.PigServer.(PigServer.java:208) at org.apache.pig.builtin.TestOrcStorage.setup(TestOrcStorage.java:109) Caused by: java.lang.ClassNotFoundException: org.apache.htrace.core.Tracer$Builder at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:418) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:355) at java.lang.ClassLoader.loadClass(ClassLoader.java:351) {code} and java.io.IOException: Waiting for startup of standalone server in TestHBaseStorage described in https://stackoverflow.com/questions/67364593/java-io-ioexception-waiting-for-startup-of-standalone-server-minizookeeperclu > Fix test failures with TestHBaseStorage and htrace dependency > - > > Key: PIG-5433 > URL: https://issues.apache.org/jira/browse/PIG-5433 > Project: Pig > Issue Type: Bug > Reporter: Rohini Palaniswamy > Assignee: Rohini Palaniswamy >Priority: Major > Fix For: 0.18.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PIG-5433) Fix test failures with TestHBaseStorage and htrace dependency
Rohini Palaniswamy created PIG-5433: --- Summary: Fix test failures with TestHBaseStorage and htrace dependency Key: PIG-5433 URL: https://issues.apache.org/jira/browse/PIG-5433 Project: Pig Issue Type: Bug Reporter: Rohini Palaniswamy Assignee: Rohini Palaniswamy Fix For: 0.18.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PIG-5431) Date datatype is different between Hive 1.x and Hive 3.x
Rohini Palaniswamy created PIG-5431: --- Summary: Date datatype is different between Hive 1.x and Hive 3.x Key: PIG-5431 URL: https://issues.apache.org/jira/browse/PIG-5431 Project: Pig Issue Type: Bug Reporter: Rohini Palaniswamy Assignee: Rohini Palaniswamy Fix For: 0.18.0 java.lang.ClassCastException: org.apache.hadoop.hive.common.type.Date cannot be cast to java.sql.Date -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PIG-5321) Upgrade Spark 2 version to 2.2.0 for Pig on Spark
[ https://issues.apache.org/jira/browse/PIG-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy resolved PIG-5321. - Resolution: Duplicate This has been already fixed by [~knoguchi] as part of PIG-5397 with Spark 2 version being upgraded to 2.4.8. > Upgrade Spark 2 version to 2.2.0 for Pig on Spark > - > > Key: PIG-5321 > URL: https://issues.apache.org/jira/browse/PIG-5321 > Project: Pig > Issue Type: Improvement > Components: spark >Reporter: Ádám Szita >Priority: Major > > Right now we maintain support for 2 versions of Spark for PoS jobs: > spark1.version=1.6.1 > spark2.version=2.1.1 > I believe we should move forward with the latter. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5430) TestTezGraceParallelism failing due to tez log change
[ https://issues.apache.org/jira/browse/PIG-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17618002#comment-17618002 ] Rohini Palaniswamy commented on PIG-5430: - +1 > TestTezGraceParallelism failing due to tez log change > - > > Key: PIG-5430 > URL: https://issues.apache.org/jira/browse/PIG-5430 > Project: Pig > Issue Type: Test >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Trivial > Fix For: 0.18.0 > > Attachments: pig-5430-v01.patch > > > After PIG-5428, TestTezGraceParallelism:testIncreaseParallelism, > testDecreaseParallelism started failing due to change in log messages by > recent Tez. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5429) Update hbase version from 2.0.0 to 2.4.14
[ https://issues.apache.org/jira/browse/PIG-5429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17618001#comment-17618001 ] Rohini Palaniswamy commented on PIG-5429: - +1 > Update hbase version from 2.0.0 to 2.4.14 > - > > Key: PIG-5429 > URL: https://issues.apache.org/jira/browse/PIG-5429 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Minor > Attachments: pig-5429-v01.patch > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5428) Update hadoop2,3 and tez to recent versions
[ https://issues.apache.org/jira/browse/PIG-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17616638#comment-17616638 ] Rohini Palaniswamy commented on PIG-5428: - +1 > Update hadoop2,3 and tez to recent versions > --- > > Key: PIG-5428 > URL: https://issues.apache.org/jira/browse/PIG-5428 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Fix For: 0.18.0 > > Attachments: pig-5428-v01.patch > > > PIG-5253 hadoop3 patch is committed. > Now, updating hadoop2&3, tez and other dependent library versions. > Only testing using two different parameters. > * -Dhbaseversion=2 -Dhadoopversion=2 -Dhiveversion=1 -Dsparkversion=2 > and > * -Dhbaseversion=2 -Dhadoopversion=3 -Dhiveversion=3 -Dsparkversion=2 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5406) TestJoinLocal imports org.python.google.common.collect.Lists instead of org.google.common.collect.Lists
[ https://issues.apache.org/jira/browse/PIG-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5406: Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Thanks Daniel for the review. > TestJoinLocal imports org.python.google.common.collect.Lists instead of > org.google.common.collect.Lists > --- > > Key: PIG-5406 > URL: https://issues.apache.org/jira/browse/PIG-5406 > Project: Pig > Issue Type: Bug >Affects Versions: 0.15.0, 0.16.0, 0.17.0 >Reporter: James Z.M. Gao > Assignee: Rohini Palaniswamy >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5406-v1.patch > > > [PIG-4366|https://github.com/apache/pig/commit/81abb6bd0adb6e101898d67b3c2a9e35e11ce993] > make PIG-2861 coming back. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PIG-5426) Migrate to log4j2.x
Rohini Palaniswamy created PIG-5426: --- Summary: Migrate to log4j2.x Key: PIG-5426 URL: https://issues.apache.org/jira/browse/PIG-5426 Project: Pig Issue Type: Improvement Reporter: Rohini Palaniswamy Assignee: Rohini Palaniswamy Fix For: 0.18.0 Hadoop (HADOOP-18088) decided to migrate to reload4j to address log4j1.x vulnerabilities. I did the work of migrating Oozie server and client to log4j2.x while launched hadoop jobs will still use 1.x till hadoop migrates. So think it should be easy to do that for pig client as well. If it does not work as expected, will just go with the easy switch to reload4j. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PIG-5388) Upgrade to Avro and Trevni 1.9.x
[ https://issues.apache.org/jira/browse/PIG-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy reassigned PIG-5388: --- Fix Version/s: 0.18.0 Assignee: Rohini Palaniswamy Summary: Upgrade to Avro and Trevni 1.9.x (was: Upgrade to Avro 1.9.x) > Upgrade to Avro and Trevni 1.9.x > > > Key: PIG-5388 > URL: https://issues.apache.org/jira/browse/PIG-5388 > Project: Pig > Issue Type: Improvement >Reporter: Nándor Kollár > Assignee: Rohini Palaniswamy >Priority: Minor > Fix For: 0.18.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5419) Upgrade Joda time version
[ https://issues.apache.org/jira/browse/PIG-5419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5419: Fix Version/s: 0.18.0 Can you update to 2.11.0 (https://www.joda.org/joda-time/changes-report.html#a2.11.0)? > Upgrade Joda time version > - > > Key: PIG-5419 > URL: https://issues.apache.org/jira/browse/PIG-5419 > Project: Pig > Issue Type: Improvement >Reporter: Venkatasubrahmanian Narayanan >Assignee: Venkatasubrahmanian Narayanan >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5419.patch > > > Pig depends on an older version of Joda time, which can result in conflicts > with other versions in some workflows. Upgrading it to the latest > version(2.10.13) will resolve Pig's side of such issues. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5406) TestJoinLocal imports org.python.google.common.collect.Lists instead of org.google.common.collect.Lists
[ https://issues.apache.org/jira/browse/PIG-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5406: Attachment: PIG-5406-v1.patch > TestJoinLocal imports org.python.google.common.collect.Lists instead of > org.google.common.collect.Lists > --- > > Key: PIG-5406 > URL: https://issues.apache.org/jira/browse/PIG-5406 > Project: Pig > Issue Type: Bug >Affects Versions: 0.15.0, 0.16.0, 0.17.0 >Reporter: James Z.M. Gao > Assignee: Rohini Palaniswamy >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5406-v1.patch > > > [PIG-4366|https://github.com/apache/pig/commit/81abb6bd0adb6e101898d67b3c2a9e35e11ce993] > make PIG-2861 coming back. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5406) TestJoinLocal imports org.python.google.common.collect.Lists instead of org.google.common.collect.Lists
[ https://issues.apache.org/jira/browse/PIG-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5406: Status: Patch Available (was: Open) > TestJoinLocal imports org.python.google.common.collect.Lists instead of > org.google.common.collect.Lists > --- > > Key: PIG-5406 > URL: https://issues.apache.org/jira/browse/PIG-5406 > Project: Pig > Issue Type: Bug >Affects Versions: 0.17.0, 0.16.0, 0.15.0 >Reporter: James Z.M. Gao > Assignee: Rohini Palaniswamy >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5406-v1.patch > > > [PIG-4366|https://github.com/apache/pig/commit/81abb6bd0adb6e101898d67b3c2a9e35e11ce993] > make PIG-2861 coming back. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PIG-5406) TestJoinLocal imports org.python.google.common.collect.Lists instead of org.google.common.collect.Lists
[ https://issues.apache.org/jira/browse/PIG-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy reassigned PIG-5406: --- Fix Version/s: 0.18.0 Assignee: Rohini Palaniswamy Priority: Minor (was: Major) > TestJoinLocal imports org.python.google.common.collect.Lists instead of > org.google.common.collect.Lists > --- > > Key: PIG-5406 > URL: https://issues.apache.org/jira/browse/PIG-5406 > Project: Pig > Issue Type: Bug >Affects Versions: 0.15.0, 0.16.0, 0.17.0 >Reporter: James Z.M. Gao > Assignee: Rohini Palaniswamy >Priority: Minor > Fix For: 0.18.0 > > > [PIG-4366|https://github.com/apache/pig/commit/81abb6bd0adb6e101898d67b3c2a9e35e11ce993] > make PIG-2861 coming back. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5423) Upgrade hadoop/tez dependency
[ https://issues.apache.org/jira/browse/PIG-5423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17579177#comment-17579177 ] Rohini Palaniswamy commented on PIG-5423: - [~knoguchi], you mentioned about having to add tez_conf.set("tez.runtime.transfer.data-via-events.enabled", "false"); to fix some test failures. Can the patch be updated with that? > Upgrade hadoop/tez dependency > -- > > Key: PIG-5423 > URL: https://issues.apache.org/jira/browse/PIG-5423 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Attachments: pig-5423-v01.patch > > > We already have PIG-5253 for supporting hadoop3. Here, upgrading hadoop2 > dependency to the most recent hadoop2 version, 2.10.1. > Also, upgrading Tez to 0.9.2. (0.10.1 showed some regressions and need > further checking). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PIG-5422) Upgrade guava/groovy dependency
[ https://issues.apache.org/jira/browse/PIG-5422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy resolved PIG-5422. - Fix Version/s: 0.18.0 Hadoop Flags: Reviewed Resolution: Fixed +1. Committed to trunk. Thanks Koji. > Upgrade guava/groovy dependency > --- > > Key: PIG-5422 > URL: https://issues.apache.org/jira/browse/PIG-5422 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Trivial > Fix For: 0.18.0 > > Attachments: pig-5422-v01.patch, pig-5422-v02.patch > > > Following owasp/cve. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PIG-5421) Upgrade commons dependencies
[ https://issues.apache.org/jira/browse/PIG-5421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy resolved PIG-5421. - Fix Version/s: 0.18.0 Hadoop Flags: Reviewed Resolution: Fixed Committed to trunk. Thanks Koji > Upgrade commons dependencies > - > > Key: PIG-5421 > URL: https://issues.apache.org/jira/browse/PIG-5421 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Trivial > Fix For: 0.18.0 > > Attachments: pig-5421-v01.patch > > > Following owasp/cve report -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PIG-5253) Pig Hadoop 3 support
[ https://issues.apache.org/jira/browse/PIG-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy resolved PIG-5253. - Hadoop Flags: Reviewed Resolution: Fixed > Pig Hadoop 3 support > > > Key: PIG-5253 > URL: https://issues.apache.org/jira/browse/PIG-5253 > Project: Pig > Issue Type: Improvement >Reporter: Nándor Kollár >Assignee: Ádám Szita >Priority: Blocker > Fix For: 0.18.0 > > Attachments: PIG-5253-v3.patch, PIG-5253.0.patch, PIG-5253.0.svn.patch > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (PIG-5253) Pig Hadoop 3 support
[ https://issues.apache.org/jira/browse/PIG-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy reopened PIG-5253: - > Pig Hadoop 3 support > > > Key: PIG-5253 > URL: https://issues.apache.org/jira/browse/PIG-5253 > Project: Pig > Issue Type: Improvement >Reporter: Nándor Kollár >Assignee: Ádám Szita >Priority: Blocker > Fix For: 0.18.0 > > Attachments: PIG-5253-v3.patch, PIG-5253.0.patch, PIG-5253.0.svn.patch > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PIG-5377) Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce
[ https://issues.apache.org/jira/browse/PIG-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy resolved PIG-5377. - Hadoop Flags: Reviewed Resolution: Fixed > Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce > - > > Key: PIG-5377 > URL: https://issues.apache.org/jira/browse/PIG-5377 > Project: Pig > Issue Type: Improvement > Components: internal-udfs, piggybank >Reporter: Kevin J. Price >Assignee: Kevin J. Price >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5377-2.patch, PIG-5377.patch > > > Now that we're running on JDK8 and can have default implementations in > interfaces, we can move supportsParallelWriteToStoreLocation() to the > StoreFuncInterface class and properly set it on the supported built-in > functions rather than having a static list. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5377) Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce
[ https://issues.apache.org/jira/browse/PIG-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5377: Patch Info: (was: Patch Available) > Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce > - > > Key: PIG-5377 > URL: https://issues.apache.org/jira/browse/PIG-5377 > Project: Pig > Issue Type: Improvement > Components: internal-udfs, piggybank >Reporter: Kevin J. Price >Assignee: Kevin J. Price >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5377-2.patch, PIG-5377.patch > > > Now that we're running on JDK8 and can have default implementations in > interfaces, we can move supportsParallelWriteToStoreLocation() to the > StoreFuncInterface class and properly set it on the supported built-in > functions rather than having a static list. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (PIG-5377) Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce
[ https://issues.apache.org/jira/browse/PIG-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy reopened PIG-5377: - > Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce > - > > Key: PIG-5377 > URL: https://issues.apache.org/jira/browse/PIG-5377 > Project: Pig > Issue Type: Improvement > Components: internal-udfs, piggybank >Reporter: Kevin J. Price >Assignee: Kevin J. Price >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5377-2.patch, PIG-5377.patch > > > Now that we're running on JDK8 and can have default implementations in > interfaces, we can move supportsParallelWriteToStoreLocation() to the > StoreFuncInterface class and properly set it on the supported built-in > functions rather than having a static list. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PIG-5425) Pig 0.15 and later don't set context signature correctly
[ https://issues.apache.org/jira/browse/PIG-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy resolved PIG-5425. - Hadoop Flags: Reviewed Resolution: Fixed > Pig 0.15 and later don't set context signature correctly > > > Key: PIG-5425 > URL: https://issues.apache.org/jira/browse/PIG-5425 > Project: Pig > Issue Type: Improvement >Reporter: Jacob Tolar >Assignee: Jacob Tolar >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5425.0.patch > > > As an author of Pig UDFs, my expectation in EvalFunc ( > [https://github.com/apache/pig/blob/release-0.17.0/src/org/apache/pig/EvalFunc.java] > ) is that {{setUDFContextSignature}} would be called before > {{setInputSchema}}. This was previously the case up through Pig 0.14 > > In Pig 0.15 and later (according to the git tags, at least; I've only checked > 0.17), this is not true. > This commit introduces the problem behavior: > [https://github.com/apache/pig/commit/8af34f1971628d1eeb0cd1f07fe03397ca887b81] > The issue is in > src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java > line 513 ([git blame > link|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java#L513]) > introduced in that commit. > > There, {{f.setInputSchema()}} is called without previously calling > {{f.setUDFContextSignature(signature)}}. > Note that on line 509, {{((POUserFunc)p).setSignature(op.getSignature());}} > is called, but POUserFunc [re-instantiates the EvalFunc and does not actually > use the func argument passed in its > constructor|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java#L119-L128] > (quite confusing, but probably attributable to changes over time). > {{f}} is discarded, so it should be safe to simply call > {{f.setUdfContextSignature(signature)}} as a simple fix. > The code here is arguably unnecessarily complex and could probably be cleaned > up further, but I propose the simple fix above without a larger refactoring. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (PIG-5425) Pig 0.15 and later don't set context signature correctly
[ https://issues.apache.org/jira/browse/PIG-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy reopened PIG-5425: - > Pig 0.15 and later don't set context signature correctly > > > Key: PIG-5425 > URL: https://issues.apache.org/jira/browse/PIG-5425 > Project: Pig > Issue Type: Improvement >Reporter: Jacob Tolar >Assignee: Jacob Tolar >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5425.0.patch > > > As an author of Pig UDFs, my expectation in EvalFunc ( > [https://github.com/apache/pig/blob/release-0.17.0/src/org/apache/pig/EvalFunc.java] > ) is that {{setUDFContextSignature}} would be called before > {{setInputSchema}}. This was previously the case up through Pig 0.14 > > In Pig 0.15 and later (according to the git tags, at least; I've only checked > 0.17), this is not true. > This commit introduces the problem behavior: > [https://github.com/apache/pig/commit/8af34f1971628d1eeb0cd1f07fe03397ca887b81] > The issue is in > src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java > line 513 ([git blame > link|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java#L513]) > introduced in that commit. > > There, {{f.setInputSchema()}} is called without previously calling > {{f.setUDFContextSignature(signature)}}. > Note that on line 509, {{((POUserFunc)p).setSignature(op.getSignature());}} > is called, but POUserFunc [re-instantiates the EvalFunc and does not actually > use the func argument passed in its > constructor|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java#L119-L128] > (quite confusing, but probably attributable to changes over time). > {{f}} is discarded, so it should be safe to simply call > {{f.setUdfContextSignature(signature)}} as a simple fix. > The code here is arguably unnecessarily complex and could probably be cleaned > up further, but I propose the simple fix above without a larger refactoring. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5425) Pig 0.15 and later don't set context signature correctly
[ https://issues.apache.org/jira/browse/PIG-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5425: Resolution: Fixed Status: Resolved (was: Patch Available) > Pig 0.15 and later don't set context signature correctly > > > Key: PIG-5425 > URL: https://issues.apache.org/jira/browse/PIG-5425 > Project: Pig > Issue Type: Improvement >Reporter: Jacob Tolar >Assignee: Jacob Tolar >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5425.0.patch > > > As an author of Pig UDFs, my expectation in EvalFunc ( > [https://github.com/apache/pig/blob/release-0.17.0/src/org/apache/pig/EvalFunc.java] > ) is that {{setUDFContextSignature}} would be called before > {{setInputSchema}}. This was previously the case up through Pig 0.14 > > In Pig 0.15 and later (according to the git tags, at least; I've only checked > 0.17), this is not true. > This commit introduces the problem behavior: > [https://github.com/apache/pig/commit/8af34f1971628d1eeb0cd1f07fe03397ca887b81] > The issue is in > src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java > line 513 ([git blame > link|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java#L513]) > introduced in that commit. > > There, {{f.setInputSchema()}} is called without previously calling > {{f.setUDFContextSignature(signature)}}. > Note that on line 509, {{((POUserFunc)p).setSignature(op.getSignature());}} > is called, but POUserFunc [re-instantiates the EvalFunc and does not actually > use the func argument passed in its > constructor|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java#L119-L128] > (quite confusing, but probably attributable to changes over time). > {{f}} is discarded, so it should be safe to simply call > {{f.setUdfContextSignature(signature)}} as a simple fix. > The code here is arguably unnecessarily complex and could probably be cleaned > up further, but I propose the simple fix above without a larger refactoring. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PIG-5425) Pig 0.15 and later don't set context signature correctly
[ https://issues.apache.org/jira/browse/PIG-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy reassigned PIG-5425: --- Fix Version/s: 0.18.0 Assignee: Jacob Tolar +1. Committed to trunk. Thanks for the patch [~jtolar] > Pig 0.15 and later don't set context signature correctly > > > Key: PIG-5425 > URL: https://issues.apache.org/jira/browse/PIG-5425 > Project: Pig > Issue Type: Improvement >Reporter: Jacob Tolar >Assignee: Jacob Tolar >Priority: Major > Fix For: 0.18.0 > > Attachments: PIG-5425.0.patch > > > As an author of Pig UDFs, my expectation in EvalFunc ( > [https://github.com/apache/pig/blob/release-0.17.0/src/org/apache/pig/EvalFunc.java] > ) is that {{setUDFContextSignature}} would be called before > {{setInputSchema}}. This was previously the case up through Pig 0.14 > > In Pig 0.15 and later (according to the git tags, at least; I've only checked > 0.17), this is not true. > This commit introduces the problem behavior: > [https://github.com/apache/pig/commit/8af34f1971628d1eeb0cd1f07fe03397ca887b81] > The issue is in > src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java > line 513 ([git blame > link|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java#L513]) > introduced in that commit. > > There, {{f.setInputSchema()}} is called without previously calling > {{f.setUDFContextSignature(signature)}}. > Note that on line 509, {{((POUserFunc)p).setSignature(op.getSignature());}} > is called, but POUserFunc [re-instantiates the EvalFunc and does not actually > use the func argument passed in its > constructor|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java#L119-L128] > (quite confusing, but probably attributable to changes over time). > {{f}} is discarded, so it should be safe to simply call > {{f.setUdfContextSignature(signature)}} as a simple fix. > The code here is arguably unnecessarily complex and could probably be cleaned > up further, but I propose the simple fix above without a larger refactoring. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5420) Update accumulo dependency to 1.10.1
[ https://issues.apache.org/jira/browse/PIG-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17579158#comment-17579158 ] Rohini Palaniswamy commented on PIG-5420: - This patch needs updating as accumulo.version is now moved to ivy/libraries-h2.properties and ivy/libraries-h3.properties after PIG-5253 > Update accumulo dependency to 1.10.1 > > > Key: PIG-5420 > URL: https://issues.apache.org/jira/browse/PIG-5420 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Trivial > Attachments: pig-5420-v01.patch > > > Following owasp/cve report. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PIG-5253) Pig Hadoop 3 support
[ https://issues.apache.org/jira/browse/PIG-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy resolved PIG-5253. - Resolution: Fixed Attached [^PIG-5253.0.svn.patch] from https://reviews.apache.org/r/72326/ to jira. Fixed the wrong license file and committed [^PIG-5253-v3.patch] to trunk. Thanks [~nkollar] and [~szita] for this key patch. > Pig Hadoop 3 support > > > Key: PIG-5253 > URL: https://issues.apache.org/jira/browse/PIG-5253 > Project: Pig > Issue Type: Improvement >Reporter: Nándor Kollár >Assignee: Ádám Szita >Priority: Blocker > Fix For: 0.18.0 > > Attachments: PIG-5253-v3.patch, PIG-5253.0.patch, PIG-5253.0.svn.patch > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5253) Pig Hadoop 3 support
[ https://issues.apache.org/jira/browse/PIG-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5253: Attachment: PIG-5253.0.svn.patch PIG-5253-v3.patch > Pig Hadoop 3 support > > > Key: PIG-5253 > URL: https://issues.apache.org/jira/browse/PIG-5253 > Project: Pig > Issue Type: Improvement >Reporter: Nándor Kollár >Assignee: Ádám Szita >Priority: Blocker > Fix For: 0.18.0 > > Attachments: PIG-5253-v3.patch, PIG-5253.0.patch, PIG-5253.0.svn.patch > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5377) Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce
[ https://issues.apache.org/jira/browse/PIG-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5377: Resolution: Fixed Status: Resolved (was: Patch Available) > Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce > - > > Key: PIG-5377 > URL: https://issues.apache.org/jira/browse/PIG-5377 > Project: Pig > Issue Type: Improvement > Components: internal-udfs, piggybank >Reporter: Kevin J. Price >Assignee: Kevin J. Price >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5377-2.patch, PIG-5377.patch > > > Now that we're running on JDK8 and can have default implementations in > interfaces, we can move supportsParallelWriteToStoreLocation() to the > StoreFuncInterface class and properly set it on the supported built-in > functions rather than having a static list. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5377) Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce
[ https://issues.apache.org/jira/browse/PIG-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-5377: Fix Version/s: 0.18.0 Committed to trunk. Thank you for the contribution [~kpriceyahoo]. > Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce > - > > Key: PIG-5377 > URL: https://issues.apache.org/jira/browse/PIG-5377 > Project: Pig > Issue Type: Improvement > Components: internal-udfs, piggybank >Reporter: Kevin J. Price >Assignee: Kevin J. Price >Priority: Minor > Fix For: 0.18.0 > > Attachments: PIG-5377-2.patch, PIG-5377.patch > > > Now that we're running on JDK8 and can have default implementations in > interfaces, we can move supportsParallelWriteToStoreLocation() to the > StoreFuncInterface class and properly set it on the supported built-in > functions rather than having a static list. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: Review Request 72326: PIG-5253: Pig Hadoop 3 support
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/72326/#review224613 --- trunk/test/org/apache/pig/test/MapReduceMiniCluster.java Lines 1 (patched) <https://reviews.apache.org/r/72326/#comment313397> Koji found that this file has the Cloudera license. Can you replace with Apache license? trunk/test/org/apache/pig/test/MapReduceMiniCluster.java Lines 41 (patched) <https://reviews.apache.org/r/72326/#comment313399> m_conf.set("dfs.datanode.address", "0.0.0.0:0"); m_conf.set("dfs.datanode.http.address", "0.0.0.0:0"); m_conf.set("pig.jobcontrol.sleep", "100"); System.setProperty("cluster", m_conf.get(MRConfiguration.JOB_TRACKER)); System.setProperty("namenode", m_conf.get(FileSystem.FS_DEFAULT_NAME_KEY)); is missing compared to older MiniCluster.java. Not sure datanode address settings are needed but setting pig.jobcontrol.sleep is likely needed to have the tests run faster. trunk/test/org/apache/pig/test/TezMiniCluster.java Line 61 (original), 65 (patched) <https://reviews.apache.org/r/72326/#comment313398> Can you add tez_conf.set("tez.runtime.transfer.data-via-events.enabled", "false"); here. Koji found that some tests were failing with Hadoop 3 in local mode without that setting. - Rohini Palaniswamy On April 6, 2020, 2:57 p.m., Adam Szita wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/72326/ > ------- > > (Updated April 6, 2020, 2:57 p.m.) > > > Review request for pig, Koji Noguchi, Nandor Kollar, and Rohini Palaniswamy. > > > Bugs: PIG-5253 > https://issues.apache.org/jira/browse/PIG-5253 > > > Repository: pig > > > Description > --- > > This it continuing from https://reviews.apache.org/r/65239/ (there's issues > with reviewboard's pig-git repo) > This change is now rebased on trunk, and I fixed test issues around the MR > mode MiniGenericCluster refactoring. > > > Diffs > - > > trunk/bin/pig 1876187 > trunk/bin/pig.py 1876187 > trunk/build.xml 1876187 > trunk/ivy.xml 1876187 > trunk/ivy/libraries-h2.properties PRE-CREATION > trunk/ivy/libraries-h3.properties PRE-CREATION > trunk/ivy/libraries.properties 1876187 > trunk/test/org/apache/pig/parser/TestErrorHandling.java 1876187 > trunk/test/org/apache/pig/parser/TestQueryParserUtils.java 1876187 > trunk/test/org/apache/pig/test/MapReduceMiniCluster.java PRE-CREATION > trunk/test/org/apache/pig/test/MiniCluster.java 1876187 > trunk/test/org/apache/pig/test/MiniGenericCluster.java 1876187 > trunk/test/org/apache/pig/test/SparkMiniCluster.java 1876187 > trunk/test/org/apache/pig/test/TestGrunt.java 1876187 > trunk/test/org/apache/pig/test/TezMiniCluster.java 1876187 > trunk/test/org/apache/pig/test/Util.java 1876187 > trunk/test/org/apache/pig/test/YarnMiniCluster.java 1876187 > > > Diff: https://reviews.apache.org/r/72326/diff/1/ > > > Testing > --- > > > Thanks, > > Adam Szita > >
[jira] [Resolved] (PIG-3544) Pig fails to query Apache Cassandra 2.x
[ https://issues.apache.org/jira/browse/PIG-3544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy resolved PIG-3544. - Resolution: Not A Problem Closing as it is a bug in Cassandra code base. > Pig fails to query Apache Cassandra 2.x > --- > > Key: PIG-3544 > URL: https://issues.apache.org/jira/browse/PIG-3544 > Project: Pig > Issue Type: Bug > Components: build >Affects Versions: 0.12.0 > Environment: CentOS 6.4 - 2.6.32-279.19.1.el6.centos.plus.x86_64 >Reporter: Claudio Romo Otto >Priority: Blocker > > Using Apache Pig 0.12 with Apache Cassandra 2.x (2.0.0 / 2.0.1), > with this sample request > data = LOAD 'cql://keyspace1/testcf?' USING CqlStorage(); > testcf is just any CF > I get this error: > java.lang.RuntimeException: InvalidRequestException(why:Undefined name > key_alias in selection clause) > at > org.apache.cassandra.hadoop.pig.AbstractCassandraStorage.initSchema(AbstractCassandraStorage.java:511) > at > org.apache.cassandra.hadoop.pig.CqlStorage.setLocation(CqlStorage.java:246) > at > org.apache.cassandra.hadoop.pig.CqlStorage.getSchema(CqlStorage.java:280) > at > org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:151) > at > org.apache.pig.newplan.logical.relational.LOLoad.getSchema(LOLoad.java:110) > at > org.apache.pig.newplan.logical.visitor.LineageFindRelVisitor.visit(LineageFindRelVisitor.java:100) > at > org.apache.pig.newplan.logical.relational.LOLoad.accept(LOLoad.java:219) > at > org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75) > at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50) > at > org.apache.pig.newplan.logical.visitor.CastLineageSetter.(CastLineageSetter.java:57) > at org.apache.pig.PigServer$Graph.compile(PigServer.java:1635) > at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1566) > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1538) > at org.apache.pig.PigServer.registerQuery(PigServer.java:540) > at > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:970) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165) > at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69) > at org.apache.pig.Main.run(Main.java:490) > at org.apache.pig.Main.main(Main.java:111) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.main(RunJar.java:212) > Caused by: InvalidRequestException(why:Undefined name key_alias in selection > clause) > at > org.apache.cassandra.thrift.Cassandra$execute_cql3_query_result$execute_cql3_query_resultStandardScheme.read(Cassandra.java:48006) > at > org.apache.cassandra.thrift.Cassandra$execute_cql3_query_result$execute_cql3_query_resultStandardScheme.read(Cassandra.java:47983) > at > org.apache.cassandra.thrift.Cassandra$execute_cql3_query_result.read(Cassandra.java:47898) > at > org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) > at > org.apache.cassandra.thrift.Cassandra$Client.recv_execute_cql3_query(Cassandra.java:1658) > at > org.apache.cassandra.thrift.Cassandra$Client.execute_cql3_query(Cassandra.java:1643) > at > org.apache.cassandra.hadoop.pig.AbstractCassandraStorage.getCfDef(AbstractCassandraStorage.java:573) > at > org.apache.cassandra.hadoop.pig.AbstractCassandraStorage.initSchema(AbstractCassandraStorage.java:500) > ... 25 more -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5424) Upgrade hbase/zookeeper dependencies
[ https://issues.apache.org/jira/browse/PIG-5424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17521827#comment-17521827 ] Rohini Palaniswamy commented on PIG-5424: - +1 > Upgrade hbase/zookeeper dependencies > > > Key: PIG-5424 > URL: https://issues.apache.org/jira/browse/PIG-5424 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Trivial > Attachments: pig-5424-v01.patch > > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (PIG-5422) Upgrade guava/groovy dependency
[ https://issues.apache.org/jira/browse/PIG-5422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17521824#comment-17521824 ] Rohini Palaniswamy commented on PIG-5422: - Updating guava version will cause issues with hadoop unless we shade it. > Upgrade guava/groovy dependency > --- > > Key: PIG-5422 > URL: https://issues.apache.org/jira/browse/PIG-5422 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Trivial > Attachments: pig-5422-v01.patch > > > Following owasp/cve. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (PIG-5421) Upgrade commons dependencies
[ https://issues.apache.org/jira/browse/PIG-5421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17521823#comment-17521823 ] Rohini Palaniswamy commented on PIG-5421: - +1 > Upgrade commons dependencies > - > > Key: PIG-5421 > URL: https://issues.apache.org/jira/browse/PIG-5421 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Trivial > Attachments: pig-5421-v01.patch > > > Following owasp/cve report -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (PIG-5420) Update accumulo dependency to 1.10.1
[ https://issues.apache.org/jira/browse/PIG-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17521822#comment-17521822 ] Rohini Palaniswamy commented on PIG-5420: - Can we just making it accumulo.version instead of accumulo1.version? > Update accumulo dependency to 1.10.1 > > > Key: PIG-5420 > URL: https://issues.apache.org/jira/browse/PIG-5420 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Trivial > Attachments: pig-5420-v01.patch > > > Following owasp/cve report. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (PIG-5413) [spark] TestStreaming.testInputCacheSpecs failing with "File script1.pl was already registered"
[ https://issues.apache.org/jira/browse/PIG-5413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17473177#comment-17473177 ] Rohini Palaniswamy commented on PIG-5413: - +1 > [spark] TestStreaming.testInputCacheSpecs failing with "File script1.pl was > already registered" > --- > > Key: PIG-5413 > URL: https://issues.apache.org/jira/browse/PIG-5413 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Minor > Attachments: pig-5413-v01.patch > > > {noformat} > Caused by: java.lang.IllegalArgumentException: requirement failed: File > script1.pl was already registered with a different path (old path = > /tmp/yarn-local/usercache/knoguchi/appcache/application_1628754354801_523406/container_e07_1628754354801_523406_01_61/tmp/pig_junit_tmp1798933174/cache7028476439694979845/script1.pl, > new path = > /tmp/yarn-local/usercache/knoguchi/appcache/application_1628754354801_523406/container_e07_1628754354801_523406_01_61/tmp/pig_junit_tmp1798933174/cache4167672945345635171/script1.pl > at scala.Predef$.require(Predef.scala:224) > at > org.apache.spark.rpc.netty.NettyStreamManager.addFile(NettyStreamManager.scala:70) > at org.apache.spark.SparkContext.addFile(SparkContext.scala:1559) > ... > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (PIG-5415) [spark] TestScriptLanguage conflict between multiple SparkContext (after spark2.4 upgrade)
[ https://issues.apache.org/jira/browse/PIG-5415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17457792#comment-17457792 ] Rohini Palaniswamy commented on PIG-5415: - +1 > [spark] TestScriptLanguage conflict between multiple SparkContext (after > spark2.4 upgrade) > -- > > Key: PIG-5415 > URL: https://issues.apache.org/jira/browse/PIG-5415 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Minor > Attachments: pig-5415-v01.patch > > > {noformat} > 2021-10-12 17:54:40,073 [main] ERROR org.apache.pig.scripting.BoundScript - > Pig pipeline failed to complete > java.util.concurrent.ExecutionException: > org.apache.pig.backend.executionengine.ExecException: ERROR 0: > java.lang.IllegalStateException: Cannot call methods on a stopped > SparkContext. > This stopped SparkContext was created at: > org.apache.spark.api.java.JavaSparkContext.(JavaSparkContext.scala:58) > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.startSparkIfNeeded(SparkLauncher.java:640) > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:184) > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:290) > org.apache.pig.PigServer.launchPlan(PigServer.java:1479) > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1464) > org.apache.pig.PigServer.execute(PigServer.java:1453) > org.apache.pig.PigServer.executeBatch(PigServer.java:489) > org.apache.pig.PigServer.executeBatch(PigServer.java:472) > org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:172) > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:235) > org.apache.pig.scripting.BoundScript$MyCallable.call(BoundScript.java:347) > org.apache.pig.scripting.BoundScript$MyCallable.call(BoundScript.java:323) > java.util.concurrent.FutureTask.run(FutureTask.java:266) > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > java.lang.Thread.run(Thread.java:748) > The currently active SparkContext was created at: > org.apache.spark.api.java.JavaSparkContext.(JavaSparkContext.scala:58) > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.startSparkIfNeeded(SparkLauncher.java:640) > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:184) > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:290) > org.apache.pig.PigServer.launchPlan(PigServer.java:1479) > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1464) > org.apache.pig.PigServer.execute(PigServer.java:1453) > org.apache.pig.PigServer.executeBatch(PigServer.java:489) > org.apache.pig.PigServer.executeBatch(PigServer.java:472) > org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:172) > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:235) > org.apache.pig.scripting.BoundScript$MyCallable.call(BoundScript.java:347) > org.apache.pig.scripting.BoundScript$MyCallable.call(BoundScript.java:323) > java.util.concurrent.FutureTask.run(FutureTask.java:266) > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > java.lang.Thread.run(Thread.java:748) > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (PIG-5398) SparkLauncher does not read SPARK_CONF_DIR/spark-defaults.conf
[ https://issues.apache.org/jira/browse/PIG-5398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17457791#comment-17457791 ] Rohini Palaniswamy commented on PIG-5398: - +1 > SparkLauncher does not read SPARK_CONF_DIR/spark-defaults.conf > --- > > Key: PIG-5398 > URL: https://issues.apache.org/jira/browse/PIG-5398 > Project: Pig > Issue Type: Bug >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Minor > Attachments: pig-5398-v01.patch > > > Noticed while testing spark e2e tests. Somehow, Pig's spark launcher is not > reading SPARK_CONF_DIR at all. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (PIG-5412) testSkewedJoinOuter spark unit-test failing with ClassNotFoundException
[ https://issues.apache.org/jira/browse/PIG-5412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17457790#comment-17457790 ] Rohini Palaniswamy commented on PIG-5412: - +1 > testSkewedJoinOuter spark unit-test failing with ClassNotFoundException > --- > > Key: PIG-5412 > URL: https://issues.apache.org/jira/browse/PIG-5412 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Minor > Attachments: pig-5412-v01.patch > > > \{TestSkewedJoin,TestJoinSmoke}.testSkewedJoinOuter > both with {{-Dtest.exec.type=spark -Dsparkversion=2}} > are somehow failing with > "java.lang.ClassNotFoundException: > org.apache.pig.backend.hadoop.executionengine.spark.Spark1Shims" > {noformat} > Unable to open iterator for alias C. Backend error : Job aborted. > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to > open iterator for alias C. Backend error : Job aborted. > at org.apache.pig.PigServer.openIterator(PigServer.java:1014) > at > org.apache.pig.test.TestJoinSmoke.testSkewedJoinOuter(TestJoinSmoke.java:199) > Caused by: org.apache.spark.SparkException: Job aborted. > at > org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:100) > at > org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1083) > at > org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1081) > at > org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1081) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) > at org.apache.spark.rdd.RDD.withScope(RDD.scala:385) > at > org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:1081) > at > org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopFile$2.apply$mcV$sp(PairRDDFunctions.scala:1000) > at > org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopFile$2.apply(PairRDDFunctions.scala:991) > at > org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopFile$2.apply(PairRDDFunctions.scala:991) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) > at org.apache.spark.rdd.RDD.withScope(RDD.scala:385) > at > org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopFile(PairRDDFunctions.scala:991) > at > org.apache.pig.backend.hadoop.executionengine.spark.converter.StoreConverter.convert(StoreConverter.java:99) > at > org.apache.pig.backend.hadoop.executionengine.spark.converter.StoreConverter.convert(StoreConverter.java:56) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.physicalToRDD(JobGraphBuilder.java:292) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:182) > at > org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37) > at > org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46) > at > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:240) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:290) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1479) > at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1464) > at org.apache.pig.PigServer.storeEx(PigServer.java:1123) > at org.apache.pig.PigServer.store(PigServer.java:1086) > at org.apache.pig.PigServer.openIterator(PigServer.java:999) > Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 0 in stage 19.0 failed 4 times, most recent failure: Lost task 0.3 in > stage 19.0 (TID 26, gsrd466n11.red.ygrid.yahoo.com, executor 2): > org.apache.spark.SparkException: Task failed while writing rows > at > org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark
[jira] [Commented] (PIG-5397) Update spark2.version to 2.4.8
[ https://issues.apache.org/jira/browse/PIG-5397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17457786#comment-17457786 ] Rohini Palaniswamy commented on PIG-5397: - +1 > Update spark2.version to 2.4.8 > -- > > Key: PIG-5397 > URL: https://issues.apache.org/jira/browse/PIG-5397 > Project: Pig > Issue Type: Improvement >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Trivial > Attachments: pig-5397-v01.patch, pig-5397-v02.patch > > > bq. spark2.version=2.1.1 > This is probably too old. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (PIG-5418) Utils.parseSchema(String), parseConstant(String) leak memory
[ https://issues.apache.org/jira/browse/PIG-5418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy reassigned PIG-5418: --- Assignee: Jacob Tolar > Utils.parseSchema(String), parseConstant(String) leak memory > > > Key: PIG-5418 > URL: https://issues.apache.org/jira/browse/PIG-5418 > Project: Pig > Issue Type: Improvement >Reporter: Jacob Tolar >Assignee: Jacob Tolar >Priority: Minor > Attachments: PIG-5418.patch > > > A minor issue: I noticed that Utils.parseSchema() and parseConstant() leak > memory. I noticed this while running a unit test for a UDF several thousand > times and checking the heap. > Links are to latest commit as of creating this ticket: > https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/impl/util/Utils.java#L244-L256 > {{new PigContext()}} [creates a MapReduce > ExecutionEngine|https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/impl/PigContext.java#L269]. > > This creates a > [MapReduceLauncher|https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRExecutionEngine.java#L34]. > > This registers a [Hadoop shutdown > hook|https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java#L104-L105] > which doesn't go away until the JVM dies. See: > https://hadoop.apache.org/docs/r2.8.2/hadoop-project-dist/hadoop-common/api/org/apache/hadoop/util/ShutdownHookManager.html > . > I will attach a proposed patch. From my reading of the code and running > tests, the existing schema parse APIs do not actually use anything from this > dummy PigContext, and with a minor tweak it can be passed in as NULL, > avoiding the creation of these extra resources. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (PIG-5414) Build failure on Linux ARM64 due to old Apache Avro
[ https://issues.apache.org/jira/browse/PIG-5414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy reassigned PIG-5414: --- Assignee: Martin Tzvetanov Grigorov I will review this for you by early next week [~mgrigorov]. > Build failure on Linux ARM64 due to old Apache Avro > --- > > Key: PIG-5414 > URL: https://issues.apache.org/jira/browse/PIG-5414 > Project: Pig > Issue Type: Bug > Components: build >Affects Versions: 0.18.0 >Reporter: Martin Tzvetanov Grigorov >Assignee: Martin Tzvetanov Grigorov >Priority: Major > Attachments: 35.patch, > TEST-org.apache.pig.builtin.TestAvroStorage.txt, > TEST-org.apache.pig.builtin.TestOrcStorage.txt, > TEST-org.apache.pig.builtin.TestOrcStoragePushdown.txt > > > Trying to build Apache Pig on Ubuntu 20.04.3 ARM64 fails because of old > version of Snappy and Avro libraries: > > {code:java} > Testsuite: org.apache.pig.builtin.TestAvroStorage > Tests run: 0, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.1 sec > - Standard Output --- > 2021-10-12 14:43:35,483 [main] INFO > org.apache.pig.impl.util.SpillableMemoryManager - Selected heap (PS Old Gen) > of size 1431830528 to monitor. collectionUsageThreshold = 1064828928, > usageThreshold = 1064828928 > 2021-10-12 14:43:35,489 [main] INFO org.apache.pig.ExecTypeProvider - > Trying ExecType : LOCAL > 2021-10-12 14:43:35,489 [main] INFO org.apache.pig.ExecTypeProvider - > Picked LOCAL as the ExecType > 2021-10-12 14:43:35,515 [main] WARN org.apache.hadoop.conf.Configuration - > DEPRECATED: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml > is deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml > to override properties of core-default.xml, mapred-default.xml and > hdfs-default.xml respectively > 2021-10-12 14:43:35,755 [main] INFO > org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is > deprecated. Instead, use mapreduce.jobtracker.address > 2021-10-12 14:43:35,899 [main] WARN org.apache.hadoop.util.NativeCodeLoader > - Unable to load native-hadoop library for your platform... using > builtin-java classes where applicable > 2021-10-12 14:43:35,916 [main] INFO > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting > to hadoop file system at: file:/// > 2021-10-12 14:43:36,116 [main] INFO > org.apache.hadoop.conf.Configuration.deprecation - io.bytes.per.checksum is > deprecated. Instead, use dfs.bytes-per-checksum > 2021-10-12 14:43:36,137 [main] INFO org.apache.pig.PigServer - Pig Script > ID for the session: PIG-default-01426621-bc19-499f-981e-b13959fe0d84 > 2021-10-12 14:43:36,137 [main] WARN org.apache.pig.PigServer - ATS is > disabled since yarn.timeline-service.enabled set to false > 2021-10-12 14:43:36,150 [main] INFO org.apache.pig.builtin.TestAvroStorage > - creating > test/org/apache/pig/builtin/avro/data/avro/uncompressed/arraysAsOutputByPig.avro > 2021-10-12 14:43:36,502 [main] INFO org.apache.pig.builtin.TestAvroStorage > - Could not generate avro file: > test/org/apache/pig/builtin/avro/data/avro/uncompressed/arraysAsOutputByPig.avro > java.net.ConnectException: Call From martin/127.0.0.1 to localhost:40073 > failed on connection exception: java.net.ConnectException: Connection > refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792) > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732) > at org.apache.hadoop.ipc.Client.call(Client.java:1479) > at org.apache.hadoop.ipc.Client.call(Client.java:1412) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) > at com.sun.proxy.$Proxy13.getBlockLocations(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:255) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ... > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (PIG-5410) Support Python 3 for streaming_python
Rohini Palaniswamy created PIG-5410: --- Summary: Support Python 3 for streaming_python Key: PIG-5410 URL: https://issues.apache.org/jira/browse/PIG-5410 Project: Pig Issue Type: New Feature Reporter: Rohini Palaniswamy Assignee: Venkatasubrahmanian Narayanan Fix For: 0.18.0 Python 3 is incompatible with Python 2. We need to make it work with both. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PIG-5407) Update search bar for the site
[ https://issues.apache.org/jira/browse/PIG-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17288642#comment-17288642 ] Rohini Palaniswamy commented on PIG-5407: - +1 to v03 patch > Update search bar for the site > -- > > Key: PIG-5407 > URL: https://issues.apache.org/jira/browse/PIG-5407 > Project: Pig > Issue Type: Bug > Components: site >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Attachments: pig-5407-v01.patch, pig-5407-v02.patch, > pig-5407-v03.patch > > > It was recently reported that search-hadoop is no longer valid. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PIG-5407) Update search bar for the site
[ https://issues.apache.org/jira/browse/PIG-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17288635#comment-17288635 ] Rohini Palaniswamy commented on PIG-5407: - +1 > Update search bar for the site > -- > > Key: PIG-5407 > URL: https://issues.apache.org/jira/browse/PIG-5407 > Project: Pig > Issue Type: Bug > Components: site >Reporter: Koji Noguchi >Assignee: Koji Noguchi >Priority: Major > Attachments: pig-5407-v01.patch, pig-5407-v02.patch > > > It was recently reported that search-hadoop is no longer valid. -- This message was sent by Atlassian Jira (v8.3.4#803005)