from:"Rohini Palaniswamy"

[jira] [Commented] (PIG-5439) Support Spark 3 and drop SparkShim

2024-05-07 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844427#comment-17844427
 ] 

Rohini Palaniswamy commented on PIG-5439:
-

+1

> Support Spark 3 and drop SparkShim
> --
>
> Key: PIG-5439
> URL: https://issues.apache.org/jira/browse/PIG-5439
> Project: Pig
>  Issue Type: Improvement
>  Components: spark
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Major
> Fix For: 0.19.0
>
> Attachments: pig-5439-v01.patch, pig-5439-v02.patch
>
>
> Support Pig-on-Spark to run on spark3. 
> Initial version would only run up to Spark 3.2.4 and not on 3.3 or 3.4. 
> This is due to log4j mismatch. 
> After moving to log4j2 (PIG-5426), we can move Spark to 3.3 or higher.
> So far, not all unit/e2e tests pass with the proposed patch but at least 
> compilation goes through.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (PIG-5450) Pig-on-Spark3 E2E ORC test failing with java.lang.VerifyError: Bad return type

2024-04-16 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837863#comment-17837863
 ] 

Rohini Palaniswamy commented on PIG-5450:
-

+1

> Pig-on-Spark3 E2E ORC test failing with java.lang.VerifyError: Bad return type
> --
>
> Key: PIG-5450
> URL: https://issues.apache.org/jira/browse/PIG-5450
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Major
> Attachments: pig-5450-v01.patch
>
>
> {noformat}
> Caused by: java.lang.VerifyError: Bad return type
> Exception Details:
> Location:
> org/apache/orc/impl/TypeUtils.createColumn(Lorg/apache/orc/TypeDescription;Lorg/apache/orc/TypeDescription$RowBatchVersion;I)Lorg/apache/hadoop/hive/ql/exec/vector/ColumnVector;
>  @117: areturn
> Reason:
> Type 'org/apache/hadoop/hive/ql/exec/vector/DateColumnVector' (current frame, 
> stack[0]) is not assignable to 
> 'org/apache/hadoop/hive/ql/exec/vector/ColumnVector' (from method signature)
>  {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (PIG-5449) TestEmptyInputDir failing on pig-on-spark3

2024-04-16 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837862#comment-17837862
 ] 

Rohini Palaniswamy commented on PIG-5449:
-

+1

> TestEmptyInputDir failing on pig-on-spark3
> --
>
> Key: PIG-5449
> URL: https://issues.apache.org/jira/browse/PIG-5449
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Major
> Attachments: pig-5449-v01.patch
>
>
> TestEmptyInputDir failing on pig-on-spark3 with 
> {noformat:title=TestEmptyInputDir.testMergeJoinFailure}
> junit.framework.AssertionFailedError
> at 
> org.apache.pig.test.TestEmptyInputDir.testMergeJoin(TestEmptyInputDir.java:141)
> {noformat}
> {noformat:title=TestEmptyInputDir.testGroupByFailure}
> junit.framework.AssertionFailedError
> at 
> org.apache.pig.test.TestEmptyInputDir.testGroupBy(TestEmptyInputDir.java:80)
> {noformat}
> {noformat:title=TestEmptyInputDir.testBloomJoinOuterFailure}
> junit.framework.AssertionFailedError
> at 
> org.apache.pig.test.TestEmptyInputDir.testBloomJoinOuter(TestEmptyInputDir.java:297)
> {noformat}
> {noformat:title=TestEmptyInputDir.testFRJoinFailure}
> junit.framework.AssertionFailedError
> at 
> org.apache.pig.test.TestEmptyInputDir.testFRJoin(TestEmptyInputDir.java:171)
> {noformat}
> {noformat:title=TestEmptyInputDir.testBloomJoinFailure}
> junit.framework.AssertionFailedError
> at 
> org.apache.pig.test.TestEmptyInputDir.testBloomJoin(TestEmptyInputDir.java:267)
>  {noformat}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (PIG-5448) All TestHBaseStorage tests failing on pig-on-spark3

2024-04-16 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837861#comment-17837861
 ] 

Rohini Palaniswamy commented on PIG-5448:
-

+1

> All TestHBaseStorage tests failing on pig-on-spark3
> ---
>
> Key: PIG-5448
> URL: https://issues.apache.org/jira/browse/PIG-5448
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Minor
> Attachments: pig-5448-v01.patch
>
>
> For Pig on Spark3 (with PIG-5439), all of the TestHBaseStorage unit tests are 
> failing with 
> {noformat}
> org.apache.pig.PigException: ERROR 1002: Unable to store alias b
> at org.apache.pig.PigServer.storeEx(PigServer.java:1127)
> at org.apache.pig.PigServer.store(PigServer.java:1086)
> at 
> org.apache.pig.test.TestHBaseStorage.testStoreToHBase_1_with_delete(TestHBaseStorage.java:1251)
> Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get 
> the rdds of this spark operator:
> at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115)
> at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140)
> at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37)
> at 
> org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87)
> at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46)
> at 
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:241)
> at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:290)
> at org.apache.pig.PigServer.launchPlan(PigServer.java:1479)
> at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1464)
> at org.apache.pig.PigServer.storeEx(PigServer.java:1123)
> Caused by: java.lang.RuntimeException: No task metrics available for jobId 0
> at 
> org.apache.pig.tools.pigstats.spark.SparkJobStats.collectStats(SparkJobStats.java:109)
> at 
> org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:77)
> at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:73)
> at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225)
> at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (PIG-5438) Update SparkCounter.Accumulator to AccumulatorV2

2024-04-16 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837860#comment-17837860
 ] 

Rohini Palaniswamy commented on PIG-5438:
-

+1

> Update SparkCounter.Accumulator to AccumulatorV2
> 
>
> Key: PIG-5438
> URL: https://issues.apache.org/jira/browse/PIG-5438
> Project: Pig
>  Issue Type: Improvement
>  Components: spark
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Trivial
> Fix For: 0.19.0
>
> Attachments: pig-5438-v01.patch
>
>
> Original Accumulator is deprecated in Spark2 and gone in Spark3.  
> AccumulatorV2 is usable on both Spark2 and Spark3. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (PIG-5446) Tez TestPigProgressReporting.testProgressReportingWithStatusMessage failing

2024-03-13 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17826791#comment-17826791
 ] 

Rohini Palaniswamy commented on PIG-5446:
-

+1

> Tez TestPigProgressReporting.testProgressReportingWithStatusMessage failing
> ---
>
> Key: PIG-5446
> URL: https://issues.apache.org/jira/browse/PIG-5446
> Project: Pig
>  Issue Type: Bug
>  Components: tez
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Major
> Attachments: pig-5446-v01.patch
>
>
> {noformat}
> Unable to open iterator for alias B. Backend error : Vertex failed, 
> vertexName=scope-4, vertexId=vertex_1707216362777_0001_1_00, 
> diagnostics=[Task failed, taskId=task_1707216362777_0001_1_00_00, 
> diagnostics=[TaskAttempt 0 failed, info=[Attempt failed because it appears to 
> make no progress for 1ms], TaskAttempt 1 failed, info=[Attempt failed 
> because it appears to make no progress for 1ms]], Vertex did not succeed 
> due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex 
> vertex_1707216362777_0001_1_00 [scope-4] killed/failed due 
> to:OWN_TASK_FAILURE] DAG did not succeed due to VERTEX_FAILURE. 
> failedVertices:1 killedVertices:0
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to 
> open iterator for alias B. Backend error : Vertex failed, vertexName=scope-4, 
> vertexId=vertex_1707216362777_0001_1_00, diagnostics=[Task failed, 
> taskId=task_1707216362777_0001_1_00_00, diagnostics=[TaskAttempt 0 
> failed, info=[Attempt failed because it appears to make no progress for 
> 1ms], TaskAttempt 1 failed, info=[Attempt failed because it appears to 
> make no progress for 1ms]], Vertex did not succeed due to 
> OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex 
> vertex_1707216362777_0001_1_00 [scope-4] killed/failed due 
> to:OWN_TASK_FAILURE]
> DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0
> at org.apache.pig.PigServer.openIterator(PigServer.java:1014)
> at 
> org.apache.pig.test.TestPigProgressReporting.testProgressReportingWithStatusMessage(TestPigProgressReporting.java:58)
> Caused by: org.apache.tez.dag.api.TezException: Vertex failed, 
> vertexName=scope-4, vertexId=vertex_1707216362777_0001_1_00, 
> diagnostics=[Task failed, taskId=task_1707216362777_0001_1_00_00, 
> diagnostics=[TaskAttempt 0 failed, info=[Attempt failed because it appears to 
> make no progress for 1ms], TaskAttempt 1 failed, info=[Attempt failed 
> because it appears to make no progress for 1ms]], Vertex did not succeed 
> due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex 
> vertex_1707216362777_0001_1_00 [scope-4] killed/failed due 
> to:OWN_TASK_FAILURE]
> DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0
> at 
> org.apache.pig.tools.pigstats.tez.TezPigScriptStats.accumulateStats(TezPigScriptStats.java:204)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob.run(TezJob.java:243)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher$1.run(TezLauncher.java:212)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> 45.647 {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (PIG-5416) Spark unit tests failing randomly with "java.lang.RuntimeException: Unexpected job execution status RUNNING"

2024-03-13 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17826790#comment-17826790
 ] 

Rohini Palaniswamy commented on PIG-5416:
-

+1

> Spark unit tests failing randomly with "java.lang.RuntimeException: 
> Unexpected job execution status RUNNING"
> 
>
> Key: PIG-5416
> URL: https://issues.apache.org/jira/browse/PIG-5416
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Koji Noguchi
>Priority: Minor
> Attachments: pig-5416-v01.patch
>
>
> Spark unit tests fail randomly with same errors. 
>  Sample stack trace showing "Caused by: java.lang.RuntimeException: 
> Unexpected job execution status RUNNING".
> {noformat:title=TestBuiltInBagToTupleOrString.testPigScriptForBagToTupleUDF}
> Unable to store alias B
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to 
> store alias B
> at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1783)
> at org.apache.pig.PigServer.registerQuery(PigServer.java:708)
> at org.apache.pig.PigServer.registerQuery(PigServer.java:721)
> at 
> org.apache.pig.test.TestBuiltInBagToTupleOrString.testPigScriptForBagToTupleUDF(TestBuiltInBagToTupleOrString.java:429)
> Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get 
> the rdds of this spark operator:
> at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115)
> at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140)
> at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37)
> at 
> org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87)
> at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46)
> at 
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:240)
> at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:290)
> at org.apache.pig.PigServer.launchPlan(PigServer.java:1479)
> at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1464)
> at org.apache.pig.PigServer.execute(PigServer.java:1453)
> at org.apache.pig.PigServer.access$500(PigServer.java:119)
> at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1778)
> Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING
> at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138)
> at 
> org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75)
> at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59)
> at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225)
> at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (PIG-5447) Pig-on-Spark TestSkewedJoin.testSkewedJoinOuter failing with NoSuchElementException

2024-03-13 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17826789#comment-17826789
 ] 

Rohini Palaniswamy commented on PIG-5447:
-

+1

> Pig-on-Spark TestSkewedJoin.testSkewedJoinOuter failing with 
> NoSuchElementException
> ---
>
> Key: PIG-5447
> URL: https://issues.apache.org/jira/browse/PIG-5447
> Project: Pig
>  Issue Type: Bug
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Major
> Attachments: pig-5447-v01.patch
>
>
> TestSkewedJoin.testSkewedJoinOuter is consistently failing for right-outer 
> and full-outer joins.
> "Caused by: java.util.NoSuchElementException: next on empty iterator"



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5437) Add lib and idea folder to .gitignore

2023-07-15 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5437:

Fix Version/s: 0.18.0
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

+1. Committed to trunk and branch-0.18. Thanks for the contribution [~maswin]

> Add lib and idea folder to .gitignore
> -
>
> Key: PIG-5437
> URL: https://issues.apache.org/jira/browse/PIG-5437
> Project: Pig
>  Issue Type: Improvement
>Reporter: Alagappan Maruthappan
>Assignee: Alagappan Maruthappan
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: PIG-5437-0.patch
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5420) Update accumulo dependency to 1.10.1

2023-07-15 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5420:

Fix Version/s: 0.18.1

> Update accumulo dependency to 1.10.1
> 
>
> Key: PIG-5420
> URL: https://issues.apache.org/jira/browse/PIG-5420
> Project: Pig
>  Issue Type: Improvement
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Trivial
> Fix For: 0.18.1
>
> Attachments: pig-5420-v01.patch
>
>
> Following owasp/cve report. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5419) Upgrade Joda time version

2023-07-15 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5419:

Fix Version/s: 0.18.1
   (was: 0.18.0)

Can you update to 2.12.5 ?

> Upgrade Joda time version
> -
>
> Key: PIG-5419
> URL: https://issues.apache.org/jira/browse/PIG-5419
> Project: Pig
>  Issue Type: Improvement
>Reporter: Venkatasubrahmanian Narayanan
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Minor
> Fix For: 0.18.1
>
> Attachments: PIG-5419.patch
>
>
> Pig depends on an older version of Joda time, which can result in conflicts 
> with other versions in some workflows. Upgrading it to the latest 
> version(2.10.13) will resolve Pig's side of such issues.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (PIG-5440) Extra jars needed for hive3

2023-07-15 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy resolved PIG-5440.
-
Fix Version/s: 0.18.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

Committed to trunk and branch-0.18. Thanks [~knoguchi]

> Extra jars needed for hive3
> ---
>
> Key: PIG-5440
> URL: https://issues.apache.org/jira/browse/PIG-5440
> Project: Pig
>  Issue Type: Improvement
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: pig-5440-v01.patch, pig-5440-v02.patch
>
>
> When testing Hive3,  e2e tests were failing with 
> {{Caused by: java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hive/llap/security/LlapSigner$Signable}}  etc. 
> Updating dependent classes. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5438) Update SparkCounter.Accumulator to AccumulatorV2

2023-07-15 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5438:

Fix Version/s: 0.19.0

> Update SparkCounter.Accumulator to AccumulatorV2
> 
>
> Key: PIG-5438
> URL: https://issues.apache.org/jira/browse/PIG-5438
> Project: Pig
>  Issue Type: Improvement
>  Components: spark
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Trivial
> Fix For: 0.19.0
>
> Attachments: pig-5438-v01.patch
>
>
> Original Accumulator is deprecated in Spark2 and gone in Spark3.  
> AccumulatorV2 is usable on both Spark2 and Spark3. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5439) Support Spark 3 and drop SparkShim

2023-07-15 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5439:

Fix Version/s: 0.19.0

> Support Spark 3 and drop SparkShim
> --
>
> Key: PIG-5439
> URL: https://issues.apache.org/jira/browse/PIG-5439
> Project: Pig
>  Issue Type: Improvement
>  Components: spark
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Major
> Fix For: 0.19.0
>
> Attachments: pig-5439-v01.patch
>
>
> Support Pig-on-Spark to run on spark3. 
> Initial version would only run up to Spark 3.2.4 and not on 3.3 or 3.4. 
> This is due to log4j mismatch. 
> After moving to log4j2 (PIG-5426), we can move Spark to 3.3 or higher.
> So far, not all unit/e2e tests pass with the proposed patch but at least 
> compilation goes through.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5414) Build failure on Linux ARM64 due to old Apache Avro

2023-07-15 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5414:

Fix Version/s: 0.18.1

> Build failure on Linux ARM64 due to old Apache Avro
> ---
>
> Key: PIG-5414
> URL: https://issues.apache.org/jira/browse/PIG-5414
> Project: Pig
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.18.0
>Reporter: Martin Tzvetanov Grigorov
>Assignee: Martin Tzvetanov Grigorov
>Priority: Major
> Fix For: 0.18.1
>
> Attachments: 35.patch, 
> TEST-org.apache.pig.builtin.TestAvroStorage.txt, 
> TEST-org.apache.pig.builtin.TestOrcStorage.txt, 
> TEST-org.apache.pig.builtin.TestOrcStoragePushdown.txt
>
>
> Trying to build Apache Pig on Ubuntu 20.04.3 ARM64 fails because of old 
> version of Snappy and Avro libraries:
>  
> {code:java}
> Testsuite: org.apache.pig.builtin.TestAvroStorage
> Tests run: 0, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.1 sec
> - Standard Output ---
> 2021-10-12 14:43:35,483 [main] INFO  
> org.apache.pig.impl.util.SpillableMemoryManager  - Selected heap (PS Old Gen) 
> of size 1431830528 to monitor. collectionUsageThreshold = 1064828928, 
> usageThreshold = 1064828928
> 2021-10-12 14:43:35,489 [main] INFO  org.apache.pig.ExecTypeProvider  - 
> Trying ExecType : LOCAL
> 2021-10-12 14:43:35,489 [main] INFO  org.apache.pig.ExecTypeProvider  - 
> Picked LOCAL as the ExecType
> 2021-10-12 14:43:35,515 [main] WARN  org.apache.hadoop.conf.Configuration  - 
> DEPRECATED: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml 
> is deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml 
> to override properties of core-default.xml, mapred-default.xml and 
> hdfs-default.xml respectively
> 2021-10-12 14:43:35,755 [main] INFO  
> org.apache.hadoop.conf.Configuration.deprecation  - mapred.job.tracker is 
> deprecated. Instead, use mapreduce.jobtracker.address
> 2021-10-12 14:43:35,899 [main] WARN  org.apache.hadoop.util.NativeCodeLoader  
> - Unable to load native-hadoop library for your platform... using 
> builtin-java classes where applicable
> 2021-10-12 14:43:35,916 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine  - Connecting 
> to hadoop file system at: file:///
> 2021-10-12 14:43:36,116 [main] INFO  
> org.apache.hadoop.conf.Configuration.deprecation  - io.bytes.per.checksum is 
> deprecated. Instead, use dfs.bytes-per-checksum
> 2021-10-12 14:43:36,137 [main] INFO  org.apache.pig.PigServer  - Pig Script 
> ID for the session: PIG-default-01426621-bc19-499f-981e-b13959fe0d84
> 2021-10-12 14:43:36,137 [main] WARN  org.apache.pig.PigServer  - ATS is 
> disabled since yarn.timeline-service.enabled set to false
> 2021-10-12 14:43:36,150 [main] INFO  org.apache.pig.builtin.TestAvroStorage  
> - creating 
> test/org/apache/pig/builtin/avro/data/avro/uncompressed/arraysAsOutputByPig.avro
> 2021-10-12 14:43:36,502 [main] INFO  org.apache.pig.builtin.TestAvroStorage  
> - Could not generate avro file: 
> test/org/apache/pig/builtin/avro/data/avro/uncompressed/arraysAsOutputByPig.avro
> java.net.ConnectException: Call From martin/127.0.0.1 to localhost:40073 
> failed on connection exception: java.net.ConnectException: Connection 
> refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
> at org.apache.hadoop.ipc.Client.call(Client.java:1479)
> at org.apache.hadoop.ipc.Client.call(Client.java:1412)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
> at com.sun.proxy.$Proxy13.getBlockLocations(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:255)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> ...
>  {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5418) Utils.parseSchema(String), parseConstant(String) leak memory

2023-07-15 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5418:

Fix Version/s: 0.18.1

> Utils.parseSchema(String), parseConstant(String) leak memory
> 
>
> Key: PIG-5418
> URL: https://issues.apache.org/jira/browse/PIG-5418
> Project: Pig
>  Issue Type: Improvement
>Reporter: Jacob Tolar
>Assignee: Jacob Tolar
>Priority: Minor
> Fix For: 0.18.1
>
> Attachments: PIG-5418.patch
>
>
> A minor issue: I noticed that Utils.parseSchema() and parseConstant() leak 
> memory. I noticed this while running a unit test for a UDF several thousand 
> times and checking the heap. 
> Links are to latest commit as of creating this ticket: 
> https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/impl/util/Utils.java#L244-L256
> {{new PigContext()}} [creates a MapReduce 
> ExecutionEngine|https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/impl/PigContext.java#L269].
>  
> This creates a 
> [MapReduceLauncher|https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRExecutionEngine.java#L34].
>  
> This registers a [Hadoop shutdown 
> hook|https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java#L104-L105]
>  which doesn't go away until the JVM dies. See: 
> https://hadoop.apache.org/docs/r2.8.2/hadoop-project-dist/hadoop-common/api/org/apache/hadoop/util/ShutdownHookManager.html
>  . 
> I will attach a proposed patch. From my reading of the code and running 
> tests, the existing schema parse APIs do not actually use anything from this 
> dummy PigContext, and with a minor tweak it can be passed in as NULL, 
> avoiding the creation of these extra resources. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5443) Add testcase for skew join for tez grace shuffle vertex manager

2023-07-15 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5443:

Description: Need to add test case for fix in 
https://issues.apache.org/jira/browse/PIG-5441. Can just modify one of the 
existing skewed join unit or e2e test cases by increasing mappers (split size) 
or adding PARALLEL 2 for right side data. Also check if one-one edges are 
affected by this part of the code.  (was: Need to add test case for fix in 
https://issues.apache.org/jira/browse/PIG-5441. Can just modify one of the 
existing skewed join unit or e2e test cases by increasing mappers (split size) 
or adding PARALLEL 2 for right side data. )

> Add testcase for skew join for tez grace shuffle vertex manager
> ---
>
> Key: PIG-5443
> URL: https://issues.apache.org/jira/browse/PIG-5443
> Project: Pig
>  Issue Type: Task
>    Reporter: Rohini Palaniswamy
>Priority: Minor
>
> Need to add test case for fix in 
> https://issues.apache.org/jira/browse/PIG-5441. Can just modify one of the 
> existing skewed join unit or e2e test cases by increasing mappers (split 
> size) or adding PARALLEL 2 for right side data. Also check if one-one edges 
> are affected by this part of the code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (PIG-5443) Add testcase for skew join for tez grace shuffle vertex manager

2023-07-15 Thread Rohini Palaniswamy (Jira)

Rohini Palaniswamy created PIG-5443:
---

 Summary: Add testcase for skew join for tez grace shuffle vertex 
manager
 Key: PIG-5443
 URL: https://issues.apache.org/jira/browse/PIG-5443
 Project: Pig
  Issue Type: Task
Reporter: Rohini Palaniswamy


Need to add test case for fix in 
https://issues.apache.org/jira/browse/PIG-5441. Can just modify one of the 
existing skewed join unit or e2e test cases by increasing mappers (split size) 
or adding PARALLEL 2 for right side data. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (PIG-5442) Add only credentials from setStoreLocation to the Job Conf

2023-07-15 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy resolved PIG-5442.
-
Fix Version/s: 0.18.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

+1. Committed to branch-0.18. and trunk. Thanks for the contribution [~maswin]

> Add only credentials from setStoreLocation to the Job Conf
> --
>
> Key: PIG-5442
> URL: https://issues.apache.org/jira/browse/PIG-5442
> Project: Pig
>  Issue Type: Bug
>Reporter: Alagappan Maruthappan
>Assignee: Alagappan Maruthappan
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5442-1.patch
>
>
> While testing HCatStorer with Iceberg realized Pig calls setStoreLocation on 
> all Stores with the same Job object - 
> [https://github.com/apache/pig/blob/b050a33c66fc22d648370b5c6bda04e0e51d3aa3/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java#L1081]
> Setting populated by one store is affecting the other stores. In my case the 
> "mapred.output.committer.class" is set as HiveIcebergCommitter by PigStore 
> that is used by the Iceberg table and the other stores which inserts data to 
> a non-iceberg tables also use that setting and trying to use 
> HiveIcebergCommitter.
>  
> On checking with [~rohini] , it is called to get the credentials from all 
> stores since addCredentials API was added later and not all stores have 
> implemented it and some still set configuration in setLocation method (i.e, 
> HCatStorer). 
>  
> Fixed it by passing a separate copy of Job object to each store's setLocation 
> method and adding only the credential object from the call.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5442) Add only credentials from setStoreLocation to the Job Conf

2023-07-15 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5442:

Attachment: PIG-5442-1.patch

> Add only credentials from setStoreLocation to the Job Conf
> --
>
> Key: PIG-5442
> URL: https://issues.apache.org/jira/browse/PIG-5442
> Project: Pig
>  Issue Type: Bug
>Reporter: Alagappan Maruthappan
>Assignee: Alagappan Maruthappan
>Priority: Major
> Attachments: PIG-5442-1.patch
>
>
> While testing HCatStorer with Iceberg realized Pig calls setStoreLocation on 
> all Stores with the same Job object - 
> [https://github.com/apache/pig/blob/b050a33c66fc22d648370b5c6bda04e0e51d3aa3/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java#L1081]
> Setting populated by one store is affecting the other stores. In my case the 
> "mapred.output.committer.class" is set as HiveIcebergCommitter by PigStore 
> that is used by the Iceberg table and the other stores which inserts data to 
> a non-iceberg tables also use that setting and trying to use 
> HiveIcebergCommitter.
>  
> On checking with [~rohini] , it is called to get the credentials from all 
> stores since addCredentials API was added later and not all stores have 
> implemented it and some still set configuration in setLocation method (i.e, 
> HCatStorer). 
>  
> Fixed it by passing a separate copy of Job object to each store's setLocation 
> method and adding only the credential object from the call.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5441) Pig skew join tez grace reducer fails to find shuffle data

2023-07-15 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5441:

Hadoop Flags: Reviewed
  Resolution: Fixed
  Status: Resolved  (was: Patch Available)

Patch committed to branch-0.18 and trunk. Thanks [~yigress] for the 
contribution.

> Pig skew join tez grace reducer fails to find shuffle data
> --
>
> Key: PIG-5441
> URL: https://issues.apache.org/jira/browse/PIG-5441
> Project: Pig
>  Issue Type: Bug
>  Components: tez
>Affects Versions: 0.17.0
>Reporter: Yi Zhang
>Assignee: Yi Zhang
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5441.patch
>
>
> User pig tez skew join encountered issue of not finding shuffle data from the 
> sampler aggregate vertex. The right side join has >1 reducers.
> For workaround adjust tez.runtime.transfer.data-via-events.max-size to avoid 
> spill to disk for the sampler aggregation vertex. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (PIG-5441) Pig skew join tez grace reducer fails to find shuffle data

2023-05-24 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17725781#comment-17725781
 ] 

Rohini Palaniswamy commented on PIG-5441:
-

+1. Can you just attach the patch to jira ?

> Pig skew join tez grace reducer fails to find shuffle data
> --
>
> Key: PIG-5441
> URL: https://issues.apache.org/jira/browse/PIG-5441
> Project: Pig
>  Issue Type: Bug
>  Components: tez
>Affects Versions: 0.17.0
>Reporter: Yi Zhang
>Assignee: Yi Zhang
>Priority: Major
> Fix For: 0.18.0
>
>
> User pig tez skew join encountered issue of not finding shuffle data from the 
> sampler aggregate vertex. The right side join has >1 reducers.
> For workaround adjust tez.runtime.transfer.data-via-events.max-size to avoid 
> spill to disk for the sampler aggregation vertex. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5441) Pig skew join tez grace reducer fails to find shuffle data

2023-05-24 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5441:

Fix Version/s: 0.18.0
 Assignee: Yi Zhang
   Status: Patch Available  (was: Open)

> Pig skew join tez grace reducer fails to find shuffle data
> --
>
> Key: PIG-5441
> URL: https://issues.apache.org/jira/browse/PIG-5441
> Project: Pig
>  Issue Type: Bug
>  Components: tez
>Affects Versions: 0.17.0
>Reporter: Yi Zhang
>Assignee: Yi Zhang
>Priority: Major
> Fix For: 0.18.0
>
>
> User pig tez skew join encountered issue of not finding shuffle data from the 
> sampler aggregate vertex. The right side join has >1 reducers.
> For workaround adjust tez.runtime.transfer.data-via-events.max-size to avoid 
> spill to disk for the sampler aggregation vertex. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (PIG-5440) Extra jars needed for hive3

2023-05-24 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17725780#comment-17725780
 ] 

Rohini Palaniswamy commented on PIG-5440:
-

+1

> Extra jars needed for hive3
> ---
>
> Key: PIG-5440
> URL: https://issues.apache.org/jira/browse/PIG-5440
> Project: Pig
>  Issue Type: Improvement
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Minor
> Attachments: pig-5440-v01.patch, pig-5440-v02.patch
>
>
> When testing Hive3,  e2e tests were failing with 
> {{Caused by: java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hive/llap/security/LlapSigner$Signable}}  etc. 
> Updating dependent classes. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (PIG-5440) Extra jars needed for hive3

2023-05-12 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17722276#comment-17722276
 ] 

Rohini Palaniswamy commented on PIG-5440:
-

+1. Can you add space between "orc-shims","aircompressor" before commit ?

> Extra jars needed for hive3
> ---
>
> Key: PIG-5440
> URL: https://issues.apache.org/jira/browse/PIG-5440
> Project: Pig
>  Issue Type: Improvement
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Minor
> Attachments: pig-5440-v01.patch
>
>
> When testing Hive3,  e2e tests were failing with 
> {{Caused by: java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hive/llap/security/LlapSigner$Signable}}  etc. 
> Updating dependent classes. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (PIG-5432) OrcStorage fails to detect schema in some cases

2023-03-30 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17706981#comment-17706981
 ] 

Rohini Palaniswamy edited comment on PIG-5432 at 3/30/23 5:33 PM:
--

+1. Committed to branch-0.18 and trunk. Thanks for the contribution [~jtolar]


was (Author: rohini):
+1. Committed to branch-0.18 and trunk. Thanks for contribution [~jtolar]

> OrcStorage fails to detect schema in some cases
> ---
>
> Key: PIG-5432
> URL: https://issues.apache.org/jira/browse/PIG-5432
> Project: Pig
>  Issue Type: Bug
>Reporter: Jacob Tolar
>Assignee: Jacob Tolar
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: PIG-5432.v01.patch
>
>
> OrcStorage needs to detect the schema of input data paths. If some data paths 
> have no ORC files (perhaps only a _SUCCESS marker is present), this will 
> fail. 
> For example: 
> {code}
> A = LOAD '/path/to/20230101,/path/to/20230102' USING OrcStorage();
> {code}
> If {{/path/to/20230101}} contains only a _SUCCESS marker and {{20230102}} 
> contains data, OrcStorage fails to detect the schema and Pig exits with a 
> confusing/unhelpful error, something like "Cannot find any ORC files from 
> . Probably multiple load/store statements in script."
> The code tries to use a search algorithm to recursively search through all 
> input paths for the data (via Utils.depthFirstSearchForFile), but it is 
> implemented incorrectly and returns early in this scenario.
> See: 
> https://github.com/apache/pig/blob/c0d75ba930f9aa5c6454d0264a96f82b45279202/src/org/apache/pig/builtin/OrcStorage.java#L389-L408
> https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/impl/util/Utils.java#L629-L667
> I'll attach a patch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5432) OrcStorage fails to detect schema in some cases

2023-03-30 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5432:

Hadoop Flags: Reviewed
  Resolution: Fixed
  Status: Resolved  (was: Patch Available)

+1. Committed to branch-0.18 and trunk. Thanks for contribution [~jtolar]

> OrcStorage fails to detect schema in some cases
> ---
>
> Key: PIG-5432
> URL: https://issues.apache.org/jira/browse/PIG-5432
> Project: Pig
>  Issue Type: Bug
>Reporter: Jacob Tolar
>Assignee: Jacob Tolar
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: PIG-5432.v01.patch
>
>
> OrcStorage needs to detect the schema of input data paths. If some data paths 
> have no ORC files (perhaps only a _SUCCESS marker is present), this will 
> fail. 
> For example: 
> {code}
> A = LOAD '/path/to/20230101,/path/to/20230102' USING OrcStorage();
> {code}
> If {{/path/to/20230101}} contains only a _SUCCESS marker and {{20230102}} 
> contains data, OrcStorage fails to detect the schema and Pig exits with a 
> confusing/unhelpful error, something like "Cannot find any ORC files from 
> . Probably multiple load/store statements in script."
> The code tries to use a search algorithm to recursively search through all 
> input paths for the data (via Utils.depthFirstSearchForFile), but it is 
> implemented incorrectly and returns early in this scenario.
> See: 
> https://github.com/apache/pig/blob/c0d75ba930f9aa5c6454d0264a96f82b45279202/src/org/apache/pig/builtin/OrcStorage.java#L389-L408
> https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/impl/util/Utils.java#L629-L667
> I'll attach a patch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (PIG-5432) OrcStorage fails to detect schema in some cases

2023-03-30 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy reassigned PIG-5432:
---

Fix Version/s: 0.18.0
 Assignee: Jacob Tolar

> OrcStorage fails to detect schema in some cases
> ---
>
> Key: PIG-5432
> URL: https://issues.apache.org/jira/browse/PIG-5432
> Project: Pig
>  Issue Type: Bug
>Reporter: Jacob Tolar
>Assignee: Jacob Tolar
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: PIG-5432.v01.patch
>
>
> OrcStorage needs to detect the schema of input data paths. If some data paths 
> have no ORC files (perhaps only a _SUCCESS marker is present), this will 
> fail. 
> For example: 
> {code}
> A = LOAD '/path/to/20230101,/path/to/20230102' USING OrcStorage();
> {code}
> If {{/path/to/20230101}} contains only a _SUCCESS marker and {{20230102}} 
> contains data, OrcStorage fails to detect the schema and Pig exits with a 
> confusing/unhelpful error, something like "Cannot find any ORC files from 
> . Probably multiple load/store statements in script."
> The code tries to use a search algorithm to recursively search through all 
> input paths for the data (via Utils.depthFirstSearchForFile), but it is 
> implemented incorrectly and returns early in this scenario.
> See: 
> https://github.com/apache/pig/blob/c0d75ba930f9aa5c6454d0264a96f82b45279202/src/org/apache/pig/builtin/OrcStorage.java#L389-L408
> https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/impl/util/Utils.java#L629-L667
> I'll attach a patch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: Branching for 0.18 release

2023-01-17 Thread Rohini Palaniswamy

This is done now and the release branch is at
https://svn.apache.org/repos/asf/pig/branches/branch-0.18

On Sun, Jan 15, 2023 at 6:08 PM Rohini Palaniswamy 
wrote:

> Hi all,
>Will be creating a branch for the 0.18 release from trunk tomorrow
> afternoon.
>
> Regards,
> Rohini
>

[jira] [Resolved] (PIG-5436) update owasp version

2023-01-17 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy resolved PIG-5436.
-
Fix Version/s: 0.18.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

Committed to trunk. Thanks Koji

> update owasp version
> 
>
> Key: PIG-5436
> URL: https://issues.apache.org/jira/browse/PIG-5436
> Project: Pig
>  Issue Type: Test
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Trivial
> Fix For: 0.18.0
>
> Attachments: pig-5436-v01.patch
>
>
> Owasp testing started to fail with 
> {quote}Caused by: org.h2.jdbc.JdbcBatchUpdateException: Value too long for 
> column "VERSIONENDEXCLUDING VARCHAR(50) SELECTIVITY 1"
> {quote}
>  
> Following https://github.com/jeremylong/DependencyCheck/issues/5225, updating 
> the owasp version. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5435) pig.exec.reducers.max does not take effect for skewed join

2023-01-16 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5435:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks [~vnarayanan7] and [~knoguchi]. 

> pig.exec.reducers.max does not take effect for skewed join
> --
>
> Key: PIG-5435
> URL: https://issues.apache.org/jira/browse/PIG-5435
> Project: Pig
>  Issue Type: Bug
>    Reporter: Rohini Palaniswamy
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5435-1.patch
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5434) Migrate from log4j to reload4j

2023-01-16 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5434:

Hadoop Flags: Reviewed
  Resolution: Fixed
  Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks Koji

> Migrate from log4j to reload4j
> --
>
> Key: PIG-5434
> URL: https://issues.apache.org/jira/browse/PIG-5434
> Project: Pig
>  Issue Type: Improvement
>    Reporter: Rohini Palaniswamy
>    Assignee: Rohini Palaniswamy
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5434-1.patch
>
>
> Was trying to migrate to log4j2.x (PIG-5426) but was running into issues. As 
> 0.18 is delayed long enough, migrating to reload4j in this release similar to 
> HADOOP-18088. Will migrate to log4j2.x in the next release.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5417) Replace guava's Files.createTempDir()

2023-01-16 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5417:

Hadoop Flags: Reviewed
  Resolution: Fixed
  Status: Resolved  (was: Patch Available)

Committed to trunk. Thank you for the contribution [~xiaoheipangzi]

> Replace guava's Files.createTempDir()
> -
>
> Key: PIG-5417
> URL: https://issues.apache.org/jira/browse/PIG-5417
> Project: Pig
>  Issue Type: Bug
>Reporter: lujie
>Assignee: lujie
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5417-1.patch
>
>
> see [https://www.cvedetails.com/cve/CVE-2020-8908/]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (PIG-5436) update owasp version

2023-01-16 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677558#comment-17677558
 ] 

Rohini Palaniswamy commented on PIG-5436:
-

+1

> update owasp version
> 
>
> Key: PIG-5436
> URL: https://issues.apache.org/jira/browse/PIG-5436
> Project: Pig
>  Issue Type: Test
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Trivial
> Attachments: pig-5436-v01.patch
>
>
> Owasp testing started to fail with 
> {quote}Caused by: org.h2.jdbc.JdbcBatchUpdateException: Value too long for 
> column "VERSIONENDEXCLUDING VARCHAR(50) SELECTIVITY 1"
> {quote}
>  
> Following https://github.com/jeremylong/DependencyCheck/issues/5225, updating 
> the owasp version. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5435) pig.exec.reducers.max does not take effect for skewed join

2023-01-16 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5435:

Status: Patch Available  (was: Open)

This is a patch that I had reviewed internally. [~knoguchi] can you +1 here. 

> pig.exec.reducers.max does not take effect for skewed join
> --
>
> Key: PIG-5435
> URL: https://issues.apache.org/jira/browse/PIG-5435
> Project: Pig
>  Issue Type: Bug
>    Reporter: Rohini Palaniswamy
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5435-1.patch
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5435) pig.exec.reducers.max does not take effect for skewed join

2023-01-16 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5435:

Attachment: PIG-5435-1.patch

> pig.exec.reducers.max does not take effect for skewed join
> --
>
> Key: PIG-5435
> URL: https://issues.apache.org/jira/browse/PIG-5435
> Project: Pig
>  Issue Type: Bug
>    Reporter: Rohini Palaniswamy
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5435-1.patch
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (PIG-5435) pig.exec.reducers.max does not take effect for skewed join

2023-01-16 Thread Rohini Palaniswamy (Jira)

Rohini Palaniswamy created PIG-5435:
---

 Summary: pig.exec.reducers.max does not take effect for skewed join
 Key: PIG-5435
 URL: https://issues.apache.org/jira/browse/PIG-5435
 Project: Pig
  Issue Type: Bug
Reporter: Rohini Palaniswamy
Assignee: Venkatasubrahmanian Narayanan
 Fix For: 0.18.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5417) Replace guava's Files.createTempDir()

2023-01-16 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5417:

Status: Patch Available  (was: Open)

> Replace guava's Files.createTempDir()
> -
>
> Key: PIG-5417
> URL: https://issues.apache.org/jira/browse/PIG-5417
> Project: Pig
>  Issue Type: Bug
>Reporter: lujie
>Assignee: lujie
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5417-1.patch
>
>
> see [https://www.cvedetails.com/cve/CVE-2020-8908/]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (PIG-5417) Replace guava's Files.createTempDir()

2023-01-16 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677523#comment-17677523
 ] 

Rohini Palaniswamy commented on PIG-5417:
-

Downloaded 
https://patch-diff.githubusercontent.com/raw/apache/pig/pull/36.patch and was 
going to commit it, but compilation failed as it did not catch IOException. 
Updated the patch with a try catch block. [~knoguchi], can you +1 as there is a 
minor change from the original patch?  Thought will get this into the release 
as it address a CVE. 

> Replace guava's Files.createTempDir()
> -
>
> Key: PIG-5417
> URL: https://issues.apache.org/jira/browse/PIG-5417
> Project: Pig
>  Issue Type: Bug
>Reporter: lujie
>Assignee: lujie
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5417-1.patch
>
>
> see [https://www.cvedetails.com/cve/CVE-2020-8908/]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5417) Replace guava's Files.createTempDir()

2023-01-16 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5417:

Attachment: PIG-5417-1.patch

> Replace guava's Files.createTempDir()
> -
>
> Key: PIG-5417
> URL: https://issues.apache.org/jira/browse/PIG-5417
> Project: Pig
>  Issue Type: Bug
>Reporter: lujie
>Assignee: lujie
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5417-1.patch
>
>
> see [https://www.cvedetails.com/cve/CVE-2020-8908/]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (PIG-5417) Replace guava's Files.createTempDir()

2023-01-16 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy reassigned PIG-5417:
---

Assignee: lujie  (was: lujie)

> Replace guava's Files.createTempDir()
> -
>
> Key: PIG-5417
> URL: https://issues.apache.org/jira/browse/PIG-5417
> Project: Pig
>  Issue Type: Bug
>Reporter: lujie
>Assignee: lujie
>Priority: Major
> Fix For: 0.18.0
>
>
> see [https://www.cvedetails.com/cve/CVE-2020-8908/]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (PIG-5417) Replace guava's Files.createTempDir()

2023-01-16 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy reassigned PIG-5417:
---

Fix Version/s: 0.18.0
 Assignee: lujie

> Replace guava's Files.createTempDir()
> -
>
> Key: PIG-5417
> URL: https://issues.apache.org/jira/browse/PIG-5417
> Project: Pig
>  Issue Type: Bug
>Reporter: lujie
>Assignee: lujie
>Priority: Major
> Fix For: 0.18.0
>
>
> see [https://www.cvedetails.com/cve/CVE-2020-8908/]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5434) Migrate from log4j to reload4j

2023-01-16 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5434:

Status: Patch Available  (was: Open)

This patch also upgrades to the latest slf4j version

> Migrate from log4j to reload4j
> --
>
> Key: PIG-5434
> URL: https://issues.apache.org/jira/browse/PIG-5434
> Project: Pig
>  Issue Type: Improvement
>    Reporter: Rohini Palaniswamy
>    Assignee: Rohini Palaniswamy
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5434-1.patch
>
>
> Was trying to migrate to log4j2.x (PIG-5426) but was running into issues. As 
> 0.18 is delayed long enough, migrating to reload4j in this release similar to 
> HADOOP-18088. Will migrate to log4j2.x in the next release.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5434) Migrate from log4j to reload4j

2023-01-16 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5434:

Attachment: PIG-5434-1.patch

> Migrate from log4j to reload4j
> --
>
> Key: PIG-5434
> URL: https://issues.apache.org/jira/browse/PIG-5434
> Project: Pig
>  Issue Type: Improvement
>    Reporter: Rohini Palaniswamy
>    Assignee: Rohini Palaniswamy
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5434-1.patch
>
>
> Was trying to migrate to log4j2.x (PIG-5426) but was running into issues. As 
> 0.18 is delayed long enough, migrating to reload4j in this release similar to 
> HADOOP-18088. Will migrate to log4j2.x in the next release.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5426) Migrate to log4j2.x

2023-01-16 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5426:

Fix Version/s: 0.19.0
   (was: 0.18.0)

Running into issues. As 0.18 is delayed long enough, migrating to reload4j in 
that release as part of PIG-5434. Will migrate to log4j2.x in the next release.

> Migrate to log4j2.x
> ---
>
> Key: PIG-5426
> URL: https://issues.apache.org/jira/browse/PIG-5426
> Project: Pig
>  Issue Type: Improvement
>    Reporter: Rohini Palaniswamy
>    Assignee: Rohini Palaniswamy
>Priority: Major
> Fix For: 0.19.0
>
>
> Hadoop (HADOOP-18088) decided to migrate to reload4j to address log4j1.x 
> vulnerabilities. I did the work of migrating Oozie server and client to 
> log4j2.x while launched hadoop jobs will still use 1.x till hadoop migrates. 
> So think it should be easy to do that for pig client as well. If it does not 
> work as expected, will just go with the easy switch to reload4j. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (PIG-5434) Migrate from log4j to reload4j

2023-01-16 Thread Rohini Palaniswamy (Jira)

Rohini Palaniswamy created PIG-5434:
---

 Summary: Migrate from log4j to reload4j
 Key: PIG-5434
 URL: https://issues.apache.org/jira/browse/PIG-5434
 Project: Pig
  Issue Type: Improvement
Reporter: Rohini Palaniswamy
Assignee: Rohini Palaniswamy
 Fix For: 0.18.0


Was trying to migrate to log4j2.x (PIG-5426) but was running into issues. As 
0.18 is delayed long enough, migrating to reload4j in this release similar to 
HADOOP-18088. Will migrate to log4j2.x in the next release.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Branching for 0.18 release

2023-01-15 Thread Rohini Palaniswamy

Hi all,
   Will be creating a branch for the 0.18 release from trunk tomorrow
afternoon.

Regards,
Rohini

[jira] [Updated] (PIG-5431) Date datatype is different between Hive 1.x and Hive 3.x

2023-01-15 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5431:

Hadoop Flags: Reviewed
  Resolution: Fixed
  Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks Koji for the review.

> Date datatype is different between Hive 1.x and Hive 3.x
> 
>
> Key: PIG-5431
> URL: https://issues.apache.org/jira/browse/PIG-5431
> Project: Pig
>  Issue Type: Bug
>    Reporter: Rohini Palaniswamy
>    Assignee: Rohini Palaniswamy
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5431-1.patch
>
>
> java.lang.ClassCastException: org.apache.hadoop.hive.common.type.Date cannot 
> be cast to java.sql.Date



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5433) Fix test failures with TestHBaseStorage and htrace dependency

2023-01-15 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5433:

Hadoop Flags: Reviewed
  Resolution: Fixed
  Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks Koji.

> Fix test failures with TestHBaseStorage and htrace dependency
> -
>
> Key: PIG-5433
> URL: https://issues.apache.org/jira/browse/PIG-5433
> Project: Pig
>  Issue Type: Bug
>    Reporter: Rohini Palaniswamy
>    Assignee: Rohini Palaniswamy
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5433-1.patch
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5431) Date datatype is different between Hive 1.x and Hive 3.x

2023-01-15 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5431:

Status: Patch Available  (was: Open)

> Date datatype is different between Hive 1.x and Hive 3.x
> 
>
> Key: PIG-5431
> URL: https://issues.apache.org/jira/browse/PIG-5431
> Project: Pig
>  Issue Type: Bug
>    Reporter: Rohini Palaniswamy
>    Assignee: Rohini Palaniswamy
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5431-1.patch
>
>
> java.lang.ClassCastException: org.apache.hadoop.hive.common.type.Date cannot 
> be cast to java.sql.Date



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5431) Date datatype is different between Hive 1.x and Hive 3.x

2023-01-15 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5431:

Attachment: PIG-5431-1.patch

> Date datatype is different between Hive 1.x and Hive 3.x
> 
>
> Key: PIG-5431
> URL: https://issues.apache.org/jira/browse/PIG-5431
> Project: Pig
>  Issue Type: Bug
>    Reporter: Rohini Palaniswamy
>    Assignee: Rohini Palaniswamy
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5431-1.patch
>
>
> java.lang.ClassCastException: org.apache.hadoop.hive.common.type.Date cannot 
> be cast to java.sql.Date



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5433) Fix test failures with TestHBaseStorage and htrace dependency

2023-01-15 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5433:

Attachment: PIG-5433-1.patch

> Fix test failures with TestHBaseStorage and htrace dependency
> -
>
> Key: PIG-5433
> URL: https://issues.apache.org/jira/browse/PIG-5433
> Project: Pig
>  Issue Type: Bug
>    Reporter: Rohini Palaniswamy
>    Assignee: Rohini Palaniswamy
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5433-1.patch
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5433) Fix test failures with TestHBaseStorage and htrace dependency

2023-01-15 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5433:

Status: Patch Available  (was: Open)

> Fix test failures with TestHBaseStorage and htrace dependency
> -
>
> Key: PIG-5433
> URL: https://issues.apache.org/jira/browse/PIG-5433
> Project: Pig
>  Issue Type: Bug
>    Reporter: Rohini Palaniswamy
>    Assignee: Rohini Palaniswamy
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5433-1.patch
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (PIG-5433) Fix test failures with TestHBaseStorage and htrace dependency

2023-01-15 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677083#comment-17677083
 ] 

Rohini Palaniswamy commented on PIG-5433:
-

Ran into below test failure 
{code}
org/apache/htrace/core/Tracer$Builder
java.lang.NoClassDefFoundError: org/apache/htrace/core/Tracer$Builder
at org.apache.hadoop.fs.FsTracer.get(FsTracer.java:42)
at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3256)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:121)
at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3310)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3278)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:475)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:228)
at 
org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:68)
at 
org.apache.pig.backend.hadoop.datastorage.HDataStorage.(HDataStorage.java:58)
at 
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:227)
at 
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:111)
at org.apache.pig.impl.PigContext.connect(PigContext.java:310)
at org.apache.pig.PigServer.(PigServer.java:232)
at org.apache.pig.PigServer.(PigServer.java:220)
at org.apache.pig.PigServer.(PigServer.java:212)
at org.apache.pig.PigServer.(PigServer.java:208)
at org.apache.pig.builtin.TestOrcStorage.setup(TestOrcStorage.java:109)
Caused by: java.lang.ClassNotFoundException: 
org.apache.htrace.core.Tracer$Builder
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:355)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
{code}

and java.io.IOException: Waiting for startup of standalone server in 
TestHBaseStorage described in
https://stackoverflow.com/questions/67364593/java-io-ioexception-waiting-for-startup-of-standalone-server-minizookeeperclu




> Fix test failures with TestHBaseStorage and htrace dependency
> -
>
> Key: PIG-5433
> URL: https://issues.apache.org/jira/browse/PIG-5433
> Project: Pig
>  Issue Type: Bug
>    Reporter: Rohini Palaniswamy
>    Assignee: Rohini Palaniswamy
>Priority: Major
> Fix For: 0.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (PIG-5433) Fix test failures with TestHBaseStorage and htrace dependency

2023-01-15 Thread Rohini Palaniswamy (Jira)

Rohini Palaniswamy created PIG-5433:
---

 Summary: Fix test failures with TestHBaseStorage and htrace 
dependency
 Key: PIG-5433
 URL: https://issues.apache.org/jira/browse/PIG-5433
 Project: Pig
  Issue Type: Bug
Reporter: Rohini Palaniswamy
Assignee: Rohini Palaniswamy
 Fix For: 0.18.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (PIG-5431) Date datatype is different between Hive 1.x and Hive 3.x

2023-01-10 Thread Rohini Palaniswamy (Jira)

Rohini Palaniswamy created PIG-5431:
---

 Summary: Date datatype is different between Hive 1.x and Hive 3.x
 Key: PIG-5431
 URL: https://issues.apache.org/jira/browse/PIG-5431
 Project: Pig
  Issue Type: Bug
Reporter: Rohini Palaniswamy
Assignee: Rohini Palaniswamy
 Fix For: 0.18.0


java.lang.ClassCastException: org.apache.hadoop.hive.common.type.Date cannot be 
cast to java.sql.Date



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (PIG-5321) Upgrade Spark 2 version to 2.2.0 for Pig on Spark

2022-10-17 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy resolved PIG-5321.
-
Resolution: Duplicate

This has been already fixed by [~knoguchi] as part of PIG-5397 with Spark 2 
version being upgraded to 2.4.8.

> Upgrade Spark 2 version to 2.2.0 for Pig on Spark
> -
>
> Key: PIG-5321
> URL: https://issues.apache.org/jira/browse/PIG-5321
> Project: Pig
>  Issue Type: Improvement
>  Components: spark
>Reporter: Ádám Szita
>Priority: Major
>
> Right now we maintain support for 2 versions of Spark for PoS jobs:
> spark1.version=1.6.1
> spark2.version=2.1.1
> I believe we should move forward with the latter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (PIG-5430) TestTezGraceParallelism failing due to tez log change

2022-10-14 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17618002#comment-17618002
 ] 

Rohini Palaniswamy commented on PIG-5430:
-

+1

> TestTezGraceParallelism failing due to tez log change
> -
>
> Key: PIG-5430
> URL: https://issues.apache.org/jira/browse/PIG-5430
> Project: Pig
>  Issue Type: Test
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Trivial
> Fix For: 0.18.0
>
> Attachments: pig-5430-v01.patch
>
>
> After PIG-5428, TestTezGraceParallelism:testIncreaseParallelism, 
> testDecreaseParallelism started failing due to change in log messages by 
> recent Tez.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (PIG-5429) Update hbase version from 2.0.0 to 2.4.14

2022-10-14 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17618001#comment-17618001
 ] 

Rohini Palaniswamy commented on PIG-5429:
-

+1

> Update hbase version from 2.0.0 to 2.4.14
> -
>
> Key: PIG-5429
> URL: https://issues.apache.org/jira/browse/PIG-5429
> Project: Pig
>  Issue Type: Improvement
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Minor
> Attachments: pig-5429-v01.patch
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (PIG-5428) Update hadoop2,3 and tez to recent versions

2022-10-12 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17616638#comment-17616638
 ] 

Rohini Palaniswamy commented on PIG-5428:
-

+1

> Update hadoop2,3 and tez to recent versions
> ---
>
> Key: PIG-5428
> URL: https://issues.apache.org/jira/browse/PIG-5428
> Project: Pig
>  Issue Type: Improvement
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: pig-5428-v01.patch
>
>
> PIG-5253 hadoop3 patch is committed. 
> Now, updating hadoop2&3, tez and other dependent library versions. 
> Only testing using two different parameters. 
> * -Dhbaseversion=2 -Dhadoopversion=2 -Dhiveversion=1 -Dsparkversion=2
> and
> * -Dhbaseversion=2 -Dhadoopversion=3 -Dhiveversion=3 -Dsparkversion=2



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5406) TestJoinLocal imports org.python.google.common.collect.Lists instead of org.google.common.collect.Lists

2022-08-18 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5406:

Hadoop Flags: Reviewed
  Resolution: Fixed
  Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks Daniel for the review.

> TestJoinLocal imports org.python.google.common.collect.Lists instead of 
> org.google.common.collect.Lists
> ---
>
> Key: PIG-5406
> URL: https://issues.apache.org/jira/browse/PIG-5406
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.15.0, 0.16.0, 0.17.0
>Reporter: James Z.M. Gao
>    Assignee: Rohini Palaniswamy
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: PIG-5406-v1.patch
>
>
> [PIG-4366|https://github.com/apache/pig/commit/81abb6bd0adb6e101898d67b3c2a9e35e11ce993]
>  make PIG-2861 coming back.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (PIG-5426) Migrate to log4j2.x

2022-08-13 Thread Rohini Palaniswamy (Jira)

Rohini Palaniswamy created PIG-5426:
---

 Summary: Migrate to log4j2.x
 Key: PIG-5426
 URL: https://issues.apache.org/jira/browse/PIG-5426
 Project: Pig
  Issue Type: Improvement
Reporter: Rohini Palaniswamy
Assignee: Rohini Palaniswamy
 Fix For: 0.18.0


Hadoop (HADOOP-18088) decided to migrate to reload4j to address log4j1.x 
vulnerabilities. I did the work of migrating Oozie server and client to 
log4j2.x while launched hadoop jobs will still use 1.x till hadoop migrates. So 
think it should be easy to do that for pig client as well. If it does not work 
as expected, will just go with the easy switch to reload4j. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (PIG-5388) Upgrade to Avro and Trevni 1.9.x

2022-08-13 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy reassigned PIG-5388:
---

Fix Version/s: 0.18.0
 Assignee: Rohini Palaniswamy
  Summary: Upgrade to Avro and Trevni 1.9.x  (was: Upgrade to Avro 
1.9.x)

> Upgrade to Avro and Trevni 1.9.x
> 
>
> Key: PIG-5388
> URL: https://issues.apache.org/jira/browse/PIG-5388
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nándor Kollár
>    Assignee: Rohini Palaniswamy
>Priority: Minor
> Fix For: 0.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5419) Upgrade Joda time version

2022-08-13 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5419:

Fix Version/s: 0.18.0

Can you update to 2.11.0 
(https://www.joda.org/joda-time/changes-report.html#a2.11.0)?

> Upgrade Joda time version
> -
>
> Key: PIG-5419
> URL: https://issues.apache.org/jira/browse/PIG-5419
> Project: Pig
>  Issue Type: Improvement
>Reporter: Venkatasubrahmanian Narayanan
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: PIG-5419.patch
>
>
> Pig depends on an older version of Joda time, which can result in conflicts 
> with other versions in some workflows. Upgrading it to the latest 
> version(2.10.13) will resolve Pig's side of such issues.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5406) TestJoinLocal imports org.python.google.common.collect.Lists instead of org.google.common.collect.Lists

2022-08-12 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5406:

Attachment: PIG-5406-v1.patch

> TestJoinLocal imports org.python.google.common.collect.Lists instead of 
> org.google.common.collect.Lists
> ---
>
> Key: PIG-5406
> URL: https://issues.apache.org/jira/browse/PIG-5406
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.15.0, 0.16.0, 0.17.0
>Reporter: James Z.M. Gao
>    Assignee: Rohini Palaniswamy
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: PIG-5406-v1.patch
>
>
> [PIG-4366|https://github.com/apache/pig/commit/81abb6bd0adb6e101898d67b3c2a9e35e11ce993]
>  make PIG-2861 coming back.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5406) TestJoinLocal imports org.python.google.common.collect.Lists instead of org.google.common.collect.Lists

2022-08-12 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5406:

Status: Patch Available  (was: Open)

> TestJoinLocal imports org.python.google.common.collect.Lists instead of 
> org.google.common.collect.Lists
> ---
>
> Key: PIG-5406
> URL: https://issues.apache.org/jira/browse/PIG-5406
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.17.0, 0.16.0, 0.15.0
>Reporter: James Z.M. Gao
>    Assignee: Rohini Palaniswamy
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: PIG-5406-v1.patch
>
>
> [PIG-4366|https://github.com/apache/pig/commit/81abb6bd0adb6e101898d67b3c2a9e35e11ce993]
>  make PIG-2861 coming back.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (PIG-5406) TestJoinLocal imports org.python.google.common.collect.Lists instead of org.google.common.collect.Lists

2022-08-12 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy reassigned PIG-5406:
---

Fix Version/s: 0.18.0
 Assignee: Rohini Palaniswamy
 Priority: Minor  (was: Major)

> TestJoinLocal imports org.python.google.common.collect.Lists instead of 
> org.google.common.collect.Lists
> ---
>
> Key: PIG-5406
> URL: https://issues.apache.org/jira/browse/PIG-5406
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.15.0, 0.16.0, 0.17.0
>Reporter: James Z.M. Gao
>    Assignee: Rohini Palaniswamy
>Priority: Minor
> Fix For: 0.18.0
>
>
> [PIG-4366|https://github.com/apache/pig/commit/81abb6bd0adb6e101898d67b3c2a9e35e11ce993]
>  make PIG-2861 coming back.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (PIG-5423) Upgrade hadoop/tez dependency

2022-08-12 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17579177#comment-17579177
 ] 

Rohini Palaniswamy commented on PIG-5423:
-

[~knoguchi], you mentioned about having to add 
tez_conf.set("tez.runtime.transfer.data-via-events.enabled", "false"); to fix 
some test failures. Can the patch be updated with that?

> Upgrade hadoop/tez dependency 
> --
>
> Key: PIG-5423
> URL: https://issues.apache.org/jira/browse/PIG-5423
> Project: Pig
>  Issue Type: Improvement
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Major
> Attachments: pig-5423-v01.patch
>
>
> We already have PIG-5253 for supporting hadoop3.  Here, upgrading hadoop2 
> dependency to the most recent hadoop2 version, 2.10.1. 
> Also, upgrading Tez to 0.9.2.  (0.10.1 showed some regressions and need 
> further checking). 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (PIG-5422) Upgrade guava/groovy dependency

2022-08-12 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy resolved PIG-5422.
-
Fix Version/s: 0.18.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

+1. Committed to trunk. Thanks Koji.

> Upgrade guava/groovy dependency
> ---
>
> Key: PIG-5422
> URL: https://issues.apache.org/jira/browse/PIG-5422
> Project: Pig
>  Issue Type: Improvement
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Trivial
> Fix For: 0.18.0
>
> Attachments: pig-5422-v01.patch, pig-5422-v02.patch
>
>
> Following owasp/cve. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (PIG-5421) Upgrade commons dependencies

2022-08-12 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy resolved PIG-5421.
-
Fix Version/s: 0.18.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

Committed to trunk. Thanks Koji

> Upgrade commons dependencies 
> -
>
> Key: PIG-5421
> URL: https://issues.apache.org/jira/browse/PIG-5421
> Project: Pig
>  Issue Type: Improvement
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Trivial
> Fix For: 0.18.0
>
> Attachments: pig-5421-v01.patch
>
>
> Following owasp/cve report



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (PIG-5253) Pig Hadoop 3 support

2022-08-12 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy resolved PIG-5253.
-
Hadoop Flags: Reviewed
  Resolution: Fixed

> Pig Hadoop 3 support
> 
>
> Key: PIG-5253
> URL: https://issues.apache.org/jira/browse/PIG-5253
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nándor Kollár
>Assignee: Ádám Szita
>Priority: Blocker
> Fix For: 0.18.0
>
> Attachments: PIG-5253-v3.patch, PIG-5253.0.patch, PIG-5253.0.svn.patch
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Reopened] (PIG-5253) Pig Hadoop 3 support

2022-08-12 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy reopened PIG-5253:
-

> Pig Hadoop 3 support
> 
>
> Key: PIG-5253
> URL: https://issues.apache.org/jira/browse/PIG-5253
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nándor Kollár
>Assignee: Ádám Szita
>Priority: Blocker
> Fix For: 0.18.0
>
> Attachments: PIG-5253-v3.patch, PIG-5253.0.patch, PIG-5253.0.svn.patch
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (PIG-5377) Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce

2022-08-12 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy resolved PIG-5377.
-
Hadoop Flags: Reviewed
  Resolution: Fixed

> Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce
> -
>
> Key: PIG-5377
> URL: https://issues.apache.org/jira/browse/PIG-5377
> Project: Pig
>  Issue Type: Improvement
>  Components: internal-udfs, piggybank
>Reporter: Kevin J. Price
>Assignee: Kevin J. Price
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: PIG-5377-2.patch, PIG-5377.patch
>
>
> Now that we're running on JDK8 and can have default implementations in 
> interfaces, we can move supportsParallelWriteToStoreLocation() to the 
> StoreFuncInterface class and properly set it on the supported built-in 
> functions rather than having a static list.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5377) Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce

2022-08-12 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5377:

Patch Info:   (was: Patch Available)

> Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce
> -
>
> Key: PIG-5377
> URL: https://issues.apache.org/jira/browse/PIG-5377
> Project: Pig
>  Issue Type: Improvement
>  Components: internal-udfs, piggybank
>Reporter: Kevin J. Price
>Assignee: Kevin J. Price
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: PIG-5377-2.patch, PIG-5377.patch
>
>
> Now that we're running on JDK8 and can have default implementations in 
> interfaces, we can move supportsParallelWriteToStoreLocation() to the 
> StoreFuncInterface class and properly set it on the supported built-in 
> functions rather than having a static list.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Reopened] (PIG-5377) Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce

2022-08-12 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy reopened PIG-5377:
-

> Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce
> -
>
> Key: PIG-5377
> URL: https://issues.apache.org/jira/browse/PIG-5377
> Project: Pig
>  Issue Type: Improvement
>  Components: internal-udfs, piggybank
>Reporter: Kevin J. Price
>Assignee: Kevin J. Price
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: PIG-5377-2.patch, PIG-5377.patch
>
>
> Now that we're running on JDK8 and can have default implementations in 
> interfaces, we can move supportsParallelWriteToStoreLocation() to the 
> StoreFuncInterface class and properly set it on the supported built-in 
> functions rather than having a static list.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (PIG-5425) Pig 0.15 and later don't set context signature correctly

2022-08-12 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy resolved PIG-5425.
-
Hadoop Flags: Reviewed
  Resolution: Fixed

> Pig 0.15 and later don't set context signature correctly
> 
>
> Key: PIG-5425
> URL: https://issues.apache.org/jira/browse/PIG-5425
> Project: Pig
>  Issue Type: Improvement
>Reporter: Jacob Tolar
>Assignee: Jacob Tolar
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5425.0.patch
>
>
> As an author of Pig UDFs, my expectation in EvalFunc ( 
> [https://github.com/apache/pig/blob/release-0.17.0/src/org/apache/pig/EvalFunc.java]
>  ) is that {{setUDFContextSignature}} would be called before 
> {{setInputSchema}}. This was previously the case up through Pig 0.14
>  
> In Pig 0.15 and later (according to the git tags, at least; I've only checked 
> 0.17), this is not true.
> This commit introduces the problem behavior: 
> [https://github.com/apache/pig/commit/8af34f1971628d1eeb0cd1f07fe03397ca887b81]
> The issue is in 
> src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java 
>  line 513 ([git blame 
> link|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java#L513])
>  introduced in that commit. 
>  
> There, {{f.setInputSchema()}} is called without previously calling 
> {{f.setUDFContextSignature(signature)}}. 
> Note that on line 509, {{((POUserFunc)p).setSignature(op.getSignature());}} 
> is called, but POUserFunc [re-instantiates the EvalFunc and does not actually 
> use the func argument passed in its 
> constructor|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java#L119-L128]
>  (quite confusing, but probably attributable to changes over time). 
> {{f}} is discarded, so it should be safe to simply call 
> {{f.setUdfContextSignature(signature)}} as a simple fix.
> The code here is arguably unnecessarily complex and could probably be cleaned 
> up further, but I propose the simple fix above without a larger refactoring.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Reopened] (PIG-5425) Pig 0.15 and later don't set context signature correctly

2022-08-12 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy reopened PIG-5425:
-

> Pig 0.15 and later don't set context signature correctly
> 
>
> Key: PIG-5425
> URL: https://issues.apache.org/jira/browse/PIG-5425
> Project: Pig
>  Issue Type: Improvement
>Reporter: Jacob Tolar
>Assignee: Jacob Tolar
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5425.0.patch
>
>
> As an author of Pig UDFs, my expectation in EvalFunc ( 
> [https://github.com/apache/pig/blob/release-0.17.0/src/org/apache/pig/EvalFunc.java]
>  ) is that {{setUDFContextSignature}} would be called before 
> {{setInputSchema}}. This was previously the case up through Pig 0.14
>  
> In Pig 0.15 and later (according to the git tags, at least; I've only checked 
> 0.17), this is not true.
> This commit introduces the problem behavior: 
> [https://github.com/apache/pig/commit/8af34f1971628d1eeb0cd1f07fe03397ca887b81]
> The issue is in 
> src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java 
>  line 513 ([git blame 
> link|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java#L513])
>  introduced in that commit. 
>  
> There, {{f.setInputSchema()}} is called without previously calling 
> {{f.setUDFContextSignature(signature)}}. 
> Note that on line 509, {{((POUserFunc)p).setSignature(op.getSignature());}} 
> is called, but POUserFunc [re-instantiates the EvalFunc and does not actually 
> use the func argument passed in its 
> constructor|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java#L119-L128]
>  (quite confusing, but probably attributable to changes over time). 
> {{f}} is discarded, so it should be safe to simply call 
> {{f.setUdfContextSignature(signature)}} as a simple fix.
> The code here is arguably unnecessarily complex and could probably be cleaned 
> up further, but I propose the simple fix above without a larger refactoring.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5425) Pig 0.15 and later don't set context signature correctly

2022-08-12 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5425:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Pig 0.15 and later don't set context signature correctly
> 
>
> Key: PIG-5425
> URL: https://issues.apache.org/jira/browse/PIG-5425
> Project: Pig
>  Issue Type: Improvement
>Reporter: Jacob Tolar
>Assignee: Jacob Tolar
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5425.0.patch
>
>
> As an author of Pig UDFs, my expectation in EvalFunc ( 
> [https://github.com/apache/pig/blob/release-0.17.0/src/org/apache/pig/EvalFunc.java]
>  ) is that {{setUDFContextSignature}} would be called before 
> {{setInputSchema}}. This was previously the case up through Pig 0.14
>  
> In Pig 0.15 and later (according to the git tags, at least; I've only checked 
> 0.17), this is not true.
> This commit introduces the problem behavior: 
> [https://github.com/apache/pig/commit/8af34f1971628d1eeb0cd1f07fe03397ca887b81]
> The issue is in 
> src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java 
>  line 513 ([git blame 
> link|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java#L513])
>  introduced in that commit. 
>  
> There, {{f.setInputSchema()}} is called without previously calling 
> {{f.setUDFContextSignature(signature)}}. 
> Note that on line 509, {{((POUserFunc)p).setSignature(op.getSignature());}} 
> is called, but POUserFunc [re-instantiates the EvalFunc and does not actually 
> use the func argument passed in its 
> constructor|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java#L119-L128]
>  (quite confusing, but probably attributable to changes over time). 
> {{f}} is discarded, so it should be safe to simply call 
> {{f.setUdfContextSignature(signature)}} as a simple fix.
> The code here is arguably unnecessarily complex and could probably be cleaned 
> up further, but I propose the simple fix above without a larger refactoring.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (PIG-5425) Pig 0.15 and later don't set context signature correctly

2022-08-12 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy reassigned PIG-5425:
---

Fix Version/s: 0.18.0
 Assignee: Jacob Tolar

+1. Committed to trunk. Thanks for the patch [~jtolar]

> Pig 0.15 and later don't set context signature correctly
> 
>
> Key: PIG-5425
> URL: https://issues.apache.org/jira/browse/PIG-5425
> Project: Pig
>  Issue Type: Improvement
>Reporter: Jacob Tolar
>Assignee: Jacob Tolar
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5425.0.patch
>
>
> As an author of Pig UDFs, my expectation in EvalFunc ( 
> [https://github.com/apache/pig/blob/release-0.17.0/src/org/apache/pig/EvalFunc.java]
>  ) is that {{setUDFContextSignature}} would be called before 
> {{setInputSchema}}. This was previously the case up through Pig 0.14
>  
> In Pig 0.15 and later (according to the git tags, at least; I've only checked 
> 0.17), this is not true.
> This commit introduces the problem behavior: 
> [https://github.com/apache/pig/commit/8af34f1971628d1eeb0cd1f07fe03397ca887b81]
> The issue is in 
> src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java 
>  line 513 ([git blame 
> link|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java#L513])
>  introduced in that commit. 
>  
> There, {{f.setInputSchema()}} is called without previously calling 
> {{f.setUDFContextSignature(signature)}}. 
> Note that on line 509, {{((POUserFunc)p).setSignature(op.getSignature());}} 
> is called, but POUserFunc [re-instantiates the EvalFunc and does not actually 
> use the func argument passed in its 
> constructor|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java#L119-L128]
>  (quite confusing, but probably attributable to changes over time). 
> {{f}} is discarded, so it should be safe to simply call 
> {{f.setUdfContextSignature(signature)}} as a simple fix.
> The code here is arguably unnecessarily complex and could probably be cleaned 
> up further, but I propose the simple fix above without a larger refactoring.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (PIG-5420) Update accumulo dependency to 1.10.1

2022-08-12 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17579158#comment-17579158
 ] 

Rohini Palaniswamy commented on PIG-5420:
-

This patch needs updating as accumulo.version is now moved to 
ivy/libraries-h2.properties and ivy/libraries-h3.properties after PIG-5253

> Update accumulo dependency to 1.10.1
> 
>
> Key: PIG-5420
> URL: https://issues.apache.org/jira/browse/PIG-5420
> Project: Pig
>  Issue Type: Improvement
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Trivial
> Attachments: pig-5420-v01.patch
>
>
> Following owasp/cve report. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (PIG-5253) Pig Hadoop 3 support

2022-08-12 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy resolved PIG-5253.
-
Resolution: Fixed

Attached [^PIG-5253.0.svn.patch] from https://reviews.apache.org/r/72326/ to 
jira. Fixed the wrong license file and committed [^PIG-5253-v3.patch] to trunk. 
Thanks [~nkollar] and [~szita] for this key patch.

> Pig Hadoop 3 support
> 
>
> Key: PIG-5253
> URL: https://issues.apache.org/jira/browse/PIG-5253
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nándor Kollár
>Assignee: Ádám Szita
>Priority: Blocker
> Fix For: 0.18.0
>
> Attachments: PIG-5253-v3.patch, PIG-5253.0.patch, PIG-5253.0.svn.patch
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5253) Pig Hadoop 3 support

2022-08-12 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5253:

Attachment: PIG-5253.0.svn.patch
PIG-5253-v3.patch

> Pig Hadoop 3 support
> 
>
> Key: PIG-5253
> URL: https://issues.apache.org/jira/browse/PIG-5253
> Project: Pig
>  Issue Type: Improvement
>Reporter: Nándor Kollár
>Assignee: Ádám Szita
>Priority: Blocker
> Fix For: 0.18.0
>
> Attachments: PIG-5253-v3.patch, PIG-5253.0.patch, PIG-5253.0.svn.patch
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5377) Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce

2022-08-10 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5377:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce
> -
>
> Key: PIG-5377
> URL: https://issues.apache.org/jira/browse/PIG-5377
> Project: Pig
>  Issue Type: Improvement
>  Components: internal-udfs, piggybank
>Reporter: Kevin J. Price
>Assignee: Kevin J. Price
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: PIG-5377-2.patch, PIG-5377.patch
>
>
> Now that we're running on JDK8 and can have default implementations in 
> interfaces, we can move supportsParallelWriteToStoreLocation() to the 
> StoreFuncInterface class and properly set it on the supported built-in 
> functions rather than having a static list.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PIG-5377) Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce

2022-08-10 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-5377:

Fix Version/s: 0.18.0

Committed to trunk. Thank you for the contribution [~kpriceyahoo].

> Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce
> -
>
> Key: PIG-5377
> URL: https://issues.apache.org/jira/browse/PIG-5377
> Project: Pig
>  Issue Type: Improvement
>  Components: internal-udfs, piggybank
>Reporter: Kevin J. Price
>Assignee: Kevin J. Price
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: PIG-5377-2.patch, PIG-5377.patch
>
>
> Now that we're running on JDK8 and can have default implementations in 
> interfaces, we can move supportsParallelWriteToStoreLocation() to the 
> StoreFuncInterface class and properly set it on the supported built-in 
> functions rather than having a static list.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: Review Request 72326: PIG-5253: Pig Hadoop 3 support

2022-08-10 Thread Rohini Palaniswamy


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72326/#review224613
---




trunk/test/org/apache/pig/test/MapReduceMiniCluster.java
Lines 1 (patched)
<https://reviews.apache.org/r/72326/#comment313397>

Koji found that this file has the Cloudera license. Can you replace with 
Apache license?



trunk/test/org/apache/pig/test/MapReduceMiniCluster.java
Lines 41 (patched)
<https://reviews.apache.org/r/72326/#comment313399>

m_conf.set("dfs.datanode.address", "0.0.0.0:0");
m_conf.set("dfs.datanode.http.address", "0.0.0.0:0");
m_conf.set("pig.jobcontrol.sleep", "100"); 
   
System.setProperty("cluster", 
m_conf.get(MRConfiguration.JOB_TRACKER));
System.setProperty("namenode", 
m_conf.get(FileSystem.FS_DEFAULT_NAME_KEY));

is missing compared to older MiniCluster.java. Not sure datanode address 
settings are needed but setting pig.jobcontrol.sleep is likely needed to have 
the tests run faster.



trunk/test/org/apache/pig/test/TezMiniCluster.java
Line 61 (original), 65 (patched)
<https://reviews.apache.org/r/72326/#comment313398>

Can you add 

tez_conf.set("tez.runtime.transfer.data-via-events.enabled", "false");

here. Koji found that some tests were failing with Hadoop 3 in local mode 
without that setting.


- Rohini Palaniswamy


On April 6, 2020, 2:57 p.m., Adam Szita wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72326/
> -------
> 
> (Updated April 6, 2020, 2:57 p.m.)
> 
> 
> Review request for pig, Koji Noguchi, Nandor Kollar, and Rohini Palaniswamy.
> 
> 
> Bugs: PIG-5253
> https://issues.apache.org/jira/browse/PIG-5253
> 
> 
> Repository: pig
> 
> 
> Description
> ---
> 
> This it continuing from https://reviews.apache.org/r/65239/ (there's issues 
> with reviewboard's pig-git repo)
> This change is now rebased on trunk, and I fixed test issues around the MR 
> mode MiniGenericCluster refactoring.
> 
> 
> Diffs
> -
> 
>   trunk/bin/pig 1876187 
>   trunk/bin/pig.py 1876187 
>   trunk/build.xml 1876187 
>   trunk/ivy.xml 1876187 
>   trunk/ivy/libraries-h2.properties PRE-CREATION 
>   trunk/ivy/libraries-h3.properties PRE-CREATION 
>   trunk/ivy/libraries.properties 1876187 
>   trunk/test/org/apache/pig/parser/TestErrorHandling.java 1876187 
>   trunk/test/org/apache/pig/parser/TestQueryParserUtils.java 1876187 
>   trunk/test/org/apache/pig/test/MapReduceMiniCluster.java PRE-CREATION 
>   trunk/test/org/apache/pig/test/MiniCluster.java 1876187 
>   trunk/test/org/apache/pig/test/MiniGenericCluster.java 1876187 
>   trunk/test/org/apache/pig/test/SparkMiniCluster.java 1876187 
>   trunk/test/org/apache/pig/test/TestGrunt.java 1876187 
>   trunk/test/org/apache/pig/test/TezMiniCluster.java 1876187 
>   trunk/test/org/apache/pig/test/Util.java 1876187 
>   trunk/test/org/apache/pig/test/YarnMiniCluster.java 1876187 
> 
> 
> Diff: https://reviews.apache.org/r/72326/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Adam Szita
> 
>

[jira] [Resolved] (PIG-3544) Pig fails to query Apache Cassandra 2.x

2022-08-10 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-3544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy resolved PIG-3544.
-
Resolution: Not A Problem

Closing as it is a bug in Cassandra code base.

> Pig fails to query Apache Cassandra 2.x
> ---
>
> Key: PIG-3544
> URL: https://issues.apache.org/jira/browse/PIG-3544
> Project: Pig
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.12.0
> Environment: CentOS 6.4 - 2.6.32-279.19.1.el6.centos.plus.x86_64
>Reporter: Claudio Romo Otto
>Priority: Blocker
>
> Using Apache Pig 0.12 with Apache Cassandra 2.x (2.0.0 / 2.0.1),
> with this sample request
> data = LOAD 'cql://keyspace1/testcf?' USING CqlStorage();
> testcf is just any CF
> I get this error:
> java.lang.RuntimeException: InvalidRequestException(why:Undefined name 
> key_alias in selection clause)
> at 
> org.apache.cassandra.hadoop.pig.AbstractCassandraStorage.initSchema(AbstractCassandraStorage.java:511)
> at 
> org.apache.cassandra.hadoop.pig.CqlStorage.setLocation(CqlStorage.java:246)
> at 
> org.apache.cassandra.hadoop.pig.CqlStorage.getSchema(CqlStorage.java:280)
> at 
> org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:151)
> at 
> org.apache.pig.newplan.logical.relational.LOLoad.getSchema(LOLoad.java:110)
> at 
> org.apache.pig.newplan.logical.visitor.LineageFindRelVisitor.visit(LineageFindRelVisitor.java:100)
> at 
> org.apache.pig.newplan.logical.relational.LOLoad.accept(LOLoad.java:219)
> at 
> org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
> at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
> at 
> org.apache.pig.newplan.logical.visitor.CastLineageSetter.(CastLineageSetter.java:57)
> at org.apache.pig.PigServer$Graph.compile(PigServer.java:1635)
> at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1566)
> at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1538)
> at org.apache.pig.PigServer.registerQuery(PigServer.java:540)
> at 
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:970)
> at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
> at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
> at org.apache.pig.Main.run(Main.java:490)
> at org.apache.pig.Main.main(Main.java:111)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> Caused by: InvalidRequestException(why:Undefined name key_alias in selection 
> clause)
>  at 
> org.apache.cassandra.thrift.Cassandra$execute_cql3_query_result$execute_cql3_query_resultStandardScheme.read(Cassandra.java:48006)
> at 
> org.apache.cassandra.thrift.Cassandra$execute_cql3_query_result$execute_cql3_query_resultStandardScheme.read(Cassandra.java:47983)
> at 
> org.apache.cassandra.thrift.Cassandra$execute_cql3_query_result.read(Cassandra.java:47898)
> at 
> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
> at 
> org.apache.cassandra.thrift.Cassandra$Client.recv_execute_cql3_query(Cassandra.java:1658)
> at 
> org.apache.cassandra.thrift.Cassandra$Client.execute_cql3_query(Cassandra.java:1643)
> at 
> org.apache.cassandra.hadoop.pig.AbstractCassandraStorage.getCfDef(AbstractCassandraStorage.java:573)
> at 
> org.apache.cassandra.hadoop.pig.AbstractCassandraStorage.initSchema(AbstractCassandraStorage.java:500)
> ... 25 more



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (PIG-5424) Upgrade hbase/zookeeper dependencies

2022-04-13 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17521827#comment-17521827
 ] 

Rohini Palaniswamy commented on PIG-5424:
-

+1

> Upgrade hbase/zookeeper dependencies
> 
>
> Key: PIG-5424
> URL: https://issues.apache.org/jira/browse/PIG-5424
> Project: Pig
>  Issue Type: Improvement
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Trivial
> Attachments: pig-5424-v01.patch
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (PIG-5422) Upgrade guava/groovy dependency

2022-04-13 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17521824#comment-17521824
 ] 

Rohini Palaniswamy commented on PIG-5422:
-

Updating guava version will cause issues with hadoop unless we shade it.

> Upgrade guava/groovy dependency
> ---
>
> Key: PIG-5422
> URL: https://issues.apache.org/jira/browse/PIG-5422
> Project: Pig
>  Issue Type: Improvement
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Trivial
> Attachments: pig-5422-v01.patch
>
>
> Following owasp/cve. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (PIG-5421) Upgrade commons dependencies

2022-04-13 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17521823#comment-17521823
 ] 

Rohini Palaniswamy commented on PIG-5421:
-

+1

> Upgrade commons dependencies 
> -
>
> Key: PIG-5421
> URL: https://issues.apache.org/jira/browse/PIG-5421
> Project: Pig
>  Issue Type: Improvement
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Trivial
> Attachments: pig-5421-v01.patch
>
>
> Following owasp/cve report



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (PIG-5420) Update accumulo dependency to 1.10.1

2022-04-13 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17521822#comment-17521822
 ] 

Rohini Palaniswamy commented on PIG-5420:
-

Can we just making it accumulo.version instead of accumulo1.version?

> Update accumulo dependency to 1.10.1
> 
>
> Key: PIG-5420
> URL: https://issues.apache.org/jira/browse/PIG-5420
> Project: Pig
>  Issue Type: Improvement
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Trivial
> Attachments: pig-5420-v01.patch
>
>
> Following owasp/cve report. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (PIG-5413) [spark] TestStreaming.testInputCacheSpecs failing with "File script1.pl was already registered"

2022-01-11 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17473177#comment-17473177
 ] 

Rohini Palaniswamy commented on PIG-5413:
-

+1

> [spark] TestStreaming.testInputCacheSpecs failing with "File script1.pl was 
> already registered"
> ---
>
> Key: PIG-5413
> URL: https://issues.apache.org/jira/browse/PIG-5413
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Minor
> Attachments: pig-5413-v01.patch
>
>
>  {noformat}
> Caused by: java.lang.IllegalArgumentException: requirement failed: File 
> script1.pl was already registered with a different path (old path = 
> /tmp/yarn-local/usercache/knoguchi/appcache/application_1628754354801_523406/container_e07_1628754354801_523406_01_61/tmp/pig_junit_tmp1798933174/cache7028476439694979845/script1.pl,
>  new path = 
> /tmp/yarn-local/usercache/knoguchi/appcache/application_1628754354801_523406/container_e07_1628754354801_523406_01_61/tmp/pig_junit_tmp1798933174/cache4167672945345635171/script1.pl
> at scala.Predef$.require(Predef.scala:224)
> at 
> org.apache.spark.rpc.netty.NettyStreamManager.addFile(NettyStreamManager.scala:70)
> at org.apache.spark.SparkContext.addFile(SparkContext.scala:1559)
> ...
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (PIG-5415) [spark] TestScriptLanguage conflict between multiple SparkContext (after spark2.4 upgrade)

2021-12-11 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17457792#comment-17457792
 ] 

Rohini Palaniswamy commented on PIG-5415:
-

+1

> [spark] TestScriptLanguage conflict between multiple SparkContext (after 
> spark2.4 upgrade)
> --
>
> Key: PIG-5415
> URL: https://issues.apache.org/jira/browse/PIG-5415
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Minor
> Attachments: pig-5415-v01.patch
>
>
> {noformat}
> 2021-10-12 17:54:40,073 [main] ERROR org.apache.pig.scripting.BoundScript - 
> Pig pipeline failed to complete
> java.util.concurrent.ExecutionException: 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: 
> java.lang.IllegalStateException: Cannot call methods on a stopped 
> SparkContext.
> This stopped SparkContext was created at:
> org.apache.spark.api.java.JavaSparkContext.(JavaSparkContext.scala:58)
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.startSparkIfNeeded(SparkLauncher.java:640)
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:184)
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:290)
> org.apache.pig.PigServer.launchPlan(PigServer.java:1479)
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1464)
> org.apache.pig.PigServer.execute(PigServer.java:1453)
> org.apache.pig.PigServer.executeBatch(PigServer.java:489)
> org.apache.pig.PigServer.executeBatch(PigServer.java:472)
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:172)
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:235)
> org.apache.pig.scripting.BoundScript$MyCallable.call(BoundScript.java:347)
> org.apache.pig.scripting.BoundScript$MyCallable.call(BoundScript.java:323)
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
> The currently active SparkContext was created at:
> org.apache.spark.api.java.JavaSparkContext.(JavaSparkContext.scala:58)
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.startSparkIfNeeded(SparkLauncher.java:640)
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:184)
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:290)
> org.apache.pig.PigServer.launchPlan(PigServer.java:1479)
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1464)
> org.apache.pig.PigServer.execute(PigServer.java:1453)
> org.apache.pig.PigServer.executeBatch(PigServer.java:489)
> org.apache.pig.PigServer.executeBatch(PigServer.java:472)
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:172)
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:235)
> org.apache.pig.scripting.BoundScript$MyCallable.call(BoundScript.java:347)
> org.apache.pig.scripting.BoundScript$MyCallable.call(BoundScript.java:323)
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (PIG-5398) SparkLauncher does not read SPARK_CONF_DIR/spark-defaults.conf

2021-12-11 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17457791#comment-17457791
 ] 

Rohini Palaniswamy commented on PIG-5398:
-

+1

> SparkLauncher does not read SPARK_CONF_DIR/spark-defaults.conf 
> ---
>
> Key: PIG-5398
> URL: https://issues.apache.org/jira/browse/PIG-5398
> Project: Pig
>  Issue Type: Bug
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Minor
> Attachments: pig-5398-v01.patch
>
>
> Noticed while testing spark e2e tests.  Somehow, Pig's spark launcher is not 
> reading SPARK_CONF_DIR at all. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (PIG-5412) testSkewedJoinOuter spark unit-test failing with ClassNotFoundException

2021-12-11 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17457790#comment-17457790
 ] 

Rohini Palaniswamy commented on PIG-5412:
-

+1

> testSkewedJoinOuter spark unit-test failing with ClassNotFoundException
> ---
>
> Key: PIG-5412
> URL: https://issues.apache.org/jira/browse/PIG-5412
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Minor
> Attachments: pig-5412-v01.patch
>
>
> \{TestSkewedJoin,TestJoinSmoke}.testSkewedJoinOuter 
> both with {{-Dtest.exec.type=spark -Dsparkversion=2}} 
> are somehow failing with 
> "java.lang.ClassNotFoundException: 
> org.apache.pig.backend.hadoop.executionengine.spark.Spark1Shims"
> {noformat}
> Unable to open iterator for alias C. Backend error : Job aborted.
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to 
> open iterator for alias C. Backend error : Job aborted.
> at org.apache.pig.PigServer.openIterator(PigServer.java:1014)
> at 
> org.apache.pig.test.TestJoinSmoke.testSkewedJoinOuter(TestJoinSmoke.java:199)
> Caused by: org.apache.spark.SparkException: Job aborted.
> at 
> org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:100)
> at 
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1083)
> at 
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1081)
> at 
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1081)
> at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
> at org.apache.spark.rdd.RDD.withScope(RDD.scala:385)
> at 
> org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:1081)
> at 
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopFile$2.apply$mcV$sp(PairRDDFunctions.scala:1000)
> at 
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopFile$2.apply(PairRDDFunctions.scala:991)
> at 
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopFile$2.apply(PairRDDFunctions.scala:991)
> at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
> at org.apache.spark.rdd.RDD.withScope(RDD.scala:385)
> at 
> org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopFile(PairRDDFunctions.scala:991)
> at 
> org.apache.pig.backend.hadoop.executionengine.spark.converter.StoreConverter.convert(StoreConverter.java:99)
> at 
> org.apache.pig.backend.hadoop.executionengine.spark.converter.StoreConverter.convert(StoreConverter.java:56)
> at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.physicalToRDD(JobGraphBuilder.java:292)
> at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:182)
> at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112)
> at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140)
> at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37)
> at 
> org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87)
> at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46)
> at 
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:240)
> at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:290)
> at org.apache.pig.PigServer.launchPlan(PigServer.java:1479)
> at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1464)
> at org.apache.pig.PigServer.storeEx(PigServer.java:1123)
> at org.apache.pig.PigServer.store(PigServer.java:1086)
> at org.apache.pig.PigServer.openIterator(PigServer.java:999)
> Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: 
> Task 0 in stage 19.0 failed 4 times, most recent failure: Lost task 0.3 in 
> stage 19.0 (TID 26, gsrd466n11.red.ygrid.yahoo.com, executor 2): 
> org.apache.spark.SparkException: Task failed while writing rows
> at 
> org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark

[jira] [Commented] (PIG-5397) Update spark2.version to 2.4.8

2021-12-11 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17457786#comment-17457786
 ] 

Rohini Palaniswamy commented on PIG-5397:
-

+1

> Update spark2.version to 2.4.8
> --
>
> Key: PIG-5397
> URL: https://issues.apache.org/jira/browse/PIG-5397
> Project: Pig
>  Issue Type: Improvement
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Trivial
> Attachments: pig-5397-v01.patch, pig-5397-v02.patch
>
>
> bq.   spark2.version=2.1.1
> This is probably too old.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Assigned] (PIG-5418) Utils.parseSchema(String), parseConstant(String) leak memory

2021-12-11 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy reassigned PIG-5418:
---

Assignee: Jacob Tolar

> Utils.parseSchema(String), parseConstant(String) leak memory
> 
>
> Key: PIG-5418
> URL: https://issues.apache.org/jira/browse/PIG-5418
> Project: Pig
>  Issue Type: Improvement
>Reporter: Jacob Tolar
>Assignee: Jacob Tolar
>Priority: Minor
> Attachments: PIG-5418.patch
>
>
> A minor issue: I noticed that Utils.parseSchema() and parseConstant() leak 
> memory. I noticed this while running a unit test for a UDF several thousand 
> times and checking the heap. 
> Links are to latest commit as of creating this ticket: 
> https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/impl/util/Utils.java#L244-L256
> {{new PigContext()}} [creates a MapReduce 
> ExecutionEngine|https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/impl/PigContext.java#L269].
>  
> This creates a 
> [MapReduceLauncher|https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRExecutionEngine.java#L34].
>  
> This registers a [Hadoop shutdown 
> hook|https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java#L104-L105]
>  which doesn't go away until the JVM dies. See: 
> https://hadoop.apache.org/docs/r2.8.2/hadoop-project-dist/hadoop-common/api/org/apache/hadoop/util/ShutdownHookManager.html
>  . 
> I will attach a proposed patch. From my reading of the code and running 
> tests, the existing schema parse APIs do not actually use anything from this 
> dummy PigContext, and with a minor tweak it can be passed in as NULL, 
> avoiding the creation of these extra resources. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Assigned] (PIG-5414) Build failure on Linux ARM64 due to old Apache Avro

2021-10-29 Thread Rohini Palaniswamy (Jira)



 [ 
https://issues.apache.org/jira/browse/PIG-5414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy reassigned PIG-5414:
---

Assignee: Martin Tzvetanov Grigorov

I will review this for you by early next week [~mgrigorov]. 

> Build failure on Linux ARM64 due to old Apache Avro
> ---
>
> Key: PIG-5414
> URL: https://issues.apache.org/jira/browse/PIG-5414
> Project: Pig
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.18.0
>Reporter: Martin Tzvetanov Grigorov
>Assignee: Martin Tzvetanov Grigorov
>Priority: Major
> Attachments: 35.patch, 
> TEST-org.apache.pig.builtin.TestAvroStorage.txt, 
> TEST-org.apache.pig.builtin.TestOrcStorage.txt, 
> TEST-org.apache.pig.builtin.TestOrcStoragePushdown.txt
>
>
> Trying to build Apache Pig on Ubuntu 20.04.3 ARM64 fails because of old 
> version of Snappy and Avro libraries:
>  
> {code:java}
> Testsuite: org.apache.pig.builtin.TestAvroStorage
> Tests run: 0, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.1 sec
> - Standard Output ---
> 2021-10-12 14:43:35,483 [main] INFO  
> org.apache.pig.impl.util.SpillableMemoryManager  - Selected heap (PS Old Gen) 
> of size 1431830528 to monitor. collectionUsageThreshold = 1064828928, 
> usageThreshold = 1064828928
> 2021-10-12 14:43:35,489 [main] INFO  org.apache.pig.ExecTypeProvider  - 
> Trying ExecType : LOCAL
> 2021-10-12 14:43:35,489 [main] INFO  org.apache.pig.ExecTypeProvider  - 
> Picked LOCAL as the ExecType
> 2021-10-12 14:43:35,515 [main] WARN  org.apache.hadoop.conf.Configuration  - 
> DEPRECATED: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml 
> is deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml 
> to override properties of core-default.xml, mapred-default.xml and 
> hdfs-default.xml respectively
> 2021-10-12 14:43:35,755 [main] INFO  
> org.apache.hadoop.conf.Configuration.deprecation  - mapred.job.tracker is 
> deprecated. Instead, use mapreduce.jobtracker.address
> 2021-10-12 14:43:35,899 [main] WARN  org.apache.hadoop.util.NativeCodeLoader  
> - Unable to load native-hadoop library for your platform... using 
> builtin-java classes where applicable
> 2021-10-12 14:43:35,916 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine  - Connecting 
> to hadoop file system at: file:///
> 2021-10-12 14:43:36,116 [main] INFO  
> org.apache.hadoop.conf.Configuration.deprecation  - io.bytes.per.checksum is 
> deprecated. Instead, use dfs.bytes-per-checksum
> 2021-10-12 14:43:36,137 [main] INFO  org.apache.pig.PigServer  - Pig Script 
> ID for the session: PIG-default-01426621-bc19-499f-981e-b13959fe0d84
> 2021-10-12 14:43:36,137 [main] WARN  org.apache.pig.PigServer  - ATS is 
> disabled since yarn.timeline-service.enabled set to false
> 2021-10-12 14:43:36,150 [main] INFO  org.apache.pig.builtin.TestAvroStorage  
> - creating 
> test/org/apache/pig/builtin/avro/data/avro/uncompressed/arraysAsOutputByPig.avro
> 2021-10-12 14:43:36,502 [main] INFO  org.apache.pig.builtin.TestAvroStorage  
> - Could not generate avro file: 
> test/org/apache/pig/builtin/avro/data/avro/uncompressed/arraysAsOutputByPig.avro
> java.net.ConnectException: Call From martin/127.0.0.1 to localhost:40073 
> failed on connection exception: java.net.ConnectException: Connection 
> refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
> at org.apache.hadoop.ipc.Client.call(Client.java:1479)
> at org.apache.hadoop.ipc.Client.call(Client.java:1412)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
> at com.sun.proxy.$Proxy13.getBlockLocations(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:255)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> ...
>  {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (PIG-5410) Support Python 3 for streaming_python

2021-06-09 Thread Rohini Palaniswamy (Jira)

Rohini Palaniswamy created PIG-5410:
---

 Summary: Support Python 3 for streaming_python
 Key: PIG-5410
 URL: https://issues.apache.org/jira/browse/PIG-5410
 Project: Pig
  Issue Type: New Feature
Reporter: Rohini Palaniswamy
Assignee: Venkatasubrahmanian Narayanan
 Fix For: 0.18.0


Python 3 is incompatible with Python 2. We need to make it work with both. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PIG-5407) Update search bar for the site

2021-02-22 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17288642#comment-17288642
 ] 

Rohini Palaniswamy commented on PIG-5407:
-

+1 to v03 patch

> Update search bar for the site
> --
>
> Key: PIG-5407
> URL: https://issues.apache.org/jira/browse/PIG-5407
> Project: Pig
>  Issue Type: Bug
>  Components: site
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Major
> Attachments: pig-5407-v01.patch, pig-5407-v02.patch, 
> pig-5407-v03.patch
>
>
> It was recently reported that search-hadoop is no longer valid.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PIG-5407) Update search bar for the site

2021-02-22 Thread Rohini Palaniswamy (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17288635#comment-17288635
 ] 

Rohini Palaniswamy commented on PIG-5407:
-

+1

> Update search bar for the site
> --
>
> Key: PIG-5407
> URL: https://issues.apache.org/jira/browse/PIG-5407
> Project: Pig
>  Issue Type: Bug
>  Components: site
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Major
> Attachments: pig-5407-v01.patch, pig-5407-v02.patch
>
>
> It was recently reported that search-hadoop is no longer valid.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 3731 matches

Mail list logo