[jira] [Commented] (PIG-5451) Pig-on-Spark3 E2E Orc_Pushdown_5 failing

2024-03-29 Thread Koji Noguchi (Jira)


[ 
https://issues.apache.org/jira/browse/PIG-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832323#comment-17832323
 ] 

Koji Noguchi commented on PIG-5451:
---

This was caused by conflict of orc.version.  

./build/ivy/lib/Pig/orc-core-1.5.6.jar
./lib/h3/orc-core-1.5.6.jar

and

spark/jars/orc-core-1.6.14.jar

> Pig-on-Spark3 E2E Orc_Pushdown_5 failing 
> -
>
> Key: PIG-5451
> URL: https://issues.apache.org/jira/browse/PIG-5451
> Project: Pig
>  Issue Type: Bug
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Minor
>
> Test failing with
> "java.lang.IllegalAccessError: class org.threeten.extra.chrono.HybridDate 
> cannot access its superclass org.threeten.extra.chrono.AbstractDate"



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (PIG-5451) Pig-on-Spark3 E2E Orc_Pushdown_5 failing

2024-03-29 Thread Koji Noguchi (Jira)
Koji Noguchi created PIG-5451:
-

 Summary: Pig-on-Spark3 E2E Orc_Pushdown_5 failing 
 Key: PIG-5451
 URL: https://issues.apache.org/jira/browse/PIG-5451
 Project: Pig
  Issue Type: Bug
Reporter: Koji Noguchi
Assignee: Koji Noguchi


Test failing with
"java.lang.IllegalAccessError: class org.threeten.extra.chrono.HybridDate 
cannot access its superclass org.threeten.extra.chrono.AbstractDate"






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PIG-5451) Pig-on-Spark3 E2E Orc_Pushdown_5 failing

2024-03-29 Thread Koji Noguchi (Jira)


[ 
https://issues.apache.org/jira/browse/PIG-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832320#comment-17832320
 ] 

Koji Noguchi commented on PIG-5451:
---

Full stack trace.
{noformat}
2024-03-29 10:57:31,787 [dag-scheduler-event-loop] INFO 
org.apache.spark.scheduler.DAGScheduler - ResultStage 3 (runJob at 
SparkHadoopWriter.scala:83) failed in 36.126 s due to Job aborted due to stage 
failure: Task 0 in stage 3.0 failed 4 times, most recent failure: Lost task 0.3 
in stage 3.0 (TID 8) (gsrd479n10.red.ygrid.yahoo.com executor 4): 
java.lang.IllegalAccessError: class org.threeten.extra.chrono.HybridDate cannot 
access its superclass org.threeten.extra.chrono.AbstractDate
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:756)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at 
org.apache.spark.util.ChildFirstURLClassLoader.loadClass(ChildFirstURLClassLoader.java:46)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
at org.threeten.extra.chrono.HybridChronology.date(HybridChronology.java:235)
at org.threeten.extra.chrono.HybridChronology.date(HybridChronology.java:88)
at java.time.chrono.AbstractChronology.resolveYMD(AbstractChronology.java:563)
at java.time.chrono.AbstractChronology.resolveDate(AbstractChronology.java:472)
at 
org.threeten.extra.chrono.HybridChronology.resolveDate(HybridChronology.java:452)
at 
org.threeten.extra.chrono.HybridChronology.resolveDate(HybridChronology.java:88)
at java.time.format.Parsed.resolveDateFields(Parsed.java:351)
at java.time.format.Parsed.resolveFields(Parsed.java:257)
at java.time.format.Parsed.resolve(Parsed.java:244)
at 
java.time.format.DateTimeParseContext.toResolved(DateTimeParseContext.java:331)
at 
java.time.format.DateTimeFormatter.parseResolved0(DateTimeFormatter.java:1955)
at java.time.format.DateTimeFormatter.parse(DateTimeFormatter.java:1777)
at org.apache.orc.impl.DateUtils._clinit_(DateUtils.java:74)
at 
org.apache.orc.impl.ColumnStatisticsImpl$TimestampStatisticsImpl._init_(ColumnStatisticsImpl.java:1683)
at 
org.apache.orc.impl.ColumnStatisticsImpl.deserialize(ColumnStatisticsImpl.java:2131)
at 
org.apache.orc.impl.RecordReaderImpl.evaluatePredicateProto(RecordReaderImpl.java:522)
at 
org.apache.orc.impl.RecordReaderImpl$SargApplier.pickRowGroups(RecordReaderImpl.java:1045)
at 
org.apache.orc.impl.RecordReaderImpl.pickRowGroups(RecordReaderImpl.java:1117)
at org.apache.orc.impl.RecordReaderImpl.readStripe(RecordReaderImpl.java:1137)
at 
org.apache.orc.impl.RecordReaderImpl.advanceStripe(RecordReaderImpl.java:1187)
at 
org.apache.orc.impl.RecordReaderImpl.advanceToNextRow(RecordReaderImpl.java:1222)
at org.apache.orc.impl.RecordReaderImpl._init_(RecordReaderImpl.java:254)
at 
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl._init_(RecordReaderImpl.java:67)
at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:83)
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.createReaderFromFile(OrcInputFormat.java:337)
at 
org.apache.hadoop.hive.ql.io.orc.OrcNewInputFormat$OrcRecordReader._init_(OrcNewInputFormat.java:72)
at 
org.apache.hadoop.hive.ql.io.orc.OrcNewInputFormat.createRecordReader(OrcNewInputFormat.java:57)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initNextRecordReader(PigRecordReader.java:255)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader._init_(PigRecordReader.java:126)
at 
org.apache.pig.backend.hadoop.executionengine.spark.SparkPigRecordReader._init_(SparkPigRecordReader.java:44)
at 
org.apache.pig.backend.hadoop.executionengine.spark.running.PigInputFormatSpark$SparkRecordReaderFactory.createRecordReader(PigInputFormatSpark.java:131)
at 
org.apache.pig.backend.hadoop.executionengine.spark.running.PigInputFormatSpark.createRecordReader(PigInputFormatSpark.java:71)
at 
org.apache.spark.rdd.NewHadoopRDD$$anon$1.liftedTree1$1(NewHadoopRDD.scala:215)
at org.apache.spark.rdd.NewHadoopRDD$$anon$1._init_(NewHadoopRDD.scala:213)
at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:168)
at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:71)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at 

[jira] [Updated] (PIG-5450) Pig-on-Spark3 E2E ORC test failing with java.lang.VerifyError: Bad return type

2024-03-29 Thread Koji Noguchi (Jira)


 [ 
https://issues.apache.org/jira/browse/PIG-5450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Noguchi updated PIG-5450:
--
Attachment: pig-5450-v01.patch

It turns out the weird error was coming from conflicting jar. 
{{./build/ivy/lib/Pig/hive-storage-api-2.7.0.jar}}
and
{{spark/spark/jars/hive-storage-api-2.7.2.jar}}

Uploading a patch updating hive-storage-api version.

> Pig-on-Spark3 E2E ORC test failing with java.lang.VerifyError: Bad return type
> --
>
> Key: PIG-5450
> URL: https://issues.apache.org/jira/browse/PIG-5450
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Major
> Attachments: pig-5450-v01.patch
>
>
> {noformat}
> Caused by: java.lang.VerifyError: Bad return type
> Exception Details:
> Location:
> org/apache/orc/impl/TypeUtils.createColumn(Lorg/apache/orc/TypeDescription;Lorg/apache/orc/TypeDescription$RowBatchVersion;I)Lorg/apache/hadoop/hive/ql/exec/vector/ColumnVector;
>  @117: areturn
> Reason:
> Type 'org/apache/hadoop/hive/ql/exec/vector/DateColumnVector' (current frame, 
> stack[0]) is not assignable to 
> 'org/apache/hadoop/hive/ql/exec/vector/ColumnVector' (from method signature)
>  {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PIG-5450) Pig-on-Spark3 E2E ORC test failing with java.lang.VerifyError: Bad return type

2024-03-29 Thread Koji Noguchi (Jira)


[ 
https://issues.apache.org/jira/browse/PIG-5450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832318#comment-17832318
 ] 

Koji Noguchi commented on PIG-5450:
---

Weird full trace.
{noformat}
024-03-27 10:50:40,088 [task-result-getter-0] WARN 
org.apache.spark.scheduler.TaskSetManager - Lost task 0.0 in stage 0.0 (TID 0) 
(gsrd238n05.red.ygrid.yahoo.com executor 1): org.apache.spark.SparkException: 
Task failed while writing rows
at 
org.apache.spark.internal.io.SparkHadoopWriter$.executeTask(SparkHadoopWriter.scala:163)
at 
org.apache.spark.internal.io.SparkHadoopWriter$.$anonfun$write$1(SparkHadoopWriter.scala:88)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:131)
at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1491)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.VerifyError: Bad return type
Exception Details:
Location:
org/apache/orc/impl/TypeUtils.createColumn(Lorg/apache/orc/TypeDescription;Lorg/apache/orc/TypeDescription$RowBatchVersion;I)Lorg/apache/hadoop/hive/ql/exec/vector/ColumnVector;
 @117: areturn
Reason:
Type 'org/apache/hadoop/hive/ql/exec/vector/DateColumnVector' (current frame, 
stack[0]) is not assignable to 
'org/apache/hadoop/hive/ql/exec/vector/ColumnVector' (from method signature)
Current Frame:
bci: @117
flags: { }
locals: { 'org/apache/orc/TypeDescription', 
'org/apache/orc/TypeDescription$RowBatchVersion', integer }
stack: { 'org/apache/hadoop/hive/ql/exec/vector/DateColumnVector' }
Bytecode:
0x000: b200 022a b600 03b6 0004 2eaa  0181
0x010:  0001  0013  0059  0059
0x020:  0059  0059  0059  0062
0x030:  006b  006b  0074  0074
0x040:  007d  00ad  00ad  00ad
0x050:  00ad  00b6  00f7  0138
0x060:  0155 bb00 0559 1cb7 0006 b0bb 0007
0x070: 591c b700 08b0 bb00 0959 1cb7 000a b0bb
0x080: 000b 591c b700 0cb0 2ab6 000d 3e2a b600
0x090: 0e36 042b b200 0fa5 0009 1d10 12a4 000f
0x0a0: bb00 1159 1c1d 1504 b700 12b0 bb00 1359
0x0b0: 1c1d 1504 b700 14b0 bb00 1559 1cb7 0016
0x0c0: b02a b600 174e 2db9 0018 0100 bd00 193a
0x0d0: 0403 3605 1505 1904 bea2 001e 1904 1505
0x0e0: 2d15 05b9 001a 0200 c000 102b 1cb8 001b
0x0f0: 5384 0501 a7ff e0bb 001c 591c 1904 b700
0x100: 1db0 2ab6 0017 4e2d b900 1801 00bd 0019
0x110: 3a04 0336 0515 0519 04be a200 1e19 0415
0x120: 052d 1505 b900 1a02 00c0 0010 2b1c b800
0x130: 1b53 8405 01a7 ffe0 bb00 1e59 1c19 04b7
0x140: 001f b02a b600 174e bb00 2059 1c2d 03b9
0x150: 001a 0200 c000 102b 1cb8 001b b700 21b0
0x160: 2ab6 0017 4ebb 0022 591c 2d03 b900 1a02
0x170: 00c0 0010 2b1c b800 1b2d 04b9 001a 0200
0x180: c000 102b 1cb8 001b b700 23b0 bb00 2459
0x190: bb00 2559 b700 2612 27b6 0028 2ab6 0003
0x1a0: b600 29b6 002a b700 2bbf
Stackmap Table:
same_frame_extended(@100)
same_frame(@109)
same_frame(@118)
same_frame(@127)
same_frame(@136)
append_frame(@160,Integer,Integer)
same_frame(@172)
chop_frame(@184,2)
same_frame(@193)
append_frame(@212,Object[_75],Object[_76],Integer)
chop_frame(@247,1)
chop_frame(@258,2)
append_frame(@277,Object[_75],Object[_76],Integer)
chop_frame(@312,1)
chop_frame(@323,2)
same_frame(@352)
same_frame(@396)

at org.apache.orc.TypeDescription.createRowBatch(TypeDescription.java:483)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl._init_(WriterImpl.java:100)
at org.apache.hadoop.hive.ql.io.orc.OrcFile.createWriter(OrcFile.java:334)
at 
org.apache.hadoop.hive.ql.io.orc.OrcNewOutputFormat$OrcRecordWriter.write(OrcNewOutputFormat.java:51)
at 
org.apache.hadoop.hive.ql.io.orc.OrcNewOutputFormat$OrcRecordWriter.write(OrcNewOutputFormat.java:37)
at org.apache.pig.builtin.OrcStorage.putNext(OrcStorage.java:249)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.StoreFuncDecorator.putNext(StoreFuncDecorator.java:75)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:146)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
at 
org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.write(SparkHadoopWriter.scala:368)
at 
org.apache.spark.internal.io.SparkHadoopWriter$.$anonfun$executeTask$1(SparkHadoopWriter.scala:138)
at 
org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1525)
at 
org.apache.spark.internal.io.SparkHadoopWriter$.executeTask(SparkHadoopWriter.scala:135)

[jira] [Created] (PIG-5450) Pig-on-Spark3 E2E ORC test failing with java.lang.VerifyError: Bad return type

2024-03-29 Thread Koji Noguchi (Jira)
Koji Noguchi created PIG-5450:
-

 Summary: Pig-on-Spark3 E2E ORC test failing with 
java.lang.VerifyError: Bad return type
 Key: PIG-5450
 URL: https://issues.apache.org/jira/browse/PIG-5450
 Project: Pig
  Issue Type: Bug
  Components: spark
Reporter: Koji Noguchi
Assignee: Koji Noguchi


{noformat}
Caused by: java.lang.VerifyError: Bad return type
Exception Details:
Location:
org/apache/orc/impl/TypeUtils.createColumn(Lorg/apache/orc/TypeDescription;Lorg/apache/orc/TypeDescription$RowBatchVersion;I)Lorg/apache/hadoop/hive/ql/exec/vector/ColumnVector;
 @117: areturn
Reason:
Type 'org/apache/hadoop/hive/ql/exec/vector/DateColumnVector' (current frame, 
stack[0]) is not assignable to 
'org/apache/hadoop/hive/ql/exec/vector/ColumnVector' (from method signature)
 {noformat}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PIG-5410) Support Python 3 for streaming_python

2024-03-29 Thread Koji Noguchi (Jira)


 [ 
https://issues.apache.org/jira/browse/PIG-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Noguchi updated PIG-5410:
--
Attachment: pig-5410-v02.patch

Testing the patch, it was failing with
{noformat}
Caused by: org.apache.pig.impl.streaming.StreamingUDFException: LINE : File 
"/grid/0/tmp/yarn-local/usercache/gtrain/appcache/application_1694019138198_2621253/container_e13_1694019138198_2621253_01_04/tmp/controller1951726576599472905.py",
 line 365
WRAPPED_MAP_END)
^
SyntaxError: invalid syntax
{noformat}
it seems like the patch was missing a '+'.   Uploading a new patch with '+'.  



> Support Python 3 for streaming_python
> -
>
> Key: PIG-5410
> URL: https://issues.apache.org/jira/browse/PIG-5410
> Project: Pig
>  Issue Type: New Feature
>Reporter: Rohini Palaniswamy
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5410.patch, pig-5410-v02.patch
>
>
> Python 3 is incompatible with Python 2. We need to make it work with both. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (PIG-5410) Support Python 3 for streaming_python

2024-03-29 Thread Koji Noguchi (Jira)


[ 
https://issues.apache.org/jira/browse/PIG-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832317#comment-17832317
 ] 

Koji Noguchi edited comment on PIG-5410 at 3/29/24 9:10 PM:


Testing the patch, it was failing with
{noformat}
Caused by: org.apache.pig.impl.streaming.StreamingUDFException: LINE : File 
"/grid/0/tmp/yarn-local/usercache/gtrain/appcache/application_1694019138198_2621253/container_e13_1694019138198_2621253_01_04/tmp/controller1951726576599472905.py",
 line 365
WRAPPED_MAP_END)
^
SyntaxError: invalid syntax
{noformat}
it seems like the patch was missing a '+'. Uploading a new patch.


was (Author: knoguchi):
Testing the patch, it was failing with
{noformat}
Caused by: org.apache.pig.impl.streaming.StreamingUDFException: LINE : File 
"/grid/0/tmp/yarn-local/usercache/gtrain/appcache/application_1694019138198_2621253/container_e13_1694019138198_2621253_01_04/tmp/controller1951726576599472905.py",
 line 365
WRAPPED_MAP_END)
^
SyntaxError: invalid syntax
{noformat}
it seems like the patch was missing a '+'.   Uploading a new patch with '+'.  



> Support Python 3 for streaming_python
> -
>
> Key: PIG-5410
> URL: https://issues.apache.org/jira/browse/PIG-5410
> Project: Pig
>  Issue Type: New Feature
>Reporter: Rohini Palaniswamy
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-5410.patch, pig-5410-v02.patch
>
>
> Python 3 is incompatible with Python 2. We need to make it work with both. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)