[jira] [Reopened] (SPARK-26796) Testcases failing with "org.apache.hadoop.fs.ChecksumException" error

2019-07-18 Thread Anuja Jakhade (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuja Jakhade reopened SPARK-26796:
---

> Testcases failing with "org.apache.hadoop.fs.ChecksumException" error
> -
>
> Key: SPARK-26796
> URL: https://issues.apache.org/jira/browse/SPARK-26796
> Project: Spark
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 2.3.2, 2.4.0
> Environment: Ubuntu 16.04 
> Java Version
> openjdk version "1.8.0_192"
>  OpenJDK Runtime Environment (build 1.8.0_192-b12_openj9)
>  Eclipse OpenJ9 VM (build openj9-0.11.0, JRE 1.8.0 Compressed References 
> 20181107_80 (JIT enabled, AOT enabled)
>  OpenJ9 - 090ff9dcd
>  OMR - ea548a66
>  JCL - b5a3affe73 based on jdk8u192-b12)
>  
> Hadoop  Version
> Hadoop 2.7.1
>  Subversion Unknown -r Unknown
>  Compiled by test on 2019-01-29T09:09Z
>  Compiled with protoc 2.5.0
>  From source with checksum 5e94a235f9a71834e2eb73fb36ee873f
>  This command was run using 
> /home/test/hadoop-release-2.7.1/hadoop-dist/target/hadoop-2.7.1/share/hadoop/common/hadoop-common-2.7.1.jar
>  
>  
>  
>Reporter: Anuja Jakhade
>Priority: Major
>
> Observing test case failures due to Checksum error 
> Below is the error log
> [ERROR] checkpointAndComputation(test.org.apache.spark.JavaAPISuite) Time 
> elapsed: 1.232 s <<< ERROR!
> org.apache.spark.SparkException: 
> Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times, most 
> recent failure: Lost task 0.0 in stage 2.0 (TID 2, localhost, executor 
> driver): org.apache.hadoop.fs.ChecksumException: Checksum error: 
> file:/home/test/spark/core/target/tmp/1548319689411-0/fd0ba388-539c-49aa-bf76-e7d50aa2d1fc/rdd-0/part-0
>  at 0 exp: 222499834 got: 1400184476
>  at org.apache.hadoop.fs.FSInputChecker.verifySums(FSInputChecker.java:323)
>  at 
> org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:279)
>  at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:214)
>  at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:232)
>  at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:196)
>  at java.io.DataInputStream.read(DataInputStream.java:149)
>  at 
> java.io.ObjectInputStream$PeekInputStream.read(ObjectInputStream.java:2769)
>  at 
> java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2785)
>  at 
> java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3262)
>  at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:968)
>  at java.io.ObjectInputStream.(ObjectInputStream.java:390)
>  at 
> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.(JavaSerializer.scala:63)
>  at 
> org.apache.spark.serializer.JavaDeserializationStream.(JavaSerializer.scala:63)
>  at 
> org.apache.spark.serializer.JavaSerializerInstance.deserializeStream(JavaSerializer.scala:122)
>  at 
> org.apache.spark.rdd.ReliableCheckpointRDD$.readCheckpointFile(ReliableCheckpointRDD.scala:300)
>  at 
> org.apache.spark.rdd.ReliableCheckpointRDD.compute(ReliableCheckpointRDD.scala:100)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:322)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
>  at org.apache.spark.scheduler.Task.run(Task.scala:109)
>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:813)
> Driver stacktrace:
>  at 
> test.org.apache.spark.JavaAPISuite.checkpointAndComputation(JavaAPISuite.java:1243)
> Caused by: org.apache.hadoop.fs.ChecksumException: Checksum error:
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27428) Test "metrics StatsD sink with Timer " fails on BigEndian

2019-04-10 Thread Anuja Jakhade (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuja Jakhade updated SPARK-27428:
--
Description: 
Test case "metrics StatsD sink with Timer *** FAILED ***" fails with error

java.net.SocketTimeoutException: Receive timed out
 at 
java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:143)
 at java.net.DatagramSocket.receive(DatagramSocket.java:812)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4$$anonfun$apply$3.apply(StatsdSinkSuite.scala:155)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4$$anonfun$apply$3.apply(StatsdSinkSuite.scala:154)
 at scala.collection.immutable.Range.foreach(Range.scala:160)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4.apply(StatsdSinkSuite.scala:154)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4.apply(StatsdSinkSuite.scala:123)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite.org$apache$spark$metrics$sink$StatsdSinkSuite$$withSocketAndSink(StatsdSinkSuite.scala:51)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4.apply$mcV$sp(StatsdSinkSuite.scala:123)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4.apply(StatsdSinkSuite.scala:123)

On debugging observed that the last packet is not received at 
"socket.receive(p)". Hence the assert fails.  

Also I want to know, which feature of Apache Spark is tested in this test.

  was:
Test case "metrics StatsD sink with Timer *** FAILED ***" fails with error

java.net.SocketTimeoutException: Receive timed out
 at 
java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:143)
 at java.net.DatagramSocket.receive(DatagramSocket.java:812)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4$$anonfun$apply$3.apply(StatsdSinkSuite.scala:155)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4$$anonfun$apply$3.apply(StatsdSinkSuite.scala:154)
 at scala.collection.immutable.Range.foreach(Range.scala:160)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4.apply(StatsdSinkSuite.scala:154)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4.apply(StatsdSinkSuite.scala:123)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite.org$apache$spark$metrics$sink$StatsdSinkSuite$$withSocketAndSink(StatsdSinkSuite.scala:51)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4.apply$mcV$sp(StatsdSinkSuite.scala:123)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4.apply(StatsdSinkSuite.scala:123)

On debugging observed that the last packet is not received at 
"socket.receive(p)". Hence the assert fails.  

Also I want to know, which feature of Apache Spark is tested in this this test.


> Test "metrics StatsD sink with Timer " fails on BigEndian
> -
>
> Key: SPARK-27428
> URL: https://issues.apache.org/jira/browse/SPARK-27428
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.2, 2.3.3, 2.3.4
> Environment: Working on Ubuntu16.04, Linux 
> Java versions : 
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 Linux s390x-64-Bit 
> Compressed References 20190205_218 (JIT enabled, AOT enabled)
> and
> openjdk version "1.8.0_191"
> OpenJDK Runtime Environment (build 1.8.0_191-8u191-b12-2ubuntu0.18.04.1-b12)
> OpenJDK 64-Bit Zero VM (build 25.191-b12, interpreted mode)
>Reporter: Anuja Jakhade
>Priority: Major
>
> Test case "metrics StatsD sink with Timer *** FAILED ***" fails with error
> java.net.SocketTimeoutException: Receive timed out
>  at 
> java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:143)
>  at java.net.DatagramSocket.receive(DatagramSocket.java:812)
>  at 
> org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4$$anonfun$apply$3.apply(StatsdSinkSuite.scala:155)
>  at 
> org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4$$anonfun$apply$3.apply(StatsdSinkSuite.scala:154)
>  at scala.collection.immutable.Range.foreach(Range.scala:160)
>  at 
> org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4.apply(StatsdSinkSuite.scala:154)
>  at 
> org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4.apply(StatsdSinkSuite.scala:123)
>  at 
> org.apache.spark.metrics.sink.StatsdSinkSuite.org$apache$spark$metrics$sink$StatsdSinkSuite$$withSocketAndSink(StatsdSinkSuite.scala:51)
>  at 
> org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4.apply$mcV$sp(StatsdSinkSuite.scala:123)
>  at 
> 

[jira] [Updated] (SPARK-27428) Test "metrics StatsD sink with Timer " fails on BigEndian

2019-04-10 Thread Anuja Jakhade (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuja Jakhade updated SPARK-27428:
--
Description: 
Test case "metrics StatsD sink with Timer *** FAILED ***" fails with error

```java.net.SocketTimeoutException: Receive timed out
 at 
java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:143)
 at java.net.DatagramSocket.receive(DatagramSocket.java:812)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4$$anonfun$apply$3.apply(StatsdSinkSuite.scala:155)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4$$anonfun$apply$3.apply(StatsdSinkSuite.scala:154)
 at scala.collection.immutable.Range.foreach(Range.scala:160)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4.apply(StatsdSinkSuite.scala:154)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4.apply(StatsdSinkSuite.scala:123)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite.org$apache$spark$metrics$sink$StatsdSinkSuite$$withSocketAndSink(StatsdSinkSuite.scala:51)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4.apply$mcV$sp(StatsdSinkSuite.scala:123)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4.apply(StatsdSinkSuite.scala:123)

On debugging observed that the last packet is not received at 
"socket.receive(p)". Hence the assert fails.  ```

 

Also I want to know, which feature of Apache Spark is tested in this this test.

  was:
Test case "metrics StatsD sink with Timer *** FAILED ***" fails with error

java.net.SocketTimeoutException: Receive timed out
 at 
java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:143)
 at java.net.DatagramSocket.receive(DatagramSocket.java:812)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4$$anonfun$apply$3.apply(StatsdSinkSuite.scala:155)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4$$anonfun$apply$3.apply(StatsdSinkSuite.scala:154)
 at scala.collection.immutable.Range.foreach(Range.scala:160)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4.apply(StatsdSinkSuite.scala:154)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4.apply(StatsdSinkSuite.scala:123)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite.org$apache$spark$metrics$sink$StatsdSinkSuite$$withSocketAndSink(StatsdSinkSuite.scala:51)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4.apply$mcV$sp(StatsdSinkSuite.scala:123)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4.apply(StatsdSinkSuite.scala:123)

On debugging observed that the last packet is not received at 
"socket.receive(p)". Hence the assert fails.  

Also I want to know, which feature of Apache Spark is tested in this test.


> Test "metrics StatsD sink with Timer " fails on BigEndian
> -
>
> Key: SPARK-27428
> URL: https://issues.apache.org/jira/browse/SPARK-27428
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.2, 2.3.3, 2.3.4
> Environment: Working on Ubuntu16.04, Linux 
> Java versions : 
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 Linux s390x-64-Bit 
> Compressed References 20190205_218 (JIT enabled, AOT enabled)
> and
> openjdk version "1.8.0_191"
> OpenJDK Runtime Environment (build 1.8.0_191-8u191-b12-2ubuntu0.18.04.1-b12)
> OpenJDK 64-Bit Zero VM (build 25.191-b12, interpreted mode)
>Reporter: Anuja Jakhade
>Priority: Major
>
> Test case "metrics StatsD sink with Timer *** FAILED ***" fails with error
> ```java.net.SocketTimeoutException: Receive timed out
>  at 
> java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:143)
>  at java.net.DatagramSocket.receive(DatagramSocket.java:812)
>  at 
> org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4$$anonfun$apply$3.apply(StatsdSinkSuite.scala:155)
>  at 
> org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4$$anonfun$apply$3.apply(StatsdSinkSuite.scala:154)
>  at scala.collection.immutable.Range.foreach(Range.scala:160)
>  at 
> org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4.apply(StatsdSinkSuite.scala:154)
>  at 
> org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4.apply(StatsdSinkSuite.scala:123)
>  at 
> org.apache.spark.metrics.sink.StatsdSinkSuite.org$apache$spark$metrics$sink$StatsdSinkSuite$$withSocketAndSink(StatsdSinkSuite.scala:51)
>  at 
> org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4.apply$mcV$sp(StatsdSinkSuite.scala:123)
>  at 
> 

[jira] [Updated] (SPARK-27428) Test "metrics StatsD sink with Timer " fails on BigEndian

2019-04-10 Thread Anuja Jakhade (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuja Jakhade updated SPARK-27428:
--
Description: 
Test case "metrics StatsD sink with Timer *** FAILED ***" fails with error

java.net.SocketTimeoutException: Receive timed out
 at 
java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:143)
 at java.net.DatagramSocket.receive(DatagramSocket.java:812)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4$$anonfun$apply$3.apply(StatsdSinkSuite.scala:155)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4$$anonfun$apply$3.apply(StatsdSinkSuite.scala:154)
 at scala.collection.immutable.Range.foreach(Range.scala:160)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4.apply(StatsdSinkSuite.scala:154)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4.apply(StatsdSinkSuite.scala:123)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite.org$apache$spark$metrics$sink$StatsdSinkSuite$$withSocketAndSink(StatsdSinkSuite.scala:51)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4.apply$mcV$sp(StatsdSinkSuite.scala:123)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4.apply(StatsdSinkSuite.scala:123)

On debugging observed that the last packet is not received at 
"socket.receive(p)". Hence the assert fails.  

 

Also I want to know, which feature of Apache Spark is tested in this this test.

  was:
Test case "metrics StatsD sink with Timer *** FAILED ***" fails with error

```java.net.SocketTimeoutException: Receive timed out
 at 
java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:143)
 at java.net.DatagramSocket.receive(DatagramSocket.java:812)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4$$anonfun$apply$3.apply(StatsdSinkSuite.scala:155)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4$$anonfun$apply$3.apply(StatsdSinkSuite.scala:154)
 at scala.collection.immutable.Range.foreach(Range.scala:160)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4.apply(StatsdSinkSuite.scala:154)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4.apply(StatsdSinkSuite.scala:123)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite.org$apache$spark$metrics$sink$StatsdSinkSuite$$withSocketAndSink(StatsdSinkSuite.scala:51)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4.apply$mcV$sp(StatsdSinkSuite.scala:123)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4.apply(StatsdSinkSuite.scala:123)

On debugging observed that the last packet is not received at 
"socket.receive(p)". Hence the assert fails.  ```

 

Also I want to know, which feature of Apache Spark is tested in this this test.


> Test "metrics StatsD sink with Timer " fails on BigEndian
> -
>
> Key: SPARK-27428
> URL: https://issues.apache.org/jira/browse/SPARK-27428
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.2, 2.3.3, 2.3.4
> Environment: Working on Ubuntu16.04, Linux 
> Java versions : 
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 Linux s390x-64-Bit 
> Compressed References 20190205_218 (JIT enabled, AOT enabled)
> and
> openjdk version "1.8.0_191"
> OpenJDK Runtime Environment (build 1.8.0_191-8u191-b12-2ubuntu0.18.04.1-b12)
> OpenJDK 64-Bit Zero VM (build 25.191-b12, interpreted mode)
>Reporter: Anuja Jakhade
>Priority: Major
>
> Test case "metrics StatsD sink with Timer *** FAILED ***" fails with error
> java.net.SocketTimeoutException: Receive timed out
>  at 
> java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:143)
>  at java.net.DatagramSocket.receive(DatagramSocket.java:812)
>  at 
> org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4$$anonfun$apply$3.apply(StatsdSinkSuite.scala:155)
>  at 
> org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4$$anonfun$apply$3.apply(StatsdSinkSuite.scala:154)
>  at scala.collection.immutable.Range.foreach(Range.scala:160)
>  at 
> org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4.apply(StatsdSinkSuite.scala:154)
>  at 
> org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4.apply(StatsdSinkSuite.scala:123)
>  at 
> org.apache.spark.metrics.sink.StatsdSinkSuite.org$apache$spark$metrics$sink$StatsdSinkSuite$$withSocketAndSink(StatsdSinkSuite.scala:51)
>  at 
> org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4.apply$mcV$sp(StatsdSinkSuite.scala:123)
>  at 
> 

[jira] [Updated] (SPARK-27428) Test "metrics StatsD sink with Timer " fails on BigEndian

2019-04-10 Thread Anuja Jakhade (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuja Jakhade updated SPARK-27428:
--
Description: 
Test case "metrics StatsD sink with Timer *** FAILED ***" fails with error

java.net.SocketTimeoutException: Receive timed out
 at 
java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:143)
 at java.net.DatagramSocket.receive(DatagramSocket.java:812)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4$$anonfun$apply$3.apply(StatsdSinkSuite.scala:155)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4$$anonfun$apply$3.apply(StatsdSinkSuite.scala:154)
 at scala.collection.immutable.Range.foreach(Range.scala:160)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4.apply(StatsdSinkSuite.scala:154)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4.apply(StatsdSinkSuite.scala:123)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite.org$apache$spark$metrics$sink$StatsdSinkSuite$$withSocketAndSink(StatsdSinkSuite.scala:51)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4.apply$mcV$sp(StatsdSinkSuite.scala:123)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4.apply(StatsdSinkSuite.scala:123)

On debugging observed that the last packet is not received at 
"socket.receive(p)". Hence the assert fails.  

Also I want to know, which feature of Apache Spark is tested in this this test.

  was:
Test case "metrics StatsD sink with Timer *** FAILED ***" fails with error

java.net.SocketTimeoutException: Receive timed out
 at 
java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:143)
 at java.net.DatagramSocket.receive(DatagramSocket.java:812)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4$$anonfun$apply$3.apply(StatsdSinkSuite.scala:155)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4$$anonfun$apply$3.apply(StatsdSinkSuite.scala:154)
 at scala.collection.immutable.Range.foreach(Range.scala:160)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4.apply(StatsdSinkSuite.scala:154)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4.apply(StatsdSinkSuite.scala:123)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite.org$apache$spark$metrics$sink$StatsdSinkSuite$$withSocketAndSink(StatsdSinkSuite.scala:51)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4.apply$mcV$sp(StatsdSinkSuite.scala:123)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4.apply(StatsdSinkSuite.scala:123)

On debugging observed that the last packet is not received at 
"socket.receive(p)". Hence the assert fails.  


> Test "metrics StatsD sink with Timer " fails on BigEndian
> -
>
> Key: SPARK-27428
> URL: https://issues.apache.org/jira/browse/SPARK-27428
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.2, 2.3.3, 2.3.4
> Environment: Working on Ubuntu16.04, Linux 
> Java versions : 
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 Linux s390x-64-Bit 
> Compressed References 20190205_218 (JIT enabled, AOT enabled)
> and
> openjdk version "1.8.0_191"
> OpenJDK Runtime Environment (build 1.8.0_191-8u191-b12-2ubuntu0.18.04.1-b12)
> OpenJDK 64-Bit Zero VM (build 25.191-b12, interpreted mode)
>Reporter: Anuja Jakhade
>Priority: Major
>
> Test case "metrics StatsD sink with Timer *** FAILED ***" fails with error
> java.net.SocketTimeoutException: Receive timed out
>  at 
> java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:143)
>  at java.net.DatagramSocket.receive(DatagramSocket.java:812)
>  at 
> org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4$$anonfun$apply$3.apply(StatsdSinkSuite.scala:155)
>  at 
> org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4$$anonfun$apply$3.apply(StatsdSinkSuite.scala:154)
>  at scala.collection.immutable.Range.foreach(Range.scala:160)
>  at 
> org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4.apply(StatsdSinkSuite.scala:154)
>  at 
> org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4.apply(StatsdSinkSuite.scala:123)
>  at 
> org.apache.spark.metrics.sink.StatsdSinkSuite.org$apache$spark$metrics$sink$StatsdSinkSuite$$withSocketAndSink(StatsdSinkSuite.scala:51)
>  at 
> org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4.apply$mcV$sp(StatsdSinkSuite.scala:123)
>  at 
> org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4.apply(StatsdSinkSuite.scala:123)
> On debugging observed 

[jira] [Created] (SPARK-27428) Test "metrics StatsD sink with Timer " fails on BigEndian

2019-04-10 Thread Anuja Jakhade (JIRA)
Anuja Jakhade created SPARK-27428:
-

 Summary: Test "metrics StatsD sink with Timer " fails on BigEndian
 Key: SPARK-27428
 URL: https://issues.apache.org/jira/browse/SPARK-27428
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 2.3.3, 2.3.2, 2.3.4
 Environment: Working on Ubuntu16.04, Linux 

Java versions : 

Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 Linux s390x-64-Bit Compressed 
References 20190205_218 (JIT enabled, AOT enabled)

and

openjdk version "1.8.0_191"
OpenJDK Runtime Environment (build 1.8.0_191-8u191-b12-2ubuntu0.18.04.1-b12)
OpenJDK 64-Bit Zero VM (build 25.191-b12, interpreted mode)
Reporter: Anuja Jakhade


Test case "metrics StatsD sink with Timer *** FAILED ***" fails with error

java.net.SocketTimeoutException: Receive timed out
 at 
java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:143)
 at java.net.DatagramSocket.receive(DatagramSocket.java:812)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4$$anonfun$apply$3.apply(StatsdSinkSuite.scala:155)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4$$anonfun$apply$3.apply(StatsdSinkSuite.scala:154)
 at scala.collection.immutable.Range.foreach(Range.scala:160)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4.apply(StatsdSinkSuite.scala:154)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4$$anonfun$apply$mcV$sp$4.apply(StatsdSinkSuite.scala:123)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite.org$apache$spark$metrics$sink$StatsdSinkSuite$$withSocketAndSink(StatsdSinkSuite.scala:51)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4.apply$mcV$sp(StatsdSinkSuite.scala:123)
 at 
org.apache.spark.metrics.sink.StatsdSinkSuite$$anonfun$4.apply(StatsdSinkSuite.scala:123)

On debugging observed that the last packet is not received at 
"socket.receive(p)". Hence the assert fails.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-26985) Test "access only some column of the all of columns " fails on big endian

2019-03-19 Thread Anuja Jakhade (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16795931#comment-16795931
 ] 

Anuja Jakhade edited comment on SPARK-26985 at 3/19/19 10:19 AM:
-

Hi [~srowen], [~hyukjin.kwon]

I have observed that after changing the ByteOrder in 
_*[OnHeapColumnVector.java|https://github.com/apache/spark/blob/v2.3.2/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java]*_
 to  *ByteOrder.BIG_ENDIAN* the tests passes.

Because the float and double data is read properly. However in that case some 
tests of Paraquet Module fails. e.x: *ParquetIOSuite.*

Is there any specific reason why we are using ByteOrder format as LITTLE_ENDIAN 
even when the bigEndianPlatform is true.

The above fix doesn't work on all the test cases and the behavior of 
*ParquetIOSuite and DataFrameTungsten/InMemoryColumnarQuerySuite* compliment 
each other. 

 *ParquetIOSuite*  passes only when  ByteOrder  is set to 
ByteOrder.LITTLE_ENDIAN 

and  

*DataFrameTungsten/InMemoryColumnarQuerySuite* passes only when **ByteOrder  is 
set to ByteOrder.BIG_ENDIAN.

 


was (Author: anuja):
Hi [~srowen], [~hyukjin.kwon]

I have observed that after changing the ByteOrder in 
_*[OnHeapColumnVector.java|https://github.com/apache/spark/blob/v2.3.2/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java]*_
 to  *ByteOrder.BIG_ENDIAN* the tests passes.

Because the float and double data is read properly. However in that case some 
tests of Paraquet Module fails. e.x: *ParquetIOSuite.*

Is there any specific reason why we are using ByteOrder format as LITTLE_ENDIAN 
even when the bigEndianPlatform is true.

Because in that case the fix doesn't work on all the test cases and the 
behavior of *ParquetIOSuite and DataFrameTungsten/InMemoryColumnarQuerySuite* 
compliment each other. 

 *ParquetIOSuite*  passes only when  ByteOrder  is set to 
ByteOrder.LITTLE_ENDIAN 

and  

*DataFrameTungsten/InMemoryColumnarQuerySuite* passes only when **ByteOrder  is 
set to ByteOrder.BIG_ENDIAN.

 

> Test "access only some column of the all of columns " fails on big endian
> -
>
> Key: SPARK-26985
> URL: https://issues.apache.org/jira/browse/SPARK-26985
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.2
> Environment: Linux Ubuntu 16.04 
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 64-Bit Compressed 
> References 20190205_218 (JIT enabled, AOT enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>  
>Reporter: Anuja Jakhade
>Priority: Major
>  Labels: BigEndian
> Attachments: DataFrameTungstenSuite.txt, 
> InMemoryColumnarQuerySuite.txt, access only some column of the all of 
> columns.txt
>
>
> While running tests on Apache Spark v2.3.2 with AdoptJDK on big endian, I am 
> observing test failures for 2 Suites of Project SQL.
>  1. InMemoryColumnarQuerySuite
>  2. DataFrameTungstenSuite
>  In both the cases test "access only some column of the all of columns" fails 
> due to mismatch in the final assert.
> Observed that the data obtained after df.cache() is causing the error. Please 
> find attached the log with the details. 
> cache() works perfectly fine if double and  float values are not in picture.
> Inside test !!- access only some column of the all of columns *** FAILED 
> ***



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-26985) Test "access only some column of the all of columns " fails on big endian

2019-03-19 Thread Anuja Jakhade (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16795931#comment-16795931
 ] 

Anuja Jakhade edited comment on SPARK-26985 at 3/19/19 10:32 AM:
-

Hi [~srowen], [~hyukjin.kwon]

I have observed that after changing the ByteOrder in 
_*[OnHeapColumnVector.java|https://github.com/apache/spark/blob/v2.3.2/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java]*_
 to  *ByteOrder.BIG_ENDIAN* the tests passes.

Because the float and double data is read properly. However in that case some 
tests of Paraquet Module fails. e.x: *ParquetIOSuite.*

Is there any specific reason why we are using ByteOrder format as LITTLE_ENDIAN 
even when the bigEndianPlatform is true?

Because the above fix however, doesn't work on all the test cases and the 
behavior of *ParquetIOSuite and DataFrameTungsten/InMemoryColumnarQuerySuite* 
compliment each other. 

*ParquetIOSuite*  passes only when  ByteOrder  is set to 
ByteOrder.LITTLE_ENDIAN and *DataFrameTungsten/InMemoryColumnarQuerySuite* 
passes only when ByteOrder  is set to ByteOrder.BIG_ENDIAN.

 


was (Author: anuja):
Hi [~srowen], [~hyukjin.kwon]

I have observed that after changing the ByteOrder in 
_*[OnHeapColumnVector.java|https://github.com/apache/spark/blob/v2.3.2/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java]*_
 to  *ByteOrder.BIG_ENDIAN* the tests passes.

Because the float and double data is read properly. However in that case some 
tests of Paraquet Module fails. e.x: *ParquetIOSuite.*

Is there any specific reason why we are using ByteOrder format as LITTLE_ENDIAN 
even when the bigEndianPlatform is true.

The above fix however, doesn't work on all the test cases and the behavior of 
*ParquetIOSuite and DataFrameTungsten/InMemoryColumnarQuerySuite* compliment 
each other. 

*ParquetIOSuite*  passes only when  ByteOrder  is set to 
ByteOrder.LITTLE_ENDIAN and *DataFrameTungsten/InMemoryColumnarQuerySuite* 
passes only when ByteOrder  is set to ByteOrder.BIG_ENDIAN.

 

> Test "access only some column of the all of columns " fails on big endian
> -
>
> Key: SPARK-26985
> URL: https://issues.apache.org/jira/browse/SPARK-26985
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.2
> Environment: Linux Ubuntu 16.04 
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 64-Bit Compressed 
> References 20190205_218 (JIT enabled, AOT enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>  
>Reporter: Anuja Jakhade
>Priority: Major
>  Labels: BigEndian
> Attachments: DataFrameTungstenSuite.txt, 
> InMemoryColumnarQuerySuite.txt, access only some column of the all of 
> columns.txt
>
>
> While running tests on Apache Spark v2.3.2 with AdoptJDK on big endian, I am 
> observing test failures for 2 Suites of Project SQL.
>  1. InMemoryColumnarQuerySuite
>  2. DataFrameTungstenSuite
>  In both the cases test "access only some column of the all of columns" fails 
> due to mismatch in the final assert.
> Observed that the data obtained after df.cache() is causing the error. Please 
> find attached the log with the details. 
> cache() works perfectly fine if double and  float values are not in picture.
> Inside test !!- access only some column of the all of columns *** FAILED 
> ***



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-26985) Test "access only some column of the all of columns " fails on big endian

2019-03-19 Thread Anuja Jakhade (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16795931#comment-16795931
 ] 

Anuja Jakhade edited comment on SPARK-26985 at 3/19/19 10:24 AM:
-

Hi [~srowen], [~hyukjin.kwon]

I have observed that after changing the ByteOrder in 
_*[OnHeapColumnVector.java|https://github.com/apache/spark/blob/v2.3.2/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java]*_
 to  *ByteOrder.BIG_ENDIAN* the tests passes.

Because the float and double data is read properly. However in that case some 
tests of Paraquet Module fails. e.x: *ParquetIOSuite.*

Is there any specific reason why we are using ByteOrder format as LITTLE_ENDIAN 
even when the bigEndianPlatform is true.

The above fix however, doesn't work on all the test cases and the behavior of 
*ParquetIOSuite and DataFrameTungsten/InMemoryColumnarQuerySuite* compliment 
each other. 

*ParquetIOSuite*  passes only when  ByteOrder  is set to 
ByteOrder.LITTLE_ENDIAN and *DataFrameTungsten/InMemoryColumnarQuerySuite* 
passes only when ByteOrder  is set to ByteOrder.BIG_ENDIAN.

 


was (Author: anuja):
Hi [~srowen], [~hyukjin.kwon]

I have observed that after changing the ByteOrder in 
_*[OnHeapColumnVector.java|https://github.com/apache/spark/blob/v2.3.2/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java]*_
 to  *ByteOrder.BIG_ENDIAN* the tests passes.

Because the float and double data is read properly. However in that case some 
tests of Paraquet Module fails. e.x: *ParquetIOSuite.*

Is there any specific reason why we are using ByteOrder format as LITTLE_ENDIAN 
even when the bigEndianPlatform is true.

The above fix however, doesn't work on all the test cases and the behavior of 
*ParquetIOSuite and DataFrameTungsten/InMemoryColumnarQuerySuite* compliment 
each other. 

*ParquetIOSuite*  passes only when  ByteOrder  is set to 
ByteOrder.LITTLE_ENDIAN and *DataFrameTungsten/InMemoryColumnarQuerySuite* 
passes only when **ByteOrder  is set to ByteOrder.BIG_ENDIAN.

 

> Test "access only some column of the all of columns " fails on big endian
> -
>
> Key: SPARK-26985
> URL: https://issues.apache.org/jira/browse/SPARK-26985
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.2
> Environment: Linux Ubuntu 16.04 
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 64-Bit Compressed 
> References 20190205_218 (JIT enabled, AOT enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>  
>Reporter: Anuja Jakhade
>Priority: Major
>  Labels: BigEndian
> Attachments: DataFrameTungstenSuite.txt, 
> InMemoryColumnarQuerySuite.txt, access only some column of the all of 
> columns.txt
>
>
> While running tests on Apache Spark v2.3.2 with AdoptJDK on big endian, I am 
> observing test failures for 2 Suites of Project SQL.
>  1. InMemoryColumnarQuerySuite
>  2. DataFrameTungstenSuite
>  In both the cases test "access only some column of the all of columns" fails 
> due to mismatch in the final assert.
> Observed that the data obtained after df.cache() is causing the error. Please 
> find attached the log with the details. 
> cache() works perfectly fine if double and  float values are not in picture.
> Inside test !!- access only some column of the all of columns *** FAILED 
> ***



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-26985) Test "access only some column of the all of columns " fails on big endian

2019-03-19 Thread Anuja Jakhade (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16795931#comment-16795931
 ] 

Anuja Jakhade edited comment on SPARK-26985 at 3/19/19 10:24 AM:
-

Hi [~srowen], [~hyukjin.kwon]

I have observed that after changing the ByteOrder in 
_*[OnHeapColumnVector.java|https://github.com/apache/spark/blob/v2.3.2/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java]*_
 to  *ByteOrder.BIG_ENDIAN* the tests passes.

Because the float and double data is read properly. However in that case some 
tests of Paraquet Module fails. e.x: *ParquetIOSuite.*

Is there any specific reason why we are using ByteOrder format as LITTLE_ENDIAN 
even when the bigEndianPlatform is true.

The above fix however, doesn't work on all the test cases and the behavior of 
*ParquetIOSuite and DataFrameTungsten/InMemoryColumnarQuerySuite* compliment 
each other. 

*ParquetIOSuite*  passes only when  ByteOrder  is set to 
ByteOrder.LITTLE_ENDIAN and *DataFrameTungsten/InMemoryColumnarQuerySuite* 
passes only when **ByteOrder  is set to ByteOrder.BIG_ENDIAN.

 


was (Author: anuja):
Hi [~srowen], [~hyukjin.kwon]

I have observed that after changing the ByteOrder in 
_*[OnHeapColumnVector.java|https://github.com/apache/spark/blob/v2.3.2/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java]*_
 to  *ByteOrder.BIG_ENDIAN* the tests passes.

Because the float and double data is read properly. However in that case some 
tests of Paraquet Module fails. e.x: *ParquetIOSuite.*

Is there any specific reason why we are using ByteOrder format as LITTLE_ENDIAN 
even when the bigEndianPlatform is true.

The above fix however, doesn't work on all the test cases and the behavior of 
*ParquetIOSuite and DataFrameTungsten/InMemoryColumnarQuerySuite* compliment 
each other. 

 *ParquetIOSuite*  passes only when  ByteOrder  is set to 
ByteOrder.LITTLE_ENDIAN and *DataFrameTungsten/InMemoryColumnarQuerySuite* 
passes only when **ByteOrder  is set to ByteOrder.BIG_ENDIAN.

 

> Test "access only some column of the all of columns " fails on big endian
> -
>
> Key: SPARK-26985
> URL: https://issues.apache.org/jira/browse/SPARK-26985
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.2
> Environment: Linux Ubuntu 16.04 
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 64-Bit Compressed 
> References 20190205_218 (JIT enabled, AOT enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>  
>Reporter: Anuja Jakhade
>Priority: Major
>  Labels: BigEndian
> Attachments: DataFrameTungstenSuite.txt, 
> InMemoryColumnarQuerySuite.txt, access only some column of the all of 
> columns.txt
>
>
> While running tests on Apache Spark v2.3.2 with AdoptJDK on big endian, I am 
> observing test failures for 2 Suites of Project SQL.
>  1. InMemoryColumnarQuerySuite
>  2. DataFrameTungstenSuite
>  In both the cases test "access only some column of the all of columns" fails 
> due to mismatch in the final assert.
> Observed that the data obtained after df.cache() is causing the error. Please 
> find attached the log with the details. 
> cache() works perfectly fine if double and  float values are not in picture.
> Inside test !!- access only some column of the all of columns *** FAILED 
> ***



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-26985) Test "access only some column of the all of columns " fails on big endian

2019-03-19 Thread Anuja Jakhade (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16795931#comment-16795931
 ] 

Anuja Jakhade edited comment on SPARK-26985 at 3/19/19 10:17 AM:
-

Hi [~srowen], [~hyukjin.kwon]

I have observed that after changing the ByteOrder in 
"_*[OnHeapColumnVector.java|https://github.com/apache/spark/blob/v2.3.2/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java*_;
 to  *ByteOrder.BIG_ENDIAN* the tests passes.

Because the float and double data is read properly. However in that case some 
tests of Paraquet Module fails. e.x: *ParquetIOSuite.*

Is there any specific reason why we are using ByteOrder format as LITTLE_ENDIAN 
even when the bigEndianPlatform is true.

Because in that case the fix doesn't work on all the test cases and the 
behavior of *ParquetIOSuite and DataFrameTungsten/InMemoryColumnarQuerySuite* 
compliment each other. 

 *ParquetIOSuite*  passes only when  ByteOrder  is set to 
ByteOrder.LITTLE_ENDIAN 

and  

*DataFrameTungsten/InMemoryColumnarQuerySuite* passes only when **ByteOrder  is 
set to ByteOrder.BIG_ENDIAN.

 


was (Author: anuja):
Hi [~srowen], [~hyukjin.kwon]

I have observed that after changing the ByteOrder in 
"_*OnHeapColumnVector.java*_" to  *ByteOrder.BIG_ENDIAN* the tests passes.

Because the float and double data is read properly. However in that case some 
tests of Paraquet Module fails. e.x: *ParquetIOSuite.*

Is there any specific reason why we are using ByteOrder format as LITTLE_ENDIAN 
even when the bigEndianPlatform is true.

Because in that case the fix doesn't work on all the test cases and the 
behavior of *ParquetIOSuite and DataFrameTungsten/InMemoryColumnarQuerySuite* 
compliment each other. 

 *ParquetIOSuite*  passes only when  ByteOrder  is set to 
ByteOrder.LITTLE_ENDIAN 

and  

*DataFrameTungsten/InMemoryColumnarQuerySuite* passes only when **ByteOrder  is 
set to ByteOrder.BIG_ENDIAN.

 

> Test "access only some column of the all of columns " fails on big endian
> -
>
> Key: SPARK-26985
> URL: https://issues.apache.org/jira/browse/SPARK-26985
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.2
> Environment: Linux Ubuntu 16.04 
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 64-Bit Compressed 
> References 20190205_218 (JIT enabled, AOT enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>  
>Reporter: Anuja Jakhade
>Priority: Major
>  Labels: BigEndian
> Attachments: DataFrameTungstenSuite.txt, 
> InMemoryColumnarQuerySuite.txt, access only some column of the all of 
> columns.txt
>
>
> While running tests on Apache Spark v2.3.2 with AdoptJDK on big endian, I am 
> observing test failures for 2 Suites of Project SQL.
>  1. InMemoryColumnarQuerySuite
>  2. DataFrameTungstenSuite
>  In both the cases test "access only some column of the all of columns" fails 
> due to mismatch in the final assert.
> Observed that the data obtained after df.cache() is causing the error. Please 
> find attached the log with the details. 
> cache() works perfectly fine if double and  float values are not in picture.
> Inside test !!- access only some column of the all of columns *** FAILED 
> ***



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-26985) Test "access only some column of the all of columns " fails on big endian

2019-03-19 Thread Anuja Jakhade (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16795931#comment-16795931
 ] 

Anuja Jakhade edited comment on SPARK-26985 at 3/19/19 10:19 AM:
-

Hi [~srowen], [~hyukjin.kwon]

I have observed that after changing the ByteOrder in 
_*[OnHeapColumnVector.java|https://github.com/apache/spark/blob/v2.3.2/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java]*_
 to  *ByteOrder.BIG_ENDIAN* the tests passes.

Because the float and double data is read properly. However in that case some 
tests of Paraquet Module fails. e.x: *ParquetIOSuite.*

Is there any specific reason why we are using ByteOrder format as LITTLE_ENDIAN 
even when the bigEndianPlatform is true.

The above fix however, doesn't work on all the test cases and the behavior of 
*ParquetIOSuite and DataFrameTungsten/InMemoryColumnarQuerySuite* compliment 
each other. 

 *ParquetIOSuite*  passes only when  ByteOrder  is set to 
ByteOrder.LITTLE_ENDIAN and *DataFrameTungsten/InMemoryColumnarQuerySuite* 
passes only when **ByteOrder  is set to ByteOrder.BIG_ENDIAN.

 


was (Author: anuja):
Hi [~srowen], [~hyukjin.kwon]

I have observed that after changing the ByteOrder in 
_*[OnHeapColumnVector.java|https://github.com/apache/spark/blob/v2.3.2/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java]*_
 to  *ByteOrder.BIG_ENDIAN* the tests passes.

Because the float and double data is read properly. However in that case some 
tests of Paraquet Module fails. e.x: *ParquetIOSuite.*

Is there any specific reason why we are using ByteOrder format as LITTLE_ENDIAN 
even when the bigEndianPlatform is true.

The above fix however doesn't work on all the test cases and the behavior of 
*ParquetIOSuite and DataFrameTungsten/InMemoryColumnarQuerySuite* compliment 
each other. 

 *ParquetIOSuite*  passes only when  ByteOrder  is set to 
ByteOrder.LITTLE_ENDIAN 

and  

*DataFrameTungsten/InMemoryColumnarQuerySuite* passes only when **ByteOrder  is 
set to ByteOrder.BIG_ENDIAN.

 

> Test "access only some column of the all of columns " fails on big endian
> -
>
> Key: SPARK-26985
> URL: https://issues.apache.org/jira/browse/SPARK-26985
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.2
> Environment: Linux Ubuntu 16.04 
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 64-Bit Compressed 
> References 20190205_218 (JIT enabled, AOT enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>  
>Reporter: Anuja Jakhade
>Priority: Major
>  Labels: BigEndian
> Attachments: DataFrameTungstenSuite.txt, 
> InMemoryColumnarQuerySuite.txt, access only some column of the all of 
> columns.txt
>
>
> While running tests on Apache Spark v2.3.2 with AdoptJDK on big endian, I am 
> observing test failures for 2 Suites of Project SQL.
>  1. InMemoryColumnarQuerySuite
>  2. DataFrameTungstenSuite
>  In both the cases test "access only some column of the all of columns" fails 
> due to mismatch in the final assert.
> Observed that the data obtained after df.cache() is causing the error. Please 
> find attached the log with the details. 
> cache() works perfectly fine if double and  float values are not in picture.
> Inside test !!- access only some column of the all of columns *** FAILED 
> ***



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-26985) Test "access only some column of the all of columns " fails on big endian

2019-03-19 Thread Anuja Jakhade (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16795931#comment-16795931
 ] 

Anuja Jakhade edited comment on SPARK-26985 at 3/19/19 10:19 AM:
-

Hi [~srowen], [~hyukjin.kwon]

I have observed that after changing the ByteOrder in 
_*[OnHeapColumnVector.java|https://github.com/apache/spark/blob/v2.3.2/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java]*_
 to  *ByteOrder.BIG_ENDIAN* the tests passes.

Because the float and double data is read properly. However in that case some 
tests of Paraquet Module fails. e.x: *ParquetIOSuite.*

Is there any specific reason why we are using ByteOrder format as LITTLE_ENDIAN 
even when the bigEndianPlatform is true.

The above fix however doesn't work on all the test cases and the behavior of 
*ParquetIOSuite and DataFrameTungsten/InMemoryColumnarQuerySuite* compliment 
each other. 

 *ParquetIOSuite*  passes only when  ByteOrder  is set to 
ByteOrder.LITTLE_ENDIAN 

and  

*DataFrameTungsten/InMemoryColumnarQuerySuite* passes only when **ByteOrder  is 
set to ByteOrder.BIG_ENDIAN.

 


was (Author: anuja):
Hi [~srowen], [~hyukjin.kwon]

I have observed that after changing the ByteOrder in 
_*[OnHeapColumnVector.java|https://github.com/apache/spark/blob/v2.3.2/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java]*_
 to  *ByteOrder.BIG_ENDIAN* the tests passes.

Because the float and double data is read properly. However in that case some 
tests of Paraquet Module fails. e.x: *ParquetIOSuite.*

Is there any specific reason why we are using ByteOrder format as LITTLE_ENDIAN 
even when the bigEndianPlatform is true.

The above fix doesn't work on all the test cases and the behavior of 
*ParquetIOSuite and DataFrameTungsten/InMemoryColumnarQuerySuite* compliment 
each other. 

 *ParquetIOSuite*  passes only when  ByteOrder  is set to 
ByteOrder.LITTLE_ENDIAN 

and  

*DataFrameTungsten/InMemoryColumnarQuerySuite* passes only when **ByteOrder  is 
set to ByteOrder.BIG_ENDIAN.

 

> Test "access only some column of the all of columns " fails on big endian
> -
>
> Key: SPARK-26985
> URL: https://issues.apache.org/jira/browse/SPARK-26985
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.2
> Environment: Linux Ubuntu 16.04 
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 64-Bit Compressed 
> References 20190205_218 (JIT enabled, AOT enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>  
>Reporter: Anuja Jakhade
>Priority: Major
>  Labels: BigEndian
> Attachments: DataFrameTungstenSuite.txt, 
> InMemoryColumnarQuerySuite.txt, access only some column of the all of 
> columns.txt
>
>
> While running tests on Apache Spark v2.3.2 with AdoptJDK on big endian, I am 
> observing test failures for 2 Suites of Project SQL.
>  1. InMemoryColumnarQuerySuite
>  2. DataFrameTungstenSuite
>  In both the cases test "access only some column of the all of columns" fails 
> due to mismatch in the final assert.
> Observed that the data obtained after df.cache() is causing the error. Please 
> find attached the log with the details. 
> cache() works perfectly fine if double and  float values are not in picture.
> Inside test !!- access only some column of the all of columns *** FAILED 
> ***



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-26985) Test "access only some column of the all of columns " fails on big endian

2019-03-19 Thread Anuja Jakhade (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16795931#comment-16795931
 ] 

Anuja Jakhade edited comment on SPARK-26985 at 3/19/19 10:18 AM:
-

Hi [~srowen], [~hyukjin.kwon]

I have observed that after changing the ByteOrder in 
_*[OnHeapColumnVector.java|https://github.com/apache/spark/blob/v2.3.2/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java]*_
 to  *ByteOrder.BIG_ENDIAN* the tests passes.

Because the float and double data is read properly. However in that case some 
tests of Paraquet Module fails. e.x: *ParquetIOSuite.*

Is there any specific reason why we are using ByteOrder format as LITTLE_ENDIAN 
even when the bigEndianPlatform is true.

Because in that case the fix doesn't work on all the test cases and the 
behavior of *ParquetIOSuite and DataFrameTungsten/InMemoryColumnarQuerySuite* 
compliment each other. 

 *ParquetIOSuite*  passes only when  ByteOrder  is set to 
ByteOrder.LITTLE_ENDIAN 

and  

*DataFrameTungsten/InMemoryColumnarQuerySuite* passes only when **ByteOrder  is 
set to ByteOrder.BIG_ENDIAN.

 


was (Author: anuja):
Hi [~srowen], [~hyukjin.kwon]

I have observed that after changing the ByteOrder in 
"_*[OnHeapColumnVector.java|https://github.com/apache/spark/blob/v2.3.2/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java*_;
 to  *ByteOrder.BIG_ENDIAN* the tests passes.

Because the float and double data is read properly. However in that case some 
tests of Paraquet Module fails. e.x: *ParquetIOSuite.*

Is there any specific reason why we are using ByteOrder format as LITTLE_ENDIAN 
even when the bigEndianPlatform is true.

Because in that case the fix doesn't work on all the test cases and the 
behavior of *ParquetIOSuite and DataFrameTungsten/InMemoryColumnarQuerySuite* 
compliment each other. 

 *ParquetIOSuite*  passes only when  ByteOrder  is set to 
ByteOrder.LITTLE_ENDIAN 

and  

*DataFrameTungsten/InMemoryColumnarQuerySuite* passes only when **ByteOrder  is 
set to ByteOrder.BIG_ENDIAN.

 

> Test "access only some column of the all of columns " fails on big endian
> -
>
> Key: SPARK-26985
> URL: https://issues.apache.org/jira/browse/SPARK-26985
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.2
> Environment: Linux Ubuntu 16.04 
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 64-Bit Compressed 
> References 20190205_218 (JIT enabled, AOT enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>  
>Reporter: Anuja Jakhade
>Priority: Major
>  Labels: BigEndian
> Attachments: DataFrameTungstenSuite.txt, 
> InMemoryColumnarQuerySuite.txt, access only some column of the all of 
> columns.txt
>
>
> While running tests on Apache Spark v2.3.2 with AdoptJDK on big endian, I am 
> observing test failures for 2 Suites of Project SQL.
>  1. InMemoryColumnarQuerySuite
>  2. DataFrameTungstenSuite
>  In both the cases test "access only some column of the all of columns" fails 
> due to mismatch in the final assert.
> Observed that the data obtained after df.cache() is causing the error. Please 
> find attached the log with the details. 
> cache() works perfectly fine if double and  float values are not in picture.
> Inside test !!- access only some column of the all of columns *** FAILED 
> ***



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26985) Test "access only some column of the all of columns " fails on big endian

2019-03-19 Thread Anuja Jakhade (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16795931#comment-16795931
 ] 

Anuja Jakhade commented on SPARK-26985:
---

Hi [~srowen], [~hyukjin.kwon]

I have observed that after changing the ByteOrder in 
"_*OnHeapColumnVector.java*_" to  *ByteOrder.LITTLE_ENDIAN* the tests passes. 
Because the data is read properly. However in that case some tests of Paraquet 
Module fails. e.x: *ParquetIOSuite.*

Is there any specific reason why we are using ByteOrder format as LITTLE_ENDIAN 
even when the bigEndianPlatform is true.

Because in that case the fix doesn't work on all the test cases and the 
behavior of *ParquetIOSuite and DataFrameTungsten/InMemoryColumnarQuerySuite* 
compliment each other. 

 

> Test "access only some column of the all of columns " fails on big endian
> -
>
> Key: SPARK-26985
> URL: https://issues.apache.org/jira/browse/SPARK-26985
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.2
> Environment: Linux Ubuntu 16.04 
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 64-Bit Compressed 
> References 20190205_218 (JIT enabled, AOT enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>  
>Reporter: Anuja Jakhade
>Priority: Major
>  Labels: BigEndian
> Attachments: DataFrameTungstenSuite.txt, 
> InMemoryColumnarQuerySuite.txt, access only some column of the all of 
> columns.txt
>
>
> While running tests on Apache Spark v2.3.2 with AdoptJDK on big endian, I am 
> observing test failures for 2 Suites of Project SQL.
>  1. InMemoryColumnarQuerySuite
>  2. DataFrameTungstenSuite
>  In both the cases test "access only some column of the all of columns" fails 
> due to mismatch in the final assert.
> Observed that the data obtained after df.cache() is causing the error. Please 
> find attached the log with the details. 
> cache() works perfectly fine if double and  float values are not in picture.
> Inside test !!- access only some column of the all of columns *** FAILED 
> ***



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-26985) Test "access only some column of the all of columns " fails on big endian

2019-03-19 Thread Anuja Jakhade (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16795931#comment-16795931
 ] 

Anuja Jakhade edited comment on SPARK-26985 at 3/19/19 10:16 AM:
-

Hi [~srowen], [~hyukjin.kwon]

I have observed that after changing the ByteOrder in 
"_*OnHeapColumnVector.java*_" to  *ByteOrder.BIG_ENDIAN* the tests passes.

Because the float and double data is read properly. However in that case some 
tests of Paraquet Module fails. e.x: *ParquetIOSuite.*

Is there any specific reason why we are using ByteOrder format as LITTLE_ENDIAN 
even when the bigEndianPlatform is true.

Because in that case the fix doesn't work on all the test cases and the 
behavior of *ParquetIOSuite and DataFrameTungsten/InMemoryColumnarQuerySuite* 
compliment each other. 

 *ParquetIOSuite*  passes only when  ByteOrder  is set to 
ByteOrder.LITTLE_ENDIAN 

and  

*DataFrameTungsten/InMemoryColumnarQuerySuite* passes only when **ByteOrder  is 
set to ByteOrder.BIG_ENDIAN.

 


was (Author: anuja):
Hi [~srowen], [~hyukjin.kwon]

I have observed that after changing the ByteOrder in 
"_*OnHeapColumnVector.java*_" to  *ByteOrder.BIG_ENDIAN* the tests passes. 
Because the data is read properly. However in that case some tests of Paraquet 
Module fails. e.x: *ParquetIOSuite.*

Is there any specific reason why we are using ByteOrder format as LITTLE_ENDIAN 
even when the bigEndianPlatform is true.

Because in that case the fix doesn't work on all the test cases and the 
behavior of *ParquetIOSuite and DataFrameTungsten/InMemoryColumnarQuerySuite* 
compliment each other. 

 

> Test "access only some column of the all of columns " fails on big endian
> -
>
> Key: SPARK-26985
> URL: https://issues.apache.org/jira/browse/SPARK-26985
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.2
> Environment: Linux Ubuntu 16.04 
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 64-Bit Compressed 
> References 20190205_218 (JIT enabled, AOT enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>  
>Reporter: Anuja Jakhade
>Priority: Major
>  Labels: BigEndian
> Attachments: DataFrameTungstenSuite.txt, 
> InMemoryColumnarQuerySuite.txt, access only some column of the all of 
> columns.txt
>
>
> While running tests on Apache Spark v2.3.2 with AdoptJDK on big endian, I am 
> observing test failures for 2 Suites of Project SQL.
>  1. InMemoryColumnarQuerySuite
>  2. DataFrameTungstenSuite
>  In both the cases test "access only some column of the all of columns" fails 
> due to mismatch in the final assert.
> Observed that the data obtained after df.cache() is causing the error. Please 
> find attached the log with the details. 
> cache() works perfectly fine if double and  float values are not in picture.
> Inside test !!- access only some column of the all of columns *** FAILED 
> ***



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-26985) Test "access only some column of the all of columns " fails on big endian

2019-03-19 Thread Anuja Jakhade (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16795931#comment-16795931
 ] 

Anuja Jakhade edited comment on SPARK-26985 at 3/19/19 10:12 AM:
-

Hi [~srowen], [~hyukjin.kwon]

I have observed that after changing the ByteOrder in 
"_*OnHeapColumnVector.java*_" to  *ByteOrder.BIG_ENDIAN* the tests passes. 
Because the data is read properly. However in that case some tests of Paraquet 
Module fails. e.x: *ParquetIOSuite.*

Is there any specific reason why we are using ByteOrder format as LITTLE_ENDIAN 
even when the bigEndianPlatform is true.

Because in that case the fix doesn't work on all the test cases and the 
behavior of *ParquetIOSuite and DataFrameTungsten/InMemoryColumnarQuerySuite* 
compliment each other. 

 


was (Author: anuja):
Hi [~srowen], [~hyukjin.kwon]

I have observed that after changing the ByteOrder in 
"_*OnHeapColumnVector.java*_" to  *ByteOrder.LITTLE_ENDIAN* the tests passes. 
Because the data is read properly. However in that case some tests of Paraquet 
Module fails. e.x: *ParquetIOSuite.*

Is there any specific reason why we are using ByteOrder format as LITTLE_ENDIAN 
even when the bigEndianPlatform is true.

Because in that case the fix doesn't work on all the test cases and the 
behavior of *ParquetIOSuite and DataFrameTungsten/InMemoryColumnarQuerySuite* 
compliment each other. 

 

> Test "access only some column of the all of columns " fails on big endian
> -
>
> Key: SPARK-26985
> URL: https://issues.apache.org/jira/browse/SPARK-26985
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.2
> Environment: Linux Ubuntu 16.04 
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 64-Bit Compressed 
> References 20190205_218 (JIT enabled, AOT enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>  
>Reporter: Anuja Jakhade
>Priority: Major
>  Labels: BigEndian
> Attachments: DataFrameTungstenSuite.txt, 
> InMemoryColumnarQuerySuite.txt, access only some column of the all of 
> columns.txt
>
>
> While running tests on Apache Spark v2.3.2 with AdoptJDK on big endian, I am 
> observing test failures for 2 Suites of Project SQL.
>  1. InMemoryColumnarQuerySuite
>  2. DataFrameTungstenSuite
>  In both the cases test "access only some column of the all of columns" fails 
> due to mismatch in the final assert.
> Observed that the data obtained after df.cache() is causing the error. Please 
> find attached the log with the details. 
> cache() works perfectly fine if double and  float values are not in picture.
> Inside test !!- access only some column of the all of columns *** FAILED 
> ***



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Reopened] (SPARK-26985) Test "access only some column of the all of columns " fails on big endian

2019-03-11 Thread Anuja Jakhade (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuja Jakhade reopened SPARK-26985:
---

I did tried with OpenJDK. However same behavior is observed. 

The test fails with the same error. 

> Test "access only some column of the all of columns " fails on big endian
> -
>
> Key: SPARK-26985
> URL: https://issues.apache.org/jira/browse/SPARK-26985
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.2
> Environment: Linux Ubuntu 16.04 
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 64-Bit Compressed 
> References 20190205_218 (JIT enabled, AOT enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>  
>Reporter: Anuja Jakhade
>Priority: Major
>  Labels: BigEndian
> Attachments: DataFrameTungstenSuite.txt, 
> InMemoryColumnarQuerySuite.txt, access only some column of the all of 
> columns.txt
>
>
> While running tests on Apache Spark v2.3.2 with AdoptJDK on big endian, I am 
> observing test failures for 2 Suites of Project SQL.
>  1. InMemoryColumnarQuerySuite
>  2. DataFrameTungstenSuite
>  In both the cases test "access only some column of the all of columns" fails 
> due to mismatch in the final assert.
> Observed that the data obtained after df.cache() is causing the error. Please 
> find attached the log with the details. 
> cache() works perfectly fine if double and  float values are not in picture.
> Inside test !!- access only some column of the all of columns *** FAILED 
> ***



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26940) Observed greater deviation on big endian platform for SingletonReplSuite test case

2019-03-11 Thread Anuja Jakhade (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16789393#comment-16789393
 ] 

Anuja Jakhade commented on SPARK-26940:
---

I did tried with OpenJDK. However same behavior is observed. 

The test fails with the same error. 

> Observed greater deviation on big endian platform for SingletonReplSuite test 
> case
> --
>
> Key: SPARK-26940
> URL: https://issues.apache.org/jira/browse/SPARK-26940
> Project: Spark
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 2.3.2
> Environment: Ubuntu 16.04 LTS
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 Linux (JIT enabled, AOT 
> enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>Reporter: Anuja Jakhade
>Priority: Minor
>  Labels: BigEndian
> Attachments: failure_log.txt
>
>
> I have built Apache Spark v2.3.2 on Big Endian platform with AdoptJDK OpenJ9 
> 1.8.0_202.
> My build is successful. However while running the scala tests of "*Spark 
> Project REPL*" module, I am facing failures at SingletonReplSuite with error 
> log as attached.
> The deviation observed on big endian is greater than the acceptable deviation 
> 0.2.
> How efficient is it to increase the deviation defined in 
> SingletonReplSuite.scala
> Can this be fixed? 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Reopened] (SPARK-26940) Observed greater deviation on big endian platform for SingletonReplSuite test case

2019-03-11 Thread Anuja Jakhade (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuja Jakhade reopened SPARK-26940:
---

I did tried with OpenJDK. However same behavior is observed. 

The test fails with the same error. 

> Observed greater deviation on big endian platform for SingletonReplSuite test 
> case
> --
>
> Key: SPARK-26940
> URL: https://issues.apache.org/jira/browse/SPARK-26940
> Project: Spark
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 2.3.2
> Environment: Ubuntu 16.04 LTS
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 Linux (JIT enabled, AOT 
> enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>Reporter: Anuja Jakhade
>Priority: Minor
>  Labels: BigEndian
> Attachments: failure_log.txt
>
>
> I have built Apache Spark v2.3.2 on Big Endian platform with AdoptJDK OpenJ9 
> 1.8.0_202.
> My build is successful. However while running the scala tests of "*Spark 
> Project REPL*" module, I am facing failures at SingletonReplSuite with error 
> log as attached.
> The deviation observed on big endian is greater than the acceptable deviation 
> 0.2.
> How efficient is it to increase the deviation defined in 
> SingletonReplSuite.scala
> Can this be fixed? 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26985) Test "access only some column of the all of columns " fails on big endian

2019-03-11 Thread Anuja Jakhade (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16789392#comment-16789392
 ] 

Anuja Jakhade commented on SPARK-26985:
---

I did tried with OpenJDK. However same behavior is observed. 

The test fails with the same error. 

> Test "access only some column of the all of columns " fails on big endian
> -
>
> Key: SPARK-26985
> URL: https://issues.apache.org/jira/browse/SPARK-26985
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.2
> Environment: Linux Ubuntu 16.04 
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 64-Bit Compressed 
> References 20190205_218 (JIT enabled, AOT enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>  
>Reporter: Anuja Jakhade
>Priority: Major
>  Labels: BigEndian
> Attachments: DataFrameTungstenSuite.txt, 
> InMemoryColumnarQuerySuite.txt, access only some column of the all of 
> columns.txt
>
>
> While running tests on Apache Spark v2.3.2 with AdoptJDK on big endian, I am 
> observing test failures for 2 Suites of Project SQL.
>  1. InMemoryColumnarQuerySuite
>  2. DataFrameTungstenSuite
>  In both the cases test "access only some column of the all of columns" fails 
> due to mismatch in the final assert.
> Observed that the data obtained after df.cache() is causing the error. Please 
> find attached the log with the details. 
> cache() works perfectly fine if double and  float values are not in picture.
> Inside test !!- access only some column of the all of columns *** FAILED 
> ***



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26985) Test "access only some column of the all of columns " fails on big endian

2019-02-26 Thread Anuja Jakhade (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuja Jakhade updated SPARK-26985:
--
Description: 
While running tests on Apache Spark v2.3.2 with AdoptJDK on big endian, I am 
observing test failures for 2 Suites of Project SQL.
 1. InMemoryColumnarQuerySuite
 2. DataFrameTungstenSuite
 In both the cases test "access only some column of the all of columns" fails 
due to mismatch in the final assert.

Observed that the data obtained after df.cache() is causing the error. Please 
find attached the log with the details. 

cache() works perfectly fine if double and  float values are not in picture.

Inside test !!- access only some column of the all of columns *** FAILED ***

  was:
While running tests on Apache Spark v2.3.2 with AdoptJDK on big endian, I am 
observing test failures for 2 Suites of Project SQL.
 1. InMemoryColumnarQuerySuite
 2. DataFrameTungstenSuite
 In both the cases test "access only some column of the all of columns" fails 
due to mismatch in the final assert.

Observed that the data obtained after df.cache() is causing the error. Please 
find attached the log with the details. 

 

Inside test !!- access only some column of the all of columns *** FAILED ***


> Test "access only some column of the all of columns " fails on big endian
> -
>
> Key: SPARK-26985
> URL: https://issues.apache.org/jira/browse/SPARK-26985
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.2
> Environment: Linux Ubuntu 16.04 
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 64-Bit Compressed 
> References 20190205_218 (JIT enabled, AOT enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>  
>Reporter: Anuja Jakhade
>Priority: Critical
>  Labels: BigEndian
> Attachments: DataFrameTungstenSuite.txt, 
> InMemoryColumnarQuerySuite.txt, access only some column of the all of 
> columns.txt
>
>
> While running tests on Apache Spark v2.3.2 with AdoptJDK on big endian, I am 
> observing test failures for 2 Suites of Project SQL.
>  1. InMemoryColumnarQuerySuite
>  2. DataFrameTungstenSuite
>  In both the cases test "access only some column of the all of columns" fails 
> due to mismatch in the final assert.
> Observed that the data obtained after df.cache() is causing the error. Please 
> find attached the log with the details. 
> cache() works perfectly fine if double and  float values are not in picture.
> Inside test !!- access only some column of the all of columns *** FAILED 
> ***



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26985) Test "access only some column of the all of columns " fails on big endian

2019-02-26 Thread Anuja Jakhade (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuja Jakhade updated SPARK-26985:
--
Labels: BigEndian  (was: )

> Test "access only some column of the all of columns " fails on big endian
> -
>
> Key: SPARK-26985
> URL: https://issues.apache.org/jira/browse/SPARK-26985
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.2
> Environment: Linux Ubuntu 16.04 
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 64-Bit Compressed 
> References 20190205_218 (JIT enabled, AOT enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>  
>Reporter: Anuja Jakhade
>Priority: Critical
>  Labels: BigEndian
> Attachments: DataFrameTungstenSuite.txt, 
> InMemoryColumnarQuerySuite.txt, access only some column of the all of 
> columns.txt
>
>
> While running tests on Apache Spark v2.3.2 with AdoptJDK on big endian, I am 
> observing test failures for 2 Suites of Project SQL.
>  1. InMemoryColumnarQuerySuite
>  2. DataFrameTungstenSuite
>  In both the cases test "access only some column of the all of columns" fails 
> due to mismatch in the final assert.
> Observed that the data obtained after df.cache() is causing the error. Please 
> find attached the log with the details. 
>  
> Inside test !!- access only some column of the all of columns *** FAILED 
> ***



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26940) Observed greater deviation on big endian platform for SingletonReplSuite test case

2019-02-26 Thread Anuja Jakhade (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuja Jakhade updated SPARK-26940:
--
Labels: BigEndian  (was: )

> Observed greater deviation on big endian platform for SingletonReplSuite test 
> case
> --
>
> Key: SPARK-26940
> URL: https://issues.apache.org/jira/browse/SPARK-26940
> Project: Spark
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 2.3.2
> Environment: Ubuntu 16.04 LTS
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 Linux (JIT enabled, AOT 
> enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>Reporter: Anuja Jakhade
>Priority: Critical
>  Labels: BigEndian
> Attachments: failure_log.txt
>
>
> I have built Apache Spark v2.3.2 on Big Endian platform with AdoptJDK OpenJ9 
> 1.8.0_202.
> My build is successful. However while running the scala tests of "*Spark 
> Project REPL*" module, I am facing failures at SingletonReplSuite with error 
> log as attached.
> The deviation observed on big endian is greater than the acceptable deviation 
> 0.2.
> How efficient is it to increase the deviation defined in 
> SingletonReplSuite.scala
> Can this be fixed? 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26985) Test "access only some column of the all of columns " fails on big endian

2019-02-26 Thread Anuja Jakhade (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuja Jakhade updated SPARK-26985:
--
Attachment: access only some column of the all of columns.txt

> Test "access only some column of the all of columns " fails on big endian
> -
>
> Key: SPARK-26985
> URL: https://issues.apache.org/jira/browse/SPARK-26985
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.2
> Environment: Linux Ubuntu 16.04 
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 64-Bit Compressed 
> References 20190205_218 (JIT enabled, AOT enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>  
>Reporter: Anuja Jakhade
>Priority: Critical
> Attachments: DataFrameTungstenSuite.txt, 
> InMemoryColumnarQuerySuite.txt, access only some column of the all of 
> columns.txt
>
>
> While running tests on Apache Spark v2.3.2 with AdoptJDK on big endian, I am 
> observing test failures for 2 Suites of Project SQL.
>  1. InMemoryColumnarQuerySuite
>  2. DataFrameTungstenSuite
>  In both the cases test "access only some column of the all of columns" fails 
> due to mismatch in the final assert.
> Observed that the data obtained after df.cache() is causing the error. Please 
> find attached the log with the details. 
>  
> Inside test !!- access only some column of the all of columns *** FAILED 
> ***



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26985) Test "access only some column of the all of columns " fails on big endian

2019-02-26 Thread Anuja Jakhade (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuja Jakhade updated SPARK-26985:
--
Description: 
While running tests on Apache Spark v2.3.2 with AdoptJDK on big endian, I am 
observing test failures for 2 Suites of Project SQL.
 1. InMemoryColumnarQuerySuite
 2. DataFrameTungstenSuite
 In both the cases test "access only some column of the all of columns" fails 
due to mismatch in the final assert.

Observed that the data obtained after df.cache() is causing the error. Please 
find attached the log with the details. 

 

Inside test !!- access only some column of the all of columns *** FAILED ***

  was:
While running tests on Apache Spark v2.3.2 with AdoptJDK on big endian, I am 
observing test failures for 2 Suites of Project SQL.
 1. InMemoryColumnarQuerySuite
 2. DataFrameTungstenSuite
 In both the cases test "access only some column of the all of columns" fails 
due to mismatch in the final assert.
 Seems that the difference in mapping of float and decimal on big endian is 
causing the assert to fail.

Inside test !!- access only some column of the all of columns *** FAILED ***


> Test "access only some column of the all of columns " fails on big endian
> -
>
> Key: SPARK-26985
> URL: https://issues.apache.org/jira/browse/SPARK-26985
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.2
> Environment: Linux Ubuntu 16.04 
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 64-Bit Compressed 
> References 20190205_218 (JIT enabled, AOT enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>  
>Reporter: Anuja Jakhade
>Priority: Critical
> Attachments: DataFrameTungstenSuite.txt, 
> InMemoryColumnarQuerySuite.txt
>
>
> While running tests on Apache Spark v2.3.2 with AdoptJDK on big endian, I am 
> observing test failures for 2 Suites of Project SQL.
>  1. InMemoryColumnarQuerySuite
>  2. DataFrameTungstenSuite
>  In both the cases test "access only some column of the all of columns" fails 
> due to mismatch in the final assert.
> Observed that the data obtained after df.cache() is causing the error. Please 
> find attached the log with the details. 
>  
> Inside test !!- access only some column of the all of columns *** FAILED 
> ***



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26985) Test "access only some column of the all of columns " fails on big endian

2019-02-25 Thread Anuja Jakhade (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuja Jakhade updated SPARK-26985:
--
Description: 
While running tests on Apache Spark v2.3.2 with AdoptJDK on big endian, I am 
observing test failures for 2 Suites of Project SQL.
 1. InMemoryColumnarQuerySuite
 2. DataFrameTungstenSuite
 In both the cases test "access only some column of the all of columns" fails 
due to mismatch in the final assert.
 Seems that the difference in mapping of float and decimal on big endian is 
causing the assert to fail.

Inside test !!- access only some column of the all of columns *** FAILED ***

  was:
While running tests on Apache Spark v2.3.2 with AdoptJDK on big endian, I am 
observing test failures for 2 Suites of Project SQL.
 1. InMemoryColumnarQuerySuite
 2. DataFrameTungstenSuite
 In both the cases test "access only some column of the all of columns" fails 
due to mismatch in the final assert.
 Seems that the difference in mapping of float and decimal on big endian is 
causing the assert to fail.

Inside test !!- access only some column of the all of columns *** FAILED ***
 99 did not equal 9 (InMemoryColumnarQuerySuite.scala:153)


> Test "access only some column of the all of columns " fails on big endian
> -
>
> Key: SPARK-26985
> URL: https://issues.apache.org/jira/browse/SPARK-26985
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.2
> Environment: Linux Ubuntu 16.04 
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 64-Bit Compressed 
> References 20190205_218 (JIT enabled, AOT enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>  
>Reporter: Anuja Jakhade
>Priority: Critical
> Attachments: DataFrameTungstenSuite.txt, 
> InMemoryColumnarQuerySuite.txt
>
>
> While running tests on Apache Spark v2.3.2 with AdoptJDK on big endian, I am 
> observing test failures for 2 Suites of Project SQL.
>  1. InMemoryColumnarQuerySuite
>  2. DataFrameTungstenSuite
>  In both the cases test "access only some column of the all of columns" fails 
> due to mismatch in the final assert.
>  Seems that the difference in mapping of float and decimal on big endian is 
> causing the assert to fail.
> Inside test !!- access only some column of the all of columns *** FAILED 
> ***



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26940) Observed greater deviation on big endian platform for SingletonReplSuite test case

2019-02-25 Thread Anuja Jakhade (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuja Jakhade updated SPARK-26940:
--
Priority: Critical  (was: Major)

> Observed greater deviation on big endian platform for SingletonReplSuite test 
> case
> --
>
> Key: SPARK-26940
> URL: https://issues.apache.org/jira/browse/SPARK-26940
> Project: Spark
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 2.3.2
> Environment: Ubuntu 16.04 LTS
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 Linux (JIT enabled, AOT 
> enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>Reporter: Anuja Jakhade
>Priority: Critical
> Attachments: failure_log.txt
>
>
> I have built Apache Spark v2.3.2 on Big Endian platform with AdoptJDK OpenJ9 
> 1.8.0_202.
> My build is successful. However while running the scala tests of "*Spark 
> Project REPL*" module, I am facing failures at SingletonReplSuite with error 
> log as attached.
> The deviation observed on big endian is greater than the acceptable deviation 
> 0.2.
> How efficient is it to increase the deviation defined in 
> SingletonReplSuite.scala
> Can this be fixed? 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26985) Test "access only some column of the all of columns " fails on big endian

2019-02-25 Thread Anuja Jakhade (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuja Jakhade updated SPARK-26985:
--
Attachment: InMemoryColumnarQuerySuite.txt
DataFrameTungstenSuite.txt

> Test "access only some column of the all of columns " fails on big endian
> -
>
> Key: SPARK-26985
> URL: https://issues.apache.org/jira/browse/SPARK-26985
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.2
> Environment: Linux Ubuntu 16.04 
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 64-Bit Compressed 
> References 20190205_218 (JIT enabled, AOT enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>  
>Reporter: Anuja Jakhade
>Priority: Critical
> Attachments: DataFrameTungstenSuite.txt, 
> InMemoryColumnarQuerySuite.txt
>
>
> I am running tests on Apache Spark v2.3.2 with AdoptJDK on big endian
>  I am observing test failures for 2 Suites of Project SQL.
>  1. InMemoryColumnarQuerySuite
>  2. DataFrameTungstenSuite
>  In both the cases test "access only some column of the all of columns" fails 
> due to mismatch in the final assert.
>  Seems that the difference in mapping of float and decimal on big endian is 
> causing the assert to fail.
> Inside test !!- access only some column of the all of columns *** FAILED 
> ***
>  99 did not equal 9 (InMemoryColumnarQuerySuite.scala:153)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26985) Test "access only some column of the all of columns " fails on big endian

2019-02-25 Thread Anuja Jakhade (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuja Jakhade updated SPARK-26985:
--
Description: 
While running tests on Apache Spark v2.3.2 with AdoptJDK on big endian, I am 
observing test failures for 2 Suites of Project SQL.
 1. InMemoryColumnarQuerySuite
 2. DataFrameTungstenSuite
 In both the cases test "access only some column of the all of columns" fails 
due to mismatch in the final assert.
 Seems that the difference in mapping of float and decimal on big endian is 
causing the assert to fail.

Inside test !!- access only some column of the all of columns *** FAILED ***
 99 did not equal 9 (InMemoryColumnarQuerySuite.scala:153)

  was:
I am running tests on Apache Spark v2.3.2 with AdoptJDK on big endian
 I am observing test failures for 2 Suites of Project SQL.
 1. InMemoryColumnarQuerySuite
 2. DataFrameTungstenSuite
 In both the cases test "access only some column of the all of columns" fails 
due to mismatch in the final assert.
 Seems that the difference in mapping of float and decimal on big endian is 
causing the assert to fail.

Inside test !!- access only some column of the all of columns *** FAILED ***
 99 did not equal 9 (InMemoryColumnarQuerySuite.scala:153)


> Test "access only some column of the all of columns " fails on big endian
> -
>
> Key: SPARK-26985
> URL: https://issues.apache.org/jira/browse/SPARK-26985
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.2
> Environment: Linux Ubuntu 16.04 
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 64-Bit Compressed 
> References 20190205_218 (JIT enabled, AOT enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>  
>Reporter: Anuja Jakhade
>Priority: Critical
> Attachments: DataFrameTungstenSuite.txt, 
> InMemoryColumnarQuerySuite.txt
>
>
> While running tests on Apache Spark v2.3.2 with AdoptJDK on big endian, I am 
> observing test failures for 2 Suites of Project SQL.
>  1. InMemoryColumnarQuerySuite
>  2. DataFrameTungstenSuite
>  In both the cases test "access only some column of the all of columns" fails 
> due to mismatch in the final assert.
>  Seems that the difference in mapping of float and decimal on big endian is 
> causing the assert to fail.
> Inside test !!- access only some column of the all of columns *** FAILED 
> ***
>  99 did not equal 9 (InMemoryColumnarQuerySuite.scala:153)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26985) Test "access only some column of the all of columns " fails on big endian

2019-02-25 Thread Anuja Jakhade (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuja Jakhade updated SPARK-26985:
--
Description: 
I am running tests on Apache Spark v2.3.2 with AdoptJDK on big endian
 I am observing test failures for 2 Suites of Project SQL.
 1. InMemoryColumnarQuerySuite
 2. DataFrameTungstenSuite
 In both the cases test "access only some column of the all of columns" fails 
due to mismatch in the final assert.
 Seems that the difference in mapping of float and decimal on big endian is 
causing the assert to fail.

Inside test !!- access only some column of the all of columns *** FAILED ***
 99 did not equal 9 (InMemoryColumnarQuerySuite.scala:153)

  was:
I am running tests on Apache Spark v2.3.2 with AdoptJDK on big endian
I am obsorving test failures at 2 Suites of Prject SQL.
1. InMemoryColumnarQuerySuite
2. DataFrameTungstenSuite
In both the cases test "access only some column of the all of columns" fails 
due to mismatch in the final assert.
Seems that the differnce in mapping of float and decimal on big endian is 
causing the assert to fail.

Inside test !!- access only some column of the all of columns *** FAILED ***
 99 did not equal 9 (InMemoryColumnarQuerySuite.scala:153)


> Test "access only some column of the all of columns " fails on big endian
> -
>
> Key: SPARK-26985
> URL: https://issues.apache.org/jira/browse/SPARK-26985
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.2
> Environment: Linux Ubuntu 16.04 
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 64-Bit Compressed 
> References 20190205_218 (JIT enabled, AOT enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>  
>Reporter: Anuja Jakhade
>Priority: Critical
>
> I am running tests on Apache Spark v2.3.2 with AdoptJDK on big endian
>  I am observing test failures for 2 Suites of Project SQL.
>  1. InMemoryColumnarQuerySuite
>  2. DataFrameTungstenSuite
>  In both the cases test "access only some column of the all of columns" fails 
> due to mismatch in the final assert.
>  Seems that the difference in mapping of float and decimal on big endian is 
> causing the assert to fail.
> Inside test !!- access only some column of the all of columns *** FAILED 
> ***
>  99 did not equal 9 (InMemoryColumnarQuerySuite.scala:153)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-26985) Test "access only some column of the all of columns " fails on big endian

2019-02-25 Thread Anuja Jakhade (JIRA)
Anuja Jakhade created SPARK-26985:
-

 Summary: Test "access only some column of the all of columns " 
fails on big endian
 Key: SPARK-26985
 URL: https://issues.apache.org/jira/browse/SPARK-26985
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.3.2
 Environment: Linux Ubuntu 16.04 

openjdk version "1.8.0_202"
OpenJDK Runtime Environment (build 1.8.0_202-b08)
Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 64-Bit Compressed References 
20190205_218 (JIT enabled, AOT enabled)
OpenJ9 - 90dd8cb40
OMR - d2f4534b
JCL - d002501a90 based on jdk8u202-b08)

 
Reporter: Anuja Jakhade


I am running tests on Apache Spark v2.3.2 with AdoptJDK on big endian
I am obsorving test failures at 2 Suites of Prject SQL.
1. InMemoryColumnarQuerySuite
2. DataFrameTungstenSuite
In both the cases test "access only some column of the all of columns" fails 
due to mismatch in the final assert.
Seems that the differnce in mapping of float and decimal on big endian is 
causing the assert to fail.

Inside test !!- access only some column of the all of columns *** FAILED ***
 99 did not equal 9 (InMemoryColumnarQuerySuite.scala:153)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26940) Observed greater deviation on big endian platform for SingletonReplSuite test case

2019-02-20 Thread Anuja Jakhade (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuja Jakhade updated SPARK-26940:
--
Summary: Observed greater deviation on big endian platform for 
SingletonReplSuite test case  (was: Observed greater deviation on big endian 
for SingletonReplSuite test case)

> Observed greater deviation on big endian platform for SingletonReplSuite test 
> case
> --
>
> Key: SPARK-26940
> URL: https://issues.apache.org/jira/browse/SPARK-26940
> Project: Spark
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 2.3.2
> Environment: Ubuntu 16.04 LTS
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 Linux (JIT enabled, AOT 
> enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>Reporter: Anuja Jakhade
>Priority: Major
> Attachments: failure_log.txt
>
>
> I have built Apache Spark v2.3.2 on Big Endian with AdoptJDK OpenJ9 1.8.0_202.
> My build is successful. However while running the scala tests of "*Spark 
> Project REPL*" module. I am facing failures at SingletonReplSuite with error 
> log as attached below 
> The deviation observed on big endian is greater than the acceptable deviation 
> 0.2.
> How efficient is it to increase the deviation defined in 
> SingletonReplSuite.scala
> Can this be fixed? 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26940) Observed greater deviation on big endian platform for SingletonReplSuite test case

2019-02-20 Thread Anuja Jakhade (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuja Jakhade updated SPARK-26940:
--
Description: 
I have built Apache Spark v2.3.2 on Big Endian with AdoptJDK OpenJ9 1.8.0_202.

My build is successful. However while running the scala tests of "*Spark 
Project REPL*" module, I am facing failures at SingletonReplSuite with error 
log as attached.

The deviation observed on big endian is greater than the acceptable deviation 
0.2.

How efficient is it to increase the deviation defined in 
SingletonReplSuite.scala

Can this be fixed? 

 

  was:
I have built Apache Spark v2.3.2 on Big Endian with AdoptJDK OpenJ9 1.8.0_202.

My build is successful. However while running the scala tests of "*Spark 
Project REPL*" module. I am facing failures at SingletonReplSuite with error 
log as attached below 

The deviation observed on big endian is greater than the acceptable deviation 
0.2.

How efficient is it to increase the deviation defined in 
SingletonReplSuite.scala

Can this be fixed? 

 


> Observed greater deviation on big endian platform for SingletonReplSuite test 
> case
> --
>
> Key: SPARK-26940
> URL: https://issues.apache.org/jira/browse/SPARK-26940
> Project: Spark
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 2.3.2
> Environment: Ubuntu 16.04 LTS
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 Linux (JIT enabled, AOT 
> enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>Reporter: Anuja Jakhade
>Priority: Major
> Attachments: failure_log.txt
>
>
> I have built Apache Spark v2.3.2 on Big Endian with AdoptJDK OpenJ9 1.8.0_202.
> My build is successful. However while running the scala tests of "*Spark 
> Project REPL*" module, I am facing failures at SingletonReplSuite with error 
> log as attached.
> The deviation observed on big endian is greater than the acceptable deviation 
> 0.2.
> How efficient is it to increase the deviation defined in 
> SingletonReplSuite.scala
> Can this be fixed? 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26940) Observed greater deviation on big endian platform for SingletonReplSuite test case

2019-02-20 Thread Anuja Jakhade (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuja Jakhade updated SPARK-26940:
--
Description: 
I have built Apache Spark v2.3.2 on Big Endian platform with AdoptJDK OpenJ9 
1.8.0_202.

My build is successful. However while running the scala tests of "*Spark 
Project REPL*" module, I am facing failures at SingletonReplSuite with error 
log as attached.

The deviation observed on big endian is greater than the acceptable deviation 
0.2.

How efficient is it to increase the deviation defined in 
SingletonReplSuite.scala

Can this be fixed? 

 

  was:
I have built Apache Spark v2.3.2 on Big Endian with AdoptJDK OpenJ9 1.8.0_202.

My build is successful. However while running the scala tests of "*Spark 
Project REPL*" module, I am facing failures at SingletonReplSuite with error 
log as attached.

The deviation observed on big endian is greater than the acceptable deviation 
0.2.

How efficient is it to increase the deviation defined in 
SingletonReplSuite.scala

Can this be fixed? 

 


> Observed greater deviation on big endian platform for SingletonReplSuite test 
> case
> --
>
> Key: SPARK-26940
> URL: https://issues.apache.org/jira/browse/SPARK-26940
> Project: Spark
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 2.3.2
> Environment: Ubuntu 16.04 LTS
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 Linux (JIT enabled, AOT 
> enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>Reporter: Anuja Jakhade
>Priority: Major
> Attachments: failure_log.txt
>
>
> I have built Apache Spark v2.3.2 on Big Endian platform with AdoptJDK OpenJ9 
> 1.8.0_202.
> My build is successful. However while running the scala tests of "*Spark 
> Project REPL*" module, I am facing failures at SingletonReplSuite with error 
> log as attached.
> The deviation observed on big endian is greater than the acceptable deviation 
> 0.2.
> How efficient is it to increase the deviation defined in 
> SingletonReplSuite.scala
> Can this be fixed? 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26940) Observed greater deviation on big endian for SingletonReplSuite test case

2019-02-20 Thread Anuja Jakhade (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuja Jakhade updated SPARK-26940:
--
Attachment: (was: failure_log)

> Observed greater deviation on big endian for SingletonReplSuite test case
> -
>
> Key: SPARK-26940
> URL: https://issues.apache.org/jira/browse/SPARK-26940
> Project: Spark
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 2.3.2
> Environment: Ubuntu 16.04 LTS
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 Linux (JIT enabled, AOT 
> enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>Reporter: Anuja Jakhade
>Priority: Major
> Attachments: failure_log.txt
>
>
> I have built Apache Spark v2.3.2 on Big Endian with AdoptJDK OpenJ9 1.8.0_202.
> My build is successful. However while running the scala tests of "*Spark 
> Project REPL*" module. I am facing failures at SingletonReplSuite with error 
> log as attached below 
> The deviation observed on big endian is greater than the acceptable deviation 
> 0.2.
> How efficient is it to increase the deviation defined in 
> SingletonReplSuite.scala
> Can this be fixed? 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26940) Observed greater deviation on big endian for SingletonReplSuite test case

2019-02-20 Thread Anuja Jakhade (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuja Jakhade updated SPARK-26940:
--
Attachment: failure_log.txt

> Observed greater deviation on big endian for SingletonReplSuite test case
> -
>
> Key: SPARK-26940
> URL: https://issues.apache.org/jira/browse/SPARK-26940
> Project: Spark
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 2.3.2
> Environment: Ubuntu 16.04 LTS
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 Linux (JIT enabled, AOT 
> enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>Reporter: Anuja Jakhade
>Priority: Major
> Attachments: failure_log.txt
>
>
> I have built Apache Spark v2.3.2 on Big Endian with AdoptJDK OpenJ9 1.8.0_202.
> My build is successful. However while running the scala tests of "*Spark 
> Project REPL*" module. I am facing failures at SingletonReplSuite with error 
> log as attached below 
> The deviation observed on big endian is greater than the acceptable deviation 
> 0.2.
> How efficient is it to increase the deviation defined in 
> SingletonReplSuite.scala
> Can this be fixed? 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26940) Observed greater deviation on big endian for SingletonReplSuite test case

2019-02-20 Thread Anuja Jakhade (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuja Jakhade updated SPARK-26940:
--
Attachment: failure_log

> Observed greater deviation on big endian for SingletonReplSuite test case
> -
>
> Key: SPARK-26940
> URL: https://issues.apache.org/jira/browse/SPARK-26940
> Project: Spark
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 2.3.2
> Environment: Ubuntu 16.04 LTS
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 Linux (JIT enabled, AOT 
> enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>Reporter: Anuja Jakhade
>Priority: Major
> Attachments: failure_log
>
>
> I have built Apache Spark v2.3.2 on Big Endian with AdoptJDK OpenJ9 1.8.0_202.
> My build is successful. However while running the scala tests of "*Spark 
> Project REPL*" module. I am facing failures at SingletonReplSuite with error 
> log as attached below 
> The deviation observed on big endian is greater than the acceptable deviation 
> 0.2.
> How efficient is it to increase the deviation defined in 
> SingletonReplSuite.scala
> Can this be fixed? 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26940) Observed greater deviation on big endian for SingletonReplSuite test case

2019-02-20 Thread Anuja Jakhade (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuja Jakhade updated SPARK-26940:
--
Description: 
I have built Apache Spark v2.3.2 on Big Endian with AdoptJDK OpenJ9 1.8.0_202.

My build is successful. However while running the scala tests of "*Spark 
Project REPL*" module. I am facing failures at SingletonReplSuite with error 
log as attached below 

The deviation observed on big endian is greater than the acceptable deviation 
0.2.

How efficient is it to increase the deviation defined in 
SingletonReplSuite.scala

Can this be fixed? 

 

  was:
I have built Apache Spark v2.3.2 on Big Endian with AdoptJDK OpenJ9 1.8.0_202.

My build is successful. However while running the scala tests of "*Spark 
Project REPL*" module. I am facing failures at SingletonReplSuite with error 
log as below 

 
 - should clone and clean line object in ClosureCleaner *** FAILED ***
 isContain was true Interpreter output contained 'AssertionError':

scala> import org.apache.spark.rdd.RDD

scala>
 scala> lines: org.apache.spark.rdd.RDD[String] = pom.xml MapPartitionsRDD[46] 
at textFile at :40

scala> defined class Data

scala> dataRDD: org.apache.spark.rdd.RDD[Data] = MapPartitionsRDD[47] at map at 
:43

scala> res28: Long = 180

scala> repartitioned: org.apache.spark.rdd.RDD[Data] = MapPartitionsRDD[51] at 
repartition at :41

scala> res29: Long = 180

scala>
 scala> | | getCacheSize: (rdd: org.apache.spark.rdd.RDD[_])Long

scala> cacheSize1: Long = 24608

scala> cacheSize2: Long = 17768

scala>
 scala>
 scala> deviation: Double = 0.2779583875162549

scala> | java.lang.AssertionError: assertion failed: deviation too large: 
0.2779583875162549, first size: 24608, second size: 17768
 at scala.Predef$.assert(Predef.scala:170)
 ... 46 elided

scala> | _result_1550641172995: Int = 1

scala> (SingletonReplSuite.scala:121)

 

The deviation observed on big endian is greater than the acceptable deviation 
0.2.

How efficient is it to increase the deviation defined in 
SingletonReplSuite.scala

Can this be fixed? 

 

Summary: Observed greater deviation on big endian for 
SingletonReplSuite test case  (was: Observed greater deviation Big Endian for 
SingletonReplSuite test case)

> Observed greater deviation on big endian for SingletonReplSuite test case
> -
>
> Key: SPARK-26940
> URL: https://issues.apache.org/jira/browse/SPARK-26940
> Project: Spark
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 2.3.2
> Environment: Ubuntu 16.04 LTS
> openjdk version "1.8.0_202"
> OpenJDK Runtime Environment (build 1.8.0_202-b08)
> Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 Linux (JIT enabled, AOT 
> enabled)
> OpenJ9 - 90dd8cb40
> OMR - d2f4534b
> JCL - d002501a90 based on jdk8u202-b08)
>Reporter: Anuja Jakhade
>Priority: Major
>
> I have built Apache Spark v2.3.2 on Big Endian with AdoptJDK OpenJ9 1.8.0_202.
> My build is successful. However while running the scala tests of "*Spark 
> Project REPL*" module. I am facing failures at SingletonReplSuite with error 
> log as attached below 
> The deviation observed on big endian is greater than the acceptable deviation 
> 0.2.
> How efficient is it to increase the deviation defined in 
> SingletonReplSuite.scala
> Can this be fixed? 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-26940) Observed greater deviation Big Endian for SingletonReplSuite test case

2019-02-20 Thread Anuja Jakhade (JIRA)
Anuja Jakhade created SPARK-26940:
-

 Summary: Observed greater deviation Big Endian for 
SingletonReplSuite test case
 Key: SPARK-26940
 URL: https://issues.apache.org/jira/browse/SPARK-26940
 Project: Spark
  Issue Type: Test
  Components: Tests
Affects Versions: 2.3.2
 Environment: Ubuntu 16.04 LTS

openjdk version "1.8.0_202"
OpenJDK Runtime Environment (build 1.8.0_202-b08)
Eclipse OpenJ9 VM (build openj9-0.12.1, JRE 1.8.0 Linux (JIT enabled, AOT 
enabled)
OpenJ9 - 90dd8cb40
OMR - d2f4534b
JCL - d002501a90 based on jdk8u202-b08)
Reporter: Anuja Jakhade


I have built Apache Spark v2.3.2 on Big Endian with AdoptJDK OpenJ9 1.8.0_202.

My build is successful. However while running the scala tests of "*Spark 
Project REPL*" module. I am facing failures at SingletonReplSuite with error 
log as below 

 
 - should clone and clean line object in ClosureCleaner *** FAILED ***
 isContain was true Interpreter output contained 'AssertionError':

scala> import org.apache.spark.rdd.RDD

scala>
 scala> lines: org.apache.spark.rdd.RDD[String] = pom.xml MapPartitionsRDD[46] 
at textFile at :40

scala> defined class Data

scala> dataRDD: org.apache.spark.rdd.RDD[Data] = MapPartitionsRDD[47] at map at 
:43

scala> res28: Long = 180

scala> repartitioned: org.apache.spark.rdd.RDD[Data] = MapPartitionsRDD[51] at 
repartition at :41

scala> res29: Long = 180

scala>
 scala> | | getCacheSize: (rdd: org.apache.spark.rdd.RDD[_])Long

scala> cacheSize1: Long = 24608

scala> cacheSize2: Long = 17768

scala>
 scala>
 scala> deviation: Double = 0.2779583875162549

scala> | java.lang.AssertionError: assertion failed: deviation too large: 
0.2779583875162549, first size: 24608, second size: 17768
 at scala.Predef$.assert(Predef.scala:170)
 ... 46 elided

scala> | _result_1550641172995: Int = 1

scala> (SingletonReplSuite.scala:121)

 

The deviation observed on big endian is greater than the acceptable deviation 
0.2.

How efficient is it to increase the deviation defined in 
SingletonReplSuite.scala

Can this be fixed? 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26796) Testcases failing with "org.apache.hadoop.fs.ChecksumException" error

2019-01-31 Thread Anuja Jakhade (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuja Jakhade updated SPARK-26796:
--
Environment: 
Ubuntu 16.04 

Java Version

openjdk version "1.8.0_192"
OpenJDK Runtime Environment (build 1.8.0_192-b12_openj9)
Eclipse OpenJ9 VM (build openj9-0.11.0, JRE 1.8.0 Linux s390x-64-Bit Compressed 
References 20181107_80 (JIT enabled, AOT enabled)
OpenJ9 - 090ff9dcd
OMR - ea548a66
JCL - b5a3affe73 based on jdk8u192-b12)

 

Hadoop  Version

Hadoop 2.7.1
 Subversion Unknown -r Unknown
 Compiled by test on 2019-01-29T09:09Z
 Compiled with protoc 2.5.0
 From source with checksum 5e94a235f9a71834e2eb73fb36ee873f
 This command was run using 
/home/test/hadoop-release-2.7.1/hadoop-dist/target/hadoop-2.7.1/share/hadoop/common/hadoop-common-2.7.1.jar

 

 

 

  was:
Ubuntu 16.04 

Java Version

openjdk version "1.8.0_191"
 OpenJDK Runtime Environment (build 1.8.0_191-8u191-b12-0ubuntu0.16.04.1-b12)
 OpenJDK 64-Bit Zero VM (build 25.191-b12, interpreted mode)

 

Hadoop  Version

Hadoop 2.7.1
 Subversion Unknown -r Unknown
 Compiled by test on 2019-01-29T09:09Z
 Compiled with protoc 2.5.0
 From source with checksum 5e94a235f9a71834e2eb73fb36ee873f
 This command was run using 
/home/test/hadoop-release-2.7.1/hadoop-dist/target/hadoop-2.7.1/share/hadoop/common/hadoop-common-2.7.1.jar

 

 

 


> Testcases failing with "org.apache.hadoop.fs.ChecksumException" error
> -
>
> Key: SPARK-26796
> URL: https://issues.apache.org/jira/browse/SPARK-26796
> Project: Spark
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 2.3.2, 2.4.0
> Environment: Ubuntu 16.04 
> Java Version
> openjdk version "1.8.0_192"
> OpenJDK Runtime Environment (build 1.8.0_192-b12_openj9)
> Eclipse OpenJ9 VM (build openj9-0.11.0, JRE 1.8.0 Linux s390x-64-Bit 
> Compressed References 20181107_80 (JIT enabled, AOT enabled)
> OpenJ9 - 090ff9dcd
> OMR - ea548a66
> JCL - b5a3affe73 based on jdk8u192-b12)
>  
> Hadoop  Version
> Hadoop 2.7.1
>  Subversion Unknown -r Unknown
>  Compiled by test on 2019-01-29T09:09Z
>  Compiled with protoc 2.5.0
>  From source with checksum 5e94a235f9a71834e2eb73fb36ee873f
>  This command was run using 
> /home/test/hadoop-release-2.7.1/hadoop-dist/target/hadoop-2.7.1/share/hadoop/common/hadoop-common-2.7.1.jar
>  
>  
>  
>Reporter: Anuja Jakhade
>Priority: Major
>
> Observing test case failures due to Checksum error 
> Below is the error log
> [ERROR] checkpointAndComputation(test.org.apache.spark.JavaAPISuite) Time 
> elapsed: 1.232 s <<< ERROR!
> org.apache.spark.SparkException: 
> Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times, most 
> recent failure: Lost task 0.0 in stage 2.0 (TID 2, localhost, executor 
> driver): org.apache.hadoop.fs.ChecksumException: Checksum error: 
> file:/home/test/spark/core/target/tmp/1548319689411-0/fd0ba388-539c-49aa-bf76-e7d50aa2d1fc/rdd-0/part-0
>  at 0 exp: 222499834 got: 1400184476
>  at org.apache.hadoop.fs.FSInputChecker.verifySums(FSInputChecker.java:323)
>  at 
> org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:279)
>  at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:214)
>  at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:232)
>  at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:196)
>  at java.io.DataInputStream.read(DataInputStream.java:149)
>  at 
> java.io.ObjectInputStream$PeekInputStream.read(ObjectInputStream.java:2769)
>  at 
> java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2785)
>  at 
> java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3262)
>  at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:968)
>  at java.io.ObjectInputStream.(ObjectInputStream.java:390)
>  at 
> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.(JavaSerializer.scala:63)
>  at 
> org.apache.spark.serializer.JavaDeserializationStream.(JavaSerializer.scala:63)
>  at 
> org.apache.spark.serializer.JavaSerializerInstance.deserializeStream(JavaSerializer.scala:122)
>  at 
> org.apache.spark.rdd.ReliableCheckpointRDD$.readCheckpointFile(ReliableCheckpointRDD.scala:300)
>  at 
> org.apache.spark.rdd.ReliableCheckpointRDD.compute(ReliableCheckpointRDD.scala:100)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:322)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
>  at org.apache.spark.scheduler.Task.run(Task.scala:109)
>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
>  at 
> 

[jira] [Created] (SPARK-26796) Testcases failing with "org.apache.hadoop.fs.ChecksumException" error

2019-01-31 Thread Anuja Jakhade (JIRA)
Anuja Jakhade created SPARK-26796:
-

 Summary: Testcases failing with 
"org.apache.hadoop.fs.ChecksumException" error
 Key: SPARK-26796
 URL: https://issues.apache.org/jira/browse/SPARK-26796
 Project: Spark
  Issue Type: Bug
  Components: Tests
Affects Versions: 2.4.0, 2.3.2
 Environment: I am working on Ubuntu 16.04 on s390x.

Java Version

openjdk version "1.8.0_191"
OpenJDK Runtime Environment (build 1.8.0_191-8u191-b12-0ubuntu0.16.04.1-b12)
OpenJDK 64-Bit Zero VM (build 25.191-b12, interpreted mode)

 

Hadoop  Version

Hadoop 2.7.1
Subversion Unknown -r Unknown
Compiled by test on 2019-01-29T09:09Z
Compiled with protoc 2.5.0
>From source with checksum 5e94a235f9a71834e2eb73fb36ee873f
This command was run using 
/home/test/hadoop-release-2.7.1/hadoop-dist/target/hadoop-2.7.1/share/hadoop/common/hadoop-common-2.7.1.jar

 

 

 
Reporter: Anuja Jakhade


Observing test case failures due to Checksum error 

Below is the error log

[ERROR] checkpointAndComputation(test.org.apache.spark.JavaAPISuite) Time 
elapsed: 1.232 s <<< ERROR!
org.apache.spark.SparkException: 
Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times, most 
recent failure: Lost task 0.0 in stage 2.0 (TID 2, localhost, executor driver): 
org.apache.hadoop.fs.ChecksumException: Checksum error: 
file:/home/test/spark/core/target/tmp/1548319689411-0/fd0ba388-539c-49aa-bf76-e7d50aa2d1fc/rdd-0/part-0
 at 0 exp: 222499834 got: 1400184476
 at org.apache.hadoop.fs.FSInputChecker.verifySums(FSInputChecker.java:323)
 at 
org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:279)
 at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:214)
 at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:232)
 at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:196)
 at java.io.DataInputStream.read(DataInputStream.java:149)
 at java.io.ObjectInputStream$PeekInputStream.read(ObjectInputStream.java:2769)
 at 
java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2785)
 at 
java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3262)
 at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:968)
 at java.io.ObjectInputStream.(ObjectInputStream.java:390)
 at 
org.apache.spark.serializer.JavaDeserializationStream$$anon$1.(JavaSerializer.scala:63)
 at 
org.apache.spark.serializer.JavaDeserializationStream.(JavaSerializer.scala:63)
 at 
org.apache.spark.serializer.JavaSerializerInstance.deserializeStream(JavaSerializer.scala:122)
 at 
org.apache.spark.rdd.ReliableCheckpointRDD$.readCheckpointFile(ReliableCheckpointRDD.scala:300)
 at 
org.apache.spark.rdd.ReliableCheckpointRDD.compute(ReliableCheckpointRDD.scala:100)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:322)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
 at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
 at org.apache.spark.scheduler.Task.run(Task.scala:109)
 at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:813)

Driver stacktrace:
 at 
test.org.apache.spark.JavaAPISuite.checkpointAndComputation(JavaAPISuite.java:1243)
Caused by: org.apache.hadoop.fs.ChecksumException: Checksum error:

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26796) Testcases failing with "org.apache.hadoop.fs.ChecksumException" error

2019-01-31 Thread Anuja Jakhade (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuja Jakhade updated SPARK-26796:
--
Environment: 
Ubuntu 16.04 

Java Version

openjdk version "1.8.0_192"
 OpenJDK Runtime Environment (build 1.8.0_192-b12_openj9)
 Eclipse OpenJ9 VM (build openj9-0.11.0, JRE 1.8.0 Compressed References 
20181107_80 (JIT enabled, AOT enabled)
 OpenJ9 - 090ff9dcd
 OMR - ea548a66
 JCL - b5a3affe73 based on jdk8u192-b12)

 

Hadoop  Version

Hadoop 2.7.1
 Subversion Unknown -r Unknown
 Compiled by test on 2019-01-29T09:09Z
 Compiled with protoc 2.5.0
 From source with checksum 5e94a235f9a71834e2eb73fb36ee873f
 This command was run using 
/home/test/hadoop-release-2.7.1/hadoop-dist/target/hadoop-2.7.1/share/hadoop/common/hadoop-common-2.7.1.jar

 

 

 

  was:
Ubuntu 16.04 

Java Version

openjdk version "1.8.0_192"
OpenJDK Runtime Environment (build 1.8.0_192-b12_openj9)
Eclipse OpenJ9 VM (build openj9-0.11.0, JRE 1.8.0 Linux s390x-64-Bit Compressed 
References 20181107_80 (JIT enabled, AOT enabled)
OpenJ9 - 090ff9dcd
OMR - ea548a66
JCL - b5a3affe73 based on jdk8u192-b12)

 

Hadoop  Version

Hadoop 2.7.1
 Subversion Unknown -r Unknown
 Compiled by test on 2019-01-29T09:09Z
 Compiled with protoc 2.5.0
 From source with checksum 5e94a235f9a71834e2eb73fb36ee873f
 This command was run using 
/home/test/hadoop-release-2.7.1/hadoop-dist/target/hadoop-2.7.1/share/hadoop/common/hadoop-common-2.7.1.jar

 

 

 


> Testcases failing with "org.apache.hadoop.fs.ChecksumException" error
> -
>
> Key: SPARK-26796
> URL: https://issues.apache.org/jira/browse/SPARK-26796
> Project: Spark
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 2.3.2, 2.4.0
> Environment: Ubuntu 16.04 
> Java Version
> openjdk version "1.8.0_192"
>  OpenJDK Runtime Environment (build 1.8.0_192-b12_openj9)
>  Eclipse OpenJ9 VM (build openj9-0.11.0, JRE 1.8.0 Compressed References 
> 20181107_80 (JIT enabled, AOT enabled)
>  OpenJ9 - 090ff9dcd
>  OMR - ea548a66
>  JCL - b5a3affe73 based on jdk8u192-b12)
>  
> Hadoop  Version
> Hadoop 2.7.1
>  Subversion Unknown -r Unknown
>  Compiled by test on 2019-01-29T09:09Z
>  Compiled with protoc 2.5.0
>  From source with checksum 5e94a235f9a71834e2eb73fb36ee873f
>  This command was run using 
> /home/test/hadoop-release-2.7.1/hadoop-dist/target/hadoop-2.7.1/share/hadoop/common/hadoop-common-2.7.1.jar
>  
>  
>  
>Reporter: Anuja Jakhade
>Priority: Major
>
> Observing test case failures due to Checksum error 
> Below is the error log
> [ERROR] checkpointAndComputation(test.org.apache.spark.JavaAPISuite) Time 
> elapsed: 1.232 s <<< ERROR!
> org.apache.spark.SparkException: 
> Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times, most 
> recent failure: Lost task 0.0 in stage 2.0 (TID 2, localhost, executor 
> driver): org.apache.hadoop.fs.ChecksumException: Checksum error: 
> file:/home/test/spark/core/target/tmp/1548319689411-0/fd0ba388-539c-49aa-bf76-e7d50aa2d1fc/rdd-0/part-0
>  at 0 exp: 222499834 got: 1400184476
>  at org.apache.hadoop.fs.FSInputChecker.verifySums(FSInputChecker.java:323)
>  at 
> org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:279)
>  at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:214)
>  at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:232)
>  at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:196)
>  at java.io.DataInputStream.read(DataInputStream.java:149)
>  at 
> java.io.ObjectInputStream$PeekInputStream.read(ObjectInputStream.java:2769)
>  at 
> java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2785)
>  at 
> java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3262)
>  at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:968)
>  at java.io.ObjectInputStream.(ObjectInputStream.java:390)
>  at 
> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.(JavaSerializer.scala:63)
>  at 
> org.apache.spark.serializer.JavaDeserializationStream.(JavaSerializer.scala:63)
>  at 
> org.apache.spark.serializer.JavaSerializerInstance.deserializeStream(JavaSerializer.scala:122)
>  at 
> org.apache.spark.rdd.ReliableCheckpointRDD$.readCheckpointFile(ReliableCheckpointRDD.scala:300)
>  at 
> org.apache.spark.rdd.ReliableCheckpointRDD.compute(ReliableCheckpointRDD.scala:100)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:322)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
>  at org.apache.spark.scheduler.Task.run(Task.scala:109)
>  at 

[jira] [Updated] (SPARK-26796) Testcases failing with "org.apache.hadoop.fs.ChecksumException" error

2019-01-31 Thread Anuja Jakhade (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuja Jakhade updated SPARK-26796:
--
Environment: 
Ubuntu 16.04 

Java Version

openjdk version "1.8.0_191"
 OpenJDK Runtime Environment (build 1.8.0_191-8u191-b12-0ubuntu0.16.04.1-b12)
 OpenJDK 64-Bit Zero VM (build 25.191-b12, interpreted mode)

 

Hadoop  Version

Hadoop 2.7.1
 Subversion Unknown -r Unknown
 Compiled by test on 2019-01-29T09:09Z
 Compiled with protoc 2.5.0
 From source with checksum 5e94a235f9a71834e2eb73fb36ee873f
 This command was run using 
/home/test/hadoop-release-2.7.1/hadoop-dist/target/hadoop-2.7.1/share/hadoop/common/hadoop-common-2.7.1.jar

 

 

 

  was:
I am working on Ubuntu 16.04 

Java Version

openjdk version "1.8.0_191"
 OpenJDK Runtime Environment (build 1.8.0_191-8u191-b12-0ubuntu0.16.04.1-b12)
 OpenJDK 64-Bit Zero VM (build 25.191-b12, interpreted mode)

 

Hadoop  Version

Hadoop 2.7.1
 Subversion Unknown -r Unknown
 Compiled by test on 2019-01-29T09:09Z
 Compiled with protoc 2.5.0
 From source with checksum 5e94a235f9a71834e2eb73fb36ee873f
 This command was run using 
/home/test/hadoop-release-2.7.1/hadoop-dist/target/hadoop-2.7.1/share/hadoop/common/hadoop-common-2.7.1.jar

 

 

 


> Testcases failing with "org.apache.hadoop.fs.ChecksumException" error
> -
>
> Key: SPARK-26796
> URL: https://issues.apache.org/jira/browse/SPARK-26796
> Project: Spark
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 2.3.2, 2.4.0
> Environment: Ubuntu 16.04 
> Java Version
> openjdk version "1.8.0_191"
>  OpenJDK Runtime Environment (build 1.8.0_191-8u191-b12-0ubuntu0.16.04.1-b12)
>  OpenJDK 64-Bit Zero VM (build 25.191-b12, interpreted mode)
>  
> Hadoop  Version
> Hadoop 2.7.1
>  Subversion Unknown -r Unknown
>  Compiled by test on 2019-01-29T09:09Z
>  Compiled with protoc 2.5.0
>  From source with checksum 5e94a235f9a71834e2eb73fb36ee873f
>  This command was run using 
> /home/test/hadoop-release-2.7.1/hadoop-dist/target/hadoop-2.7.1/share/hadoop/common/hadoop-common-2.7.1.jar
>  
>  
>  
>Reporter: Anuja Jakhade
>Priority: Major
>
> Observing test case failures due to Checksum error 
> Below is the error log
> [ERROR] checkpointAndComputation(test.org.apache.spark.JavaAPISuite) Time 
> elapsed: 1.232 s <<< ERROR!
> org.apache.spark.SparkException: 
> Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times, most 
> recent failure: Lost task 0.0 in stage 2.0 (TID 2, localhost, executor 
> driver): org.apache.hadoop.fs.ChecksumException: Checksum error: 
> file:/home/test/spark/core/target/tmp/1548319689411-0/fd0ba388-539c-49aa-bf76-e7d50aa2d1fc/rdd-0/part-0
>  at 0 exp: 222499834 got: 1400184476
>  at org.apache.hadoop.fs.FSInputChecker.verifySums(FSInputChecker.java:323)
>  at 
> org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:279)
>  at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:214)
>  at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:232)
>  at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:196)
>  at java.io.DataInputStream.read(DataInputStream.java:149)
>  at 
> java.io.ObjectInputStream$PeekInputStream.read(ObjectInputStream.java:2769)
>  at 
> java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2785)
>  at 
> java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3262)
>  at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:968)
>  at java.io.ObjectInputStream.(ObjectInputStream.java:390)
>  at 
> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.(JavaSerializer.scala:63)
>  at 
> org.apache.spark.serializer.JavaDeserializationStream.(JavaSerializer.scala:63)
>  at 
> org.apache.spark.serializer.JavaSerializerInstance.deserializeStream(JavaSerializer.scala:122)
>  at 
> org.apache.spark.rdd.ReliableCheckpointRDD$.readCheckpointFile(ReliableCheckpointRDD.scala:300)
>  at 
> org.apache.spark.rdd.ReliableCheckpointRDD.compute(ReliableCheckpointRDD.scala:100)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:322)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
>  at org.apache.spark.scheduler.Task.run(Task.scala:109)
>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:813)
> Driver stacktrace:
>  at 
> 

[jira] [Updated] (SPARK-26796) Testcases failing with "org.apache.hadoop.fs.ChecksumException" error

2019-01-31 Thread Anuja Jakhade (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuja Jakhade updated SPARK-26796:
--
Environment: 
I am working on Ubuntu 16.04 

Java Version

openjdk version "1.8.0_191"
 OpenJDK Runtime Environment (build 1.8.0_191-8u191-b12-0ubuntu0.16.04.1-b12)
 OpenJDK 64-Bit Zero VM (build 25.191-b12, interpreted mode)

 

Hadoop  Version

Hadoop 2.7.1
 Subversion Unknown -r Unknown
 Compiled by test on 2019-01-29T09:09Z
 Compiled with protoc 2.5.0
 From source with checksum 5e94a235f9a71834e2eb73fb36ee873f
 This command was run using 
/home/test/hadoop-release-2.7.1/hadoop-dist/target/hadoop-2.7.1/share/hadoop/common/hadoop-common-2.7.1.jar

 

 

 

  was:
I am working on Ubuntu 16.04 on s390x.

Java Version

openjdk version "1.8.0_191"
OpenJDK Runtime Environment (build 1.8.0_191-8u191-b12-0ubuntu0.16.04.1-b12)
OpenJDK 64-Bit Zero VM (build 25.191-b12, interpreted mode)

 

Hadoop  Version

Hadoop 2.7.1
Subversion Unknown -r Unknown
Compiled by test on 2019-01-29T09:09Z
Compiled with protoc 2.5.0
>From source with checksum 5e94a235f9a71834e2eb73fb36ee873f
This command was run using 
/home/test/hadoop-release-2.7.1/hadoop-dist/target/hadoop-2.7.1/share/hadoop/common/hadoop-common-2.7.1.jar

 

 

 


> Testcases failing with "org.apache.hadoop.fs.ChecksumException" error
> -
>
> Key: SPARK-26796
> URL: https://issues.apache.org/jira/browse/SPARK-26796
> Project: Spark
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 2.3.2, 2.4.0
> Environment: I am working on Ubuntu 16.04 
> Java Version
> openjdk version "1.8.0_191"
>  OpenJDK Runtime Environment (build 1.8.0_191-8u191-b12-0ubuntu0.16.04.1-b12)
>  OpenJDK 64-Bit Zero VM (build 25.191-b12, interpreted mode)
>  
> Hadoop  Version
> Hadoop 2.7.1
>  Subversion Unknown -r Unknown
>  Compiled by test on 2019-01-29T09:09Z
>  Compiled with protoc 2.5.0
>  From source with checksum 5e94a235f9a71834e2eb73fb36ee873f
>  This command was run using 
> /home/test/hadoop-release-2.7.1/hadoop-dist/target/hadoop-2.7.1/share/hadoop/common/hadoop-common-2.7.1.jar
>  
>  
>  
>Reporter: Anuja Jakhade
>Priority: Major
>
> Observing test case failures due to Checksum error 
> Below is the error log
> [ERROR] checkpointAndComputation(test.org.apache.spark.JavaAPISuite) Time 
> elapsed: 1.232 s <<< ERROR!
> org.apache.spark.SparkException: 
> Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times, most 
> recent failure: Lost task 0.0 in stage 2.0 (TID 2, localhost, executor 
> driver): org.apache.hadoop.fs.ChecksumException: Checksum error: 
> file:/home/test/spark/core/target/tmp/1548319689411-0/fd0ba388-539c-49aa-bf76-e7d50aa2d1fc/rdd-0/part-0
>  at 0 exp: 222499834 got: 1400184476
>  at org.apache.hadoop.fs.FSInputChecker.verifySums(FSInputChecker.java:323)
>  at 
> org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:279)
>  at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:214)
>  at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:232)
>  at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:196)
>  at java.io.DataInputStream.read(DataInputStream.java:149)
>  at 
> java.io.ObjectInputStream$PeekInputStream.read(ObjectInputStream.java:2769)
>  at 
> java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2785)
>  at 
> java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3262)
>  at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:968)
>  at java.io.ObjectInputStream.(ObjectInputStream.java:390)
>  at 
> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.(JavaSerializer.scala:63)
>  at 
> org.apache.spark.serializer.JavaDeserializationStream.(JavaSerializer.scala:63)
>  at 
> org.apache.spark.serializer.JavaSerializerInstance.deserializeStream(JavaSerializer.scala:122)
>  at 
> org.apache.spark.rdd.ReliableCheckpointRDD$.readCheckpointFile(ReliableCheckpointRDD.scala:300)
>  at 
> org.apache.spark.rdd.ReliableCheckpointRDD.compute(ReliableCheckpointRDD.scala:100)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:322)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
>  at org.apache.spark.scheduler.Task.run(Task.scala:109)
>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:813)
> Driver