[jira] [Updated] (SPARK-18803) Fix path-related and JarEntry-related test failures and skip some tests failed on Windows due to path length limitation

Hyukjin Kwon (JIRA) Fri, 09 Dec 2016 01:34:13 -0800

     [ 
https://issues.apache.org/jira/browse/SPARK-18803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Hyukjin Kwon updated SPARK-18803:
---------------------------------
    Description: 
There are some tests being failed on Windows as below for several problems.

*Incorrect path handling*

- {{FileSuite}}

{code}
[info] - binary file input as byte array *** FAILED *** (500 milliseconds)
[info]   
"file:/C:/projects/spark/target/tmp/spark-e7c3a3b8-0a4b-4a7f-9ebe-7c4883e48624/record-bytestream-00000.bin"
 did not contain 
"C:\projects\spark\target\tmp\spark-e7c3a3b8-0a4b-4a7f-9ebe-7c4883e48624\record-bytestream-00000.bin"
 (FileSuite.scala:258)
[info]   org.scalatest.exceptions.TestFailedException:
[info]   at 
org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:500)
[info]   at 
org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1555)
[info]   at 
org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:466)
[info]   at 
org.apache.spark.FileSuite$$anonfun$14.apply$mcV$sp(FileSuite.scala:258)
[info]   at org.apache.spark.FileSuite$$anonfun$14.apply(FileSuite.scala:239)
[info]   at org.apache.spark.FileSuite$$anonfun$14.apply(FileSuite.scala:239)
...
{code}


{code}
[info] - Get input files via old Hadoop API *** FAILED *** (1 second, 94 
milliseconds)
[info]   
Set("/C:/projects/spark/target/tmp/spark-cf5b1f8b-c5ed-43e0-8d17-546ebbfa8200/output/part-00000",
 
"/C:/projects/spark/target/tmp/spark-cf5b1f8b-c5ed-43e0-8d17-546ebbfa8200/output/part-00001")
 did not equal 
Set("C:\projects\spark\target\tmp\spark-cf5b1f8b-c5ed-43e0-8d17-546ebbfa8200\output/part-00000",
 
"C:\projects\spark\target\tmp\spark-cf5b1f8b-c5ed-43e0-8d17-546ebbfa8200\output/part-00001")
 (FileSuite.scala:535)
[info]   org.scalatest.exceptions.TestFailedException:
[info]   at 
org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:500)
[info]   at 
org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1555)
[info]   at 
org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:466)
[info]   at 
org.apache.spark.FileSuite$$anonfun$29.apply$mcV$sp(FileSuite.scala:535)
[info]   at org.apache.spark.FileSuite$$anonfun$29.apply(FileSuite.scala:524)
[info]   at org.apache.spark.FileSuite$$anonfun$29.apply(FileSuite.scala:524)
...
{code}


{code}
[info] - Get input files via new Hadoop API *** FAILED *** (313 milliseconds)
[info]   
Set("/C:/projects/spark/target/tmp/spark-12bc1540-1111-4df6-9c4d-79e0e614407c/output/part-00000",
 
"/C:/projects/spark/target/tmp/spark-12bc1540-1111-4df6-9c4d-79e0e614407c/output/part-00001")
 did not equal 
Set("C:\projects\spark\target\tmp\spark-12bc1540-1111-4df6-9c4d-79e0e614407c\output/part-00000",
 
"C:\projects\spark\target\tmp\spark-12bc1540-1111-4df6-9c4d-79e0e614407c\output/part-00001")
 (FileSuite.scala:549)
[info]   org.scalatest.exceptions.TestFailedException:
[info]   at 
org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:500)
[info]   at 
org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1555)
[info]   at 
org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:466)
[info]   at 
org.apache.spark.FileSuite$$anonfun$30.apply$mcV$sp(FileSuite.scala:549)
[info]   at org.apache.spark.FileSuite$$anonfun$30.apply(FileSuite.scala:538)
[info]   at org.apache.spark.FileSuite$$anonfun$30.apply(FileSuite.scala:538)
...
{code}


- {{TaskResultGetterSuite}}

{code}
[info] - handling results larger than max RPC message size *** FAILED *** (1 
second, 579 milliseconds)
[info]   1 did not equal 0 Expect result to be removed from the block manager. 
(TaskResultGetterSuite.scala:129)
[info]   org.scalatest.exceptions.TestFailedException:
[info]   at 
org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:500)
[info]   at 
org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1555)
[info]   at 
org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:466)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$4.apply$mcV$sp(TaskResultGetterSuite.scala:129)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$4.apply(TaskResultGetterSuite.scala:121)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$4.apply(TaskResultGetterSuite.scala:121)
[info]   ...
[info]   Cause: java.net.URISyntaxException: Illegal character in path at index 
12: 
string:///C:\projects\spark\target\tmp\spark-93c485af-68da-440f-a907-aac7acd5fc25\repro\MyException.java
[info]   at java.net.URI$Parser.fail(URI.java:2848)
[info]   at java.net.URI$Parser.checkChars(URI.java:3021)
[info]   at java.net.URI$Parser.parseHierarchical(URI.java:3105)
[info]   at java.net.URI$Parser.parse(URI.java:3053)
[info]   at java.net.URI.<init>(URI.java:588)
[info]   at java.net.URI.create(URI.java:850)
[info]   at 
org.apache.spark.TestUtils$.org$apache$spark$TestUtils$$createURI(TestUtils.scala:112)
[info]   at 
org.apache.spark.TestUtils$JavaSourceFromString.<init>(TestUtils.scala:116)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6.apply$mcV$sp(TaskResultGetterSuite.scala:174)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6.apply(TaskResultGetterSuite.scala:169)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6.apply(TaskResultGetterSuite.scala:169)
[info]   at 
org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
[info]   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
...
{code}

{code}
[info] - failed task deserialized with the correct classloader (SPARK-11195) 
*** FAILED *** (0 milliseconds)
[info]   java.lang.IllegalArgumentException: Illegal character in path at index 
12: 
string:///C:\projects\spark\target\tmp\spark-93c485af-68da-440f-a907-aac7acd5fc25\repro\MyException.java
[info]   at java.net.URI.create(URI.java:852)
[info]   at 
org.apache.spark.TestUtils$.org$apache$spark$TestUtils$$createURI(TestUtils.scala:112)
[info]   at 
org.apache.spark.TestUtils$JavaSourceFromString.<init>(TestUtils.scala:116)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6.apply$mcV$sp(TaskResultGetterSuite.scala:174)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6.apply(TaskResultGetterSuite.scala:169)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6.apply(TaskResultGetterSuite.scala:169)
...
{code}

- {{SparkSubmitSuite}}

{code}
[info]   java.lang.IllegalArgumentException: Illegal character in path at index 
12: string:///C:\projects\spark\target\tmp\1481210831381-0\870903339\MyLib.java
[info]   at java.net.URI.create(URI.java:852)
[info]   at 
org.apache.spark.TestUtils$.org$apache$spark$TestUtils$$createURI(TestUtils.scala:112)
[info]   at 
org.apache.spark.TestUtils$JavaSourceFromString.<init>(TestUtils.scala:116)
[info]   at 
org.apache.spark.deploy.IvyTestUtils$.createJavaClass(IvyTestUtils.scala:145)
[info]   at 
org.apache.spark.deploy.IvyTestUtils$.org$apache$spark$deploy$IvyTestUtils$$createLocalRepository(IvyTestUtils.scala:302)
[info]   at 
org.apache.spark.deploy.IvyTestUtils$.createLocalRepositoryForTests(IvyTestUtils.scala:341)
[info]   at 
org.apache.spark.deploy.IvyTestUtils$.withRepository(IvyTestUtils.scala:368)
[info]   at 
org.apache.spark.deploy.SparkSubmitSuite$$anonfun$18.apply$mcV$sp(SparkSubmitSuite.scala:412)
[info]   at 
org.apache.spark.deploy.SparkSubmitSuite$$anonfun$18.apply(SparkSubmitSuite.scala:408)
[info]   at 
org.apache.spark.deploy.SparkSubmitSuite$$anonfun$18.apply(SparkSubmitSuite.scala:408)
...
{code}


*Correct separate for JarEntry*

After the path fix from above, then {{TaskResultGetterSuite}} throws another 
exception as below:

- {{TaskResultGetterSuite}}

{code}
[info] - failed task deserialized with the correct classloader (SPARK-11195) 
*** FAILED *** (907 milliseconds)
[info]   java.lang.ClassNotFoundException: repro.MyException
[info]   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
[info]   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
[info]   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
[info]   at java.lang.Class.forName0(Native Method)
[info]   at java.lang.Class.forName(Class.java:348)
[info]   at org.apache.spark.util.Utils$.classForName(Utils.scala:229)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6$$anonfun$apply$mcV$sp$1.apply$mcV$sp(TaskResultGetterSuite.scala:191)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6$$anonfun$apply$mcV$sp$1.apply(TaskResultGetterSuite.scala:187)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6$$anonfun$apply$mcV$sp$1.apply(TaskResultGetterSuite.scala:187)
[info]   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1303)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6.apply$mcV$sp(TaskResultGetterSuite.scala:212)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6.apply(TaskResultGetterSuite.scala:169)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6.apply(TaskResultGetterSuite.scala:169)
{code}

This is because {{Paths.get}} concatenate the given paths to OS-specific format 
(Windows \ and Linux /). However, for {{JarEntry}} we should comply

ZIP specification meaning it should be always / according to ZIP specification 
- https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT

See {{4.4.17 file name: (Variable)}} in 
https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT


*Long path problem on Windows*

Some tests in {{ShuffleSuite}} via {{ShuffleNettySuite}} were skipped due to 
the same reason with SPARK-18718

  was:

There are some tests being failed on Windows as below for several problems.

*Incorrect path handling*

- {{FileSuite}}

{code}
[info] - binary file input as byte array *** FAILED *** (500 milliseconds)
[info]   
"file:/C:/projects/spark/target/tmp/spark-e7c3a3b8-0a4b-4a7f-9ebe-7c4883e48624/record-bytestream-00000.bin"
 did not contain 
"C:\projects\spark\target\tmp\spark-e7c3a3b8-0a4b-4a7f-9ebe-7c4883e48624\record-bytestream-00000.bin"
 (FileSuite.scala:258)
[info]   org.scalatest.exceptions.TestFailedException:
[info]   at 
org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:500)
[info]   at 
org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1555)
[info]   at 
org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:466)
[info]   at 
org.apache.spark.FileSuite$$anonfun$14.apply$mcV$sp(FileSuite.scala:258)
[info]   at org.apache.spark.FileSuite$$anonfun$14.apply(FileSuite.scala:239)
[info]   at org.apache.spark.FileSuite$$anonfun$14.apply(FileSuite.scala:239)
...
{code}


{code}
[info] - Get input files via old Hadoop API *** FAILED *** (1 second, 94 
milliseconds)
[info]   
Set("/C:/projects/spark/target/tmp/spark-cf5b1f8b-c5ed-43e0-8d17-546ebbfa8200/output/part-00000",
 
"/C:/projects/spark/target/tmp/spark-cf5b1f8b-c5ed-43e0-8d17-546ebbfa8200/output/part-00001")
 did not equal 
Set("C:\projects\spark\target\tmp\spark-cf5b1f8b-c5ed-43e0-8d17-546ebbfa8200\output/part-00000",
 
"C:\projects\spark\target\tmp\spark-cf5b1f8b-c5ed-43e0-8d17-546ebbfa8200\output/part-00001")
 (FileSuite.scala:535)
[info]   org.scalatest.exceptions.TestFailedException:
[info]   at 
org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:500)
[info]   at 
org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1555)
[info]   at 
org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:466)
[info]   at 
org.apache.spark.FileSuite$$anonfun$29.apply$mcV$sp(FileSuite.scala:535)
[info]   at org.apache.spark.FileSuite$$anonfun$29.apply(FileSuite.scala:524)
[info]   at org.apache.spark.FileSuite$$anonfun$29.apply(FileSuite.scala:524)
...
{code}


{code}
[info] - Get input files via new Hadoop API *** FAILED *** (313 milliseconds)
[info]   
Set("/C:/projects/spark/target/tmp/spark-12bc1540-1111-4df6-9c4d-79e0e614407c/output/part-00000",
 
"/C:/projects/spark/target/tmp/spark-12bc1540-1111-4df6-9c4d-79e0e614407c/output/part-00001")
 did not equal 
Set("C:\projects\spark\target\tmp\spark-12bc1540-1111-4df6-9c4d-79e0e614407c\output/part-00000",
 
"C:\projects\spark\target\tmp\spark-12bc1540-1111-4df6-9c4d-79e0e614407c\output/part-00001")
 (FileSuite.scala:549)
[info]   org.scalatest.exceptions.TestFailedException:
[info]   at 
org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:500)
[info]   at 
org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1555)
[info]   at 
org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:466)
[info]   at 
org.apache.spark.FileSuite$$anonfun$30.apply$mcV$sp(FileSuite.scala:549)
[info]   at org.apache.spark.FileSuite$$anonfun$30.apply(FileSuite.scala:538)
[info]   at org.apache.spark.FileSuite$$anonfun$30.apply(FileSuite.scala:538)
...
{code}


- {{TaskResultGetterSuite}}

{code}
[info] - handling results larger than max RPC message size *** FAILED *** (1 
second, 579 milliseconds)
[info]   1 did not equal 0 Expect result to be removed from the block manager. 
(TaskResultGetterSuite.scala:129)
[info]   org.scalatest.exceptions.TestFailedException:
[info]   at 
org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:500)
[info]   at 
org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1555)
[info]   at 
org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:466)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$4.apply$mcV$sp(TaskResultGetterSuite.scala:129)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$4.apply(TaskResultGetterSuite.scala:121)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$4.apply(TaskResultGetterSuite.scala:121)
[info]   ...
[info]   Cause: java.net.URISyntaxException: Illegal character in path at index 
12: 
string:///C:\projects\spark\target\tmp\spark-93c485af-68da-440f-a907-aac7acd5fc25\repro\MyException.java
[info]   at java.net.URI$Parser.fail(URI.java:2848)
[info]   at java.net.URI$Parser.checkChars(URI.java:3021)
[info]   at java.net.URI$Parser.parseHierarchical(URI.java:3105)
[info]   at java.net.URI$Parser.parse(URI.java:3053)
[info]   at java.net.URI.<init>(URI.java:588)
[info]   at java.net.URI.create(URI.java:850)
[info]   at 
org.apache.spark.TestUtils$.org$apache$spark$TestUtils$$createURI(TestUtils.scala:112)
[info]   at 
org.apache.spark.TestUtils$JavaSourceFromString.<init>(TestUtils.scala:116)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6.apply$mcV$sp(TaskResultGetterSuite.scala:174)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6.apply(TaskResultGetterSuite.scala:169)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6.apply(TaskResultGetterSuite.scala:169)
[info]   at 
org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
[info]   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
...
{code}

{code}
[info] - failed task deserialized with the correct classloader (SPARK-11195) 
*** FAILED *** (0 milliseconds)
[info]   java.lang.IllegalArgumentException: Illegal character in path at index 
12: 
string:///C:\projects\spark\target\tmp\spark-93c485af-68da-440f-a907-aac7acd5fc25\repro\MyException.java
[info]   at java.net.URI.create(URI.java:852)
[info]   at 
org.apache.spark.TestUtils$.org$apache$spark$TestUtils$$createURI(TestUtils.scala:112)
[info]   at 
org.apache.spark.TestUtils$JavaSourceFromString.<init>(TestUtils.scala:116)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6.apply$mcV$sp(TaskResultGetterSuite.scala:174)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6.apply(TaskResultGetterSuite.scala:169)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6.apply(TaskResultGetterSuite.scala:169)
...
{code}

- {{SparkSubmitSuite}}

{code}
[info]   java.lang.IllegalArgumentException: Illegal character in path at index 
12: string:///C:\projects\spark\target\tmp\1481210831381-0\870903339\MyLib.java
[info]   at java.net.URI.create(URI.java:852)
[info]   at 
org.apache.spark.TestUtils$.org$apache$spark$TestUtils$$createURI(TestUtils.scala:112)
[info]   at 
org.apache.spark.TestUtils$JavaSourceFromString.<init>(TestUtils.scala:116)
[info]   at 
org.apache.spark.deploy.IvyTestUtils$.createJavaClass(IvyTestUtils.scala:145)
[info]   at 
org.apache.spark.deploy.IvyTestUtils$.org$apache$spark$deploy$IvyTestUtils$$createLocalRepository(IvyTestUtils.scala:302)
[info]   at 
org.apache.spark.deploy.IvyTestUtils$.createLocalRepositoryForTests(IvyTestUtils.scala:341)
[info]   at 
org.apache.spark.deploy.IvyTestUtils$.withRepository(IvyTestUtils.scala:368)
[info]   at 
org.apache.spark.deploy.SparkSubmitSuite$$anonfun$18.apply$mcV$sp(SparkSubmitSuite.scala:412)
[info]   at 
org.apache.spark.deploy.SparkSubmitSuite$$anonfun$18.apply(SparkSubmitSuite.scala:408)
[info]   at 
org.apache.spark.deploy.SparkSubmitSuite$$anonfun$18.apply(SparkSubmitSuite.scala:408)
...
{code}


*Correct separate for JarEntry*

After the path fix from above, then {{TaskResultGetterSuite}} throws another 
exception as below:

- {{TaskResultGetterSuite}}

{code}
[info] - failed task deserialized with the correct classloader (SPARK-11195) 
*** FAILED *** (907 milliseconds)
[info]   java.lang.ClassNotFoundException: repro.MyException
[info]   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
[info]   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
[info]   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
[info]   at java.lang.Class.forName0(Native Method)
[info]   at java.lang.Class.forName(Class.java:348)
[info]   at org.apache.spark.util.Utils$.classForName(Utils.scala:229)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6$$anonfun$apply$mcV$sp$1.apply$mcV$sp(TaskResultGetterSuite.scala:191)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6$$anonfun$apply$mcV$sp$1.apply(TaskResultGetterSuite.scala:187)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6$$anonfun$apply$mcV$sp$1.apply(TaskResultGetterSuite.scala:187)
[info]   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1303)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6.apply$mcV$sp(TaskResultGetterSuite.scala:212)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6.apply(TaskResultGetterSuite.scala:169)
[info]   at 
org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6.apply(TaskResultGetterSuite.scala:169)
{code}

This is because {{Paths.get}} concatenate the given paths to OS-specific format 
(Windows \ and Linux /). However, for {{JarEntry}} we should comply

ZIP specification meaning it should be always \ according to ZIP specification 
- https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT

See {{4.4.17 file name: (Variable)}} in 
https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT


*Long path problem on Windows*

Some tests in {{ShuffleSuite}} via {{ShuffleNettySuite}} were skipped due to 
the same reason with SPARK-18718


> Fix path-related and JarEntry-related test failures and skip some tests 
> failed on Windows due to path length limitation
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-18803
>                 URL: https://issues.apache.org/jira/browse/SPARK-18803
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Tests
>            Reporter: Hyukjin Kwon
>            Priority: Minor
>
> There are some tests being failed on Windows as below for several problems.
> *Incorrect path handling*
> - {{FileSuite}}
> {code}
> [info] - binary file input as byte array *** FAILED *** (500 milliseconds)
> [info]   
> "file:/C:/projects/spark/target/tmp/spark-e7c3a3b8-0a4b-4a7f-9ebe-7c4883e48624/record-bytestream-00000.bin"
>  did not contain 
> "C:\projects\spark\target\tmp\spark-e7c3a3b8-0a4b-4a7f-9ebe-7c4883e48624\record-bytestream-00000.bin"
>  (FileSuite.scala:258)
> [info]   org.scalatest.exceptions.TestFailedException:
> [info]   at 
> org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:500)
> [info]   at 
> org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1555)
> [info]   at 
> org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:466)
> [info]   at 
> org.apache.spark.FileSuite$$anonfun$14.apply$mcV$sp(FileSuite.scala:258)
> [info]   at org.apache.spark.FileSuite$$anonfun$14.apply(FileSuite.scala:239)
> [info]   at org.apache.spark.FileSuite$$anonfun$14.apply(FileSuite.scala:239)
> ...
> {code}
> {code}
> [info] - Get input files via old Hadoop API *** FAILED *** (1 second, 94 
> milliseconds)
> [info]   
> Set("/C:/projects/spark/target/tmp/spark-cf5b1f8b-c5ed-43e0-8d17-546ebbfa8200/output/part-00000",
>  
> "/C:/projects/spark/target/tmp/spark-cf5b1f8b-c5ed-43e0-8d17-546ebbfa8200/output/part-00001")
>  did not equal 
> Set("C:\projects\spark\target\tmp\spark-cf5b1f8b-c5ed-43e0-8d17-546ebbfa8200\output/part-00000",
>  
> "C:\projects\spark\target\tmp\spark-cf5b1f8b-c5ed-43e0-8d17-546ebbfa8200\output/part-00001")
>  (FileSuite.scala:535)
> [info]   org.scalatest.exceptions.TestFailedException:
> [info]   at 
> org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:500)
> [info]   at 
> org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1555)
> [info]   at 
> org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:466)
> [info]   at 
> org.apache.spark.FileSuite$$anonfun$29.apply$mcV$sp(FileSuite.scala:535)
> [info]   at org.apache.spark.FileSuite$$anonfun$29.apply(FileSuite.scala:524)
> [info]   at org.apache.spark.FileSuite$$anonfun$29.apply(FileSuite.scala:524)
> ...
> {code}
> {code}
> [info] - Get input files via new Hadoop API *** FAILED *** (313 milliseconds)
> [info]   
> Set("/C:/projects/spark/target/tmp/spark-12bc1540-1111-4df6-9c4d-79e0e614407c/output/part-00000",
>  
> "/C:/projects/spark/target/tmp/spark-12bc1540-1111-4df6-9c4d-79e0e614407c/output/part-00001")
>  did not equal 
> Set("C:\projects\spark\target\tmp\spark-12bc1540-1111-4df6-9c4d-79e0e614407c\output/part-00000",
>  
> "C:\projects\spark\target\tmp\spark-12bc1540-1111-4df6-9c4d-79e0e614407c\output/part-00001")
>  (FileSuite.scala:549)
> [info]   org.scalatest.exceptions.TestFailedException:
> [info]   at 
> org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:500)
> [info]   at 
> org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1555)
> [info]   at 
> org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:466)
> [info]   at 
> org.apache.spark.FileSuite$$anonfun$30.apply$mcV$sp(FileSuite.scala:549)
> [info]   at org.apache.spark.FileSuite$$anonfun$30.apply(FileSuite.scala:538)
> [info]   at org.apache.spark.FileSuite$$anonfun$30.apply(FileSuite.scala:538)
> ...
> {code}
> - {{TaskResultGetterSuite}}
> {code}
> [info] - handling results larger than max RPC message size *** FAILED *** (1 
> second, 579 milliseconds)
> [info]   1 did not equal 0 Expect result to be removed from the block 
> manager. (TaskResultGetterSuite.scala:129)
> [info]   org.scalatest.exceptions.TestFailedException:
> [info]   at 
> org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:500)
> [info]   at 
> org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1555)
> [info]   at 
> org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:466)
> [info]   at 
> org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$4.apply$mcV$sp(TaskResultGetterSuite.scala:129)
> [info]   at 
> org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$4.apply(TaskResultGetterSuite.scala:121)
> [info]   at 
> org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$4.apply(TaskResultGetterSuite.scala:121)
> [info]   ...
> [info]   Cause: java.net.URISyntaxException: Illegal character in path at 
> index 12: 
> string:///C:\projects\spark\target\tmp\spark-93c485af-68da-440f-a907-aac7acd5fc25\repro\MyException.java
> [info]   at java.net.URI$Parser.fail(URI.java:2848)
> [info]   at java.net.URI$Parser.checkChars(URI.java:3021)
> [info]   at java.net.URI$Parser.parseHierarchical(URI.java:3105)
> [info]   at java.net.URI$Parser.parse(URI.java:3053)
> [info]   at java.net.URI.<init>(URI.java:588)
> [info]   at java.net.URI.create(URI.java:850)
> [info]   at 
> org.apache.spark.TestUtils$.org$apache$spark$TestUtils$$createURI(TestUtils.scala:112)
> [info]   at 
> org.apache.spark.TestUtils$JavaSourceFromString.<init>(TestUtils.scala:116)
> [info]   at 
> org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6.apply$mcV$sp(TaskResultGetterSuite.scala:174)
> [info]   at 
> org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6.apply(TaskResultGetterSuite.scala:169)
> [info]   at 
> org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6.apply(TaskResultGetterSuite.scala:169)
> [info]   at 
> org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
> [info]   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
> ...
> {code}
> {code}
> [info] - failed task deserialized with the correct classloader (SPARK-11195) 
> *** FAILED *** (0 milliseconds)
> [info]   java.lang.IllegalArgumentException: Illegal character in path at 
> index 12: 
> string:///C:\projects\spark\target\tmp\spark-93c485af-68da-440f-a907-aac7acd5fc25\repro\MyException.java
> [info]   at java.net.URI.create(URI.java:852)
> [info]   at 
> org.apache.spark.TestUtils$.org$apache$spark$TestUtils$$createURI(TestUtils.scala:112)
> [info]   at 
> org.apache.spark.TestUtils$JavaSourceFromString.<init>(TestUtils.scala:116)
> [info]   at 
> org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6.apply$mcV$sp(TaskResultGetterSuite.scala:174)
> [info]   at 
> org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6.apply(TaskResultGetterSuite.scala:169)
> [info]   at 
> org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6.apply(TaskResultGetterSuite.scala:169)
> ...
> {code}
> - {{SparkSubmitSuite}}
> {code}
> [info]   java.lang.IllegalArgumentException: Illegal character in path at 
> index 12: 
> string:///C:\projects\spark\target\tmp\1481210831381-0\870903339\MyLib.java
> [info]   at java.net.URI.create(URI.java:852)
> [info]   at 
> org.apache.spark.TestUtils$.org$apache$spark$TestUtils$$createURI(TestUtils.scala:112)
> [info]   at 
> org.apache.spark.TestUtils$JavaSourceFromString.<init>(TestUtils.scala:116)
> [info]   at 
> org.apache.spark.deploy.IvyTestUtils$.createJavaClass(IvyTestUtils.scala:145)
> [info]   at 
> org.apache.spark.deploy.IvyTestUtils$.org$apache$spark$deploy$IvyTestUtils$$createLocalRepository(IvyTestUtils.scala:302)
> [info]   at 
> org.apache.spark.deploy.IvyTestUtils$.createLocalRepositoryForTests(IvyTestUtils.scala:341)
> [info]   at 
> org.apache.spark.deploy.IvyTestUtils$.withRepository(IvyTestUtils.scala:368)
> [info]   at 
> org.apache.spark.deploy.SparkSubmitSuite$$anonfun$18.apply$mcV$sp(SparkSubmitSuite.scala:412)
> [info]   at 
> org.apache.spark.deploy.SparkSubmitSuite$$anonfun$18.apply(SparkSubmitSuite.scala:408)
> [info]   at 
> org.apache.spark.deploy.SparkSubmitSuite$$anonfun$18.apply(SparkSubmitSuite.scala:408)
> ...
> {code}
> *Correct separate for JarEntry*
> After the path fix from above, then {{TaskResultGetterSuite}} throws another 
> exception as below:
> - {{TaskResultGetterSuite}}
> {code}
> [info] - failed task deserialized with the correct classloader (SPARK-11195) 
> *** FAILED *** (907 milliseconds)
> [info]   java.lang.ClassNotFoundException: repro.MyException
> [info]   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> [info]   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> [info]   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> [info]   at java.lang.Class.forName0(Native Method)
> [info]   at java.lang.Class.forName(Class.java:348)
> [info]   at org.apache.spark.util.Utils$.classForName(Utils.scala:229)
> [info]   at 
> org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6$$anonfun$apply$mcV$sp$1.apply$mcV$sp(TaskResultGetterSuite.scala:191)
> [info]   at 
> org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6$$anonfun$apply$mcV$sp$1.apply(TaskResultGetterSuite.scala:187)
> [info]   at 
> org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6$$anonfun$apply$mcV$sp$1.apply(TaskResultGetterSuite.scala:187)
> [info]   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1303)
> [info]   at 
> org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6.apply$mcV$sp(TaskResultGetterSuite.scala:212)
> [info]   at 
> org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6.apply(TaskResultGetterSuite.scala:169)
> [info]   at 
> org.apache.spark.scheduler.TaskResultGetterSuite$$anonfun$6.apply(TaskResultGetterSuite.scala:169)
> {code}
> This is because {{Paths.get}} concatenate the given paths to OS-specific 
> format (Windows \ and Linux /). However, for {{JarEntry}} we should comply
> ZIP specification meaning it should be always / according to ZIP 
> specification - https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT
> See {{4.4.17 file name: (Variable)}} in 
> https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT
> *Long path problem on Windows*
> Some tests in {{ShuffleSuite}} via {{ShuffleNettySuite}} were skipped due to 
> the same reason with SPARK-18718



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (SPARK-18803) Fix path-related and JarEntry-related test failures and skip some tests failed on Windows due to path length limitation

Reply via email to