[
https://issues.apache.org/jira/browse/SPARK-18830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Apache Spark reassigned SPARK-18830:
------------------------------------
Assignee: Apache Spark
> Fix tests in PipedRDDSuite to pass on Winodws
> ---------------------------------------------
>
> Key: SPARK-18830
> URL: https://issues.apache.org/jira/browse/SPARK-18830
> Project: Spark
> Issue Type: Sub-task
> Components: Tests
> Reporter: Hyukjin Kwon
> Assignee: Apache Spark
> Priority: Minor
>
> - {{PipedRDDSuite}}
> {code}
> [info] - pipe with empty partition *** FAILED *** (672 milliseconds)
> [info] Set(0, 4, 5) did not equal Set(0, 5, 6) (PipedRDDSuite.scala:145)
> [info] org.scalatest.exceptions.TestFailedException:
> [info] at
> org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:500)
> [info] at
> org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1555)
> [info] at
> org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:466)
> [info] at
> org.apache.spark.rdd.PipedRDDSuite$$anonfun$5.apply$mcV$sp(PipedRDDSuite.scala:145)
> [info] at
> org.apache.spark.rdd.PipedRDDSuite$$anonfun$5.apply(PipedRDDSuite.scala:140)
> [info] at
> org.apache.spark.rdd.PipedRDDSuite$$anonfun$5.apply(PipedRDDSuite.scala:140)
> ...
> {code}
> In this case, {{wc -c}} counts the characters but the newlines characters on
> Windows are {{\r\n}} which are two. So, the counts ends up one more for each.
> {code}
> [info] - test pipe exports map_input_file *** FAILED *** (62 milliseconds)
> [info] java.lang.IllegalStateException: Subprocess exited with status 1.
> Command ran: printenv map_input_file
> [info] at org.apache.spark.rdd.PipedRDD$$anon$1.hasNext(PipedRDD.scala:178)
> [info] at scala.collection.Iterator$class.foreach(Iterator.scala:893)
> [info] at org.apache.spark.rdd.PipedRDD$$anon$1.foreach(PipedRDD.scala:163)
> [info] at
> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
> [info] at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
> [info] at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
> [info] at
> scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310)
> [info] at org.apache.spark.rdd.PipedRDD$$anon$1.to(PipedRDD.scala:163)
> [info] at
> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:302)
> [info] at org.apache.spark.rdd.PipedRDD$$anon$1.toBuffer(PipedRDD.scala:163)
> [info] at
> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:289)
> [info] at org.apache.spark.rdd.PipedRDD$$anon$1.toArray(PipedRDD.scala:163)
> [info] at
> org.apache.spark.rdd.PipedRDDSuite.testExportInputFile(PipedRDDSuite.scala:247)
> [info] at
> org.apache.spark.rdd.PipedRDDSuite$$anonfun$10.apply$mcV$sp(PipedRDDSuite.scala:209)
> [info] at
> org.apache.spark.rdd.PipedRDDSuite$$anonfun$10.apply(PipedRDDSuite.scala:209)
> [info] at
> org.apache.spark.rdd.PipedRDDSuite$$anonfun$10.apply(PipedRDDSuite.scala:209)
> ...
> {code}
> {code}
> [info] - test pipe exports mapreduce_map_input_file *** FAILED *** (172
> milliseconds)
> [info] java.lang.IllegalStateException: Subprocess exited with status 1.
> Command ran: printenv mapreduce_map_input_file
> [info] at org.apache.spark.rdd.PipedRDD$$anon$1.hasNext(PipedRDD.scala:178)
> [info] at scala.collection.Iterator$class.foreach(Iterator.scala:893)
> [info] at org.apache.spark.rdd.PipedRDD$$anon$1.foreach(PipedRDD.scala:163)
> [info] at
> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
> [info] at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
> [info] at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
> [info] at
> scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310)
> [info] at org.apache.spark.rdd.PipedRDD$$anon$1.to(PipedRDD.scala:163)
> [info] at
> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:302)
> [info] at org.apache.spark.rdd.PipedRDD$$anon$1.toBuffer(PipedRDD.scala:163)
> [info] at
> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:289)
> [info] at org.apache.spark.rdd.PipedRDD$$anon$1.toArray(PipedRDD.scala:163)
> [info] at
> org.apache.spark.rdd.PipedRDDSuite.testExportInputFile(PipedRDDSuite.scala:247)
> [info] at
> org.apache.spark.rdd.PipedRDDSuite$$anonfun$11.apply$mcV$sp(PipedRDDSuite.scala:213)
> [info] at
> org.apache.spark.rdd.PipedRDDSuite$$anonfun$11.apply(PipedRDDSuite.scala:213)
> [info] at
> org.apache.spark.rdd.PipedRDDSuite$$anonfun$11.apply(PipedRDDSuite.scala:213)
> ...
> {code}
> For both tests above, it is due to the incorrect behaviour in {{printenv}} on
> Windows which is (I believe) a thirparty tool that resembles {{printenv}} on
> Linux (installed in AppVeyor environment or Windows Server 2012 R2). This
> command does not exist, at least, for Windows 7 and 10.
> This command prints the environment variables; however, when environment
> variables are set to {{ProcessBuilder}} as lower-cased keys, {{printenv}} in
> Windows ignores this although it is actually set and accessible. When they
> are upper-cased, they are fine.
> On Winodws, we can use {{cmd.exe /C set [varname]}} officialy for this
> purpose. We could fix the tests with this in order to test if the environment
> variable is set.
> For the full logs, please refer
> https://ci.appveyor.com/project/spark-test/spark/build/156-tmp-windows-base
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]