[jira] [Commented] (SPARK-25688) Potential resource leak in ORC
[ https://issues.apache.org/jira/browse/SPARK-25688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16643926#comment-16643926 ] Dongjoon Hyun commented on SPARK-25688: --- [~smilegator] . This is a duplication of SPARK-23390. Please see SPARK-23390. We also observed `parquet` failure, too. The reason why we see ORC error more frequently is that ORC is tested at the first and it eventually hides similar Parquet failures. {code:java} private val allFileBasedDataSources = Seq("orc", "parquet", "csv", "json", "text"){code} > Potential resource leak in ORC > -- > > Key: SPARK-25688 > URL: https://issues.apache.org/jira/browse/SPARK-25688 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.0 >Reporter: Xiao Li >Assignee: Dongjoon Hyun >Priority: Critical > > http://spark-tests.appspot.com/test-details?suite_name=org.apache.spark.sql.FileBasedDataSourceSuite_name=%28It+is+not+a+test+it+is+a+sbt.testing.SuiteSelector%29 > > All the test failure is caused by the ORC internal. > {code} > org.scalatest.exceptions.TestFailedDueToTimeoutException: The code passed to > eventually never returned normally. Attempted 15 times over 10.019369471 > seconds. Last failure message: There are 1 possibly leaked file streams.. > sbt.ForkMain$ForkError: > org.scalatest.exceptions.TestFailedDueToTimeoutException: The code passed to > eventually never returned normally. Attempted 15 times over 10.019369471 > seconds. Last failure message: There are 1 possibly leaked file streams.. > at > org.scalatest.concurrent.Eventually$class.tryTryAgain$1(Eventually.scala:421) > at > org.scalatest.concurrent.Eventually$class.eventually(Eventually.scala:439) > at > org.apache.spark.sql.FileBasedDataSourceSuite.eventually(FileBasedDataSourceSuite.scala:37) > at > org.scalatest.concurrent.Eventually$class.eventually(Eventually.scala:308) > at > org.apache.spark.sql.FileBasedDataSourceSuite.eventually(FileBasedDataSourceSuite.scala:37) > at > org.apache.spark.sql.test.SharedSparkSession$class.afterEach(SharedSparkSession.scala:132) > at > org.apache.spark.sql.FileBasedDataSourceSuite.afterEach(FileBasedDataSourceSuite.scala:37) > at > org.scalatest.BeforeAndAfterEach$$anonfun$1.apply$mcV$sp(BeforeAndAfterEach.scala:234) > at > org.scalatest.Status$$anonfun$withAfterEffect$1.apply(Status.scala:379) > at > org.scalatest.Status$$anonfun$withAfterEffect$1.apply(Status.scala:375) > at org.scalatest.SucceededStatus$.whenCompleted(Status.scala:454) > at org.scalatest.Status$class.withAfterEffect(Status.scala:375) > at org.scalatest.SucceededStatus$.withAfterEffect(Status.scala:426) > at > org.scalatest.BeforeAndAfterEach$class.runTest(BeforeAndAfterEach.scala:232) > at > org.apache.spark.sql.FileBasedDataSourceSuite.runTest(FileBasedDataSourceSuite.scala:37) > at > org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:229) > at > org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:229) > at > org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:396) > at > org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:384) > at scala.collection.immutable.List.foreach(List.scala:392) > at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384) > at > org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:379) > at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:461) > at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:229) > at org.scalatest.FunSuite.runTests(FunSuite.scala:1560) > at org.scalatest.Suite$class.run(Suite.scala:1147) > at > org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1560) > at > org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:233) > at > org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:233) > at org.scalatest.SuperEngine.runImpl(Engine.scala:521) > at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:233) > at > org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterAll$$super$run(SparkFunSuite.scala:52) > at > org.scalatest.BeforeAndAfterAll$class.liftedTree1$1(BeforeAndAfterAll.scala:213) > at > org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:210) > at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:52) > at > org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:314) > at > org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:480) > at sbt.ForkMain$Run$2.call(ForkMain.java:296) > at
[jira] [Commented] (SPARK-25688) Potential resource leak in ORC
[ https://issues.apache.org/jira/browse/SPARK-25688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16643910#comment-16643910 ] Xiao Li commented on SPARK-25688: - https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-sbt-hadoop-2.7/5015/consoleText Is this from `Enabling/disabling ignoreMissingFiles using orc`? > Potential resource leak in ORC > -- > > Key: SPARK-25688 > URL: https://issues.apache.org/jira/browse/SPARK-25688 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.0 >Reporter: Xiao Li >Assignee: Dongjoon Hyun >Priority: Critical > > http://spark-tests.appspot.com/test-details?suite_name=org.apache.spark.sql.FileBasedDataSourceSuite_name=%28It+is+not+a+test+it+is+a+sbt.testing.SuiteSelector%29 > > All the test failure is caused by the ORC internal. > {code} > org.scalatest.exceptions.TestFailedDueToTimeoutException: The code passed to > eventually never returned normally. Attempted 15 times over 10.019369471 > seconds. Last failure message: There are 1 possibly leaked file streams.. > sbt.ForkMain$ForkError: > org.scalatest.exceptions.TestFailedDueToTimeoutException: The code passed to > eventually never returned normally. Attempted 15 times over 10.019369471 > seconds. Last failure message: There are 1 possibly leaked file streams.. > at > org.scalatest.concurrent.Eventually$class.tryTryAgain$1(Eventually.scala:421) > at > org.scalatest.concurrent.Eventually$class.eventually(Eventually.scala:439) > at > org.apache.spark.sql.FileBasedDataSourceSuite.eventually(FileBasedDataSourceSuite.scala:37) > at > org.scalatest.concurrent.Eventually$class.eventually(Eventually.scala:308) > at > org.apache.spark.sql.FileBasedDataSourceSuite.eventually(FileBasedDataSourceSuite.scala:37) > at > org.apache.spark.sql.test.SharedSparkSession$class.afterEach(SharedSparkSession.scala:132) > at > org.apache.spark.sql.FileBasedDataSourceSuite.afterEach(FileBasedDataSourceSuite.scala:37) > at > org.scalatest.BeforeAndAfterEach$$anonfun$1.apply$mcV$sp(BeforeAndAfterEach.scala:234) > at > org.scalatest.Status$$anonfun$withAfterEffect$1.apply(Status.scala:379) > at > org.scalatest.Status$$anonfun$withAfterEffect$1.apply(Status.scala:375) > at org.scalatest.SucceededStatus$.whenCompleted(Status.scala:454) > at org.scalatest.Status$class.withAfterEffect(Status.scala:375) > at org.scalatest.SucceededStatus$.withAfterEffect(Status.scala:426) > at > org.scalatest.BeforeAndAfterEach$class.runTest(BeforeAndAfterEach.scala:232) > at > org.apache.spark.sql.FileBasedDataSourceSuite.runTest(FileBasedDataSourceSuite.scala:37) > at > org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:229) > at > org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:229) > at > org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:396) > at > org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:384) > at scala.collection.immutable.List.foreach(List.scala:392) > at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384) > at > org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:379) > at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:461) > at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:229) > at org.scalatest.FunSuite.runTests(FunSuite.scala:1560) > at org.scalatest.Suite$class.run(Suite.scala:1147) > at > org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1560) > at > org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:233) > at > org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:233) > at org.scalatest.SuperEngine.runImpl(Engine.scala:521) > at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:233) > at > org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterAll$$super$run(SparkFunSuite.scala:52) > at > org.scalatest.BeforeAndAfterAll$class.liftedTree1$1(BeforeAndAfterAll.scala:213) > at > org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:210) > at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:52) > at > org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:314) > at > org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:480) > at sbt.ForkMain$Run$2.call(ForkMain.java:296) > at sbt.ForkMain$Run$2.call(ForkMain.java:286) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >
[jira] [Commented] (SPARK-25688) Potential resource leak in ORC
[ https://issues.apache.org/jira/browse/SPARK-25688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16643870#comment-16643870 ] Xiao Li commented on SPARK-25688: - It sounds like ORC still has a resource leak even after the latest version upgrade. > Potential resource leak in ORC > -- > > Key: SPARK-25688 > URL: https://issues.apache.org/jira/browse/SPARK-25688 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.0 >Reporter: Xiao Li >Assignee: Dongjoon Hyun >Priority: Critical > > http://spark-tests.appspot.com/test-details?suite_name=org.apache.spark.sql.FileBasedDataSourceSuite_name=%28It+is+not+a+test+it+is+a+sbt.testing.SuiteSelector%29 > > All the test failure is caused by the ORC internal. > {code} > org.scalatest.exceptions.TestFailedDueToTimeoutException: The code passed to > eventually never returned normally. Attempted 15 times over 10.019369471 > seconds. Last failure message: There are 1 possibly leaked file streams.. > sbt.ForkMain$ForkError: > org.scalatest.exceptions.TestFailedDueToTimeoutException: The code passed to > eventually never returned normally. Attempted 15 times over 10.019369471 > seconds. Last failure message: There are 1 possibly leaked file streams.. > at > org.scalatest.concurrent.Eventually$class.tryTryAgain$1(Eventually.scala:421) > at > org.scalatest.concurrent.Eventually$class.eventually(Eventually.scala:439) > at > org.apache.spark.sql.FileBasedDataSourceSuite.eventually(FileBasedDataSourceSuite.scala:37) > at > org.scalatest.concurrent.Eventually$class.eventually(Eventually.scala:308) > at > org.apache.spark.sql.FileBasedDataSourceSuite.eventually(FileBasedDataSourceSuite.scala:37) > at > org.apache.spark.sql.test.SharedSparkSession$class.afterEach(SharedSparkSession.scala:132) > at > org.apache.spark.sql.FileBasedDataSourceSuite.afterEach(FileBasedDataSourceSuite.scala:37) > at > org.scalatest.BeforeAndAfterEach$$anonfun$1.apply$mcV$sp(BeforeAndAfterEach.scala:234) > at > org.scalatest.Status$$anonfun$withAfterEffect$1.apply(Status.scala:379) > at > org.scalatest.Status$$anonfun$withAfterEffect$1.apply(Status.scala:375) > at org.scalatest.SucceededStatus$.whenCompleted(Status.scala:454) > at org.scalatest.Status$class.withAfterEffect(Status.scala:375) > at org.scalatest.SucceededStatus$.withAfterEffect(Status.scala:426) > at > org.scalatest.BeforeAndAfterEach$class.runTest(BeforeAndAfterEach.scala:232) > at > org.apache.spark.sql.FileBasedDataSourceSuite.runTest(FileBasedDataSourceSuite.scala:37) > at > org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:229) > at > org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:229) > at > org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:396) > at > org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:384) > at scala.collection.immutable.List.foreach(List.scala:392) > at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384) > at > org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:379) > at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:461) > at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:229) > at org.scalatest.FunSuite.runTests(FunSuite.scala:1560) > at org.scalatest.Suite$class.run(Suite.scala:1147) > at > org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1560) > at > org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:233) > at > org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:233) > at org.scalatest.SuperEngine.runImpl(Engine.scala:521) > at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:233) > at > org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterAll$$super$run(SparkFunSuite.scala:52) > at > org.scalatest.BeforeAndAfterAll$class.liftedTree1$1(BeforeAndAfterAll.scala:213) > at > org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:210) > at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:52) > at > org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:314) > at > org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:480) > at sbt.ForkMain$Run$2.call(ForkMain.java:296) > at sbt.ForkMain$Run$2.call(ForkMain.java:286) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at >