[jira] [Commented] (SPARK-28759) Upgrade scala-maven-plugin to 4.2.0
[ https://issues.apache.org/jira/browse/SPARK-28759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920028#comment-16920028 ] Hyukjin Kwon commented on SPARK-28759: -- (let me turn this back to SPARK-24417 since now 4.2.0 has https://github.com/davidB/scala-maven-plugin/pull/358 fix) > Upgrade scala-maven-plugin to 4.2.0 > --- > > Key: SPARK-28759 > URL: https://issues.apache.org/jira/browse/SPARK-28759 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 3.0.0 >Reporter: Dongjoon Hyun >Assignee: Hyukjin Kwon >Priority: Major > Fix For: 3.0.0 > > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-28759) Upgrade scala-maven-plugin to 4.2.0
[ https://issues.apache.org/jira/browse/SPARK-28759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-28759: - Parent: SPARK-24417 Issue Type: Sub-task (was: Improvement) > Upgrade scala-maven-plugin to 4.2.0 > --- > > Key: SPARK-28759 > URL: https://issues.apache.org/jira/browse/SPARK-28759 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 3.0.0 >Reporter: Dongjoon Hyun >Assignee: Hyukjin Kwon >Priority: Major > Fix For: 3.0.0 > > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-28025) HDFSBackedStateStoreProvider should not leak .crc files
[ https://issues.apache.org/jira/browse/SPARK-28025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920018#comment-16920018 ] Jungtaek Lim edited comment on SPARK-28025 at 8/31/19 4:30 AM: --- FYI, I just submitted a patch for HADOOP-16255. Hope we can get rid of workaround sooner. was (Author: kabhwan): FYI, I just submitted a patch for HADOOP-16255--. Hope we can get rid of workaround sooner. > HDFSBackedStateStoreProvider should not leak .crc files > > > Key: SPARK-28025 > URL: https://issues.apache.org/jira/browse/SPARK-28025 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.4.3 > Environment: Spark 2.4.3 > Kubernetes 1.11(?) (OpenShift) > StateStore storage on a mounted PVC. Viewed as a local filesystem by the > `FileContextBasedCheckpointFileManager` : > {noformat} > scala> glusterfm.isLocal > res17: Boolean = true{noformat} >Reporter: Gerard Maas >Assignee: Jungtaek Lim >Priority: Major > Fix For: 2.4.4, 3.0.0 > > > The HDFSBackedStateStoreProvider when using the default CheckpointFileManager > is leaving '.crc' files behind. There's a .crc file created for each > `atomicFile` operation of the CheckpointFileManager. > Over time, the number of files becomes very large. It makes the state store > file system constantly increase in size and, in our case, deteriorates the > file system performance. > Here's a sample of one of our spark storage volumes after 2 days of execution > (4 stateful streaming jobs, each on a different sub-dir): > # > {noformat} > Total files in PVC (used for checkpoints and state store) > $find . | wc -l > 431796 > # .crc files > $find . -name "*.crc" | wc -l > 418053{noformat} > With each .crc file taking one storage block, the used storage runs into the > GBs of data. > These jobs are running on Kubernetes. Our shared storage provider, GlusterFS, > shows serious performance deterioration with this large number of files: > {noformat} > DEBUG HDFSBackedStateStoreProvider: fetchFiles() took 29164ms{noformat} > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-28025) HDFSBackedStateStoreProvider should not leak .crc files
[ https://issues.apache.org/jira/browse/SPARK-28025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920018#comment-16920018 ] Jungtaek Lim edited comment on SPARK-28025 at 8/31/19 4:30 AM: --- FYI, I just submitted a patch for HADOOP-16255--. Hope we can get rid of workaround sooner. was (Author: kabhwan): FYI, I just submitted a patch for HADOOP-16225. Hope we can get rid of workaround sooner. > HDFSBackedStateStoreProvider should not leak .crc files > > > Key: SPARK-28025 > URL: https://issues.apache.org/jira/browse/SPARK-28025 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.4.3 > Environment: Spark 2.4.3 > Kubernetes 1.11(?) (OpenShift) > StateStore storage on a mounted PVC. Viewed as a local filesystem by the > `FileContextBasedCheckpointFileManager` : > {noformat} > scala> glusterfm.isLocal > res17: Boolean = true{noformat} >Reporter: Gerard Maas >Assignee: Jungtaek Lim >Priority: Major > Fix For: 2.4.4, 3.0.0 > > > The HDFSBackedStateStoreProvider when using the default CheckpointFileManager > is leaving '.crc' files behind. There's a .crc file created for each > `atomicFile` operation of the CheckpointFileManager. > Over time, the number of files becomes very large. It makes the state store > file system constantly increase in size and, in our case, deteriorates the > file system performance. > Here's a sample of one of our spark storage volumes after 2 days of execution > (4 stateful streaming jobs, each on a different sub-dir): > # > {noformat} > Total files in PVC (used for checkpoints and state store) > $find . | wc -l > 431796 > # .crc files > $find . -name "*.crc" | wc -l > 418053{noformat} > With each .crc file taking one storage block, the used storage runs into the > GBs of data. > These jobs are running on Kubernetes. Our shared storage provider, GlusterFS, > shows serious performance deterioration with this large number of files: > {noformat} > DEBUG HDFSBackedStateStoreProvider: fetchFiles() took 29164ms{noformat} > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28025) HDFSBackedStateStoreProvider should not leak .crc files
[ https://issues.apache.org/jira/browse/SPARK-28025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920018#comment-16920018 ] Jungtaek Lim commented on SPARK-28025: -- FYI, I just submitted a patch for HADOOP-16225. Hope we can get rid of workaround sooner. > HDFSBackedStateStoreProvider should not leak .crc files > > > Key: SPARK-28025 > URL: https://issues.apache.org/jira/browse/SPARK-28025 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.4.3 > Environment: Spark 2.4.3 > Kubernetes 1.11(?) (OpenShift) > StateStore storage on a mounted PVC. Viewed as a local filesystem by the > `FileContextBasedCheckpointFileManager` : > {noformat} > scala> glusterfm.isLocal > res17: Boolean = true{noformat} >Reporter: Gerard Maas >Assignee: Jungtaek Lim >Priority: Major > Fix For: 2.4.4, 3.0.0 > > > The HDFSBackedStateStoreProvider when using the default CheckpointFileManager > is leaving '.crc' files behind. There's a .crc file created for each > `atomicFile` operation of the CheckpointFileManager. > Over time, the number of files becomes very large. It makes the state store > file system constantly increase in size and, in our case, deteriorates the > file system performance. > Here's a sample of one of our spark storage volumes after 2 days of execution > (4 stateful streaming jobs, each on a different sub-dir): > # > {noformat} > Total files in PVC (used for checkpoints and state store) > $find . | wc -l > 431796 > # .crc files > $find . -name "*.crc" | wc -l > 418053{noformat} > With each .crc file taking one storage block, the used storage runs into the > GBs of data. > These jobs are running on Kubernetes. Our shared storage provider, GlusterFS, > shows serious performance deterioration with this large number of files: > {noformat} > DEBUG HDFSBackedStateStoreProvider: fetchFiles() took 29164ms{noformat} > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28770) Flaky Tests: Test ReplayListenerSuite.End-to-end replay with compression failed
[ https://issues.apache.org/jira/browse/SPARK-28770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920009#comment-16920009 ] zhao bo commented on SPARK-28770: - Thanks Lim, yeah, we also found that most test jobs will pass towards these tests. For us, It's hard to say this issue is trully exist on X86, but on ARM, we failed every time. Hope team can see what happened. What we do on ARM, is just revert the commit [https://github.com/apache/spark/pull/23767], then all pass. > Flaky Tests: Test ReplayListenerSuite.End-to-end replay with compression > failed > --- > > Key: SPARK-28770 > URL: https://issues.apache.org/jira/browse/SPARK-28770 > Project: Spark > Issue Type: Test > Components: Spark Core >Affects Versions: 2.4.3 > Environment: Community jenkins and our arm testing instance. >Reporter: huangtianhua >Priority: Major > > Test > org.apache.spark.scheduler.ReplayListenerSuite.End-to-end replay with > compression is failed see > [https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-3.2/267/testReport/junit/org.apache.spark.scheduler/ReplayListenerSuite/End_to_end_replay_with_compression/] > > And also the test is failed on arm instance, I sent email to spark-dev > before, and we suspect there is something related with the commit > [https://github.com/apache/spark/pull/23767], we tried to revert it and the > tests are passed: > ReplayListenerSuite: > - ... > - End-to-end replay *** FAILED *** > "[driver]" did not equal "[1]" (JsonProtocolSuite.scala:622) > - End-to-end replay with compression *** FAILED *** > "[driver]" did not equal "[1]" (JsonProtocolSuite.scala:622) > > Not sure what's wrong, hope someone can help to figure it out, thanks very > much. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28937) Improve error reporting in Spark Secrets Test Suite
[ https://issues.apache.org/jira/browse/SPARK-28937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919991#comment-16919991 ] holdenk commented on SPARK-28937: - I'm working on this > Improve error reporting in Spark Secrets Test Suite > --- > > Key: SPARK-28937 > URL: https://issues.apache.org/jira/browse/SPARK-28937 > Project: Spark > Issue Type: Improvement > Components: Kubernetes, Tests >Affects Versions: 3.0.0 >Reporter: holdenk >Assignee: holdenk >Priority: Trivial > > Right now most the checks for the Secrets Test suite are done inside an > eventually condition meaning when they fail they fail with a last exception > that they can not connect to the pod, this can mask the actual failure. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-28937) Improve error reporting in Spark Secrets Test Suite
holdenk created SPARK-28937: --- Summary: Improve error reporting in Spark Secrets Test Suite Key: SPARK-28937 URL: https://issues.apache.org/jira/browse/SPARK-28937 Project: Spark Issue Type: Improvement Components: Kubernetes, Tests Affects Versions: 3.0.0 Reporter: holdenk Assignee: holdenk Right now most the checks for the Secrets Test suite are done inside an eventually condition meaning when they fail they fail with a last exception that they can not connect to the pod, this can mask the actual failure. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28936) Simplify Spark K8s tests by replacing race condition during command execution
[ https://issues.apache.org/jira/browse/SPARK-28936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919990#comment-16919990 ] holdenk commented on SPARK-28936: - I'm working on this. > Simplify Spark K8s tests by replacing race condition during command execution > - > > Key: SPARK-28936 > URL: https://issues.apache.org/jira/browse/SPARK-28936 > Project: Spark > Issue Type: Improvement > Components: Kubernetes, Tests >Affects Versions: 3.0.0 >Reporter: holdenk >Assignee: holdenk >Priority: Major > > Currently our command execution for Spark Kubernetes integration tests > depends on a Thread.sleep which sometimes doesn't wait long enough. This > normally doesn't show up because we automatically retry the the commands > inside of an eventually, but on some machines may result in flaky tests. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-28936) Simplify Spark K8s tests by replacing race condition during command execution
holdenk created SPARK-28936: --- Summary: Simplify Spark K8s tests by replacing race condition during command execution Key: SPARK-28936 URL: https://issues.apache.org/jira/browse/SPARK-28936 Project: Spark Issue Type: Improvement Components: Kubernetes, Tests Affects Versions: 3.0.0 Reporter: holdenk Assignee: holdenk Currently our command execution for Spark Kubernetes integration tests depends on a Thread.sleep which sometimes doesn't wait long enough. This normally doesn't show up because we automatically retry the the commands inside of an eventually, but on some machines may result in flaky tests. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-28770) Flaky Tests: Test ReplayListenerSuite.End-to-end replay with compression failed
[ https://issues.apache.org/jira/browse/SPARK-28770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919982#comment-16919982 ] Jungtaek Lim edited comment on SPARK-28770 at 8/31/19 12:44 AM: Just hit again. [https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109965/testReport] > we suspect there is something related with the commit >[https://github.com/apache/spark/pull/23767], we tried to revert it and the >tests are passed: It's not occurred frequently, so you may need to run at least 100 times to make sure reverting would help. was (Author: kabhwan): Just hit again. [https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109965/testReport] > we suspect there is something related with the commit >[https://github.com/apache/spark/pull/23767], we tried to revert it and the >tests are passed: It's not occurred frequently, so you may need to run 100 times to make sure reverting would help. > Flaky Tests: Test ReplayListenerSuite.End-to-end replay with compression > failed > --- > > Key: SPARK-28770 > URL: https://issues.apache.org/jira/browse/SPARK-28770 > Project: Spark > Issue Type: Test > Components: Spark Core >Affects Versions: 2.4.3 > Environment: Community jenkins and our arm testing instance. >Reporter: huangtianhua >Priority: Major > > Test > org.apache.spark.scheduler.ReplayListenerSuite.End-to-end replay with > compression is failed see > [https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-3.2/267/testReport/junit/org.apache.spark.scheduler/ReplayListenerSuite/End_to_end_replay_with_compression/] > > And also the test is failed on arm instance, I sent email to spark-dev > before, and we suspect there is something related with the commit > [https://github.com/apache/spark/pull/23767], we tried to revert it and the > tests are passed: > ReplayListenerSuite: > - ... > - End-to-end replay *** FAILED *** > "[driver]" did not equal "[1]" (JsonProtocolSuite.scala:622) > - End-to-end replay with compression *** FAILED *** > "[driver]" did not equal "[1]" (JsonProtocolSuite.scala:622) > > Not sure what's wrong, hope someone can help to figure it out, thanks very > much. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28770) Flaky Tests: Test ReplayListenerSuite.End-to-end replay with compression failed
[ https://issues.apache.org/jira/browse/SPARK-28770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919982#comment-16919982 ] Jungtaek Lim commented on SPARK-28770: -- Just hit again. [https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109965/testReport] > we suspect there is something related with the commit >[https://github.com/apache/spark/pull/23767], we tried to revert it and the >tests are passed: It's not occurred frequently, so you may need to run 100 times to make sure reverting would help. > Flaky Tests: Test ReplayListenerSuite.End-to-end replay with compression > failed > --- > > Key: SPARK-28770 > URL: https://issues.apache.org/jira/browse/SPARK-28770 > Project: Spark > Issue Type: Test > Components: Spark Core >Affects Versions: 2.4.3 > Environment: Community jenkins and our arm testing instance. >Reporter: huangtianhua >Priority: Major > > Test > org.apache.spark.scheduler.ReplayListenerSuite.End-to-end replay with > compression is failed see > [https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-3.2/267/testReport/junit/org.apache.spark.scheduler/ReplayListenerSuite/End_to_end_replay_with_compression/] > > And also the test is failed on arm instance, I sent email to spark-dev > before, and we suspect there is something related with the commit > [https://github.com/apache/spark/pull/23767], we tried to revert it and the > tests are passed: > ReplayListenerSuite: > - ... > - End-to-end replay *** FAILED *** > "[driver]" did not equal "[1]" (JsonProtocolSuite.scala:622) > - End-to-end replay with compression *** FAILED *** > "[driver]" did not equal "[1]" (JsonProtocolSuite.scala:622) > > Not sure what's wrong, hope someone can help to figure it out, thanks very > much. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28935) Document SQL metrics for Details for Query Plan
[ https://issues.apache.org/jira/browse/SPARK-28935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919975#comment-16919975 ] Liang-Chi Hsieh commented on SPARK-28935: - Thanks for pinging me! I will look into this. > Document SQL metrics for Details for Query Plan > --- > > Key: SPARK-28935 > URL: https://issues.apache.org/jira/browse/SPARK-28935 > Project: Spark > Issue Type: Sub-task > Components: Documentation >Affects Versions: 3.0.0 >Reporter: Xiao Li >Priority: Major > > [https://github.com/apache/spark/pull/25349] shows the query plans but it > does not describe the meaning of each metric in the plan. For end users, they > might not understand the meaning of the metrics we output. > > !https://user-images.githubusercontent.com/7322292/62421634-9d9c4980-b6d7-11e9-8e31-1e6ba9b402e8.png! -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-28926) CLONE - ArrayIndexOutOfBoundsException and Not-stable AUC metrics in ALS for datasets with 12 billion instances
[ https://issues.apache.org/jira/browse/SPARK-28926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh resolved SPARK-28926. - Resolution: Duplicate I think this is duplicate to SPARK-28927. > CLONE - ArrayIndexOutOfBoundsException and Not-stable AUC metrics in ALS for > datasets with 12 billion instances > > > Key: SPARK-28926 > URL: https://issues.apache.org/jira/browse/SPARK-28926 > Project: Spark > Issue Type: Bug > Components: MLlib >Affects Versions: 2.2.1 >Reporter: Qiang Wang >Assignee: Xiangrui Meng >Priority: Major > > The stack trace is below: > {quote}19/08/28 07:00:40 WARN Executor task launch worker for task 325074 > BlockManager: Block rdd_10916_493 could not be removed as it was not found on > disk or in memory 19/08/28 07:00:41 ERROR Executor task launch worker for > task 325074 Executor: Exception in task 3.0 in stage 347.1 (TID 325074) > java.lang.ArrayIndexOutOfBoundsException: 6741 at > org.apache.spark.dpshade.recommendation.ALS$$anonfun$org$apache$spark$ml$recommendation$ALS$$computeFactors$1.apply(ALS.scala:1460) > at > org.apache.spark.dpshade.recommendation.ALS$$anonfun$org$apache$spark$ml$recommendation$ALS$$computeFactors$1.apply(ALS.scala:1440) > at > org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$40$$anonfun$apply$41.apply(PairRDDFunctions.scala:760) > at > org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$40$$anonfun$apply$41.apply(PairRDDFunctions.scala:760) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) at > org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:216) > at > org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1041) > at > org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1032) > at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:972) at > org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1032) > at > org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:763) > at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:334) at > org.apache.spark.rdd.RDD.iterator(RDD.scala:285) at > org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$2.apply(CoGroupedRDD.scala:141) > at > org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$2.apply(CoGroupedRDD.scala:137) > at > scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733) > at scala.collection.immutable.List.foreach(List.scala:381) at > scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732) > at org.apache.spark.rdd.CoGroupedRDD.compute(CoGroupedRDD.scala:137) at > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at > org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at > org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at > org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at > org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) at > org.apache.spark.scheduler.Task.run(Task.scala:108) at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:358) at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {quote} > This exception happened sometimes. And we also found that the AUC metric was > not stable when evaluating the inner product of the user factors and the item > factors with the same dataset and configuration. AUC varied from 0.60 to 0.67 > which was not stable for production environment. > Dataset capacity: ~12 billion ratings > Here is the our code: > {code:java} > val hivedata = sc.sql(sqltext).select(id,dpid,score).coalesce(numPartitions) > val predataItem = hivedata.rdd.map(r=>(r._1._1,(r._1._2,r._2.sum))) > .groupByKey().zipWithIndex() > .persist(StorageLevel.MEMORY_AND_DISK_SER) > val predataUser = > predataItem.flatMap(r=>r._1._2.map(y=>(y._1,(r._2.toInt,y._2 >
[jira] [Commented] (SPARK-28935) Document SQL metrics for Details for Query Plan
[ https://issues.apache.org/jira/browse/SPARK-28935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919972#comment-16919972 ] Xiao Li commented on SPARK-28935: - cc [~viirya] Are you interested on this? You added a few metrics before. Maybe you are the best person to deliver this. > Document SQL metrics for Details for Query Plan > --- > > Key: SPARK-28935 > URL: https://issues.apache.org/jira/browse/SPARK-28935 > Project: Spark > Issue Type: Sub-task > Components: Documentation >Affects Versions: 3.0.0 >Reporter: Xiao Li >Priority: Major > > [https://github.com/apache/spark/pull/25349] shows the query plans but it > does not describe the meaning of each metric in the plan. For end users, they > might not understand the meaning of the metrics we output. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-28935) Document SQL metrics for Details for Query Plan
[ https://issues.apache.org/jira/browse/SPARK-28935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-28935: Description: [https://github.com/apache/spark/pull/25349] shows the query plans but it does not describe the meaning of each metric in the plan. For end users, they might not understand the meaning of the metrics we output. !https://user-images.githubusercontent.com/7322292/62421634-9d9c4980-b6d7-11e9-8e31-1e6ba9b402e8.png! was:[https://github.com/apache/spark/pull/25349] shows the query plans but it does not describe the meaning of each metric in the plan. For end users, they might not understand the meaning of the metrics we output. > Document SQL metrics for Details for Query Plan > --- > > Key: SPARK-28935 > URL: https://issues.apache.org/jira/browse/SPARK-28935 > Project: Spark > Issue Type: Sub-task > Components: Documentation >Affects Versions: 3.0.0 >Reporter: Xiao Li >Priority: Major > > [https://github.com/apache/spark/pull/25349] shows the query plans but it > does not describe the meaning of each metric in the plan. For end users, they > might not understand the meaning of the metrics we output. > > !https://user-images.githubusercontent.com/7322292/62421634-9d9c4980-b6d7-11e9-8e31-1e6ba9b402e8.png! -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-28935) Document SQL metrics for Details for Query Plan
Xiao Li created SPARK-28935: --- Summary: Document SQL metrics for Details for Query Plan Key: SPARK-28935 URL: https://issues.apache.org/jira/browse/SPARK-28935 Project: Spark Issue Type: Sub-task Components: Documentation Affects Versions: 3.0.0 Reporter: Xiao Li [https://github.com/apache/spark/pull/25349] shows the query plans but it does not describe the meaning of each metric in the plan. For end users, they might not understand the meaning of the metrics we output. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-28934) Add `spark.sql.compatiblity.mode`
[ https://issues.apache.org/jira/browse/SPARK-28934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28934: -- Reporter: Xiao Li (was: Dongjoon Hyun) > Add `spark.sql.compatiblity.mode` > - > > Key: SPARK-28934 > URL: https://issues.apache.org/jira/browse/SPARK-28934 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 3.0.0 >Reporter: Xiao Li >Priority: Major > > This issue aims to add `spark.sql.compatiblity.mode` whose values are `spark` > or `pgSQL` case-insensitively to control PostgreSQL compatibility features. > > Apache Spark 3.0.0 can start with `spark.sql.parser.ansi.enabled=false` and > `spark.sql.compatiblity.mode=spark`. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-28934) Add `spark.sql.compatiblity.mode`
Dongjoon Hyun created SPARK-28934: - Summary: Add `spark.sql.compatiblity.mode` Key: SPARK-28934 URL: https://issues.apache.org/jira/browse/SPARK-28934 Project: Spark Issue Type: New Feature Components: SQL Affects Versions: 3.0.0 Reporter: Dongjoon Hyun This issue aims to add `spark.sql.compatiblity.mode` whose values are `spark` or `pgSQL` case-insensitively to control PostgreSQL compatibility features. Apache Spark 3.0.0 can start with `spark.sql.parser.ansi.enabled=false` and `spark.sql.compatiblity.mode=spark`. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-28933) Reduce unnecessary shuffle in ALS when initializing factors
[ https://issues.apache.org/jira/browse/SPARK-28933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh reassigned SPARK-28933: --- Assignee: Liang-Chi Hsieh > Reduce unnecessary shuffle in ALS when initializing factors > --- > > Key: SPARK-28933 > URL: https://issues.apache.org/jira/browse/SPARK-28933 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 3.0.0 >Reporter: Liang-Chi Hsieh >Assignee: Liang-Chi Hsieh >Priority: Major > > When Initializing factors in ALS, we should use {{mapPartitions}} instead of > current {{map}}, so we can preserve existing partition of the RDD of > {{InBlock}}. The RDD of {{InBlock}} is already partitioned by src block id. > We don't change the partition when initializing factors. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-28933) Reduce unnecessary shuffle in ALS when initializing factors
Liang-Chi Hsieh created SPARK-28933: --- Summary: Reduce unnecessary shuffle in ALS when initializing factors Key: SPARK-28933 URL: https://issues.apache.org/jira/browse/SPARK-28933 Project: Spark Issue Type: Improvement Components: ML Affects Versions: 3.0.0 Reporter: Liang-Chi Hsieh When Initializing factors in ALS, we should use {{mapPartitions}} instead of current {{map}}, so we can preserve existing partition of the RDD of {{InBlock}}. The RDD of {{InBlock}} is already partitioned by src block id. We don't change the partition when initializing factors. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-28932) Maven install fails on JDK11
[ https://issues.apache.org/jira/browse/SPARK-28932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28932: -- Component/s: (was: Spark Core) Build > Maven install fails on JDK11 > > > Key: SPARK-28932 > URL: https://issues.apache.org/jira/browse/SPARK-28932 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 3.0.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Blocker > > {code} > mvn clean install -pl common/network-common -DskipTests > error: fatal error: object scala in compiler mirror not found. > one error found > [INFO] > > [INFO] BUILD FAILURE > [INFO] > > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-28932) Maven install fails on JDK11
[ https://issues.apache.org/jira/browse/SPARK-28932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-28932: - Target Version/s: 3.0.0 Assignee: Dongjoon Hyun > Maven install fails on JDK11 > > > Key: SPARK-28932 > URL: https://issues.apache.org/jira/browse/SPARK-28932 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 3.0.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Blocker > > {code} > mvn clean install -pl common/network-common -DskipTests > error: fatal error: object scala in compiler mirror not found. > one error found > [INFO] > > [INFO] BUILD FAILURE > [INFO] > > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-28804) Document DESCRIBE QUERY in SQL Reference.
[ https://issues.apache.org/jira/browse/SPARK-28804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-28804. - Fix Version/s: 3.0.0 Assignee: Dilip Biswal Resolution: Fixed > Document DESCRIBE QUERY in SQL Reference. > - > > Key: SPARK-28804 > URL: https://issues.apache.org/jira/browse/SPARK-28804 > Project: Spark > Issue Type: Sub-task > Components: Documentation, SQL >Affects Versions: 2.4.3 >Reporter: Dilip Biswal >Assignee: Dilip Biswal >Priority: Major > Fix For: 3.0.0 > > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-26046) Add a way for StreamingQueryManager to remove all listeners
[ https://issues.apache.org/jira/browse/SPARK-26046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Murthy updated SPARK-26046: - Description: StreamingQueryManager should have a way to clear out all listeners. There's addListener(listener) and removeListener(listener), but not removeAllListeners. We should expose a new method -removeAllListeners() that calls listenerBus.removeAllListeners (added here: [https://github.com/apache/spark/commit/9690eba16efe6d25261934d8b73a221972b684f3])- listListeners() that can be used to remove listeners. (was: StreamingQueryManager should have a way to clear out all listeners. There's addListener(listener) and removeListener(listener), but not removeAllListeners. We should expose a new method removeAllListeners() that calls listenerBus.removeAllListeners (added here: [https://github.com/apache/spark/commit/9690eba16efe6d25261934d8b73a221972b684f3]). ) > Add a way for StreamingQueryManager to remove all listeners > --- > > Key: SPARK-26046 > URL: https://issues.apache.org/jira/browse/SPARK-26046 > Project: Spark > Issue Type: Task > Components: Structured Streaming >Affects Versions: 2.4.0 >Reporter: Mukul Murthy >Priority: Major > > StreamingQueryManager should have a way to clear out all listeners. There's > addListener(listener) and removeListener(listener), but not > removeAllListeners. We should expose a new method -removeAllListeners() that > calls listenerBus.removeAllListeners (added here: > [https://github.com/apache/spark/commit/9690eba16efe6d25261934d8b73a221972b684f3])- > listListeners() that can be used to remove listeners. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-28932) Maven install fails on JDK11
Dongjoon Hyun created SPARK-28932: - Summary: Maven install fails on JDK11 Key: SPARK-28932 URL: https://issues.apache.org/jira/browse/SPARK-28932 Project: Spark Issue Type: Sub-task Components: Spark Core Affects Versions: 3.0.0 Reporter: Dongjoon Hyun {code} mvn clean install -pl common/network-common -DskipTests error: fatal error: object scala in compiler mirror not found. one error found [INFO] [INFO] BUILD FAILURE [INFO] {code} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Reopened] (SPARK-26046) Add a way for StreamingQueryManager to remove all listeners
[ https://issues.apache.org/jira/browse/SPARK-26046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Murthy reopened SPARK-26046: -- >From some other discussions I've had, I actually think it's a reasonable to >have a way to remove all listeners. I don't think it should be a >removeAllListeners API, as originally discussed, but StreamingQueryManager >could have a listListeners API which the caller could then choose to use to >remove each listener manually. > Add a way for StreamingQueryManager to remove all listeners > --- > > Key: SPARK-26046 > URL: https://issues.apache.org/jira/browse/SPARK-26046 > Project: Spark > Issue Type: Task > Components: Structured Streaming >Affects Versions: 2.4.0 >Reporter: Mukul Murthy >Priority: Major > > StreamingQueryManager should have a way to clear out all listeners. There's > addListener(listener) and removeListener(listener), but not > removeAllListeners. We should expose a new method removeAllListeners() that > calls listenerBus.removeAllListeners (added here: > [https://github.com/apache/spark/commit/9690eba16efe6d25261934d8b73a221972b684f3]). > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-28894) Jenkins does not report test results of SQLQueryTestSuite in Jenkins
[ https://issues.apache.org/jira/browse/SPARK-28894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-28894. --- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 25630 [https://github.com/apache/spark/pull/25630] > Jenkins does not report test results of SQLQueryTestSuite in Jenkins > > > Key: SPARK-28894 > URL: https://issues.apache.org/jira/browse/SPARK-28894 > Project: Spark > Issue Type: Test > Components: SQL >Affects Versions: 3.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Fix For: 3.0.0 > > > See > https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109834/testReport/junit/org.apache.spark.sql/SQLQueryTestSuite/ > We don't know which file has an error before reading the logs. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-28894) Jenkins does not report test results of SQLQueryTestSuite in Jenkins
[ https://issues.apache.org/jira/browse/SPARK-28894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-28894: - Assignee: Hyukjin Kwon > Jenkins does not report test results of SQLQueryTestSuite in Jenkins > > > Key: SPARK-28894 > URL: https://issues.apache.org/jira/browse/SPARK-28894 > Project: Spark > Issue Type: Test > Components: SQL >Affects Versions: 3.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > > See > https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109834/testReport/junit/org.apache.spark.sql/SQLQueryTestSuite/ > We don't know which file has an error before reading the logs. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-28921) Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10)
[ https://issues.apache.org/jira/browse/SPARK-28921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-28921: --- Affects Version/s: 2.3.3 > Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10) > - > > Key: SPARK-28921 > URL: https://issues.apache.org/jira/browse/SPARK-28921 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.3.3, 2.4.3 >Reporter: Paul Schweigert >Priority: Critical > > Spark jobs are failing on latest versions of Kubernetes when jobs attempt to > provision executor pods (jobs like Spark-Pi that do not launch executors run > without a problem): > > Here's an example error message: > > {code:java} > 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors > from Kubernetes. > 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors > from Kubernetes.19/08/30 01:29:09 WARN WatchConnectionManager: Exec Failure: > HTTP 403, Status: 403 - > java.net.ProtocolException: Expected HTTP 101 response but was '403 > Forbidden' > at > okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:216) > at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:183) > at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141) > at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > at java.lang.Thread.run(Thread.java:748) > {code} > > Looks like the issue is caused by fixes for a recent CVE : > CVE: [https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-14809] > Fix: [https://github.com/fabric8io/kubernetes-client/pull/1669] > > Looks like upgrading kubernetes-client to 4.4.2 would solve this issue. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-28925) Update Kubernetes-client to 4.4.2 to be compatible with Kubernetes 1.13 and 1.14
[ https://issues.apache.org/jira/browse/SPARK-28925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919907#comment-16919907 ] Andy Grove edited comment on SPARK-28925 at 8/30/19 9:54 PM: - This also impacts Spark 2.3.3 on EKS 1.11 due to security patches that were rolled out in the past week. {code:java} Server Version: version.Info{Major:"1", Minor:"11+", GitVersion:"v1.11.10-eks-7f15cc", GitCommit:"7f15ccb4e58f112866f7ddcfebf563f199558488", GitTreeState:"clean", BuildDate:"2019-08-19T17:46:02Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"} {code} was (Author: andygrove): This also impacts Spark 2.3.3 > Update Kubernetes-client to 4.4.2 to be compatible with Kubernetes 1.13 and > 1.14 > > > Key: SPARK-28925 > URL: https://issues.apache.org/jira/browse/SPARK-28925 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.3.3, 2.4.3 >Reporter: Eric >Priority: Minor > > Hello, > If you use Spark with Kubernetes 1.13 or 1.14 you will see this error: > {code:java} > {"time": "2019-08-28T09:56:11.866Z", "lvl":"INFO", "logger": > "org.apache.spark.internal.Logging", > "thread":"kubernetes-executor-snapshots-subscribers-0","msg":"Going to > request 1 executors from Kubernetes."} > {"time": "2019-08-28T09:56:12.028Z", "lvl":"WARN", "logger": > "io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$2", > "thread":"OkHttp https://kubernetes.default.svc/...","msg":"Exec Failure: > HTTP 403, Status: 403 - "} > java.net.ProtocolException: Expected HTTP 101 response but was '403 Forbidden' > {code} > Apparently the bug is fixed here: > [https://github.com/fabric8io/kubernetes-client/pull/1669] > We have currently compiled Spark source code with Kubernetes-client 4.4.2 and > it's working great on our cluster. We are using Kubernetes 1.13.10. > > Could it be possible to update that dependency version? > > Thanks! -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-28925) Update Kubernetes-client to 4.4.2 to be compatible with Kubernetes 1.13 and 1.14
[ https://issues.apache.org/jira/browse/SPARK-28925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919907#comment-16919907 ] Andy Grove edited comment on SPARK-28925 at 8/30/19 9:55 PM: - This also impacts Spark 2.3.3 on EKS 1.11 due to security patches that were rolled out in the past week. {code:java} Server Version: version.Info{Major:"1", Minor:"11+", GitVersion:"v1.11.10-eks-7f15cc", GitCommit:"7f15ccb4e58f112866f7ddcfebf563f199558488", GitTreeState:"clean", BuildDate:"2019-08-19T17:46:02Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"} {code} I experimented with replacing {{kubernetes-client.jar}} with version 4.4.2 and it did resolve this issue, but caused other issues, so isn't a real option for a workaround for my use case. was (Author: andygrove): This also impacts Spark 2.3.3 on EKS 1.11 due to security patches that were rolled out in the past week. {code:java} Server Version: version.Info{Major:"1", Minor:"11+", GitVersion:"v1.11.10-eks-7f15cc", GitCommit:"7f15ccb4e58f112866f7ddcfebf563f199558488", GitTreeState:"clean", BuildDate:"2019-08-19T17:46:02Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"} {code} > Update Kubernetes-client to 4.4.2 to be compatible with Kubernetes 1.13 and > 1.14 > > > Key: SPARK-28925 > URL: https://issues.apache.org/jira/browse/SPARK-28925 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.3.3, 2.4.3 >Reporter: Eric >Priority: Minor > > Hello, > If you use Spark with Kubernetes 1.13 or 1.14 you will see this error: > {code:java} > {"time": "2019-08-28T09:56:11.866Z", "lvl":"INFO", "logger": > "org.apache.spark.internal.Logging", > "thread":"kubernetes-executor-snapshots-subscribers-0","msg":"Going to > request 1 executors from Kubernetes."} > {"time": "2019-08-28T09:56:12.028Z", "lvl":"WARN", "logger": > "io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$2", > "thread":"OkHttp https://kubernetes.default.svc/...","msg":"Exec Failure: > HTTP 403, Status: 403 - "} > java.net.ProtocolException: Expected HTTP 101 response but was '403 Forbidden' > {code} > Apparently the bug is fixed here: > [https://github.com/fabric8io/kubernetes-client/pull/1669] > We have currently compiled Spark source code with Kubernetes-client 4.4.2 and > it's working great on our cluster. We are using Kubernetes 1.13.10. > > Could it be possible to update that dependency version? > > Thanks! -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-28925) Update Kubernetes-client to 4.4.2 to be compatible with Kubernetes 1.13 and 1.14
[ https://issues.apache.org/jira/browse/SPARK-28925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-28925: --- Affects Version/s: 2.3.3 > Update Kubernetes-client to 4.4.2 to be compatible with Kubernetes 1.13 and > 1.14 > > > Key: SPARK-28925 > URL: https://issues.apache.org/jira/browse/SPARK-28925 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.3.3, 2.4.3 >Reporter: Eric >Priority: Minor > > Hello, > If you use Spark with Kubernetes 1.13 or 1.14 you will see this error: > {code:java} > {"time": "2019-08-28T09:56:11.866Z", "lvl":"INFO", "logger": > "org.apache.spark.internal.Logging", > "thread":"kubernetes-executor-snapshots-subscribers-0","msg":"Going to > request 1 executors from Kubernetes."} > {"time": "2019-08-28T09:56:12.028Z", "lvl":"WARN", "logger": > "io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$2", > "thread":"OkHttp https://kubernetes.default.svc/...","msg":"Exec Failure: > HTTP 403, Status: 403 - "} > java.net.ProtocolException: Expected HTTP 101 response but was '403 Forbidden' > {code} > Apparently the bug is fixed here: > [https://github.com/fabric8io/kubernetes-client/pull/1669] > We have currently compiled Spark source code with Kubernetes-client 4.4.2 and > it's working great on our cluster. We are using Kubernetes 1.13.10. > > Could it be possible to update that dependency version? > > Thanks! -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28925) Update Kubernetes-client to 4.4.2 to be compatible with Kubernetes 1.13 and 1.14
[ https://issues.apache.org/jira/browse/SPARK-28925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919907#comment-16919907 ] Andy Grove commented on SPARK-28925: This also impacts Spark 2.3.3 > Update Kubernetes-client to 4.4.2 to be compatible with Kubernetes 1.13 and > 1.14 > > > Key: SPARK-28925 > URL: https://issues.apache.org/jira/browse/SPARK-28925 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.3 >Reporter: Eric >Priority: Minor > > Hello, > If you use Spark with Kubernetes 1.13 or 1.14 you will see this error: > {code:java} > {"time": "2019-08-28T09:56:11.866Z", "lvl":"INFO", "logger": > "org.apache.spark.internal.Logging", > "thread":"kubernetes-executor-snapshots-subscribers-0","msg":"Going to > request 1 executors from Kubernetes."} > {"time": "2019-08-28T09:56:12.028Z", "lvl":"WARN", "logger": > "io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$2", > "thread":"OkHttp https://kubernetes.default.svc/...","msg":"Exec Failure: > HTTP 403, Status: 403 - "} > java.net.ProtocolException: Expected HTTP 101 response but was '403 Forbidden' > {code} > Apparently the bug is fixed here: > [https://github.com/fabric8io/kubernetes-client/pull/1669] > We have currently compiled Spark source code with Kubernetes-client 4.4.2 and > it's working great on our cluster. We are using Kubernetes 1.13.10. > > Could it be possible to update that dependency version? > > Thanks! -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21181) Suppress memory leak errors reported by netty
[ https://issues.apache.org/jira/browse/SPARK-21181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919903#comment-16919903 ] Thangamani Murugasamy commented on SPARK-21181: --- I have same problem in Spark 2.3 RROR util.ResourceLeakDetector: LEAK: ByteBuf.release() was not called before it's garbage-collected. See http://netty.io/wiki/reference-counted-objects.html for more information. Recent access records: [Stage 0:===> (25 + 5) / 30]19/08/30 16:39:07 ERROR datasources.FileFormatWriter: Aborting job null. java.util.concurrent.TimeoutException: Futures timed out after [300 seconds] at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219) at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:201) at org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.doExecuteBroadcast(BroadcastExchangeExec.scala:136) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:144) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:140) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155) > Suppress memory leak errors reported by netty > - > > Key: SPARK-21181 > URL: https://issues.apache.org/jira/browse/SPARK-21181 > Project: Spark > Issue Type: Bug > Components: Input/Output >Affects Versions: 2.1.0 >Reporter: Dhruve Ashar >Assignee: Dhruve Ashar >Priority: Minor > Fix For: 2.1.2, 2.2.0, 2.3.0 > > > We are seeing netty report memory leak erros like the one below after > switching to 2.1. > {code} > ERROR ResourceLeakDetector: LEAK: ByteBuf.release() was not called before > it's garbage-collected. Enable advanced leak reporting to find out where the > leak occurred. To enable advanced leak reporting, specify the JVM option > '-Dio.netty.leakDetection.level=advanced' or call > ResourceLeakDetector.setLevel() See > http://netty.io/wiki/reference-counted-objects.html for more information. > {code} > Looking a bit deeper, Spark is not leaking any memory here, but it is > confusing for the user to see the error message in the driver logs. > After enabling, '-Dio.netty.leakDetection.level=advanced', netty reveals the > SparkSaslServer to be the source of these leaks. > Sample trace :https://gist.github.com/dhruve/b299ebc35aa0a185c244a0468927daf1 -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28891) do-release-docker.sh in master does not work for branch-2.3
[ https://issues.apache.org/jira/browse/SPARK-28891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919896#comment-16919896 ] Dongjoon Hyun commented on SPARK-28891: --- Since 2.3.4 vote passed, this is merged to `branch-2.3` as a final commit. `branch-2.3` is now locked since it becomes EOL. > do-release-docker.sh in master does not work for branch-2.3 > --- > > Key: SPARK-28891 > URL: https://issues.apache.org/jira/browse/SPARK-28891 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 2.3.4 >Reporter: Kazuaki Ishizaki >Assignee: Kazuaki Ishizaki >Priority: Major > Fix For: 2.3.4 > > > According to [~maropu], > [do-release-docker.sh|https://github.com/apache/spark/blob/master/dev/create-release/do-release-docker.sh] > in master branch worked for 2.3.3 release for branch-2.3. > After updates in [this PR|https://github.com/apache/spark/pull/23098], > {{do-release-docker.sh}} does not work for branch-2.3 now as shown: > {code} > ... > Checked out revision 35358. > Copying release tarballs > cp: cannot stat 'pyspark-*': No such file or directory > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-28891) do-release-docker.sh in master does not work for branch-2.3
[ https://issues.apache.org/jira/browse/SPARK-28891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-28891: - Assignee: Kazuaki Ishizaki > do-release-docker.sh in master does not work for branch-2.3 > --- > > Key: SPARK-28891 > URL: https://issues.apache.org/jira/browse/SPARK-28891 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 2.3.4 >Reporter: Kazuaki Ishizaki >Assignee: Kazuaki Ishizaki >Priority: Major > > According to [~maropu], > [do-release-docker.sh|https://github.com/apache/spark/blob/master/dev/create-release/do-release-docker.sh] > in master branch worked for 2.3.3 release for branch-2.3. > After updates in [this PR|https://github.com/apache/spark/pull/23098], > {{do-release-docker.sh}} does not work for branch-2.3 now as shown: > {code} > ... > Checked out revision 35358. > Copying release tarballs > cp: cannot stat 'pyspark-*': No such file or directory > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-28891) do-release-docker.sh in master does not work for branch-2.3
[ https://issues.apache.org/jira/browse/SPARK-28891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-28891. --- Fix Version/s: 2.3.4 Resolution: Fixed Issue resolved by pull request 25607 [https://github.com/apache/spark/pull/25607] > do-release-docker.sh in master does not work for branch-2.3 > --- > > Key: SPARK-28891 > URL: https://issues.apache.org/jira/browse/SPARK-28891 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 2.3.4 >Reporter: Kazuaki Ishizaki >Assignee: Kazuaki Ishizaki >Priority: Major > Fix For: 2.3.4 > > > According to [~maropu], > [do-release-docker.sh|https://github.com/apache/spark/blob/master/dev/create-release/do-release-docker.sh] > in master branch worked for 2.3.3 release for branch-2.3. > After updates in [this PR|https://github.com/apache/spark/pull/23098], > {{do-release-docker.sh}} does not work for branch-2.3 now as shown: > {code} > ... > Checked out revision 35358. > Copying release tarballs > cp: cannot stat 'pyspark-*': No such file or directory > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-27931) Accept 'on' and 'off' as input for boolean data type
[ https://issues.apache.org/jira/browse/SPARK-27931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-27931: - Assignee: YoungGyu Chun > Accept 'on' and 'off' as input for boolean data type > > > Key: SPARK-27931 > URL: https://issues.apache.org/jira/browse/SPARK-27931 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.0.0 >Reporter: Yuming Wang >Assignee: YoungGyu Chun >Priority: Major > > This ticket contains three things: > 1. Accept 'on' and 'off' as input for boolean data type > {code:sql} > SELECT cast('no' as boolean) AS false; > SELECT cast('off' as boolean) AS false; > {code} > 2. Accept unique prefixes thereof: > {code:sql} > SELECT cast('of' as boolean) AS false; > SELECT cast('fal' as boolean) AS false; > {code} > 3. Trim the string when cast to boolean type > {code:sql} > SELECT cast('true ' as boolean) AS true; > SELECT cast(' FALSE' as boolean) AS true; > {code} > More details: > [https://www.postgresql.org/docs/devel/datatype-boolean.html] > > [https://github.com/postgres/postgres/blob/REL_12_BETA1/src/backend/utils/adt/bool.c#L25] > > [https://github.com/postgres/postgres/commit/05a7db05826c5eb68173b6d7ef1553c19322ef48] > > [https://github.com/postgres/postgres/commit/9729c9360886bee7feddc6a1124b0742de4b9f3d] > Other DBs: > [http://docs.aws.amazon.com/redshift/latest/dg/r_Boolean_type.html] > [https://my.vertica.com/docs/5.0/HTML/Master/2983.htm] > > [https://github.com/prestosql/presto/blob/b845cd66da3eb1fcece50efba83ea12bc40afbaa/presto-main/src/main/java/com/facebook/presto/type/VarcharOperators.java#L108-L138] -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-27931) Accept 'on' and 'off' as input for boolean data type
[ https://issues.apache.org/jira/browse/SPARK-27931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-27931. --- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 25458 [https://github.com/apache/spark/pull/25458] > Accept 'on' and 'off' as input for boolean data type > > > Key: SPARK-27931 > URL: https://issues.apache.org/jira/browse/SPARK-27931 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.0.0 >Reporter: Yuming Wang >Assignee: YoungGyu Chun >Priority: Major > Fix For: 3.0.0 > > > This ticket contains three things: > 1. Accept 'on' and 'off' as input for boolean data type > {code:sql} > SELECT cast('no' as boolean) AS false; > SELECT cast('off' as boolean) AS false; > {code} > 2. Accept unique prefixes thereof: > {code:sql} > SELECT cast('of' as boolean) AS false; > SELECT cast('fal' as boolean) AS false; > {code} > 3. Trim the string when cast to boolean type > {code:sql} > SELECT cast('true ' as boolean) AS true; > SELECT cast(' FALSE' as boolean) AS true; > {code} > More details: > [https://www.postgresql.org/docs/devel/datatype-boolean.html] > > [https://github.com/postgres/postgres/blob/REL_12_BETA1/src/backend/utils/adt/bool.c#L25] > > [https://github.com/postgres/postgres/commit/05a7db05826c5eb68173b6d7ef1553c19322ef48] > > [https://github.com/postgres/postgres/commit/9729c9360886bee7feddc6a1124b0742de4b9f3d] > Other DBs: > [http://docs.aws.amazon.com/redshift/latest/dg/r_Boolean_type.html] > [https://my.vertica.com/docs/5.0/HTML/Master/2983.htm] > > [https://github.com/prestosql/presto/blob/b845cd66da3eb1fcece50efba83ea12bc40afbaa/presto-main/src/main/java/com/facebook/presto/type/VarcharOperators.java#L108-L138] -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28906) `bin/spark-submit --version` shows incorrect info
[ https://issues.apache.org/jira/browse/SPARK-28906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919893#comment-16919893 ] Kazuaki Ishizaki commented on SPARK-28906: -- For user name, we have to pass {{USER}} environment variable to the docker container at the end of {{do-release-docker.sh}}. I created a patch to fix this. For other information to be got by {{git}} command, {{spark-build-info}} script is not executed at the wrong directory (i.e. out of the cloned directory). My guess is the command is executed under the work directory. I did not creat a patch yet. > `bin/spark-submit --version` shows incorrect info > - > > Key: SPARK-28906 > URL: https://issues.apache.org/jira/browse/SPARK-28906 > Project: Spark > Issue Type: Bug > Components: Project Infra >Affects Versions: 2.3.1, 2.3.2, 2.3.3, 2.3.4, 2.4.4, 2.4.0, 2.4.1, 2.4.2, > 3.0.0, 2.4.3 >Reporter: Marcelo Vanzin >Priority: Minor > Attachments: image-2019-08-29-05-50-13-526.png > > > Since Spark 2.3.1, `spark-submit` shows a wrong information. > {code} > $ bin/spark-submit --version > Welcome to > __ > / __/__ ___ _/ /__ > _\ \/ _ \/ _ `/ __/ '_/ >/___/ .__/\_,_/_/ /_/\_\ version 2.3.3 > /_/ > Using Scala version 2.11.8, OpenJDK 64-Bit Server VM, 1.8.0_222 > Branch > Compiled by user on 2019-02-04T13:00:46Z > Revision > Url > Type --help for more information. > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28921) Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10)
[ https://issues.apache.org/jira/browse/SPARK-28921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919869#comment-16919869 ] Dongjoon Hyun commented on SPARK-28921: --- Could you make a PR with your test case, [~psschwei]? > Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10) > - > > Key: SPARK-28921 > URL: https://issues.apache.org/jira/browse/SPARK-28921 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.3 >Reporter: Paul Schweigert >Priority: Critical > > Spark jobs are failing on latest versions of Kubernetes when jobs attempt to > provision executor pods (jobs like Spark-Pi that do not launch executors run > without a problem): > > Here's an example error message: > > {code:java} > 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors > from Kubernetes. > 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors > from Kubernetes.19/08/30 01:29:09 WARN WatchConnectionManager: Exec Failure: > HTTP 403, Status: 403 - > java.net.ProtocolException: Expected HTTP 101 response but was '403 > Forbidden' > at > okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:216) > at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:183) > at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141) > at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > at java.lang.Thread.run(Thread.java:748) > {code} > > Looks like the issue is caused by fixes for a recent CVE : > CVE: [https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-14809] > Fix: [https://github.com/fabric8io/kubernetes-client/pull/1669] > > Looks like upgrading kubernetes-client to 4.4.2 would solve this issue. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-28889) Allow UDTs to define custom casting behavior
[ https://issues.apache.org/jira/browse/SPARK-28889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919822#comment-16919822 ] Zachary S Ennenga edited comment on SPARK-28889 at 8/30/19 6:43 PM: While I understand if the spark team is not particularly interested in solving this problem themselves at this time, I'm more concerned with understanding if this is in line with the eventual solution to UDTs and datasets. If it is, I'm about halfway through the PR as is, and I'm happy to complete it. If it's not, I'm curious what the plan is, and if it's represented in Jira, I'd love to know what tickets so I can follow along. was (Author: zennenga): While I understand if the spark team is not particularly interested in solving this problem themselves at this time, I'm more concerned with understanding if this is in line with the eventual solution to UDTs and datasets. If it is, I'm about halfway through the PR as is, and I'm happy to complete it. > Allow UDTs to define custom casting behavior > > > Key: SPARK-28889 > URL: https://issues.apache.org/jira/browse/SPARK-28889 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.4.3 >Reporter: Zachary S Ennenga >Priority: Minor > > Looking at `org.apache.spark.sql.catalyst.expressions.Cast`, UDTs do not > support any sort of casting except for identity casts, IE: > {code:java} > case (udt1: UserDefinedType[_], udt2: UserDefinedType[_]) if udt1.userClass > == udt2.userClass => > true > {code} > I propose we add an additional piece of functionality here to allow UDTs to > define their own canCast and cast functions to allow users to define their > own cast mechanisms. > An example of how this might look: > {code:java} > case (fromType, toType: UserDefinedType[_]) => > toType.canCast(fromType) // Returns boolean > {code} > {code:java} > case (fromType, toType: UserDefinedType[_]) => > toType.cast(fromType) // Returns Casting function > {code} > The UDT base class would contain a default implementation that replicates > current behavior (IE no casting). -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28889) Allow UDTs to define custom casting behavior
[ https://issues.apache.org/jira/browse/SPARK-28889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919822#comment-16919822 ] Zachary S Ennenga commented on SPARK-28889: --- While I understand if the spark team is not particularly interested in solving this problem themselves at this time, I'm more concerned with understanding if this is in line with the eventual solution to UDTs and datasets. If it is, I'm about halfway through the PR as is, and I'm happy to complete it. > Allow UDTs to define custom casting behavior > > > Key: SPARK-28889 > URL: https://issues.apache.org/jira/browse/SPARK-28889 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.4.3 >Reporter: Zachary S Ennenga >Priority: Minor > > Looking at `org.apache.spark.sql.catalyst.expressions.Cast`, UDTs do not > support any sort of casting except for identity casts, IE: > {code:java} > case (udt1: UserDefinedType[_], udt2: UserDefinedType[_]) if udt1.userClass > == udt2.userClass => > true > {code} > I propose we add an additional piece of functionality here to allow UDTs to > define their own canCast and cast functions to allow users to define their > own cast mechanisms. > An example of how this might look: > {code:java} > case (fromType, toType: UserDefinedType[_]) => > toType.canCast(fromType) // Returns boolean > {code} > {code:java} > case (fromType, toType: UserDefinedType[_]) => > toType.cast(fromType) // Returns Casting function > {code} > The UDT base class would contain a default implementation that replicates > current behavior (IE no casting). -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28889) Allow UDTs to define custom casting behavior
[ https://issues.apache.org/jira/browse/SPARK-28889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919819#comment-16919819 ] Zachary S Ennenga commented on SPARK-28889: --- Based on https://issues.apache.org/jira/browse/SPARK-7768 it seems the intent is to make it public again, though it has been pushed back a few times for reasons that aren't really discussed in the ticket. Is there another solution for defining custom encodes for types within datasets before that ticket is set to be completed? If there isn't, and the intent to solve that problem via UDTs, this enhancement seems useful to solve a specific set of problems, specifically, for automatically transforming simple types in hive (IE string) to complex types (LocalDate) in datasets by using dataframe.as[ComplexType]. > Allow UDTs to define custom casting behavior > > > Key: SPARK-28889 > URL: https://issues.apache.org/jira/browse/SPARK-28889 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.4.3 >Reporter: Zachary S Ennenga >Priority: Minor > > Looking at `org.apache.spark.sql.catalyst.expressions.Cast`, UDTs do not > support any sort of casting except for identity casts, IE: > {code:java} > case (udt1: UserDefinedType[_], udt2: UserDefinedType[_]) if udt1.userClass > == udt2.userClass => > true > {code} > I propose we add an additional piece of functionality here to allow UDTs to > define their own canCast and cast functions to allow users to define their > own cast mechanisms. > An example of how this might look: > {code:java} > case (fromType, toType: UserDefinedType[_]) => > toType.canCast(fromType) // Returns boolean > {code} > {code:java} > case (fromType, toType: UserDefinedType[_]) => > toType.cast(fromType) // Returns Casting function > {code} > The UDT base class would contain a default implementation that replicates > current behavior (IE no casting). -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Issue Comment Deleted] (SPARK-28921) Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10)
[ https://issues.apache.org/jira/browse/SPARK-28921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Schweigert updated SPARK-28921: Comment: was deleted (was: Possible duplicate of https://issues.apache.org/jira/browse/SPARK-28925) > Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10) > - > > Key: SPARK-28921 > URL: https://issues.apache.org/jira/browse/SPARK-28921 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.3 >Reporter: Paul Schweigert >Priority: Critical > > Spark jobs are failing on latest versions of Kubernetes when jobs attempt to > provision executor pods (jobs like Spark-Pi that do not launch executors run > without a problem): > > Here's an example error message: > > {code:java} > 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors > from Kubernetes. > 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors > from Kubernetes.19/08/30 01:29:09 WARN WatchConnectionManager: Exec Failure: > HTTP 403, Status: 403 - > java.net.ProtocolException: Expected HTTP 101 response but was '403 > Forbidden' > at > okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:216) > at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:183) > at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141) > at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > at java.lang.Thread.run(Thread.java:748) > {code} > > Looks like the issue is caused by fixes for a recent CVE : > CVE: [https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-14809] > Fix: [https://github.com/fabric8io/kubernetes-client/pull/1669] > > Looks like upgrading kubernetes-client to 4.4.2 would solve this issue. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Issue Comment Deleted] (SPARK-28921) Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10)
[ https://issues.apache.org/jira/browse/SPARK-28921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Schweigert updated SPARK-28921: Comment: was deleted (was: Longer-term solution will be to upgrade the version of the kubernetes-client : [https://github.com/fabric8io/kubernetes-client/pull/1669]) > Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10) > - > > Key: SPARK-28921 > URL: https://issues.apache.org/jira/browse/SPARK-28921 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.3 >Reporter: Paul Schweigert >Priority: Critical > > Spark jobs are failing on latest versions of Kubernetes when jobs attempt to > provision executor pods (jobs like Spark-Pi that do not launch executors run > without a problem): > > Here's an example error message: > > {code:java} > 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors > from Kubernetes. > 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors > from Kubernetes.19/08/30 01:29:09 WARN WatchConnectionManager: Exec Failure: > HTTP 403, Status: 403 - > java.net.ProtocolException: Expected HTTP 101 response but was '403 > Forbidden' > at > okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:216) > at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:183) > at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141) > at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > at java.lang.Thread.run(Thread.java:748) > {code} > > Looks like the issue is caused by fixes for a recent CVE : > CVE: [https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-14809] > Fix: [https://github.com/fabric8io/kubernetes-client/pull/1669] > > Looks like upgrading kubernetes-client to 4.4.2 would solve this issue. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-28930) Spark DESC FORMATTED TABLENAME information display issues
[ https://issues.apache.org/jira/browse/SPARK-28930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28930: -- Description: Spark DESC FORMATTED TABLENAME information display issues.Showing incorrect *Last Access time and* feeling some information displays can make it better. Test steps: 1. Open spark sql 2. Create table with partition CREATE EXTERNAL TABLE IF NOT EXISTS employees_info_extended ( id INT, name STRING, usd_flag STRING, salary DOUBLE, deductions MAP, address STRING ) PARTITIONED BY (entrytime STRING) STORED AS TEXTFILE location 'hdfs://hacluster/user/sparkhive/warehouse'; 3. from spark sql check the table description desc formatted tablename; 4. From scala shell check the table description sql("desc formatted tablename").show() *Issue1:* If there is no comment for spark scala shell shows *"null" in small letters* but all other places Hive beeline/Spark beeline/Spark SQL it is showing in *CAPITAL "NULL*". Better to show same in all places. {code} *scala>* sql("desc formatted employees_info_extended").show(false); +-+---++--- |col_name|data_type|*comment*| +-+---++--- |id|int|*null*| |name|string|*null*| |usd_flag|string|*null*| |salary|double|*null*| |deductions|map|*null*| |address|string|null| |entrytime|string|null| | # Partition Information| | | | # col_name|data_type|comment| |entrytime|string|null| | | | | | # Detailed Table Information| | | |Database|sparkdb__| | |Table|employees_info_extended| | |Owner|root| | *|Created Time |Tue Aug 20 13:42:06 CST 2019| |* *|Last Access |Thu Jan 01 08:00:00 CST 1970| |* |Created By|Spark 2.4.3| | |Type|EXTERNAL| | |Provider|hive| | +-+---++--- only showing top 20 rows *scala>* {code} *Issue 2:* Spark SQL "desc formatted tablename" is not showing the header [# col_name,data_type,comment|#col_name,data_type,comment] in the top of the query result.But header is showing on top of partition description. For Better understanding show the header on Top of the query result. {code} *spark-sql>* desc formatted employees_info_extended1; id int *NULL* name string *NULL* usd_flag string NULL salary double NULL deductions map NULL address string NULL entrytime string NULL * ## Partition Information* ## col_name data_type comment* entrytime string *NULL* # Detailed Table Information Database sparkdb__ Table employees_info_extended1 Owner spark *Created Time Tue Aug 20 14:50:37 CST 2019* *Last Access Thu Jan 01 08:00:00 CST 1970* Created By Spark 2.3.2.0201 Type EXTERNAL Provider hive Table Properties [transient_lastDdlTime=1566286655] Location hdfs://hacluster/user/sparkhive/warehouse Serde Library org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe InputFormat org.apache.hadoop.mapred.TextInputFormat OutputFormat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Storage Properties [serialization.format=1] Partition Provider Catalog Time taken: 0.477 seconds, Fetched 27 row(s) *spark-sql>* {code} *Issue 3:* I created the table on Aug 20.So it is showing created time correct .*But Last access time showing 1970 Jan 01*. It is not good to show Last access time earlier time than the created time.Better to show the correct date and time else show UNKNOWN. *[Created Time,Tue Aug 20 13:42:06 CST 2019,]* *[Last Access,Thu Jan 01 08:00:00 CST 1970,]* was: Spark DESC FORMATTED TABLENAME information display issues.Showing incorrect *Last Access time and* feeling some information displays can make it better. Test steps: 1. Open spark sql 2. Create table with partition CREATE EXTERNAL TABLE IF NOT EXISTS employees_info_extended ( id INT, name STRING, usd_flag STRING, salary DOUBLE, deductions MAP, address STRING ) PARTITIONED BY (entrytime STRING) STORED AS TEXTFILE location 'hdfs://hacluster/user/sparkhive/warehouse'; 3. from spark sql check the table description desc formatted tablename; 4. From scala shell check the table description sql("desc formatted tablename").show() *Issue1:* If there is no comment for spark scala shell shows *"null" in small letters* but all other places Hive beeline/Spark beeline/Spark SQL it is showing in *CAPITAL "NULL*". Better to show same in all places. *scala>* sql("desc formatted employees_info_extended").show(false); +-+---++--- |col_name|data_type|*comment*| +-+---++--- |id|int|*null*| |name|string|*null*| |usd_flag|string|*null*| |salary|double|*null*| |deductions|map|*null*| |address|string|null| |entrytime|string|null| | # Partition Information| | | | # col_name|data_type|comment| |entrytime|string|null| | | | | | # Detailed Table Information| | | |Database|sparkdb__|
[jira] [Updated] (SPARK-28930) Spark DESC FORMATTED TABLENAME information display issues
[ https://issues.apache.org/jira/browse/SPARK-28930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28930: -- Component/s: (was: Spark Shell) > Spark DESC FORMATTED TABLENAME information display issues > - > > Key: SPARK-28930 > URL: https://issues.apache.org/jira/browse/SPARK-28930 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.3 >Reporter: jobit mathew >Priority: Minor > > Spark DESC FORMATTED TABLENAME information display issues.Showing incorrect > *Last Access time and* feeling some information displays can make it better. > Test steps: > 1. Open spark sql > 2. Create table with partition > CREATE EXTERNAL TABLE IF NOT EXISTS employees_info_extended ( id INT, name > STRING, usd_flag STRING, salary DOUBLE, deductions MAP, > address STRING ) PARTITIONED BY (entrytime STRING) STORED AS TEXTFILE > location 'hdfs://hacluster/user/sparkhive/warehouse'; > 3. from spark sql check the table description > desc formatted tablename; > 4. From scala shell check the table description > sql("desc formatted tablename").show() > *Issue1:* > If there is no comment for spark scala shell shows *"null" in small letters* > but all other places Hive beeline/Spark beeline/Spark SQL it is showing in > *CAPITAL "NULL*". Better to show same in all places. > > {code} > *scala>* sql("desc formatted employees_info_extended").show(false); > +-+---++--- > |col_name|data_type|*comment*| > +-+---++--- > |id|int|*null*| > |name|string|*null*| > |usd_flag|string|*null*| > |salary|double|*null*| > |deductions|map|*null*| > |address|string|null| > |entrytime|string|null| > | # Partition Information| | | > | # col_name|data_type|comment| > |entrytime|string|null| > | | | | > | # Detailed Table Information| | | > |Database|sparkdb__| | > |Table|employees_info_extended| | > |Owner|root| | > *|Created Time |Tue Aug 20 13:42:06 CST 2019| |* > *|Last Access |Thu Jan 01 08:00:00 CST 1970| |* > |Created By|Spark 2.4.3| | > |Type|EXTERNAL| | > |Provider|hive| | > +-+---++--- > only showing top 20 rows > *scala>* > {code} > *Issue 2:* > Spark SQL "desc formatted tablename" is not showing the header [# > col_name,data_type,comment|#col_name,data_type,comment] in the top of the > query result.But header is showing on top of partition description. For > Better understanding show the header on Top of the query result. > {code} > *spark-sql>* desc formatted employees_info_extended1; > id int *NULL* > name string *NULL* > usd_flag string NULL > salary double NULL > deductions map NULL > address string NULL > entrytime string NULL > * > ## Partition Information* > ## col_name data_type comment* > entrytime string *NULL* > # Detailed Table Information > Database sparkdb__ > Table employees_info_extended1 > Owner spark > *Created Time Tue Aug 20 14:50:37 CST 2019* > *Last Access Thu Jan 01 08:00:00 CST 1970* > Created By Spark 2.3.2.0201 > Type EXTERNAL > Provider hive > Table Properties [transient_lastDdlTime=1566286655] > Location hdfs://hacluster/user/sparkhive/warehouse > Serde Library org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > InputFormat org.apache.hadoop.mapred.TextInputFormat > OutputFormat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > Storage Properties [serialization.format=1] > Partition Provider Catalog > Time taken: 0.477 seconds, Fetched 27 row(s) > *spark-sql>* > {code} > > *Issue 3:* > I created the table on Aug 20.So it is showing created time correct .*But > Last access time showing 1970 Jan 01*. It is not good to show Last access > time earlier time than the created time.Better to show the correct date and > time else show UNKNOWN. > *[Created Time,Tue Aug 20 13:42:06 CST 2019,]* > *[Last Access,Thu Jan 01 08:00:00 CST 1970,]* -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-28571) Shuffle storage API: Use API in SortShuffleWriter
[ https://issues.apache.org/jira/browse/SPARK-28571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-28571: -- Assignee: Matt Cheah > Shuffle storage API: Use API in SortShuffleWriter > - > > Key: SPARK-28571 > URL: https://issues.apache.org/jira/browse/SPARK-28571 > Project: Spark > Issue Type: Sub-task > Components: Shuffle >Affects Versions: 3.0.0 >Reporter: Matt Cheah >Assignee: Matt Cheah >Priority: Major > > Use the APIs introduced in SPARK-28209 in the SortShuffleWriter. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-28571) Shuffle storage API: Use API in SortShuffleWriter
[ https://issues.apache.org/jira/browse/SPARK-28571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-28571. Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 25342 [https://github.com/apache/spark/pull/25342] > Shuffle storage API: Use API in SortShuffleWriter > - > > Key: SPARK-28571 > URL: https://issues.apache.org/jira/browse/SPARK-28571 > Project: Spark > Issue Type: Sub-task > Components: Shuffle >Affects Versions: 3.0.0 >Reporter: Matt Cheah >Assignee: Matt Cheah >Priority: Major > Fix For: 3.0.0 > > > Use the APIs introduced in SPARK-28209 in the SortShuffleWriter. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-28759) Upgrade scala-maven-plugin to 4.2.0
[ https://issues.apache.org/jira/browse/SPARK-28759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-28759. --- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 25633 [https://github.com/apache/spark/pull/25633] > Upgrade scala-maven-plugin to 4.2.0 > --- > > Key: SPARK-28759 > URL: https://issues.apache.org/jira/browse/SPARK-28759 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.0.0 >Reporter: Dongjoon Hyun >Assignee: Hyukjin Kwon >Priority: Major > Fix For: 3.0.0 > > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-28759) Upgrade scala-maven-plugin to 4.2.0
[ https://issues.apache.org/jira/browse/SPARK-28759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-28759: - Assignee: Hyukjin Kwon > Upgrade scala-maven-plugin to 4.2.0 > --- > > Key: SPARK-28759 > URL: https://issues.apache.org/jira/browse/SPARK-28759 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.0.0 >Reporter: Dongjoon Hyun >Assignee: Hyukjin Kwon >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-28866) Persist item factors RDD when checkpointing in ALS
[ https://issues.apache.org/jira/browse/SPARK-28866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-28866. --- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 25576 [https://github.com/apache/spark/pull/25576] > Persist item factors RDD when checkpointing in ALS > -- > > Key: SPARK-28866 > URL: https://issues.apache.org/jira/browse/SPARK-28866 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 3.0.0 >Reporter: Liang-Chi Hsieh >Assignee: Liang-Chi Hsieh >Priority: Minor > Fix For: 3.0.0 > > > In ALS ML implementation, if `implicitPrefs` is false, we checkpoint the RDD > of item factors, between intervals. Before checkpointing and materializing > RDD, this RDD was not persisted. It causes recomputation. In an experiment, > there is performance difference between persisting and no persisting before > checkpointing the RDD. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-28866) Persist item factors RDD when checkpointing in ALS
[ https://issues.apache.org/jira/browse/SPARK-28866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-28866: - Assignee: Liang-Chi Hsieh > Persist item factors RDD when checkpointing in ALS > -- > > Key: SPARK-28866 > URL: https://issues.apache.org/jira/browse/SPARK-28866 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 3.0.0 >Reporter: Liang-Chi Hsieh >Assignee: Liang-Chi Hsieh >Priority: Minor > > In ALS ML implementation, if `implicitPrefs` is false, we checkpoint the RDD > of item factors, between intervals. Before checkpointing and materializing > RDD, this RDD was not persisted. It causes recomputation. In an experiment, > there is performance difference between persisting and no persisting before > checkpointing the RDD. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28900) Test Pyspark, SparkR on JDK 11 with run-tests
[ https://issues.apache.org/jira/browse/SPARK-28900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919697#comment-16919697 ] Josh Rosen commented on SPARK-28900: Some quick notes / braindump (I may write more later): * AFAIK _release_ publishing / artifact signing hasn't taken place on Jenkins for a while now (I'm not sure if we're still doing snapshot publishing there, though). Given this, we should delete unused publishing builders and their associated credentials (which I think have been rotated anyways). I'm _pretty_ sure this is technically feasible, but it's been a long time since I've last investigated. If we delete the publishing builders then it should be fairly straightforward to dump a snapshot of the JJB scripts into a public repo (sans-git-history, perhaps). * We should consider removing code / builders for old branches which will never be patched (such as {{branch-1.6}}). This may simplify the build scripts. * Strong +1 from me towards using Dockerized build container: a standard Docker environment would let us remove most of the legacy build cruft. ** IIRC Dockerization of these builds in AMPLab Jenkins was historically blocked by the old version of CentOS's Docker support: the Docker daemon would lock up / freeze if launching many PR builder jobs in parallel. This should be fixed for the newer Ubuntu hosts, though. ** Alternatively, eventually porting all of this to Bazel and sourcing all languages' dependencies and toolchains from there instead from the local environment would sidestep a lot of these problems. * I think we may already have some mechanism which builds conda environments / virtualenvs for the PySpark packaging tests? Maybe that could be used for the regular PySpark tests as well? > Test Pyspark, SparkR on JDK 11 with run-tests > - > > Key: SPARK-28900 > URL: https://issues.apache.org/jira/browse/SPARK-28900 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 3.0.0 >Reporter: Sean Owen >Priority: Major > > Right now, we are testing JDK 11 with a Maven-based build, as in > https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-3.2/ > It looks like _all_ of the Maven-based jobs 'manually' build and invoke > tests, and only run tests via Maven -- that is, they do not run Pyspark or > SparkR tests. The SBT-based builds do, because they use the {{dev/run-tests}} > script that is meant to be for this purpose. > In fact, there seem to be a couple flavors of copy-pasted build configs. SBT > builds look like: > {code} > #!/bin/bash > set -e > # Configure per-build-executor Ivy caches to avoid SBT Ivy lock contention > export HOME="/home/sparkivy/per-executor-caches/$EXECUTOR_NUMBER" > mkdir -p "$HOME" > export SBT_OPTS="-Duser.home=$HOME -Dsbt.ivy.home=$HOME/.ivy2" > export SPARK_VERSIONS_SUITE_IVY_PATH="$HOME/.ivy2" > # Add a pre-downloaded version of Maven to the path so that we avoid the > flaky download step. > export > PATH="/home/jenkins/tools/hudson.tasks.Maven_MavenInstallation/Maven_3.3.9/bin/:$PATH" > git clean -fdx > ./dev/run-tests > {code} > Maven builds looks like: > {code} > #!/bin/bash > set -x > set -e > rm -rf ./work > git clean -fdx > # Generate random point for Zinc > export ZINC_PORT > ZINC_PORT=$(python -S -c "import random; print random.randrange(3030,4030)") > # Use per-build-executor Ivy caches to avoid SBT Ivy lock contention: > export > SPARK_VERSIONS_SUITE_IVY_PATH="/home/sparkivy/per-executor-caches/$EXECUTOR_NUMBER/.ivy2" > mkdir -p "$SPARK_VERSIONS_SUITE_IVY_PATH" > # Prepend JAVA_HOME/bin to fix issue where Zinc's embedded SBT incremental > compiler seems to > # ignore our JAVA_HOME and use the system javac instead. > export PATH="$JAVA_HOME/bin:$PATH" > # Add a pre-downloaded version of Maven to the path so that we avoid the > flaky download step. > export > PATH="/home/jenkins/tools/hudson.tasks.Maven_MavenInstallation/Maven_3.3.9/bin/:$PATH" > MVN="build/mvn -DzincPort=$ZINC_PORT" > set +e > if [[ $HADOOP_PROFILE == hadoop-1 ]]; then > # Note that there is no -Pyarn flag here for Hadoop 1: > $MVN \ > -DskipTests \ > -P"$HADOOP_PROFILE" \ > -Dhadoop.version="$HADOOP_VERSION" \ > -Phive \ > -Phive-thriftserver \ > -Pkinesis-asl \ > -Pmesos \ > clean package > retcode1=$? > $MVN \ > -P"$HADOOP_PROFILE" \ > -Dhadoop.version="$HADOOP_VERSION" \ > -Phive \ > -Phive-thriftserver \ > -Pkinesis-asl \ > -Pmesos \ > --fail-at-end \ > test > retcode2=$? > else > $MVN \ > -DskipTests \ > -P"$HADOOP_PROFILE" \ > -Pyarn \ > -Phive \ > -Phive-thriftserver \
[jira] [Updated] (SPARK-28921) Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10)
[ https://issues.apache.org/jira/browse/SPARK-28921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Schweigert updated SPARK-28921: Description: Spark jobs are failing on latest versions of Kubernetes when jobs attempt to provision executor pods (jobs like Spark-Pi that do not launch executors run without a problem): Here's an example error message: {code:java} 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors from Kubernetes. 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors from Kubernetes.19/08/30 01:29:09 WARN WatchConnectionManager: Exec Failure: HTTP 403, Status: 403 - java.net.ProtocolException: Expected HTTP 101 response but was '403 Forbidden' at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:216) at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:183) at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141) at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {code} Looks like the issue is caused by fixes for a recent CVE : CVE: [https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-14809] Fix: [https://github.com/fabric8io/kubernetes-client/pull/1669] Looks like upgrading kubernetes-client to 4.4.2 would solve this issue. was: Spark jobs are failing on latest versions of Kubernetes when jobs attempt to provision executor pods (jobs like Spark-Pi that do not launch executors run without a problem): Here's an example error message: {code:java} 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors from Kubernetes. 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors from Kubernetes.19/08/30 01:29:09 WARN WatchConnectionManager: Exec Failure: HTTP 403, Status: 403 - java.net.ProtocolException: Expected HTTP 101 response but was '403 Forbidden' at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:216) at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:183) at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141) at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {code} Looks like the issue is caused by the internal master Kubernetes url not having the port specified: [https://github.com/apache/spark/blob/master//resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Constants.scala#L82:7] Using the master with the port (443) seems to fix the problem. > Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10) > - > > Key: SPARK-28921 > URL: https://issues.apache.org/jira/browse/SPARK-28921 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.3 >Reporter: Paul Schweigert >Priority: Critical > > Spark jobs are failing on latest versions of Kubernetes when jobs attempt to > provision executor pods (jobs like Spark-Pi that do not launch executors run > without a problem): > > Here's an example error message: > > {code:java} > 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors > from Kubernetes. > 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors > from Kubernetes.19/08/30 01:29:09 WARN WatchConnectionManager: Exec Failure: > HTTP 403, Status: 403 - > java.net.ProtocolException: Expected HTTP 101 response but was '403 > Forbidden' > at > okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:216) > at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:183) > at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141) > at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > at java.lang.Thread.run(Thread.java:748) > {code} > > Looks like the issue is caused by fixes for a recent CVE : > CVE: [https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-14809] > Fix: [https://github.com/fabric8io/kubernetes-client/pull/1669] > > Looks like upgrading kubernetes-client to 4.4.2 would solve this issue. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (SPARK-28931) Fix couple of bugs in FsHistoryProviderSuite
Jungtaek Lim created SPARK-28931: Summary: Fix couple of bugs in FsHistoryProviderSuite Key: SPARK-28931 URL: https://issues.apache.org/jira/browse/SPARK-28931 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.0.0 Reporter: Jungtaek Lim There're some bugs reside on FsHistoryProviderSuite itself. # When creating log file via {{newLogFile}}, codec is ignored, leading to wrong file name. (No one tends to create test for test code, as well as the bug doesn't affect existing tests indeed, so not easy to catch.) # When writing events to log file via {{writeFile}}, metadata (in case of new format) gets written to file regardless of its codec, and the content is overwritten by another stream, hence no information for Spark version is available. It affects existing test, hence we have wrong expected value to workaround the bug. Note that they're bugs on test code, non-test code works fine. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-28921) Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10)
[ https://issues.apache.org/jira/browse/SPARK-28921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Schweigert updated SPARK-28921: Priority: Critical (was: Minor) > Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10) > - > > Key: SPARK-28921 > URL: https://issues.apache.org/jira/browse/SPARK-28921 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.3 >Reporter: Paul Schweigert >Priority: Critical > > Spark jobs are failing on latest versions of Kubernetes when jobs attempt to > provision executor pods (jobs like Spark-Pi that do not launch executors run > without a problem): > > Here's an example error message: > > {code:java} > 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors > from Kubernetes. > 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors > from Kubernetes.19/08/30 01:29:09 WARN WatchConnectionManager: Exec Failure: > HTTP 403, Status: 403 - > java.net.ProtocolException: Expected HTTP 101 response but was '403 > Forbidden' > at > okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:216) > at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:183) > at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141) > at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > at java.lang.Thread.run(Thread.java:748) > {code} > > Looks like the issue is caused by the internal master Kubernetes url not > having the port specified: > [https://github.com/apache/spark/blob/master//resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Constants.scala#L82:7] > > Using the master with the port (443) seems to fix the problem. > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28921) Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10)
[ https://issues.apache.org/jira/browse/SPARK-28921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919608#comment-16919608 ] Paul Schweigert commented on SPARK-28921: - Possible duplicate of https://issues.apache.org/jira/browse/SPARK-28925 > Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10) > - > > Key: SPARK-28921 > URL: https://issues.apache.org/jira/browse/SPARK-28921 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.3 >Reporter: Paul Schweigert >Priority: Minor > > Spark jobs are failing on latest versions of Kubernetes when jobs attempt to > provision executor pods (jobs like Spark-Pi that do not launch executors run > without a problem): > > Here's an example error message: > > {code:java} > 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors > from Kubernetes. > 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors > from Kubernetes.19/08/30 01:29:09 WARN WatchConnectionManager: Exec Failure: > HTTP 403, Status: 403 - > java.net.ProtocolException: Expected HTTP 101 response but was '403 > Forbidden' > at > okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:216) > at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:183) > at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141) > at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > at java.lang.Thread.run(Thread.java:748) > {code} > > Looks like the issue is caused by the internal master Kubernetes url not > having the port specified: > [https://github.com/apache/spark/blob/master//resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Constants.scala#L82:7] > > Using the master with the port (443) seems to fix the problem. > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28921) Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10)
[ https://issues.apache.org/jira/browse/SPARK-28921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919596#comment-16919596 ] Paul Schweigert commented on SPARK-28921: - Longer-term solution will be to upgrade the version of the kubernetes-client : [https://github.com/fabric8io/kubernetes-client/pull/1669] > Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10) > - > > Key: SPARK-28921 > URL: https://issues.apache.org/jira/browse/SPARK-28921 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.3 >Reporter: Paul Schweigert >Priority: Minor > > Spark jobs are failing on latest versions of Kubernetes when jobs attempt to > provision executor pods (jobs like Spark-Pi that do not launch executors run > without a problem): > > Here's an example error message: > > {code:java} > 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors > from Kubernetes. > 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors > from Kubernetes.19/08/30 01:29:09 WARN WatchConnectionManager: Exec Failure: > HTTP 403, Status: 403 - > java.net.ProtocolException: Expected HTTP 101 response but was '403 > Forbidden' > at > okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:216) > at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:183) > at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141) > at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > at java.lang.Thread.run(Thread.java:748) > {code} > > Looks like the issue is caused by the internal master Kubernetes url not > having the port specified: > [https://github.com/apache/spark/blob/master//resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Constants.scala#L82:7] > > Using the master with the port (443) seems to fix the problem. > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28025) HDFSBackedStateStoreProvider should not leak .crc files
[ https://issues.apache.org/jira/browse/SPARK-28025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919583#comment-16919583 ] Stavros Kontopoulos commented on SPARK-28025: - Thanks I will have a look :) > HDFSBackedStateStoreProvider should not leak .crc files > > > Key: SPARK-28025 > URL: https://issues.apache.org/jira/browse/SPARK-28025 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.4.3 > Environment: Spark 2.4.3 > Kubernetes 1.11(?) (OpenShift) > StateStore storage on a mounted PVC. Viewed as a local filesystem by the > `FileContextBasedCheckpointFileManager` : > {noformat} > scala> glusterfm.isLocal > res17: Boolean = true{noformat} >Reporter: Gerard Maas >Assignee: Jungtaek Lim >Priority: Major > Fix For: 2.4.4, 3.0.0 > > > The HDFSBackedStateStoreProvider when using the default CheckpointFileManager > is leaving '.crc' files behind. There's a .crc file created for each > `atomicFile` operation of the CheckpointFileManager. > Over time, the number of files becomes very large. It makes the state store > file system constantly increase in size and, in our case, deteriorates the > file system performance. > Here's a sample of one of our spark storage volumes after 2 days of execution > (4 stateful streaming jobs, each on a different sub-dir): > # > {noformat} > Total files in PVC (used for checkpoints and state store) > $find . | wc -l > 431796 > # .crc files > $find . -name "*.crc" | wc -l > 418053{noformat} > With each .crc file taking one storage block, the used storage runs into the > GBs of data. > These jobs are running on Kubernetes. Our shared storage provider, GlusterFS, > shows serious performance deterioration with this large number of files: > {noformat} > DEBUG HDFSBackedStateStoreProvider: fetchFiles() took 29164ms{noformat} > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-28025) HDFSBackedStateStoreProvider should not leak .crc files
[ https://issues.apache.org/jira/browse/SPARK-28025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919577#comment-16919577 ] Stavros Kontopoulos edited comment on SPARK-28025 at 8/30/19 2:15 PM: -- [~kabhwan] cool I have a look. was (Author: skonto): [~kabhwan] which PR? > HDFSBackedStateStoreProvider should not leak .crc files > > > Key: SPARK-28025 > URL: https://issues.apache.org/jira/browse/SPARK-28025 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.4.3 > Environment: Spark 2.4.3 > Kubernetes 1.11(?) (OpenShift) > StateStore storage on a mounted PVC. Viewed as a local filesystem by the > `FileContextBasedCheckpointFileManager` : > {noformat} > scala> glusterfm.isLocal > res17: Boolean = true{noformat} >Reporter: Gerard Maas >Assignee: Jungtaek Lim >Priority: Major > Fix For: 2.4.4, 3.0.0 > > > The HDFSBackedStateStoreProvider when using the default CheckpointFileManager > is leaving '.crc' files behind. There's a .crc file created for each > `atomicFile` operation of the CheckpointFileManager. > Over time, the number of files becomes very large. It makes the state store > file system constantly increase in size and, in our case, deteriorates the > file system performance. > Here's a sample of one of our spark storage volumes after 2 days of execution > (4 stateful streaming jobs, each on a different sub-dir): > # > {noformat} > Total files in PVC (used for checkpoints and state store) > $find . | wc -l > 431796 > # .crc files > $find . -name "*.crc" | wc -l > 418053{noformat} > With each .crc file taking one storage block, the used storage runs into the > GBs of data. > These jobs are running on Kubernetes. Our shared storage provider, GlusterFS, > shows serious performance deterioration with this large number of files: > {noformat} > DEBUG HDFSBackedStateStoreProvider: fetchFiles() took 29164ms{noformat} > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28025) HDFSBackedStateStoreProvider should not leak .crc files
[ https://issues.apache.org/jira/browse/SPARK-28025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919580#comment-16919580 ] Gabor Somogyi commented on SPARK-28025: --- [~skonto], this one: [https://github.com/apache/spark/pull/25488] > HDFSBackedStateStoreProvider should not leak .crc files > > > Key: SPARK-28025 > URL: https://issues.apache.org/jira/browse/SPARK-28025 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.4.3 > Environment: Spark 2.4.3 > Kubernetes 1.11(?) (OpenShift) > StateStore storage on a mounted PVC. Viewed as a local filesystem by the > `FileContextBasedCheckpointFileManager` : > {noformat} > scala> glusterfm.isLocal > res17: Boolean = true{noformat} >Reporter: Gerard Maas >Assignee: Jungtaek Lim >Priority: Major > Fix For: 2.4.4, 3.0.0 > > > The HDFSBackedStateStoreProvider when using the default CheckpointFileManager > is leaving '.crc' files behind. There's a .crc file created for each > `atomicFile` operation of the CheckpointFileManager. > Over time, the number of files becomes very large. It makes the state store > file system constantly increase in size and, in our case, deteriorates the > file system performance. > Here's a sample of one of our spark storage volumes after 2 days of execution > (4 stateful streaming jobs, each on a different sub-dir): > # > {noformat} > Total files in PVC (used for checkpoints and state store) > $find . | wc -l > 431796 > # .crc files > $find . -name "*.crc" | wc -l > 418053{noformat} > With each .crc file taking one storage block, the used storage runs into the > GBs of data. > These jobs are running on Kubernetes. Our shared storage provider, GlusterFS, > shows serious performance deterioration with this large number of files: > {noformat} > DEBUG HDFSBackedStateStoreProvider: fetchFiles() took 29164ms{noformat} > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28025) HDFSBackedStateStoreProvider should not leak .crc files
[ https://issues.apache.org/jira/browse/SPARK-28025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919577#comment-16919577 ] Stavros Kontopoulos commented on SPARK-28025: - [~kabhwan] which PR? > HDFSBackedStateStoreProvider should not leak .crc files > > > Key: SPARK-28025 > URL: https://issues.apache.org/jira/browse/SPARK-28025 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.4.3 > Environment: Spark 2.4.3 > Kubernetes 1.11(?) (OpenShift) > StateStore storage on a mounted PVC. Viewed as a local filesystem by the > `FileContextBasedCheckpointFileManager` : > {noformat} > scala> glusterfm.isLocal > res17: Boolean = true{noformat} >Reporter: Gerard Maas >Assignee: Jungtaek Lim >Priority: Major > Fix For: 2.4.4, 3.0.0 > > > The HDFSBackedStateStoreProvider when using the default CheckpointFileManager > is leaving '.crc' files behind. There's a .crc file created for each > `atomicFile` operation of the CheckpointFileManager. > Over time, the number of files becomes very large. It makes the state store > file system constantly increase in size and, in our case, deteriorates the > file system performance. > Here's a sample of one of our spark storage volumes after 2 days of execution > (4 stateful streaming jobs, each on a different sub-dir): > # > {noformat} > Total files in PVC (used for checkpoints and state store) > $find . | wc -l > 431796 > # .crc files > $find . -name "*.crc" | wc -l > 418053{noformat} > With each .crc file taking one storage block, the used storage runs into the > GBs of data. > These jobs are running on Kubernetes. Our shared storage provider, GlusterFS, > shows serious performance deterioration with this large number of files: > {noformat} > DEBUG HDFSBackedStateStoreProvider: fetchFiles() took 29164ms{noformat} > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28930) Spark DESC FORMATTED TABLENAME information display issues
[ https://issues.apache.org/jira/browse/SPARK-28930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919563#comment-16919563 ] Sujith Chacko commented on SPARK-28930: --- @ [~jobitmathew] As i remember Issue 3 is already handled as part of SPARK-24812 some time back, need to recheck. other issues i will check and get back to you. cc [~dongjoon] > Spark DESC FORMATTED TABLENAME information display issues > - > > Key: SPARK-28930 > URL: https://issues.apache.org/jira/browse/SPARK-28930 > Project: Spark > Issue Type: Bug > Components: Spark Shell, SQL >Affects Versions: 2.4.3 >Reporter: jobit mathew >Priority: Minor > > Spark DESC FORMATTED TABLENAME information display issues.Showing incorrect > *Last Access time and* feeling some information displays can make it better. > Test steps: > 1. Open spark sql > 2. Create table with partition > CREATE EXTERNAL TABLE IF NOT EXISTS employees_info_extended ( id INT, name > STRING, usd_flag STRING, salary DOUBLE, deductions MAP, > address STRING ) PARTITIONED BY (entrytime STRING) STORED AS TEXTFILE > location 'hdfs://hacluster/user/sparkhive/warehouse'; > 3. from spark sql check the table description > desc formatted tablename; > 4. From scala shell check the table description > sql("desc formatted tablename").show() > *Issue1:* > If there is no comment for spark scala shell shows *"null" in small letters* > but all other places Hive beeline/Spark beeline/Spark SQL it is showing in > *CAPITAL "NULL*". Better to show same in all places. > > *scala>* sql("desc formatted employees_info_extended").show(false); > +-+---++--- > |col_name|data_type|*comment*| > +-+---++--- > |id|int|*null*| > |name|string|*null*| > |usd_flag|string|*null*| > |salary|double|*null*| > |deductions|map|*null*| > |address|string|null| > |entrytime|string|null| > | # Partition Information| | | > | # col_name|data_type|comment| > |entrytime|string|null| > | | | | > | # Detailed Table Information| | | > |Database|sparkdb__| | > |Table|employees_info_extended| | > |Owner|root| | > *|Created Time |Tue Aug 20 13:42:06 CST 2019| |* > *|Last Access |Thu Jan 01 08:00:00 CST 1970| |* > |Created By|Spark 2.4.3| | > |Type|EXTERNAL| | > |Provider|hive| | > +-+---++--- > only showing top 20 rows > *scala>* > *Issue 2:* > Spark SQL "desc formatted tablename" is not showing the header [# > col_name,data_type,comment|#col_name,data_type,comment] in the top of the > query result.But header is showing on top of partition description. For > Better understanding show the header on Top of the query result. > *spark-sql>* desc formatted employees_info_extended1; > id int *NULL* > name string *NULL* > usd_flag string NULL > salary double NULL > deductions map NULL > address string NULL > entrytime string NULL > * > ## Partition Information* > ## col_name data_type comment* > entrytime string *NULL* > # Detailed Table Information > Database sparkdb__ > Table employees_info_extended1 > Owner spark > *Created Time Tue Aug 20 14:50:37 CST 2019* > *Last Access Thu Jan 01 08:00:00 CST 1970* > Created By Spark 2.3.2.0201 > Type EXTERNAL > Provider hive > Table Properties [transient_lastDdlTime=1566286655] > Location hdfs://hacluster/user/sparkhive/warehouse > Serde Library org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > InputFormat org.apache.hadoop.mapred.TextInputFormat > OutputFormat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > Storage Properties [serialization.format=1] > Partition Provider Catalog > Time taken: 0.477 seconds, Fetched 27 row(s) > *spark-sql>* > > *Issue 3:* > I created the table on Aug 20.So it is showing created time correct .*But > Last access time showing 1970 Jan 01*. It is not good to show Last access > time earlier time than the created time.Better to show the correct date and > time else show UNKNOWN. > *[Created Time,Tue Aug 20 13:42:06 CST 2019,]* > *[Last Access,Thu Jan 01 08:00:00 CST 1970,]* -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-28930) Spark DESC FORMATTED TABLENAME information display issues
[ https://issues.apache.org/jira/browse/SPARK-28930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jobit mathew updated SPARK-28930: - Description: Spark DESC FORMATTED TABLENAME information display issues.Showing incorrect *Last Access time and* feeling some information displays can make it better. Test steps: 1. Open spark sql 2. Create table with partition CREATE EXTERNAL TABLE IF NOT EXISTS employees_info_extended ( id INT, name STRING, usd_flag STRING, salary DOUBLE, deductions MAP, address STRING ) PARTITIONED BY (entrytime STRING) STORED AS TEXTFILE location 'hdfs://hacluster/user/sparkhive/warehouse'; 3. from spark sql check the table description desc formatted tablename; 4. From scala shell check the table description sql("desc formatted tablename").show() *Issue1:* If there is no comment for spark scala shell shows *"null" in small letters* but all other places Hive beeline/Spark beeline/Spark SQL it is showing in *CAPITAL "NULL*". Better to show same in all places. *scala>* sql("desc formatted employees_info_extended").show(false); +-+---++--- |col_name|data_type|*comment*| +-+---++--- |id|int|*null*| |name|string|*null*| |usd_flag|string|*null*| |salary|double|*null*| |deductions|map|*null*| |address|string|null| |entrytime|string|null| | # Partition Information| | | | # col_name|data_type|comment| |entrytime|string|null| | | | | | # Detailed Table Information| | | |Database|sparkdb__| | |Table|employees_info_extended| | |Owner|root| | *|Created Time |Tue Aug 20 13:42:06 CST 2019| |* *|Last Access |Thu Jan 01 08:00:00 CST 1970| |* |Created By|Spark 2.4.3| | |Type|EXTERNAL| | |Provider|hive| | +-+---++--- only showing top 20 rows *scala>* *Issue 2:* Spark SQL "desc formatted tablename" is not showing the header [# col_name,data_type,comment|#col_name,data_type,comment] in the top of the query result.But header is showing on top of partition description. For Better understanding show the header on Top of the query result. *spark-sql>* desc formatted employees_info_extended1; id int *NULL* name string *NULL* usd_flag string NULL salary double NULL deductions map NULL address string NULL entrytime string NULL * ## Partition Information* ## col_name data_type comment* entrytime string *NULL* # Detailed Table Information Database sparkdb__ Table employees_info_extended1 Owner spark *Created Time Tue Aug 20 14:50:37 CST 2019* *Last Access Thu Jan 01 08:00:00 CST 1970* Created By Spark 2.3.2.0201 Type EXTERNAL Provider hive Table Properties [transient_lastDdlTime=1566286655] Location hdfs://hacluster/user/sparkhive/warehouse Serde Library org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe InputFormat org.apache.hadoop.mapred.TextInputFormat OutputFormat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Storage Properties [serialization.format=1] Partition Provider Catalog Time taken: 0.477 seconds, Fetched 27 row(s) *spark-sql>* *Issue 3:* I created the table on Aug 20.So it is showing created time correct .*But Last access time showing 1970 Jan 01*. It is not good to show Last access time earlier time than the created time.Better to show the correct date and time else show UNKNOWN. *[Created Time,Tue Aug 20 13:42:06 CST 2019,]* *[Last Access,Thu Jan 01 08:00:00 CST 1970,]* was: Spark DESC FORMATTED TABLENAME information display issues.Showing incorrect *Last Access time and* feeling some information displays can make it better. Test steps: 1. Open spark sql 2. Create table with partition CREATE EXTERNAL TABLE IF NOT EXISTS employees_info_extended ( id INT, name STRING, usd_flag STRING, salary DOUBLE, deductions MAP, address STRING ) PARTITIONED BY (entrytime STRING) STORED AS TEXTFILE location 'hdfs://hacluster/user/sparkhive/warehouse'; 3. from spark sql check the table description desc formatted tablename; 4. From scala shell check the table description sql("desc formatted tablename").show() Issue1: If there is no comment for spark scala shell shows *"null" in small letters* but all other places Hive beeline/Spark beeline/Spark SQL it is showing in *CAPITAL "NULL*". Better to show same in all places. *scala>* sql("desc formatted employees_info_extended").show(false); +++---+ |col_name |data_type |*comment*| +++---+ |id |int |*null* | |name |string |*null* | |usd_flag |string |*null* | |salary |double |*null* | |deductions |map |*null* | |address |string |null | |entrytime |string |null | |# Partition Information | | | |# col_name |data_type |comment| |entrytime |string |null | | | | | |# Detailed Table Information| | | |Database |sparkdb__ | |
[jira] [Commented] (SPARK-28929) Spark Logging level should be INFO instead of Debug in Executor Plugin API[SPARK-24918]
[ https://issues.apache.org/jira/browse/SPARK-28929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919524#comment-16919524 ] Rakesh Raushan commented on SPARK-28929: i am working on this. > Spark Logging level should be INFO instead of Debug in Executor Plugin > API[SPARK-24918] > --- > > Key: SPARK-28929 > URL: https://issues.apache.org/jira/browse/SPARK-28929 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 2.4.2, 2.4.3 >Reporter: jobit mathew >Priority: Minor > > Spark Logging level should be INFO instead of Debug in Executor Plugin > API[SPARK-24918]. > Currently logging level for Executor Plugin API[SPARK-24918] is DEBUG > logDebug(s"Initializing the following plugins: $\{pluginNames.mkString(", > ")}") > logDebug(s"Successfully loaded plugin " + > plugin.getClass().getCanonicalName()) > logDebug("Finished initializing plugins") > It is better to change to INFO instead of DEBUG. > > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Issue Comment Deleted] (SPARK-28929) Spark Logging level should be INFO instead of Debug in Executor Plugin API[SPARK-24918]
[ https://issues.apache.org/jira/browse/SPARK-28929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] pavithra ramachandran updated SPARK-28929: -- Comment: was deleted (was: I am working on this ) > Spark Logging level should be INFO instead of Debug in Executor Plugin > API[SPARK-24918] > --- > > Key: SPARK-28929 > URL: https://issues.apache.org/jira/browse/SPARK-28929 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 2.4.2, 2.4.3 >Reporter: jobit mathew >Priority: Minor > > Spark Logging level should be INFO instead of Debug in Executor Plugin > API[SPARK-24918]. > Currently logging level for Executor Plugin API[SPARK-24918] is DEBUG > logDebug(s"Initializing the following plugins: $\{pluginNames.mkString(", > ")}") > logDebug(s"Successfully loaded plugin " + > plugin.getClass().getCanonicalName()) > logDebug("Finished initializing plugins") > It is better to change to INFO instead of DEBUG. > > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28929) Spark Logging level should be INFO instead of Debug in Executor Plugin API[SPARK-24918]
[ https://issues.apache.org/jira/browse/SPARK-28929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919523#comment-16919523 ] pavithra ramachandran commented on SPARK-28929: --- I am working on this > Spark Logging level should be INFO instead of Debug in Executor Plugin > API[SPARK-24918] > --- > > Key: SPARK-28929 > URL: https://issues.apache.org/jira/browse/SPARK-28929 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 2.4.2, 2.4.3 >Reporter: jobit mathew >Priority: Minor > > Spark Logging level should be INFO instead of Debug in Executor Plugin > API[SPARK-24918]. > Currently logging level for Executor Plugin API[SPARK-24918] is DEBUG > logDebug(s"Initializing the following plugins: $\{pluginNames.mkString(", > ")}") > logDebug(s"Successfully loaded plugin " + > plugin.getClass().getCanonicalName()) > logDebug("Finished initializing plugins") > It is better to change to INFO instead of DEBUG. > > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-28930) Spark DESC FORMATTED TABLENAME information display issues
[ https://issues.apache.org/jira/browse/SPARK-28930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jobit mathew updated SPARK-28930: - Environment: (was: _*_emphasized text_*_) > Spark DESC FORMATTED TABLENAME information display issues > - > > Key: SPARK-28930 > URL: https://issues.apache.org/jira/browse/SPARK-28930 > Project: Spark > Issue Type: Bug > Components: Spark Shell, SQL >Affects Versions: 2.4.3 >Reporter: jobit mathew >Priority: Minor > > Spark DESC FORMATTED TABLENAME information display issues.Showing incorrect > *Last Access time and* feeling some information displays can make it better. > Test steps: > 1. Open spark sql > 2. Create table with partition > CREATE EXTERNAL TABLE IF NOT EXISTS employees_info_extended ( id INT, name > STRING, usd_flag STRING, salary DOUBLE, deductions MAP, > address STRING ) PARTITIONED BY (entrytime STRING) STORED AS TEXTFILE > location 'hdfs://hacluster/user/sparkhive/warehouse'; > 3. from spark sql check the table description > desc formatted tablename; > 4. From scala shell check the table description > sql("desc formatted tablename").show() > Issue1: > If there is no comment for spark scala shell shows *"null" in small letters* > but all other places Hive beeline/Spark beeline/Spark SQL it is showing in > *CAPITAL "NULL*". Better to show same in all places. > > *scala>* sql("desc formatted employees_info_extended").show(false); > +++---+ > |col_name |data_type |*comment*| > +++---+ > |id |int |*null* | > |name |string |*null* | > |usd_flag |string |*null* | > |salary |double |*null* | > |deductions |map |*null* | > |address |string |null | > |entrytime |string |null | > |# Partition Information | | | > |# col_name |data_type |comment| > |entrytime |string |null | > | | | | > |# Detailed Table Information| | | > |Database |sparkdb__ | | > |Table |employees_info_extended | | > |Owner |root | | > *|Created Time |Tue Aug 20 13:42:06 CST 2019| |* > *|Last Access |Thu Jan 01 08:00:00 CST 1970| |* > |Created By |Spark 2.4.3 | | > |Type |EXTERNAL | | > |Provider |hive | | > +++---+ > only showing top 20 rows > *scala>* > Issue 2: > Spark SQL "desc formatted tablename" is not showing the header [# > col_name,data_type,comment|#col_name,data_type,comment] in the top of the > query result.But header is showing on top of partition description. For > Better understanding show the header on Top of the query result. > *spark-sql>* desc formatted employees_info_extended1; > id int *NULL* > name string *NULL* > usd_flag string NULL > salary double NULL > deductions map NULL > address string NULL > entrytime string NULL > *# Partition Information* > *# col_name data_type comment* > entrytime string *NULL* > # Detailed Table Information > Database sparkdb__ > Table employees_info_extended1 > Owner spark > *Created Time Tue Aug 20 14:50:37 CST 2019* > *Last Access Thu Jan 01 08:00:00 CST 1970* > Created By Spark 2.3.2.0201 > Type EXTERNAL > Provider hive > Table Properties [transient_lastDdlTime=1566286655] > Location hdfs://hacluster/user/sparkhive/warehouse > Serde Library org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > InputFormat org.apache.hadoop.mapred.TextInputFormat > OutputFormat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > Storage Properties [serialization.format=1] > Partition Provider Catalog > Time taken: 0.477 seconds, Fetched 27 row(s) > *spark-sql>* > > Issue 3: > I created the table on Aug 20.So it is showing created time correct .*But > Last access time showing 1970 Jan 01*. It is not good to show Last access > time earlier time than the created time.Better to show the correct date and > time else show UNKNOWN. > *[Created Time,Tue Aug 20 13:42:06 CST 2019,]* > *[Last Access,Thu Jan 01 08:00:00 CST 1970,]* -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-28930) Spark DESC FORMATTED TABLENAME information display issues
jobit mathew created SPARK-28930: Summary: Spark DESC FORMATTED TABLENAME information display issues Key: SPARK-28930 URL: https://issues.apache.org/jira/browse/SPARK-28930 Project: Spark Issue Type: Bug Components: Spark Shell, SQL Affects Versions: 2.4.3 Environment: _*_emphasized text_*_ Reporter: jobit mathew Spark DESC FORMATTED TABLENAME information display issues.Showing incorrect *Last Access time and* feeling some information displays can make it better. Test steps: 1. Open spark sql 2. Create table with partition CREATE EXTERNAL TABLE IF NOT EXISTS employees_info_extended ( id INT, name STRING, usd_flag STRING, salary DOUBLE, deductions MAP, address STRING ) PARTITIONED BY (entrytime STRING) STORED AS TEXTFILE location 'hdfs://hacluster/user/sparkhive/warehouse'; 3. from spark sql check the table description desc formatted tablename; 4. From scala shell check the table description sql("desc formatted tablename").show() Issue1: If there is no comment for spark scala shell shows *"null" in small letters* but all other places Hive beeline/Spark beeline/Spark SQL it is showing in *CAPITAL "NULL*". Better to show same in all places. *scala>* sql("desc formatted employees_info_extended").show(false); +++---+ |col_name |data_type |*comment*| +++---+ |id |int |*null* | |name |string |*null* | |usd_flag |string |*null* | |salary |double |*null* | |deductions |map |*null* | |address |string |null | |entrytime |string |null | |# Partition Information | | | |# col_name |data_type |comment| |entrytime |string |null | | | | | |# Detailed Table Information| | | |Database |sparkdb__ | | |Table |employees_info_extended | | |Owner |root | | *|Created Time |Tue Aug 20 13:42:06 CST 2019| |* *|Last Access |Thu Jan 01 08:00:00 CST 1970| |* |Created By |Spark 2.4.3 | | |Type |EXTERNAL | | |Provider |hive | | +++---+ only showing top 20 rows *scala>* Issue 2: Spark SQL "desc formatted tablename" is not showing the header [# col_name,data_type,comment|#col_name,data_type,comment] in the top of the query result.But header is showing on top of partition description. For Better understanding show the header on Top of the query result. *spark-sql>* desc formatted employees_info_extended1; id int *NULL* name string *NULL* usd_flag string NULL salary double NULL deductions map NULL address string NULL entrytime string NULL *# Partition Information* *# col_name data_type comment* entrytime string *NULL* # Detailed Table Information Database sparkdb__ Table employees_info_extended1 Owner spark *Created Time Tue Aug 20 14:50:37 CST 2019* *Last Access Thu Jan 01 08:00:00 CST 1970* Created By Spark 2.3.2.0201 Type EXTERNAL Provider hive Table Properties [transient_lastDdlTime=1566286655] Location hdfs://hacluster/user/sparkhive/warehouse Serde Library org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe InputFormat org.apache.hadoop.mapred.TextInputFormat OutputFormat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Storage Properties [serialization.format=1] Partition Provider Catalog Time taken: 0.477 seconds, Fetched 27 row(s) *spark-sql>* Issue 3: I created the table on Aug 20.So it is showing created time correct .*But Last access time showing 1970 Jan 01*. It is not good to show Last access time earlier time than the created time.Better to show the correct date and time else show UNKNOWN. *[Created Time,Tue Aug 20 13:42:06 CST 2019,]* *[Last Access,Thu Jan 01 08:00:00 CST 1970,]* -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-28929) Spark Logging level should be INFO instead of Debug in Executor Plugin API[SPARK-24918]
jobit mathew created SPARK-28929: Summary: Spark Logging level should be INFO instead of Debug in Executor Plugin API[SPARK-24918] Key: SPARK-28929 URL: https://issues.apache.org/jira/browse/SPARK-28929 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 2.4.3, 2.4.2 Reporter: jobit mathew Spark Logging level should be INFO instead of Debug in Executor Plugin API[SPARK-24918]. Currently logging level for Executor Plugin API[SPARK-24918] is DEBUG logDebug(s"Initializing the following plugins: $\{pluginNames.mkString(", ")}") logDebug(s"Successfully loaded plugin " + plugin.getClass().getCanonicalName()) logDebug("Finished initializing plugins") It is better to change to INFO instead of DEBUG. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28025) HDFSBackedStateStoreProvider should not leak .crc files
[ https://issues.apache.org/jira/browse/SPARK-28025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919514#comment-16919514 ] Steve Loughran commented on SPARK-28025: Has anyone considered enhancing org.apache.hadoop.fs.ChecksumFileSystem to say "if "file.bytes-per-checksum" == 0 then checksums are disabled? Currently it fails if bytes per CRC <= 0, but you could make the 0 value a switch to say "none". > HDFSBackedStateStoreProvider should not leak .crc files > > > Key: SPARK-28025 > URL: https://issues.apache.org/jira/browse/SPARK-28025 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.4.3 > Environment: Spark 2.4.3 > Kubernetes 1.11(?) (OpenShift) > StateStore storage on a mounted PVC. Viewed as a local filesystem by the > `FileContextBasedCheckpointFileManager` : > {noformat} > scala> glusterfm.isLocal > res17: Boolean = true{noformat} >Reporter: Gerard Maas >Assignee: Jungtaek Lim >Priority: Major > Fix For: 2.4.4, 3.0.0 > > > The HDFSBackedStateStoreProvider when using the default CheckpointFileManager > is leaving '.crc' files behind. There's a .crc file created for each > `atomicFile` operation of the CheckpointFileManager. > Over time, the number of files becomes very large. It makes the state store > file system constantly increase in size and, in our case, deteriorates the > file system performance. > Here's a sample of one of our spark storage volumes after 2 days of execution > (4 stateful streaming jobs, each on a different sub-dir): > # > {noformat} > Total files in PVC (used for checkpoints and state store) > $find . | wc -l > 431796 > # .crc files > $find . -name "*.crc" | wc -l > 418053{noformat} > With each .crc file taking one storage block, the used storage runs into the > GBs of data. > These jobs are running on Kubernetes. Our shared storage provider, GlusterFS, > shows serious performance deterioration with this large number of files: > {noformat} > DEBUG HDFSBackedStateStoreProvider: fetchFiles() took 29164ms{noformat} > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-28759) Upgrade scala-maven-plugin to 4.2.0
[ https://issues.apache.org/jira/browse/SPARK-28759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-28759: - Summary: Upgrade scala-maven-plugin to 4.2.0 (was: Upgrade scala-maven-plugin to 4.1.1) > Upgrade scala-maven-plugin to 4.2.0 > --- > > Key: SPARK-28759 > URL: https://issues.apache.org/jira/browse/SPARK-28759 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.0.0 >Reporter: Dongjoon Hyun >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28025) HDFSBackedStateStoreProvider should not leak .crc files
[ https://issues.apache.org/jira/browse/SPARK-28025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919507#comment-16919507 ] Jungtaek Lim commented on SPARK-28025: -- [~skonto] Please take a look at my PR as my PR didn't follow your workaround. We identified which Hadoop issue we are facing, and took a workaround as deleting crc file manually. > HDFSBackedStateStoreProvider should not leak .crc files > > > Key: SPARK-28025 > URL: https://issues.apache.org/jira/browse/SPARK-28025 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.4.3 > Environment: Spark 2.4.3 > Kubernetes 1.11(?) (OpenShift) > StateStore storage on a mounted PVC. Viewed as a local filesystem by the > `FileContextBasedCheckpointFileManager` : > {noformat} > scala> glusterfm.isLocal > res17: Boolean = true{noformat} >Reporter: Gerard Maas >Assignee: Jungtaek Lim >Priority: Major > Fix For: 2.4.4, 3.0.0 > > > The HDFSBackedStateStoreProvider when using the default CheckpointFileManager > is leaving '.crc' files behind. There's a .crc file created for each > `atomicFile` operation of the CheckpointFileManager. > Over time, the number of files becomes very large. It makes the state store > file system constantly increase in size and, in our case, deteriorates the > file system performance. > Here's a sample of one of our spark storage volumes after 2 days of execution > (4 stateful streaming jobs, each on a different sub-dir): > # > {noformat} > Total files in PVC (used for checkpoints and state store) > $find . | wc -l > 431796 > # .crc files > $find . -name "*.crc" | wc -l > 418053{noformat} > With each .crc file taking one storage block, the used storage runs into the > GBs of data. > These jobs are running on Kubernetes. Our shared storage provider, GlusterFS, > shows serious performance deterioration with this large number of files: > {noformat} > DEBUG HDFSBackedStateStoreProvider: fetchFiles() took 29164ms{noformat} > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-28911) Unify Kafka source option pattern
[ https://issues.apache.org/jira/browse/SPARK-28911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved SPARK-28911. --- Resolution: Won't Do Based on the discussion on the PR this can be closed too. > Unify Kafka source option pattern > - > > Key: SPARK-28911 > URL: https://issues.apache.org/jira/browse/SPARK-28911 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Affects Versions: 3.0.0 >Reporter: wenxuanguan >Priority: Major > > Pattern of datasource options is Camel-Case, such as CheckpointLocation, and > only some Kafka source option is separated with dot, Such as > fetchOffset.numRetries. > Also we can distinguish the Kafka original options from pattern, such as > kafka.bootstrap.servers -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Closed] (SPARK-28911) Unify Kafka source option pattern
[ https://issues.apache.org/jira/browse/SPARK-28911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed SPARK-28911. - > Unify Kafka source option pattern > - > > Key: SPARK-28911 > URL: https://issues.apache.org/jira/browse/SPARK-28911 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Affects Versions: 3.0.0 >Reporter: wenxuanguan >Priority: Major > > Pattern of datasource options is Camel-Case, such as CheckpointLocation, and > only some Kafka source option is separated with dot, Such as > fetchOffset.numRetries. > Also we can distinguish the Kafka original options from pattern, such as > kafka.bootstrap.servers -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-28025) HDFSBackedStateStoreProvider should not leak .crc files
[ https://issues.apache.org/jira/browse/SPARK-28025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919471#comment-16919471 ] Stavros Kontopoulos edited comment on SPARK-28025 at 8/30/19 11:54 AM: --- @[~dongjoon] [~zsxwing] this needs to be re-opened. When using the workaround we recently hit this issue: [https://github.com/broadinstitute/gatk/issues/1389] which can be fixed easily with a derived class like in this PR: [https://github.com/broadinstitute/gatk/pull/1421/files] but this is a bit of inconvenient. However, I believe as well that this should be fixed in Spark (less surprises) otherwise we need to document it as [~kabhwan] said above. was (Author: skonto): @[~dongjoon] [~zsxwing] this needs to be re-opened. When using the workaround we recently hit this issue: [https://github.com/broadinstitute/gatk/issues/1389] which can be fixed easily with a derived class like in this PR: [https://github.com/broadinstitute/gatk/pull/1421/files] However, I believe as well that this should be fixed in Spark (less surprises) otherwise we need to document it as [~kabhwan] said above. > HDFSBackedStateStoreProvider should not leak .crc files > > > Key: SPARK-28025 > URL: https://issues.apache.org/jira/browse/SPARK-28025 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.4.3 > Environment: Spark 2.4.3 > Kubernetes 1.11(?) (OpenShift) > StateStore storage on a mounted PVC. Viewed as a local filesystem by the > `FileContextBasedCheckpointFileManager` : > {noformat} > scala> glusterfm.isLocal > res17: Boolean = true{noformat} >Reporter: Gerard Maas >Assignee: Jungtaek Lim >Priority: Major > Fix For: 2.4.4, 3.0.0 > > > The HDFSBackedStateStoreProvider when using the default CheckpointFileManager > is leaving '.crc' files behind. There's a .crc file created for each > `atomicFile` operation of the CheckpointFileManager. > Over time, the number of files becomes very large. It makes the state store > file system constantly increase in size and, in our case, deteriorates the > file system performance. > Here's a sample of one of our spark storage volumes after 2 days of execution > (4 stateful streaming jobs, each on a different sub-dir): > # > {noformat} > Total files in PVC (used for checkpoints and state store) > $find . | wc -l > 431796 > # .crc files > $find . -name "*.crc" | wc -l > 418053{noformat} > With each .crc file taking one storage block, the used storage runs into the > GBs of data. > These jobs are running on Kubernetes. Our shared storage provider, GlusterFS, > shows serious performance deterioration with this large number of files: > {noformat} > DEBUG HDFSBackedStateStoreProvider: fetchFiles() took 29164ms{noformat} > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28025) HDFSBackedStateStoreProvider should not leak .crc files
[ https://issues.apache.org/jira/browse/SPARK-28025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919471#comment-16919471 ] Stavros Kontopoulos commented on SPARK-28025: - @[~dongjoon] [~zsxwing] this needs to be re-opened. When using the workaround we recently hit this issue: [https://github.com/broadinstitute/gatk/issues/1389] which can be fixed easily with a derived class like in this PR: [https://github.com/broadinstitute/gatk/pull/1421/files] However, I believe as well that this should be fixed in Spark (less surprises) otherwise we need to document it as [~kabhwan] said above. > HDFSBackedStateStoreProvider should not leak .crc files > > > Key: SPARK-28025 > URL: https://issues.apache.org/jira/browse/SPARK-28025 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.4.3 > Environment: Spark 2.4.3 > Kubernetes 1.11(?) (OpenShift) > StateStore storage on a mounted PVC. Viewed as a local filesystem by the > `FileContextBasedCheckpointFileManager` : > {noformat} > scala> glusterfm.isLocal > res17: Boolean = true{noformat} >Reporter: Gerard Maas >Assignee: Jungtaek Lim >Priority: Major > Fix For: 2.4.4, 3.0.0 > > > The HDFSBackedStateStoreProvider when using the default CheckpointFileManager > is leaving '.crc' files behind. There's a .crc file created for each > `atomicFile` operation of the CheckpointFileManager. > Over time, the number of files becomes very large. It makes the state store > file system constantly increase in size and, in our case, deteriorates the > file system performance. > Here's a sample of one of our spark storage volumes after 2 days of execution > (4 stateful streaming jobs, each on a different sub-dir): > # > {noformat} > Total files in PVC (used for checkpoints and state store) > $find . | wc -l > 431796 > # .crc files > $find . -name "*.crc" | wc -l > 418053{noformat} > With each .crc file taking one storage block, the used storage runs into the > GBs of data. > These jobs are running on Kubernetes. Our shared storage provider, GlusterFS, > shows serious performance deterioration with this large number of files: > {noformat} > DEBUG HDFSBackedStateStoreProvider: fetchFiles() took 29164ms{noformat} > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-28928) Take over Kafka delegation token protocol on sources/sinks
[ https://issues.apache.org/jira/browse/SPARK-28928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi updated SPARK-28928: -- Summary: Take over Kafka delegation token protocol on sources/sinks (was: Take over delegation token protocol on sources/sinks) > Take over Kafka delegation token protocol on sources/sinks > -- > > Key: SPARK-28928 > URL: https://issues.apache.org/jira/browse/SPARK-28928 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Affects Versions: 3.0.0 >Reporter: Gabor Somogyi >Priority: Major > > At the moment there are 3 places where communication protocol with Kafka > cluster has to be configured: > * On delegation token > * On source > * On sink > Most of the time users are using the same protocol on all these places > (within one Kafka cluster). It would be better to declare it in one place > (delegation token side) and Kafka sources/sinks can take this config over. > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28928) Take over delegation token protocol on sources/sinks
[ https://issues.apache.org/jira/browse/SPARK-28928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919463#comment-16919463 ] Gabor Somogyi commented on SPARK-28928: --- I'm working on this. > Take over delegation token protocol on sources/sinks > > > Key: SPARK-28928 > URL: https://issues.apache.org/jira/browse/SPARK-28928 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Affects Versions: 3.0.0 >Reporter: Gabor Somogyi >Priority: Major > > At the moment there are 3 places where communication protocol with Kafka > cluster has to be configured: > * On delegation token > * On source > * On sink > Most of the time users are using the same protocol on all these places > (within one Kafka cluster). It would be better to declare it in one place > (delegation token side) and Kafka sources/sinks can take this config over. > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-28928) Take over delegation token protocol on sources/sinks
Gabor Somogyi created SPARK-28928: - Summary: Take over delegation token protocol on sources/sinks Key: SPARK-28928 URL: https://issues.apache.org/jira/browse/SPARK-28928 Project: Spark Issue Type: Improvement Components: Structured Streaming Affects Versions: 3.0.0 Reporter: Gabor Somogyi At the moment there are 3 places where communication protocol with Kafka cluster has to be configured: * On delegation token * On source * On sink Most of the time users are using the same protocol on all these places (within one Kafka cluster). It would be better to declare it in one place (delegation token side) and Kafka sources/sinks can take this config over. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-28906) `bin/spark-submit --version` shows incorrect info
[ https://issues.apache.org/jira/browse/SPARK-28906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919420#comment-16919420 ] Kazuaki Ishizaki edited comment on SPARK-28906 at 8/30/19 10:51 AM: In {{jars/spark-core_2.11-2.3.*.jar}}, {{spark-version-info.properties}} exists. This file is different between 2.3.0 and 2.3.4. This file is generated by `build/spark-build-info`. {code} $ cat spark-version-info.properties.230 version=2.3.0 user=sameera revision=a0d7949896e70f427e7f3942ff340c9484ff0aab branch=master date=2018-02-22T19:24:38Z url=g...@github.com:sameeragarwal/spark.git $ cat spark-version-info.properties.234 version=2.3.4 user= revision= branch= date=2019-08-26T08:29:39Z url= {code} was (Author: kiszk): In {{jars/spark-core_2.11-2.3.*.jar}}, {{spark-version-info.properties}} exists. This file is different between 2.3.0 and 2.3.4. {code} $ cat spark-version-info.properties.230 version=2.3.0 user=sameera revision=a0d7949896e70f427e7f3942ff340c9484ff0aab branch=master date=2018-02-22T19:24:38Z url=g...@github.com:sameeragarwal/spark.git $ cat spark-version-info.properties.234 version=2.3.4 user= revision= branch= date=2019-08-26T08:29:39Z url= {code} > `bin/spark-submit --version` shows incorrect info > - > > Key: SPARK-28906 > URL: https://issues.apache.org/jira/browse/SPARK-28906 > Project: Spark > Issue Type: Bug > Components: Project Infra >Affects Versions: 2.3.1, 2.3.2, 2.3.3, 2.3.4, 2.4.4, 2.4.0, 2.4.1, 2.4.2, > 3.0.0, 2.4.3 >Reporter: Marcelo Vanzin >Priority: Minor > Attachments: image-2019-08-29-05-50-13-526.png > > > Since Spark 2.3.1, `spark-submit` shows a wrong information. > {code} > $ bin/spark-submit --version > Welcome to > __ > / __/__ ___ _/ /__ > _\ \/ _ \/ _ `/ __/ '_/ >/___/ .__/\_,_/_/ /_/\_\ version 2.3.3 > /_/ > Using Scala version 2.11.8, OpenJDK 64-Bit Server VM, 1.8.0_222 > Branch > Compiled by user on 2019-02-04T13:00:46Z > Revision > Url > Type --help for more information. > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28906) `bin/spark-submit --version` shows incorrect info
[ https://issues.apache.org/jira/browse/SPARK-28906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919420#comment-16919420 ] Kazuaki Ishizaki commented on SPARK-28906: -- In {{jars/spark-core_2.11-2.3.*.jar}}, {{spark-version-info.properties}} exists. This file is different between 2.3.0 and 2.3.4. {code} $ cat spark-version-info.properties.230 version=2.3.0 user=sameera revision=a0d7949896e70f427e7f3942ff340c9484ff0aab branch=master date=2018-02-22T19:24:38Z url=g...@github.com:sameeragarwal/spark.git $ cat spark-version-info.properties.234 version=2.3.4 user= revision= branch= date=2019-08-26T08:29:39Z url= {code} > `bin/spark-submit --version` shows incorrect info > - > > Key: SPARK-28906 > URL: https://issues.apache.org/jira/browse/SPARK-28906 > Project: Spark > Issue Type: Bug > Components: Project Infra >Affects Versions: 2.3.1, 2.3.2, 2.3.3, 2.3.4, 2.4.4, 2.4.0, 2.4.1, 2.4.2, > 3.0.0, 2.4.3 >Reporter: Marcelo Vanzin >Priority: Minor > Attachments: image-2019-08-29-05-50-13-526.png > > > Since Spark 2.3.1, `spark-submit` shows a wrong information. > {code} > $ bin/spark-submit --version > Welcome to > __ > / __/__ ___ _/ /__ > _\ \/ _ \/ _ `/ __/ '_/ >/___/ .__/\_,_/_/ /_/\_\ version 2.3.3 > /_/ > Using Scala version 2.11.8, OpenJDK 64-Bit Server VM, 1.8.0_222 > Branch > Compiled by user on 2019-02-04T13:00:46Z > Revision > Url > Type --help for more information. > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-28444) Bump Kubernetes Client Version to 4.3.0
[ https://issues.apache.org/jira/browse/SPARK-28444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919346#comment-16919346 ] Björn edited comment on SPARK-28444 at 8/30/19 9:07 AM: We're running into the same issue. As I'm developing with a local ansible/Vagrant setup (Kubernetes deployed through kubeadm, 3 nodes) I did some testings on different versions with SparkPi example (Spark 2.4.3). My results were: * spark-submit in cluster and client mode works fine in local docker-desktop running Kubernetes 1.14.3 * spark-submit in client mode works fine for Kubernetes 1.15.3 in the Vagrant multinode cluster * spark-submit in cluster mode does not work for Kubernetes 1.15.3, 1.14.3,1.13.10 in the Vagrant multinode cluster spark-submit in cluster mode starts the driver, spawns the executors but fails when trying to watch the pod with HTTP 403 Exception with an empty message (esp. not complaining about permissions). The log is more or less the same as posted above. I think neither the compatibility nor the permissions (as executor pods can be created with the service account) are the cause for this. Does anyone have ideas how to further debug this? was (Author: dolkemeier): We're running into the same issue. As I'm developing with a local ansible/Vagrant setup (Kubernetes deployed through kubeadm, 3 nodes) I did some testings on different versions with SparkPi example (Spark 2.4.3). My results were: * spark-submit in cluster and client mode works fine in local docker-desktop running Kubernetes 1.14.3 * spark-submit in client mode works fine for Kubernetes 1.15.3 in the Vagrant multinode cluster * spark-submit in cluster mode does not work for Kubernetes 1.15.3, 1.14.3,1.13.10 in the Vagrant multinode cluster spark-submit in cluster mode starts the driver, spawns the executors but fails when trying to watch the pod with HTTP 403 Exception with an empty message (esp. not complaining about permissions). The log is more or less the same as posted above. I think neither the compatibility nor the permissions (as executor pods can be created with the service account, spark sa has cluster role) are the cause for this. Does anyone have ideas how to further debug this? > Bump Kubernetes Client Version to 4.3.0 > --- > > Key: SPARK-28444 > URL: https://issues.apache.org/jira/browse/SPARK-28444 > Project: Spark > Issue Type: Dependency upgrade > Components: Kubernetes >Affects Versions: 3.0.0, 2.4.3 >Reporter: Patrick Winter >Priority: Major > > Spark is currently using the Kubernetes client version 4.1.2. This client > does not support the current Kubernetes version 1.14, as can be seen on the > [compatibility > matrix|[https://github.com/fabric8io/kubernetes-client#compatibility-matrix]]. > Therefore the Kubernetes client should be bumped up to version 4.3.0. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28444) Bump Kubernetes Client Version to 4.3.0
[ https://issues.apache.org/jira/browse/SPARK-28444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919346#comment-16919346 ] Björn commented on SPARK-28444: --- We're running into the same issue. As I'm developing with a local ansible/Vagrant setup (Kubernetes deployed through kubeadm, 3 nodes) I did some testings on different versions with SparkPi example (Spark 2.4.3). My results were: * spark-submit in cluster and client mode works fine in local docker-desktop running Kubernetes 1.14.3 * spark-submit in client mode works fine for Kubernetes 1.15.3 in the Vagrant multinode cluster * spark-submit in cluster mode does not work for Kubernetes 1.15.3, 1.14.3,1.13.10 in the Vagrant multinode cluster spark-submit in cluster mode starts the driver, spawns the executors but fails when trying to watch the pod with HTTP 403 Exception with an empty message (esp. not complaining about permissions). The log is more or less the same as posted above. I think neither the compatibility nor the permissions (as executor pods can be created with the service account, spark sa has cluster role) are the cause for this. Does anyone have ideas how to further debug this? > Bump Kubernetes Client Version to 4.3.0 > --- > > Key: SPARK-28444 > URL: https://issues.apache.org/jira/browse/SPARK-28444 > Project: Spark > Issue Type: Dependency upgrade > Components: Kubernetes >Affects Versions: 3.0.0, 2.4.3 >Reporter: Patrick Winter >Priority: Major > > Spark is currently using the Kubernetes client version 4.1.2. This client > does not support the current Kubernetes version 1.14, as can be seen on the > [compatibility > matrix|[https://github.com/fabric8io/kubernetes-client#compatibility-matrix]]. > Therefore the Kubernetes client should be bumped up to version 4.3.0. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28906) `bin/spark-submit --version` shows incorrect info
[ https://issues.apache.org/jira/browse/SPARK-28906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919330#comment-16919330 ] Kazuaki Ishizaki commented on SPARK-28906: -- I attached output of 2.3.0 and 2.3.4 in one comment as below. Let me see the script, too. ``` $ spark-2.3.0-bin-hadoop2.6/bin/spark-submit --version Welcome to __ / __/__ ___ _/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.3.0 /_/ Using Scala version 2.11.8, OpenJDK 64-Bit Server VM, 1.8.0_212 Branch master Compiled by user sameera on 2018-02-22T19:24:38Z Revision a0d7949896e70f427e7f3942ff340c9484ff0aab Url g...@github.com:sameeragarwal/spark.git Type --help for more information. $ spark-2.3.4-bin-hadoop2.6/bin/spark-submit --version Welcome to __ / __/__ ___ _/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.3.4 /_/ Using Scala version 2.11.8, OpenJDK 64-Bit Server VM, 1.8.0_212 Branch Compiled by user on 2019-08-26T08:29:39Z Revision Url Type --help for more information. ``` > `bin/spark-submit --version` shows incorrect info > - > > Key: SPARK-28906 > URL: https://issues.apache.org/jira/browse/SPARK-28906 > Project: Spark > Issue Type: Bug > Components: Project Infra >Affects Versions: 2.3.1, 2.3.2, 2.3.3, 2.3.4, 2.4.4, 2.4.0, 2.4.1, 2.4.2, > 3.0.0, 2.4.3 >Reporter: Marcelo Vanzin >Priority: Minor > Attachments: image-2019-08-29-05-50-13-526.png > > > Since Spark 2.3.1, `spark-submit` shows a wrong information. > {code} > $ bin/spark-submit --version > Welcome to > __ > / __/__ ___ _/ /__ > _\ \/ _ \/ _ `/ __/ '_/ >/___/ .__/\_,_/_/ /_/\_\ version 2.3.3 > /_/ > Using Scala version 2.11.8, OpenJDK 64-Bit Server VM, 1.8.0_222 > Branch > Compiled by user on 2019-02-04T13:00:46Z > Revision > Url > Type --help for more information. > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28913) ArrayIndexOutOfBoundsException and Not-stable AUC metrics in ALS for datasets with 12 billion instances
[ https://issues.apache.org/jira/browse/SPARK-28913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919300#comment-16919300 ] Qiang Wang commented on SPARK-28913: Sorry, I can not find the way to close the issue, just ignore it which was created by mistake. > ArrayIndexOutOfBoundsException and Not-stable AUC metrics in ALS for datasets > with 12 billion instances > > > Key: SPARK-28913 > URL: https://issues.apache.org/jira/browse/SPARK-28913 > Project: Spark > Issue Type: Bug > Components: MLlib >Affects Versions: 2.2.1 >Reporter: Qiang Wang >Assignee: Xiangrui Meng >Priority: Major > > The stack trace is below: > {quote}19/08/28 07:00:40 WARN Executor task launch worker for task 325074 > BlockManager: Block rdd_10916_493 could not be removed as it was not found on > disk or in memory 19/08/28 07:00:41 ERROR Executor task launch worker for > task 325074 Executor: Exception in task 3.0 in stage 347.1 (TID 325074) > java.lang.ArrayIndexOutOfBoundsException: 6741 at > org.apache.spark.dpshade.recommendation.ALS$$anonfun$org$apache$spark$ml$recommendation$ALS$$computeFactors$1.apply(ALS.scala:1460) > at > org.apache.spark.dpshade.recommendation.ALS$$anonfun$org$apache$spark$ml$recommendation$ALS$$computeFactors$1.apply(ALS.scala:1440) > at > org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$40$$anonfun$apply$41.apply(PairRDDFunctions.scala:760) > at > org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$40$$anonfun$apply$41.apply(PairRDDFunctions.scala:760) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) at > org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:216) > at > org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1041) > at > org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1032) > at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:972) at > org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1032) > at > org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:763) > at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:334) at > org.apache.spark.rdd.RDD.iterator(RDD.scala:285) at > org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$2.apply(CoGroupedRDD.scala:141) > at > org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$2.apply(CoGroupedRDD.scala:137) > at > scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733) > at scala.collection.immutable.List.foreach(List.scala:381) at > scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732) > at org.apache.spark.rdd.CoGroupedRDD.compute(CoGroupedRDD.scala:137) at > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at > org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at > org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at > org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at > org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) at > org.apache.spark.scheduler.Task.run(Task.scala:108) at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:358) at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {quote} > This exception happened sometimes. And we also found that the AUC metric was > not stable when evaluating the inner product of the user factors and the item > factors with the same dataset and configuration. AUC varied from 0.60 to 0.67 > which was not stable for production environment. > Dataset capacity: ~12 billion ratings > Here is the our code: > {code:java} > val hivedata = sc.sql(sqltext).select(id,dpid,score).coalesce(numPartitions) > val predataItem = hivedata.rdd.map(r=>(r._1._1,(r._1._2,r._2.sum))) > .groupByKey().zipWithIndex() > .persist(StorageLevel.MEMORY_AND_DISK_SER) > val predataUser = > predataItem.flatMap(r=>r._1._2.map(y=>(y._1,(r._2.toInt,y._2
[jira] [Commented] (SPARK-28926) CLONE - ArrayIndexOutOfBoundsException and Not-stable AUC metrics in ALS for datasets with 12 billion instances
[ https://issues.apache.org/jira/browse/SPARK-28926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919299#comment-16919299 ] Qiang Wang commented on SPARK-28926: Sorry, I can not find the way to close the issue, just ignore it which was created by mistake. > CLONE - ArrayIndexOutOfBoundsException and Not-stable AUC metrics in ALS for > datasets with 12 billion instances > > > Key: SPARK-28926 > URL: https://issues.apache.org/jira/browse/SPARK-28926 > Project: Spark > Issue Type: Bug > Components: MLlib >Affects Versions: 2.2.1 >Reporter: Qiang Wang >Assignee: Xiangrui Meng >Priority: Major > > The stack trace is below: > {quote}19/08/28 07:00:40 WARN Executor task launch worker for task 325074 > BlockManager: Block rdd_10916_493 could not be removed as it was not found on > disk or in memory 19/08/28 07:00:41 ERROR Executor task launch worker for > task 325074 Executor: Exception in task 3.0 in stage 347.1 (TID 325074) > java.lang.ArrayIndexOutOfBoundsException: 6741 at > org.apache.spark.dpshade.recommendation.ALS$$anonfun$org$apache$spark$ml$recommendation$ALS$$computeFactors$1.apply(ALS.scala:1460) > at > org.apache.spark.dpshade.recommendation.ALS$$anonfun$org$apache$spark$ml$recommendation$ALS$$computeFactors$1.apply(ALS.scala:1440) > at > org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$40$$anonfun$apply$41.apply(PairRDDFunctions.scala:760) > at > org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$40$$anonfun$apply$41.apply(PairRDDFunctions.scala:760) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) at > org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:216) > at > org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1041) > at > org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1032) > at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:972) at > org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1032) > at > org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:763) > at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:334) at > org.apache.spark.rdd.RDD.iterator(RDD.scala:285) at > org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$2.apply(CoGroupedRDD.scala:141) > at > org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$2.apply(CoGroupedRDD.scala:137) > at > scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733) > at scala.collection.immutable.List.foreach(List.scala:381) at > scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732) > at org.apache.spark.rdd.CoGroupedRDD.compute(CoGroupedRDD.scala:137) at > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at > org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at > org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at > org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at > org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) at > org.apache.spark.scheduler.Task.run(Task.scala:108) at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:358) at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {quote} > This exception happened sometimes. And we also found that the AUC metric was > not stable when evaluating the inner product of the user factors and the item > factors with the same dataset and configuration. AUC varied from 0.60 to 0.67 > which was not stable for production environment. > Dataset capacity: ~12 billion ratings > Here is the our code: > {code:java} > val hivedata = sc.sql(sqltext).select(id,dpid,score).coalesce(numPartitions) > val predataItem = hivedata.rdd.map(r=>(r._1._1,(r._1._2,r._2.sum))) > .groupByKey().zipWithIndex() > .persist(StorageLevel.MEMORY_AND_DISK_SER) > val predataUser = >
[jira] [Created] (SPARK-28927) ArrayIndexOutOfBoundsException and Not-stable AUC metrics in ALS for datasets with 12 billion instances
Qiang Wang created SPARK-28927: -- Summary: ArrayIndexOutOfBoundsException and Not-stable AUC metrics in ALS for datasets with 12 billion instances Key: SPARK-28927 URL: https://issues.apache.org/jira/browse/SPARK-28927 Project: Spark Issue Type: Bug Components: ML Affects Versions: 2.2.1 Reporter: Qiang Wang The stack trace is below: {quote}19/08/28 07:00:40 WARN Executor task launch worker for task 325074 BlockManager: Block rdd_10916_493 could not be removed as it was not found on disk or in memory 19/08/28 07:00:41 ERROR Executor task launch worker for task 325074 Executor: Exception in task 3.0 in stage 347.1 (TID 325074) java.lang.ArrayIndexOutOfBoundsException: 6741 at org.apache.spark.dpshade.recommendation.ALS$$anonfun$org$apache$spark$ml$recommendation$ALS$$computeFactors$1.apply(ALS.scala:1460) at org.apache.spark.dpshade.recommendation.ALS$$anonfun$org$apache$spark$ml$recommendation$ALS$$computeFactors$1.apply(ALS.scala:1440) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$40$$anonfun$apply$41.apply(PairRDDFunctions.scala:760) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$40$$anonfun$apply$41.apply(PairRDDFunctions.scala:760) at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:216) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1041) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1032) at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:972) at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1032) at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:763) at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:334) at org.apache.spark.rdd.RDD.iterator(RDD.scala:285) at org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$2.apply(CoGroupedRDD.scala:141) at org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$2.apply(CoGroupedRDD.scala:137) at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733) at scala.collection.immutable.List.foreach(List.scala:381) at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732) at org.apache.spark.rdd.CoGroupedRDD.compute(CoGroupedRDD.scala:137) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:358) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {quote} This exception happened sometimes. And we also found that the AUC metric was not stable when evaluating the inner product of the user factors and the item factors with the same dataset and configuration. AUC varied from 0.60 to 0.67 which was not stable for production environment. Dataset capacity: ~12 billion ratings Here is the our code: val trainData = predataUser.flatMap(x => x._1._2.map(y => (x._2.toInt, y._1, y._2.toFloat))) .setName(trainDataName).persist(StorageLevel.MEMORY_AND_DISK_SER)case class ALSData(user:Int, item:Int, rating:Float) extends Serializable val ratingData = trainData.map(x => ALSData(x._1, x._2, x._3)).toDF() val als = new ALS val paramMap = ParamMap(als.alpha -> 25000). put(als.checkpointInterval, 5). put(als.implicitPrefs, true). put(als.itemCol, "item"). put(als.maxIter, 60). put(als.nonnegative, false). put(als.numItemBlocks, 600). put(als.numUserBlocks, 600). put(als.regParam, 4.5). put(als.rank, 25). put(als.userCol, "user") als.fit(ratingData, paramMap) -- This message was sent by Atlassian Jira (v8.3.2#803003) - To
[jira] [Created] (SPARK-28926) CLONE - ArrayIndexOutOfBoundsException and Not-stable AUC metrics in ALS for datasets with 12 billion instances
Qiang Wang created SPARK-28926: -- Summary: CLONE - ArrayIndexOutOfBoundsException and Not-stable AUC metrics in ALS for datasets with 12 billion instances Key: SPARK-28926 URL: https://issues.apache.org/jira/browse/SPARK-28926 Project: Spark Issue Type: Bug Components: MLlib Affects Versions: 2.2.1 Reporter: Qiang Wang Assignee: Xiangrui Meng The stack trace is below: {quote}19/08/28 07:00:40 WARN Executor task launch worker for task 325074 BlockManager: Block rdd_10916_493 could not be removed as it was not found on disk or in memory 19/08/28 07:00:41 ERROR Executor task launch worker for task 325074 Executor: Exception in task 3.0 in stage 347.1 (TID 325074) java.lang.ArrayIndexOutOfBoundsException: 6741 at org.apache.spark.dpshade.recommendation.ALS$$anonfun$org$apache$spark$ml$recommendation$ALS$$computeFactors$1.apply(ALS.scala:1460) at org.apache.spark.dpshade.recommendation.ALS$$anonfun$org$apache$spark$ml$recommendation$ALS$$computeFactors$1.apply(ALS.scala:1440) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$40$$anonfun$apply$41.apply(PairRDDFunctions.scala:760) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$40$$anonfun$apply$41.apply(PairRDDFunctions.scala:760) at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:216) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1041) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1032) at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:972) at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1032) at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:763) at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:334) at org.apache.spark.rdd.RDD.iterator(RDD.scala:285) at org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$2.apply(CoGroupedRDD.scala:141) at org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$2.apply(CoGroupedRDD.scala:137) at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733) at scala.collection.immutable.List.foreach(List.scala:381) at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732) at org.apache.spark.rdd.CoGroupedRDD.compute(CoGroupedRDD.scala:137) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:358) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {quote} This exception happened sometimes. And we also found that the AUC metric was not stable when evaluating the inner product of the user factors and the item factors with the same dataset and configuration. AUC varied from 0.60 to 0.67 which was not stable for production environment. Dataset capacity: ~12 billion ratings Here is the our code: {code:java} val hivedata = sc.sql(sqltext).select(id,dpid,score).coalesce(numPartitions) val predataItem = hivedata.rdd.map(r=>(r._1._1,(r._1._2,r._2.sum))) .groupByKey().zipWithIndex() .persist(StorageLevel.MEMORY_AND_DISK_SER) val predataUser = predataItem.flatMap(r=>r._1._2.map(y=>(y._1,(r._2.toInt,y._2 .aggregateByKey(zeroValueArr,numPartitions)((a,b)=> a += b,(a,b)=>a ++ b).map(r=>(r._1,r._2.toIterable)) .zipWithIndex().persist(StorageLevel.MEMORY_AND_DISK_SER) //x._2 is the item_id, y._1 is the user_id, y._2 is the rating val trainData = predataUser.flatMap(x => x._1._2.map(y => (x._2.toInt, y._1, y._2.toFloat))) .setName(trainDataName).persist(StorageLevel.MEMORY_AND_DISK_SER) case class ALSData(user:Int, item:Int, rating:Float) extends Serializable val ratingData = trainData.map(x =>
[jira] [Resolved] (SPARK-28872) Will Spark SQL suport the auto analyze for table or partitions like hive by seting hive.stats.autogather=true.
[ https://issues.apache.org/jira/browse/SPARK-28872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-28872. -- Resolution: Invalid > Will Spark SQL suport the auto analyze for table or partitions like hive by > seting hive.stats.autogather=true. > -- > > Key: SPARK-28872 > URL: https://issues.apache.org/jira/browse/SPARK-28872 > Project: Spark > Issue Type: Question > Components: SQL >Affects Versions: 2.4.3 >Reporter: Shao >Priority: Major > > Like the summary, Will the spark sql suport the auto analyze for table or > tartitions in the future? -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-28874) Pyspark date_format add one years in the last days off year
[ https://issues.apache.org/jira/browse/SPARK-28874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-28874. -- Resolution: Invalid > Pyspark date_format add one years in the last days off year > --- > > Key: SPARK-28874 > URL: https://issues.apache.org/jira/browse/SPARK-28874 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 2.1.0, 2.3.0 >Reporter: Luis >Priority: Major > > Pyspark date_format add one years in the last days off year : > Example : > {code:python} > from pyspark.sql.functions import date_format, lit > spark.range(1).select(date_format(lit("2010-12-26"), "-MM-dd")).show() > {code} > {code} > +---+ > |date_format(2010-12-26, -MM-dd)| > +---+ > | 2011-12-26| > +---+ > {code} > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28874) Pyspark date_format add one years in the last days off year
[ https://issues.apache.org/jira/browse/SPARK-28874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919294#comment-16919294 ] Hyukjin Kwon commented on SPARK-28874: -- Use {{y}}. {{Y}} is {{Y week-based-year year 1996; 96}}. See https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html > Pyspark date_format add one years in the last days off year > --- > > Key: SPARK-28874 > URL: https://issues.apache.org/jira/browse/SPARK-28874 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 2.1.0, 2.3.0 >Reporter: Luis >Priority: Major > > Pyspark date_format add one years in the last days off year : > Example : > {code:python} > from pyspark.sql.functions import date_format, lit > spark.range(1).select(date_format(lit("2010-12-26"), "-MM-dd")).show() > {code} > {code} > +---+ > |date_format(2010-12-26, -MM-dd)| > +---+ > | 2011-12-26| > +---+ > {code} > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-28925) Update Kubernetes-client to 4.4.2 to be compatible with Kubernetes 1.13 and 1.14
Eric created SPARK-28925: Summary: Update Kubernetes-client to 4.4.2 to be compatible with Kubernetes 1.13 and 1.14 Key: SPARK-28925 URL: https://issues.apache.org/jira/browse/SPARK-28925 Project: Spark Issue Type: Bug Components: Kubernetes Affects Versions: 2.4.3 Reporter: Eric Hello, If you use Spark with Kubernetes 1.13 or 1.14 you will see this error: {code:java} {"time": "2019-08-28T09:56:11.866Z", "lvl":"INFO", "logger": "org.apache.spark.internal.Logging", "thread":"kubernetes-executor-snapshots-subscribers-0","msg":"Going to request 1 executors from Kubernetes."} {"time": "2019-08-28T09:56:12.028Z", "lvl":"WARN", "logger": "io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$2", "thread":"OkHttp https://kubernetes.default.svc/...","msg":"Exec Failure: HTTP 403, Status: 403 - "} java.net.ProtocolException: Expected HTTP 101 response but was '403 Forbidden' {code} Apparently the bug is fixed here: [https://github.com/fabric8io/kubernetes-client/pull/1669] We have currently compiled Spark source code with Kubernetes-client 4.4.2 and it's working great on our cluster. We are using Kubernetes 1.13.10. Could it be possible to update that dependency version? Thanks! -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-28874) Pyspark date_format add one years in the last days off year
[ https://issues.apache.org/jira/browse/SPARK-28874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-28874: - Summary: Pyspark date_format add one years in the last days off year (was: Pyspark bug in date_format) > Pyspark date_format add one years in the last days off year > --- > > Key: SPARK-28874 > URL: https://issues.apache.org/jira/browse/SPARK-28874 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 2.1.0, 2.3.0 >Reporter: Luis >Priority: Major > > Pyspark date_format add one years in the last days off year : > Example : > {code:python} > from pyspark.sql.functions import date_format, lit > spark.range(1).select(date_format(lit("2010-12-26"), "-MM-dd")).show() > {code} > {code} > +---+ > |date_format(2010-12-26, -MM-dd)| > +---+ > | 2011-12-26| > +---+ > {code} > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-28874) Pyspark bug in date_format
[ https://issues.apache.org/jira/browse/SPARK-28874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-28874: - Description: Pyspark date_format add one years in the last days off year : Example : {code:python} spark.range(1).select(date_format(lit("2010-12-26"), "-MM-dd")).show() {code} {code} +---+ |date_format(2010-12-26, -MM-dd)| +---+ | 2011-12-26| +---+ {code} was: Pyspark date_format add one years in the last days off year : Example : {code:python} from datetime import datetime from dateutil.relativedelta import relativedelta import pandas as pd from pyspark.sql.functions import date_format, col from pyspark.sql.types import * start_date = datetime(2010,1,1) end_date = datetime(2055,1,1) indx_ts = pd.date_range(start_date.strftime('%m/%d/%Y'), end_date.strftime('%m/%d/%Y'), freq='D') data_date = [ {"d":datetime.utcfromtimestamp(x.tolist()/1e9)} for x in indx_ts.values ] df_p = spark.createDataFrame(data_date,StructType([StructField('d', DateType(), True)])) df_string = df_p.withColumn("date_string" ,date_format(col("d"), "-MM-dd")) df_string.filter("d!=date_string").show(1000) {code} {code} +--+---+ | d|date_string| +--+---+ |2010-12-26| 2011-12-26| |2010-12-27| 2011-12-27| |2010-12-28| 2011-12-28| |2010-12-29| 2011-12-29| |2010-12-30| 2011-12-30| |2010-12-31| 2011-12-31| |2012-12-30| 2013-12-30| |2012-12-31| 2013-12-31| |2013-12-29| 2014-12-29| |2013-12-30| 2014-12-30| |2013-12-31| 2014-12-31| |2014-12-28| 2015-12-28| |2014-12-29| 2015-12-29| |2014-12-30| 2015-12-30| |2014-12-31| 2015-12-31| |2015-12-27| 2016-12-27| |2015-12-28| 2016-12-28| |2015-12-29| 2016-12-29| |2015-12-30| 2016-12-30| |2015-12-31| 2016-12-31| |2017-12-31| 2018-12-31| |2018-12-30| 2019-12-30| |2018-12-31| 2019-12-31| |2019-12-29| 2020-12-29| |2019-12-30| 2020-12-30| |2019-12-31| 2020-12-31| |2020-12-27| 2021-12-27| |2020-12-28| 2021-12-28| |2020-12-29| 2021-12-29| |2020-12-30| 2021-12-30| |2020-12-31| 2021-12-31| |2021-12-26| 2022-12-26| |2021-12-27| 2022-12-27| |2021-12-28| 2022-12-28| |2021-12-29| 2022-12-29| |2021-12-30| 2022-12-30| |2021-12-31| 2022-12-31| |2023-12-31| 2024-12-31| |2024-12-29| 2025-12-29| |2024-12-30| 2025-12-30| |2024-12-31| 2025-12-31| |2025-12-28| 2026-12-28| |2025-12-29| 2026-12-29| |2025-12-30| 2026-12-30| |2025-12-31| 2026-12-31| |2026-12-27| 2027-12-27| |2026-12-28| 2027-12-28| |2026-12-29| 2027-12-29| |2026-12-30| 2027-12-30| |2026-12-31| 2027-12-31| |2027-12-26| 2028-12-26| |2027-12-27| 2028-12-27| |2027-12-28| 2028-12-28| |2027-12-29| 2028-12-29| |2027-12-30| 2028-12-30| |2027-12-31| 2028-12-31| |2028-12-31| 2029-12-31| |2029-12-30| 2030-12-30| |2029-12-31| 2030-12-31| |2030-12-29| 2031-12-29| |2030-12-30| 2031-12-30| |2030-12-31| 2031-12-31| |2031-12-28| 2032-12-28| |2031-12-29| 2032-12-29| |2031-12-30| 2032-12-30| |2031-12-31| 2032-12-31| |2032-12-26| 2033-12-26| |2032-12-27| 2033-12-27| |2032-12-28| 2033-12-28| |2032-12-29| 2033-12-29| |2032-12-30| 2033-12-30| |2032-12-31| 2033-12-31| |2034-12-31| 2035-12-31| |2035-12-30| 2036-12-30| |2035-12-31| 2036-12-31| |2036-12-28| 2037-12-28| |2036-12-29| 2037-12-29| |2036-12-30| 2037-12-30| |2036-12-31| 2037-12-31| |2037-12-27| 2038-12-27| |2037-12-28| 2038-12-28| |2037-12-29| 2038-12-29| |2037-12-30| 2038-12-30| |2037-12-31| 2038-12-31| |2038-12-26| 2039-12-26| |2038-12-27| 2039-12-27| |2038-12-28| 2039-12-28| |2038-12-29| 2039-12-29| |2038-12-30| 2039-12-30| |2038-12-31| 2039-12-31| |2040-12-30| 2041-12-30| |2040-12-31| 2041-12-31| |2041-12-29| 2042-12-29| |2041-12-30| 2042-12-30| |2041-12-31| 2042-12-31| |2042-12-28| 2043-12-28| |2042-12-29| 2043-12-29| |2042-12-30| 2043-12-30| |2042-12-31| 2043-12-31| |2043-12-27| 2044-12-27| |2043-12-28| 2044-12-28| |2043-12-29| 2044-12-29| |2043-12-30| 2044-12-30| |2043-12-31| 2044-12-31| |2045-12-31| 2046-12-31| |2046-12-30| 2047-12-30| |2046-12-31| 2047-12-31| |2047-12-29| 2048-12-29| |2047-12-30| 2048-12-30| |2047-12-31| 2048-12-31| |2048-12-27| 2049-12-27| |2048-12-28| 2049-12-28| |2048-12-29| 2049-12-29| |2048-12-30| 2049-12-30| |2048-12-31| 2049-12-31| |2049-12-26| 2050-12-26| |2049-12-27| 2050-12-27| |2049-12-28| 2050-12-28| |2049-12-29| 2050-12-29| |2049-12-30| 2050-12-30| |2049-12-31| 2050-12-31| |2051-12-31| 2052-12-31| |2052-12-29| 2053-12-29| |2052-12-30| 2053-12-30| |2052-12-31| 2053-12-31| |2053-12-28| 2054-12-28| |2053-12-29| 2054-12-29| |2053-12-30| 2054-12-30| |2053-12-31| 2054-12-31| |2054-12-27| 2055-12-27| |2054-12-28| 2055-12-28| |2054-12-29| 2055-12-29| |2054-12-30| 2055-12-30| |2054-12-31| 2055-12-31| +--+---+ {code} > Pyspark bug in date_format > -- > > Key: SPARK-28874 > URL: https://issues.apache.org/jira/browse/SPARK-28874 >
[jira] [Resolved] (SPARK-28668) Support the V2SessionCatalog with AlterTable commands
[ https://issues.apache.org/jira/browse/SPARK-28668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-28668. - Fix Version/s: 3.0.0 Assignee: Burak Yavuz Resolution: Fixed > Support the V2SessionCatalog with AlterTable commands > - > > Key: SPARK-28668 > URL: https://issues.apache.org/jira/browse/SPARK-28668 > Project: Spark > Issue Type: Planned Work > Components: SQL >Affects Versions: 3.0.0 >Reporter: Burak Yavuz >Assignee: Burak Yavuz >Priority: Blocker > Fix For: 3.0.0 > > > We need to support the V2SessionCatalog with AlterTable commands so that V2 > DataSources can leverage DDL through SQL ALTER TABLE commands. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-28874) Pyspark bug in date_format
[ https://issues.apache.org/jira/browse/SPARK-28874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-28874: - Description: Pyspark date_format add one years in the last days off year : Example : {code:python} from pyspark.sql.functions import date_format, lit spark.range(1).select(date_format(lit("2010-12-26"), "-MM-dd")).show() {code} {code} +---+ |date_format(2010-12-26, -MM-dd)| +---+ | 2011-12-26| +---+ {code} was: Pyspark date_format add one years in the last days off year : Example : {code:python} spark.range(1).select(date_format(lit("2010-12-26"), "-MM-dd")).show() {code} {code} +---+ |date_format(2010-12-26, -MM-dd)| +---+ | 2011-12-26| +---+ {code} > Pyspark bug in date_format > -- > > Key: SPARK-28874 > URL: https://issues.apache.org/jira/browse/SPARK-28874 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 2.1.0, 2.3.0 >Reporter: Luis >Priority: Major > > Pyspark date_format add one years in the last days off year : > Example : > {code:python} > from pyspark.sql.functions import date_format, lit > spark.range(1).select(date_format(lit("2010-12-26"), "-MM-dd")).show() > {code} > {code} > +---+ > |date_format(2010-12-26, -MM-dd)| > +---+ > | 2011-12-26| > +---+ > {code} > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org