[jira] [Updated] (SPARK-45351) Change RocksDB as default shuffle service db backend
[ https://issues.apache.org/jira/browse/SPARK-45351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45351: --- Labels: pull-request-available (was: ) > Change RocksDB as default shuffle service db backend > > > Key: SPARK-45351 > URL: https://issues.apache.org/jira/browse/SPARK-45351 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Jia Fan >Priority: Major > Labels: pull-request-available > > Change RocksDB as default shuffle service db backend, because we will remove > leveldb in the future. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45351) Change RocksDB as default shuffle service db backend
Jia Fan created SPARK-45351: --- Summary: Change RocksDB as default shuffle service db backend Key: SPARK-45351 URL: https://issues.apache.org/jira/browse/SPARK-45351 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 4.0.0 Reporter: Jia Fan Change RocksDB as default shuffle service db backend, because we will remove leveldb in the future. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-45340) Remove the SQL config spark.sql.hive.verifyPartitionPath
[ https://issues.apache.org/jira/browse/SPARK-45340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk resolved SPARK-45340. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 43130 [https://github.com/apache/spark/pull/43130] > Remove the SQL config spark.sql.hive.verifyPartitionPath > > > Key: SPARK-45340 > URL: https://issues.apache.org/jira/browse/SPARK-45340 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.0.0 >Reporter: Max Gekk >Assignee: Max Gekk >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > The SQL config spark.sql.hive.verifyPartitionPath has been deprecated a quite > a while in version 3.0. Can be removed in the version 4.0. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45350) Rename the imported Java Boolean to JBoolean
Yang Jie created SPARK-45350: Summary: Rename the imported Java Boolean to JBoolean Key: SPARK-45350 URL: https://issues.apache.org/jira/browse/SPARK-45350 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 4.0.0 Reporter: Yang Jie Some places have used `import java.lang.Boolean` for the import of Java Boolean type, which can easily cause ambiguity, it should be renamed to JBoolean. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-44681) Solve issue referencing github.com/apache/spark-connect-go as Go library
[ https://issues.apache.org/jira/browse/SPARK-44681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BoYang resolved SPARK-44681. Fix Version/s: 3.4.0 Target Version/s: 3.5.0 Resolution: Fixed > Solve issue referencing github.com/apache/spark-connect-go as Go library > > > Key: SPARK-44681 > URL: https://issues.apache.org/jira/browse/SPARK-44681 > Project: Spark > Issue Type: Sub-task > Components: Connect Contrib >Affects Versions: 3.5.0 >Reporter: BoYang >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-44780) Document SQL Session variables
[ https://issues.apache.org/jira/browse/SPARK-44780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-44780. - Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 42467 [https://github.com/apache/spark/pull/42467] > Document SQL Session variables > -- > > Key: SPARK-44780 > URL: https://issues.apache.org/jira/browse/SPARK-44780 > Project: Spark > Issue Type: Task > Components: Spark Core >Affects Versions: 3.4.2 >Reporter: Serge Rielau >Assignee: Serge Rielau >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: Screenshot 2023-08-11 at 10.22.55 PM.png, Screenshot > 2023-08-11 at 10.24.33 PM.png, Screenshot 2023-08-11 at 10.26.54 PM.png > > > SQL Session variables have been added with: SPARK-42849. > Here we add the docs for it. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-44780) Document SQL Session variables
[ https://issues.apache.org/jira/browse/SPARK-44780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-44780: --- Assignee: Serge Rielau > Document SQL Session variables > -- > > Key: SPARK-44780 > URL: https://issues.apache.org/jira/browse/SPARK-44780 > Project: Spark > Issue Type: Task > Components: Spark Core >Affects Versions: 3.4.2 >Reporter: Serge Rielau >Assignee: Serge Rielau >Priority: Major > Labels: pull-request-available > Attachments: Screenshot 2023-08-11 at 10.22.55 PM.png, Screenshot > 2023-08-11 at 10.24.33 PM.png, Screenshot 2023-08-11 at 10.26.54 PM.png > > > SQL Session variables have been added with: SPARK-42849. > Here we add the docs for it. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-44780) Document SQL Session variables
[ https://issues.apache.org/jira/browse/SPARK-44780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-44780: --- Labels: pull-request-available (was: ) > Document SQL Session variables > -- > > Key: SPARK-44780 > URL: https://issues.apache.org/jira/browse/SPARK-44780 > Project: Spark > Issue Type: Task > Components: Spark Core >Affects Versions: 3.4.2 >Reporter: Serge Rielau >Priority: Major > Labels: pull-request-available > Attachments: Screenshot 2023-08-11 at 10.22.55 PM.png, Screenshot > 2023-08-11 at 10.24.33 PM.png, Screenshot 2023-08-11 at 10.26.54 PM.png > > > SQL Session variables have been added with: SPARK-42849. > Here we add the docs for it. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45338) Remove scala.collection.JavaConverters
[ https://issues.apache.org/jira/browse/SPARK-45338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie updated SPARK-45338: - Parent Issue: SPARK-45314 (was: SPARK-44111) > Remove scala.collection.JavaConverters > -- > > Key: SPARK-45338 > URL: https://issues.apache.org/jira/browse/SPARK-45338 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 4.0.0 >Reporter: Jia Fan >Priority: Major > Labels: pull-request-available > > Remove deprecated scala.collection.JavaConverters, replaced by > scala.jdk.CollectionConverters -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-43850) Remove the import for scala.language.higherKinds and delete the corresponding suppression rule
[ https://issues.apache.org/jira/browse/SPARK-43850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie resolved SPARK-43850. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 43128 [https://github.com/apache/spark/pull/43128] > Remove the import for scala.language.higherKinds and delete the corresponding > suppression rule > -- > > Key: SPARK-43850 > URL: https://issues.apache.org/jira/browse/SPARK-43850 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-43850) Remove the import for scala.language.higherKinds and delete the corresponding suppression rule
[ https://issues.apache.org/jira/browse/SPARK-43850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie reassigned SPARK-43850: Assignee: Yang Jie > Remove the import for scala.language.higherKinds and delete the corresponding > suppression rule > -- > > Key: SPARK-43850 > URL: https://issues.apache.org/jira/browse/SPARK-43850 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45349) Backport SPARK-44034 and SPARK-44074 to branch-3.4/banch-3.3
[ https://issues.apache.org/jira/browse/SPARK-45349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45349: --- Labels: pull-request-available (was: ) > Backport SPARK-44034 and SPARK-44074 to branch-3.4/banch-3.3 > > > Key: SPARK-45349 > URL: https://issues.apache.org/jira/browse/SPARK-45349 > Project: Spark > Issue Type: Task > Components: Tests >Affects Versions: 3.4.2, 3.3.4 >Reporter: Yang Jie >Priority: Major > Labels: pull-request-available > > Improve the success rate of CI -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45349) Backport SPARK-44034 and SPARK-44074 to branch-3.4/banch-3.3
Yang Jie created SPARK-45349: Summary: Backport SPARK-44034 and SPARK-44074 to branch-3.4/banch-3.3 Key: SPARK-45349 URL: https://issues.apache.org/jira/browse/SPARK-45349 Project: Spark Issue Type: Task Components: Tests Affects Versions: 3.4.2, 3.3.4 Reporter: Yang Jie Improve the success rate of CI -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45348) Make the Maven build in GitHub Action check "javadoc:javadoc".
[ https://issues.apache.org/jira/browse/SPARK-45348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45348: --- Labels: pull-request-available (was: ) > Make the Maven build in GitHub Action check "javadoc:javadoc". > -- > > Key: SPARK-45348 > URL: https://issues.apache.org/jira/browse/SPARK-45348 > Project: Spark > Issue Type: Task > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-44223) Drop leveldb support
[ https://issues.apache.org/jira/browse/SPARK-44223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-44223: --- Labels: pull-request-available (was: ) > Drop leveldb support > > > Key: SPARK-44223 > URL: https://issues.apache.org/jira/browse/SPARK-44223 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > Labels: pull-request-available > > The leveldb project seems to be no longer maintained, and we can always > replace it with rocksdb. I think we can remove support and dependencies on > leveldb in Spark 4.0 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45334) Remove misleading comment in parquetSchemaConverter
[ https://issues.apache.org/jira/browse/SPARK-45334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen reassigned SPARK-45334: Assignee: Mengran Lan > Remove misleading comment in parquetSchemaConverter > --- > > Key: SPARK-45334 > URL: https://issues.apache.org/jira/browse/SPARK-45334 > Project: Spark > Issue Type: Documentation > Components: SQL >Affects Versions: 3.5.0 >Reporter: Mengran Lan >Assignee: Mengran Lan >Priority: Trivial > Labels: pull-request-available > > I'm debugging a parquet issue and reading spark code as references. Happened > to find a misleading comment which remains in the latest version as well. > {code:java} > Types > .buildGroup(repetition).as(LogicalTypeAnnotation.listType()) > .addField(Types > .buildGroup(REPEATED) > // "array" is the name chosen by parquet-hive (1.7.0 and prior version) > .addField(convertField(StructField("array", elementType, nullable))) > .named("bag")) > .named(field.name) {code} > the comment above is misleading since Hive always uses "array_element" as the > name. > It is imported by this PR [https://github.com/apache/spark/pull/14399] and > relates to this issue https://issues.apache.org/jira/browse/SPARK-16777 > Furthermore, the parquet-hive module has been removed from the parquet-mr > project https://issues.apache.org/jira/browse/PARQUET-1676 > I suggest removing this piece of comment and will submit a PR later. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-45334) Remove misleading comment in parquetSchemaConverter
[ https://issues.apache.org/jira/browse/SPARK-45334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-45334. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 43119 [https://github.com/apache/spark/pull/43119] > Remove misleading comment in parquetSchemaConverter > --- > > Key: SPARK-45334 > URL: https://issues.apache.org/jira/browse/SPARK-45334 > Project: Spark > Issue Type: Documentation > Components: SQL >Affects Versions: 3.5.0 >Reporter: Mengran Lan >Assignee: Mengran Lan >Priority: Trivial > Labels: pull-request-available > Fix For: 4.0.0 > > > I'm debugging a parquet issue and reading spark code as references. Happened > to find a misleading comment which remains in the latest version as well. > {code:java} > Types > .buildGroup(repetition).as(LogicalTypeAnnotation.listType()) > .addField(Types > .buildGroup(REPEATED) > // "array" is the name chosen by parquet-hive (1.7.0 and prior version) > .addField(convertField(StructField("array", elementType, nullable))) > .named("bag")) > .named(field.name) {code} > the comment above is misleading since Hive always uses "array_element" as the > name. > It is imported by this PR [https://github.com/apache/spark/pull/14399] and > relates to this issue https://issues.apache.org/jira/browse/SPARK-16777 > Furthermore, the parquet-hive module has been removed from the parquet-mr > project https://issues.apache.org/jira/browse/PARQUET-1676 > I suggest removing this piece of comment and will submit a PR later. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-45302) Remove PID communication between Python workers when no demon is used
[ https://issues.apache.org/jira/browse/SPARK-45302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-45302. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 43087 [https://github.com/apache/spark/pull/43087] > Remove PID communication between Python workers when no demon is used > - > > Key: SPARK-45302 > URL: https://issues.apache.org/jira/browse/SPARK-45302 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > We don't need to send the PID around when JDK 9+ is used because we can get > the API directly. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45302) Remove PID communication between Python workers when no demon is used
[ https://issues.apache.org/jira/browse/SPARK-45302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-45302: Assignee: Hyukjin Kwon > Remove PID communication between Python workers when no demon is used > - > > Key: SPARK-45302 > URL: https://issues.apache.org/jira/browse/SPARK-45302 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > > We don't need to send the PID around when JDK 9+ is used because we can get > the API directly. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-45282) Join loses records for cached datasets
[ https://issues.apache.org/jira/browse/SPARK-45282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17769383#comment-17769383 ] XiDuo You commented on SPARK-45282: --- I can not re-produce this issue in master branch (4.0.0), [~koert] have you tried master branch ? > Join loses records for cached datasets > -- > > Key: SPARK-45282 > URL: https://issues.apache.org/jira/browse/SPARK-45282 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.1, 3.5.0 > Environment: spark 3.4.1 on apache hadoop 3.3.6 or kubernetes 1.26 or > databricks 13.3 >Reporter: koert kuipers >Priority: Major > Labels: CorrectnessBug, correctness > > we observed this issue on spark 3.4.1 but it is also present on 3.5.0. it is > not present on spark 3.3.1. > it only shows up in distributed environment. i cannot replicate in unit test. > however i did get it to show up on hadoop cluster, kubernetes, and on > databricks 13.3 > the issue is that records are dropped when two cached dataframes are joined. > it seems in spark 3.4.1 in queryplan some Exchanges are dropped as an > optimization while in spark 3.3.1 these Exhanges are still present. it seems > to be an issue with AQE with canChangeCachedPlanOutputPartitioning=true. > to reproduce on distributed cluster these settings needed are: > {code:java} > spark.sql.adaptive.advisoryPartitionSizeInBytes 33554432 > spark.sql.adaptive.coalescePartitions.parallelismFirst false > spark.sql.adaptive.enabled true > spark.sql.optimizer.canChangeCachedPlanOutputPartitioning true {code} > code using scala to reproduce is: > {code:java} > import java.util.UUID > import org.apache.spark.sql.functions.col > import spark.implicits._ > val data = (1 to 100).toDS().map(i => > UUID.randomUUID().toString).persist() > val left = data.map(k => (k, 1)) > val right = data.map(k => (k, k)) // if i change this to k => (k, 1) it works! > println("number of left " + left.count()) > println("number of right " + right.count()) > println("number of (left join right) " + > left.toDF("key", "value1").join(right.toDF("key", "value2"), "key").count() > ) > val left1 = left > .toDF("key", "value1") > .repartition(col("key")) // comment out this line to make it work > .persist() > println("number of left1 " + left1.count()) > val right1 = right > .toDF("key", "value2") > .repartition(col("key")) // comment out this line to make it work > .persist() > println("number of right1 " + right1.count()) > println("number of (left1 join right1) " + left1.join(right1, > "key").count()) // this gives incorrect result{code} > this produces the following output: > {code:java} > number of left 100 > number of right 100 > number of (left join right) 100 > number of left1 100 > number of right1 100 > number of (left1 join right1) 859531 {code} > note that the last number (the incorrect one) actually varies depending on > settings and cluster size etc. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45348) Make the Maven build in GitHub Action check "javadoc:javadoc".
Yang Jie created SPARK-45348: Summary: Make the Maven build in GitHub Action check "javadoc:javadoc". Key: SPARK-45348 URL: https://issues.apache.org/jira/browse/SPARK-45348 Project: Spark Issue Type: Task Components: Project Infra Affects Versions: 4.0.0 Reporter: Yang Jie -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-45339) Pyspark should log errors it retries
[ https://issues.apache.org/jira/browse/SPARK-45339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-45339. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 43127 [https://github.com/apache/spark/pull/43127] > Pyspark should log errors it retries > > > Key: SPARK-45339 > URL: https://issues.apache.org/jira/browse/SPARK-45339 > Project: Spark > Issue Type: Improvement > Components: PySpark >Affects Versions: 3.5.0 >Reporter: Alice Sayutina >Assignee: Alice Sayutina >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45339) Pyspark should log errors it retries
[ https://issues.apache.org/jira/browse/SPARK-45339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-45339: Assignee: Alice Sayutina > Pyspark should log errors it retries > > > Key: SPARK-45339 > URL: https://issues.apache.org/jira/browse/SPARK-45339 > Project: Spark > Issue Type: Improvement > Components: PySpark >Affects Versions: 3.5.0 >Reporter: Alice Sayutina >Assignee: Alice Sayutina >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-43662) Enable ReshapeParityTests.test_merge_asof
[ https://issues.apache.org/jira/browse/SPARK-43662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-43662: --- Labels: pull-request-available (was: ) > Enable ReshapeParityTests.test_merge_asof > - > > Key: SPARK-43662 > URL: https://issues.apache.org/jira/browse/SPARK-43662 > Project: Spark > Issue Type: Sub-task > Components: Connect, Pandas API on Spark >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Priority: Major > Labels: pull-request-available > > Enable ReshapeParityTests.test_merge_asof -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45328) Remove Hive support prior to 2.0.0
[ https://issues.apache.org/jira/browse/SPARK-45328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-45328: Assignee: Hyukjin Kwon > Remove Hive support prior to 2.0.0 > -- > > Key: SPARK-45328 > URL: https://issues.apache.org/jira/browse/SPARK-45328 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > > They don't support JDK 17, and we can't make it supported. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-45328) Remove Hive support prior to 2.0.0
[ https://issues.apache.org/jira/browse/SPARK-45328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-45328. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 43116 [https://github.com/apache/spark/pull/43116] > Remove Hive support prior to 2.0.0 > -- > > Key: SPARK-45328 > URL: https://issues.apache.org/jira/browse/SPARK-45328 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > They don't support JDK 17, and we can't make it supported. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45347) Include SparkThrowable in FetchErrorDetailsResponse
[ https://issues.apache.org/jira/browse/SPARK-45347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45347: --- Labels: pull-request-available (was: ) > Include SparkThrowable in FetchErrorDetailsResponse > --- > > Key: SPARK-45347 > URL: https://issues.apache.org/jira/browse/SPARK-45347 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 4.0.0 >Reporter: Yihong He >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45347) Include SparkThrowable in FetchErrorDetailsResponse
Yihong He created SPARK-45347: - Summary: Include SparkThrowable in FetchErrorDetailsResponse Key: SPARK-45347 URL: https://issues.apache.org/jira/browse/SPARK-45347 Project: Spark Issue Type: New Feature Components: Connect Affects Versions: 4.0.0 Reporter: Yihong He -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-44940) Improve performance of JSON parsing when "spark.sql.json.enablePartialResults" is enabled
[ https://issues.apache.org/jira/browse/SPARK-44940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-44940: -- Fix Version/s: 3.5.0 (was: 3.5.1) > Improve performance of JSON parsing when > "spark.sql.json.enablePartialResults" is enabled > - > > Key: SPARK-44940 > URL: https://issues.apache.org/jira/browse/SPARK-44940 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.0, 3.5.0, 4.0.0 >Reporter: Ivan Sadikov >Assignee: Ivan Sadikov >Priority: Major > Labels: correctness, pull-request-available > Fix For: 3.4.2, 3.5.0 > > > Follow-up on https://issues.apache.org/jira/browse/SPARK-40646. > I found that JSON parsing is significantly slower due to exception creation > in control flow. Also, some fields are not parsed correctly and the exception > is thrown in certain cases: > {code:java} > Caused by: java.lang.ClassCastException: > org.apache.spark.sql.catalyst.util.GenericArrayData cannot be cast to > org.apache.spark.sql.catalyst.InternalRow > at > org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow.getStruct(rows.scala:51) > at > org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow.getStruct$(rows.scala:51) > at > org.apache.spark.sql.catalyst.expressions.GenericInternalRow.getStruct(rows.scala:195) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown > Source) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown > Source) > at scala.collection.Iterator$$anon$10.next(Iterator.scala:461) > at > org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1$$anon$2.getNext(FileScanRDD.scala:590) > ... 39 more {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-44940) Improve performance of JSON parsing when "spark.sql.json.enablePartialResults" is enabled
[ https://issues.apache.org/jira/browse/SPARK-44940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17769338#comment-17769338 ] Thomas Graves commented on SPARK-44940: --- I noticed this went into 3.5.0 ([https://github.com/apache/spark/commits/v3.5.0)] so updating the fixed versions. > Improve performance of JSON parsing when > "spark.sql.json.enablePartialResults" is enabled > - > > Key: SPARK-44940 > URL: https://issues.apache.org/jira/browse/SPARK-44940 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.0, 3.5.0, 4.0.0 >Reporter: Ivan Sadikov >Assignee: Ivan Sadikov >Priority: Major > Labels: correctness, pull-request-available > Fix For: 3.4.2, 3.5.1 > > > Follow-up on https://issues.apache.org/jira/browse/SPARK-40646. > I found that JSON parsing is significantly slower due to exception creation > in control flow. Also, some fields are not parsed correctly and the exception > is thrown in certain cases: > {code:java} > Caused by: java.lang.ClassCastException: > org.apache.spark.sql.catalyst.util.GenericArrayData cannot be cast to > org.apache.spark.sql.catalyst.InternalRow > at > org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow.getStruct(rows.scala:51) > at > org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow.getStruct$(rows.scala:51) > at > org.apache.spark.sql.catalyst.expressions.GenericInternalRow.getStruct(rows.scala:195) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown > Source) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown > Source) > at scala.collection.Iterator$$anon$10.next(Iterator.scala:461) > at > org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1$$anon$2.getNext(FileScanRDD.scala:590) > ... 39 more {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-44442) Drop mesos support
[ https://issues.apache.org/jira/browse/SPARK-2?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-2: --- Labels: pull-request-available (was: ) > Drop mesos support > -- > > Key: SPARK-2 > URL: https://issues.apache.org/jira/browse/SPARK-2 > Project: Spark > Issue Type: Sub-task > Components: Mesos >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > Labels: pull-request-available > > [https://spark.apache.org/docs/latest/running-on-mesos.html] > > {_}Note{_}: Apache Mesos support is deprecated as of Apache Spark 3.2.0. It > will be removed in a future version. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-44034) Add a new test group for sql module
[ https://issues.apache.org/jira/browse/SPARK-44034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-44034: --- Labels: pull-request-available (was: ) > Add a new test group for sql module > --- > > Key: SPARK-44034 > URL: https://issues.apache.org/jira/browse/SPARK-44034 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 3.5.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45346) Parquet schema inference should respect case sensitive flag when merging schema
[ https://issues.apache.org/jira/browse/SPARK-45346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45346: --- Labels: pull-request-available (was: ) > Parquet schema inference should respect case sensitive flag when merging > schema > --- > > Key: SPARK-45346 > URL: https://issues.apache.org/jira/browse/SPARK-45346 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.0, 3.5.0 >Reporter: Wenchen Fan >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45346) Parquet schema inference should respect case sensitive flag when merging schema
Wenchen Fan created SPARK-45346: --- Summary: Parquet schema inference should respect case sensitive flag when merging schema Key: SPARK-45346 URL: https://issues.apache.org/jira/browse/SPARK-45346 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.5.0, 3.4.0 Reporter: Wenchen Fan -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-45345) Refactor release-build.sh
[ https://issues.apache.org/jira/browse/SPARK-45345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17769278#comment-17769278 ] Yang Jie commented on SPARK-45345: -- Currently, I'm not familiar enough with this, so I'm not sure if it's necessary to refactor `release-build.sh` for Spark 4.0. > Refactor release-build.sh > - > > Key: SPARK-45345 > URL: https://issues.apache.org/jira/browse/SPARK-45345 > Project: Spark > Issue Type: Sub-task > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-44366) Migrate antlr4 from 4.9 to 4.10+
[ https://issues.apache.org/jira/browse/SPARK-44366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-44366. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 43075 [https://github.com/apache/spark/pull/43075] > Migrate antlr4 from 4.9 to 4.10+ > > > Key: SPARK-44366 > URL: https://issues.apache.org/jira/browse/SPARK-44366 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Assignee: Yang Jie >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-44366) Migrate antlr4 from 4.9 to 4.10+
[ https://issues.apache.org/jira/browse/SPARK-44366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen reassigned SPARK-44366: Assignee: Yang Jie > Migrate antlr4 from 4.9 to 4.10+ > > > Key: SPARK-44366 > URL: https://issues.apache.org/jira/browse/SPARK-44366 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Assignee: Yang Jie >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-44756) Executor hangs when RetryingBlockTransferor fails to initiate retry
[ https://issues.apache.org/jira/browse/SPARK-44756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mridul Muralidharan resolved SPARK-44756. - Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 42426 [https://github.com/apache/spark/pull/42426] > Executor hangs when RetryingBlockTransferor fails to initiate retry > --- > > Key: SPARK-44756 > URL: https://issues.apache.org/jira/browse/SPARK-44756 > Project: Spark > Issue Type: Bug > Components: Shuffle, Spark Core >Affects Versions: 3.3.1 >Reporter: Harunobu Daikoku >Assignee: Harunobu Daikoku >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > > We have been observing this issue several times in our production where some > executors are being stuck at BlockTransferService#fetchBlockSync(). > After some investigation, the issue seems to be caused by an unhandled edge > case in RetryingBlockTransferor. > 1. Shuffle transfer fails for whatever reason > {noformat} > java.io.IOException: Cannot allocate memory > at sun.nio.ch.FileDispatcherImpl.write0(Native Method) > at sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:60) > at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) > at sun.nio.ch.IOUtil.write(IOUtil.java:51) > at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:211) > at > org.apache.spark.network.shuffle.SimpleDownloadFile$SimpleDownloadWritableChannel.write(SimpleDownloadFile.java:78) > at > org.apache.spark.network.shuffle.OneForOneBlockFetcher$DownloadCallback.onData(OneForOneBlockFetcher.java:340) > at > org.apache.spark.network.client.StreamInterceptor.handle(StreamInterceptor.java:79) > at > org.apache.spark.network.util.TransportFrameDecoder.feedInterceptor(TransportFrameDecoder.java:263) > at > org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:87) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) > {noformat} > 2. The above exception caught by > [AbstractChannelHandlerContext#invokeChannelRead()|https://github.com/netty/netty/blob/netty-4.1.74.Final/transport/src/main/java/io/netty/channel/AbstractChannelHandlerContext.java#L381], > and propagated to the exception handler > 3. Exception reaches > [RetryingBlockTransferor#initiateRetry()|https://github.com/apache/spark/blob/v3.3.1/common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RetryingBlockTransferor.java#L178-L180], > and it tries to initiate retry > {noformat} > 23/08/09 16:58:37 shuffle-client-4-2 INFO RetryingBlockTransferor: Retrying > fetch (1/3) for 1 outstanding blocks after 5000 ms > {noformat} > 4. Retry initiation fails (in our case, it fails to create a new thread) > 5. Exception caught by > [AbstractChannelHandlerContext#invokeExceptionCaught()|https://github.com/netty/netty/blob/netty-4.1.74.Final/transport/src/main/java/io/netty/channel/AbstractChannelHandlerContext.java#L305-L309], > and not further processed > {noformat} > 23/08/09 16:58:53 shuffle-client-4-2 DEBUG AbstractChannelHandlerContext: An > exception java.lang.OutOfMemoryError: unable to create new native thread > at java.lang.Thread.start0(Native Method) > at java.lang.Thread.start(Thread.java:719) > at > java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957) > at > java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1378) > at > java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112) > at > org.apache.spark.network.shuffle.RetryingBlockTransferor.initiateRetry(RetryingBlockTransferor.java:182) > at > org.apache.spark.network.shuffle.RetryingBlockTransferor.access$500(RetryingBlockTransferor.java:43) > at > org.apache.spark.network.shuffle.RetryingBlockTransferor$RetryingBlockTransferListener.handleBlockTransferFailure(RetryingBlockTransferor.java:230) > at > org.apache.spark.network.shuffle.RetryingBlockTransferor$RetryingBlockTransferListener.onBlockFetchFailure(RetryingBlockTransferor.java:260) > at > org.apache.spark.network.shuffle.OneForOneBlockFetcher.failRemainingBlocks(OneForOneBlockFetcher.java:318) > at > org.apache.spark.network.shuffle.OneForOneBlockFetcher.access$300(OneForOneBlockFetcher.java:55) > at > org.apache.spark.network.shuffle.OneForOneBlockFetcher$DownloadCallback.onFailure(OneForOneBlockFetcher.java:357) > at > org.apache.spark.network.client.StreamInterceptor.exceptionCaught(StreamInterceptor.java:56) > at >
[jira] [Assigned] (SPARK-44756) Executor hangs when RetryingBlockTransferor fails to initiate retry
[ https://issues.apache.org/jira/browse/SPARK-44756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mridul Muralidharan reassigned SPARK-44756: --- Assignee: Harunobu Daikoku > Executor hangs when RetryingBlockTransferor fails to initiate retry > --- > > Key: SPARK-44756 > URL: https://issues.apache.org/jira/browse/SPARK-44756 > Project: Spark > Issue Type: Bug > Components: Shuffle, Spark Core >Affects Versions: 3.3.1 >Reporter: Harunobu Daikoku >Assignee: Harunobu Daikoku >Priority: Minor > Labels: pull-request-available > > We have been observing this issue several times in our production where some > executors are being stuck at BlockTransferService#fetchBlockSync(). > After some investigation, the issue seems to be caused by an unhandled edge > case in RetryingBlockTransferor. > 1. Shuffle transfer fails for whatever reason > {noformat} > java.io.IOException: Cannot allocate memory > at sun.nio.ch.FileDispatcherImpl.write0(Native Method) > at sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:60) > at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) > at sun.nio.ch.IOUtil.write(IOUtil.java:51) > at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:211) > at > org.apache.spark.network.shuffle.SimpleDownloadFile$SimpleDownloadWritableChannel.write(SimpleDownloadFile.java:78) > at > org.apache.spark.network.shuffle.OneForOneBlockFetcher$DownloadCallback.onData(OneForOneBlockFetcher.java:340) > at > org.apache.spark.network.client.StreamInterceptor.handle(StreamInterceptor.java:79) > at > org.apache.spark.network.util.TransportFrameDecoder.feedInterceptor(TransportFrameDecoder.java:263) > at > org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:87) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) > {noformat} > 2. The above exception caught by > [AbstractChannelHandlerContext#invokeChannelRead()|https://github.com/netty/netty/blob/netty-4.1.74.Final/transport/src/main/java/io/netty/channel/AbstractChannelHandlerContext.java#L381], > and propagated to the exception handler > 3. Exception reaches > [RetryingBlockTransferor#initiateRetry()|https://github.com/apache/spark/blob/v3.3.1/common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RetryingBlockTransferor.java#L178-L180], > and it tries to initiate retry > {noformat} > 23/08/09 16:58:37 shuffle-client-4-2 INFO RetryingBlockTransferor: Retrying > fetch (1/3) for 1 outstanding blocks after 5000 ms > {noformat} > 4. Retry initiation fails (in our case, it fails to create a new thread) > 5. Exception caught by > [AbstractChannelHandlerContext#invokeExceptionCaught()|https://github.com/netty/netty/blob/netty-4.1.74.Final/transport/src/main/java/io/netty/channel/AbstractChannelHandlerContext.java#L305-L309], > and not further processed > {noformat} > 23/08/09 16:58:53 shuffle-client-4-2 DEBUG AbstractChannelHandlerContext: An > exception java.lang.OutOfMemoryError: unable to create new native thread > at java.lang.Thread.start0(Native Method) > at java.lang.Thread.start(Thread.java:719) > at > java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957) > at > java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1378) > at > java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112) > at > org.apache.spark.network.shuffle.RetryingBlockTransferor.initiateRetry(RetryingBlockTransferor.java:182) > at > org.apache.spark.network.shuffle.RetryingBlockTransferor.access$500(RetryingBlockTransferor.java:43) > at > org.apache.spark.network.shuffle.RetryingBlockTransferor$RetryingBlockTransferListener.handleBlockTransferFailure(RetryingBlockTransferor.java:230) > at > org.apache.spark.network.shuffle.RetryingBlockTransferor$RetryingBlockTransferListener.onBlockFetchFailure(RetryingBlockTransferor.java:260) > at > org.apache.spark.network.shuffle.OneForOneBlockFetcher.failRemainingBlocks(OneForOneBlockFetcher.java:318) > at > org.apache.spark.network.shuffle.OneForOneBlockFetcher.access$300(OneForOneBlockFetcher.java:55) > at > org.apache.spark.network.shuffle.OneForOneBlockFetcher$DownloadCallback.onFailure(OneForOneBlockFetcher.java:357) > at > org.apache.spark.network.client.StreamInterceptor.exceptionCaught(StreamInterceptor.java:56) > at > org.apache.spark.network.util.TransportFrameDecoder.exceptionCaught(TransportFrameDecoder.java:231) > at >
[jira] [Updated] (SPARK-45344) Remove all scala version string check
[ https://issues.apache.org/jira/browse/SPARK-45344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45344: --- Labels: pull-request-available (was: ) > Remove all scala version string check > - > > Key: SPARK-45344 > URL: https://issues.apache.org/jira/browse/SPARK-45344 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45343) CSV multiLine documentation is confusing
[ https://issues.apache.org/jira/browse/SPARK-45343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen updated SPARK-45343: - Priority: Trivial (was: Major) > CSV multiLine documentation is confusing > > > Key: SPARK-45343 > URL: https://issues.apache.org/jira/browse/SPARK-45343 > Project: Spark > Issue Type: Documentation > Components: Spark Core >Affects Versions: 3.5.0 >Reporter: Bill Schneider >Priority: Trivial > Labels: pull-request-available > > This is confusing, maybe copy-paste from JSON: > |Parse one record, which may span multiple lines, per file. CSV built-in > functions ignore this option.| > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45343) CSV multiLine documentation is confusing
[ https://issues.apache.org/jira/browse/SPARK-45343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45343: --- Labels: pull-request-available (was: ) > CSV multiLine documentation is confusing > > > Key: SPARK-45343 > URL: https://issues.apache.org/jira/browse/SPARK-45343 > Project: Spark > Issue Type: Documentation > Components: Spark Core >Affects Versions: 3.5.0 >Reporter: Bill Schneider >Priority: Major > Labels: pull-request-available > > This is confusing, maybe copy-paste from JSON: > |Parse one record, which may span multiple lines, per file. CSV built-in > functions ignore this option.| > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45345) Refactor release-build.sh
Yang Jie created SPARK-45345: Summary: Refactor release-build.sh Key: SPARK-45345 URL: https://issues.apache.org/jira/browse/SPARK-45345 Project: Spark Issue Type: Sub-task Components: Project Infra Affects Versions: 4.0.0 Reporter: Yang Jie -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45344) Remove all scala version string check
Yang Jie created SPARK-45344: Summary: Remove all scala version string check Key: SPARK-45344 URL: https://issues.apache.org/jira/browse/SPARK-45344 Project: Spark Issue Type: Sub-task Components: Spark Core, SQL Affects Versions: 4.0.0 Reporter: Yang Jie -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45217) Support change log level of specific package or class
[ https://issues.apache.org/jira/browse/SPARK-45217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45217: --- Labels: pull-request-available (was: ) > Support change log level of specific package or class > - > > Key: SPARK-45217 > URL: https://issues.apache.org/jira/browse/SPARK-45217 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.5.0 >Reporter: Zhongwei Zhu >Priority: Minor > Labels: pull-request-available > > Add SparkContext.setLogLevel(loggerName: String, logLevel: String) to support > change log level of specific package or class -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-45343) CSV multiLine documentation is confusing
[ https://issues.apache.org/jira/browse/SPARK-45343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17769236#comment-17769236 ] Bill Schneider commented on SPARK-45343: PR:https://github.com/apache/spark/pull/43132 > CSV multiLine documentation is confusing > > > Key: SPARK-45343 > URL: https://issues.apache.org/jira/browse/SPARK-45343 > Project: Spark > Issue Type: Documentation > Components: Spark Core >Affects Versions: 3.5.0 >Reporter: Bill Schneider >Priority: Major > > This is confusing, maybe copy-paste from JSON: > |Parse one record, which may span multiple lines, per file. CSV built-in > functions ignore this option.| > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-45316) Respect `spark.sql.files.ignoreMissingFiles` in HadoopRDD
[ https://issues.apache.org/jira/browse/SPARK-45316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk resolved SPARK-45316. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 43097 [https://github.com/apache/spark/pull/43097] > Respect `spark.sql.files.ignoreMissingFiles` in HadoopRDD > - > > Key: SPARK-45316 > URL: https://issues.apache.org/jira/browse/SPARK-45316 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.0.0 >Reporter: Max Gekk >Assignee: Max Gekk >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Currently, the SQL config spark.sql.files.ignoreMissingFiles influences on > RDDs created in Spark SQL such as FileScanRDD but doesn't impact on HadoopRDD > and NewHadoopRDD. The last RDDs have separate core config > spark.files.ignoreMissingFiles. That inconsistency might confuse users if > they don't know implementation details. This ticket aims to eliminate the > inconsistency. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45343) CSV multiLine documentation is confusing
Bill Schneider created SPARK-45343: -- Summary: CSV multiLine documentation is confusing Key: SPARK-45343 URL: https://issues.apache.org/jira/browse/SPARK-45343 Project: Spark Issue Type: Documentation Components: Spark Core Affects Versions: 3.5.0 Reporter: Bill Schneider This is confusing, maybe copy-paste from JSON: |Parse one record, which may span multiple lines, per file. CSV built-in functions ignore this option.| -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45342) Remove the scala doc compilation option specific to Scala 2.12.
Yang Jie created SPARK-45342: Summary: Remove the scala doc compilation option specific to Scala 2.12. Key: SPARK-45342 URL: https://issues.apache.org/jira/browse/SPARK-45342 Project: Spark Issue Type: Sub-task Components: Build Affects Versions: 4.0.0 Reporter: Yang Jie -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45341) Make the sbt doc command execute successfully with Java 17
[ https://issues.apache.org/jira/browse/SPARK-45341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45341: --- Labels: pull-request-available (was: ) > Make the sbt doc command execute successfully with Java 17 > -- > > Key: SPARK-45341 > URL: https://issues.apache.org/jira/browse/SPARK-45341 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > Labels: pull-request-available > > {code:java} > [error] /Users/yangjie01/SourceCode/git/spark-mine-sbt/Picked up > JAVA_TOOL_OPTIONS:-Duser.language=en > [error] Loading source file > /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBTypeInfo.java... > [error] Loading source file > /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/ArrayWrappers.java... > [error] Loading source file > /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/KVIndex.java... > [error] Loading source file > /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/InMemoryStore.java... > [error] Loading source file > /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java... > [error] Loading source file > /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDB.java... > [error] Loading source file > /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBTypeInfo.java... > [error] Loading source file > /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/UnsupportedStoreVersionException.java... > [error] Loading source file > /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java... > [error] Loading source file > /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/KVStoreIterator.java... > [error] Loading source file > /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/KVStore.java... > [error] Loading source file > /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/KVStoreView.java... > [error] Loading source file > /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/KVTypeInfo.java... > [error] Loading source file > /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java... > [error] Loading source file > /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/KVStoreSerializer.java... > [error] Constructing Javadoc information... > [error] Building index for all the packages and classes... > [error] Standard Doclet version 17.0.8+7-LTS > [error] Building tree for all the packages and classes... > [error] > /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/KVStore.java:32:1: > error: heading used out of sequence: , compared to implicit preceding > heading: > [error] * Serialization > [error] ^Generating > /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/target/scala-2.13/api/org/apache/spark/util/kvstore/InMemoryStore.html... > [error] Generating > /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/target/scala-2.13/api/org/apache/spark/util/kvstore/KVIndex.html... > [error] Generating > /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/target/scala-2.13/api/org/apache/spark/util/kvstore/KVStore.html... > [error] Generating > /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/target/scala-2.13/api/org/apache/spark/util/kvstore/KVStoreIterator.html... > [error] Generating > /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/target/scala-2.13/api/org/apache/spark/util/kvstore/KVStoreSerializer.html... > [error] Generating > /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/target/scala-2.13/api/org/apache/spark/util/kvstore/KVStoreView.html... > [error] Generating > /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/target/scala-2.13/api/org/apache/spark/util/kvstore/KVTypeInfo.html... > [error] Generating > /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/target/scala-2.13/api/org/apache/spark/util/kvstore/LevelDB.html... > [error]
[jira] [Updated] (SPARK-45341) Make the sbt doc command execute successfully with Java 17
[ https://issues.apache.org/jira/browse/SPARK-45341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie updated SPARK-45341: - Description: {code:java} [error] /Users/yangjie01/SourceCode/git/spark-mine-sbt/Picked up JAVA_TOOL_OPTIONS:-Duser.language=en [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBTypeInfo.java... [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/ArrayWrappers.java... [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/KVIndex.java... [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/InMemoryStore.java... [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java... [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDB.java... [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBTypeInfo.java... [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/UnsupportedStoreVersionException.java... [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java... [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/KVStoreIterator.java... [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/KVStore.java... [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/KVStoreView.java... [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/KVTypeInfo.java... [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java... [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/KVStoreSerializer.java... [error] Constructing Javadoc information... [error] Building index for all the packages and classes... [error] Standard Doclet version 17.0.8+7-LTS [error] Building tree for all the packages and classes... [error] /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/KVStore.java:32:1: error: heading used out of sequence: , compared to implicit preceding heading: [error] * Serialization [error] ^Generating /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/target/scala-2.13/api/org/apache/spark/util/kvstore/InMemoryStore.html... [error] Generating /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/target/scala-2.13/api/org/apache/spark/util/kvstore/KVIndex.html... [error] Generating /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/target/scala-2.13/api/org/apache/spark/util/kvstore/KVStore.html... [error] Generating /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/target/scala-2.13/api/org/apache/spark/util/kvstore/KVStoreIterator.html... [error] Generating /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/target/scala-2.13/api/org/apache/spark/util/kvstore/KVStoreSerializer.html... [error] Generating /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/target/scala-2.13/api/org/apache/spark/util/kvstore/KVStoreView.html... [error] Generating /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/target/scala-2.13/api/org/apache/spark/util/kvstore/KVTypeInfo.html... [error] Generating /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/target/scala-2.13/api/org/apache/spark/util/kvstore/LevelDB.html... [error] Generating /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/target/scala-2.13/api/org/apache/spark/util/kvstore/LevelDB.TypeAliases.html... [error] Generating /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/target/scala-2.13/api/org/apache/spark/util/kvstore/RocksDB.html... [error] Generating /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/target/scala-2.13/api/org/apache/spark/util/kvstore/RocksDB.TypeAliases.html... [error] Generating
[jira] [Created] (SPARK-45341) Make the sbt doc command execute successfully with Java 17
Yang Jie created SPARK-45341: Summary: Make the sbt doc command execute successfully with Java 17 Key: SPARK-45341 URL: https://issues.apache.org/jira/browse/SPARK-45341 Project: Spark Issue Type: Sub-task Components: Build Affects Versions: 4.0.0 Reporter: Yang Jie {code:java} [error] /Users/yangjie01/SourceCode/git/spark-mine-sbt/Picked up JAVA_TOOL_OPTIONS:-Duser.language=en [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBTypeInfo.java... [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/ArrayWrappers.java... [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/KVIndex.java... [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/InMemoryStore.java... [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java... [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDB.java... [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBTypeInfo.java... [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/UnsupportedStoreVersionException.java... [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java... [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/KVStoreIterator.java... [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/KVStore.java... [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/KVStoreView.java... [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/KVTypeInfo.java... [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java... [error] Loading source file /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/KVStoreSerializer.java... [error] Constructing Javadoc information... [error] Building index for all the packages and classes... [error] Standard Doclet version 17.0.8+7-LTS [error] Building tree for all the packages and classes... [error] /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/src/main/java/org/apache/spark/util/kvstore/KVStore.java:32:1: error: heading used out of sequence: , compared to implicit preceding heading: [error] * Serialization [error] ^Generating /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/target/scala-2.13/api/org/apache/spark/util/kvstore/InMemoryStore.html... [error] Generating /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/target/scala-2.13/api/org/apache/spark/util/kvstore/KVIndex.html... [error] Generating /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/target/scala-2.13/api/org/apache/spark/util/kvstore/KVStore.html... [error] Generating /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/target/scala-2.13/api/org/apache/spark/util/kvstore/KVStoreIterator.html... [error] Generating /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/target/scala-2.13/api/org/apache/spark/util/kvstore/KVStoreSerializer.html... [error] Generating /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/target/scala-2.13/api/org/apache/spark/util/kvstore/KVStoreView.html... [error] Generating /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/target/scala-2.13/api/org/apache/spark/util/kvstore/KVTypeInfo.html... [error] Generating /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/target/scala-2.13/api/org/apache/spark/util/kvstore/LevelDB.html... [error] Generating /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/target/scala-2.13/api/org/apache/spark/util/kvstore/LevelDB.TypeAliases.html... [error] Generating /Users/yangjie01/SourceCode/git/spark-mine-sbt/common/kvstore/target/scala-2.13/api/org/apache/spark/util/kvstore/RocksDB.html... [error] Generating
[jira] [Updated] (SPARK-45340) Remove the SQL config spark.sql.hive.verifyPartitionPath
[ https://issues.apache.org/jira/browse/SPARK-45340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45340: --- Labels: pull-request-available (was: ) > Remove the SQL config spark.sql.hive.verifyPartitionPath > > > Key: SPARK-45340 > URL: https://issues.apache.org/jira/browse/SPARK-45340 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.0.0 >Reporter: Max Gekk >Assignee: Max Gekk >Priority: Major > Labels: pull-request-available > > The SQL config spark.sql.hive.verifyPartitionPath has been deprecated a quite > a while in version 3.0. Can be removed in the version 4.0. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45340) Remove the SQL config spark.sql.hive.verifyPartitionPath
Max Gekk created SPARK-45340: Summary: Remove the SQL config spark.sql.hive.verifyPartitionPath Key: SPARK-45340 URL: https://issues.apache.org/jira/browse/SPARK-45340 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 4.0.0 Reporter: Max Gekk Assignee: Max Gekk The SQL config spark.sql.hive.verifyPartitionPath has been deprecated a quite a while in version 3.0. Can be removed in the version 4.0. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-43850) Cleanup unused imports related suppression rules for Scala 2.13
[ https://issues.apache.org/jira/browse/SPARK-43850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie updated SPARK-43850: - Parent: SPARK-45314 Issue Type: Sub-task (was: Improvement) > Cleanup unused imports related suppression rules for Scala 2.13 > --- > > Key: SPARK-43850 > URL: https://issues.apache.org/jira/browse/SPARK-43850 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-43850) Remove the import for scala.language.higherKinds and delete the corresponding suppression rule
[ https://issues.apache.org/jira/browse/SPARK-43850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie updated SPARK-43850: - Summary: Remove the import for scala.language.higherKinds and delete the corresponding suppression rule (was: Cleanup unused imports related suppression rules for Scala 2.13) > Remove the import for scala.language.higherKinds and delete the corresponding > suppression rule > -- > > Key: SPARK-43850 > URL: https://issues.apache.org/jira/browse/SPARK-43850 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-43850) Cleanup unused imports related suppression rules for Scala 2.13
[ https://issues.apache.org/jira/browse/SPARK-43850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-43850: --- Labels: pull-request-available (was: ) > Cleanup unused imports related suppression rules for Scala 2.13 > --- > > Key: SPARK-43850 > URL: https://issues.apache.org/jira/browse/SPARK-43850 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-43850) Cleanup unused imports related suppression rules for Scala 2.13
[ https://issues.apache.org/jira/browse/SPARK-43850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie updated SPARK-43850: - Affects Version/s: 4.0.0 (was: 3.5.0) > Cleanup unused imports related suppression rules for Scala 2.13 > --- > > Key: SPARK-43850 > URL: https://issues.apache.org/jira/browse/SPARK-43850 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45271) Merge _LEGACY_ERROR_TEMP_1113 into TABLE_OPERATION & delete some unused method in QueryCompilationErrors
[ https://issues.apache.org/jira/browse/SPARK-45271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-45271: --- Assignee: BingKun Pan > Merge _LEGACY_ERROR_TEMP_1113 into TABLE_OPERATION & delete some unused > method in QueryCompilationErrors > > > Key: SPARK-45271 > URL: https://issues.apache.org/jira/browse/SPARK-45271 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-45271) Merge _LEGACY_ERROR_TEMP_1113 into TABLE_OPERATION & delete some unused method in QueryCompilationErrors
[ https://issues.apache.org/jira/browse/SPARK-45271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-45271. - Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 43044 [https://github.com/apache/spark/pull/43044] > Merge _LEGACY_ERROR_TEMP_1113 into TABLE_OPERATION & delete some unused > method in QueryCompilationErrors > > > Key: SPARK-45271 > URL: https://issues.apache.org/jira/browse/SPARK-45271 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45339) Pyspark should log errors it retries
[ https://issues.apache.org/jira/browse/SPARK-45339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45339: --- Labels: pull-request-available (was: ) > Pyspark should log errors it retries > > > Key: SPARK-45339 > URL: https://issues.apache.org/jira/browse/SPARK-45339 > Project: Spark > Issue Type: Improvement > Components: PySpark >Affects Versions: 3.5.0 >Reporter: Alice Sayutina >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45339) Pyspark should log errors it retries
Alice Sayutina created SPARK-45339: -- Summary: Pyspark should log errors it retries Key: SPARK-45339 URL: https://issues.apache.org/jira/browse/SPARK-45339 Project: Spark Issue Type: Improvement Components: PySpark Affects Versions: 3.5.0 Reporter: Alice Sayutina -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45309) Remove all SystemUtils.isJavaVersionAtLeast with JDK 9
[ https://issues.apache.org/jira/browse/SPARK-45309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-45309: Assignee: Hyukjin Kwon > Remove all SystemUtils.isJavaVersionAtLeast with JDK 9 > -- > > Key: SPARK-45309 > URL: https://issues.apache.org/jira/browse/SPARK-45309 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > > We use JDK 11+ so we can remove all Java 9+ conditions -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-45309) Remove all SystemUtils.isJavaVersionAtLeast with JDK 9
[ https://issues.apache.org/jira/browse/SPARK-45309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-45309. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 43098 [https://github.com/apache/spark/pull/43098] > Remove all SystemUtils.isJavaVersionAtLeast with JDK 9 > -- > > Key: SPARK-45309 > URL: https://issues.apache.org/jira/browse/SPARK-45309 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > We use JDK 11+ so we can remove all Java 9+ conditions -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-45323) Upgrade snappy to 1.1.10.4
[ https://issues.apache.org/jira/browse/SPARK-45323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-45323. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 43109 [https://github.com/apache/spark/pull/43109] > Upgrade snappy to 1.1.10.4 > -- > > Key: SPARK-45323 > URL: https://issues.apache.org/jira/browse/SPARK-45323 > Project: Spark > Issue Type: Dependency upgrade > Components: Build >Affects Versions: 4.0.0, 3.5.1 >Reporter: Bjørn Jørgensen >Assignee: Bjørn Jørgensen >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Security Fix > Fixed SnappyInputStream so as not to allocate too large memory when > decompressing data with an extremely large chunk size by @tunnelshade (code > change) > This does not affect users only using Snappy.compress/uncompress methods -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45323) Upgrade snappy to 1.1.10.4
[ https://issues.apache.org/jira/browse/SPARK-45323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-45323: Assignee: Bjørn Jørgensen > Upgrade snappy to 1.1.10.4 > -- > > Key: SPARK-45323 > URL: https://issues.apache.org/jira/browse/SPARK-45323 > Project: Spark > Issue Type: Dependency upgrade > Components: Build >Affects Versions: 4.0.0, 3.5.1 >Reporter: Bjørn Jørgensen >Assignee: Bjørn Jørgensen >Priority: Major > Labels: pull-request-available > > Security Fix > Fixed SnappyInputStream so as not to allocate too large memory when > decompressing data with an extremely large chunk size by @tunnelshade (code > change) > This does not affect users only using Snappy.compress/uncompress methods -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45338) Remove scala.collection.JavaConverters
[ https://issues.apache.org/jira/browse/SPARK-45338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45338: --- Labels: pull-request-available (was: ) > Remove scala.collection.JavaConverters > -- > > Key: SPARK-45338 > URL: https://issues.apache.org/jira/browse/SPARK-45338 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 4.0.0 >Reporter: Jia Fan >Priority: Major > Labels: pull-request-available > > Remove deprecated scala.collection.JavaConverters, replaced by > scala.jdk.CollectionConverters -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45337) Refactor `AbstractCommandBuilder#getScalaVersion` to remove the check for Scala 2.12.
[ https://issues.apache.org/jira/browse/SPARK-45337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45337: --- Labels: pull-request-available (was: ) > Refactor `AbstractCommandBuilder#getScalaVersion` to remove the check for > Scala 2.12. > -- > > Key: SPARK-45337 > URL: https://issues.apache.org/jira/browse/SPARK-45337 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45338) Remove scala.collection.JavaConverters
Jia Fan created SPARK-45338: --- Summary: Remove scala.collection.JavaConverters Key: SPARK-45338 URL: https://issues.apache.org/jira/browse/SPARK-45338 Project: Spark Issue Type: Sub-task Components: Spark Core, SQL Affects Versions: 4.0.0 Reporter: Jia Fan Remove deprecated scala.collection.JavaConverters, replaced by scala.jdk.CollectionConverters -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45313) Inline `Iterators#size` and remove `Iterators.scala`
[ https://issues.apache.org/jira/browse/SPARK-45313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie updated SPARK-45313: - Parent: SPARK-45314 Issue Type: Sub-task (was: Improvement) > Inline `Iterators#size` and remove `Iterators.scala` > > > Key: SPARK-45313 > URL: https://issues.apache.org/jira/browse/SPARK-45313 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45337) Refactor `AbstractCommandBuilder#getScalaVersion` to remove the check for Scala 2.12.
Yang Jie created SPARK-45337: Summary: Refactor `AbstractCommandBuilder#getScalaVersion` to remove the check for Scala 2.12. Key: SPARK-45337 URL: https://issues.apache.org/jira/browse/SPARK-45337 Project: Spark Issue Type: Sub-task Components: Spark Core Affects Versions: 4.0.0 Reporter: Yang Jie -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-45321) Clean up the unnecessary Scala 2.12 related binary files.
[ https://issues.apache.org/jira/browse/SPARK-45321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie resolved SPARK-45321. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 43106 [https://github.com/apache/spark/pull/43106] > Clean up the unnecessary Scala 2.12 related binary files. > - > > Key: SPARK-45321 > URL: https://issues.apache.org/jira/browse/SPARK-45321 > Project: Spark > Issue Type: Sub-task > Components: Tests >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45321) Clean up the unnecessary Scala 2.12 related binary files.
[ https://issues.apache.org/jira/browse/SPARK-45321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie reassigned SPARK-45321: Assignee: Yang Jie > Clean up the unnecessary Scala 2.12 related binary files. > - > > Key: SPARK-45321 > URL: https://issues.apache.org/jira/browse/SPARK-45321 > Project: Spark > Issue Type: Sub-task > Components: Tests >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45336) Update the Oracle docker image version used for test and integration to use Oracle Database 23c Free
[ https://issues.apache.org/jira/browse/SPARK-45336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45336: --- Labels: pull-request-available (was: ) > Update the Oracle docker image version used for test and integration to use > Oracle Database 23c Free > > > Key: SPARK-45336 > URL: https://issues.apache.org/jira/browse/SPARK-45336 > Project: Spark > Issue Type: Improvement > Components: Tests >Affects Versions: 3.5.0 >Reporter: Luca Canali >Priority: Minor > Labels: pull-request-available > > This proposes to update the Docker image used for integration tests and > builds to Oracle Database 23c Free. > The Docker image used for integration tests and builds currently uses Oracle > XE version 21.3.0. Oracle 21 support ends in April 2024. The latest Oracle > release is 23c, it is a long-term release supported till 2032. With Oracle > 23c, Oracle has changed the name of the free version of its database, from > Oracle XE (Express Edition) to Oracle Database Free. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45310) Mapstatus location type changed from external shuffle service to executor after decommission migration
[ https://issues.apache.org/jira/browse/SPARK-45310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-45310: -- Assignee: (was: Apache Spark) > Mapstatus location type changed from external shuffle service to executor > after decommission migration > -- > > Key: SPARK-45310 > URL: https://issues.apache.org/jira/browse/SPARK-45310 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.0.3, 3.1.3, 3.2.4, 3.3.2, 3.4.1, 3.5.0 >Reporter: wuyi >Priority: Major > Labels: pull-request-available > > When migrating shuffle blocks during decommission, the updated mapstatus > location doesn't respect the external shuffle service location when external > shuffle service is enabled. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45310) Mapstatus location type changed from external shuffle service to executor after decommission migration
[ https://issues.apache.org/jira/browse/SPARK-45310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-45310: -- Assignee: Apache Spark > Mapstatus location type changed from external shuffle service to executor > after decommission migration > -- > > Key: SPARK-45310 > URL: https://issues.apache.org/jira/browse/SPARK-45310 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.0.3, 3.1.3, 3.2.4, 3.3.2, 3.4.1, 3.5.0 >Reporter: wuyi >Assignee: Apache Spark >Priority: Major > Labels: pull-request-available > > When migrating shuffle blocks during decommission, the updated mapstatus > location doesn't respect the external shuffle service location when external > shuffle service is enabled. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45336) Update the Oracle docker image version used for test and integration to use Oracle Database 23c Free
[ https://issues.apache.org/jira/browse/SPARK-45336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luca Canali updated SPARK-45336: Description: This proposes to update the Docker image used for integration tests and builds to Oracle Database 23c Free. The Docker image used for integration tests and builds currently uses Oracle XE version 21.3.0. Oracle 21 support ends in April 2024. The latest Oracle release is 23c, it is a long-term release supported till 2032. With Oracle 23c, Oracle has changed the name of the free version of its database, from Oracle XE (Express Edition) to Oracle Database Free. was: This proposes to update the Docker image used for integration tests and builds to Oracle Database 23c Free. The Docker image used for integration tests and builds currently uses Oracle XE version 21.3.0. Oracle 21 support ends in April 2024. The latest Oracle release is 23c, it is a long-term release support till 2032. With Oracle 23c, Oracle has changed the name of the free version of its database, from Oracle XE (Express Edition) to Oracle Database Free. > Update the Oracle docker image version used for test and integration to use > Oracle Database 23c Free > > > Key: SPARK-45336 > URL: https://issues.apache.org/jira/browse/SPARK-45336 > Project: Spark > Issue Type: Improvement > Components: Tests >Affects Versions: 3.5.0 >Reporter: Luca Canali >Priority: Minor > > This proposes to update the Docker image used for integration tests and > builds to Oracle Database 23c Free. > The Docker image used for integration tests and builds currently uses Oracle > XE version 21.3.0. Oracle 21 support ends in April 2024. The latest Oracle > release is 23c, it is a long-term release supported till 2032. With Oracle > 23c, Oracle has changed the name of the free version of its database, from > Oracle XE (Express Edition) to Oracle Database Free. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45336) Update the Oracle docker image version used for test and integration to use Oracle Database 23c Free
Luca Canali created SPARK-45336: --- Summary: Update the Oracle docker image version used for test and integration to use Oracle Database 23c Free Key: SPARK-45336 URL: https://issues.apache.org/jira/browse/SPARK-45336 Project: Spark Issue Type: Improvement Components: Tests Affects Versions: 3.5.0 Reporter: Luca Canali This proposes to update the Docker image used for integration tests and builds to Oracle Database 23c Free. The Docker image used for integration tests and builds currently uses Oracle XE version 21.3.0. Oracle 21 support ends in April 2024. The latest Oracle release is 23c, it is a long-term release support till 2032. With Oracle 23c, Oracle has changed the name of the free version of its database, from Oracle XE (Express Edition) to Oracle Database Free. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45335) Correct the group of `ElementAt` and `TryElementAt`
[ https://issues.apache.org/jira/browse/SPARK-45335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45335: --- Labels: pull-request-available (was: ) > Correct the group of `ElementAt` and `TryElementAt` > --- > > Key: SPARK-45335 > URL: https://issues.apache.org/jira/browse/SPARK-45335 > Project: Spark > Issue Type: Improvement > Components: Documentation, SQL >Affects Versions: 4.0.0 >Reporter: Ruifeng Zheng >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45335) Correct the group of `ElementAt` and `TryElementAt`
Ruifeng Zheng created SPARK-45335: - Summary: Correct the group of `ElementAt` and `TryElementAt` Key: SPARK-45335 URL: https://issues.apache.org/jira/browse/SPARK-45335 Project: Spark Issue Type: Improvement Components: Documentation, SQL Affects Versions: 4.0.0 Reporter: Ruifeng Zheng -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-43620) Support `Column` for SparkConnectColumn.__getitem__
[ https://issues.apache.org/jira/browse/SPARK-43620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-43620: --- Labels: pull-request-available (was: ) > Support `Column` for SparkConnectColumn.__getitem__ > --- > > Key: SPARK-43620 > URL: https://issues.apache.org/jira/browse/SPARK-43620 > Project: Spark > Issue Type: Sub-task > Components: Connect, Pandas API on Spark >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Priority: Major > Labels: pull-request-available > > Repro: > {code:java} > pser = pd.Series(["a", "b", "c"]) > psser = ps.from_pandas(pser) > psser.astype("category") # internally calls > `map_scol[self.spark.column]`{code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45334) Remove misleading comment in parquetSchemaConverter
[ https://issues.apache.org/jira/browse/SPARK-45334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45334: --- Labels: pull-request-available (was: ) > Remove misleading comment in parquetSchemaConverter > --- > > Key: SPARK-45334 > URL: https://issues.apache.org/jira/browse/SPARK-45334 > Project: Spark > Issue Type: Documentation > Components: SQL >Affects Versions: 3.5.0 >Reporter: Mengran Lan >Priority: Trivial > Labels: pull-request-available > > I'm debugging a parquet issue and reading spark code as references. Happened > to find a misleading comment which remains in the latest version as well. > {code:java} > Types > .buildGroup(repetition).as(LogicalTypeAnnotation.listType()) > .addField(Types > .buildGroup(REPEATED) > // "array" is the name chosen by parquet-hive (1.7.0 and prior version) > .addField(convertField(StructField("array", elementType, nullable))) > .named("bag")) > .named(field.name) {code} > the comment above is misleading since Hive always uses "array_element" as the > name. > It is imported by this PR [https://github.com/apache/spark/pull/14399] and > relates to this issue https://issues.apache.org/jira/browse/SPARK-16777 > Furthermore, the parquet-hive module has been removed from the parquet-mr > project https://issues.apache.org/jira/browse/PARQUET-1676 > I suggest removing this piece of comment and will submit a PR later. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45334) Remove misleading comment in parquetSchemaConverter
Mengran Lan created SPARK-45334: --- Summary: Remove misleading comment in parquetSchemaConverter Key: SPARK-45334 URL: https://issues.apache.org/jira/browse/SPARK-45334 Project: Spark Issue Type: Documentation Components: SQL Affects Versions: 3.5.0 Reporter: Mengran Lan I'm debugging a parquet issue and reading spark code as references. Happened to find a misleading comment which remains in the latest version as well. {code:java} Types .buildGroup(repetition).as(LogicalTypeAnnotation.listType()) .addField(Types .buildGroup(REPEATED) // "array" is the name chosen by parquet-hive (1.7.0 and prior version) .addField(convertField(StructField("array", elementType, nullable))) .named("bag")) .named(field.name) {code} the comment above is misleading since Hive always uses "array_element" as the name. It is imported by this PR [https://github.com/apache/spark/pull/14399] and relates to this issue https://issues.apache.org/jira/browse/SPARK-16777 Furthermore, the parquet-hive module has been removed from the parquet-mr project https://issues.apache.org/jira/browse/PARQUET-1676 I suggest removing this piece of comment and will submit a PR later. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45232) Add missing function groups to SQL references
[ https://issues.apache.org/jira/browse/SPARK-45232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruifeng Zheng reassigned SPARK-45232: - Assignee: Ruifeng Zheng > Add missing function groups to SQL references > - > > Key: SPARK-45232 > URL: https://issues.apache.org/jira/browse/SPARK-45232 > Project: Spark > Issue Type: Improvement > Components: Documentation >Affects Versions: 4.0.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-45232) Add missing function groups to SQL references
[ https://issues.apache.org/jira/browse/SPARK-45232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruifeng Zheng resolved SPARK-45232. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 43011 [https://github.com/apache/spark/pull/43011] > Add missing function groups to SQL references > - > > Key: SPARK-45232 > URL: https://issues.apache.org/jira/browse/SPARK-45232 > Project: Spark > Issue Type: Improvement > Components: Documentation >Affects Versions: 4.0.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org