[jira] [Created] (SPARK-49128) Support custom History Server UI title
Dongjoon Hyun created SPARK-49128: - Summary: Support custom History Server UI title Key: SPARK-49128 URL: https://issues.apache.org/jira/browse/SPARK-49128 Project: Spark Issue Type: Sub-task Components: Spark Core Affects Versions: 4.0.0 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49126) Move `spark.history.ui.maxApplications` config definition to `History.scala`
[ https://issues.apache.org/jira/browse/SPARK-49126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49126: - Assignee: Dongjoon Hyun > Move `spark.history.ui.maxApplications` config definition to `History.scala` > > > Key: SPARK-49126 > URL: https://issues.apache.org/jira/browse/SPARK-49126 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 4.0.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-49126) Move `spark.history.ui.maxApplications` config definition to `History.scala`
Dongjoon Hyun created SPARK-49126: - Summary: Move `spark.history.ui.maxApplications` config definition to `History.scala` Key: SPARK-49126 URL: https://issues.apache.org/jira/browse/SPARK-49126 Project: Spark Issue Type: Sub-task Components: Spark Core Affects Versions: 4.0.0 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49098) Support Table Options for Insert SQL
[ https://issues.apache.org/jira/browse/SPARK-49098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49098. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47591 [https://github.com/apache/spark/pull/47591] > Support Table Options for Insert SQL > > > Key: SPARK-49098 > URL: https://issues.apache.org/jira/browse/SPARK-49098 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 4.0.0 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > SPARK-36680 added syntax to support table options in SELECT clause. This is > a follow up to allow for write table options in INSERT clause. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49098) Support Table Options for Insert SQL
[ https://issues.apache.org/jira/browse/SPARK-49098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49098: - Assignee: Szehon Ho > Support Table Options for Insert SQL > > > Key: SPARK-49098 > URL: https://issues.apache.org/jira/browse/SPARK-49098 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 4.0.0 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Labels: pull-request-available > > SPARK-36680 added syntax to support table options in SELECT clause. This is > a follow up to allow for write table options in INSERT clause. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49116) Fix `InvalidDefaultArgInFrom` in Python/R binding Dockerfiles
[ https://issues.apache.org/jira/browse/SPARK-49116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49116. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47617 [https://github.com/apache/spark/pull/47617] > Fix `InvalidDefaultArgInFrom` in Python/R binding Dockerfiles > - > > Key: SPARK-49116 > URL: https://issues.apache.org/jira/browse/SPARK-49116 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes >Affects Versions: 4.0.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49014) Bump Apache Avro to 1.12.0
[ https://issues.apache.org/jira/browse/SPARK-49014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49014: - Assignee: Fokko Driesprong > Bump Apache Avro to 1.12.0 > -- > > Key: SPARK-49014 > URL: https://issues.apache.org/jira/browse/SPARK-49014 > Project: Spark > Issue Type: Improvement > Components: Input/Output >Affects Versions: 3.4.3 >Reporter: Fokko Driesprong >Assignee: Fokko Driesprong >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49014) Bump Apache Avro to 1.12.0
[ https://issues.apache.org/jira/browse/SPARK-49014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-49014: -- Target Version/s: (was: 4.0.0) > Bump Apache Avro to 1.12.0 > -- > > Key: SPARK-49014 > URL: https://issues.apache.org/jira/browse/SPARK-49014 > Project: Spark > Issue Type: Improvement > Components: Input/Output >Affects Versions: 3.4.3 >Reporter: Fokko Driesprong >Assignee: Fokko Driesprong >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49014) Bump Apache Avro to 1.12.0
[ https://issues.apache.org/jira/browse/SPARK-49014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49014. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47498 [https://github.com/apache/spark/pull/47498] > Bump Apache Avro to 1.12.0 > -- > > Key: SPARK-49014 > URL: https://issues.apache.org/jira/browse/SPARK-49014 > Project: Spark > Issue Type: Improvement > Components: Input/Output >Affects Versions: 3.4.3 >Reporter: Fokko Driesprong >Assignee: Fokko Driesprong >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49014) Bump Apache Avro to 1.12.0
[ https://issues.apache.org/jira/browse/SPARK-49014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-49014: -- Component/s: Build (was: Input/Output) > Bump Apache Avro to 1.12.0 > -- > > Key: SPARK-49014 > URL: https://issues.apache.org/jira/browse/SPARK-49014 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.4.3 >Reporter: Fokko Driesprong >Assignee: Fokko Driesprong >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49097) Add Python3 environment detection for the `build_error_docs` method in `build_api_decs.rb`
[ https://issues.apache.org/jira/browse/SPARK-49097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49097. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47590 [https://github.com/apache/spark/pull/47590] > Add Python3 environment detection for the `build_error_docs` method in > `build_api_decs.rb` > -- > > Key: SPARK-49097 > URL: https://issues.apache.org/jira/browse/SPARK-49097 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Wei Guo >Assignee: Wei Guo >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49097) Add Python3 environment detection for the `build_error_docs` method in `build_api_decs.rb`
[ https://issues.apache.org/jira/browse/SPARK-49097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49097: - Assignee: Wei Guo > Add Python3 environment detection for the `build_error_docs` method in > `build_api_decs.rb` > -- > > Key: SPARK-49097 > URL: https://issues.apache.org/jira/browse/SPARK-49097 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Wei Guo >Assignee: Wei Guo >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (ORC-1753) Use Avro 1.12.0 in bench module
[ https://issues.apache.org/jira/browse/ORC-1753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved ORC-1753. Fix Version/s: 2.0.2 Resolution: Fixed Issue resolved by pull request 1996 [https://github.com/apache/orc/pull/1996] > Use Avro 1.12.0 in bench module > --- > > Key: ORC-1753 > URL: https://issues.apache.org/jira/browse/ORC-1753 > Project: ORC > Issue Type: Test > Components: Java >Affects Versions: 2.0.2 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Minor > Fix For: 2.0.2 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (SPARK-49117) Fix `docker-image-tool.sh` up-to-date
Dongjoon Hyun created SPARK-49117: - Summary: Fix `docker-image-tool.sh` up-to-date Key: SPARK-49117 URL: https://issues.apache.org/jira/browse/SPARK-49117 Project: Spark Issue Type: Sub-task Components: Kubernetes Affects Versions: 4.0.0 Reporter: Dongjoon Hyun Apache Spark 4 dropped Java 11 support. So, we should fix the following. {code} - - Build and push Java11-based image with tag "v3.4.0" to docker.io/myrepo + - Build and push Java17-based image with tag "v4.0.0" to docker.io/myrepo {code} Apache Spark 4 requires JDK instead of JRE. So, we should fix the following. {code} -$0 -r docker.io/myrepo -t v3.4.0 -b java_image_tag=11-jre build +$0 -r docker.io/myrepo -t v4.0.0 -b java_image_tag=17 build {code} Lastly, `3.4.0` is too old because it's released on April 13, 2023. We had better use v4.0.0. {code} -$0 -r docker.io/myrepo -t v3.4.0 -p kubernetes/dockerfiles/spark/bindings/python/Dockerfile build +$0 -r docker.io/myrepo -t v4.0.0 -p kubernetes/dockerfiles/spark/bindings/python/Dockerfile build {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49117) Fix `docker-image-tool.sh` to be up-to-date
[ https://issues.apache.org/jira/browse/SPARK-49117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-49117: -- Summary: Fix `docker-image-tool.sh` to be up-to-date (was: Fix `docker-image-tool.sh` up-to-date) > Fix `docker-image-tool.sh` to be up-to-date > --- > > Key: SPARK-49117 > URL: https://issues.apache.org/jira/browse/SPARK-49117 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes >Affects Versions: 4.0.0 > Reporter: Dongjoon Hyun >Priority: Major > > Apache Spark 4 dropped Java 11 support. So, we should fix the following. > {code} > - - Build and push Java11-based image with tag "v3.4.0" to docker.io/myrepo > + - Build and push Java17-based image with tag "v4.0.0" to docker.io/myrepo > {code} > Apache Spark 4 requires JDK instead of JRE. So, we should fix the following. > {code} > -$0 -r docker.io/myrepo -t v3.4.0 -b java_image_tag=11-jre build > +$0 -r docker.io/myrepo -t v4.0.0 -b java_image_tag=17 build > {code} > Lastly, `3.4.0` is too old because it's released on April 13, 2023. We had > better use v4.0.0. > {code} > -$0 -r docker.io/myrepo -t v3.4.0 -p > kubernetes/dockerfiles/spark/bindings/python/Dockerfile build > +$0 -r docker.io/myrepo -t v4.0.0 -p > kubernetes/dockerfiles/spark/bindings/python/Dockerfile build > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49116) Fix `InvalidDefaultArgInFrom` in Python/R binding Dockerfiles
[ https://issues.apache.org/jira/browse/SPARK-49116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49116: - Assignee: Dongjoon Hyun > Fix `InvalidDefaultArgInFrom` in Python/R binding Dockerfiles > - > > Key: SPARK-49116 > URL: https://issues.apache.org/jira/browse/SPARK-49116 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes >Affects Versions: 4.0.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-49116) Fix `InvalidDefaultArgInFrom` in Python/R binding Dockerfiles
Dongjoon Hyun created SPARK-49116: - Summary: Fix `InvalidDefaultArgInFrom` in Python/R binding Dockerfiles Key: SPARK-49116 URL: https://issues.apache.org/jira/browse/SPARK-49116 Project: Spark Issue Type: Sub-task Components: Kubernetes Affects Versions: 4.0.0 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (ORC-1753) Use Avro 1.12.0 in bench module
[ https://issues.apache.org/jira/browse/ORC-1753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned ORC-1753: -- Assignee: Dongjoon Hyun > Use Avro 1.12.0 in bench module > --- > > Key: ORC-1753 > URL: https://issues.apache.org/jira/browse/ORC-1753 > Project: ORC > Issue Type: Test > Components: Java >Affects Versions: 2.0.2 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ORC-1753) Use Avro 1.12.0 in bench module
Dongjoon Hyun created ORC-1753: -- Summary: Use Avro 1.12.0 in bench module Key: ORC-1753 URL: https://issues.apache.org/jira/browse/ORC-1753 Project: ORC Issue Type: Test Components: Java Affects Versions: 2.0.2 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ORC-1697) Fix IllegalArgumentException when reading json timestamp type in benchmark
[ https://issues.apache.org/jira/browse/ORC-1697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated ORC-1697: --- Issue Type: Test (was: Bug) > Fix IllegalArgumentException when reading json timestamp type in benchmark > -- > > Key: ORC-1697 > URL: https://issues.apache.org/jira/browse/ORC-1697 > Project: ORC > Issue Type: Test >Reporter: dzcxzl >Assignee: dzcxzl >Priority: Major > Fix For: 2.0.2 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ORC-1752) Fix NumberFormatException when reading json timestamp type in benchmark
[ https://issues.apache.org/jira/browse/ORC-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated ORC-1752: --- Issue Type: Test (was: Bug) > Fix NumberFormatException when reading json timestamp type in benchmark > --- > > Key: ORC-1752 > URL: https://issues.apache.org/jira/browse/ORC-1752 > Project: ORC > Issue Type: Test > Components: Java >Reporter: dzcxzl >Assignee: dzcxzl >Priority: Minor > Fix For: 2.0.2 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ORC-1752) Fix NumberFormatException when reading json timestamp type in benchmark
[ https://issues.apache.org/jira/browse/ORC-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated ORC-1752: --- Fix Version/s: (was: 2.1.0) > Fix NumberFormatException when reading json timestamp type in benchmark > --- > > Key: ORC-1752 > URL: https://issues.apache.org/jira/browse/ORC-1752 > Project: ORC > Issue Type: Bug > Components: Java >Reporter: dzcxzl >Assignee: dzcxzl >Priority: Minor > Fix For: 2.0.2 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (ORC-1752) Fix NumberFormatException when reading json timestamp type in benchmark
[ https://issues.apache.org/jira/browse/ORC-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned ORC-1752: -- Assignee: dzcxzl > Fix NumberFormatException when reading json timestamp type in benchmark > --- > > Key: ORC-1752 > URL: https://issues.apache.org/jira/browse/ORC-1752 > Project: ORC > Issue Type: Bug > Components: Java >Reporter: dzcxzl >Assignee: dzcxzl >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ORC-1752) Fix NumberFormatException when reading json timestamp type in benchmark
[ https://issues.apache.org/jira/browse/ORC-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved ORC-1752. Fix Version/s: 2.1.0 2.0.2 Resolution: Fixed Issue resolved by pull request 1995 [https://github.com/apache/orc/pull/1995] > Fix NumberFormatException when reading json timestamp type in benchmark > --- > > Key: ORC-1752 > URL: https://issues.apache.org/jira/browse/ORC-1752 > Project: ORC > Issue Type: Bug > Components: Java >Reporter: dzcxzl >Assignee: dzcxzl >Priority: Minor > Fix For: 2.1.0, 2.0.2 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (SPARK-49108) Add `submit_pi.sh` REST API example
[ https://issues.apache.org/jira/browse/SPARK-49108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-49108: -- Summary: Add `submit_pi.sh` REST API example (was: Add `submit_pi.sh` example via REST API) > Add `submit_pi.sh` REST API example > --- > > Key: SPARK-49108 > URL: https://issues.apache.org/jira/browse/SPARK-49108 > Project: Spark > Issue Type: Sub-task > Components: Examples >Affects Versions: 4.0.0 > Reporter: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49108) Add `submit_pi.sh` example via REST API
[ https://issues.apache.org/jira/browse/SPARK-49108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-49108: -- Summary: Add `submit_pi.sh` example via REST API (was: Add `SparkPi` example to use REST API) > Add `submit_pi.sh` example via REST API > --- > > Key: SPARK-49108 > URL: https://issues.apache.org/jira/browse/SPARK-49108 > Project: Spark > Issue Type: Sub-task > Components: Examples >Affects Versions: 4.0.0 > Reporter: Dongjoon Hyun >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (ORC-1697) Fix IllegalArgumentException when reading json timestamp type in benchmark
[ https://issues.apache.org/jira/browse/ORC-1697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated ORC-1697: --- Issue Type: Bug (was: Improvement) > Fix IllegalArgumentException when reading json timestamp type in benchmark > -- > > Key: ORC-1697 > URL: https://issues.apache.org/jira/browse/ORC-1697 > Project: ORC > Issue Type: Bug >Reporter: dzcxzl >Assignee: dzcxzl >Priority: Major > Fix For: 2.0.2 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ORC-1697) Fix IllegalArgumentException when reading json timestamp type in benchmark
[ https://issues.apache.org/jira/browse/ORC-1697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated ORC-1697: --- Fix Version/s: (was: 2.1.0) > Fix IllegalArgumentException when reading json timestamp type in benchmark > -- > > Key: ORC-1697 > URL: https://issues.apache.org/jira/browse/ORC-1697 > Project: ORC > Issue Type: Improvement >Reporter: dzcxzl >Assignee: dzcxzl >Priority: Major > Fix For: 2.0.2 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (ORC-1697) Fix IllegalArgumentException when reading json timestamp type in benchmark
[ https://issues.apache.org/jira/browse/ORC-1697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned ORC-1697: -- Assignee: dzcxzl > Fix IllegalArgumentException when reading json timestamp type in benchmark > -- > > Key: ORC-1697 > URL: https://issues.apache.org/jira/browse/ORC-1697 > Project: ORC > Issue Type: Improvement >Reporter: dzcxzl >Assignee: dzcxzl >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ORC-1697) Fix IllegalArgumentException when reading json timestamp type in benchmark
[ https://issues.apache.org/jira/browse/ORC-1697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved ORC-1697. Fix Version/s: 2.1.0 2.0.2 Resolution: Fixed Issue resolved by pull request 1930 [https://github.com/apache/orc/pull/1930] > Fix IllegalArgumentException when reading json timestamp type in benchmark > -- > > Key: ORC-1697 > URL: https://issues.apache.org/jira/browse/ORC-1697 > Project: ORC > Issue Type: Improvement >Reporter: dzcxzl >Assignee: dzcxzl >Priority: Major > Fix For: 2.1.0, 2.0.2 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (SPARK-49106) Documented Prometheus endpoints
[ https://issues.apache.org/jira/browse/SPARK-49106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49106. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47219 [https://github.com/apache/spark/pull/47219] > Documented Prometheus endpoints > --- > > Key: SPARK-49106 > URL: https://issues.apache.org/jira/browse/SPARK-49106 > Project: Spark > Issue Type: Sub-task > Components: Documentation >Affects Versions: 4.0.0 > Reporter: Dongjoon Hyun >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-49106) Documented Prometheus endpoints
Dongjoon Hyun created SPARK-49106: - Summary: Documented Prometheus endpoints Key: SPARK-49106 URL: https://issues.apache.org/jira/browse/SPARK-49106 Project: Spark Issue Type: Sub-task Components: Documentation Affects Versions: 4.0.0 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-48731) Upgrade `docker/build-push-action` to v6
[ https://issues.apache.org/jira/browse/SPARK-48731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-48731. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47112 [https://github.com/apache/spark/pull/47112] > Upgrade `docker/build-push-action` to v6 > > > Key: SPARK-48731 > URL: https://issues.apache.org/jira/browse/SPARK-48731 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-48949) SPJ: Runtime Partition Filtering
[ https://issues.apache.org/jira/browse/SPARK-48949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-48949. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47426 [https://github.com/apache/spark/pull/47426] > SPJ: Runtime Partition Filtering > > > Key: SPARK-48949 > URL: https://issues.apache.org/jira/browse/SPARK-48949 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49105) Upgrade ojdbc11 to 23.5.0.24.07 and OracleDatabaseOnDocker docker image tag to oracle-free:23.5-slim
[ https://issues.apache.org/jira/browse/SPARK-49105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49105. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47597 [https://github.com/apache/spark/pull/47597] > Upgrade ojdbc11 to 23.5.0.24.07 and OracleDatabaseOnDocker docker image tag > to oracle-free:23.5-slim > > > Key: SPARK-49105 > URL: https://issues.apache.org/jira/browse/SPARK-49105 > Project: Spark > Issue Type: Improvement > Components: Build, Tests >Affects Versions: 4.0.0 >Reporter: Wei Guo >Assignee: Wei Guo >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-49104) Document `JWSFilter` usage in Spark UI and REST API and rename parameter to `secretKey`
Dongjoon Hyun created SPARK-49104: - Summary: Document `JWSFilter` usage in Spark UI and REST API and rename parameter to `secretKey` Key: SPARK-49104 URL: https://issues.apache.org/jira/browse/SPARK-49104 Project: Spark Issue Type: Sub-task Components: Documentation, Spark Core Affects Versions: 4.0.0 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49103) Support `spark.master.rest.filters`
[ https://issues.apache.org/jira/browse/SPARK-49103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49103. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47595 [https://github.com/apache/spark/pull/47595] > Support `spark.master.rest.filters` > --- > > Key: SPARK-49103 > URL: https://issues.apache.org/jira/browse/SPARK-49103 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 4.0.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49103) Support `spark.master.rest.filters`
[ https://issues.apache.org/jira/browse/SPARK-49103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49103: - Assignee: Dongjoon Hyun > Support `spark.master.rest.filters` > --- > > Key: SPARK-49103 > URL: https://issues.apache.org/jira/browse/SPARK-49103 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 4.0.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-49103) Support `spark.master.rest.filters`
Dongjoon Hyun created SPARK-49103: - Summary: Support `spark.master.rest.filters` Key: SPARK-49103 URL: https://issues.apache.org/jira/browse/SPARK-49103 Project: Spark Issue Type: Sub-task Components: Spark Core Affects Versions: 4.0.0 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49080) Upgrade mssql-jdbc to 12.8.0.jre11
[ https://issues.apache.org/jira/browse/SPARK-49080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49080. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47569 [https://github.com/apache/spark/pull/47569] > Upgrade mssql-jdbc to 12.8.0.jre11 > -- > > Key: SPARK-49080 > URL: https://issues.apache.org/jira/browse/SPARK-49080 > Project: Spark > Issue Type: Improvement > Components: Build, Tests >Affects Versions: 4.0.0 >Reporter: Wei Guo >Assignee: Wei Guo >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49094) ignoreCorruptFiles file source option is partially supported for orc format
[ https://issues.apache.org/jira/browse/SPARK-49094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49094. --- Fix Version/s: 3.4.4 3.5.2 4.0.0 Resolution: Fixed Issue resolved by pull request 47583 [https://github.com/apache/spark/pull/47583] > ignoreCorruptFiles file source option is partially supported for orc format > --- > > Key: SPARK-49094 > URL: https://issues.apache.org/jira/browse/SPARK-49094 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0, 3.5.1, 3.4.3 >Reporter: Kent Yao >Assignee: Kent Yao >Priority: Major > Labels: pull-request-available > Fix For: 3.4.4, 3.5.2, 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49094) ignoreCorruptFiles file source option is partially supported for orc format
[ https://issues.apache.org/jira/browse/SPARK-49094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49094: - Assignee: Kent Yao > ignoreCorruptFiles file source option is partially supported for orc format > --- > > Key: SPARK-49094 > URL: https://issues.apache.org/jira/browse/SPARK-49094 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0, 3.5.1, 3.4.3 >Reporter: Kent Yao >Assignee: Kent Yao >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49094) ignoreCorruptFiles file source option is partially supported for orc format
[ https://issues.apache.org/jira/browse/SPARK-49094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-49094: -- Issue Type: Improvement (was: Bug) > ignoreCorruptFiles file source option is partially supported for orc format > --- > > Key: SPARK-49094 > URL: https://issues.apache.org/jira/browse/SPARK-49094 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.0.0, 3.5.1, 3.4.3 >Reporter: Kent Yao >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49094) ignoreCorruptFiles file source option is partially supported for orc format
[ https://issues.apache.org/jira/browse/SPARK-49094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-49094: -- Issue Type: Bug (was: Improvement) > ignoreCorruptFiles file source option is partially supported for orc format > --- > > Key: SPARK-49094 > URL: https://issues.apache.org/jira/browse/SPARK-49094 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0, 3.5.1, 3.4.3 >Reporter: Kent Yao >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49090) Support `JWSFilter`
[ https://issues.apache.org/jira/browse/SPARK-49090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49090: - Assignee: Dongjoon Hyun > Support `JWSFilter` > --- > > Key: SPARK-49090 > URL: https://issues.apache.org/jira/browse/SPARK-49090 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 4.0.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49090) Support `JWSFilter`
[ https://issues.apache.org/jira/browse/SPARK-49090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49090. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47575 [https://github.com/apache/spark/pull/47575] > Support `JWSFilter` > --- > > Key: SPARK-49090 > URL: https://issues.apache.org/jira/browse/SPARK-49090 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 4.0.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-49090) Support `JWSFilter`
Dongjoon Hyun created SPARK-49090: - Summary: Support `JWSFilter` Key: SPARK-49090 URL: https://issues.apache.org/jira/browse/SPARK-49090 Project: Spark Issue Type: Sub-task Components: Spark Core Affects Versions: 4.0.0 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
Re: [NOTICE] Progress of 3.5.2-RC5
Thank you for summarizing them and leading the release, Kent. :) Dongjoon. On Wed, Jul 31, 2024 at 10:39 PM Kent Yao wrote: > Hi dev, > > Since version 3.5.2-RC4, we have received several reports regarding > correctness issues, some of which are still unresolved. We will need a > few days to address the unresolved issues listed below. The RC5 vote > might be delayed until late next week. > > === FIXED === > https://issues.apache.org/jira/browse/SPARK-49000 > https://issues.apache.org/jira/browse/SPARK-49054 > > === ONGOING === > https://issues.apache.org/jira/browse/SPARK-48950 > https://issues.apache.org/jira/browse/SPARK-49030 > > > > Thanks, > Kent Yao > > > [1] https://lists.apache.org/thread/9lj57fh3zbo2h4koh5hr7nhdky21p6zg > > - > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >
[jira] [Commented] (SPARK-48950) Corrupt data from parquet scans
[ https://issues.apache.org/jira/browse/SPARK-48950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17870285#comment-17870285 ] Dongjoon Hyun commented on SPARK-48950: --- According to the Apache Spark guideline, `Correctness` issue should be considered as Blockers. - https://spark.apache.org/contributing.html > Correctness and data loss issues should be considered Blockers for their > target versions. I guess you want to set `Target Version` field to `3.5.3` instead of `3.5.2` because this is still under investigation, right? I'm +1 because there is no reproducer here. Setting a concrete `Target Version` sounds good to me in order not to forget this, [~yao]. Let me do that for now. Thanks. > Corrupt data from parquet scans > --- > > Key: SPARK-48950 > URL: https://issues.apache.org/jira/browse/SPARK-48950 > Project: Spark > Issue Type: Bug > Components: Input/Output >Affects Versions: 3.5.0, 4.0.0, 3.5.1 > Environment: Spark 3.5.0 > Running on kubernetes > Using Azure Blob storage with hierarchical namespace enabled >Reporter: Thomas Newton >Priority: Major > Labels: correctness > Attachments: example_task_errors.txt, job_dag.png, sql_query_plan.png > > > Its very rare and non-deterministic but since Spark 3.5.0 we have started > seeing a correctness bug in parquet scans when using the vectorized reader. > We've noticed this on double type columns where occasionally small groups > (typically 10s to 100s) of rows are replaced with crazy values like > `-1.29996470e+029, 3.56717569e-184, 7.23323243e+307, -1.05929677e+045, > -7.60562076e+240, -3.1806e-064, 2.89435993e-116`. I think this is the > result of interpreting uniform random bits as a double type. Most of my > testing has been on an array of double type column but we have also seen it > on un-nested plain double type columns. > I've been testing this by adding a filter that should return zero results but > will return non-zero if the parquet scan has problems. I've attached > screenshots of this from the Spark UI. > I did a `git bisect` and found that the problem starts with > [https://github.com/apache/spark/pull/39950], but I haven't yet understood > why. Its possible that this change is fine but it reveals a problem > elsewhere? I did also notice [https://github.com/apache/spark/pull/44853] > which appears to be a different implementation of the same thing so maybe > that could help. > Its not a major problem by itself but another symptom appears to be that > Parquet scan tasks fail at a rate of approximately 0.03% with errors like > those in the attached `example_task_errors.txt`. If I revert > [https://github.com/apache/spark/pull/39950] I get exactly 0 task failures on > the same test. > > The problem seems to be a bit dependant on how the parquet files happen to be > organised on blob storage so I don't yet have a reproduce that I can share > that doesn't depend on private data. > I tested on a pre-release 4.0.0 and the problem was still present. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-48950) Corrupt data from parquet scans
[ https://issues.apache.org/jira/browse/SPARK-48950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-48950: -- Target Version/s: 3.5.3 > Corrupt data from parquet scans > --- > > Key: SPARK-48950 > URL: https://issues.apache.org/jira/browse/SPARK-48950 > Project: Spark > Issue Type: Bug > Components: Input/Output >Affects Versions: 3.5.0, 4.0.0, 3.5.1 > Environment: Spark 3.5.0 > Running on kubernetes > Using Azure Blob storage with hierarchical namespace enabled >Reporter: Thomas Newton >Priority: Major > Labels: correctness > Attachments: example_task_errors.txt, job_dag.png, sql_query_plan.png > > > Its very rare and non-deterministic but since Spark 3.5.0 we have started > seeing a correctness bug in parquet scans when using the vectorized reader. > We've noticed this on double type columns where occasionally small groups > (typically 10s to 100s) of rows are replaced with crazy values like > `-1.29996470e+029, 3.56717569e-184, 7.23323243e+307, -1.05929677e+045, > -7.60562076e+240, -3.1806e-064, 2.89435993e-116`. I think this is the > result of interpreting uniform random bits as a double type. Most of my > testing has been on an array of double type column but we have also seen it > on un-nested plain double type columns. > I've been testing this by adding a filter that should return zero results but > will return non-zero if the parquet scan has problems. I've attached > screenshots of this from the Spark UI. > I did a `git bisect` and found that the problem starts with > [https://github.com/apache/spark/pull/39950], but I haven't yet understood > why. Its possible that this change is fine but it reveals a problem > elsewhere? I did also notice [https://github.com/apache/spark/pull/44853] > which appears to be a different implementation of the same thing so maybe > that could help. > Its not a major problem by itself but another symptom appears to be that > Parquet scan tasks fail at a rate of approximately 0.03% with errors like > those in the attached `example_task_errors.txt`. If I revert > [https://github.com/apache/spark/pull/39950] I get exactly 0 task failures on > the same test. > > The problem seems to be a bit dependant on how the parquet files happen to be > organised on blob storage so I don't yet have a reproduce that I can share > that doesn't depend on private data. > I tested on a pre-release 4.0.0 and the problem was still present. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49052) Add SparkOperator class and tests
[ https://issues.apache.org/jira/browse/SPARK-49052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49052: - Assignee: Zhou JIANG > Add SparkOperator class and tests > - > > Key: SPARK-49052 > URL: https://issues.apache.org/jira/browse/SPARK-49052 > Project: Spark > Issue Type: Sub-task > Components: k8s >Affects Versions: kubernetes-operator-0.1.0 >Reporter: Zhou JIANG >Assignee: Zhou JIANG >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49052) Add SparkOperator class and tests
[ https://issues.apache.org/jira/browse/SPARK-49052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49052. --- Fix Version/s: kubernetes-operator-0.1.0 Resolution: Fixed Issue resolved by pull request 29 [https://github.com/apache/spark-kubernetes-operator/pull/29] > Add SparkOperator class and tests > - > > Key: SPARK-49052 > URL: https://issues.apache.org/jira/browse/SPARK-49052 > Project: Spark > Issue Type: Sub-task > Components: k8s >Affects Versions: kubernetes-operator-0.1.0 >Reporter: Zhou JIANG >Assignee: Zhou JIANG >Priority: Major > Labels: pull-request-available > Fix For: kubernetes-operator-0.1.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49052) Add SparkOperator class and tests
[ https://issues.apache.org/jira/browse/SPARK-49052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-49052: -- Summary: Add SparkOperator class and tests (was: Add SparkOperator main entry point class) > Add SparkOperator class and tests > - > > Key: SPARK-49052 > URL: https://issues.apache.org/jira/browse/SPARK-49052 > Project: Spark > Issue Type: Sub-task > Components: k8s >Affects Versions: kubernetes-operator-0.1.0 >Reporter: Zhou JIANG >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-49030) Self join of a CTE seems non-deterministic
[ https://issues.apache.org/jira/browse/SPARK-49030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17870269#comment-17870269 ] Dongjoon Hyun commented on SPARK-49030: --- Thank you for reporting, [~jihoonson]. For the reported case, I agree with you that `SELECT ... LIMIT 10` query could be indeterministic. Without a deterministic ordering specification like `ORDER BY`, Spark's first 10 rows can be returned by any files. Given that, IIUC, I believe we need to revise `InlineCTE` to handle those cases. > Self join of a CTE seems non-deterministic > -- > > Key: SPARK-49030 > URL: https://issues.apache.org/jira/browse/SPARK-49030 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0 > Environment: Tested with Spark 3.4.1, 3.5.1, and 4.0.0-preview. >Reporter: Jihoon Son >Priority: Minor > > {code:java} > WITH c AS (SELECT * FROM customer LIMIT 10) > SELECT count(*) > FROM c c1, c c2 > WHERE c1.c_customer_sk > c2.c_customer_sk{code} > Suppose a self join query on a CTE such as the one above. > Spark generates a physical plan like the one below for this query. > {code:java} > == Physical Plan == > AdaptiveSparkPlan isFinalPlan=false > +- HashAggregate(keys=[], functions=[count(1)], output=[count(1)#194L]) > +- HashAggregate(keys=[], functions=[partial_count(1)], > output=[count#233L]) > +- Project > +- BroadcastNestedLoopJoin BuildRight, Inner, (c_customer_sk#0 > > c_customer_sk#214) > :- Filter isnotnull(c_customer_sk#0) > : +- GlobalLimit 10, 0 > : +- Exchange SinglePartition, ENSURE_REQUIREMENTS, > [plan_id=256] > : +- LocalLimit 10 > : +- FileScan parquet [c_customer_sk#0] Batched: true, > DataFilters: [], Format: Parquet, Location: InMemoryFileIndex(1 > paths)[file:/some/path/customer], PartitionFilters: [], PushedFilters: [], > ReadSchema: struct > +- BroadcastExchange IdentityBroadcastMode, [plan_id=263] > +- Filter isnotnull(c_customer_sk#214) > +- GlobalLimit 10, 0 > +- Exchange SinglePartition, ENSURE_REQUIREMENTS, > [plan_id=259] > +- LocalLimit 10 > +- FileScan parquet [c_customer_sk#214] Batched: > true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex(1 > paths)[file:/some/path/customer], PartitionFilters: [], PushedFilters: [], > ReadSchema: struct{code} > Evaluating this plan produces non-deterministic result because the limit is > independently pushed into the two sides of the join. Each limit can produce > different data, and thus the join can produce results that vary across runs. > I understand that the query in question is not deterministic (and thus not > very practical) as, due to the nature of the limit in distributed engines, it > is not expected to produce the same result anyway across repeated runs. > However, I would still expect that the query plan evaluation remains > deterministic. > Per extended analysis as seen below, it seems that the query plan has changed > at some point during optimization. > {code:java} > == Analyzed Logical Plan == > count(1): bigint > WithCTE > :- CTERelationDef 2, false > : +- SubqueryAlias c > : +- GlobalLimit 10 > : +- LocalLimit 10 > : +- Project [c_customer_sk#0, c_customer_id#1, > c_current_cdemo_sk#2, c_current_hdemo_sk#3, c_current_addr_sk#4, > c_first_shipto_date_sk#5, c_first_sales_date_sk#6, c_salutation#7, > c_first_name#8, c_last_name#9, c_preferred_cust_flag#10, c_birth_day#11L, > c_birth_month#12L, c_birth_year#13L, c_birth_country#14, c_login#15, > c_email_address#16, c_last_review_date_sk#17] > : +- SubqueryAlias customer > : +- View (`customer`, [c_customer_sk#0, c_customer_id#1, > c_current_cdemo_sk#2, c_current_hdemo_sk#3, c_current_addr_sk#4, > c_first_shipto_date_sk#5, c_first_sales_date_sk#6, c_salutation#7, > c_first_name#8, c_last_name#9, c_preferred_cust_flag#10, c_birth_day#11L, > c_birth_month#12L, c_birth_year#13L, c_birth_country#14, c_login#15, > c_email_address#16, c_last_review_date_sk#17]) > : +- Relation > [c_customer_sk#0,c_customer_id#1,c_current_cdemo_sk#2,c_current_hdemo_sk#3,c_current_addr_sk#4,c_first_shipto_date_sk#5,c_first_sales_date_sk#6,c_salutation#7,c_first_name#8,c_last_name#9,c_preferred_cust_flag#10,c_birth_day#11L,c_birth_month#12L,c_birth_year#13L,c_birth_c
[jira] [Resolved] (SPARK-49066) Reduce the effective scope of the configuration `spark.hadoop.hadoop.security.key.provider.path` in test scenarios
[ https://issues.apache.org/jira/browse/SPARK-49066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49066. --- Fix Version/s: 4.0.0 3.5.2 Resolution: Fixed > Reduce the effective scope of the configuration > `spark.hadoop.hadoop.security.key.provider.path` in test scenarios > -- > > Key: SPARK-49066 > URL: https://issues.apache.org/jira/browse/SPARK-49066 > Project: Spark > Issue Type: Test > Components: SQL, Tests >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0, 3.5.2 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49066) Reduce the effective scope of the configuration `spark.hadoop.hadoop.security.key.provider.path` in test scenarios
[ https://issues.apache.org/jira/browse/SPARK-49066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-49066: -- Issue Type: Test (was: Improvement) > Reduce the effective scope of the configuration > `spark.hadoop.hadoop.security.key.provider.path` in test scenarios > -- > > Key: SPARK-49066 > URL: https://issues.apache.org/jira/browse/SPARK-49066 > Project: Spark > Issue Type: Test > Components: SQL, Tests >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-44638) Unable to read from JDBC data sources when using custom schema containing varchar
[ https://issues.apache.org/jira/browse/SPARK-44638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-44638: -- Fix Version/s: 3.5.2 > Unable to read from JDBC data sources when using custom schema containing > varchar > - > > Key: SPARK-44638 > URL: https://issues.apache.org/jira/browse/SPARK-44638 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.0, 3.2.4, 3.3.2, 3.4.1 >Reporter: Michael Said >Assignee: Kent Yao >Priority: Critical > Labels: pull-request-available > Fix For: 4.0.0, 3.5.2 > > > When querying the data from JDBC databases with custom schema containing > varchar I got this error : > {code:java} > [23/07/14 06:12:19 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 1) ( > executor 1): java.sql.SQLException: Unsupported type varchar(100) at > org.apache.spark.sql.errors.QueryExecutionErrors$.unsupportedJdbcTypeError(QueryExecutionErrors.scala:818) > 23/07/14 06:12:21 INFO TaskSetManager: Lost task 0.1 in stage 1.0 (TID 2) on > , executor 0: java.sql.SQLException (Unsupported type varchar(100)){code} > Code example: > {code:java} > CUSTOM_SCHEMA="ID Integer, NAME VARCHAR(100)" > df = spark.read.format("jdbc") > .option("url", "jdbc:oracle:thin:@0.0.0.0:1521:db") > .option("driver", "oracle.jdbc.OracleDriver") > .option("dbtable", "table") > .option("customSchema", CUSTOM_SCHEMA) > .option("user", "user") > .option("password", "password") > .load() > df.show(){code} > I tried to set {{spark.sql.legacy.charVarcharAsString = true}} to restore the > behavior before Spark 3.1 but it doesn't help. > The issue occurs in version 3.1.0 and above. I believe that this issue is > caused by https://issues.apache.org/jira/browse/SPARK-33480 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-44638) Unable to read from JDBC data sources when using custom schema containing varchar
[ https://issues.apache.org/jira/browse/SPARK-44638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-44638. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47551 [https://github.com/apache/spark/pull/47551] > Unable to read from JDBC data sources when using custom schema containing > varchar > - > > Key: SPARK-44638 > URL: https://issues.apache.org/jira/browse/SPARK-44638 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.0, 3.2.4, 3.3.2, 3.4.1 >Reporter: Michael Said >Assignee: Kent Yao >Priority: Critical > Labels: pull-request-available > Fix For: 4.0.0 > > > When querying the data from JDBC databases with custom schema containing > varchar I got this error : > {code:java} > [23/07/14 06:12:19 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 1) ( > executor 1): java.sql.SQLException: Unsupported type varchar(100) at > org.apache.spark.sql.errors.QueryExecutionErrors$.unsupportedJdbcTypeError(QueryExecutionErrors.scala:818) > 23/07/14 06:12:21 INFO TaskSetManager: Lost task 0.1 in stage 1.0 (TID 2) on > , executor 0: java.sql.SQLException (Unsupported type varchar(100)){code} > Code example: > {code:java} > CUSTOM_SCHEMA="ID Integer, NAME VARCHAR(100)" > df = spark.read.format("jdbc") > .option("url", "jdbc:oracle:thin:@0.0.0.0:1521:db") > .option("driver", "oracle.jdbc.OracleDriver") > .option("dbtable", "table") > .option("customSchema", CUSTOM_SCHEMA) > .option("user", "user") > .option("password", "password") > .load() > df.show(){code} > I tried to set {{spark.sql.legacy.charVarcharAsString = true}} to restore the > behavior before Spark 3.1 but it doesn't help. > The issue occurs in version 3.1.0 and above. I believe that this issue is > caused by https://issues.apache.org/jira/browse/SPARK-33480 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-44638) Unable to read from JDBC data sources when using custom schema containing varchar
[ https://issues.apache.org/jira/browse/SPARK-44638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-44638: - Assignee: Kent Yao > Unable to read from JDBC data sources when using custom schema containing > varchar > - > > Key: SPARK-44638 > URL: https://issues.apache.org/jira/browse/SPARK-44638 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.0, 3.2.4, 3.3.2, 3.4.1 >Reporter: Michael Said >Assignee: Kent Yao >Priority: Critical > Labels: pull-request-available > > When querying the data from JDBC databases with custom schema containing > varchar I got this error : > {code:java} > [23/07/14 06:12:19 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 1) ( > executor 1): java.sql.SQLException: Unsupported type varchar(100) at > org.apache.spark.sql.errors.QueryExecutionErrors$.unsupportedJdbcTypeError(QueryExecutionErrors.scala:818) > 23/07/14 06:12:21 INFO TaskSetManager: Lost task 0.1 in stage 1.0 (TID 2) on > , executor 0: java.sql.SQLException (Unsupported type varchar(100)){code} > Code example: > {code:java} > CUSTOM_SCHEMA="ID Integer, NAME VARCHAR(100)" > df = spark.read.format("jdbc") > .option("url", "jdbc:oracle:thin:@0.0.0.0:1521:db") > .option("driver", "oracle.jdbc.OracleDriver") > .option("dbtable", "table") > .option("customSchema", CUSTOM_SCHEMA) > .option("user", "user") > .option("password", "password") > .load() > df.show(){code} > I tried to set {{spark.sql.legacy.charVarcharAsString = true}} to restore the > behavior before Spark 3.1 but it doesn't help. > The issue occurs in version 3.1.0 and above. I believe that this issue is > caused by https://issues.apache.org/jira/browse/SPARK-33480 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18134) SQL: MapType in Group BY and Joins not working
[ https://issues.apache.org/jira/browse/SPARK-18134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17869944#comment-17869944 ] Dongjoon Hyun commented on SPARK-18134: --- You can file a working PR to the Apache Spark repository in order to revive this, [~rkchoudhary]. You can make a freshly new PR, or you can take over the existing PR too. If you keep the existing authorship as the first commit and add your commits later in a single PR, it's okay with the community. bq. Let us know if there's anything we can do to assist in resolving this issue. > SQL: MapType in Group BY and Joins not working > -- > > Key: SPARK-18134 > URL: https://issues.apache.org/jira/browse/SPARK-18134 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.5.0, 1.5.1, 1.5.2, 1.6.0, 1.6.1, 1.6.2, 2.0.0, 2.0.1, > 2.1.0 >Reporter: Christian Zorneck >Priority: Major > > Since version 1.5 and issue SPARK-9415, MapTypes can no longer be used in > GROUP BY and join clauses. This makes it incompatible to HiveQL. So, a Hive > feature was removed from Spark. This makes Spark incompatible to various > HiveQL statements. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
Apache ORC 2.0.2 Release?
Hi, All. Apache ORC 2.0.2 is scheduled on 2024-08-15. I'd like to volunteer for the 2.0.2 release manager. In this release, there are a few important patches requested by other ASF communities. In addition, please ping me if you have more patches to deliver as a part of Apache ORC 2.0.2. https://github.com/apache/orc/milestone/32 ORC-1749: Fix `supportVectoredIO` for hadoop version string with optional patch labels ORC-1741: Respect decimal reader isRepeating flag ORC-1747: Upgrade `zstd-jni` to 1.5.6-4 ORC-1746: Bump `netty-all` to 4.1.110.Final in `bench` module ORC-1744: Add `ubuntu-24.04` to GitHub Action ORC-1743: Upgrade Spark to 4.0.0-preview1 ORC-1742: [FOLLOWUP] Remove unused import to fix checkstyle failure ORC-1742: Suppor print the id, name and type of each column in dump tool ORC-1732: [C++] Fix detecting Homebrew-installed Protobuf on MacOS ORC-1740: Avoid the dump tool repeatedly parsing ColumnStatistics ORC-1738: [C++] Fix wrong Int128 maximum value ORC-1724: JsonFileDump utility should print user metadata ORC-1700: Write parquet decimal type data in Benchmark using `FIXED_LEN_BYTE_ARRAY` type ORC-1721: Upgrade `aircompressor` to 0.27 ORC-1733: [C++][CMake] Fix CMAKE_MODULE_PATH not to use PROJECT_SOURCE_DIR Thanks, Dongjoon.
[jira] [Commented] (ORC-1751) [C++] Syntax error in ThirdpartyToolchain
[ https://issues.apache.org/jira/browse/ORC-1751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17869935#comment-17869935 ] Dongjoon Hyun commented on ORC-1751: Thank you for reporting, [~luffyZ]. I also read the context of the Arrow community link. > [C++] Syntax error in ThirdpartyToolchain > - > > Key: ORC-1751 > URL: https://issues.apache.org/jira/browse/ORC-1751 > Project: ORC > Issue Type: Improvement > Components: C++ >Reporter: Hao Zou >Priority: Major > > This topic has been discussed > [here|https://github.com/apache/arrow/pull/43417]. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (SPARK-48950) Corrupt data from parquet scans
[ https://issues.apache.org/jira/browse/SPARK-48950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17869931#comment-17869931 ] Dongjoon Hyun commented on SPARK-48950: --- Thank you, [~Tom_Newton]. According to the JIRA description, I added a link to SPARK-42388. > Corrupt data from parquet scans > --- > > Key: SPARK-48950 > URL: https://issues.apache.org/jira/browse/SPARK-48950 > Project: Spark > Issue Type: Bug > Components: Input/Output >Affects Versions: 3.5.0, 4.0.0, 3.5.1 > Environment: Spark 3.5.0 > Running on kubernetes > Using Azure Blob storage with hierarchical namespace enabled >Reporter: Thomas Newton >Priority: Major > Labels: correctness > Attachments: example_task_errors.txt, job_dag.png, sql_query_plan.png > > > Its very rare and non-deterministic but since Spark 3.5.0 we have started > seeing a correctness bug in parquet scans when using the vectorized reader. > We've noticed this on double type columns where occasionally small groups > (typically 10s to 100s) of rows are replaced with crazy values like > `-1.29996470e+029, 3.56717569e-184, 7.23323243e+307, -1.05929677e+045, > -7.60562076e+240, -3.1806e-064, 2.89435993e-116`. I think this is the > result of interpreting uniform random bits as a double type. Most of my > testing has been on an array of double type column but we have also seen it > on un-nested plain double type columns. > I've been testing this by adding a filter that should return zero results but > will return non-zero if the parquet scan has problems. I've attached > screenshots of this from the Spark UI. > I did a `git bisect` and found that the problem starts with > [https://github.com/apache/spark/pull/39950], but I haven't yet understood > why. Its possible that this change is fine but it reveals a problem > elsewhere? I did also notice [https://github.com/apache/spark/pull/44853] > which appears to be a different implementation of the same thing so maybe > that could help. > Its not a major problem by itself but another symptom appears to be that > Parquet scan tasks fail at a rate of approximately 0.03% with errors like > those in the attached `example_task_errors.txt`. If I revert > [https://github.com/apache/spark/pull/39950] I get exactly 0 task failures on > the same test. > > The problem seems to be a bit dependant on how the parquet files happen to be > organised on blob storage so I don't yet have a reproduce that I can share > that doesn't depend on private data. > I tested on a pre-release 4.0.0 and the problem was still present. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-49000) Aggregation with DISTINCT gives wrong results when dealing with literals
[ https://issues.apache.org/jira/browse/SPARK-49000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17869930#comment-17869930 ] Dongjoon Hyun commented on SPARK-49000: --- Thank you, [~uros-db]. > Aggregation with DISTINCT gives wrong results when dealing with literals > > > Key: SPARK-49000 > URL: https://issues.apache.org/jira/browse/SPARK-49000 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0, 3.1.3, 3.2.4, 3.5.1, 3.3.4, 3.4.3 >Reporter: Uroš Bojanić >Assignee: Uroš Bojanić >Priority: Critical > Labels: correctness, pull-request-available > Fix For: 4.0.0, 3.5.2 > > > Aggregation with *DISTINCT* gives wrong results when dealing with literals. > It appears that this bug affects all (or most) released versions of Spark. > > For example: > {code:java} > select count(distinct 1) from t{code} > returns 1, while the correct result should be 0. > > For reference: > {code:java} > select count(1) from t{code} > returns 0, which is the correct and expected result. > > In these examples, suppose that *t* is a table with any columns). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-49000) Aggregation with DISTINCT gives wrong results when dealing with literals
[ https://issues.apache.org/jira/browse/SPARK-49000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17869925#comment-17869925 ] Dongjoon Hyun commented on SPARK-49000: --- When I use `spark-sq`, Apache Spark 3.5.1 and 3.4.2 seems to work correctly like the following. Is there a handy way to check this PR's case? {code} spark-sql (default)> select count(distinct 1) from (select * from range(1) where 1 = 0); 0 Time taken: 0.055 seconds, Fetched 1 row(s) {code} > Aggregation with DISTINCT gives wrong results when dealing with literals > > > Key: SPARK-49000 > URL: https://issues.apache.org/jira/browse/SPARK-49000 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0, 3.1.3, 3.2.4, 3.5.1, 3.3.4, 3.4.3 >Reporter: Uroš Bojanić >Assignee: Uroš Bojanić >Priority: Critical > Labels: correctness, pull-request-available > Fix For: 4.0.0, 3.5.2 > > > Aggregation with *DISTINCT* gives wrong results when dealing with literals. > It appears that this bug affects all (or most) released versions of Spark. > > For example: > {code:java} > select count(distinct 1) from t{code} > returns 1, while the correct result should be 0. > > For reference: > {code:java} > select count(1) from t{code} > returns 0, which is the correct and expected result. > > In these examples, suppose that *t* is a table with any columns). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49000) Aggregation with DISTINCT gives wrong results when dealing with literals
[ https://issues.apache.org/jira/browse/SPARK-49000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-49000: -- Affects Version/s: 3.4.3 3.3.4 3.5.1 3.2.4 3.1.3 > Aggregation with DISTINCT gives wrong results when dealing with literals > > > Key: SPARK-49000 > URL: https://issues.apache.org/jira/browse/SPARK-49000 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0, 3.1.3, 3.2.4, 3.5.1, 3.3.4, 3.4.3 >Reporter: Uroš Bojanić >Assignee: Uroš Bojanić >Priority: Critical > Labels: correctness, pull-request-available > Fix For: 4.0.0, 3.5.2 > > > Aggregation with *DISTINCT* gives wrong results when dealing with literals. > It appears that this bug affects all (or most) released versions of Spark. > > For example: > {code:java} > select count(distinct 1) from t{code} > returns 1, while the correct result should be 0. > > For reference: > {code:java} > select count(1) from t{code} > returns 0, which is the correct and expected result. > > In these examples, suppose that *t* is a table with any columns). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49000) Aggregation with DISTINCT gives wrong results when dealing with literals
[ https://issues.apache.org/jira/browse/SPARK-49000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-49000: -- Labels: correctness pull-request-available (was: pull-request-available) > Aggregation with DISTINCT gives wrong results when dealing with literals > > > Key: SPARK-49000 > URL: https://issues.apache.org/jira/browse/SPARK-49000 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Uroš Bojanić >Assignee: Uroš Bojanić >Priority: Critical > Labels: correctness, pull-request-available > Fix For: 4.0.0, 3.5.2 > > > Aggregation with *DISTINCT* gives wrong results when dealing with literals. > It appears that this bug affects all (or most) released versions of Spark. > > For example: > {code:java} > select count(distinct 1) from t{code} > returns 1, while the correct result should be 0. > > For reference: > {code:java} > select count(1) from t{code} > returns 0, which is the correct and expected result. > > In these examples, suppose that *t* is a table with any columns). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49064) Upgrade Kafka to 3.8.0
[ https://issues.apache.org/jira/browse/SPARK-49064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49064. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47540 [https://github.com/apache/spark/pull/47540] > Upgrade Kafka to 3.8.0 > -- > > Key: SPARK-49064 > URL: https://issues.apache.org/jira/browse/SPARK-49064 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-49055) Investigate OrcEncryptionSuite UnsatisfiedLinkError on AppleSilicon MacOS environment
[ https://issues.apache.org/jira/browse/SPARK-49055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17869533#comment-17869533 ] Dongjoon Hyun edited comment on SPARK-49055 at 7/30/24 5:46 AM: To other reviewers, for the record, please see the discuss and conclusion on the PR too. - https://github.com/apache/spark/pull/45583#issuecomment-2257479676 was (Author: dongjoon): Please see the discuss and conclusion on the PR too. - https://github.com/apache/spark/pull/45583#issuecomment-2257479676 > Investigate OrcEncryptionSuite UnsatisfiedLinkError on AppleSilicon MacOS > environment > - > > Key: SPARK-49055 > URL: https://issues.apache.org/jira/browse/SPARK-49055 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > Attachments: image-2024-07-30-12-59-02-278.png > > > {code} > git reset --hard 49b4c3bc9c09325de941dfaf41e4fd3a4a4c345f // > [SPARK-45393][BUILD] Upgrade Hadoop to 3.4.0 > build/sbt clean "sql/testOnly > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite" > ... > [info] OrcEncryptionSuite: > 12:42:55.441 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load > native-hadoop library for your platform... using builtin-java classes where > applicable > 12:42:57.950 WARN org.apache.hadoop.crypto.OpensslCipher: Failed to load > OpenSSL Cipher. > java.lang.UnsatisfiedLinkError: 'boolean > org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl()' > at org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl(Native > Method) > at > org.apache.hadoop.crypto.OpensslCipher.(OpensslCipher.java:86) > at > org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec.(OpensslAesCtrCryptoCodec.java:36) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at > java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499) > [info] - Write and read an encrypted file (2 seconds, 486 milliseconds) > [info] - Write and read an encrypted table (402 milliseconds) > [info] - SPARK-35325: Write and read encrypted nested columns (299 > milliseconds) > [info] - SPARK-35992: Write and read fully-encrypted columns with default > masking (623 milliseconds) > 12:42:59.856 WARN > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite: > = POSSIBLE THREAD LEAK IN SUITE > o.a.s.sql.execution.datasources.orc.OrcEncryptionSuite, threads: rpc-boss-3-1 > (daemon=true), Thread-17 (daemon=true), ForkJoinPool.commonPool-worker-2 > (daemon=true), shuffle-boss-6-1 (daemon=true), > ForkJoinPool.commonPool-worker-1 (daemon=true), Thread-18 (daemon=true), > ForkJoinPool.commonPool-worker-3 (daemon=true) = > [info] Run completed in 5 seconds, 291 milliseconds. > [info] Total number of tests run: 4 > [info] Suites: completed 1, aborted 0 > [info] Tests: succeeded 4, failed 0, canceled 0, ignored 0, pending 0 > [info] All tests passed. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-49055) Investigate OrcEncryptionSuite UnsatisfiedLinkError on AppleSilicon MacOS environment
[ https://issues.apache.org/jira/browse/SPARK-49055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17869533#comment-17869533 ] Dongjoon Hyun commented on SPARK-49055: --- Please see the discuss and conclusion on the PR too. - https://github.com/apache/spark/pull/45583#issuecomment-2257479676 > Investigate OrcEncryptionSuite UnsatisfiedLinkError on AppleSilicon MacOS > environment > - > > Key: SPARK-49055 > URL: https://issues.apache.org/jira/browse/SPARK-49055 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > Attachments: image-2024-07-30-12-59-02-278.png > > > {code} > git reset --hard 49b4c3bc9c09325de941dfaf41e4fd3a4a4c345f // > [SPARK-45393][BUILD] Upgrade Hadoop to 3.4.0 > build/sbt clean "sql/testOnly > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite" > ... > [info] OrcEncryptionSuite: > 12:42:55.441 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load > native-hadoop library for your platform... using builtin-java classes where > applicable > 12:42:57.950 WARN org.apache.hadoop.crypto.OpensslCipher: Failed to load > OpenSSL Cipher. > java.lang.UnsatisfiedLinkError: 'boolean > org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl()' > at org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl(Native > Method) > at > org.apache.hadoop.crypto.OpensslCipher.(OpensslCipher.java:86) > at > org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec.(OpensslAesCtrCryptoCodec.java:36) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at > java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499) > [info] - Write and read an encrypted file (2 seconds, 486 milliseconds) > [info] - Write and read an encrypted table (402 milliseconds) > [info] - SPARK-35325: Write and read encrypted nested columns (299 > milliseconds) > [info] - SPARK-35992: Write and read fully-encrypted columns with default > masking (623 milliseconds) > 12:42:59.856 WARN > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite: > = POSSIBLE THREAD LEAK IN SUITE > o.a.s.sql.execution.datasources.orc.OrcEncryptionSuite, threads: rpc-boss-3-1 > (daemon=true), Thread-17 (daemon=true), ForkJoinPool.commonPool-worker-2 > (daemon=true), shuffle-boss-6-1 (daemon=true), > ForkJoinPool.commonPool-worker-1 (daemon=true), Thread-18 (daemon=true), > ForkJoinPool.commonPool-worker-3 (daemon=true) = > [info] Run completed in 5 seconds, 291 milliseconds. > [info] Total number of tests run: 4 > [info] Suites: completed 1, aborted 0 > [info] Tests: succeeded 4, failed 0, canceled 0, ignored 0, pending 0 > [info] All tests passed. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Closed] (SPARK-49055) Investigate OrcEncryptionSuite UnsatisfiedLinkError on AppleSilicon MacOS environment
[ https://issues.apache.org/jira/browse/SPARK-49055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun closed SPARK-49055. - > Investigate OrcEncryptionSuite UnsatisfiedLinkError on AppleSilicon MacOS > environment > - > > Key: SPARK-49055 > URL: https://issues.apache.org/jira/browse/SPARK-49055 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > Attachments: image-2024-07-30-12-59-02-278.png > > > {code} > git reset --hard 49b4c3bc9c09325de941dfaf41e4fd3a4a4c345f // > [SPARK-45393][BUILD] Upgrade Hadoop to 3.4.0 > build/sbt clean "sql/testOnly > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite" > ... > [info] OrcEncryptionSuite: > 12:42:55.441 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load > native-hadoop library for your platform... using builtin-java classes where > applicable > 12:42:57.950 WARN org.apache.hadoop.crypto.OpensslCipher: Failed to load > OpenSSL Cipher. > java.lang.UnsatisfiedLinkError: 'boolean > org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl()' > at org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl(Native > Method) > at > org.apache.hadoop.crypto.OpensslCipher.(OpensslCipher.java:86) > at > org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec.(OpensslAesCtrCryptoCodec.java:36) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at > java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499) > [info] - Write and read an encrypted file (2 seconds, 486 milliseconds) > [info] - Write and read an encrypted table (402 milliseconds) > [info] - SPARK-35325: Write and read encrypted nested columns (299 > milliseconds) > [info] - SPARK-35992: Write and read fully-encrypted columns with default > masking (623 milliseconds) > 12:42:59.856 WARN > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite: > = POSSIBLE THREAD LEAK IN SUITE > o.a.s.sql.execution.datasources.orc.OrcEncryptionSuite, threads: rpc-boss-3-1 > (daemon=true), Thread-17 (daemon=true), ForkJoinPool.commonPool-worker-2 > (daemon=true), shuffle-boss-6-1 (daemon=true), > ForkJoinPool.commonPool-worker-1 (daemon=true), Thread-18 (daemon=true), > ForkJoinPool.commonPool-worker-3 (daemon=true) = > [info] Run completed in 5 seconds, 291 milliseconds. > [info] Total number of tests run: 4 > [info] Suites: completed 1, aborted 0 > [info] Tests: succeeded 4, failed 0, canceled 0, ignored 0, pending 0 > [info] All tests passed. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49055) Investigate OrcEncryptionSuite UnsatisfiedLinkError on AppleSilicon MacOS environment
[ https://issues.apache.org/jira/browse/SPARK-49055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49055. --- Resolution: Not A Problem After investigating with [~LuciferYang], we concluded that there is no functional change in Apache Hadoop code (for 10 years). Only the log level is changed by HADOOP-17982. So, I closed this issue as `Not A Problem`. > Investigate OrcEncryptionSuite UnsatisfiedLinkError on AppleSilicon MacOS > environment > - > > Key: SPARK-49055 > URL: https://issues.apache.org/jira/browse/SPARK-49055 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > Attachments: image-2024-07-30-12-59-02-278.png > > > {code} > git reset --hard 49b4c3bc9c09325de941dfaf41e4fd3a4a4c345f // > [SPARK-45393][BUILD] Upgrade Hadoop to 3.4.0 > build/sbt clean "sql/testOnly > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite" > ... > [info] OrcEncryptionSuite: > 12:42:55.441 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load > native-hadoop library for your platform... using builtin-java classes where > applicable > 12:42:57.950 WARN org.apache.hadoop.crypto.OpensslCipher: Failed to load > OpenSSL Cipher. > java.lang.UnsatisfiedLinkError: 'boolean > org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl()' > at org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl(Native > Method) > at > org.apache.hadoop.crypto.OpensslCipher.(OpensslCipher.java:86) > at > org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec.(OpensslAesCtrCryptoCodec.java:36) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at > java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499) > [info] - Write and read an encrypted file (2 seconds, 486 milliseconds) > [info] - Write and read an encrypted table (402 milliseconds) > [info] - SPARK-35325: Write and read encrypted nested columns (299 > milliseconds) > [info] - SPARK-35992: Write and read fully-encrypted columns with default > masking (623 milliseconds) > 12:42:59.856 WARN > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite: > = POSSIBLE THREAD LEAK IN SUITE > o.a.s.sql.execution.datasources.orc.OrcEncryptionSuite, threads: rpc-boss-3-1 > (daemon=true), Thread-17 (daemon=true), ForkJoinPool.commonPool-worker-2 > (daemon=true), shuffle-boss-6-1 (daemon=true), > ForkJoinPool.commonPool-worker-1 (daemon=true), Thread-18 (daemon=true), > ForkJoinPool.commonPool-worker-3 (daemon=true) = > [info] Run completed in 5 seconds, 291 milliseconds. > [info] Total number of tests run: 4 > [info] Suites: completed 1, aborted 0 > [info] Tests: succeeded 4, failed 0, canceled 0, ignored 0, pending 0 > [info] All tests passed. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-49055) Investigate OrcEncryptionSuite UnsatisfiedLinkError on AppleSilicon MacOS environment
[ https://issues.apache.org/jira/browse/SPARK-49055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17869525#comment-17869525 ] Dongjoon Hyun commented on SPARK-49055: --- There is no functional change in Hadoop code since last 10 year except the above log level change. Given that, I believe we can ignore the warning message. It's the same for all Hadoop versions (of last 10 years) > Investigate OrcEncryptionSuite UnsatisfiedLinkError on AppleSilicon MacOS > environment > - > > Key: SPARK-49055 > URL: https://issues.apache.org/jira/browse/SPARK-49055 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > Attachments: image-2024-07-30-12-59-02-278.png > > > {code} > git reset --hard 49b4c3bc9c09325de941dfaf41e4fd3a4a4c345f // > [SPARK-45393][BUILD] Upgrade Hadoop to 3.4.0 > build/sbt clean "sql/testOnly > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite" > ... > [info] OrcEncryptionSuite: > 12:42:55.441 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load > native-hadoop library for your platform... using builtin-java classes where > applicable > 12:42:57.950 WARN org.apache.hadoop.crypto.OpensslCipher: Failed to load > OpenSSL Cipher. > java.lang.UnsatisfiedLinkError: 'boolean > org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl()' > at org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl(Native > Method) > at > org.apache.hadoop.crypto.OpensslCipher.(OpensslCipher.java:86) > at > org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec.(OpensslAesCtrCryptoCodec.java:36) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at > java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499) > [info] - Write and read an encrypted file (2 seconds, 486 milliseconds) > [info] - Write and read an encrypted table (402 milliseconds) > [info] - SPARK-35325: Write and read encrypted nested columns (299 > milliseconds) > [info] - SPARK-35992: Write and read fully-encrypted columns with default > masking (623 milliseconds) > 12:42:59.856 WARN > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite: > = POSSIBLE THREAD LEAK IN SUITE > o.a.s.sql.execution.datasources.orc.OrcEncryptionSuite, threads: rpc-boss-3-1 > (daemon=true), Thread-17 (daemon=true), ForkJoinPool.commonPool-worker-2 > (daemon=true), shuffle-boss-6-1 (daemon=true), > ForkJoinPool.commonPool-worker-1 (daemon=true), Thread-18 (daemon=true), > ForkJoinPool.commonPool-worker-3 (daemon=true) = > [info] Run completed in 5 seconds, 291 milliseconds. > [info] Total number of tests run: 4 > [info] Suites: completed 1, aborted 0 > [info] Tests: succeeded 4, failed 0, canceled 0, ignored 0, pending 0 > [info] All tests passed. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-49055) Investigate OrcEncryptionSuite UnsatisfiedLinkError on AppleSilicon MacOS environment
[ https://issues.apache.org/jira/browse/SPARK-49055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17869523#comment-17869523 ] Dongjoon Hyun commented on SPARK-49055: --- To [~LuciferYang], this is a false alarm. According to the Hadoop code, they changes the log level only at Hadoop 3.4.0. - https://github.com/apache/hadoop/pull/3599/files > Investigate OrcEncryptionSuite UnsatisfiedLinkError on AppleSilicon MacOS > environment > - > > Key: SPARK-49055 > URL: https://issues.apache.org/jira/browse/SPARK-49055 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > Attachments: image-2024-07-30-12-59-02-278.png > > > {code} > git reset --hard 49b4c3bc9c09325de941dfaf41e4fd3a4a4c345f // > [SPARK-45393][BUILD] Upgrade Hadoop to 3.4.0 > build/sbt clean "sql/testOnly > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite" > ... > [info] OrcEncryptionSuite: > 12:42:55.441 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load > native-hadoop library for your platform... using builtin-java classes where > applicable > 12:42:57.950 WARN org.apache.hadoop.crypto.OpensslCipher: Failed to load > OpenSSL Cipher. > java.lang.UnsatisfiedLinkError: 'boolean > org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl()' > at org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl(Native > Method) > at > org.apache.hadoop.crypto.OpensslCipher.(OpensslCipher.java:86) > at > org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec.(OpensslAesCtrCryptoCodec.java:36) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at > java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499) > [info] - Write and read an encrypted file (2 seconds, 486 milliseconds) > [info] - Write and read an encrypted table (402 milliseconds) > [info] - SPARK-35325: Write and read encrypted nested columns (299 > milliseconds) > [info] - SPARK-35992: Write and read fully-encrypted columns with default > masking (623 milliseconds) > 12:42:59.856 WARN > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite: > = POSSIBLE THREAD LEAK IN SUITE > o.a.s.sql.execution.datasources.orc.OrcEncryptionSuite, threads: rpc-boss-3-1 > (daemon=true), Thread-17 (daemon=true), ForkJoinPool.commonPool-worker-2 > (daemon=true), shuffle-boss-6-1 (daemon=true), > ForkJoinPool.commonPool-worker-1 (daemon=true), Thread-18 (daemon=true), > ForkJoinPool.commonPool-worker-3 (daemon=true) = > [info] Run completed in 5 seconds, 291 milliseconds. > [info] Total number of tests run: 4 > [info] Suites: completed 1, aborted 0 > [info] Tests: succeeded 4, failed 0, canceled 0, ignored 0, pending 0 > [info] All tests passed. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-49055) Investigate OrcEncryptionSuite UnsatisfiedLinkError on AppleSilicon MacOS environment
[ https://issues.apache.org/jira/browse/SPARK-49055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17869522#comment-17869522 ] Dongjoon Hyun commented on SPARK-49055: --- This seems to be a new feature of Hadoop 3.4.0. - https://issues.apache.org/jira/browse/HADOOP-17982 > Investigate OrcEncryptionSuite UnsatisfiedLinkError on AppleSilicon MacOS > environment > - > > Key: SPARK-49055 > URL: https://issues.apache.org/jira/browse/SPARK-49055 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0 > Environment: MacOS on AppleSilicon >Reporter: Yang Jie >Priority: Major > Attachments: image-2024-07-30-12-59-02-278.png > > > {code} > git reset --hard 49b4c3bc9c09325de941dfaf41e4fd3a4a4c345f // > [SPARK-45393][BUILD] Upgrade Hadoop to 3.4.0 > build/sbt clean "sql/testOnly > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite" > ... > [info] OrcEncryptionSuite: > 12:42:55.441 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load > native-hadoop library for your platform... using builtin-java classes where > applicable > 12:42:57.950 WARN org.apache.hadoop.crypto.OpensslCipher: Failed to load > OpenSSL Cipher. > java.lang.UnsatisfiedLinkError: 'boolean > org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl()' > at org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl(Native > Method) > at > org.apache.hadoop.crypto.OpensslCipher.(OpensslCipher.java:86) > at > org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec.(OpensslAesCtrCryptoCodec.java:36) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at > java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499) > [info] - Write and read an encrypted file (2 seconds, 486 milliseconds) > [info] - Write and read an encrypted table (402 milliseconds) > [info] - SPARK-35325: Write and read encrypted nested columns (299 > milliseconds) > [info] - SPARK-35992: Write and read fully-encrypted columns with default > masking (623 milliseconds) > 12:42:59.856 WARN > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite: > = POSSIBLE THREAD LEAK IN SUITE > o.a.s.sql.execution.datasources.orc.OrcEncryptionSuite, threads: rpc-boss-3-1 > (daemon=true), Thread-17 (daemon=true), ForkJoinPool.commonPool-worker-2 > (daemon=true), shuffle-boss-6-1 (daemon=true), > ForkJoinPool.commonPool-worker-1 (daemon=true), Thread-18 (daemon=true), > ForkJoinPool.commonPool-worker-3 (daemon=true) = > [info] Run completed in 5 seconds, 291 milliseconds. > [info] Total number of tests run: 4 > [info] Suites: completed 1, aborted 0 > [info] Tests: succeeded 4, failed 0, canceled 0, ignored 0, pending 0 > [info] All tests passed. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49055) Investigate OrcEncryptionSuite UnsatisfiedLinkError on AppleSilicon MacOS environment
[ https://issues.apache.org/jira/browse/SPARK-49055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-49055: -- Environment: (was: MacOS on AppleSilicon) > Investigate OrcEncryptionSuite UnsatisfiedLinkError on AppleSilicon MacOS > environment > - > > Key: SPARK-49055 > URL: https://issues.apache.org/jira/browse/SPARK-49055 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > Attachments: image-2024-07-30-12-59-02-278.png > > > {code} > git reset --hard 49b4c3bc9c09325de941dfaf41e4fd3a4a4c345f // > [SPARK-45393][BUILD] Upgrade Hadoop to 3.4.0 > build/sbt clean "sql/testOnly > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite" > ... > [info] OrcEncryptionSuite: > 12:42:55.441 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load > native-hadoop library for your platform... using builtin-java classes where > applicable > 12:42:57.950 WARN org.apache.hadoop.crypto.OpensslCipher: Failed to load > OpenSSL Cipher. > java.lang.UnsatisfiedLinkError: 'boolean > org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl()' > at org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl(Native > Method) > at > org.apache.hadoop.crypto.OpensslCipher.(OpensslCipher.java:86) > at > org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec.(OpensslAesCtrCryptoCodec.java:36) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at > java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499) > [info] - Write and read an encrypted file (2 seconds, 486 milliseconds) > [info] - Write and read an encrypted table (402 milliseconds) > [info] - SPARK-35325: Write and read encrypted nested columns (299 > milliseconds) > [info] - SPARK-35992: Write and read fully-encrypted columns with default > masking (623 milliseconds) > 12:42:59.856 WARN > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite: > = POSSIBLE THREAD LEAK IN SUITE > o.a.s.sql.execution.datasources.orc.OrcEncryptionSuite, threads: rpc-boss-3-1 > (daemon=true), Thread-17 (daemon=true), ForkJoinPool.commonPool-worker-2 > (daemon=true), shuffle-boss-6-1 (daemon=true), > ForkJoinPool.commonPool-worker-1 (daemon=true), Thread-18 (daemon=true), > ForkJoinPool.commonPool-worker-3 (daemon=true) = > [info] Run completed in 5 seconds, 291 milliseconds. > [info] Total number of tests run: 4 > [info] Suites: completed 1, aborted 0 > [info] Tests: succeeded 4, failed 0, canceled 0, ignored 0, pending 0 > [info] All tests passed. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-49055) Investigate OrcEncryptionSuite UnsatisfiedLinkError on AppleSilicon MacOS environment
[ https://issues.apache.org/jira/browse/SPARK-49055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17869521#comment-17869521 ] Dongjoon Hyun commented on SPARK-49055: --- Here is the Hadoop code. - https://github.com/apache/hadoop/blob/059e996c02d64716707d8dfb905dc84bab317aef/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/OpensslCipher.java#L83-L98 {code} static { String loadingFailure = null; try { if (!NativeCodeLoader.buildSupportsOpenssl()) { PerformanceAdvisory.LOG.warn("Build does not support openssl"); loadingFailure = "build does not support openssl."; } else { initIDs(); } } catch (Throwable t) { loadingFailure = t.getMessage(); LOG.warn("Failed to load OpenSSL Cipher.", t); } finally { loadingFailureReason = loadingFailure; } } {code} > Investigate OrcEncryptionSuite UnsatisfiedLinkError on AppleSilicon MacOS > environment > - > > Key: SPARK-49055 > URL: https://issues.apache.org/jira/browse/SPARK-49055 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0 > Environment: MacOS on AppleSilicon >Reporter: Yang Jie >Priority: Major > Attachments: image-2024-07-30-12-59-02-278.png > > > {code} > git reset --hard 49b4c3bc9c09325de941dfaf41e4fd3a4a4c345f // > [SPARK-45393][BUILD] Upgrade Hadoop to 3.4.0 > build/sbt clean "sql/testOnly > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite" > ... > [info] OrcEncryptionSuite: > 12:42:55.441 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load > native-hadoop library for your platform... using builtin-java classes where > applicable > 12:42:57.950 WARN org.apache.hadoop.crypto.OpensslCipher: Failed to load > OpenSSL Cipher. > java.lang.UnsatisfiedLinkError: 'boolean > org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl()' > at org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl(Native > Method) > at > org.apache.hadoop.crypto.OpensslCipher.(OpensslCipher.java:86) > at > org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec.(OpensslAesCtrCryptoCodec.java:36) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at > java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499) > [info] - Write and read an encrypted file (2 seconds, 486 milliseconds) > [info] - Write and read an encrypted table (402 milliseconds) > [info] - SPARK-35325: Write and read encrypted nested columns (299 > milliseconds) > [info] - SPARK-35992: Write and read fully-encrypted columns with default > masking (623 milliseconds) > 12:42:59.856 WARN > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite: > = POSSIBLE THREAD LEAK IN SUITE > o.a.s.sql.execution.datasources.orc.OrcEncryptionSuite, threads: rpc-boss-3-1 > (daemon=true), Thread-17 (daemon=true), ForkJoinPool.commonPool-worker-2 > (daemon=true), shuffle-boss-6-1 (daemon=true), > ForkJoinPool.commonPool-worker-1 (daemon=true), Thread-18 (daemon=true), > ForkJoinPool.commonPool-worker-3 (daemon=true) = > [info] Run completed in 5 seconds, 291 milliseconds. > [info] Total number of tests run: 4 > [info] Suites: completed 1, aborted 0 > [info] Tests: succeeded 4, failed 0, canceled 0, ignored 0, pending 0 > [info] All tests passed. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-49055) Investigate OrcEncryptionSuite UnsatisfiedLinkError on AppleSilicon MacOS environment
[ https://issues.apache.org/jira/browse/SPARK-49055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17869519#comment-17869519 ] Dongjoon Hyun commented on SPARK-49055: --- Ack. BTW, this is not a failure, isn't it, [~LuciferYang]? There is no test failed here. It looks like `WARN` message whose content is `UnsatisfiedLinkError`. > Investigate OrcEncryptionSuite UnsatisfiedLinkError on AppleSilicon MacOS > environment > - > > Key: SPARK-49055 > URL: https://issues.apache.org/jira/browse/SPARK-49055 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0 > Environment: MacOS on AppleSilicon >Reporter: Yang Jie >Priority: Major > Attachments: image-2024-07-30-12-59-02-278.png > > > {code} > git reset --hard 49b4c3bc9c09325de941dfaf41e4fd3a4a4c345f // > [SPARK-45393][BUILD] Upgrade Hadoop to 3.4.0 > build/sbt clean "sql/testOnly > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite" > ... > [info] OrcEncryptionSuite: > 12:42:55.441 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load > native-hadoop library for your platform... using builtin-java classes where > applicable > 12:42:57.950 WARN org.apache.hadoop.crypto.OpensslCipher: Failed to load > OpenSSL Cipher. > java.lang.UnsatisfiedLinkError: 'boolean > org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl()' > at org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl(Native > Method) > at > org.apache.hadoop.crypto.OpensslCipher.(OpensslCipher.java:86) > at > org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec.(OpensslAesCtrCryptoCodec.java:36) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at > java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499) > [info] - Write and read an encrypted file (2 seconds, 486 milliseconds) > [info] - Write and read an encrypted table (402 milliseconds) > [info] - SPARK-35325: Write and read encrypted nested columns (299 > milliseconds) > [info] - SPARK-35992: Write and read fully-encrypted columns with default > masking (623 milliseconds) > 12:42:59.856 WARN > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite: > = POSSIBLE THREAD LEAK IN SUITE > o.a.s.sql.execution.datasources.orc.OrcEncryptionSuite, threads: rpc-boss-3-1 > (daemon=true), Thread-17 (daemon=true), ForkJoinPool.commonPool-worker-2 > (daemon=true), shuffle-boss-6-1 (daemon=true), > ForkJoinPool.commonPool-worker-1 (daemon=true), Thread-18 (daemon=true), > ForkJoinPool.commonPool-worker-3 (daemon=true) = > [info] Run completed in 5 seconds, 291 milliseconds. > [info] Total number of tests run: 4 > [info] Suites: completed 1, aborted 0 > [info] Tests: succeeded 4, failed 0, canceled 0, ignored 0, pending 0 > [info] All tests passed. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-49055) Investigate OrcEncryptionSuite UnsatisfiedLinkError on AppleSilicon MacOS environment
[ https://issues.apache.org/jira/browse/SPARK-49055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17869520#comment-17869520 ] Dongjoon Hyun commented on SPARK-49055: --- Does it mean any functional failure in Hadoop Layer? {code} 12:42:57.950 WARN org.apache.hadoop.crypto.OpensslCipher: Failed to load OpenSSL Cipher. java.lang.UnsatisfiedLinkError: 'boolean org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl()' ... {code} > Investigate OrcEncryptionSuite UnsatisfiedLinkError on AppleSilicon MacOS > environment > - > > Key: SPARK-49055 > URL: https://issues.apache.org/jira/browse/SPARK-49055 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0 > Environment: MacOS on AppleSilicon >Reporter: Yang Jie >Priority: Major > Attachments: image-2024-07-30-12-59-02-278.png > > > {code} > git reset --hard 49b4c3bc9c09325de941dfaf41e4fd3a4a4c345f // > [SPARK-45393][BUILD] Upgrade Hadoop to 3.4.0 > build/sbt clean "sql/testOnly > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite" > ... > [info] OrcEncryptionSuite: > 12:42:55.441 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load > native-hadoop library for your platform... using builtin-java classes where > applicable > 12:42:57.950 WARN org.apache.hadoop.crypto.OpensslCipher: Failed to load > OpenSSL Cipher. > java.lang.UnsatisfiedLinkError: 'boolean > org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl()' > at org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl(Native > Method) > at > org.apache.hadoop.crypto.OpensslCipher.(OpensslCipher.java:86) > at > org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec.(OpensslAesCtrCryptoCodec.java:36) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at > java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499) > [info] - Write and read an encrypted file (2 seconds, 486 milliseconds) > [info] - Write and read an encrypted table (402 milliseconds) > [info] - SPARK-35325: Write and read encrypted nested columns (299 > milliseconds) > [info] - SPARK-35992: Write and read fully-encrypted columns with default > masking (623 milliseconds) > 12:42:59.856 WARN > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite: > = POSSIBLE THREAD LEAK IN SUITE > o.a.s.sql.execution.datasources.orc.OrcEncryptionSuite, threads: rpc-boss-3-1 > (daemon=true), Thread-17 (daemon=true), ForkJoinPool.commonPool-worker-2 > (daemon=true), shuffle-boss-6-1 (daemon=true), > ForkJoinPool.commonPool-worker-1 (daemon=true), Thread-18 (daemon=true), > ForkJoinPool.commonPool-worker-3 (daemon=true) = > [info] Run completed in 5 seconds, 291 milliseconds. > [info] Total number of tests run: 4 > [info] Suites: completed 1, aborted 0 > [info] Tests: succeeded 4, failed 0, canceled 0, ignored 0, pending 0 > [info] All tests passed. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49055) Investigate OrcEncryptionSuite UnsatisfiedLinkError on AppleSilicon MacOS environment
[ https://issues.apache.org/jira/browse/SPARK-49055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-49055: -- Summary: Investigate OrcEncryptionSuite UnsatisfiedLinkError on AppleSilicon MacOS environment (was: Fix OrcEncryptionSuite failure on AppleSilicon MacOS environment) > Investigate OrcEncryptionSuite UnsatisfiedLinkError on AppleSilicon MacOS > environment > - > > Key: SPARK-49055 > URL: https://issues.apache.org/jira/browse/SPARK-49055 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0 > Environment: MacOS on AppleSilicon >Reporter: Yang Jie >Priority: Major > Attachments: image-2024-07-30-12-59-02-278.png > > > {code} > git reset --hard 49b4c3bc9c09325de941dfaf41e4fd3a4a4c345f // > [SPARK-45393][BUILD] Upgrade Hadoop to 3.4.0 > build/sbt clean "sql/testOnly > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite" > ... > [info] OrcEncryptionSuite: > 12:42:55.441 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load > native-hadoop library for your platform... using builtin-java classes where > applicable > 12:42:57.950 WARN org.apache.hadoop.crypto.OpensslCipher: Failed to load > OpenSSL Cipher. > java.lang.UnsatisfiedLinkError: 'boolean > org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl()' > at org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl(Native > Method) > at > org.apache.hadoop.crypto.OpensslCipher.(OpensslCipher.java:86) > at > org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec.(OpensslAesCtrCryptoCodec.java:36) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at > java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499) > [info] - Write and read an encrypted file (2 seconds, 486 milliseconds) > [info] - Write and read an encrypted table (402 milliseconds) > [info] - SPARK-35325: Write and read encrypted nested columns (299 > milliseconds) > [info] - SPARK-35992: Write and read fully-encrypted columns with default > masking (623 milliseconds) > 12:42:59.856 WARN > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite: > = POSSIBLE THREAD LEAK IN SUITE > o.a.s.sql.execution.datasources.orc.OrcEncryptionSuite, threads: rpc-boss-3-1 > (daemon=true), Thread-17 (daemon=true), ForkJoinPool.commonPool-worker-2 > (daemon=true), shuffle-boss-6-1 (daemon=true), > ForkJoinPool.commonPool-worker-1 (daemon=true), Thread-18 (daemon=true), > ForkJoinPool.commonPool-worker-3 (daemon=true) = > [info] Run completed in 5 seconds, 291 milliseconds. > [info] Total number of tests run: 4 > [info] Suites: completed 1, aborted 0 > [info] Tests: succeeded 4, failed 0, canceled 0, ignored 0, pending 0 > [info] All tests passed. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49055) Fix OrcEncryptionSuite failure on Mac environment
[ https://issues.apache.org/jira/browse/SPARK-49055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-49055: -- Environment: MacOS on AppleSilicon (was: MacOS) > Fix OrcEncryptionSuite failure on Mac environment > - > > Key: SPARK-49055 > URL: https://issues.apache.org/jira/browse/SPARK-49055 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0 > Environment: MacOS on AppleSilicon >Reporter: Yang Jie >Priority: Major > > {code} > git reset --hard 49b4c3bc9c09325de941dfaf41e4fd3a4a4c345f // > [SPARK-45393][BUILD] Upgrade Hadoop to 3.4.0 > build/sbt clean "sql/testOnly > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite" > ... > [info] OrcEncryptionSuite: > 12:42:55.441 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load > native-hadoop library for your platform... using builtin-java classes where > applicable > 12:42:57.950 WARN org.apache.hadoop.crypto.OpensslCipher: Failed to load > OpenSSL Cipher. > java.lang.UnsatisfiedLinkError: 'boolean > org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl()' > at org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl(Native > Method) > at > org.apache.hadoop.crypto.OpensslCipher.(OpensslCipher.java:86) > at > org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec.(OpensslAesCtrCryptoCodec.java:36) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at > java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499) > [info] - Write and read an encrypted file (2 seconds, 486 milliseconds) > [info] - Write and read an encrypted table (402 milliseconds) > [info] - SPARK-35325: Write and read encrypted nested columns (299 > milliseconds) > [info] - SPARK-35992: Write and read fully-encrypted columns with default > masking (623 milliseconds) > 12:42:59.856 WARN > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite: > = POSSIBLE THREAD LEAK IN SUITE > o.a.s.sql.execution.datasources.orc.OrcEncryptionSuite, threads: rpc-boss-3-1 > (daemon=true), Thread-17 (daemon=true), ForkJoinPool.commonPool-worker-2 > (daemon=true), shuffle-boss-6-1 (daemon=true), > ForkJoinPool.commonPool-worker-1 (daemon=true), Thread-18 (daemon=true), > ForkJoinPool.commonPool-worker-3 (daemon=true) = > [info] Run completed in 5 seconds, 291 milliseconds. > [info] Total number of tests run: 4 > [info] Suites: completed 1, aborted 0 > [info] Tests: succeeded 4, failed 0, canceled 0, ignored 0, pending 0 > [info] All tests passed. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49055) Fix OrcEncryptionSuite failure on AppleSilicon MacOS environment
[ https://issues.apache.org/jira/browse/SPARK-49055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-49055: -- Summary: Fix OrcEncryptionSuite failure on AppleSilicon MacOS environment (was: Fix OrcEncryptionSuite failure on Mac environment) > Fix OrcEncryptionSuite failure on AppleSilicon MacOS environment > > > Key: SPARK-49055 > URL: https://issues.apache.org/jira/browse/SPARK-49055 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0 > Environment: MacOS on AppleSilicon >Reporter: Yang Jie >Priority: Major > > {code} > git reset --hard 49b4c3bc9c09325de941dfaf41e4fd3a4a4c345f // > [SPARK-45393][BUILD] Upgrade Hadoop to 3.4.0 > build/sbt clean "sql/testOnly > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite" > ... > [info] OrcEncryptionSuite: > 12:42:55.441 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load > native-hadoop library for your platform... using builtin-java classes where > applicable > 12:42:57.950 WARN org.apache.hadoop.crypto.OpensslCipher: Failed to load > OpenSSL Cipher. > java.lang.UnsatisfiedLinkError: 'boolean > org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl()' > at org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl(Native > Method) > at > org.apache.hadoop.crypto.OpensslCipher.(OpensslCipher.java:86) > at > org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec.(OpensslAesCtrCryptoCodec.java:36) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at > java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499) > [info] - Write and read an encrypted file (2 seconds, 486 milliseconds) > [info] - Write and read an encrypted table (402 milliseconds) > [info] - SPARK-35325: Write and read encrypted nested columns (299 > milliseconds) > [info] - SPARK-35992: Write and read fully-encrypted columns with default > masking (623 milliseconds) > 12:42:59.856 WARN > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite: > = POSSIBLE THREAD LEAK IN SUITE > o.a.s.sql.execution.datasources.orc.OrcEncryptionSuite, threads: rpc-boss-3-1 > (daemon=true), Thread-17 (daemon=true), ForkJoinPool.commonPool-worker-2 > (daemon=true), shuffle-boss-6-1 (daemon=true), > ForkJoinPool.commonPool-worker-1 (daemon=true), Thread-18 (daemon=true), > ForkJoinPool.commonPool-worker-3 (daemon=true) = > [info] Run completed in 5 seconds, 291 milliseconds. > [info] Total number of tests run: 4 > [info] Suites: completed 1, aborted 0 > [info] Tests: succeeded 4, failed 0, canceled 0, ignored 0, pending 0 > [info] All tests passed. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49055) Fix OrcEncryptionSuite failure on Mac environment
[ https://issues.apache.org/jira/browse/SPARK-49055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-49055: -- Environment: MacOS > Fix OrcEncryptionSuite failure on Mac environment > - > > Key: SPARK-49055 > URL: https://issues.apache.org/jira/browse/SPARK-49055 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0 > Environment: MacOS >Reporter: Yang Jie >Priority: Major > > {code} > git reset --hard 49b4c3bc9c09325de941dfaf41e4fd3a4a4c345f // > [SPARK-45393][BUILD] Upgrade Hadoop to 3.4.0 > build/sbt clean "sql/testOnly > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite" > ... > [info] OrcEncryptionSuite: > 12:42:55.441 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load > native-hadoop library for your platform... using builtin-java classes where > applicable > 12:42:57.950 WARN org.apache.hadoop.crypto.OpensslCipher: Failed to load > OpenSSL Cipher. > java.lang.UnsatisfiedLinkError: 'boolean > org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl()' > at org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl(Native > Method) > at > org.apache.hadoop.crypto.OpensslCipher.(OpensslCipher.java:86) > at > org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec.(OpensslAesCtrCryptoCodec.java:36) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at > java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499) > [info] - Write and read an encrypted file (2 seconds, 486 milliseconds) > [info] - Write and read an encrypted table (402 milliseconds) > [info] - SPARK-35325: Write and read encrypted nested columns (299 > milliseconds) > [info] - SPARK-35992: Write and read fully-encrypted columns with default > masking (623 milliseconds) > 12:42:59.856 WARN > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite: > = POSSIBLE THREAD LEAK IN SUITE > o.a.s.sql.execution.datasources.orc.OrcEncryptionSuite, threads: rpc-boss-3-1 > (daemon=true), Thread-17 (daemon=true), ForkJoinPool.commonPool-worker-2 > (daemon=true), shuffle-boss-6-1 (daemon=true), > ForkJoinPool.commonPool-worker-1 (daemon=true), Thread-18 (daemon=true), > ForkJoinPool.commonPool-worker-3 (daemon=true) = > [info] Run completed in 5 seconds, 291 milliseconds. > [info] Total number of tests run: 4 > [info] Suites: completed 1, aborted 0 > [info] Tests: succeeded 4, failed 0, canceled 0, ignored 0, pending 0 > [info] All tests passed. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49055) Fix OrcEncryptionSuite failure on Mac environment
[ https://issues.apache.org/jira/browse/SPARK-49055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-49055: -- Reporter: Yang Jie (was: Dongjoon Hyun) > Fix OrcEncryptionSuite failure on Mac environment > - > > Key: SPARK-49055 > URL: https://issues.apache.org/jira/browse/SPARK-49055 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > > {code} > git reset --hard 49b4c3bc9c09325de941dfaf41e4fd3a4a4c345f // > [SPARK-45393][BUILD] Upgrade Hadoop to 3.4.0 > build/sbt clean "sql/testOnly > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite" > ... > [info] OrcEncryptionSuite: > 12:42:55.441 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load > native-hadoop library for your platform... using builtin-java classes where > applicable > 12:42:57.950 WARN org.apache.hadoop.crypto.OpensslCipher: Failed to load > OpenSSL Cipher. > java.lang.UnsatisfiedLinkError: 'boolean > org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl()' > at org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl(Native > Method) > at > org.apache.hadoop.crypto.OpensslCipher.(OpensslCipher.java:86) > at > org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec.(OpensslAesCtrCryptoCodec.java:36) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at > java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499) > [info] - Write and read an encrypted file (2 seconds, 486 milliseconds) > [info] - Write and read an encrypted table (402 milliseconds) > [info] - SPARK-35325: Write and read encrypted nested columns (299 > milliseconds) > [info] - SPARK-35992: Write and read fully-encrypted columns with default > masking (623 milliseconds) > 12:42:59.856 WARN > org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite: > = POSSIBLE THREAD LEAK IN SUITE > o.a.s.sql.execution.datasources.orc.OrcEncryptionSuite, threads: rpc-boss-3-1 > (daemon=true), Thread-17 (daemon=true), ForkJoinPool.commonPool-worker-2 > (daemon=true), shuffle-boss-6-1 (daemon=true), > ForkJoinPool.commonPool-worker-1 (daemon=true), Thread-18 (daemon=true), > ForkJoinPool.commonPool-worker-3 (daemon=true) = > [info] Run completed in 5 seconds, 291 milliseconds. > [info] Total number of tests run: 4 > [info] Suites: completed 1, aborted 0 > [info] Tests: succeeded 4, failed 0, canceled 0, ignored 0, pending 0 > [info] All tests passed. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-49055) Fix OrcEncryptionSuite failure on Mac environment
Dongjoon Hyun created SPARK-49055: - Summary: Fix OrcEncryptionSuite failure on Mac environment Key: SPARK-49055 URL: https://issues.apache.org/jira/browse/SPARK-49055 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 4.0.0 Reporter: Dongjoon Hyun {code} git reset --hard 49b4c3bc9c09325de941dfaf41e4fd3a4a4c345f // [SPARK-45393][BUILD] Upgrade Hadoop to 3.4.0 build/sbt clean "sql/testOnly org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite" ... [info] OrcEncryptionSuite: 12:42:55.441 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 12:42:57.950 WARN org.apache.hadoop.crypto.OpensslCipher: Failed to load OpenSSL Cipher. java.lang.UnsatisfiedLinkError: 'boolean org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl()' at org.apache.hadoop.util.NativeCodeLoader.buildSupportsOpenssl(Native Method) at org.apache.hadoop.crypto.OpensslCipher.(OpensslCipher.java:86) at org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec.(OpensslAesCtrCryptoCodec.java:36) at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499) [info] - Write and read an encrypted file (2 seconds, 486 milliseconds) [info] - Write and read an encrypted table (402 milliseconds) [info] - SPARK-35325: Write and read encrypted nested columns (299 milliseconds) [info] - SPARK-35992: Write and read fully-encrypted columns with default masking (623 milliseconds) 12:42:59.856 WARN org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite: = POSSIBLE THREAD LEAK IN SUITE o.a.s.sql.execution.datasources.orc.OrcEncryptionSuite, threads: rpc-boss-3-1 (daemon=true), Thread-17 (daemon=true), ForkJoinPool.commonPool-worker-2 (daemon=true), shuffle-boss-6-1 (daemon=true), ForkJoinPool.commonPool-worker-1 (daemon=true), Thread-18 (daemon=true), ForkJoinPool.commonPool-worker-3 (daemon=true) = [info] Run completed in 5 seconds, 291 milliseconds. [info] Total number of tests run: 4 [info] Suites: completed 1, aborted 0 [info] Tests: succeeded 4, failed 0, canceled 0, ignored 0, pending 0 [info] All tests passed. {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46058) Add separate flag for privateKeyPassword
[ https://issues.apache.org/jira/browse/SPARK-46058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-46058: -- Summary: Add separate flag for privateKeyPassword (was: [CORE] Add separate flag for privateKeyPassword) > Add separate flag for privateKeyPassword > > > Key: SPARK-46058 > URL: https://issues.apache.org/jira/browse/SPARK-46058 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Hasnain Lakhani >Assignee: Hasnain Lakhani >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Right now with config inheritance we support: > * JKS with password A, PEM with password B > * JKS with no password, PEM with password A > * JKS and PEM with no password > > But we do not support the case where JKS has a password and PEM does not. If > we set keyPassword we will attempt to use it, and cannot set > `spark.ssl.rpc.keyPassword` to null. So let's make it a separate flag as the > easiest workaround. > > This was noticed while migrating some existing deployments to the RPC SSL > support where we use openssl support for RPC and use a key with no password -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45730) Make ReloadingX509TrustManagerSuite less flaky
[ https://issues.apache.org/jira/browse/SPARK-45730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-45730: -- Summary: Make ReloadingX509TrustManagerSuite less flaky (was: [CORE] Make ReloadingX509TrustManagerSuite less flaky) > Make ReloadingX509TrustManagerSuite less flaky > -- > > Key: SPARK-45730 > URL: https://issues.apache.org/jira/browse/SPARK-45730 > Project: Spark > Issue Type: Test > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Hasnain Lakhani >Assignee: Hasnain Lakhani >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45544) Integrate SSL support into TransportContext
[ https://issues.apache.org/jira/browse/SPARK-45544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-45544: -- Summary: Integrate SSL support into TransportContext (was: [CORE] Integrate SSL support into TransportContext) > Integrate SSL support into TransportContext > --- > > Key: SPARK-45544 > URL: https://issues.apache.org/jira/browse/SPARK-45544 > Project: Spark > Issue Type: Task > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Hasnain Lakhani >Assignee: Hasnain Lakhani >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Integrate the SSL support into TransportContext so that Spark can use RPC SSL > support -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46132) Support key password for JKS keys for RPC SSL
[ https://issues.apache.org/jira/browse/SPARK-46132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-46132: -- Summary: Support key password for JKS keys for RPC SSL (was: [CORE] Support key password for JKS keys for RPC SSL) > Support key password for JKS keys for RPC SSL > - > > Key: SPARK-46132 > URL: https://issues.apache.org/jira/browse/SPARK-46132 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Hasnain Lakhani >Assignee: Hasnain Lakhani >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > See thread at > https://github.com/apache/spark/pull/43998#discussion_r1406993411 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45541) Add SSLFactory
[ https://issues.apache.org/jira/browse/SPARK-45541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-45541: -- Summary: Add SSLFactory (was: [CORE] Add SSLFactory) > Add SSLFactory > -- > > Key: SPARK-45541 > URL: https://issues.apache.org/jira/browse/SPARK-45541 > Project: Spark > Issue Type: Task > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Hasnain Lakhani >Assignee: Hasnain Lakhani >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > We need to add a factory to support creating SSL engines which will be used > to create client/server connections -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45431) Document SSL RPC feature
[ https://issues.apache.org/jira/browse/SPARK-45431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-45431: -- Summary: Document SSL RPC feature (was: [DOCS] Document SSL RPC feature) > Document SSL RPC feature > > > Key: SPARK-45431 > URL: https://issues.apache.org/jira/browse/SPARK-45431 > Project: Spark > Issue Type: Task > Components: Documentation >Affects Versions: 4.0.0 >Reporter: Hasnain Lakhani >Assignee: Hasnain Lakhani >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Add documentation for users -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45429) Add helper classes for RPC SSL communication
[ https://issues.apache.org/jira/browse/SPARK-45429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-45429: -- Summary: Add helper classes for RPC SSL communication (was: [CORE] Add helper classes for RPC SSL communication) > Add helper classes for RPC SSL communication > > > Key: SPARK-45429 > URL: https://issues.apache.org/jira/browse/SPARK-45429 > Project: Spark > Issue Type: Task > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Hasnain Lakhani >Assignee: Hasnain Lakhani >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Add helper classes to handle the fact that encryption cannot work with > zero-copy transfers. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45464) Fix yarn distribution build
[ https://issues.apache.org/jira/browse/SPARK-45464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-45464: -- Summary: Fix yarn distribution build (was: [CORE] Fix yarn distribution build) > Fix yarn distribution build > --- > > Key: SPARK-45464 > URL: https://issues.apache.org/jira/browse/SPARK-45464 > Project: Spark > Issue Type: Bug > Components: Spark Core, YARN >Affects Versions: 4.0.0 >Reporter: Hasnain Lakhani >Assignee: Hasnain Lakhani >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > [https://github.com/apache/spark/pull/43164] introduced a regression in: > > ``` > ./dev/make-distribution.sh --tgz -Phive -Phive-thriftserver -Pyarn > ``` > > this needs to be fixed -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45377) Handle InputStream in NettyLogger
[ https://issues.apache.org/jira/browse/SPARK-45377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-45377: -- Summary: Handle InputStream in NettyLogger (was: [CORE] Handle InputStream in NettyLogger) > Handle InputStream in NettyLogger > - > > Key: SPARK-45377 > URL: https://issues.apache.org/jira/browse/SPARK-45377 > Project: Spark > Issue Type: Task > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Hasnain Lakhani >Assignee: Hasnain Lakhani >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > > Allow NettyLogger to also print the size of InputStreams which aids debugging > for SSL functionality -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45408) Add RPC SSL settings to TransportConf
[ https://issues.apache.org/jira/browse/SPARK-45408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-45408: -- Summary: Add RPC SSL settings to TransportConf (was: [CORE] Add RPC SSL settings to TransportConf) > Add RPC SSL settings to TransportConf > - > > Key: SPARK-45408 > URL: https://issues.apache.org/jira/browse/SPARK-45408 > Project: Spark > Issue Type: Task > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Hasnain Lakhani >Assignee: Hasnain Lakhani >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Add support for the settings for SSL RPC support to TransportConf and some > associated tests + sample configs used by other tests -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45378) Add convertToNettyForSsl to ManagedBuffer
[ https://issues.apache.org/jira/browse/SPARK-45378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-45378: -- Summary: Add convertToNettyForSsl to ManagedBuffer (was: [CORE] Add convertToNettyForSsl to ManagedBuffer) > Add convertToNettyForSsl to ManagedBuffer > - > > Key: SPARK-45378 > URL: https://issues.apache.org/jira/browse/SPARK-45378 > Project: Spark > Issue Type: Task > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Hasnain Lakhani >Assignee: Hasnain Lakhani >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Since netty's SSL support does not support zero-copy transfers, add another > API to ManagedBuffer so we can get buffers in a format that works with SSL -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45426) Add support for a ReloadingTrustManager
[ https://issues.apache.org/jira/browse/SPARK-45426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-45426: -- Summary: Add support for a ReloadingTrustManager (was: [CORE] Add support for a ReloadingTrustManager) > Add support for a ReloadingTrustManager > --- > > Key: SPARK-45426 > URL: https://issues.apache.org/jira/browse/SPARK-45426 > Project: Spark > Issue Type: Task > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Hasnain Lakhani >Assignee: Hasnain Lakhani >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > For the RPC SSL feature, this allows us to properly reload the trust store if > needed at runtime. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45427) Add RPC SSL settings to SSLOptions and SparkTransportConf
[ https://issues.apache.org/jira/browse/SPARK-45427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-45427: -- Summary: Add RPC SSL settings to SSLOptions and SparkTransportConf (was: [CORE] Add RPC SSL settings to SSLOptions and SparkTransportConf) > Add RPC SSL settings to SSLOptions and SparkTransportConf > - > > Key: SPARK-45427 > URL: https://issues.apache.org/jira/browse/SPARK-45427 > Project: Spark > Issue Type: Task > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Hasnain Lakhani >Assignee: Hasnain Lakhani >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Add the options here for RPC SSL so we can handle setting inheritance and > propagate the options to the callsites that need it. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45375) Mark connection as timedOut in TransportClient.close
[ https://issues.apache.org/jira/browse/SPARK-45375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-45375: -- Summary: Mark connection as timedOut in TransportClient.close (was: [CORE] Mark connection as timedOut in TransportClient.close) > Mark connection as timedOut in TransportClient.close > > > Key: SPARK-45375 > URL: https://issues.apache.org/jira/browse/SPARK-45375 > Project: Spark > Issue Type: Task > Components: Spark Core >Affects Versions: 3.4.2, 4.0.0, 3.5.1, 3.3.4 >Reporter: Hasnain Lakhani >Assignee: Hasnain Lakhani >Priority: Major > > Avoid a race condition where a connection which is in the process of being > closed could be returned by the TransportClientFactory only to be immediately > closed and cause errors upon use > > This doesn't happen much in practice but is observed more frequently as part > of efforts to add SSL support -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45374) Add test keys for SSL functionality
[ https://issues.apache.org/jira/browse/SPARK-45374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-45374: -- Summary: Add test keys for SSL functionality (was: [CORE] Add test keys for SSL functionality) > Add test keys for SSL functionality > --- > > Key: SPARK-45374 > URL: https://issues.apache.org/jira/browse/SPARK-45374 > Project: Spark > Issue Type: Task > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Hasnain Lakhani >Assignee: Hasnain Lakhani >Priority: Major > > Add test SSL keys which will be used for unit and integration tests of the > new SSL RPC functionality -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org