[jira] [Resolved] (SPARK-48195) Keep and RDD created by SparkPlan doExecute
[ https://issues.apache.org/jira/browse/SPARK-48195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-48195. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 48037 [https://github.com/apache/spark/pull/48037] > Keep and RDD created by SparkPlan doExecute > --- > > Key: SPARK-48195 > URL: https://issues.apache.org/jira/browse/SPARK-48195 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.0.0 >Reporter: Juliusz Sompolski >Assignee: Juliusz Sompolski >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > For more consistency, don't make SparkPlan execute generate a new RDD each > time, but reuse the one created by doExecute once. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-48195) Keep and RDD created by SparkPlan doExecute
[ https://issues.apache.org/jira/browse/SPARK-48195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-48195: Assignee: Juliusz Sompolski > Keep and RDD created by SparkPlan doExecute > --- > > Key: SPARK-48195 > URL: https://issues.apache.org/jira/browse/SPARK-48195 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.0.0 >Reporter: Juliusz Sompolski >Assignee: Juliusz Sompolski >Priority: Major > Labels: pull-request-available > > For more consistency, don't make SparkPlan execute generate a new RDD each > time, but reuse the one created by doExecute once. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49741) Add spark.shuffle.accurateBlockSkewedFactor to OSS Apache Spark configuration docs page
[ https://issues.apache.org/jira/browse/SPARK-49741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49741. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 48189 [https://github.com/apache/spark/pull/48189] > Add spark.shuffle.accurateBlockSkewedFactor to OSS Apache Spark configuration > docs page > --- > > Key: SPARK-49741 > URL: https://issues.apache.org/jira/browse/SPARK-49741 > Project: Spark > Issue Type: Task > Components: Documentation >Affects Versions: 3.3.0, 3.4.0, 3.5.0 >Reporter: Tim Lee >Assignee: Tim Lee >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > spark.shuffle.accurateBlockSkewedFactor was added in Spark 3.3.0 in > https://issues.apache.org/jira/browse/SPARK-36967 and is a useful shuffle > configuration to prevent issues where HighlyCompressedMapStatus wrongly > estimates the shuffle block sizes when the size distribution is skewed. > Currently this config is not discoverable because it's not on the Spark > config docs page and we should add it there. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49741) Add spark.shuffle.accurateBlockSkewedFactor to OSS Apache Spark configuration docs page
[ https://issues.apache.org/jira/browse/SPARK-49741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49741: Assignee: Tim Lee > Add spark.shuffle.accurateBlockSkewedFactor to OSS Apache Spark configuration > docs page > --- > > Key: SPARK-49741 > URL: https://issues.apache.org/jira/browse/SPARK-49741 > Project: Spark > Issue Type: Task > Components: Documentation >Affects Versions: 3.3.0, 3.4.0, 3.5.0 >Reporter: Tim Lee >Assignee: Tim Lee >Priority: Major > Labels: pull-request-available > > spark.shuffle.accurateBlockSkewedFactor was added in Spark 3.3.0 in > https://issues.apache.org/jira/browse/SPARK-36967 and is a useful shuffle > configuration to prevent issues where HighlyCompressedMapStatus wrongly > estimates the shuffle block sizes when the size distribution is skewed. > Currently this config is not discoverable because it's not on the Spark > config docs page and we should add it there. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49641) Include `table_funcs` and `variant_funcs` in the built-in function list doc
[ https://issues.apache.org/jira/browse/SPARK-49641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49641: Assignee: BingKun Pan > Include `table_funcs` and `variant_funcs` in the built-in function list doc > --- > > Key: SPARK-49641 > URL: https://issues.apache.org/jira/browse/SPARK-49641 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49641) Include `table_funcs` and `variant_funcs` in the built-in function list doc
[ https://issues.apache.org/jira/browse/SPARK-49641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49641. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 48106 [https://github.com/apache/spark/pull/48106] > Include `table_funcs` and `variant_funcs` in the built-in function list doc > --- > > Key: SPARK-49641 > URL: https://issues.apache.org/jira/browse/SPARK-49641 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49422) Create a shared KeyValueGroupedDataset interface
[ https://issues.apache.org/jira/browse/SPARK-49422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-49422: - Fix Version/s: (was: 4.0.0) > Create a shared KeyValueGroupedDataset interface > > > Key: SPARK-49422 > URL: https://issues.apache.org/jira/browse/SPARK-49422 > Project: Spark > Issue Type: New Feature > Components: Connect, SQL >Affects Versions: 4.0.0 >Reporter: Herman van Hövell >Assignee: Herman van Hövell >Priority: Major > Labels: pull-request-available > > This should also implement RelationalGroupedDataset.as[K: Encoder, T: > Encoder]: KeyValueGroupedDataset[K, T]. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Reopened] (SPARK-49422) Create a shared KeyValueGroupedDataset interface
[ https://issues.apache.org/jira/browse/SPARK-49422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reopened SPARK-49422: -- reverted at https://github.com/apache/spark/commit/f3c8d26eb0c3fd7f77950eb08c70bb2a9ab6493c > Create a shared KeyValueGroupedDataset interface > > > Key: SPARK-49422 > URL: https://issues.apache.org/jira/browse/SPARK-49422 > Project: Spark > Issue Type: New Feature > Components: Connect, SQL >Affects Versions: 4.0.0 >Reporter: Herman van Hövell >Assignee: Herman van Hövell >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > This should also implement RelationalGroupedDataset.as[K: Encoder, T: > Encoder]: KeyValueGroupedDataset[K, T]. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49495) Document and Feature Preview on master branch via Live GitHub Pages Updates
[ https://issues.apache.org/jira/browse/SPARK-49495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-49495: - Fix Version/s: (was: 4.0.0) > Document and Feature Preview on master branch via Live GitHub Pages Updates > --- > > Key: SPARK-49495 > URL: https://issues.apache.org/jira/browse/SPARK-49495 > Project: Spark > Issue Type: Documentation > Components: Documentation, Project Infra >Affects Versions: 4.0.0 >Reporter: Kent Yao >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Reopened] (SPARK-49495) Document and Feature Preview on master branch via Live GitHub Pages Updates
[ https://issues.apache.org/jira/browse/SPARK-49495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reopened SPARK-49495: -- Assignee: (was: Kent Yao) Reverted in https://github.com/apache/spark/commit/7de71a2ec78d985c2a045f13c1275101b126cec4, https://github.com/apache/spark/commit/b1807095bef9c6d98e60bdc2669c8af93bc68ad4 and https://github.com/apache/spark/commit/b1807095bef9c6d98e60bdc2669c8af93bc68ad4 > Document and Feature Preview on master branch via Live GitHub Pages Updates > --- > > Key: SPARK-49495 > URL: https://issues.apache.org/jira/browse/SPARK-49495 > Project: Spark > Issue Type: Documentation > Components: Documentation, Project Infra >Affects Versions: 4.0.0 >Reporter: Kent Yao >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49684) Get rid of unnecessary global locks from Spark Connect service
[ https://issues.apache.org/jira/browse/SPARK-49684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49684. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 48131 [https://github.com/apache/spark/pull/48131] > Get rid of unnecessary global locks from Spark Connect service > -- > > Key: SPARK-49684 > URL: https://issues.apache.org/jira/browse/SPARK-49684 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 4.0.0 >Reporter: Changgyoo Park >Assignee: Changgyoo Park >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > > Candidates: sessionsLock and executionsLock. > + StreamingQueryCache. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49684) Get rid of unnecessary global locks from Spark Connect service
[ https://issues.apache.org/jira/browse/SPARK-49684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49684: Assignee: Changgyoo Park > Get rid of unnecessary global locks from Spark Connect service > -- > > Key: SPARK-49684 > URL: https://issues.apache.org/jira/browse/SPARK-49684 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 4.0.0 >Reporter: Changgyoo Park >Assignee: Changgyoo Park >Priority: Minor > Labels: pull-request-available > > Candidates: sessionsLock and executionsLock. > + StreamingQueryCache. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49688) Data race between SparkListener and SparkConnectServiceSuite over ExecuteHolder
[ https://issues.apache.org/jira/browse/SPARK-49688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49688. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 48142 [https://github.com/apache/spark/pull/48142] > Data race between SparkListener and SparkConnectServiceSuite over > ExecuteHolder > --- > > Key: SPARK-49688 > URL: https://issues.apache.org/jira/browse/SPARK-49688 > Project: Spark > Issue Type: Bug > Components: Connect >Affects Versions: 4.0.0 >Reporter: Changgyoo Park >Assignee: Changgyoo Park >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > https://github.com/longvu-db/spark/actions/runs/10903815379/job/30258793280 > => This is most likely a test issue (no synchronisation between two threads), > but need to close look into it to ensure that. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49688) Data race between SparkListener and SparkConnectServiceSuite over ExecuteHolder
[ https://issues.apache.org/jira/browse/SPARK-49688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49688: Assignee: Changgyoo Park > Data race between SparkListener and SparkConnectServiceSuite over > ExecuteHolder > --- > > Key: SPARK-49688 > URL: https://issues.apache.org/jira/browse/SPARK-49688 > Project: Spark > Issue Type: Bug > Components: Connect >Affects Versions: 4.0.0 >Reporter: Changgyoo Park >Assignee: Changgyoo Park >Priority: Major > Labels: pull-request-available > > https://github.com/longvu-db/spark/actions/runs/10903815379/job/30258793280 > => This is most likely a test issue (no synchronisation between two threads), > but need to close look into it to ensure that. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49673) Increase maxBatchSize for Connect's sqlCommandResult
[ https://issues.apache.org/jira/browse/SPARK-49673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49673: Assignee: Robert Dillitz > Increase maxBatchSize for Connect's sqlCommandResult > > > Key: SPARK-49673 > URL: https://issues.apache.org/jira/browse/SPARK-49673 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 4.0.0 >Reporter: Robert Dillitz >Assignee: Robert Dillitz >Priority: Major > Labels: pull-request-available > Original Estimate: 1h > Remaining Estimate: 1h > > Increase the default maxBatchSize from 4MiB * 0.7 to 128MiB (= > CONNECT_GRPC_MAX_MESSAGE_SIZE) * 0.7 when creating the single Arrow batch for > the SqlCommandResult in the SparkConnectPlanner. This lets us return much > larger LocalRelations in the SqlCommandResult (for example for the SHOW > PARTITIONS command) while still staying within the GRPC message size limit. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49673) Increase maxBatchSize for Connect's sqlCommandResult
[ https://issues.apache.org/jira/browse/SPARK-49673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49673. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 48122 [https://github.com/apache/spark/pull/48122] > Increase maxBatchSize for Connect's sqlCommandResult > > > Key: SPARK-49673 > URL: https://issues.apache.org/jira/browse/SPARK-49673 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 4.0.0 >Reporter: Robert Dillitz >Assignee: Robert Dillitz >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Original Estimate: 1h > Remaining Estimate: 1h > > Increase the default maxBatchSize from 4MiB * 0.7 to 128MiB (= > CONNECT_GRPC_MAX_MESSAGE_SIZE) * 0.7 when creating the single Arrow batch for > the SqlCommandResult in the SparkConnectPlanner. This lets us return much > larger LocalRelations in the SqlCommandResult (for example for the SHOW > PARTITIONS command) while still staying within the GRPC message size limit. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49548) Get rid of coarse-locking in SparkConnectSessionManager
[ https://issues.apache.org/jira/browse/SPARK-49548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49548: Assignee: Changgyoo Park > Get rid of coarse-locking in SparkConnectSessionManager > --- > > Key: SPARK-49548 > URL: https://issues.apache.org/jira/browse/SPARK-49548 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 4.0.0 >Reporter: Changgyoo Park >Assignee: Changgyoo Park >Priority: Minor > Labels: pull-request-available > > Related to https://issues.apache.org/jira/browse/SPARK-49544. > -> This has never caused a real world problem, but we had better fix it in > tandem with https://issues.apache.org/jira/browse/SPARK-49544. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49544) Severe lock contention in SparkConnectExecutionManager
[ https://issues.apache.org/jira/browse/SPARK-49544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49544. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 48034 [https://github.com/apache/spark/pull/48034] > Severe lock contention in SparkConnectExecutionManager > -- > > Key: SPARK-49544 > URL: https://issues.apache.org/jira/browse/SPARK-49544 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 4.0.0 >Reporter: Changgyoo Park >Assignee: Changgyoo Park >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Critical sections protected by executionsLock can become too broad when there > are too many ExecuteHolders, e.g., >= 10^4. The problem is aggravated when > there are too many threads in the system: priority inversion. In order to > minimise the chance of a thread getting pre-empted holding executionsLock, > replace it with a concurrent hash map that internally partitions the data: > coarse-grained locking -> fine-grained locking. > -> https://issues.apache.org/jira/browse/SPARK-49580 will be eventually > needed. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49548) Get rid of coarse-locking in SparkConnectSessionManager
[ https://issues.apache.org/jira/browse/SPARK-49548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49548. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 48036 [https://github.com/apache/spark/pull/48036] > Get rid of coarse-locking in SparkConnectSessionManager > --- > > Key: SPARK-49548 > URL: https://issues.apache.org/jira/browse/SPARK-49548 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 4.0.0 >Reporter: Changgyoo Park >Assignee: Changgyoo Park >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > > Related to https://issues.apache.org/jira/browse/SPARK-49544. > -> This has never caused a real world problem, but we had better fix it in > tandem with https://issues.apache.org/jira/browse/SPARK-49544. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49544) Severe lock contention in SparkConnectExecutionManager
[ https://issues.apache.org/jira/browse/SPARK-49544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49544: Assignee: Changgyoo Park > Severe lock contention in SparkConnectExecutionManager > -- > > Key: SPARK-49544 > URL: https://issues.apache.org/jira/browse/SPARK-49544 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 4.0.0 >Reporter: Changgyoo Park >Assignee: Changgyoo Park >Priority: Major > Labels: pull-request-available > > Critical sections protected by executionsLock can become too broad when there > are too many ExecuteHolders, e.g., >= 10^4. The problem is aggravated when > there are too many threads in the system: priority inversion. In order to > minimise the chance of a thread getting pre-empted holding executionsLock, > replace it with a concurrent hash map that internally partitions the data: > coarse-grained locking -> fine-grained locking. > -> https://issues.apache.org/jira/browse/SPARK-49580 will be eventually > needed. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49553) Remove experimental API notes for pandas related functions
[ https://issues.apache.org/jira/browse/SPARK-49553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49553: Assignee: Allison Wang > Remove experimental API notes for pandas related functions > -- > > Key: SPARK-49553 > URL: https://issues.apache.org/jira/browse/SPARK-49553 > Project: Spark > Issue Type: Sub-task > Components: Documentation, PySpark >Affects Versions: 4.0.0 >Reporter: Allison Wang >Assignee: Allison Wang >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49553) Remove experimental API notes for pandas related functions
[ https://issues.apache.org/jira/browse/SPARK-49553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49553. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 48042 [https://github.com/apache/spark/pull/48042] > Remove experimental API notes for pandas related functions > -- > > Key: SPARK-49553 > URL: https://issues.apache.org/jira/browse/SPARK-49553 > Project: Spark > Issue Type: Sub-task > Components: Documentation, PySpark >Affects Versions: 4.0.0 >Reporter: Allison Wang >Assignee: Allison Wang >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49582) Improve "dispatch_window_method" utility and docstring
[ https://issues.apache.org/jira/browse/SPARK-49582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49582. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 48056 [https://github.com/apache/spark/pull/48056] > Improve "dispatch_window_method" utility and docstring > -- > > Key: SPARK-49582 > URL: https://issues.apache.org/jira/browse/SPARK-49582 > Project: Spark > Issue Type: Improvement > Components: Connect, PySpark >Affects Versions: 4.0.0 >Reporter: Xinrong Meng >Assignee: Xinrong Meng >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > - Remove the unreachable exception from "dispatch_window_method". > - Improve docstrings. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49582) Improve "dispatch_window_method" utility and docstring
[ https://issues.apache.org/jira/browse/SPARK-49582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49582: Assignee: Xinrong Meng > Improve "dispatch_window_method" utility and docstring > -- > > Key: SPARK-49582 > URL: https://issues.apache.org/jira/browse/SPARK-49582 > Project: Spark > Issue Type: Improvement > Components: Connect, PySpark >Affects Versions: 4.0.0 >Reporter: Xinrong Meng >Assignee: Xinrong Meng >Priority: Major > Labels: pull-request-available > > - Remove the unreachable exception from "dispatch_window_method". > - Improve docstrings. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49576) Upload Python logs in CI
[ https://issues.apache.org/jira/browse/SPARK-49576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49576. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 48048 [https://github.com/apache/spark/pull/48048] > Upload Python logs in CI > > > Key: SPARK-49576 > URL: https://issues.apache.org/jira/browse/SPARK-49576 > Project: Spark > Issue Type: Improvement > Components: PySpark >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > e.g., > /__w/spark/spark/python/target/28a23950-46c7-45c5-a9b7-42e7d9b21518/python3.12__pyspark.sql.tests.connect.test_connect_session__ah_ug0xu.log) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49536) Add error handling for python streaming data source record prefetching
[ https://issues.apache.org/jira/browse/SPARK-49536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49536. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 48023 [https://github.com/apache/spark/pull/48023] > Add error handling for python streaming data source record prefetching > -- > > Key: SPARK-49536 > URL: https://issues.apache.org/jira/browse/SPARK-49536 > Project: Spark > Issue Type: Task > Components: PySpark, SS >Affects Versions: 4.0.0 >Reporter: Chaoqin Li >Assignee: Chaoqin Li >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Currently there is an assert that return status code from the python worker > is SpecialLengths.START_ARROW_STREAM when python source runner is prefetching > records. To improve debugability, check the status code and rethrow an > runtime error with detailed error message. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49536) Add error handling for python streaming data source record prefetching
[ https://issues.apache.org/jira/browse/SPARK-49536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49536: Assignee: Chaoqin Li > Add error handling for python streaming data source record prefetching > -- > > Key: SPARK-49536 > URL: https://issues.apache.org/jira/browse/SPARK-49536 > Project: Spark > Issue Type: Task > Components: PySpark, SS >Affects Versions: 4.0.0 >Reporter: Chaoqin Li >Assignee: Chaoqin Li >Priority: Major > Labels: pull-request-available > > Currently there is an assert that return status code from the python worker > is SpecialLengths.START_ARROW_STREAM when python source runner is prefetching > records. To improve debugability, check the status code and rethrow an > runtime error with detailed error message. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-49545) Increase timeout for build from 3 to 4 hours
Hyukjin Kwon created SPARK-49545: Summary: Increase timeout for build from 3 to 4 hours Key: SPARK-49545 URL: https://issues.apache.org/jira/browse/SPARK-49545 Project: Spark Issue Type: Improvement Components: Project Infra Affects Versions: 4.0.0 Reporter: Hyukjin Kwon https://github.com/apache/spark/actions/workflows/build_python_3.12.yml fails with hitting 3 hours. We should increase it up to 4 hours. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49532) Improve documentation of "plotting.sample_ratio" option
[ https://issues.apache.org/jira/browse/SPARK-49532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49532. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 48013 [https://github.com/apache/spark/pull/48013] > Improve documentation of "plotting.sample_ratio" option > --- > > Key: SPARK-49532 > URL: https://issues.apache.org/jira/browse/SPARK-49532 > Project: Spark > Issue Type: Sub-task > Components: Documentation >Affects Versions: 4.0.0 >Reporter: Xinrong Meng >Assignee: Xinrong Meng >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > The current documentation incorrectly suggests that "plotting.sample_ratio" > defaults to "plotting.max_rows". In reality, if "plotting.sample_ratio" is > not explicitly set, it is *derived* based on the ratio of "plotting.max_rows" > to the dataset size. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49532) Improve documentation of "plotting.sample_ratio" option
[ https://issues.apache.org/jira/browse/SPARK-49532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49532: Assignee: Xinrong Meng > Improve documentation of "plotting.sample_ratio" option > --- > > Key: SPARK-49532 > URL: https://issues.apache.org/jira/browse/SPARK-49532 > Project: Spark > Issue Type: Sub-task > Components: Documentation >Affects Versions: 4.0.0 >Reporter: Xinrong Meng >Assignee: Xinrong Meng >Priority: Major > Labels: pull-request-available > > The current documentation incorrectly suggests that "plotting.sample_ratio" > defaults to "plotting.max_rows". In reality, if "plotting.sample_ratio" is > not explicitly set, it is *derived* based on the ratio of "plotting.max_rows" > to the dataset size. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49284) Create a shared Catalog interface
[ https://issues.apache.org/jira/browse/SPARK-49284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49284: Assignee: Herman van Hövell > Create a shared Catalog interface > - > > Key: SPARK-49284 > URL: https://issues.apache.org/jira/browse/SPARK-49284 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 4.0.0 >Reporter: Herman van Hövell >Assignee: Herman van Hövell >Priority: Major > Labels: pull-request-available > > Create a shared Catalog interface for both classic and connect: > org.apache.spark.sql.api.Catalog in sql/api, and use these interfaces in the > classic and connect implementations. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49284) Create a shared Catalog interface
[ https://issues.apache.org/jira/browse/SPARK-49284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49284. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47991 [https://github.com/apache/spark/pull/47991] > Create a shared Catalog interface > - > > Key: SPARK-49284 > URL: https://issues.apache.org/jira/browse/SPARK-49284 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 4.0.0 >Reporter: Herman van Hövell >Assignee: Herman van Hövell >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Create a shared Catalog interface for both classic and connect: > org.apache.spark.sql.api.Catalog in sql/api, and use these interfaces in the > classic and connect implementations. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49511) Attempt to apply formatting rules to sql/api
[ https://issues.apache.org/jira/browse/SPARK-49511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49511. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47989 [https://github.com/apache/spark/pull/47989] > Attempt to apply formatting rules to sql/api > > > Key: SPARK-49511 > URL: https://issues.apache.org/jira/browse/SPARK-49511 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 4.0.0 >Reporter: Herman van Hövell >Assignee: Herman van Hövell >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49525) Log improvement for server side streaming query listener bus listener
[ https://issues.apache.org/jira/browse/SPARK-49525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49525: Assignee: Wei Liu > Log improvement for server side streaming query listener bus listener > - > > Key: SPARK-49525 > URL: https://issues.apache.org/jira/browse/SPARK-49525 > Project: Spark > Issue Type: Task > Components: Connect, SS >Affects Versions: 4.0.0 >Reporter: Wei Liu >Assignee: Wei Liu >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49526) Windows-style paths are unsupported in ArtifactManager
[ https://issues.apache.org/jira/browse/SPARK-49526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49526: Assignee: Venkata Sai Akhil Gudesa > Windows-style paths are unsupported in ArtifactManager > -- > > Key: SPARK-49526 > URL: https://issues.apache.org/jira/browse/SPARK-49526 > Project: Spark > Issue Type: Bug > Components: Connect >Affects Versions: 4.0.0 >Reporter: Venkata Sai Akhil Gudesa >Assignee: Venkata Sai Akhil Gudesa >Priority: Major > Labels: pull-request-available > > Currently, windows-based clients will run into an issue when using the > `addArtifact` API as the path passed to the server would contain backslashes > which the server would interpret as part of the file name rather than a > separator. > E.g if the client sends the name `pyfiles\abc.txt` to the server, then the > artifact would be written out as `/pyfiles\abc.txt` > instead of the correct `\pyfiles\abc.txt -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49526) Windows-style paths are unsupported in ArtifactManager
[ https://issues.apache.org/jira/browse/SPARK-49526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49526. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 48003 [https://github.com/apache/spark/pull/48003] > Windows-style paths are unsupported in ArtifactManager > -- > > Key: SPARK-49526 > URL: https://issues.apache.org/jira/browse/SPARK-49526 > Project: Spark > Issue Type: Bug > Components: Connect >Affects Versions: 4.0.0 >Reporter: Venkata Sai Akhil Gudesa >Assignee: Venkata Sai Akhil Gudesa >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Currently, windows-based clients will run into an issue when using the > `addArtifact` API as the path passed to the server would contain backslashes > which the server would interpret as part of the file name rather than a > separator. > E.g if the client sends the name `pyfiles\abc.txt` to the server, then the > artifact would be written out as `/pyfiles\abc.txt` > instead of the correct `\pyfiles\abc.txt -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49458) ReattachExecute does not supply server-side session id
[ https://issues.apache.org/jira/browse/SPARK-49458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49458: Assignee: Changgyoo Park > ReattachExecute does not supply server-side session id > -- > > Key: SPARK-49458 > URL: https://issues.apache.org/jira/browse/SPARK-49458 > Project: Spark > Issue Type: Bug > Components: Connect >Affects Versions: 4.0.0 >Reporter: Changgyoo Park >Assignee: Changgyoo Park >Priority: Major > Labels: pull-request-available > > Related to SPARK-47380. > > One example. > Driver restart -> session re created -> OPERATION_NOT_FOUND instead of > SESSION_NOT_FOUND because the first attempt to reattach a session does not > contain the last time observed server side session id. > E.g., > Start execution (client_observed_server_side_session_id is set) -> driver > restart -> reattach (client_observed_server_side_session_id is not set) -> > OPERATION_NOT_FOUND instead of SESSION_NOT_FOUND. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-48960) Makes spark-submit works with Spark connect
[ https://issues.apache.org/jira/browse/SPARK-48960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-48960: Assignee: Hyukjin Kwon > Makes spark-submit works with Spark connect > --- > > Key: SPARK-48960 > URL: https://issues.apache.org/jira/browse/SPARK-48960 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > > Similar with SPARK-48936. We should make Spark Submit works with Spark Connect -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49477) Improve pandas udf return type error message
[ https://issues.apache.org/jira/browse/SPARK-49477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49477. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47942 [https://github.com/apache/spark/pull/47942] > Improve pandas udf return type error message > > > Key: SPARK-49477 > URL: https://issues.apache.org/jira/browse/SPARK-49477 > Project: Spark > Issue Type: Improvement > Components: PySpark >Affects Versions: 4.0.0 >Reporter: Allison Wang >Assignee: Allison Wang >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49456) Spark website doesn't properly scroll to hash links
[ https://issues.apache.org/jira/browse/SPARK-49456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49456: Assignee: Neil Ramaswamy > Spark website doesn't properly scroll to hash links > > > Key: SPARK-49456 > URL: https://issues.apache.org/jira/browse/SPARK-49456 > Project: Spark > Issue Type: Improvement > Components: Documentation >Affects Versions: 4.0.0 >Reporter: Neil Ramaswamy >Assignee: Neil Ramaswamy >Priority: Major > Labels: pull-request-available > > On the version-specific Spark documentation, if you click a header, the page > will scroll past the actual content, hiding it. For example, if you go to > [this link|https://spark.apache.org/docs/latest/#downloading], you'll > probably notice the page scroll past "Downloads". > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49456) Spark website doesn't properly scroll to hash links
[ https://issues.apache.org/jira/browse/SPARK-49456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49456. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47925 [https://github.com/apache/spark/pull/47925] > Spark website doesn't properly scroll to hash links > > > Key: SPARK-49456 > URL: https://issues.apache.org/jira/browse/SPARK-49456 > Project: Spark > Issue Type: Improvement > Components: Documentation >Affects Versions: 4.0.0 >Reporter: Neil Ramaswamy >Assignee: Neil Ramaswamy >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > On the version-specific Spark documentation, if you click a header, the page > will scroll past the actual content, hiding it. For example, if you go to > [this link|https://spark.apache.org/docs/latest/#downloading], you'll > probably notice the page scroll past "Downloads". > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49449) Remove string and binary from metadata in spec
[ https://issues.apache.org/jira/browse/SPARK-49449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49449. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47917 [https://github.com/apache/spark/pull/47917] > Remove string and binary from metadata in spec > -- > > Key: SPARK-49449 > URL: https://issues.apache.org/jira/browse/SPARK-49449 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: David Cashman >Assignee: David Cashman >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > We never supported the string-from-metadata or binary-from-metadata. Remove > them for now. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49402) Fix Binder integration in PySpark documentation
[ https://issues.apache.org/jira/browse/SPARK-49402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49402: Assignee: Hyukjin Kwon > Fix Binder integration in PySpark documentation > --- > > Key: SPARK-49402 > URL: https://issues.apache.org/jira/browse/SPARK-49402 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 4.0.0, 3.5.2, 3.4.3 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > > https://mybinder.org/v2/gh/apache/spark/bb7846dd487?filepath=python%2Fdocs%2Fsource%2Fgetting_started%2Fquickstart_df.ipynb > is broken. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49402) Fix Binder integration in PySpark documentation
[ https://issues.apache.org/jira/browse/SPARK-49402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49402. -- Fix Version/s: 3.4.4 4.0.0 3.5.3 Resolution: Fixed Issue resolved by pull request 47883 [https://github.com/apache/spark/pull/47883] > Fix Binder integration in PySpark documentation > --- > > Key: SPARK-49402 > URL: https://issues.apache.org/jira/browse/SPARK-49402 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 4.0.0, 3.5.2, 3.4.3 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > Fix For: 3.4.4, 4.0.0, 3.5.3 > > > https://mybinder.org/v2/gh/apache/spark/bb7846dd487?filepath=python%2Fdocs%2Fsource%2Fgetting_started%2Fquickstart_df.ipynb > is broken. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-49402) Fix Binder integration in PySpark documentation
Hyukjin Kwon created SPARK-49402: Summary: Fix Binder integration in PySpark documentation Key: SPARK-49402 URL: https://issues.apache.org/jira/browse/SPARK-49402 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 3.4.3, 3.5.2, 4.0.0 Reporter: Hyukjin Kwon https://mybinder.org/v2/gh/apache/spark/bb7846dd487?filepath=python%2Fdocs%2Fsource%2Fgetting_started%2Fquickstart_df.ipynb is broken. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49387) Fix type hint for `accuracy` in `percentile_approx` and `approx_percentile`
[ https://issues.apache.org/jira/browse/SPARK-49387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49387. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47869 [https://github.com/apache/spark/pull/47869] > Fix type hint for `accuracy` in `percentile_approx` and `approx_percentile` > --- > > Key: SPARK-49387 > URL: https://issues.apache.org/jira/browse/SPARK-49387 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 4.0.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49387) Fix type hint for `accuracy` in `percentile_approx` and `approx_percentile`
[ https://issues.apache.org/jira/browse/SPARK-49387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49387: Assignee: Ruifeng Zheng > Fix type hint for `accuracy` in `percentile_approx` and `approx_percentile` > --- > > Key: SPARK-49387 > URL: https://issues.apache.org/jira/browse/SPARK-49387 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 4.0.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49370) maven.scaladoc.skip should not affect test code compilation
[ https://issues.apache.org/jira/browse/SPARK-49370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49370. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47858 [https://github.com/apache/spark/pull/47858] > maven.scaladoc.skip should not affect test code compilation > --- > > Key: SPARK-49370 > URL: https://issues.apache.org/jira/browse/SPARK-49370 > Project: Spark > Issue Type: Bug > Components: Build >Affects Versions: 3.5.0, 4.0.0 >Reporter: Cheng Pan >Assignee: Cheng Pan >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49370) maven.scaladoc.skip should not affect test code compilation
[ https://issues.apache.org/jira/browse/SPARK-49370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49370: Assignee: Cheng Pan > maven.scaladoc.skip should not affect test code compilation > --- > > Key: SPARK-49370 > URL: https://issues.apache.org/jira/browse/SPARK-49370 > Project: Spark > Issue Type: Bug > Components: Build >Affects Versions: 3.5.0, 4.0.0 >Reporter: Cheng Pan >Assignee: Cheng Pan >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49347) Deprecate SaprkR
[ https://issues.apache.org/jira/browse/SPARK-49347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49347. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47842 [https://github.com/apache/spark/pull/47842] > Deprecate SaprkR > > > Key: SPARK-49347 > URL: https://issues.apache.org/jira/browse/SPARK-49347 > Project: Spark > Issue Type: Task > Components: SparkR >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Critical > Labels: pull-request-available > Fix For: 4.0.0 > > > Deprecate SparkR > - Discussion: https://lists.apache.org/thread/qjgsgxklvpvyvbzsx1qr8o533j4zjlm5 > - Vote: https://lists.apache.org/thread/3c8qxks26kqflsjh0gtjo3nldk686vtq -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49347) Deprecate SaprkR
[ https://issues.apache.org/jira/browse/SPARK-49347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49347: Assignee: Hyukjin Kwon > Deprecate SaprkR > > > Key: SPARK-49347 > URL: https://issues.apache.org/jira/browse/SPARK-49347 > Project: Spark > Issue Type: Task > Components: SparkR >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Critical > Labels: pull-request-available > > Deprecate SparkR > - Discussion: https://lists.apache.org/thread/qjgsgxklvpvyvbzsx1qr8o533j4zjlm5 > - Vote: https://lists.apache.org/thread/3c8qxks26kqflsjh0gtjo3nldk686vtq -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49368) Avoid accessing protobuf lite classes directly
[ https://issues.apache.org/jira/browse/SPARK-49368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49368: Assignee: Cheng Pan > Avoid accessing protobuf lite classes directly > -- > > Key: SPARK-49368 > URL: https://issues.apache.org/jira/browse/SPARK-49368 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 4.0.0 >Reporter: Cheng Pan >Assignee: Cheng Pan >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49368) Avoid accessing protobuf lite classes directly
[ https://issues.apache.org/jira/browse/SPARK-49368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49368. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47857 [https://github.com/apache/spark/pull/47857] > Avoid accessing protobuf lite classes directly > -- > > Key: SPARK-49368 > URL: https://issues.apache.org/jira/browse/SPARK-49368 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 4.0.0 >Reporter: Cheng Pan >Assignee: Cheng Pan >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49359) allow StagedTableCatalog implementations to fall back to non-atomic write
[ https://issues.apache.org/jira/browse/SPARK-49359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49359. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47848 [https://github.com/apache/spark/pull/47848] > allow StagedTableCatalog implementations to fall back to non-atomic write > - > > Key: SPARK-49359 > URL: https://issues.apache.org/jira/browse/SPARK-49359 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.5.0, 4.0.0 >Reporter: Wenchen Fan >Assignee: Wenchen Fan >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49359) allow StagedTableCatalog implementations to fall back to non-atomic write
[ https://issues.apache.org/jira/browse/SPARK-49359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49359: Assignee: Wenchen Fan > allow StagedTableCatalog implementations to fall back to non-atomic write > - > > Key: SPARK-49359 > URL: https://issues.apache.org/jira/browse/SPARK-49359 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.5.0, 4.0.0 >Reporter: Wenchen Fan >Assignee: Wenchen Fan >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49344) Support `json_normalize` for Pandas API on Spark
[ https://issues.apache.org/jira/browse/SPARK-49344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49344: Assignee: Haejoon Lee > Support `json_normalize` for Pandas API on Spark > > > Key: SPARK-49344 > URL: https://issues.apache.org/jira/browse/SPARK-49344 > Project: Spark > Issue Type: Bug > Components: Pandas API on Spark >Affects Versions: 4.0.0 >Reporter: Haejoon Lee >Assignee: Haejoon Lee >Priority: Major > Labels: pull-request-available > > For Pandas feature parity: > https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.json_normalize.html -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49344) Support `json_normalize` for Pandas API on Spark
[ https://issues.apache.org/jira/browse/SPARK-49344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49344. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47840 [https://github.com/apache/spark/pull/47840] > Support `json_normalize` for Pandas API on Spark > > > Key: SPARK-49344 > URL: https://issues.apache.org/jira/browse/SPARK-49344 > Project: Spark > Issue Type: Bug > Components: Pandas API on Spark >Affects Versions: 4.0.0 >Reporter: Haejoon Lee >Assignee: Haejoon Lee >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > For Pandas feature parity: > https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.json_normalize.html -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49301) [PySpark] Chunk arrow data passed to Python worker
[ https://issues.apache.org/jira/browse/SPARK-49301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49301. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47804 [https://github.com/apache/spark/pull/47804] > [PySpark] Chunk arrow data passed to Python worker > -- > > Key: SPARK-49301 > URL: https://issues.apache.org/jira/browse/SPARK-49301 > Project: Spark > Issue Type: Task > Components: Structured Streaming >Affects Versions: 4.0.0 >Reporter: Bo Gao >Assignee: Bo Gao >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49347) Deprecate SaprkR
[ https://issues.apache.org/jira/browse/SPARK-49347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-49347: - Issue Type: Task (was: Improvement) > Deprecate SaprkR > > > Key: SPARK-49347 > URL: https://issues.apache.org/jira/browse/SPARK-49347 > Project: Spark > Issue Type: Task > Components: SparkR >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Priority: Major > > Deprecate SparkR > - Discussion: https://lists.apache.org/thread/qjgsgxklvpvyvbzsx1qr8o533j4zjlm5 > - Vote: https://lists.apache.org/thread/3c8qxks26kqflsjh0gtjo3nldk686vtq -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49347) Deprecate SaprkR
[ https://issues.apache.org/jira/browse/SPARK-49347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-49347: - Priority: Critical (was: Major) > Deprecate SaprkR > > > Key: SPARK-49347 > URL: https://issues.apache.org/jira/browse/SPARK-49347 > Project: Spark > Issue Type: Task > Components: SparkR >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Priority: Critical > > Deprecate SparkR > - Discussion: https://lists.apache.org/thread/qjgsgxklvpvyvbzsx1qr8o533j4zjlm5 > - Vote: https://lists.apache.org/thread/3c8qxks26kqflsjh0gtjo3nldk686vtq -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-49347) Deprecate SaprkR
Hyukjin Kwon created SPARK-49347: Summary: Deprecate SaprkR Key: SPARK-49347 URL: https://issues.apache.org/jira/browse/SPARK-49347 Project: Spark Issue Type: Improvement Components: SparkR Affects Versions: 4.0.0 Reporter: Hyukjin Kwon Deprecate SparkR - Discussion: https://lists.apache.org/thread/qjgsgxklvpvyvbzsx1qr8o533j4zjlm5 - Vote: https://lists.apache.org/thread/3c8qxks26kqflsjh0gtjo3nldk686vtq -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49121) Support from_protobuf and to_protobuf for SQL functions
[ https://issues.apache.org/jira/browse/SPARK-49121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49121. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47716 [https://github.com/apache/spark/pull/47716] > Support from_protobuf and to_protobuf for SQL functions > --- > > Key: SPARK-49121 > URL: https://issues.apache.org/jira/browse/SPARK-49121 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0 >Reporter: Haejoon Lee >Assignee: Haejoon Lee >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Similar to SPARK-48545, we also want to support from_protobuf and to_protobuf > for SQL functions -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49121) Support from_protobuf and to_protobuf for SQL functions
[ https://issues.apache.org/jira/browse/SPARK-49121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49121: Assignee: Haejoon Lee > Support from_protobuf and to_protobuf for SQL functions > --- > > Key: SPARK-49121 > URL: https://issues.apache.org/jira/browse/SPARK-49121 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0 >Reporter: Haejoon Lee >Assignee: Haejoon Lee >Priority: Major > Labels: pull-request-available > > Similar to SPARK-48545, we also want to support from_protobuf and to_protobuf > for SQL functions -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49345) Make sure using the current running Spark Session
[ https://issues.apache.org/jira/browse/SPARK-49345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49345. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47838 [https://github.com/apache/spark/pull/47838] > Make sure using the current running Spark Session > - > > Key: SPARK-49345 > URL: https://issues.apache.org/jira/browse/SPARK-49345 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > If users use `SparkSession.active()` or `SQLConf.get`, then it might pick up > wrong configurations from a different Spark session. We should make sure the > configuration handling is done with the current running session. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49323) Move MockObserver from Spark Connect Server's test folder to the Server's main folder
[ https://issues.apache.org/jira/browse/SPARK-49323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49323. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47816 [https://github.com/apache/spark/pull/47816] > Move MockObserver from Spark Connect Server's test folder to the Server's > main folder > -- > > Key: SPARK-49323 > URL: https://issues.apache.org/jira/browse/SPARK-49323 > Project: Spark > Issue Type: Task > Components: Connect >Affects Versions: 4.0.0 >Reporter: Thang Long Vu >Assignee: Thang Long Vu >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49323) Move MockObserver from Spark Connect Server's test folder to the Server's main folder
[ https://issues.apache.org/jira/browse/SPARK-49323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49323: Assignee: Thang Long Vu > Move MockObserver from Spark Connect Server's test folder to the Server's > main folder > -- > > Key: SPARK-49323 > URL: https://issues.apache.org/jira/browse/SPARK-49323 > Project: Spark > Issue Type: Task > Components: Connect >Affects Versions: 4.0.0 >Reporter: Thang Long Vu >Assignee: Thang Long Vu >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49026) Create conversions Column API to Connect protos
[ https://issues.apache.org/jira/browse/SPARK-49026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49026. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47812 [https://github.com/apache/spark/pull/47812] > Create conversions Column API to Connect protos > --- > > Key: SPARK-49026 > URL: https://issues.apache.org/jira/browse/SPARK-49026 > Project: Spark > Issue Type: New Feature > Components: Connect, SQL >Affects Versions: 4.0.0 >Reporter: Herman van Hövell >Assignee: Herman van Hövell >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Make sure we can translate the Column API to Connect Protos. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49274) Support java serialization in AgnosticEncoders
[ https://issues.apache.org/jira/browse/SPARK-49274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49274. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47806 [https://github.com/apache/spark/pull/47806] > Support java serialization in AgnosticEncoders > -- > > Key: SPARK-49274 > URL: https://issues.apache.org/jira/browse/SPARK-49274 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 4.0.0 >Reporter: Herman van Hövell >Assignee: Herman van Hövell >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49269) Improve performance and memory footprint of all VALUES clauses
[ https://issues.apache.org/jira/browse/SPARK-49269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49269. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47791 [https://github.com/apache/spark/pull/47791] > Improve performance and memory footprint of all VALUES clauses > -- > > Key: SPARK-49269 > URL: https://issues.apache.org/jira/browse/SPARK-49269 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.4.4 >Reporter: Costas Zarifis >Assignee: Costas Zarifis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > This is a continuation of the following performance improvement: > https://issues.apache.org/jira/browse/SPARK-48967 > > By pushing the early evaluation of `UnresolvedUnlineTables` into > `LocalRelation (whenever possible) in `visitInlineTable` of the AstBuilder we > can get the benefits of the aforementioned ticket not only in `INSERT INTO > ... VALUES` statements but in every statement that can contain the `VALUES()` > clause. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49269) Improve performance and memory footprint of all VALUES clauses
[ https://issues.apache.org/jira/browse/SPARK-49269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49269: Assignee: Costas Zarifis > Improve performance and memory footprint of all VALUES clauses > -- > > Key: SPARK-49269 > URL: https://issues.apache.org/jira/browse/SPARK-49269 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.4.4 >Reporter: Costas Zarifis >Assignee: Costas Zarifis >Priority: Major > Labels: pull-request-available > > This is a continuation of the following performance improvement: > https://issues.apache.org/jira/browse/SPARK-48967 > > By pushing the early evaluation of `UnresolvedUnlineTables` into > `LocalRelation (whenever possible) in `visitInlineTable` of the AstBuilder we > can get the benefits of the aforementioned ticket not only in `INSERT INTO > ... VALUES` statements but in every statement that can contain the `VALUES()` > clause. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49263) Spark Connect python client: Consistently handle boolean Dataframe reader options
[ https://issues.apache.org/jira/browse/SPARK-49263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49263. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47790 [https://github.com/apache/spark/pull/47790] > Spark Connect python client: Consistently handle boolean Dataframe reader > options > - > > Key: SPARK-49263 > URL: https://issues.apache.org/jira/browse/SPARK-49263 > Project: Spark > Issue Type: Bug > Components: Connect >Affects Versions: 3.5.0 >Reporter: Juliusz Sompolski >Assignee: Juliusz Sompolski >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Python connect client spark.read.option should be using to_str -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49263) Spark Connect python client: Consistently handle boolean Dataframe reader options
[ https://issues.apache.org/jira/browse/SPARK-49263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49263: Assignee: Juliusz Sompolski > Spark Connect python client: Consistently handle boolean Dataframe reader > options > - > > Key: SPARK-49263 > URL: https://issues.apache.org/jira/browse/SPARK-49263 > Project: Spark > Issue Type: Bug > Components: Connect >Affects Versions: 3.5.0 >Reporter: Juliusz Sompolski >Assignee: Juliusz Sompolski >Priority: Major > Labels: pull-request-available > > Python connect client spark.read.option should be using to_str -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49257) Make `bundler install` can retry
[ https://issues.apache.org/jira/browse/SPARK-49257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49257: Assignee: BingKun Pan > Make `bundler install` can retry > > > Key: SPARK-49257 > URL: https://issues.apache.org/jira/browse/SPARK-49257 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49257) Make `bundler install` can retry
[ https://issues.apache.org/jira/browse/SPARK-49257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49257. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47784 [https://github.com/apache/spark/pull/47784] > Make `bundler install` can retry > > > Key: SPARK-49257 > URL: https://issues.apache.org/jira/browse/SPARK-49257 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49260) Should not prepend the classes path of sql/core module in Spark Connect Shell
[ https://issues.apache.org/jira/browse/SPARK-49260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49260: Assignee: Yang Jie > Should not prepend the classes path of sql/core module in Spark Connect Shell > -- > > Key: SPARK-49260 > URL: https://issues.apache.org/jira/browse/SPARK-49260 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49260) Should not prepend the classes path of sql/core module in Spark Connect Shell
[ https://issues.apache.org/jira/browse/SPARK-49260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49260. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47786 [https://github.com/apache/spark/pull/47786] > Should not prepend the classes path of sql/core module in Spark Connect Shell > -- > > Key: SPARK-49260 > URL: https://issues.apache.org/jira/browse/SPARK-49260 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49022) Integrate Basic ColumnNode API in Column
[ https://issues.apache.org/jira/browse/SPARK-49022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49022. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47688 [https://github.com/apache/spark/pull/47688] > Integrate Basic ColumnNode API in Column > > > Key: SPARK-49022 > URL: https://issues.apache.org/jira/browse/SPARK-49022 > Project: Spark > Issue Type: New Feature > Components: Connect, SQL >Affects Versions: 4.0.0 >Reporter: Herman van Hövell >Assignee: Herman van Hövell >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Everything except UDFs :) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-48966) Improve error message with invalid unresolved column reference in UDTF call
[ https://issues.apache.org/jira/browse/SPARK-48966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-48966. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47447 [https://github.com/apache/spark/pull/47447] > Improve error message with invalid unresolved column reference in UDTF call > --- > > Key: SPARK-48966 > URL: https://issues.apache.org/jira/browse/SPARK-48966 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Daniel >Assignee: Daniel >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > This bug covers improving an error message in the event of invalid UDTF > calls. For example: > {{select * from udtf(}} > {{ observed => TABLE(select column from t),}} > {{ value_col => classic_dollars}} > {{)}} > Currently we get: > {{Unsupported subquery expression: Table arguments are used in a function > where they are not supported:}} > {{'UnresolvedTableValuedFunction [udtf], observed => table-argument#68918 [], > value_col => 'classic_dollars, false}} > {{ +- Project ...}} > {{ +- SubqueryAlias ...}} > {{ +- Relation ...}} > But the real error is that the user passed column identifier classic_dollars > rather than string "classic_dollars" into the string argument. > The core reason is that {{CheckAnalysis}} checks the analyzer output tree for > unresolved nodes at the end rather than reporting the error at the time of > attempted resolving of the corresponding expression or operator (in this > case, the unresolved attribute reference resulting from the unquoted > {{{}classic_dollars{}}}). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-48966) Improve error message with invalid unresolved column reference in UDTF call
[ https://issues.apache.org/jira/browse/SPARK-48966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-48966: Assignee: Daniel > Improve error message with invalid unresolved column reference in UDTF call > --- > > Key: SPARK-48966 > URL: https://issues.apache.org/jira/browse/SPARK-48966 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Daniel >Assignee: Daniel >Priority: Major > Labels: pull-request-available > > This bug covers improving an error message in the event of invalid UDTF > calls. For example: > {{select * from udtf(}} > {{ observed => TABLE(select column from t),}} > {{ value_col => classic_dollars}} > {{)}} > Currently we get: > {{Unsupported subquery expression: Table arguments are used in a function > where they are not supported:}} > {{'UnresolvedTableValuedFunction [udtf], observed => table-argument#68918 [], > value_col => 'classic_dollars, false}} > {{ +- Project ...}} > {{ +- SubqueryAlias ...}} > {{ +- Relation ...}} > But the real error is that the user passed column identifier classic_dollars > rather than string "classic_dollars" into the string argument. > The core reason is that {{CheckAnalysis}} checks the analyzer output tree for > unresolved nodes at the end rather than reporting the error at the time of > attempted resolving of the corresponding expression or operator (in this > case, the unresolved attribute reference resulting from the unquoted > {{{}classic_dollars{}}}). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49221) sbt compilation warning: `Regular tasks always evaluate task dependencies (.value) regardless of if expressions`
[ https://issues.apache.org/jira/browse/SPARK-49221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49221. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47739 [https://github.com/apache/spark/pull/47739] > sbt compilation warning: `Regular tasks always evaluate task dependencies > (.value) regardless of if expressions` > > > Key: SPARK-49221 > URL: https://issues.apache.org/jira/browse/SPARK-49221 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > {code:java} > [warn] > /Users/yangjie01/SourceCode/git/spark-sbt/project/SparkBuild.scala:1554:77: > value lookup of `/` inside an `if` expression > [warn] > [warn] problem: `/.value` is inside an `if` expression of a regular task. > [warn] Regular tasks always evaluate task dependencies (`.value`) > regardless of `if` expressions. > [warn] solution: > [warn] 1. Use a conditional task `Def.taskIf(...)` to evaluate it when the > `if` predicate is true or false. > [warn] 2. Or turn the task body into a single `if` expression; the task is > then auto-converted to a conditional task. > [warn] 3. Or make the static evaluation explicit by declaring `/.value` > outside the `if` expression. > [warn] 4. If you still want to force the static lookup, you may annotate > the task lookup with `@sbtUnchecked`, e.g. `(/.value: @sbtUnchecked)`. > [warn] 5. Add `import sbt.dsl.LinterLevel.Ignore` to your build file to > disable all task linting. > [warn] > [warn] val replClasspathes = (LocalProject("connect-client-jvm") / > Compile / dependencyClasspath) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49221) sbt compilation warning: `Regular tasks always evaluate task dependencies (.value) regardless of if expressions`
[ https://issues.apache.org/jira/browse/SPARK-49221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49221: Assignee: Yang Jie > sbt compilation warning: `Regular tasks always evaluate task dependencies > (.value) regardless of if expressions` > > > Key: SPARK-49221 > URL: https://issues.apache.org/jira/browse/SPARK-49221 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Major > Labels: pull-request-available > > {code:java} > [warn] > /Users/yangjie01/SourceCode/git/spark-sbt/project/SparkBuild.scala:1554:77: > value lookup of `/` inside an `if` expression > [warn] > [warn] problem: `/.value` is inside an `if` expression of a regular task. > [warn] Regular tasks always evaluate task dependencies (`.value`) > regardless of `if` expressions. > [warn] solution: > [warn] 1. Use a conditional task `Def.taskIf(...)` to evaluate it when the > `if` predicate is true or false. > [warn] 2. Or turn the task body into a single `if` expression; the task is > then auto-converted to a conditional task. > [warn] 3. Or make the static evaluation explicit by declaring `/.value` > outside the `if` expression. > [warn] 4. If you still want to force the static lookup, you may annotate > the task lookup with `@sbtUnchecked`, e.g. `(/.value: @sbtUnchecked)`. > [warn] 5. Add `import sbt.dsl.LinterLevel.Ignore` to your build file to > disable all task linting. > [warn] > [warn] val replClasspathes = (LocalProject("connect-client-jvm") / > Compile / dependencyClasspath) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49215) Document the NaN handling in df.na.drop
[ https://issues.apache.org/jira/browse/SPARK-49215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49215: Assignee: Ruifeng Zheng > Document the NaN handling in df.na.drop > --- > > Key: SPARK-49215 > URL: https://issues.apache.org/jira/browse/SPARK-49215 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 4.0.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49215) Document the NaN handling in df.na.drop
[ https://issues.apache.org/jira/browse/SPARK-49215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49215. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47731 [https://github.com/apache/spark/pull/47731] > Document the NaN handling in df.na.drop > --- > > Key: SPARK-49215 > URL: https://issues.apache.org/jira/browse/SPARK-49215 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 4.0.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49201) Reimplement hist plot with Spark SQL
[ https://issues.apache.org/jira/browse/SPARK-49201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49201. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47708 [https://github.com/apache/spark/pull/47708] > Reimplement hist plot with Spark SQL > > > Key: SPARK-49201 > URL: https://issues.apache.org/jira/browse/SPARK-49201 > Project: Spark > Issue Type: Sub-task > Components: Connect, PS >Affects Versions: 4.0.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49198) Prune more jars required for Spark Connect shell
[ https://issues.apache.org/jira/browse/SPARK-49198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49198. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47705 [https://github.com/apache/spark/pull/47705] > Prune more jars required for Spark Connect shell > > > Key: SPARK-49198 > URL: https://issues.apache.org/jira/browse/SPARK-49198 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49173) Change Spark Connect shell prompt from `@` to `scala>`
[ https://issues.apache.org/jira/browse/SPARK-49173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49173. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47679 [https://github.com/apache/spark/pull/47679] > Change Spark Connect shell prompt from `@` to `scala>` > -- > > Key: SPARK-49173 > URL: https://issues.apache.org/jira/browse/SPARK-49173 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Should match the prompt with Spark Classic for user experience -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49173) Change Spark Connect shell prompt from `@` to `scala>`
[ https://issues.apache.org/jira/browse/SPARK-49173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49173: Assignee: Hyukjin Kwon > Change Spark Connect shell prompt from `@` to `scala>` > -- > > Key: SPARK-49173 > URL: https://issues.apache.org/jira/browse/SPARK-49173 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > > Should match the prompt with Spark Classic for user experience -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49194) Makes Spark scripts work with Spark connect
[ https://issues.apache.org/jira/browse/SPARK-49194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-49194: - Description: This is an umbrella ticket that includes tasks making Spark scripts such as bin/pyspark, bin/spark-shell and bin/spark-submit work properly with Spark Connect. > Makes Spark scripts work with Spark connect > --- > > Key: SPARK-49194 > URL: https://issues.apache.org/jira/browse/SPARK-49194 > Project: Spark > Issue Type: Umbrella > Components: Connect >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Priority: Major > > This is an umbrella ticket that includes tasks making Spark scripts such as > bin/pyspark, bin/spark-shell and bin/spark-submit work properly with Spark > Connect. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-49198) Prune more jars required for Spark Connect shell
Hyukjin Kwon created SPARK-49198: Summary: Prune more jars required for Spark Connect shell Key: SPARK-49198 URL: https://issues.apache.org/jira/browse/SPARK-49198 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 4.0.0 Reporter: Hyukjin Kwon -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-49195) Embed script level parsing logic into SparkSubmitCommandBuilder
Hyukjin Kwon created SPARK-49195: Summary: Embed script level parsing logic into SparkSubmitCommandBuilder Key: SPARK-49195 URL: https://issues.apache.org/jira/browse/SPARK-49195 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 4.0.0 Reporter: Hyukjin Kwon Embed the logics in script to JVM, see https://github.com/apache/spark/pull/47402 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49173) Change Spark Connect shell prompt from `@` to `scala>`
[ https://issues.apache.org/jira/browse/SPARK-49173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-49173: - Parent: SPARK-49194 Issue Type: Sub-task (was: Improvement) > Change Spark Connect shell prompt from `@` to `scala>` > -- > > Key: SPARK-49173 > URL: https://issues.apache.org/jira/browse/SPARK-49173 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Priority: Major > > Should match the prompt with Spark Classic for user experience -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-48960) Makes spark-submit works with Spark connect
[ https://issues.apache.org/jira/browse/SPARK-48960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-48960: - Parent: SPARK-49194 Issue Type: Sub-task (was: Improvement) > Makes spark-submit works with Spark connect > --- > > Key: SPARK-48960 > URL: https://issues.apache.org/jira/browse/SPARK-48960 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Priority: Major > > Similar with SPARK-48936. We should make Spark Submit works with Spark Connect -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-48936) Makes spark-shell works with Spark connect
[ https://issues.apache.org/jira/browse/SPARK-48936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-48936: - Parent: SPARK-49194 Issue Type: Sub-task (was: Improvement) > Makes spark-shell works with Spark connect > -- > > Key: SPARK-48936 > URL: https://issues.apache.org/jira/browse/SPARK-48936 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > `bin/pyspark --remote` works but `bin/spark-shell --remote` does not work. We > should make it working. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49171) Update Spark Shell documentation with Spark Connect
[ https://issues.apache.org/jira/browse/SPARK-49171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-49171: - Summary: Update Spark Shell documentation with Spark Connect (was: Update Spark Shell with Spark Connect) > Update Spark Shell documentation with Spark Connect > --- > > Key: SPARK-49171 > URL: https://issues.apache.org/jira/browse/SPARK-49171 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Documentation update by SPARK-48936 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49171) Update Spark Shell documentation with Spark Connect
[ https://issues.apache.org/jira/browse/SPARK-49171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-49171: - Parent: SPARK-49194 Issue Type: Sub-task (was: Improvement) > Update Spark Shell documentation with Spark Connect > --- > > Key: SPARK-49171 > URL: https://issues.apache.org/jira/browse/SPARK-49171 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Documentation update by SPARK-48936 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49193) Improve the performance of RowSetUtils.toColumnBasedSet
[ https://issues.apache.org/jira/browse/SPARK-49193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49193. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47699 [https://github.com/apache/spark/pull/47699] > Improve the performance of RowSetUtils.toColumnBasedSet > > > Key: SPARK-49193 > URL: https://issues.apache.org/jira/browse/SPARK-49193 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0 >Reporter: Yuming Wang >Assignee: Yuming Wang >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-48936) Makes spark-shell works with Spark connect
[ https://issues.apache.org/jira/browse/SPARK-48936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-48936. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47402 [https://github.com/apache/spark/pull/47402] > Makes spark-shell works with Spark connect > -- > > Key: SPARK-48936 > URL: https://issues.apache.org/jira/browse/SPARK-48936 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > `bin/pyspark --remote` works but `bin/spark-shell --remote` does not work. We > should make it working. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-48936) Makes spark-shell works with Spark connect
[ https://issues.apache.org/jira/browse/SPARK-48936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-48936: Assignee: Hyukjin Kwon > Makes spark-shell works with Spark connect > -- > > Key: SPARK-48936 > URL: https://issues.apache.org/jira/browse/SPARK-48936 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > > `bin/pyspark --remote` works but `bin/spark-shell --remote` does not work. We > should make it working. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49177) Make GA's `build_error_docs` run only once
[ https://issues.apache.org/jira/browse/SPARK-49177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-49177: Assignee: BingKun Pan > Make GA's `build_error_docs` run only once > -- > > Key: SPARK-49177 > URL: https://issues.apache.org/jira/browse/SPARK-49177 > Project: Spark > Issue Type: Improvement > Components: Documentation, Project Infra >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Critical > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49177) Make GA's `build_error_docs` run only once
[ https://issues.apache.org/jira/browse/SPARK-49177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-49177. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47682 [https://github.com/apache/spark/pull/47682] > Make GA's `build_error_docs` run only once > -- > > Key: SPARK-49177 > URL: https://issues.apache.org/jira/browse/SPARK-49177 > Project: Spark > Issue Type: Improvement > Components: Documentation, Project Infra >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Critical > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org