[jira] [Updated] (SPARK-47896) Upgrade netty to `4.1.109.Final`
[ https://issues.apache.org/jira/browse/SPARK-47896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-47896: -- Parent: SPARK-47046 Issue Type: Sub-task (was: Improvement) > Upgrade netty to `4.1.109.Final` > > > Key: SPARK-47896 > URL: https://issues.apache.org/jira/browse/SPARK-47896 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47896) Upgrade netty to `4.1.109.Final`
[ https://issues.apache.org/jira/browse/SPARK-47896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-47896: - Assignee: BingKun Pan > Upgrade netty to `4.1.109.Final` > > > Key: SPARK-47896 > URL: https://issues.apache.org/jira/browse/SPARK-47896 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47896) Upgrade netty to `4.1.109.Final`
[ https://issues.apache.org/jira/browse/SPARK-47896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-47896. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 46112 [https://github.com/apache/spark/pull/46112] > Upgrade netty to `4.1.109.Final` > > > Key: SPARK-47896 > URL: https://issues.apache.org/jira/browse/SPARK-47896 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47896) Upgrade netty to `4.1.109.Final`
[ https://issues.apache.org/jira/browse/SPARK-47896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47896: --- Labels: pull-request-available (was: ) > Upgrade netty to `4.1.109.Final` > > > Key: SPARK-47896 > URL: https://issues.apache.org/jira/browse/SPARK-47896 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-47172) Upgrade Transport block cipher mode to GCM
[ https://issues.apache.org/jira/browse/SPARK-47172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17838474#comment-17838474 ] Mridul Muralidharan edited comment on SPARK-47172 at 4/18/24 5:47 AM: -- We do not backport features to released versions - so TLS will be in 4.x, not 3.x Given the security implications for SPARK-47318, it was backported to 3.4 and 3.5 - as it was fixing a security issue in existing functionality. This proposal reads like a new feature development, which would typically be out of scope for 3.x Given TLS, it would not very useful for 4.x either ? was (Author: mridulm80): We do not backport features to released versions - so TLS will be in 4.x, not 3.x Given the security implications for SPARK-47318, it was backported to 3.4 and 3.5 - as it was fixing a security issue in existing functionality. This proposal reads like a new feature development, while would be out of scope for 3.x Given TLS, not very useful for 4.x either ? > Upgrade Transport block cipher mode to GCM > -- > > Key: SPARK-47172 > URL: https://issues.apache.org/jira/browse/SPARK-47172 > Project: Spark > Issue Type: Improvement > Components: Security >Affects Versions: 3.4.2, 3.5.0 >Reporter: Steve Weis >Priority: Minor > > The cipher transformation currently used for encrypting RPC calls is an > unauthenticated mode (AES/CTR/NoPadding). This needs to be upgraded to an > authenticated mode (AES/GCM/NoPadding) to prevent ciphertext from being > modified in transit. > The relevant line is here: > [https://github.com/apache/spark/blob/a939a7d0fd9c6b23c879cbee05275c6fbc939e38/common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java#L220] > GCM is relatively more computationally expensive than CTR and adds a 16-byte > block of authentication tag data to each payload. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47591) Hive-thriftserver: Migrate logInfo with variables to structured logging framework
[ https://issues.apache.org/jira/browse/SPARK-47591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-47591. Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45926 [https://github.com/apache/spark/pull/45926] > Hive-thriftserver: Migrate logInfo with variables to structured logging > framework > - > > Key: SPARK-47591 > URL: https://issues.apache.org/jira/browse/SPARK-47591 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Gengliang Wang >Assignee: Haejoon Lee >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47896) Upgrade netty to `4.1.109.Final`
BingKun Pan created SPARK-47896: --- Summary: Upgrade netty to `4.1.109.Final` Key: SPARK-47896 URL: https://issues.apache.org/jira/browse/SPARK-47896 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 4.0.0 Reporter: BingKun Pan -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-47172) Upgrade Transport block cipher mode to GCM
[ https://issues.apache.org/jira/browse/SPARK-47172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17838474#comment-17838474 ] Mridul Muralidharan commented on SPARK-47172: - We do not backport features to released versions - so TLS will be in 4.x, not 3.x Given the security implications for SPARK-47318, it was backported to 3.4 and 3.5 - as it was fixing a security issue in existing functionality. This proposal reads like a new feature development, while would be out of scope for 3.x Given TLS, not very useful for 4.x either ? > Upgrade Transport block cipher mode to GCM > -- > > Key: SPARK-47172 > URL: https://issues.apache.org/jira/browse/SPARK-47172 > Project: Spark > Issue Type: Improvement > Components: Security >Affects Versions: 3.4.2, 3.5.0 >Reporter: Steve Weis >Priority: Minor > > The cipher transformation currently used for encrypting RPC calls is an > unauthenticated mode (AES/CTR/NoPadding). This needs to be upgraded to an > authenticated mode (AES/GCM/NoPadding) to prevent ciphertext from being > modified in transit. > The relevant line is here: > [https://github.com/apache/spark/blob/a939a7d0fd9c6b23c879cbee05275c6fbc939e38/common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java#L220] > GCM is relatively more computationally expensive than CTR and adds a 16-byte > block of authentication tag data to each payload. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47895) group by all should be idempotent
[ https://issues.apache.org/jira/browse/SPARK-47895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47895: --- Labels: pull-request-available (was: ) > group by all should be idempotent > - > > Key: SPARK-47895 > URL: https://issues.apache.org/jira/browse/SPARK-47895 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.0 >Reporter: Wenchen Fan >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47895) group by all should be idempotent
[ https://issues.apache.org/jira/browse/SPARK-47895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-47895: Summary: group by all should be idempotent (was: group by ordinal should be idempotent) > group by all should be idempotent > - > > Key: SPARK-47895 > URL: https://issues.apache.org/jira/browse/SPARK-47895 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.0 >Reporter: Wenchen Fan >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47895) group by ordinal should be idempotent
Wenchen Fan created SPARK-47895: --- Summary: group by ordinal should be idempotent Key: SPARK-47895 URL: https://issues.apache.org/jira/browse/SPARK-47895 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.4.0 Reporter: Wenchen Fan -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47882) createTableColumnTypes need to be mapped to database types instead of using directly
[ https://issues.apache.org/jira/browse/SPARK-47882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-47882: - Assignee: Kent Yao > createTableColumnTypes need to be mapped to database types instead of using > directly > > > Key: SPARK-47882 > URL: https://issues.apache.org/jira/browse/SPARK-47882 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.2, 4.0.0, 3.5.1 >Reporter: Kent Yao >Assignee: Kent Yao >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47882) createTableColumnTypes need to be mapped to database types instead of using directly
[ https://issues.apache.org/jira/browse/SPARK-47882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-47882. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 46093 [https://github.com/apache/spark/pull/46093] > createTableColumnTypes need to be mapped to database types instead of using > directly > > > Key: SPARK-47882 > URL: https://issues.apache.org/jira/browse/SPARK-47882 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.2, 4.0.0, 3.5.1 >Reporter: Kent Yao >Assignee: Kent Yao >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47894) Add `Environment` page to Master UI
[ https://issues.apache.org/jira/browse/SPARK-47894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-47894. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 46111 [https://github.com/apache/spark/pull/46111] > Add `Environment` page to Master UI > --- > > Key: SPARK-47894 > URL: https://issues.apache.org/jira/browse/SPARK-47894 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, Web UI >Affects Versions: 4.0.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47894) Add `Environment` page to Master UI
[ https://issues.apache.org/jira/browse/SPARK-47894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-47894: - Assignee: Dongjoon Hyun > Add `Environment` page to Master UI > --- > > Key: SPARK-47894 > URL: https://issues.apache.org/jira/browse/SPARK-47894 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, Web UI >Affects Versions: 4.0.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47839) Fix Aggregate bug in RewriteWithExpression
[ https://issues.apache.org/jira/browse/SPARK-47839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-47839: --- Assignee: Kelvin Jiang > Fix Aggregate bug in RewriteWithExpression > -- > > Key: SPARK-47839 > URL: https://issues.apache.org/jira/browse/SPARK-47839 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0 >Reporter: Kelvin Jiang >Assignee: Kelvin Jiang >Priority: Major > Labels: pull-request-available > > The following query will fail: > {code:SQL} > SELECT NULLIF(id + 1, 1) > from range(10) > group by id > {code} > This is because {{NullIf}} gets rewritten to {{With}}, then > {{RewriteWithExpression}} tries to pull common expression {{id + 1}} out of > the aggregate, resulting in an invalid plan. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47839) Fix Aggregate bug in RewriteWithExpression
[ https://issues.apache.org/jira/browse/SPARK-47839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-47839. - Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 46034 [https://github.com/apache/spark/pull/46034] > Fix Aggregate bug in RewriteWithExpression > -- > > Key: SPARK-47839 > URL: https://issues.apache.org/jira/browse/SPARK-47839 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0 >Reporter: Kelvin Jiang >Assignee: Kelvin Jiang >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > The following query will fail: > {code:SQL} > SELECT NULLIF(id + 1, 1) > from range(10) > group by id > {code} > This is because {{NullIf}} gets rewritten to {{With}}, then > {{RewriteWithExpression}} tries to pull common expression {{id + 1}} out of > the aggregate, resulting in an invalid plan. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47846) Add support for Variant schema in from_json
[ https://issues.apache.org/jira/browse/SPARK-47846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-47846. - Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 46046 [https://github.com/apache/spark/pull/46046] > Add support for Variant schema in from_json > --- > > Key: SPARK-47846 > URL: https://issues.apache.org/jira/browse/SPARK-47846 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Harsh Motwani >Assignee: Harsh Motwani >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Adding support for the variant type in the from_json expression. > "select from_json('', 'variant')" should interpret json_string > as a variant type. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47846) Add support for Variant schema in from_json
[ https://issues.apache.org/jira/browse/SPARK-47846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-47846: --- Assignee: Harsh Motwani > Add support for Variant schema in from_json > --- > > Key: SPARK-47846 > URL: https://issues.apache.org/jira/browse/SPARK-47846 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Harsh Motwani >Assignee: Harsh Motwani >Priority: Major > Labels: pull-request-available > > Adding support for the variant type in the from_json expression. > "select from_json('', 'variant')" should interpret json_string > as a variant type. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-47429) Rename errorClass to errorCondition
[ https://issues.apache.org/jira/browse/SPARK-47429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17838412#comment-17838412 ] BingKun Pan edited comment on SPARK-47429 at 4/18/24 1:26 AM: -- This will be a very huge task, and I roughly counted almost 4k+ places where the variable `errorClass` is used !image-2024-04-18-09-26-04-493.png|width=543,height=32! But this consistent of terms that follow by SQL standards is really very great! was (Author: panbingkun): This will be a very huge task, and I roughly counted almost 4k+ places where the variable `errorClass` is used !image-2024-04-18-09-26-04-493.png|width=543,height=32! But this consistent of terms that follow by SQL standards is really very great! > Rename errorClass to errorCondition > --- > > Key: SPARK-47429 > URL: https://issues.apache.org/jira/browse/SPARK-47429 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Nicholas Chammas >Priority: Minor > Attachments: image-2024-04-18-09-26-04-493.png > > > We've agreed on the parent task to rename {{errorClass}} to align it more > closely with the SQL standard, and take advantage of the opportunity to break > backwards compatibility offered by the Spark version change from 3.5 to 4.0. > This ticket also covers renaming {{subClass}} as well. > This is a subtask so the changes are in their own PR and easier to review > apart from other things. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-47429) Rename errorClass to errorCondition
[ https://issues.apache.org/jira/browse/SPARK-47429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17838412#comment-17838412 ] BingKun Pan edited comment on SPARK-47429 at 4/18/24 1:26 AM: -- This will be a very huge task, and I roughly counted almost 4k+ places where the variable `errorClass` is used !image-2024-04-18-09-26-04-493.png|width=543,height=32! But this consistent of terms that follow by SQL standards is really very great! was (Author: panbingkun): This will be a very huge task, and I roughly counted almost 4k+ places where the variable `errorClass` is used !image-2024-04-18-09-22-20-736.png|width=680,height=42! But this consistent of terms that follow by SQL standards is really very great! > Rename errorClass to errorCondition > --- > > Key: SPARK-47429 > URL: https://issues.apache.org/jira/browse/SPARK-47429 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Nicholas Chammas >Priority: Minor > Attachments: image-2024-04-18-09-26-04-493.png > > > We've agreed on the parent task to rename {{errorClass}} to align it more > closely with the SQL standard, and take advantage of the opportunity to break > backwards compatibility offered by the Spark version change from 3.5 to 4.0. > This ticket also covers renaming {{subClass}} as well. > This is a subtask so the changes are in their own PR and easier to review > apart from other things. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-47429) Rename errorClass to errorCondition
[ https://issues.apache.org/jira/browse/SPARK-47429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17838412#comment-17838412 ] BingKun Pan commented on SPARK-47429: - This will be a very huge task, and I roughly counted almost 4k+ places where the variable `errorClass` is used !image-2024-04-18-09-22-20-736.png|width=680,height=42! But this consistent of terms that follow by SQL standards is really very great! > Rename errorClass to errorCondition > --- > > Key: SPARK-47429 > URL: https://issues.apache.org/jira/browse/SPARK-47429 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Nicholas Chammas >Priority: Minor > > We've agreed on the parent task to rename {{errorClass}} to align it more > closely with the SQL standard, and take advantage of the opportunity to break > backwards compatibility offered by the Spark version change from 3.5 to 4.0. > This ticket also covers renaming {{subClass}} as well. > This is a subtask so the changes are in their own PR and easier to review > apart from other things. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47891) Improve docstring of mapInPandas
[ https://issues.apache.org/jira/browse/SPARK-47891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-47891. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 46108 [https://github.com/apache/spark/pull/46108] > Improve docstring of mapInPandas > > > Key: SPARK-47891 > URL: https://issues.apache.org/jira/browse/SPARK-47891 > Project: Spark > Issue Type: Sub-task > Components: Documentation, PySpark >Affects Versions: 4.0.0 >Reporter: Xinrong Meng >Assignee: Xinrong Meng >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Improve docstring of mapInPandas > * "using a Python native function that takes and outputs a pandas DataFrame" > is confusing cause the function takes and outputs "ITERATOR of pandas > DataFrames" instead. > * "All columns are passed together as an iterator of pandas DataFrames" > easily mislead users to think the entire DataFrame will be passed together, > "a batch of rows" is used instead. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47894) Add `Environment` page to Master UI
[ https://issues.apache.org/jira/browse/SPARK-47894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47894: --- Labels: pull-request-available (was: ) > Add `Environment` page to Master UI > --- > > Key: SPARK-47894 > URL: https://issues.apache.org/jira/browse/SPARK-47894 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, Web UI >Affects Versions: 4.0.0 >Reporter: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47894) Add `Environment` page to Master UI
Dongjoon Hyun created SPARK-47894: - Summary: Add `Environment` page to Master UI Key: SPARK-47894 URL: https://issues.apache.org/jira/browse/SPARK-47894 Project: Spark Issue Type: Sub-task Components: Spark Core, Web UI Affects Versions: 4.0.0 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47892) XML: Stop ignoring CDATA within rows.
Yousof Hosny created SPARK-47892: Summary: XML: Stop ignoring CDATA within rows. Key: SPARK-47892 URL: https://issues.apache.org/jira/browse/SPARK-47892 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 4.0.0 Reporter: Yousof Hosny Fix For: 4.0.0 This change ignores CDATA within row tags as well as outside of it. We should only ignore CDATA found outside of row tags as they are considered data within the row. [https://github.com/apache/spark/pull/45487] NOTE: With the current parser implementation, after not ignoring CDATA elements within row tags there remains the edge case of a matching closing row tag within CDATA which will be parsed as a valid end tag. Example: {code:java} {code} after no longer ignoring CDATA within rows, the closing tag in the example above will be matched by the parser which is incorrect. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47891) Improve docstring of mapInPandas
[ https://issues.apache.org/jira/browse/SPARK-47891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-47891: - Description: Improve docstring of mapInPandas * "using a Python native function that takes and outputs a pandas DataFrame" is confusing cause the function takes and outputs "ITERATOR of pandas DataFrames" instead. * "All columns are passed together as an iterator of pandas DataFrames" easily mislead users to think the entire DataFrame will be passed together, "a batch of rows" is used instead. was:Improve docstring of mapInPandas > Improve docstring of mapInPandas > > > Key: SPARK-47891 > URL: https://issues.apache.org/jira/browse/SPARK-47891 > Project: Spark > Issue Type: Sub-task > Components: Documentation, PySpark >Affects Versions: 4.0.0 >Reporter: Xinrong Meng >Priority: Major > Labels: pull-request-available > > Improve docstring of mapInPandas > * "using a Python native function that takes and outputs a pandas DataFrame" > is confusing cause the function takes and outputs "ITERATOR of pandas > DataFrames" instead. > * "All columns are passed together as an iterator of pandas DataFrames" > easily mislead users to think the entire DataFrame will be passed together, > "a batch of rows" is used instead. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47891) Improve docstring of mapInPandas
[ https://issues.apache.org/jira/browse/SPARK-47891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47891: --- Labels: pull-request-available (was: ) > Improve docstring of mapInPandas > > > Key: SPARK-47891 > URL: https://issues.apache.org/jira/browse/SPARK-47891 > Project: Spark > Issue Type: Sub-task > Components: Documentation, PySpark >Affects Versions: 4.0.0 >Reporter: Xinrong Meng >Priority: Major > Labels: pull-request-available > > Improve docstring of mapInPandas -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-47172) Upgrade Transport block cipher mode to GCM
[ https://issues.apache.org/jira/browse/SPARK-47172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17838380#comment-17838380 ] Steve Weis commented on SPARK-47172: [~mridulm80] What about 3.x? If we backported TLS support, that would be a better option. I mentioned this before and it sounded like there was not support for backporting TLS at this time. > Upgrade Transport block cipher mode to GCM > -- > > Key: SPARK-47172 > URL: https://issues.apache.org/jira/browse/SPARK-47172 > Project: Spark > Issue Type: Improvement > Components: Security >Affects Versions: 3.4.2, 3.5.0 >Reporter: Steve Weis >Priority: Minor > > The cipher transformation currently used for encrypting RPC calls is an > unauthenticated mode (AES/CTR/NoPadding). This needs to be upgraded to an > authenticated mode (AES/GCM/NoPadding) to prevent ciphertext from being > modified in transit. > The relevant line is here: > [https://github.com/apache/spark/blob/a939a7d0fd9c6b23c879cbee05275c6fbc939e38/common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java#L220] > GCM is relatively more computationally expensive than CTR and adds a 16-byte > block of authentication tag data to each payload. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47889) Setup gradle as build tool for operator repository
[ https://issues.apache.org/jira/browse/SPARK-47889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47889: --- Labels: pull-request-available (was: ) > Setup gradle as build tool for operator repository > -- > > Key: SPARK-47889 > URL: https://issues.apache.org/jira/browse/SPARK-47889 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes >Affects Versions: kubernetes-operator-0.1.0 >Reporter: Zhou JIANG >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47889) Setup gradle as build tool for operator repository
Zhou JIANG created SPARK-47889: -- Summary: Setup gradle as build tool for operator repository Key: SPARK-47889 URL: https://issues.apache.org/jira/browse/SPARK-47889 Project: Spark Issue Type: Sub-task Components: Kubernetes Affects Versions: kubernetes-operator-0.1.0 Reporter: Zhou JIANG -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47584) SQL core: Migrate logWarn with variables to structured logging framework
[ https://issues.apache.org/jira/browse/SPARK-47584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-47584. Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 46057 [https://github.com/apache/spark/pull/46057] > SQL core: Migrate logWarn with variables to structured logging framework > > > Key: SPARK-47584 > URL: https://issues.apache.org/jira/browse/SPARK-47584 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Gengliang Wang >Assignee: BingKun Pan >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47627) MERGE with WITH SCHEMA EVOLUTION keywords
[ https://issues.apache.org/jira/browse/SPARK-47627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-47627. Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45748 [https://github.com/apache/spark/pull/45748] > MERGE with WITH SCHEMA EVOLUTION keywords > - > > Key: SPARK-47627 > URL: https://issues.apache.org/jira/browse/SPARK-47627 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 4.0.0 >Reporter: Pengfei Xu >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47360) Overlay, FormatString, Length, BitLength, OctetLength, SoundEx, Luhncheck (all collations)
[ https://issues.apache.org/jira/browse/SPARK-47360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-47360. - Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 46003 [https://github.com/apache/spark/pull/46003] > Overlay, FormatString, Length, BitLength, OctetLength, SoundEx, Luhncheck > (all collations) > -- > > Key: SPARK-47360 > URL: https://issues.apache.org/jira/browse/SPARK-47360 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Uroš Bojanić >Assignee: Nikola Mandic >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47726) Document push-based shuffle metrics
[ https://issues.apache.org/jira/browse/SPARK-47726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-47726. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45872 [https://github.com/apache/spark/pull/45872] > Document push-based shuffle metrics > --- > > Key: SPARK-47726 > URL: https://issues.apache.org/jira/browse/SPARK-47726 > Project: Spark > Issue Type: Documentation > Components: Documentation >Affects Versions: 3.4.2, 4.0.0, 3.5.1 >Reporter: Luca Canali >Assignee: Luca Canali >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > > This is to add documentation for the metrics related to push-based shuffle. > It's a follow up documentation ticket from: > https://issues.apache.org/jira/browse/SPARK-36620 > Related to this, note also: https://issues.apache.org/jira/browse/SPARK-42203 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47726) Document push-based shuffle metrics
[ https://issues.apache.org/jira/browse/SPARK-47726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-47726: - Assignee: Luca Canali > Document push-based shuffle metrics > --- > > Key: SPARK-47726 > URL: https://issues.apache.org/jira/browse/SPARK-47726 > Project: Spark > Issue Type: Documentation > Components: Documentation >Affects Versions: 3.4.2, 4.0.0, 3.5.1 >Reporter: Luca Canali >Assignee: Luca Canali >Priority: Minor > Labels: pull-request-available > > This is to add documentation for the metrics related to push-based shuffle. > It's a follow up documentation ticket from: > https://issues.apache.org/jira/browse/SPARK-36620 > Related to this, note also: https://issues.apache.org/jira/browse/SPARK-42203 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47416) Add benchmark for stringpredicate expressions
[ https://issues.apache.org/jira/browse/SPARK-47416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-47416. - Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 46078 [https://github.com/apache/spark/pull/46078] > Add benchmark for stringpredicate expressions > - > > Key: SPARK-47416 > URL: https://issues.apache.org/jira/browse/SPARK-47416 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Uroš Bojanić >Assignee: Uroš Bojanić >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47886) Postgres: Add test and doc for Postgres special numeric values
[ https://issues.apache.org/jira/browse/SPARK-47886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-47886. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 46102 [https://github.com/apache/spark/pull/46102] > Postgres: Add test and doc for Postgres special numeric values > -- > > Key: SPARK-47886 > URL: https://issues.apache.org/jira/browse/SPARK-47886 > Project: Spark > Issue Type: Sub-task > Components: Documentation, SQL, Tests >Affects Versions: 4.0.0 >Reporter: Kent Yao >Assignee: Kent Yao >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47887) Remove unused import `spark/connect/common.proto` from `spark/connect/relations.proto`
[ https://issues.apache.org/jira/browse/SPARK-47887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47887: --- Labels: pull-request-available (was: ) > Remove unused import `spark/connect/common.proto` from > `spark/connect/relations.proto` > -- > > Key: SPARK-47887 > URL: https://issues.apache.org/jira/browse/SPARK-47887 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > Labels: pull-request-available > > fix compile waring: > > {code:java} > spark/connect/relations.proto:26:1: warning: Import > spark/connect/common.proto is unused. {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47887) Remove unused import `spark/connect/common.proto` from `spark/connect/relations.proto`
Yang Jie created SPARK-47887: Summary: Remove unused import `spark/connect/common.proto` from `spark/connect/relations.proto` Key: SPARK-47887 URL: https://issues.apache.org/jira/browse/SPARK-47887 Project: Spark Issue Type: Improvement Components: Connect Affects Versions: 4.0.0 Reporter: Yang Jie fix compile waring: {code:java} spark/connect/relations.proto:26:1: warning: Import spark/connect/common.proto is unused. {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47830) Reeanble ResourceProfileTests for pyspark-connect
[ https://issues.apache.org/jira/browse/SPARK-47830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-47830. -- Fix Version/s: 4.0.0 Assignee: Hyukjin Kwon Resolution: Fixed fixed in https://github.com/apache/spark/pull/46090 > Reeanble ResourceProfileTests for pyspark-connect > - > > Key: SPARK-47830 > URL: https://issues.apache.org/jira/browse/SPARK-47830 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark, Tests >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47885) Make pyspark.resource compatible with pyspark-connect
[ https://issues.apache.org/jira/browse/SPARK-47885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-47885. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 46100 [https://github.com/apache/spark/pull/46100] > Make pyspark.resource compatible with pyspark-connect > - > > Key: SPARK-47885 > URL: https://issues.apache.org/jira/browse/SPARK-47885 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47540) SPIP: Pure Python Package (Spark Connect)
[ https://issues.apache.org/jira/browse/SPARK-47540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-47540. -- Fix Version/s: 4.0.0 Assignee: Hyukjin Kwon Resolution: Done > SPIP: Pure Python Package (Spark Connect) > - > > Key: SPARK-47540 > URL: https://issues.apache.org/jira/browse/SPARK-47540 > Project: Spark > Issue Type: Umbrella > Components: Connect, PySpark >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Critical > Fix For: 4.0.0 > > > *Q1. What are you trying to do? Articulate your objectives using absolutely > no jargon.* > As part of the [Spark > Connect|https://spark.apache.org/docs/latest/spark-connect-overview.html] > development, we have introduced Scala and Python clients. While the Scala > client is already provided as a separate library and is available in Maven, > the Python client is not. This proposal aims for end users to install the > pure Python package for Spark Connect by using pip install pyspark-connect. > The pure Python package contains only Python source code without jars, which > reduces the size of the package significantly and widens the use cases of > PySpark. See also [Introducing Spark Connect - The Power of Apache Spark, > Everywhere'|https://www.databricks.com/blog/2022/07/07/introducing-spark-connect-the-power-of-apache-spark-everywhere.html]. > *Q2. What problem is this proposal NOT designed to solve?* > This proposal does not aim to Change existing PySpark package, e.g., pip > install pyspark is not affected > - Implement full compatibility with classic PySpark, e.g., implementing RDD > API > - Address how to launch Spark Connect server. Spark Connect server is > launched by users themselves > - Local mode. Without launching Spark Connect server, users cannot use this > package. > - [Official release channel|https://spark.apache.org/downloads.html] is not > affected but only PyPI. > *Q3. How is it done today, and what are the limits of current practice?* > Currently, we run pip install pyspark, and it is over 300MB because of > dependent jars. In addition, PySpark requires you to set up other > environments such as JDK installation. > This is not suitable when the running environment and resource is limited > such as edge devices such as smart home devices. > Requiring a non-Python environment is not Python friendly. > *Q4. What is new in your approach and why do you think it will be successful?* > It provides a pure Python library, which eliminates other environment > requirements such as JDK, and reduces the resource usage by decoupling Spark > Driver, and reduces the package size. > *Q5. Who cares? If you are successful, what difference will it make?* > Users who want to leverage Spark in the limited environment, and want to > decouple running JVM with Spark Driver to run Spark as a Service. They can > simply pip install pyspark-connect that does not require other dependencies > (except Python dependencies just like other Python libraries). > *Q6. What are the risks?* > Because we do not change the existing PySpark package, I do not see any major > risk in classic PySpark itself. We will reuse the same Python source, and > therefore we should make sure no Py4J is used, and no JVM access is made. > This requirement might confuse the developers. At the very least, we should > add the dedicated CI to make sure the pure Python package works. > *Q7. How long will it take?* > I expect around one month including CI set up. In fact, the prototype is > ready so I expect this to be done sooner. > *Q8. What are the mid-term and final “exams” to check for success?* > The mid-term goal is to set up a scheduled CI job that builds the pure Python > library, and runs all the tests against them. > The final goral would be to properly test end-to-end usecase from pip > installation. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47351) Between
[ https://issues.apache.org/jira/browse/SPARK-47351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uroš Bojanić updated SPARK-47351: - Summary: Between (was: TBD) > Between > --- > > Key: SPARK-47351 > URL: https://issues.apache.org/jira/browse/SPARK-47351 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Uroš Bojanić >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47408) Distinct
[ https://issues.apache.org/jira/browse/SPARK-47408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uroš Bojanić updated SPARK-47408: - Summary: Distinct (was: TBD) > Distinct > > > Key: SPARK-47408 > URL: https://issues.apache.org/jira/browse/SPARK-47408 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Uroš Bojanić >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47352) Fix Upper, Lower, InitCap collation awareness
[ https://issues.apache.org/jira/browse/SPARK-47352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47352: --- Labels: pull-request-available (was: ) > Fix Upper, Lower, InitCap collation awareness > - > > Key: SPARK-47352 > URL: https://issues.apache.org/jira/browse/SPARK-47352 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Uroš Bojanić >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47863) endsWith and startsWith don't work correctly for some collations
[ https://issues.apache.org/jira/browse/SPARK-47863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-47863. - Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 46097 [https://github.com/apache/spark/pull/46097] > endsWith and startsWith don't work correctly for some collations > > > Key: SPARK-47863 > URL: https://issues.apache.org/jira/browse/SPARK-47863 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Vladimir Golubev >Assignee: Vladimir Golubev >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > *CollationSupport.EndsWIth* and *CollationSupport.StartsWith* use > {*}CollationAwareUTF8String.matchAt{*}, which operates byte offsets to > compare prefixes/suffixes. This is not correct, since sometimes string parts > (suffix/prefix) of different lengths are actually equal in context of > case-insensitive and lower-case collations. > Example test cases that highlight the problem: > {{{}- *assertContains("The İo", "i̇o", "UNICODE_CI", true);* for > *CollationSupportSuite.*{}}}{{{}{*}testContains{*}.{}}} > {{{}- *assertEndsWith("The İo", "i̇o", "UNICODE_CI", true);* for > *CollationSupportSuite.*{}}}{{{}{*}testEndsWith{*}.{}}} > {{The first passes, since it uses *StringSearch* directly, the second one > does not.}} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47863) endsWith and startsWith don't work correctly for some collations
[ https://issues.apache.org/jira/browse/SPARK-47863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-47863: --- Assignee: Vladimir Golubev > endsWith and startsWith don't work correctly for some collations > > > Key: SPARK-47863 > URL: https://issues.apache.org/jira/browse/SPARK-47863 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Vladimir Golubev >Assignee: Vladimir Golubev >Priority: Major > Labels: pull-request-available > > *CollationSupport.EndsWIth* and *CollationSupport.StartsWith* use > {*}CollationAwareUTF8String.matchAt{*}, which operates byte offsets to > compare prefixes/suffixes. This is not correct, since sometimes string parts > (suffix/prefix) of different lengths are actually equal in context of > case-insensitive and lower-case collations. > Example test cases that highlight the problem: > {{{}- *assertContains("The İo", "i̇o", "UNICODE_CI", true);* for > *CollationSupportSuite.*{}}}{{{}{*}testContains{*}.{}}} > {{{}- *assertEndsWith("The İo", "i̇o", "UNICODE_CI", true);* for > *CollationSupportSuite.*{}}}{{{}{*}testEndsWith{*}.{}}} > {{The first passes, since it uses *StringSearch* directly, the second one > does not.}} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47421) TBD
[ https://issues.apache.org/jira/browse/SPARK-47421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uroš Bojanić updated SPARK-47421: - Summary: TBD (was: Split, SplitPart (binary & lowercase collation only)) > TBD > --- > > Key: SPARK-47421 > URL: https://issues.apache.org/jira/browse/SPARK-47421 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Uroš Bojanić >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47353) NullIf
[ https://issues.apache.org/jira/browse/SPARK-47353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uroš Bojanić updated SPARK-47353: - Summary: NullIf (was: TBD) > NullIf > -- > > Key: SPARK-47353 > URL: https://issues.apache.org/jira/browse/SPARK-47353 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Uroš Bojanić >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47355) Min & Max
[ https://issues.apache.org/jira/browse/SPARK-47355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uroš Bojanić updated SPARK-47355: - Summary: Min & Max (was: TBD) > Min & Max > - > > Key: SPARK-47355 > URL: https://issues.apache.org/jira/browse/SPARK-47355 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Uroš Bojanić >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47354) Case
[ https://issues.apache.org/jira/browse/SPARK-47354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uroš Bojanić updated SPARK-47354: - Summary: Case (was: TBD) > Case > > > Key: SPARK-47354 > URL: https://issues.apache.org/jira/browse/SPARK-47354 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Uroš Bojanić >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47421) Coalesce
[ https://issues.apache.org/jira/browse/SPARK-47421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uroš Bojanić updated SPARK-47421: - Summary: Coalesce (was: TBD) > Coalesce > > > Key: SPARK-47421 > URL: https://issues.apache.org/jira/browse/SPARK-47421 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Uroš Bojanić >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47350) SplitPart (binary & lowercase collation only)
[ https://issues.apache.org/jira/browse/SPARK-47350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uroš Bojanić updated SPARK-47350: - Summary: SplitPart (binary & lowercase collation only) (was: SplitPart (binary & lowercase collation)) > SplitPart (binary & lowercase collation only) > - > > Key: SPARK-47350 > URL: https://issues.apache.org/jira/browse/SPARK-47350 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Uroš Bojanić >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47350) SplitPart (binary & lowercase collation)
[ https://issues.apache.org/jira/browse/SPARK-47350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uroš Bojanić updated SPARK-47350: - Summary: SplitPart (binary & lowercase collation) (was: TBD) > SplitPart (binary & lowercase collation) > > > Key: SPARK-47350 > URL: https://issues.apache.org/jira/browse/SPARK-47350 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Uroš Bojanić >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47884) Switch ANSI SQL CI job to NON-ANSI SQL CI job
[ https://issues.apache.org/jira/browse/SPARK-47884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-47884. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 46099 [https://github.com/apache/spark/pull/46099] > Switch ANSI SQL CI job to NON-ANSI SQL CI job > - > > Key: SPARK-47884 > URL: https://issues.apache.org/jira/browse/SPARK-47884 > Project: Spark > Issue Type: Sub-task > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47864) Enhance "Installation" page to cover all installable options
[ https://issues.apache.org/jira/browse/SPARK-47864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-47864: -- Assignee: Apache Spark > Enhance "Installation" page to cover all installable options > > > Key: SPARK-47864 > URL: https://issues.apache.org/jira/browse/SPARK-47864 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 4.0.0 >Reporter: Haejoon Lee >Assignee: Apache Spark >Priority: Major > Labels: pull-request-available > > Like Installation page from Pandas, we might need to cover all installable > options with related dependencies from our Installation documentation. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47864) Enhance "Installation" page to cover all installable options
[ https://issues.apache.org/jira/browse/SPARK-47864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-47864: -- Assignee: (was: Apache Spark) > Enhance "Installation" page to cover all installable options > > > Key: SPARK-47864 > URL: https://issues.apache.org/jira/browse/SPARK-47864 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 4.0.0 >Reporter: Haejoon Lee >Priority: Major > Labels: pull-request-available > > Like Installation page from Pandas, we might need to cover all installable > options with related dependencies from our Installation documentation. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47864) Enhance "Installation" page to cover all installable options
[ https://issues.apache.org/jira/browse/SPARK-47864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-47864: -- Assignee: Apache Spark > Enhance "Installation" page to cover all installable options > > > Key: SPARK-47864 > URL: https://issues.apache.org/jira/browse/SPARK-47864 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 4.0.0 >Reporter: Haejoon Lee >Assignee: Apache Spark >Priority: Major > Labels: pull-request-available > > Like Installation page from Pandas, we might need to cover all installable > options with related dependencies from our Installation documentation. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47864) Enhance "Installation" page to cover all installable options
[ https://issues.apache.org/jira/browse/SPARK-47864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-47864: -- Assignee: (was: Apache Spark) > Enhance "Installation" page to cover all installable options > > > Key: SPARK-47864 > URL: https://issues.apache.org/jira/browse/SPARK-47864 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 4.0.0 >Reporter: Haejoon Lee >Priority: Major > Labels: pull-request-available > > Like Installation page from Pandas, we might need to cover all installable > options with related dependencies from our Installation documentation. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47885) Make pyspark.resource compatible with pyspark-connect
[ https://issues.apache.org/jira/browse/SPARK-47885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47885: --- Labels: pull-request-available (was: ) > Make pyspark.resource compatible with pyspark-connect > - > > Key: SPARK-47885 > URL: https://issues.apache.org/jira/browse/SPARK-47885 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47807) Make pyspark.ml compatible with pyspark-connect
[ https://issues.apache.org/jira/browse/SPARK-47807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-47807: - Summary: Make pyspark.ml compatible with pyspark-connect (was: Make pyspark.ml compatible witbh pyspark-connect) > Make pyspark.ml compatible with pyspark-connect > --- > > Key: SPARK-47807 > URL: https://issues.apache.org/jira/browse/SPARK-47807 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47885) Make pyspark.resource compatible with pyspark-connect
Hyukjin Kwon created SPARK-47885: Summary: Make pyspark.resource compatible with pyspark-connect Key: SPARK-47885 URL: https://issues.apache.org/jira/browse/SPARK-47885 Project: Spark Issue Type: Sub-task Components: PySpark Affects Versions: 4.0.0 Reporter: Hyukjin Kwon -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47884) Switch ANSI SQL CI job to NON-ANSI SQL CI job
[ https://issues.apache.org/jira/browse/SPARK-47884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-47884: -- Summary: Switch ANSI SQL CI job to NON-ANSI SQL CI job (was: Switch ANSI SQL CI to NON-ANSI SQL CI) > Switch ANSI SQL CI job to NON-ANSI SQL CI job > - > > Key: SPARK-47884 > URL: https://issues.apache.org/jira/browse/SPARK-47884 > Project: Spark > Issue Type: Sub-task > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47884) Switch ANSI SQL CI job to NON-ANSI SQL CI job
[ https://issues.apache.org/jira/browse/SPARK-47884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-47884: - Assignee: Dongjoon Hyun > Switch ANSI SQL CI job to NON-ANSI SQL CI job > - > > Key: SPARK-47884 > URL: https://issues.apache.org/jira/browse/SPARK-47884 > Project: Spark > Issue Type: Sub-task > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47884) Switch ANSI SQL CI to NON-ANSI SQL CI
[ https://issues.apache.org/jira/browse/SPARK-47884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47884: --- Labels: pull-request-available (was: ) > Switch ANSI SQL CI to NON-ANSI SQL CI > - > > Key: SPARK-47884 > URL: https://issues.apache.org/jira/browse/SPARK-47884 > Project: Spark > Issue Type: Sub-task > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47884) Switch ANSI SQL CI to NON-ANSI SQL CI
Dongjoon Hyun created SPARK-47884: - Summary: Switch ANSI SQL CI to NON-ANSI SQL CI Key: SPARK-47884 URL: https://issues.apache.org/jira/browse/SPARK-47884 Project: Spark Issue Type: Sub-task Components: Project Infra Affects Versions: 4.0.0 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-44444) Use ANSI mode by default
[ https://issues.apache.org/jira/browse/SPARK-4?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-4. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 46013 [https://github.com/apache/spark/pull/46013] > Use ANSI mode by default > > > Key: SPARK-4 > URL: https://issues.apache.org/jira/browse/SPARK-4 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Yuming Wang >Assignee: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > To avoid data issue. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-44444) Use ANSI SQL mode by default
[ https://issues.apache.org/jira/browse/SPARK-4?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-4: -- Summary: Use ANSI SQL mode by default (was: Use ANSI mode by default) > Use ANSI SQL mode by default > > > Key: SPARK-4 > URL: https://issues.apache.org/jira/browse/SPARK-4 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Yuming Wang >Assignee: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > To avoid data issue. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47822) Prohibit Hash expressions from hashing Variant type
[ https://issues.apache.org/jira/browse/SPARK-47822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-47822: --- Assignee: Harsh Motwani > Prohibit Hash expressions from hashing Variant type > --- > > Key: SPARK-47822 > URL: https://issues.apache.org/jira/browse/SPARK-47822 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Harsh Motwani >Assignee: Harsh Motwani >Priority: Major > Labels: pull-request-available > > Prohibiting Hash functions from being applied on the Variant type. This is > because they haven't been implemented on the variant type and crash during > execution. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47822) Prohibit Hash expressions from hashing Variant type
[ https://issues.apache.org/jira/browse/SPARK-47822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-47822. - Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 46017 [https://github.com/apache/spark/pull/46017] > Prohibit Hash expressions from hashing Variant type > --- > > Key: SPARK-47822 > URL: https://issues.apache.org/jira/browse/SPARK-47822 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Harsh Motwani >Assignee: Harsh Motwani >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Prohibiting Hash functions from being applied on the Variant type. This is > because they haven't been implemented on the variant type and crash during > execution. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47821) Add is_variant_null expression
[ https://issues.apache.org/jira/browse/SPARK-47821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-47821. - Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 46011 [https://github.com/apache/spark/pull/46011] > Add is_variant_null expression > -- > > Key: SPARK-47821 > URL: https://issues.apache.org/jira/browse/SPARK-47821 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Richard Chen >Assignee: Richard Chen >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > adds a `is_variant_null` expression, which returns whether a given variant > value represents a variant null (note the difference between a variant null > and an engine null) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47867) Support Variant in JSON scan.
[ https://issues.apache.org/jira/browse/SPARK-47867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-47867. - Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 46071 [https://github.com/apache/spark/pull/46071] > Support Variant in JSON scan. > - > > Key: SPARK-47867 > URL: https://issues.apache.org/jira/browse/SPARK-47867 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Chenhao Li >Assignee: Chenhao Li >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47863) endsWith and startsWith don't work correctly for some collations
[ https://issues.apache.org/jira/browse/SPARK-47863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47863: --- Labels: pull-request-available (was: ) > endsWith and startsWith don't work correctly for some collations > > > Key: SPARK-47863 > URL: https://issues.apache.org/jira/browse/SPARK-47863 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Vladimir Golubev >Priority: Major > Labels: pull-request-available > > *CollationSupport.EndsWIth* and *CollationSupport.StartsWith* use > {*}CollationAwareUTF8String.matchAt{*}, which operates byte offsets to > compare prefixes/suffixes. This is not correct, since sometimes string parts > (suffix/prefix) of different lengths are actually equal in context of > case-insensitive and lower-case collations. > Example test cases that highlight the problem: > {{{}- *assertContains("The İo", "i̇o", "UNICODE_CI", true);* for > *CollationSupportSuite.*{}}}{{{}{*}testContains{*}.{}}} > {{{}- *assertEndsWith("The İo", "i̇o", "UNICODE_CI", true);* for > *CollationSupportSuite.*{}}}{{{}{*}testEndsWith{*}.{}}} > {{The first passes, since it uses *StringSearch* directly, the second one > does not.}} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47864) Enhance "Installation" page to cover all installable options
[ https://issues.apache.org/jira/browse/SPARK-47864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47864: --- Labels: pull-request-available (was: ) > Enhance "Installation" page to cover all installable options > > > Key: SPARK-47864 > URL: https://issues.apache.org/jira/browse/SPARK-47864 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 4.0.0 >Reporter: Haejoon Lee >Priority: Major > Labels: pull-request-available > > Like Installation page from Pandas, we might need to cover all installable > options with related dependencies from our Installation documentation. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org