[jira] [Updated] (SPARK-46985) Move _NoValue from pyspark.* to pyspark.sql.*

2024-02-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-46985: --- Labels: pull-request-available (was: ) > Move _NoValue from pyspark.* to pyspark.sql.* >

[jira] [Updated] (SPARK-46679) Encoders with multiple inheritance - Key not found: T

2024-02-05 Thread Andoni Teso (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andoni Teso updated SPARK-46679: Affects Version/s: 4.0.0 > Encoders with multiple inheritance - Key not found: T >

[jira] [Updated] (SPARK-46679) Encoders with multiple inheritance - Key not found: T

2024-02-05 Thread Andoni Teso (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andoni Teso updated SPARK-46679: Priority: Critical (was: Blocker) > Encoders with multiple inheritance - Key not found: T >

[jira] [Updated] (SPARK-46984) Remove pyspark.copy_func

2024-02-05 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-46984: - Priority: Minor (was: Major) > Remove pyspark.copy_func > > >

[jira] [Created] (SPARK-46985) Move _NoValue from pyspark.* to pyspark.sql.*

2024-02-05 Thread Hyukjin Kwon (Jira)
Hyukjin Kwon created SPARK-46985: Summary: Move _NoValue from pyspark.* to pyspark.sql.* Key: SPARK-46985 URL: https://issues.apache.org/jira/browse/SPARK-46985 Project: Spark Issue Type:

[jira] [Updated] (SPARK-46984) Remove pyspark.copy_func

2024-02-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-46984: --- Labels: pull-request-available (was: ) > Remove pyspark.copy_func >

[jira] [Created] (SPARK-46984) Remove pyspark.copy_func

2024-02-05 Thread Hyukjin Kwon (Jira)
Hyukjin Kwon created SPARK-46984: Summary: Remove pyspark.copy_func Key: SPARK-46984 URL: https://issues.apache.org/jira/browse/SPARK-46984 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-46983) Decouple module dependencies between PySpark modules

2024-02-05 Thread Hyukjin Kwon (Jira)
Hyukjin Kwon created SPARK-46983: Summary: Decouple module dependencies between PySpark modules Key: SPARK-46983 URL: https://issues.apache.org/jira/browse/SPARK-46983 Project: Spark Issue

[jira] [Updated] (SPARK-46170) Support inject adaptive query post planner strategy rules in SparkSessionExtensions

2024-02-05 Thread Kent Yao (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kent Yao updated SPARK-46170: - Fix Version/s: 3.5.1 > Support inject adaptive query post planner strategy rules in >

[jira] [Updated] (SPARK-46982) Remove _LEGACY_ERROR_TEMP_2187 in favor of CANNOT_RECOGNIZE_HIVE_TYPE

2024-02-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-46982: --- Labels: pull-request-available (was: ) > Remove _LEGACY_ERROR_TEMP_2187 in favor of

[jira] [Updated] (SPARK-46981) Driver OOM happens in query planning phase with empty tables

2024-02-05 Thread Noritaka Sekiyama (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noritaka Sekiyama updated SPARK-46981: -- Description: We have observed that Driver OOM happens in query planning phase with

[jira] [Created] (SPARK-46982) Remove _LEGACY_ERROR_TEMP_2187 in favor of CANNOT_RECOGNIZE_HIVE_TYPE

2024-02-05 Thread Kent Yao (Jira)
Kent Yao created SPARK-46982: Summary: Remove _LEGACY_ERROR_TEMP_2187 in favor of CANNOT_RECOGNIZE_HIVE_TYPE Key: SPARK-46982 URL: https://issues.apache.org/jira/browse/SPARK-46982 Project: Spark

[jira] [Updated] (SPARK-46981) Driver OOM happens in query planning phase with empty tables

2024-02-05 Thread Noritaka Sekiyama (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noritaka Sekiyama updated SPARK-46981: -- Description: We have observed that Driver OOM happens in query planning phase with

[jira] [Updated] (SPARK-46981) Driver OOM happens in query planning phase with empty tables

2024-02-05 Thread Noritaka Sekiyama (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noritaka Sekiyama updated SPARK-46981: -- Attachment: test_and_twodays_simplified.sql > Driver OOM happens in query planning

[jira] [Created] (SPARK-46981) Driver OOM happens in query planning phase with empty tables

2024-02-05 Thread Noritaka Sekiyama (Jira)
Noritaka Sekiyama created SPARK-46981: - Summary: Driver OOM happens in query planning phase with empty tables Key: SPARK-46981 URL: https://issues.apache.org/jira/browse/SPARK-46981 Project:

[jira] [Updated] (SPARK-46981) Driver OOM happens in query planning phase with empty tables

2024-02-05 Thread Noritaka Sekiyama (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noritaka Sekiyama updated SPARK-46981: -- Attachment: create_sanitized_tables.py > Driver OOM happens in query planning phase

[jira] [Resolved] (SPARK-46958) missing timezone to coerce default values

2024-02-05 Thread Kent Yao (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kent Yao resolved SPARK-46958. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45000

[jira] [Updated] (SPARK-46934) Unable to create Hive View from certain Spark Dataframe StructType

2024-02-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-46934: --- Labels: pull-request-available (was: ) > Unable to create Hive View from certain Spark

[jira] [Updated] (SPARK-46979) Add support for defining state encoder for key/value and col family independently

2024-02-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-46979: --- Labels: pull-request-available (was: ) > Add support for defining state encoder for

[jira] [Commented] (SPARK-46934) Unable to create Hive View from certain Spark Dataframe StructType

2024-02-05 Thread Kent Yao (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814588#comment-17814588 ] Kent Yao commented on SPARK-46934: -- Hi [~yutinglin], How can I create an element named

[jira] [Assigned] (SPARK-46960) Testing Multiple Input Streams for TransformWithState operator

2024-02-05 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-46960: Assignee: Eric Marnadi > Testing Multiple Input Streams for TransformWithState operator

[jira] [Resolved] (SPARK-46960) Testing Multiple Input Streams for TransformWithState operator

2024-02-05 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-46960. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45004

[jira] [Updated] (SPARK-46960) Testing Multiple Input Streams for TransformWithState operator

2024-02-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-46960: --- Labels: pull-request-available (was: ) > Testing Multiple Input Streams for

[jira] [Updated] (SPARK-45599) Percentile can produce a wrong answer if -0.0 and 0.0 are mixed in the dataset

2024-02-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45599: --- Labels: correctness pull-request-available (was: correctness) > Percentile can produce a

[jira] [Updated] (SPARK-46980) Avoid using internal APIs in dataframe end-to-end tests

2024-02-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-46980: --- Labels: pull-request-available (was: ) > Avoid using internal APIs in dataframe end-to-end

[jira] [Resolved] (SPARK-46980) Avoid using internal APIs in dataframe end-to-end tests

2024-02-05 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-46980. - Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45034

[jira] [Created] (SPARK-46980) Avoid using internal APIs in tests

2024-02-05 Thread Wenchen Fan (Jira)
Wenchen Fan created SPARK-46980: --- Summary: Avoid using internal APIs in tests Key: SPARK-46980 URL: https://issues.apache.org/jira/browse/SPARK-46980 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-46980) Avoid using internal APIs in dataframe end-to-end tests

2024-02-05 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-46980: Summary: Avoid using internal APIs in dataframe end-to-end tests (was: Avoid using internal APIs

[jira] [Comment Edited] (SPARK-39441) Speed up DeduplicateRelations

2024-02-05 Thread Mitesh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17749321#comment-17749321 ] Mitesh edited comment on SPARK-39441 at 2/5/24 11:28 PM: - After applying this

[jira] [Created] (SPARK-46979) Add support for defining state encoder for key/value and col family independently

2024-02-05 Thread Anish Shrigondekar (Jira)
Anish Shrigondekar created SPARK-46979: -- Summary: Add support for defining state encoder for key/value and col family independently Key: SPARK-46979 URL: https://issues.apache.org/jira/browse/SPARK-46979

[jira] [Resolved] (SPARK-46977) A failed request to obtain a token from one NameNode should not block subsequent token requests

2024-02-05 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-46977. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45030

[jira] [Assigned] (SPARK-46977) A failed request to obtain a token from one NameNode should not block subsequent token requests

2024-02-05 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-46977: - Assignee: Cheng Pan > A failed request to obtain a token from one NameNode should not

[jira] [Resolved] (SPARK-46972) Asymmetrical replacement for char/varchar in V2SessionCatalog.createTable

2024-02-05 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-46972. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45019

[jira] [Resolved] (SPARK-46978) Refine docstring of `sum_distinct/array_agg/count_if`

2024-02-05 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-46978. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45031

[jira] [Assigned] (SPARK-46978) Refine docstring of `sum_distinct/array_agg/count_if`

2024-02-05 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-46978: - Assignee: Yang Jie > Refine docstring of `sum_distinct/array_agg/count_if` >

[jira] [Comment Edited] (SPARK-39441) Speed up DeduplicateRelations

2024-02-05 Thread Mitesh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17749321#comment-17749321 ] Mitesh edited comment on SPARK-39441 at 2/5/24 7:01 PM: After applying this fix

[jira] [Comment Edited] (SPARK-39441) Speed up DeduplicateRelations

2024-02-05 Thread Mitesh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17749321#comment-17749321 ] Mitesh edited comment on SPARK-39441 at 2/5/24 7:00 PM: After applying this fix

[jira] [Comment Edited] (SPARK-46032) connect: cannot assign instance of java.lang.invoke.SerializedLambda to field org.apache.spark.rdd.MapPartitionsRDD.f

2024-02-05 Thread Jira
[ https://issues.apache.org/jira/browse/SPARK-46032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814432#comment-17814432 ] Gaétan CACACE edited comment on SPARK-46032 at 2/5/24 4:37 PM: --- Hello

[jira] [Commented] (SPARK-46032) connect: cannot assign instance of java.lang.invoke.SerializedLambda to field org.apache.spark.rdd.MapPartitionsRDD.f

2024-02-05 Thread Jira
[ https://issues.apache.org/jira/browse/SPARK-46032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814432#comment-17814432 ] Gaétan CACACE commented on SPARK-46032: --- Hello there,   Just coming to give some more

[jira] [Assigned] (SPARK-46833) Using ICU library for collation tracking

2024-02-05 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-46833: --- Assignee: Aleksandar Tomic > Using ICU library for collation tracking >

[jira] [Resolved] (SPARK-46833) Using ICU library for collation tracking

2024-02-05 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-46833. - Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 44968

[jira] [Commented] (SPARK-46810) Clarify error class terminology

2024-02-05 Thread Nicholas Chammas (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814406#comment-17814406 ] Nicholas Chammas commented on SPARK-46810: -- [~cloud_fan], [~LuciferYang], [~beliefer], and

[jira] [Comment Edited] (SPARK-24815) Structured Streaming should support dynamic allocation

2024-02-05 Thread Krystal Mitchell (Jira)
[ https://issues.apache.org/jira/browse/SPARK-24815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763940#comment-17763940 ] Krystal Mitchell edited comment on SPARK-24815 at 2/5/24 3:32 PM: --

[jira] [Updated] (SPARK-46978) Refine docstring of `sum_distinct/array_agg/count_if`

2024-02-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-46978: --- Labels: pull-request-available (was: ) > Refine docstring of

[jira] [Updated] (SPARK-46977) A failed request to obtain a token from one NameNode should not block subsequent token requests

2024-02-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-46977: --- Labels: pull-request-available (was: ) > A failed request to obtain a token from one

[jira] [Created] (SPARK-46978) Refine docstring of `sum_distinct/array_agg/count_if`

2024-02-05 Thread Yang Jie (Jira)
Yang Jie created SPARK-46978: Summary: Refine docstring of `sum_distinct/array_agg/count_if` Key: SPARK-46978 URL: https://issues.apache.org/jira/browse/SPARK-46978 Project: Spark Issue Type:

[jira] [Created] (SPARK-46977) A failed request to obtain a token from one NameNode should not block subsequent token requests

2024-02-05 Thread Cheng Pan (Jira)
Cheng Pan created SPARK-46977: - Summary: A failed request to obtain a token from one NameNode should not block subsequent token requests Key: SPARK-46977 URL: https://issues.apache.org/jira/browse/SPARK-46977

[jira] [Assigned] (SPARK-46975) Move to_{hdf, feather, stata} to the fallback list

2024-02-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-46975: -- Assignee: (was: Apache Spark) > Move to_{hdf, feather, stata} to the fallback

[jira] [Assigned] (SPARK-46975) Move to_{hdf, feather, stata} to the fallback list

2024-02-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-46975: -- Assignee: Apache Spark > Move to_{hdf, feather, stata} to the fallback list >

[jira] [Assigned] (SPARK-46976) Implement `DataFrameGroupBy.corr`

2024-02-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-46976: -- Assignee: (was: Apache Spark) > Implement `DataFrameGroupBy.corr` >

[jira] [Assigned] (SPARK-46976) Implement `DataFrameGroupBy.corr`

2024-02-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-46976: -- Assignee: Apache Spark > Implement `DataFrameGroupBy.corr` >

[jira] [Assigned] (SPARK-46976) Implement `DataFrameGroupBy.corr`

2024-02-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-46976: -- Assignee: Apache Spark > Implement `DataFrameGroupBy.corr` >

[jira] [Assigned] (SPARK-46976) Implement `DataFrameGroupBy.corr`

2024-02-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-46976: -- Assignee: (was: Apache Spark) > Implement `DataFrameGroupBy.corr` >