[jira] [Resolved] (SPARK-46392) In Function DataSourceStrategy.translateFilterWithMapping, we need transfer cast expression to data source for filtering
[ https://issues.apache.org/jira/browse/SPARK-46392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiahong.li resolved SPARK-46392. Resolution: Abandoned > In Function DataSourceStrategy.translateFilterWithMapping, we need transfer > cast expression to data source for filtering > - > > Key: SPARK-46392 > URL: https://issues.apache.org/jira/browse/SPARK-46392 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.5.0 >Reporter: jiahong.li >Priority: Minor > Labels: pull-request-available > > Considering this Situation: > We create a partition table that created by source which is extends > TableProvider, if we select data from some specific partitions, choose > partition dataType differ from table partition type leads partition can not > be pushed down . > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46473) Reuse `getPartitionedFile` method
[ https://issues.apache.org/jira/browse/SPARK-46473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-46473: --- Labels: pull-request-available (was: ) > Reuse `getPartitionedFile` method > - > > Key: SPARK-46473 > URL: https://issues.apache.org/jira/browse/SPARK-46473 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.5.0 >Reporter: xiaoping.huang >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-46473) Reuse `getPartitionedFile` method
xiaoping.huang created SPARK-46473: -- Summary: Reuse `getPartitionedFile` method Key: SPARK-46473 URL: https://issues.apache.org/jira/browse/SPARK-46473 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.5.0 Reporter: xiaoping.huang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46472) Refine docstring of `array_prepend/array_append/array_insert`
[ https://issues.apache.org/jira/browse/SPARK-46472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-46472: --- Labels: pull-request-available (was: ) > Refine docstring of `array_prepend/array_append/array_insert` > - > > Key: SPARK-46472 > URL: https://issues.apache.org/jira/browse/SPARK-46472 > Project: Spark > Issue Type: Sub-task > Components: Documentation, PySpark >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-46472) Refine docstring of `array_prepend/array_append/array_insert`
Yang Jie created SPARK-46472: Summary: Refine docstring of `array_prepend/array_append/array_insert` Key: SPARK-46472 URL: https://issues.apache.org/jira/browse/SPARK-46472 Project: Spark Issue Type: Sub-task Components: Documentation, PySpark Affects Versions: 4.0.0 Reporter: Yang Jie -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-46469) Clean up useless local variables in `InsertIntoHiveTable`
[ https://issues.apache.org/jira/browse/SPARK-46469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kent Yao reassigned SPARK-46469: Assignee: Yang Jie > Clean up useless local variables in `InsertIntoHiveTable` > - > > Key: SPARK-46469 > URL: https://issues.apache.org/jira/browse/SPARK-46469 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-46469) Clean up useless local variables in `InsertIntoHiveTable`
[ https://issues.apache.org/jira/browse/SPARK-46469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kent Yao resolved SPARK-46469. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 44433 [https://github.com/apache/spark/pull/44433] > Clean up useless local variables in `InsertIntoHiveTable` > - > > Key: SPARK-46469 > URL: https://issues.apache.org/jira/browse/SPARK-46469 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46471) Reorganize `OpsOnDiffFramesEnabledTests`
[ https://issues.apache.org/jira/browse/SPARK-46471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-46471: --- Labels: pull-request-available (was: ) > Reorganize `OpsOnDiffFramesEnabledTests` > > > Key: SPARK-46471 > URL: https://issues.apache.org/jira/browse/SPARK-46471 > Project: Spark > Issue Type: Sub-task > Components: PS, Tests >Affects Versions: 4.0.0 >Reporter: Ruifeng Zheng >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-46471) Reorganize `OpsOnDiffFramesEnabledTests`
Ruifeng Zheng created SPARK-46471: - Summary: Reorganize `OpsOnDiffFramesEnabledTests` Key: SPARK-46471 URL: https://issues.apache.org/jira/browse/SPARK-46471 Project: Spark Issue Type: Sub-task Components: PS, Tests Affects Versions: 4.0.0 Reporter: Ruifeng Zheng -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46470) Move `test_series_datetime` to `pyspark.pandas.tests.connect.series.*`
[ https://issues.apache.org/jira/browse/SPARK-46470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-46470: --- Labels: pull-request-available (was: ) > Move `test_series_datetime` to `pyspark.pandas.tests.connect.series.*` > -- > > Key: SPARK-46470 > URL: https://issues.apache.org/jira/browse/SPARK-46470 > Project: Spark > Issue Type: Sub-task > Components: PS, Tests >Affects Versions: 4.0.0 >Reporter: Ruifeng Zheng >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46469) Clean up useless local variables in `InsertIntoHiveTable`
[ https://issues.apache.org/jira/browse/SPARK-46469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-46469: --- Labels: pull-request-available (was: ) > Clean up useless local variables in `InsertIntoHiveTable` > - > > Key: SPARK-46469 > URL: https://issues.apache.org/jira/browse/SPARK-46469 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-46469) Clean up useless local variables in `InsertIntoHiveTable`
Yang Jie created SPARK-46469: Summary: Clean up useless local variables in `InsertIntoHiveTable` Key: SPARK-46469 URL: https://issues.apache.org/jira/browse/SPARK-46469 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 4.0.0 Reporter: Yang Jie -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-46207) Support MergeInto in DataFrameWriterV2
[ https://issues.apache.org/jira/browse/SPARK-46207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiaan Geng reassigned SPARK-46207: -- Assignee: Huaxin Gao > Support MergeInto in DataFrameWriterV2 > -- > > Key: SPARK-46207 > URL: https://issues.apache.org/jira/browse/SPARK-46207 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 4.0.0 >Reporter: Huaxin Gao >Assignee: Huaxin Gao >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-46207) Support MergeInto in DataFrameWriterV2
[ https://issues.apache.org/jira/browse/SPARK-46207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiaan Geng resolved SPARK-46207. Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 44119 [https://github.com/apache/spark/pull/44119] > Support MergeInto in DataFrameWriterV2 > -- > > Key: SPARK-46207 > URL: https://issues.apache.org/jira/browse/SPARK-46207 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 4.0.0 >Reporter: Huaxin Gao >Assignee: Huaxin Gao >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-46392) In Function DataSourceStrategy.translateFilterWithMapping, we need transfer cast expression to data source for filtering
[ https://issues.apache.org/jira/browse/SPARK-46392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17799227#comment-17799227 ] jiahong.li commented on SPARK-46392: pr: https://github.com/apache/spark/pull/44431 > In Function DataSourceStrategy.translateFilterWithMapping, we need transfer > cast expression to data source for filtering > - > > Key: SPARK-46392 > URL: https://issues.apache.org/jira/browse/SPARK-46392 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.5.0 >Reporter: jiahong.li >Priority: Minor > Labels: pull-request-available > > Considering this Situation: > We create a partition table that created by source which is extends > TableProvider, if we select data from some specific partitions, choose > partition dataType differ from table partition type leads partition can not > be pushed down . > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46468) COUNT bug in lateral/exists subqueries
[ https://issues.apache.org/jira/browse/SPARK-46468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Gubichev updated SPARK-46468: Description: Some further instances of a COUNT bug. One example is this test from join-lateral.sql [https://github.com/apache/spark/blame/master/sql/core/src/test/resources/sql-tests/results/join-lateral.sql.out#L757] According to PostgreSQL, the query should return 2 rows: c1 | c2 | sum ---{-}++{-}--{-}{-} 0 | 1 | 2 1 | 2 | NULL whereas Spark SQL only returns the first one. Similar instance is the following query, which should return 1 row from t1 but has an empty result now: {{create temporary view t1(c1, c2) as values (0, 1), (1, 2);}} {{create temporary view t2(c1, c2) as values (0, 2), (0, 3);}} {{SELECT tt1.c2}} {{FROM t1 as tt1}} {{WHERE tt1.c1 in (}} select max(tt2.c1) from t2 as tt2 where tt1.c2 is null); was: Some further instances of a COUNT bug. One example is this test from join-lateral.sql [https://github.com/apache/spark/blame/master/sql/core/src/test/resources/sql-tests/results/join-lateral.sql.out#L757] According to PostgreSQL, the query should return 2 rows: c1 | c2 | sum ++- 0 | 1 | 2 1 | 2 | NULL whereas Spark SQL only returns the first one. Similar instance is the following query, which should return 1 row from t1 but has an empty result now: {{create temporary view t1(c1, c2) as values (0, 1), (1, 2);}} {{create temporary view t2(c1, c2) as values (0, 2), (0, 3);}} {{SELECT tt1.c2}} {{FROM t1 as tt1}} {{WHERE tt1.c1 in (}} {{ select max(tt2.c1)}} {{ from t2 as tt2}} {{ where tt1.c2 is null);}} {{}} > COUNT bug in lateral/exists subqueries > -- > > Key: SPARK-46468 > URL: https://issues.apache.org/jira/browse/SPARK-46468 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Andrey Gubichev >Priority: Major > > Some further instances of a COUNT bug. > > One example is this test from join-lateral.sql > [https://github.com/apache/spark/blame/master/sql/core/src/test/resources/sql-tests/results/join-lateral.sql.out#L757] > > According to PostgreSQL, the query should return 2 rows: > c1 | c2 | sum > ---{-}++{-}--{-}{-} > 0 | 1 | 2 > 1 | 2 | NULL > > whereas Spark SQL only returns the first one. > > Similar instance is the following query, which should return 1 row from t1 > but has an empty result now: > {{create temporary view t1(c1, c2) as values (0, 1), (1, 2);}} > {{create temporary view t2(c1, c2) as values (0, 2), (0, 3);}} > {{SELECT tt1.c2}} > {{FROM t1 as tt1}} > {{WHERE tt1.c1 in (}} > select max(tt2.c1) > from t2 as tt2 > where tt1.c2 is null); -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-46468) COUNT bug in lateral/exists subqueries
Andrey Gubichev created SPARK-46468: --- Summary: COUNT bug in lateral/exists subqueries Key: SPARK-46468 URL: https://issues.apache.org/jira/browse/SPARK-46468 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.0.0 Reporter: Andrey Gubichev Some further instances of a COUNT bug. One example is this test from join-lateral.sql [https://github.com/apache/spark/blame/master/sql/core/src/test/resources/sql-tests/results/join-lateral.sql.out#L757] According to PostgreSQL, the query should return 2 rows: c1 | c2 | sum ++- 0 | 1 | 2 1 | 2 | NULL whereas Spark SQL only returns the first one. Similar instance is the following query, which should return 1 row from t1 but has an empty result now: {{create temporary view t1(c1, c2) as values (0, 1), (1, 2);}} {{create temporary view t2(c1, c2) as values (0, 2), (0, 3);}} {{SELECT tt1.c2}} {{FROM t1 as tt1}} {{WHERE tt1.c1 in (}} {{ select max(tt2.c1)}} {{ from t2 as tt2}} {{ where tt1.c2 is null);}} {{}} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-46465) Implement Column.isNaN
[ https://issues.apache.org/jira/browse/SPARK-46465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-46465: Assignee: Ruifeng Zheng > Implement Column.isNaN > -- > > Key: SPARK-46465 > URL: https://issues.apache.org/jira/browse/SPARK-46465 > Project: Spark > Issue Type: New Feature > Components: Connect, PySpark >Affects Versions: 4.0.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-46465) Implement Column.isNaN
[ https://issues.apache.org/jira/browse/SPARK-46465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-46465. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 44422 [https://github.com/apache/spark/pull/44422] > Implement Column.isNaN > -- > > Key: SPARK-46465 > URL: https://issues.apache.org/jira/browse/SPARK-46465 > Project: Spark > Issue Type: New Feature > Components: Connect, PySpark >Affects Versions: 4.0.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-46462) Reorganize `OpsOnDiffFramesGroupByRollingTests`
[ https://issues.apache.org/jira/browse/SPARK-46462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-46462: Assignee: Ruifeng Zheng > Reorganize `OpsOnDiffFramesGroupByRollingTests` > --- > > Key: SPARK-46462 > URL: https://issues.apache.org/jira/browse/SPARK-46462 > Project: Spark > Issue Type: Sub-task > Components: PS, Tests >Affects Versions: 4.0.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-46462) Reorganize `OpsOnDiffFramesGroupByRollingTests`
[ https://issues.apache.org/jira/browse/SPARK-46462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-46462. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 44420 [https://github.com/apache/spark/pull/44420] > Reorganize `OpsOnDiffFramesGroupByRollingTests` > --- > > Key: SPARK-46462 > URL: https://issues.apache.org/jira/browse/SPARK-46462 > Project: Spark > Issue Type: Sub-task > Components: PS, Tests >Affects Versions: 4.0.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-46463) Reorganize `OpsOnDiffFramesGroupByExpandingTests`
[ https://issues.apache.org/jira/browse/SPARK-46463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-46463. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 44421 [https://github.com/apache/spark/pull/44421] > Reorganize `OpsOnDiffFramesGroupByExpandingTests` > - > > Key: SPARK-46463 > URL: https://issues.apache.org/jira/browse/SPARK-46463 > Project: Spark > Issue Type: Sub-task > Components: PS, Tests >Affects Versions: 4.0.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-46463) Reorganize `OpsOnDiffFramesGroupByExpandingTests`
[ https://issues.apache.org/jira/browse/SPARK-46463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-46463: Assignee: Ruifeng Zheng > Reorganize `OpsOnDiffFramesGroupByExpandingTests` > - > > Key: SPARK-46463 > URL: https://issues.apache.org/jira/browse/SPARK-46463 > Project: Spark > Issue Type: Sub-task > Components: PS, Tests >Affects Versions: 4.0.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-46466) vectorized parquet reader should never do rebase for timestamp ntz
[ https://issues.apache.org/jira/browse/SPARK-46466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17799207#comment-17799207 ] Dongjoon Hyun commented on SPARK-46466: --- According to the PR content, I added the `correctness` label and changed this as a blocker for Apache Spark 3.5.1 and 3.4.3. > vectorized parquet reader should never do rebase for timestamp ntz > -- > > Key: SPARK-46466 > URL: https://issues.apache.org/jira/browse/SPARK-46466 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.0 >Reporter: Wenchen Fan >Priority: Blocker > Labels: correctness, pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46466) vectorized parquet reader should never do rebase for timestamp ntz
[ https://issues.apache.org/jira/browse/SPARK-46466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-46466: -- Target Version/s: 3.5.1, 3.4.3 > vectorized parquet reader should never do rebase for timestamp ntz > -- > > Key: SPARK-46466 > URL: https://issues.apache.org/jira/browse/SPARK-46466 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.0 >Reporter: Wenchen Fan >Priority: Blocker > Labels: correctness, pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46466) vectorized parquet reader should never do rebase for timestamp ntz
[ https://issues.apache.org/jira/browse/SPARK-46466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-46466: -- Labels: correctness pull-request-available (was: pull-request-available) > vectorized parquet reader should never do rebase for timestamp ntz > -- > > Key: SPARK-46466 > URL: https://issues.apache.org/jira/browse/SPARK-46466 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.0 >Reporter: Wenchen Fan >Priority: Major > Labels: correctness, pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46466) vectorized parquet reader should never do rebase for timestamp ntz
[ https://issues.apache.org/jira/browse/SPARK-46466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-46466: -- Priority: Blocker (was: Major) > vectorized parquet reader should never do rebase for timestamp ntz > -- > > Key: SPARK-46466 > URL: https://issues.apache.org/jira/browse/SPARK-46466 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.0 >Reporter: Wenchen Fan >Priority: Blocker > Labels: correctness, pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-46413) Validate returnType of Arrow Python UDF
[ https://issues.apache.org/jira/browse/SPARK-46413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-46413. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 44362 [https://github.com/apache/spark/pull/44362] > Validate returnType of Arrow Python UDF > --- > > Key: SPARK-46413 > URL: https://issues.apache.org/jira/browse/SPARK-46413 > Project: Spark > Issue Type: Improvement > Components: PySpark >Affects Versions: 4.0.0 >Reporter: Xinrong Meng >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Validate returnType of Arrow Python UDF -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-46398) Test rangeBetween window function (pyspark.sql.window)
[ https://issues.apache.org/jira/browse/SPARK-46398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-46398. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 44339 [https://github.com/apache/spark/pull/44339] > Test rangeBetween window function (pyspark.sql.window) > -- > > Key: SPARK-46398 > URL: https://issues.apache.org/jira/browse/SPARK-46398 > Project: Spark > Issue Type: Sub-task > Components: PySpark, Tests >Affects Versions: 4.0.0 >Reporter: Xinrong Meng >Assignee: Xinrong Meng >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-46398) Test rangeBetween window function (pyspark.sql.window)
[ https://issues.apache.org/jira/browse/SPARK-46398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-46398: Assignee: Xinrong Meng > Test rangeBetween window function (pyspark.sql.window) > -- > > Key: SPARK-46398 > URL: https://issues.apache.org/jira/browse/SPARK-46398 > Project: Spark > Issue Type: Sub-task > Components: PySpark, Tests >Affects Versions: 4.0.0 >Reporter: Xinrong Meng >Assignee: Xinrong Meng >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-46456) Shutdown hook timeouts during ui stop
[ https://issues.apache.org/jira/browse/SPARK-46456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-46456. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 44413 [https://github.com/apache/spark/pull/44413] > Shutdown hook timeouts during ui stop > - > > Key: SPARK-46456 > URL: https://issues.apache.org/jira/browse/SPARK-46456 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.4.1, 3.5.0, 4.0.0 >Reporter: Kent Yao >Assignee: Kent Yao >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-46450) session_window doesn't identify sessions with provided gap when used as a window function
[ https://issues.apache.org/jira/browse/SPARK-46450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17799140#comment-17799140 ] Juan Pumarino commented on SPARK-46450: --- [~kabhwan] thanks for the explanation; I learned a bit more about how Spark internals work. > session_window doesn't identify sessions with provided gap when used as a > window function > - > > Key: SPARK-46450 > URL: https://issues.apache.org/jira/browse/SPARK-46450 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.4.1, 3.5.0 >Reporter: Juan Pumarino >Priority: Minor > > {{PARTITION BY session_window}} doesn't produce the expected results. Here's > an example: > {code:sql} > SELECT > id, > ts, > collect_list(id) OVER (PARTITION BY session_window(ts, '1 hour')) as > window_ids > FROM VALUES > (1, "2023-12-11 01:10"), > (2, "2023-12-11 01:15"), > (3, "2023-12-11 01:40"), > (4, "2023-12-11 02:05"), > (5, "2023-12-11 03:15"), > (6, "2023-12-11 03:20"), > (7, "2023-12-11 04:10"), > (8, "2023-12-11 05:05") > AS tab(id, ts) > {code} > Actual result: > {code:java} > +---++--+ > |id |ts |window_ids| > +---++--+ > |1 |2023-12-11 01:10|[1] | > |2 |2023-12-11 01:15|[2] | > |3 |2023-12-11 01:40|[3] | > |4 |2023-12-11 02:05|[4] | > |5 |2023-12-11 03:15|[5] | > |6 |2023-12-11 03:20|[6] | > |7 |2023-12-11 04:10|[7] | > |8 |2023-12-11 05:05|[8] | > +---++--+ > {code} > Expected result, assigning rows to two sessions with 1-hour gap: > {code:java} > +---+++ > |id |ts |window_ids | > +---+++ > |1 |2023-12-11 01:10|[1, 2, 3, 4]| > |2 |2023-12-11 01:15|[1, 2, 3, 4]| > |3 |2023-12-11 01:40|[1, 2, 3, 4]| > |4 |2023-12-11 02:05|[1, 2, 3, 4]| > |5 |2023-12-11 03:15|[5, 6, 7, 8]| > |6 |2023-12-11 03:20|[5, 6, 7, 8]| > |7 |2023-12-11 04:10|[5, 6, 7, 8]| > |8 |2023-12-11 05:05|[5, 6, 7, 8]| > +---+++ > {code} > I compared its behavior with the results as a grouping function and with how > {{window()}} behaves in both cases, which seems to confirm that the result is > inconsistent. Here are the other examples: > *{{group by window()}}* > {code:sql} > SELECT > collect_list(id) AS ids, > collect_list(ts) AS tss, > window > FROM VALUES > (1, "2023-12-11 01:10"), > (2, "2023-12-11 01:15"), > (3, "2023-12-11 01:40"), > (4, "2023-12-11 02:05"), > (5, "2023-12-11 03:15"), > (6, "2023-12-11 03:20"), > (7, "2023-12-11 04:10"), > (8, "2023-12-11 05:05") > AS tab(id, ts) > GROUP by window(ts, '1 hour') > {code} > Correctly assigns rows to 1-hour windows: > {code:java} > +-+--+--+ > |ids |tss |window > | > +-+--+--+ > |[1, 2, 3]|[2023-12-11 01:10, 2023-12-11 01:15, 2023-12-11 01:40]|{2023-12-11 > 01:00:00, 2023-12-11 02:00:00}| > |[4] |[2023-12-11 02:05]|{2023-12-11 > 02:00:00, 2023-12-11 03:00:00}| > |[5, 6] |[2023-12-11 03:15, 2023-12-11 03:20] |{2023-12-11 > 03:00:00, 2023-12-11 04:00:00}| > |[7] |[2023-12-11 04:10]|{2023-12-11 > 04:00:00, 2023-12-11 05:00:00}| > |[8] |[2023-12-11 05:05]|{2023-12-11 > 05:00:00, 2023-12-11 06:00:00}| > +-+--+--+ > {code} > > *{{group by session_window()}}* > {code:sql} > SELECT > collect_list(id) AS ids, > collect_list(ts) AS tss, > session_window > FROM VALUES > (1, "2023-12-11 01:10"), > (2, "2023-12-11 01:15"), > (3, "2023-12-11 01:40"), > (4, "2023-12-11 02:05"), > (5, "2023-12-11 03:15"), > (6, "2023-12-11 03:20"), > (7, "2023-12-11 04:10"), > (8, "2023-12-11 05:05") > AS tab(id, ts) > GROUP by session_window(ts, '1 hour') > {code} > Correctly assigns rows to two sessions with 1-hour gap: > {code:java} > +++--+ > |ids |tss > |session_window| > +++--+ > |[1, 2, 3,
[jira] [Created] (SPARK-46467) Improve and test exceptions of TimedeltaIndex
Xinrong Meng created SPARK-46467: Summary: Improve and test exceptions of TimedeltaIndex Key: SPARK-46467 URL: https://issues.apache.org/jira/browse/SPARK-46467 Project: Spark Issue Type: Sub-task Components: PySpark, Tests Affects Versions: 4.0.0 Reporter: Xinrong Meng -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46467) Improve and test exceptions of TimedeltaIndex
[ https://issues.apache.org/jira/browse/SPARK-46467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-46467: --- Labels: pull-request-available (was: ) > Improve and test exceptions of TimedeltaIndex > - > > Key: SPARK-46467 > URL: https://issues.apache.org/jira/browse/SPARK-46467 > Project: Spark > Issue Type: Sub-task > Components: PySpark, Tests >Affects Versions: 4.0.0 >Reporter: Xinrong Meng >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-46447) Remove the legacy datetime rebasing SQL configs
[ https://issues.apache.org/jira/browse/SPARK-46447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk resolved SPARK-46447. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 44402 [https://github.com/apache/spark/pull/44402] > Remove the legacy datetime rebasing SQL configs > --- > > Key: SPARK-46447 > URL: https://issues.apache.org/jira/browse/SPARK-46447 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.0.0 >Reporter: Max Gekk >Assignee: Max Gekk >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > > Remove already deprecated SQL configs (alternatives to other configs): > - spark.sql.legacy.parquet.int96RebaseModeInWrite > - spark.sql.legacy.parquet.datetimeRebaseModeInWrite > - spark.sql.legacy.parquet.int96RebaseModeInRead > - spark.sql.legacy.parquet.datetimeRebaseModeInRead > - spark.sql.legacy.avro.datetimeRebaseModeInWrite > - spark.sql.legacy.avro.datetimeRebaseModeInRead -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46466) vectorized parquet reader should never do rebase for timestamp ntz
[ https://issues.apache.org/jira/browse/SPARK-46466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-46466: --- Labels: pull-request-available (was: ) > vectorized parquet reader should never do rebase for timestamp ntz > -- > > Key: SPARK-46466 > URL: https://issues.apache.org/jira/browse/SPARK-46466 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.0 >Reporter: Wenchen Fan >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-46466) vectorized parquet reader should never do rebase for timestamp ntz
Wenchen Fan created SPARK-46466: --- Summary: vectorized parquet reader should never do rebase for timestamp ntz Key: SPARK-46466 URL: https://issues.apache.org/jira/browse/SPARK-46466 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.4.0 Reporter: Wenchen Fan -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-46461) The `sbt console` command is not available
[ https://issues.apache.org/jira/browse/SPARK-46461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17798975#comment-17798975 ] Yang Jie commented on SPARK-46461: -- cc [~srowen] [~dongjoon] [~gurwls223] I'm not certain if `sbt console` command needs to be fixed, but I've found configurations related to the `console` command in `SparkBuild.scala` that should have been unusable for quite some time. [https://github.com/apache/spark/blob/c9cfaac90fd423c3a38e295234e24744b946cb02/project/SparkBuild.scala#L1126-L1148] {code:java} object SQL { lazy val settings = Seq( (console / initialCommands) := """ |import org.apache.spark.SparkContext |import org.apache.spark.sql.SQLContext |import org.apache.spark.sql.catalyst.analysis._ |import org.apache.spark.sql.catalyst.dsl._ |import org.apache.spark.sql.catalyst.errors._ |import org.apache.spark.sql.catalyst.expressions._ |import org.apache.spark.sql.catalyst.plans.logical._ |import org.apache.spark.sql.catalyst.rules._ |import org.apache.spark.sql.catalyst.util._ |import org.apache.spark.sql.execution |import org.apache.spark.sql.functions._ |import org.apache.spark.sql.types._ | |val sc = new SparkContext("local[*]", "dev-shell") |val sqlContext = new SQLContext(sc) |import sqlContext.implicits._ |import sqlContext._ """.stripMargin, (console / cleanupCommands) := "sc.stop()" ) } {code} [https://github.com/apache/spark/blob/c9cfaac90fd423c3a38e295234e24744b946cb02/project/SparkBuild.scala#L1164-L1180] {code:java} (console / initialCommands) := """ |import org.apache.spark.SparkContext |import org.apache.spark.sql.catalyst.analysis._ |import org.apache.spark.sql.catalyst.dsl._ |import org.apache.spark.sql.catalyst.errors._ |import org.apache.spark.sql.catalyst.expressions._ |import org.apache.spark.sql.catalyst.plans.logical._ |import org.apache.spark.sql.catalyst.rules._ |import org.apache.spark.sql.catalyst.util._ |import org.apache.spark.sql.execution |import org.apache.spark.sql.functions._ |import org.apache.spark.sql.hive._ |import org.apache.spark.sql.hive.test.TestHive._ |import org.apache.spark.sql.hive.test.TestHive.implicits._ |import org.apache.spark.sql.types._""".stripMargin, (console / cleanupCommands) := "sparkContext.stop()", {code} > The `sbt console` command is not available > -- > > Key: SPARK-46461 > URL: https://issues.apache.org/jira/browse/SPARK-46461 > Project: Spark > Issue Type: Bug > Components: Build >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Minor > > # Unable to define expressions after executing the `build/sbt console` command > {code:java} > scala> val i = 1 // show > package $line3 { > sealed class $read extends _root_.scala.Serializable { > def () = { > super.; > () > }; > sealed class $iw extends _root_.java.io.Serializable { > def () = { > super.; > () > }; > val i = 1 > }; > val $iw = new $iw. > }; > object $read extends scala.AnyRef { > def () = { > super.; > () > }; > val INSTANCE = new $read. > } > } > warning: -target is deprecated: Use -release instead to compile against the > correct platform API. > Applicable -Wconf / @nowarn filters for this warning: msg= message>, cat=deprecation > ^ > error: expected class or object definition {code} > 2. Due to the default unused imports check, the error "unused imports" will > be reported after executing the `build/sbt sql/console` command > {code:java} > Welcome to Scala 2.13.12 (OpenJDK 64-Bit Server VM, Java 17.0.9). > Type in expressions for evaluation. Or try :help. > warning: -target is deprecated: Use -release instead to compile against the > correct platform API. > Applicable -Wconf / @nowarn filters for this warning: msg= message>, cat=deprecation > import org.apache.spark.sql.catalyst.errors._ > ^ > On line 6: error: object errors is not a member of package > org.apache.spark.sql.catalyst > import org.apache.spark.sql.catalyst.analysis._ > ^ > On line 4: error: Unused import > Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=unused-imports, site= > import org.apache.spark.sql.catalyst.dsl._ > ^ > On line 5: error: Unused import > Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=unused-imports, site= >
[jira] [Resolved] (SPARK-46452) Add a new API in DSv2 DataWriter to write an iterator of records
[ https://issues.apache.org/jira/browse/SPARK-46452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-46452. - Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 44410 [https://github.com/apache/spark/pull/44410] > Add a new API in DSv2 DataWriter to write an iterator of records > > > Key: SPARK-46452 > URL: https://issues.apache.org/jira/browse/SPARK-46452 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Allison Wang >Assignee: Allison Wang >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Add a new API that takes an iterator of records. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-46452) Add a new API in DSv2 DataWriter to write an iterator of records
[ https://issues.apache.org/jira/browse/SPARK-46452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-46452: --- Assignee: Allison Wang > Add a new API in DSv2 DataWriter to write an iterator of records > > > Key: SPARK-46452 > URL: https://issues.apache.org/jira/browse/SPARK-46452 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Allison Wang >Assignee: Allison Wang >Priority: Major > Labels: pull-request-available > > Add a new API that takes an iterator of records. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46464) Fix the scroll issue of tables when overflow
[ https://issues.apache.org/jira/browse/SPARK-46464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-46464: --- Labels: pull-request-available (was: ) > Fix the scroll issue of tables when overflow > > > Key: SPARK-46464 > URL: https://issues.apache.org/jira/browse/SPARK-46464 > Project: Spark > Issue Type: Bug > Components: Documentation >Affects Versions: 3.5.0 >Reporter: Kent Yao >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46465) Implement Column.isNaN
[ https://issues.apache.org/jira/browse/SPARK-46465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-46465: --- Labels: pull-request-available (was: ) > Implement Column.isNaN > -- > > Key: SPARK-46465 > URL: https://issues.apache.org/jira/browse/SPARK-46465 > Project: Spark > Issue Type: New Feature > Components: Connect, PySpark >Affects Versions: 4.0.0 >Reporter: Ruifeng Zheng >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-46464) Fix the scroll issue of tables when overflow
Kent Yao created SPARK-46464: Summary: Fix the scroll issue of tables when overflow Key: SPARK-46464 URL: https://issues.apache.org/jira/browse/SPARK-46464 Project: Spark Issue Type: Bug Components: Documentation Affects Versions: 3.5.0 Reporter: Kent Yao -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46463) Reorganize `OpsOnDiffFramesGroupByExpandingTests`
[ https://issues.apache.org/jira/browse/SPARK-46463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-46463: --- Labels: pull-request-available (was: ) > Reorganize `OpsOnDiffFramesGroupByExpandingTests` > - > > Key: SPARK-46463 > URL: https://issues.apache.org/jira/browse/SPARK-46463 > Project: Spark > Issue Type: Sub-task > Components: PS, Tests >Affects Versions: 4.0.0 >Reporter: Ruifeng Zheng >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-46463) Reorganize `OpsOnDiffFramesGroupByExpandingTests`
Ruifeng Zheng created SPARK-46463: - Summary: Reorganize `OpsOnDiffFramesGroupByExpandingTests` Key: SPARK-46463 URL: https://issues.apache.org/jira/browse/SPARK-46463 Project: Spark Issue Type: Sub-task Components: PS, Tests Affects Versions: 4.0.0 Reporter: Ruifeng Zheng -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-46052) Remove unnecessary TaskScheduler.killAllTaskAttempts
[ https://issues.apache.org/jira/browse/SPARK-46052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-46052: -- Assignee: Apache Spark > Remove unnecessary TaskScheduler.killAllTaskAttempts > > > Key: SPARK-46052 > URL: https://issues.apache.org/jira/browse/SPARK-46052 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.0.3, 3.1.3, 3.2.4, 3.3.3, 3.4.1, 3.5.0 >Reporter: wuyi >Assignee: Apache Spark >Priority: Major > Labels: pull-request-available > > Spark has two functions to kill all tasks in a Stage: > * `cancelTasks`: Not only kill all the running tasks in all the stage > attempts but also abort all the stage attempts > * `killAllTaskAttempts`: Only kill all the running tasks in all the stage > attemtps but won't abort the attempts. > However, there's no use case in Spark that a stage would launch new tasks > after its all tasks get killed. So I think we can replace > `killAllTaskAttempts` with `cancelTasks` directly. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-46052) Remove unnecessary TaskScheduler.killAllTaskAttempts
[ https://issues.apache.org/jira/browse/SPARK-46052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-46052: -- Assignee: (was: Apache Spark) > Remove unnecessary TaskScheduler.killAllTaskAttempts > > > Key: SPARK-46052 > URL: https://issues.apache.org/jira/browse/SPARK-46052 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.0.3, 3.1.3, 3.2.4, 3.3.3, 3.4.1, 3.5.0 >Reporter: wuyi >Priority: Major > Labels: pull-request-available > > Spark has two functions to kill all tasks in a Stage: > * `cancelTasks`: Not only kill all the running tasks in all the stage > attempts but also abort all the stage attempts > * `killAllTaskAttempts`: Only kill all the running tasks in all the stage > attemtps but won't abort the attempts. > However, there's no use case in Spark that a stage would launch new tasks > after its all tasks get killed. So I think we can replace > `killAllTaskAttempts` with `cancelTasks` directly. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-46330) Loading of Spark UI blocks for a long time when HybridStore enabled
[ https://issues.apache.org/jira/browse/SPARK-46330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kent Yao resolved SPARK-46330. -- Fix Version/s: 3.4.3 3.5.1 4.0.0 Resolution: Fixed Issue resolved by pull request 44260 [https://github.com/apache/spark/pull/44260] > Loading of Spark UI blocks for a long time when HybridStore enabled > --- > > Key: SPARK-46330 > URL: https://issues.apache.org/jira/browse/SPARK-46330 > Project: Spark > Issue Type: Bug > Components: UI >Affects Versions: 3.1.2, 3.3.1 >Reporter: Zhou Yifan >Assignee: Zhou Yifan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.3, 3.5.1, 4.0.0 > > > In our SparkHistoryServer, we used these two property to speed up Spark UI's > loading: > {code:java} > spark.history.store.hybridStore.enabled true > spark.history.store.hybridStore.maxMemoryUsage 16g {code} > Occasionally, we found it took minutes to load a small eventlog which usually > took seconds. > In the jstack output of SparkHistoryServer, we found that 4 threads were > blocked and waiting to lock > *org.apache.spark.deploy.history.FsHistoryProvider* object monitor, which was > locked by thread "spark-history-task-0" closing a HybridStore. > {code:java} > "qtp791499503-2688947" #2688947 daemon prio=5 os_prio=0 > tid=0x7f4044042800 nid=0x8d98 waiting for monitor entry > [0x7f3f6476] > java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.spark.deploy.history.FsHistoryProvider.getAppUI(FsHistoryProvider.scala:386) > - waiting to lock <0x0004c64433f0> (a > org.apache.spark.deploy.history.FsHistoryProvider) > at > org.apache.spark.deploy.history.HistoryServer.getAppUI(HistoryServer.scala:194) > at > org.apache.spark.deploy.history.ApplicationCache.$anonfun$loadApplicationEntry$2(ApplicationCache.scala:182) > at > org.apache.spark.deploy.history.ApplicationCache$$Lambda$805/90086258.apply(Unknown > Source) > at > org.apache.spark.deploy.history.ApplicationCache.time(ApplicationCache.scala:154) > at > org.apache.spark.deploy.history.ApplicationCache.org$apache$spark$deploy$history$ApplicationCache$$loadApplicationEntry(ApplicationCache.scala:180) > at > org.apache.spark.deploy.history.ApplicationCache$$anon$1.load(ApplicationCache.scala:71) > at > org.apache.spark.deploy.history.ApplicationCache$$anon$1.load(ApplicationCache.scala:58) > at > org.sparkproject.guava.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599) > at > org.sparkproject.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2379) > at > org.sparkproject.guava.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342) > - locked <0x00066effc3e8> (a > org.sparkproject.guava.cache.LocalCache$StrongAccessEntry) > at > org.sparkproject.guava.cache.LocalCache$Segment.get(LocalCache.java:2257) > at org.sparkproject.guava.cache.LocalCache.get(LocalCache.java:4000) > at org.sparkproject.guava.cache.LocalCache.getOrLoad(LocalCache.java:4004) > at > org.sparkproject.guava.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874) > at > org.apache.spark.deploy.history.ApplicationCache.get(ApplicationCache.scala:108) > at > org.apache.spark.deploy.history.ApplicationCache.withSparkUI(ApplicationCache.scala:120) > at > org.apache.spark.deploy.history.HistoryServer.org$apache$spark$deploy$history$HistoryServer$$loadAppUi(HistoryServer.scala:251) > at > org.apache.spark.deploy.history.HistoryServer$$anon$1.doGet(HistoryServer.scala:99) > "spark-history-task-0" #49 daemon prio=5 os_prio=0 tid=0x7f431e55b800 > nid=0x1ac6 in Object.wait() [0x7f41b2cc9000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > at java.lang.Thread.join(Thread.java:1252) > - locked <0x00063ccbc9f0> (a java.lang.Thread) > at java.lang.Thread.join(Thread.java:1326) > at > org.apache.spark.deploy.history.HybridStore.close(HybridStore.scala:106) > at org.apache.spark.status.AppStatusStore.close(AppStatusStore.scala:553) > at > org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$invalidateUI$1(FsHistoryProvider.scala:913) > at > org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$invalidateUI$1$adapted(FsHistoryProvider.scala:911) > at > org.apache.spark.deploy.history.FsHistoryProvider$$Lambda$416/229723341.apply(Unknown > Source) > at scala.Option.foreach(Option.scala:407) > at > org.apache.spark.deploy.history.FsHistoryProvider.invalidateUI(FsHistoryProvider.scala:911) > - locked <0x0004c64433f0> (a > org.apache.spark.deploy.history.FsHistoryProvider)
[jira] [Assigned] (SPARK-46330) Loading of Spark UI blocks for a long time when HybridStore enabled
[ https://issues.apache.org/jira/browse/SPARK-46330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kent Yao reassigned SPARK-46330: Assignee: Zhou Yifan > Loading of Spark UI blocks for a long time when HybridStore enabled > --- > > Key: SPARK-46330 > URL: https://issues.apache.org/jira/browse/SPARK-46330 > Project: Spark > Issue Type: Bug > Components: UI >Affects Versions: 3.1.2, 3.3.1 >Reporter: Zhou Yifan >Assignee: Zhou Yifan >Priority: Major > Labels: pull-request-available > > In our SparkHistoryServer, we used these two property to speed up Spark UI's > loading: > {code:java} > spark.history.store.hybridStore.enabled true > spark.history.store.hybridStore.maxMemoryUsage 16g {code} > Occasionally, we found it took minutes to load a small eventlog which usually > took seconds. > In the jstack output of SparkHistoryServer, we found that 4 threads were > blocked and waiting to lock > *org.apache.spark.deploy.history.FsHistoryProvider* object monitor, which was > locked by thread "spark-history-task-0" closing a HybridStore. > {code:java} > "qtp791499503-2688947" #2688947 daemon prio=5 os_prio=0 > tid=0x7f4044042800 nid=0x8d98 waiting for monitor entry > [0x7f3f6476] > java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.spark.deploy.history.FsHistoryProvider.getAppUI(FsHistoryProvider.scala:386) > - waiting to lock <0x0004c64433f0> (a > org.apache.spark.deploy.history.FsHistoryProvider) > at > org.apache.spark.deploy.history.HistoryServer.getAppUI(HistoryServer.scala:194) > at > org.apache.spark.deploy.history.ApplicationCache.$anonfun$loadApplicationEntry$2(ApplicationCache.scala:182) > at > org.apache.spark.deploy.history.ApplicationCache$$Lambda$805/90086258.apply(Unknown > Source) > at > org.apache.spark.deploy.history.ApplicationCache.time(ApplicationCache.scala:154) > at > org.apache.spark.deploy.history.ApplicationCache.org$apache$spark$deploy$history$ApplicationCache$$loadApplicationEntry(ApplicationCache.scala:180) > at > org.apache.spark.deploy.history.ApplicationCache$$anon$1.load(ApplicationCache.scala:71) > at > org.apache.spark.deploy.history.ApplicationCache$$anon$1.load(ApplicationCache.scala:58) > at > org.sparkproject.guava.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599) > at > org.sparkproject.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2379) > at > org.sparkproject.guava.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342) > - locked <0x00066effc3e8> (a > org.sparkproject.guava.cache.LocalCache$StrongAccessEntry) > at > org.sparkproject.guava.cache.LocalCache$Segment.get(LocalCache.java:2257) > at org.sparkproject.guava.cache.LocalCache.get(LocalCache.java:4000) > at org.sparkproject.guava.cache.LocalCache.getOrLoad(LocalCache.java:4004) > at > org.sparkproject.guava.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874) > at > org.apache.spark.deploy.history.ApplicationCache.get(ApplicationCache.scala:108) > at > org.apache.spark.deploy.history.ApplicationCache.withSparkUI(ApplicationCache.scala:120) > at > org.apache.spark.deploy.history.HistoryServer.org$apache$spark$deploy$history$HistoryServer$$loadAppUi(HistoryServer.scala:251) > at > org.apache.spark.deploy.history.HistoryServer$$anon$1.doGet(HistoryServer.scala:99) > "spark-history-task-0" #49 daemon prio=5 os_prio=0 tid=0x7f431e55b800 > nid=0x1ac6 in Object.wait() [0x7f41b2cc9000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > at java.lang.Thread.join(Thread.java:1252) > - locked <0x00063ccbc9f0> (a java.lang.Thread) > at java.lang.Thread.join(Thread.java:1326) > at > org.apache.spark.deploy.history.HybridStore.close(HybridStore.scala:106) > at org.apache.spark.status.AppStatusStore.close(AppStatusStore.scala:553) > at > org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$invalidateUI$1(FsHistoryProvider.scala:913) > at > org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$invalidateUI$1$adapted(FsHistoryProvider.scala:911) > at > org.apache.spark.deploy.history.FsHistoryProvider$$Lambda$416/229723341.apply(Unknown > Source) > at scala.Option.foreach(Option.scala:407) > at > org.apache.spark.deploy.history.FsHistoryProvider.invalidateUI(FsHistoryProvider.scala:911) > - locked <0x0004c64433f0> (a > org.apache.spark.deploy.history.FsHistoryProvider) > at > org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$checkForLogs$7(FsHistoryProvider.scala:541) > at >
[jira] [Updated] (SPARK-46460) The filter of partition including cast function may lead the partition pruning to disable
[ https://issues.apache.org/jira/browse/SPARK-46460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhou Tong updated SPARK-46460: -- Summary: The filter of partition including cast function may lead the partition pruning to disable (was: The filter of partition includes cast function may lead the partition pruning to disable) > The filter of partition including cast function may lead the partition > pruning to disable > - > > Key: SPARK-46460 > URL: https://issues.apache.org/jira/browse/SPARK-46460 > Project: Spark > Issue Type: Improvement > Components: Optimizer, SQL >Affects Versions: 3.2.0 >Reporter: Zhou Tong >Priority: Minor > Labels: pull-request-available > Attachments: SPARK-46460.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > SQL:select * from test_db.test_table where day between > date_sub('2023-12-01',1) and '2023-12-03' > The Physical Plan of sql above will implement _cast_ function on partition > col 'day', like this, {_}cast(day as date) > 2023-11-30{_}. In this > situation, spark just pass the filter condition _day < "2023-12-03"_ to > HiveMetastore, not including filter condition {_}cast(day as date) > > 2023-11-30{_}, which may lead performance of HMS degarde if the HiveTable has > huge number of partitions. > > In this regard, a new rule may solve this problem. This rule can convert > binary comparison _cast(day as date) > 2023-11-30_ to {_}day > > cast(2023-11-30 as string){_}. The right node is foldable, so the result is > {_}day > "2023-11-30"{_}, and the filter condition passed to HMS will be _day > > "2023-11-30" and_ _day < "2023-12-03"._ > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-28386) Cannot resolve ORDER BY columns with GROUP BY and HAVING
[ https://issues.apache.org/jira/browse/SPARK-28386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kent Yao updated SPARK-28386: - Fix Version/s: 4.0.0 > Cannot resolve ORDER BY columns with GROUP BY and HAVING > > > Key: SPARK-28386 > URL: https://issues.apache.org/jira/browse/SPARK-28386 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Yuming Wang >Assignee: Cheng Pan >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > How to reproduce: > {code:sql} > CREATE TABLE test_having (a int, b int, c string, d string) USING parquet; > INSERT INTO test_having VALUES (0, 1, '', 'A'); > INSERT INTO test_having VALUES (1, 2, '', 'b'); > INSERT INTO test_having VALUES (2, 2, '', 'c'); > INSERT INTO test_having VALUES (3, 3, '', 'D'); > INSERT INTO test_having VALUES (4, 3, '', 'e'); > INSERT INTO test_having VALUES (5, 3, '', 'F'); > INSERT INTO test_having VALUES (6, 4, '', 'g'); > INSERT INTO test_having VALUES (7, 4, '', 'h'); > INSERT INTO test_having VALUES (8, 4, '', 'I'); > INSERT INTO test_having VALUES (9, 4, '', 'j'); > SELECT lower(c), count(c) FROM test_having > GROUP BY lower(c) HAVING count(*) > 2 > ORDER BY lower(c); > {code} > {noformat} > spark-sql> SELECT lower(c), count(c) FROM test_having > > GROUP BY lower(c) HAVING count(*) > 2 > > ORDER BY lower(c); > Error in query: cannot resolve '`c`' given input columns: [lower(c), > count(c)]; line 3 pos 19; > 'Sort ['lower('c) ASC NULLS FIRST], true > +- Project [lower(c)#158, count(c)#159L] >+- Filter (count(1)#161L > cast(2 as bigint)) > +- Aggregate [lower(c#7)], [lower(c#7) AS lower(c)#158, count(c#7) AS > count(c)#159L, count(1) AS count(1)#161L] > +- SubqueryAlias test_having > +- Relation[a#5,b#6,c#7,d#8] parquet > {noformat} > But it works when setting an alias: > {noformat} > spark-sql> SELECT lower(c) withAias, count(c) FROM test_having > > GROUP BY lower(c) HAVING count(*) > 2 > > ORDER BY withAias; > 3 > 4 > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-28386) Cannot resolve ORDER BY columns with GROUP BY and HAVING
[ https://issues.apache.org/jira/browse/SPARK-28386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kent Yao resolved SPARK-28386. -- Assignee: Cheng Pan Resolution: Fixed Issue resolved by https://github.com/apache/spark/pull/44352 > Cannot resolve ORDER BY columns with GROUP BY and HAVING > > > Key: SPARK-28386 > URL: https://issues.apache.org/jira/browse/SPARK-28386 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Yuming Wang >Assignee: Cheng Pan >Priority: Major > Labels: pull-request-available > > How to reproduce: > {code:sql} > CREATE TABLE test_having (a int, b int, c string, d string) USING parquet; > INSERT INTO test_having VALUES (0, 1, '', 'A'); > INSERT INTO test_having VALUES (1, 2, '', 'b'); > INSERT INTO test_having VALUES (2, 2, '', 'c'); > INSERT INTO test_having VALUES (3, 3, '', 'D'); > INSERT INTO test_having VALUES (4, 3, '', 'e'); > INSERT INTO test_having VALUES (5, 3, '', 'F'); > INSERT INTO test_having VALUES (6, 4, '', 'g'); > INSERT INTO test_having VALUES (7, 4, '', 'h'); > INSERT INTO test_having VALUES (8, 4, '', 'I'); > INSERT INTO test_having VALUES (9, 4, '', 'j'); > SELECT lower(c), count(c) FROM test_having > GROUP BY lower(c) HAVING count(*) > 2 > ORDER BY lower(c); > {code} > {noformat} > spark-sql> SELECT lower(c), count(c) FROM test_having > > GROUP BY lower(c) HAVING count(*) > 2 > > ORDER BY lower(c); > Error in query: cannot resolve '`c`' given input columns: [lower(c), > count(c)]; line 3 pos 19; > 'Sort ['lower('c) ASC NULLS FIRST], true > +- Project [lower(c)#158, count(c)#159L] >+- Filter (count(1)#161L > cast(2 as bigint)) > +- Aggregate [lower(c#7)], [lower(c#7) AS lower(c)#158, count(c#7) AS > count(c)#159L, count(1) AS count(1)#161L] > +- SubqueryAlias test_having > +- Relation[a#5,b#6,c#7,d#8] parquet > {noformat} > But it works when setting an alias: > {noformat} > spark-sql> SELECT lower(c) withAias, count(c) FROM test_having > > GROUP BY lower(c) HAVING count(*) > 2 > > ORDER BY withAias; > 3 > 4 > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46462) Reorganize `OpsOnDiffFramesGroupByRollingTests`
[ https://issues.apache.org/jira/browse/SPARK-46462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-46462: --- Labels: pull-request-available (was: ) > Reorganize `OpsOnDiffFramesGroupByRollingTests` > --- > > Key: SPARK-46462 > URL: https://issues.apache.org/jira/browse/SPARK-46462 > Project: Spark > Issue Type: Sub-task > Components: PS, Tests >Affects Versions: 4.0.0 >Reporter: Ruifeng Zheng >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-46462) Reorganize `OpsOnDiffFramesGroupByRollingTests`
Ruifeng Zheng created SPARK-46462: - Summary: Reorganize `OpsOnDiffFramesGroupByRollingTests` Key: SPARK-46462 URL: https://issues.apache.org/jira/browse/SPARK-46462 Project: Spark Issue Type: Sub-task Components: PS, Tests Affects Versions: 4.0.0 Reporter: Ruifeng Zheng -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-46399) Add exit status to the Application End event for the use of Spark Listener
[ https://issues.apache.org/jira/browse/SPARK-46399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mridul Muralidharan reassigned SPARK-46399: --- Assignee: Reza Safi > Add exit status to the Application End event for the use of Spark Listener > -- > > Key: SPARK-46399 > URL: https://issues.apache.org/jira/browse/SPARK-46399 > Project: Spark > Issue Type: New Feature > Components: Spark Core >Affects Versions: 3.5.0 >Reporter: Reza Safi >Assignee: Reza Safi >Priority: Minor > Labels: pull-request-available > > Currently SparkListenerApplicationEnd only has a timestamp value and there is > not exit status recorded with it. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-46399) Add exit status to the Application End event for the use of Spark Listener
[ https://issues.apache.org/jira/browse/SPARK-46399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mridul Muralidharan resolved SPARK-46399. - Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 44340 [https://github.com/apache/spark/pull/44340] > Add exit status to the Application End event for the use of Spark Listener > -- > > Key: SPARK-46399 > URL: https://issues.apache.org/jira/browse/SPARK-46399 > Project: Spark > Issue Type: New Feature > Components: Spark Core >Affects Versions: 3.5.0 >Reporter: Reza Safi >Assignee: Reza Safi >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > > Currently SparkListenerApplicationEnd only has a timestamp value and there is > not exit status recorded with it. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-46272) Support CTAS using DSv2 sources
[ https://issues.apache.org/jira/browse/SPARK-46272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-46272. - Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 44190 [https://github.com/apache/spark/pull/44190] > Support CTAS using DSv2 sources > --- > > Key: SPARK-46272 > URL: https://issues.apache.org/jira/browse/SPARK-46272 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Allison Wang >Assignee: Allison Wang >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-46272) Support CTAS using DSv2 sources
[ https://issues.apache.org/jira/browse/SPARK-46272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-46272: --- Assignee: Allison Wang > Support CTAS using DSv2 sources > --- > > Key: SPARK-46272 > URL: https://issues.apache.org/jira/browse/SPARK-46272 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Allison Wang >Assignee: Allison Wang >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org