[jira] [Created] (SPARK-37532) RDD name could be very long and memory costly

2021-12-02 Thread Kent Yao (Jira)
Kent Yao created SPARK-37532: Summary: RDD name could be very long and memory costly Key: SPARK-37532 URL: https://issues.apache.org/jira/browse/SPARK-37532 Project: Spark Issue Type:

[jira] [Assigned] (SPARK-37531) Use PyArrow 6.0.0 in Python 3.9 tests at GitHub Action job

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37531: Assignee: (was: Apache Spark) > Use PyArrow 6.0.0 in Python 3.9 tests at GitHub

[jira] [Assigned] (SPARK-37531) Use PyArrow 6.0.0 in Python 3.9 tests at GitHub Action job

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37531: Assignee: Apache Spark > Use PyArrow 6.0.0 in Python 3.9 tests at GitHub Action job >

[jira] [Commented] (SPARK-37531) Use PyArrow 6.0.0 in Python 3.9 tests at GitHub Action job

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452804#comment-17452804 ] Apache Spark commented on SPARK-37531: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Updated] (SPARK-37531) Use PyArrow 6.0.0 in Python 3.9 tests at GitHub Action job

2021-12-02 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-37531: -- Summary: Use PyArrow 6.0.0 in Python 3.9 tests at GitHub Action job (was: Use PyArrow 6.0.0

[jira] [Created] (SPARK-37531) Use PyArrow 6.0.0 in PySpark UT GitHub Action job

2021-12-02 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-37531: - Summary: Use PyArrow 6.0.0 in PySpark UT GitHub Action job Key: SPARK-37531 URL: https://issues.apache.org/jira/browse/SPARK-37531 Project: Spark Issue

[jira] [Commented] (SPARK-37530) Spark reads many paths very slow though newAPIHadoopFile

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452795#comment-17452795 ] Apache Spark commented on SPARK-37530: -- User 'yaooqinn' has created a pull request for this issue:

[jira] [Commented] (SPARK-37530) Spark reads many paths very slow though newAPIHadoopFile

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452794#comment-17452794 ] Apache Spark commented on SPARK-37530: -- User 'yaooqinn' has created a pull request for this issue:

[jira] [Assigned] (SPARK-37530) Spark reads many paths very slow though newAPIHadoopFile

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37530: Assignee: Apache Spark > Spark reads many paths very slow though newAPIHadoopFile >

[jira] [Assigned] (SPARK-37530) Spark reads many paths very slow though newAPIHadoopFile

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37530: Assignee: (was: Apache Spark) > Spark reads many paths very slow though

[jira] [Created] (SPARK-37530) Spark reads many paths very slow though newAPIHadoopFile

2021-12-02 Thread Kent Yao (Jira)
Kent Yao created SPARK-37530: Summary: Spark reads many paths very slow though newAPIHadoopFile Key: SPARK-37530 URL: https://issues.apache.org/jira/browse/SPARK-37530 Project: Spark Issue Type:

[jira] [Commented] (SPARK-37510) Support TimedeltaIndex in pandas API on Spark

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452772#comment-17452772 ] Apache Spark commented on SPARK-37510: -- User 'xinrong-databricks' has created a pull request for

[jira] [Assigned] (SPARK-37510) Support TimedeltaIndex in pandas API on Spark

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37510: Assignee: Apache Spark > Support TimedeltaIndex in pandas API on Spark >

[jira] [Assigned] (SPARK-37510) Support TimedeltaIndex in pandas API on Spark

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37510: Assignee: (was: Apache Spark) > Support TimedeltaIndex in pandas API on Spark >

[jira] [Commented] (SPARK-37510) Support TimedeltaIndex in pandas API on Spark

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452770#comment-17452770 ] Apache Spark commented on SPARK-37510: -- User 'xinrong-databricks' has created a pull request for

[jira] [Assigned] (SPARK-37528) Support reorder tasks during scheduling by shuffle partition size in AQE

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37528: Assignee: (was: Apache Spark) > Support reorder tasks during scheduling by shuffle

[jira] [Assigned] (SPARK-37528) Support reorder tasks during scheduling by shuffle partition size in AQE

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37528: Assignee: Apache Spark > Support reorder tasks during scheduling by shuffle partition

[jira] [Commented] (SPARK-37528) Support reorder tasks during scheduling by shuffle partition size in AQE

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452769#comment-17452769 ] Apache Spark commented on SPARK-37528: -- User 'ulysses-you' has created a pull request for this

[jira] [Assigned] (SPARK-37529) Support K8s integration tests for Java 17

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37529: Assignee: Apache Spark (was: Kousuke Saruta) > Support K8s integration tests for Java

[jira] [Commented] (SPARK-37529) Support K8s integration tests for Java 17

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452767#comment-17452767 ] Apache Spark commented on SPARK-37529: -- User 'sarutak' has created a pull request for this issue:

[jira] [Assigned] (SPARK-37529) Support K8s integration tests for Java 17

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37529: Assignee: Kousuke Saruta (was: Apache Spark) > Support K8s integration tests for Java

[jira] [Created] (SPARK-37529) Support K8s integration tests for Java 17

2021-12-02 Thread Kousuke Saruta (Jira)
Kousuke Saruta created SPARK-37529: -- Summary: Support K8s integration tests for Java 17 Key: SPARK-37529 URL: https://issues.apache.org/jira/browse/SPARK-37529 Project: Spark Issue Type:

[jira] [Updated] (SPARK-37528) Support reorder tasks during scheduling by shuffle partition size in AQE

2021-12-02 Thread XiDuo You (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] XiDuo You updated SPARK-37528: -- Description: Reorder tasks by input size can save the whole stage execution time. Assume the larger

[jira] [Updated] (SPARK-37521) insert overwrite table but the partition information stored in Metastore was not changed

2021-12-02 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-37521: - Priority: Major (was: Blocker) > insert overwrite table but the partition information stored

[jira] [Updated] (SPARK-37528) Support reorder tasks during scheduling by shuffle partition size in AQE

2021-12-02 Thread XiDuo You (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] XiDuo You updated SPARK-37528: -- Description: Reorder tasks by input size can save the whole stage execution time. Assume the larger

[jira] [Assigned] (SPARK-37526) Add Java17 PySpark daily test coverage

2021-12-02 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-37526: Assignee: Dongjoon Hyun (was: Apache Spark) > Add Java17 PySpark daily test coverage >

[jira] [Resolved] (SPARK-37526) Add Java17 PySpark daily test coverage

2021-12-02 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-37526. -- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34788

[jira] [Assigned] (SPARK-37526) Add Java17 PySpark daily test coverage

2021-12-02 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-37526: Assignee: Apache Spark > Add Java17 PySpark daily test coverage >

[jira] [Created] (SPARK-37528) Support reorder tasks during scheduling by shuffle partition size in AQE

2021-12-02 Thread XiDuo You (Jira)
XiDuo You created SPARK-37528: - Summary: Support reorder tasks during scheduling by shuffle partition size in AQE Key: SPARK-37528 URL: https://issues.apache.org/jira/browse/SPARK-37528 Project: Spark

[jira] [Commented] (SPARK-18105) LZ4 failed to decompress a stream of shuffled data

2021-12-02 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-18105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452730#comment-17452730 ] Yuming Wang commented on SPARK-18105: - Workaround this issue by set spark.io.compression.codec=zstd.

[jira] [Assigned] (SPARK-37519) Support Relation With LateralView

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37519: Assignee: Apache Spark > Support Relation With LateralView >

[jira] [Assigned] (SPARK-37519) Support Relation With LateralView

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37519: Assignee: (was: Apache Spark) > Support Relation With LateralView >

[jira] [Resolved] (SPARK-37524) We should drop all tables after testing dynamic partition pruning

2021-12-02 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-37524. -- Fix Version/s: 3.3.0 3.0.4 3.2.1

[jira] [Assigned] (SPARK-37524) We should drop all tables after testing dynamic partition pruning

2021-12-02 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-37524: Assignee: weixiuli > We should drop all tables after testing dynamic partition pruning >

[jira] [Commented] (SPARK-37527) Translate more standard aggregate functions for pushdown

2021-12-02 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452721#comment-17452721 ] jiaan.geng commented on SPARK-37527: I'm working on. > Translate more standard aggregate functions

[jira] [Created] (SPARK-37527) Translate more standard aggregate functions for pushdown

2021-12-02 Thread jiaan.geng (Jira)
jiaan.geng created SPARK-37527: -- Summary: Translate more standard aggregate functions for pushdown Key: SPARK-37527 URL: https://issues.apache.org/jira/browse/SPARK-37527 Project: Spark Issue

[jira] [Commented] (SPARK-37526) Add Java17 PySpark daily test coverage

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452720#comment-17452720 ] Apache Spark commented on SPARK-37526: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Commented] (SPARK-37526) Add Java17 PySpark daily test coverage

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452719#comment-17452719 ] Apache Spark commented on SPARK-37526: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Assigned] (SPARK-37526) Add Java17 PySpark daily test coverage

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37526: Assignee: (was: Apache Spark) > Add Java17 PySpark daily test coverage >

[jira] [Assigned] (SPARK-37526) Add Java17 PySpark daily test coverage

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37526: Assignee: Apache Spark > Add Java17 PySpark daily test coverage >

[jira] [Created] (SPARK-37526) Add Java17 PySpark daily test coverage

2021-12-02 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-37526: - Summary: Add Java17 PySpark daily test coverage Key: SPARK-37526 URL: https://issues.apache.org/jira/browse/SPARK-37526 Project: Spark Issue Type:

[jira] [Updated] (SPARK-37512) Support TimedeltaIndex creation (from Series/Index) and TimedeltaIndex.astype

2021-12-02 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-37512: - Fix Version/s: 3.3.0 > Support TimedeltaIndex creation (from Series/Index) and

[jira] [Resolved] (SPARK-37512) Support TimedeltaIndex creation (from Series/Index) and TimedeltaIndex.astype

2021-12-02 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-37512. -- Assignee: Xinrong Meng Resolution: Fixed Fixed in

[jira] [Created] (SPARK-37525) Support basic operations of timedelta Series/Index

2021-12-02 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-37525: Summary: Support basic operations of timedelta Series/Index Key: SPARK-37525 URL: https://issues.apache.org/jira/browse/SPARK-37525 Project: Spark Issue

[jira] [Assigned] (SPARK-37524) We should drop all tables after testing dynamic partition pruning

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37524: Assignee: Apache Spark > We should drop all tables after testing dynamic partition

[jira] [Assigned] (SPARK-37524) We should drop all tables after testing dynamic partition pruning

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37524: Assignee: (was: Apache Spark) > We should drop all tables after testing dynamic

[jira] [Commented] (SPARK-37524) We should drop all tables after testing dynamic partition pruning

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452694#comment-17452694 ] Apache Spark commented on SPARK-37524: -- User 'weixiuli' has created a pull request for this issue:

[jira] [Created] (SPARK-37524) We should drop all tables after testing dynamic partition pruning

2021-12-02 Thread weixiuli (Jira)
weixiuli created SPARK-37524: Summary: We should drop all tables after testing dynamic partition pruning Key: SPARK-37524 URL: https://issues.apache.org/jira/browse/SPARK-37524 Project: Spark

[jira] [Resolved] (SPARK-37520) Add the startswith() and endswith() string functions

2021-12-02 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-37520. -- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34782

[jira] [Assigned] (SPARK-37504) pyspark should not pass all options to session states.

2021-12-02 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-37504: Assignee: angerszhu > pyspark should not pass all options to session states. >

[jira] [Resolved] (SPARK-37504) pyspark should not pass all options to session states.

2021-12-02 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-37504. -- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34757

[jira] [Commented] (SPARK-37508) Add CONTAINS() function

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452659#comment-17452659 ] Apache Spark commented on SPARK-37508: -- User 'AngersZh' has created a pull request for this

[jira] [Commented] (SPARK-37508) Add CONTAINS() function

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452658#comment-17452658 ] Apache Spark commented on SPARK-37508: -- User 'AngersZh' has created a pull request for this

[jira] [Commented] (SPARK-37392) Catalyst optimizer very time-consuming and memory-intensive with some "explode(array)"

2021-12-02 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452656#comment-17452656 ] Josh Rosen commented on SPARK-37392: When I ran this in {{spark-shell}} it triggered an OOM in

[jira] [Assigned] (SPARK-37523) Support optimize skewed partitions in Distribution and Ordering if numPartitions is not specified

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37523: Assignee: (was: Apache Spark) > Support optimize skewed partitions in Distribution

[jira] [Assigned] (SPARK-37523) Support optimize skewed partitions in Distribution and Ordering if numPartitions is not specified

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37523: Assignee: Apache Spark > Support optimize skewed partitions in Distribution and Ordering

[jira] [Commented] (SPARK-37523) Support optimize skewed partitions in Distribution and Ordering if numPartitions is not specified

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452574#comment-17452574 ] Apache Spark commented on SPARK-37523: -- User 'huaxingao' has created a pull request for this issue:

[jira] [Created] (SPARK-37523) Support optimize skewed partitions in Distribution and Ordering if numPartitions is not specified

2021-12-02 Thread Huaxin Gao (Jira)
Huaxin Gao created SPARK-37523: -- Summary: Support optimize skewed partitions in Distribution and Ordering if numPartitions is not specified Key: SPARK-37523 URL: https://issues.apache.org/jira/browse/SPARK-37523

[jira] [Resolved] (SPARK-37522) Fix MultilayerPerceptronClassifierTest.test_raw_and_probability_prediction

2021-12-02 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-37522. --- Fix Version/s: 3.3.0 3.2.1 Resolution: Fixed Issue resolved by

[jira] [Assigned] (SPARK-37522) Fix MultilayerPerceptronClassifierTest.test_raw_and_probability_prediction

2021-12-02 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-37522: - Assignee: Dongjoon Hyun > Fix

[jira] [Commented] (SPARK-37522) Fix MultilayerPerceptronClassifierTest.test_raw_and_probability_prediction

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452530#comment-17452530 ] Apache Spark commented on SPARK-37522: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Assigned] (SPARK-37522) Fix MultilayerPerceptronClassifierTest.test_raw_and_probability_prediction

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37522: Assignee: Apache Spark > Fix

[jira] [Commented] (SPARK-37522) Fix MultilayerPerceptronClassifierTest.test_raw_and_probability_prediction

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452531#comment-17452531 ] Apache Spark commented on SPARK-37522: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Assigned] (SPARK-37522) Fix MultilayerPerceptronClassifierTest.test_raw_and_probability_prediction

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37522: Assignee: (was: Apache Spark) > Fix

[jira] [Created] (SPARK-37522) Fix MultilayerPerceptronClassifierTest.test_raw_and_probability_prediction

2021-12-02 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-37522: - Summary: Fix MultilayerPerceptronClassifierTest.test_raw_and_probability_prediction Key: SPARK-37522 URL: https://issues.apache.org/jira/browse/SPARK-37522

[jira] [Updated] (SPARK-37521) insert overwrite table but the partition information stored in Metastore was not changed

2021-12-02 Thread jingxiong zhong (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jingxiong zhong updated SPARK-37521: Summary: insert overwrite table but the partition information stored in Metastore was not

[jira] [Commented] (SPARK-37521) insert overwrite table but didn't change the message of metastore

2021-12-02 Thread jingxiong zhong (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452521#comment-17452521 ] jingxiong zhong commented on SPARK-37521: - The schema of metasotre's updated partition was not

[jira] [Updated] (SPARK-37521) insert overwrite table but didn't change the message of metastore

2021-12-02 Thread jingxiong zhong (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jingxiong zhong updated SPARK-37521: Description: I create a partitioned table in SparkSQL, insert a data entry, add a regular

[jira] [Resolved] (SPARK-37450) Spark SQL reads unnecessary nested fields (another type of pruning case)

2021-12-02 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh resolved SPARK-37450. - Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34701

[jira] [Assigned] (SPARK-37450) Spark SQL reads unnecessary nested fields (another type of pruning case)

2021-12-02 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh reassigned SPARK-37450: --- Assignee: L. C. Hsieh > Spark SQL reads unnecessary nested fields (another type of pruning

[jira] [Updated] (SPARK-37515) minRatePerPartition works as "max messages per partition per a batch" (it should be per seconds)

2021-12-02 Thread Sungpeo Kook (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sungpeo Kook updated SPARK-37515: - Component/s: Structured Streaming (was: Spark Core) > minRatePerPartition

[jira] [Assigned] (SPARK-37442) In AQE, wrong InMemoryRelation size estimation causes "Cannot broadcast the table that is larger than 8GB: 8 GB" failure

2021-12-02 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-37442: --- Assignee: Michael Chen > In AQE, wrong InMemoryRelation size estimation causes "Cannot

[jira] [Resolved] (SPARK-37442) In AQE, wrong InMemoryRelation size estimation causes "Cannot broadcast the table that is larger than 8GB: 8 GB" failure

2021-12-02 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-37442. - Fix Version/s: 3.3.0 3.2.1 Resolution: Fixed Issue resolved by pull

[jira] [Created] (SPARK-37521) insert overwrite table but didn't change the message of metastore

2021-12-02 Thread jingxiong zhong (Jira)
jingxiong zhong created SPARK-37521: --- Summary: insert overwrite table but didn't change the message of metastore Key: SPARK-37521 URL: https://issues.apache.org/jira/browse/SPARK-37521 Project:

[jira] [Commented] (SPARK-37520) Add the startswith() and endswith() string functions

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452376#comment-17452376 ] Apache Spark commented on SPARK-37520: -- User 'MaxGekk' has created a pull request for this issue:

[jira] [Assigned] (SPARK-37520) Add the startswith() and endswith() string functions

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37520: Assignee: Max Gekk (was: Apache Spark) > Add the startswith() and endswith() string

[jira] [Assigned] (SPARK-37520) Add the startswith() and endswith() string functions

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37520: Assignee: Apache Spark (was: Max Gekk) > Add the startswith() and endswith() string

[jira] [Created] (SPARK-37520) Add the startswith() and endswith() string functions

2021-12-02 Thread Max Gekk (Jira)
Max Gekk created SPARK-37520: Summary: Add the startswith() and endswith() string functions Key: SPARK-37520 URL: https://issues.apache.org/jira/browse/SPARK-37520 Project: Spark Issue Type: New

[jira] [Commented] (SPARK-37517) Keep consistent order of columns with user specify for v1 table

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452355#comment-17452355 ] Apache Spark commented on SPARK-37517: -- User 'Peng-Lei' has created a pull request for this issue:

[jira] [Commented] (SPARK-37517) Keep consistent order of columns with user specify for v1 table

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452353#comment-17452353 ] Apache Spark commented on SPARK-37517: -- User 'Peng-Lei' has created a pull request for this issue:

[jira] [Assigned] (SPARK-37517) Keep consistent order of columns with user specify for v1 table

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37517: Assignee: (was: Apache Spark) > Keep consistent order of columns with user specify

[jira] [Assigned] (SPARK-37517) Keep consistent order of columns with user specify for v1 table

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37517: Assignee: Apache Spark > Keep consistent order of columns with user specify for v1 table

[jira] [Updated] (SPARK-37392) Catalyst optimizer very time-consuming and memory-intensive with some "explode(array)"

2021-12-02 Thread Francois MARTIN (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois MARTIN updated SPARK-37392: Component/s: Optimizer (was: Spark Core) > Catalyst optimizer very

[jira] [Assigned] (SPARK-37518) inject a early scan pushdown rule

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37518: Assignee: Apache Spark > inject a early scan pushdown rule >

[jira] [Commented] (SPARK-37518) inject a early scan pushdown rule

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452337#comment-17452337 ] Apache Spark commented on SPARK-37518: -- User 'beliefer' has created a pull request for this issue:

[jira] [Assigned] (SPARK-37518) inject a early scan pushdown rule

2021-12-02 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37518: Assignee: (was: Apache Spark) > inject a early scan pushdown rule >

[jira] [Updated] (SPARK-37518) inject a early scan pushdown rule

2021-12-02 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-37518: --- Summary: inject a early scan pushdown rule (was: inject a early scan push down rule) > inject a

[jira] [Updated] (SPARK-37519) Support Relation With LateralView

2021-12-02 Thread Tongwei (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-37519: Description: {code:java} CREATE TABLE person (id INT, name STRING, age INT, class INT, address STRING);

[jira] [Updated] (SPARK-37519) Support Relation With LateralView

2021-12-02 Thread Tongwei (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-37519: Description: {code:java} CREATE TABLE person (id INT, name STRING, age INT, class INT, address STRING);

[jira] [Updated] (SPARK-37519) Support Relation With LateralView

2021-12-02 Thread Tongwei (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tongwei updated SPARK-37519: Description: ``` CREATE TABLE person (id INT, name STRING, age INT, class INT, address STRING); INSERT

[jira] [Created] (SPARK-37519) Support Relation With LateralView

2021-12-02 Thread Tongwei (Jira)
Tongwei created SPARK-37519: --- Summary: Support Relation With LateralView Key: SPARK-37519 URL: https://issues.apache.org/jira/browse/SPARK-37519 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-37518) inject a early scan push down rule

2021-12-02 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-37518: --- Description: Currently, Spark supports push down filters, aggregates and limit. All the job is

[jira] [Created] (SPARK-37518) inject a early scan push down rule

2021-12-02 Thread jiaan.geng (Jira)
jiaan.geng created SPARK-37518: -- Summary: inject a early scan push down rule Key: SPARK-37518 URL: https://issues.apache.org/jira/browse/SPARK-37518 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-37517) Keep consistent order of columns with user specify for v1 table

2021-12-02 Thread PengLei (Jira)
PengLei created SPARK-37517: --- Summary: Keep consistent order of columns with user specify for v1 table Key: SPARK-37517 URL: https://issues.apache.org/jira/browse/SPARK-37517 Project: Spark Issue

[jira] [Updated] (SPARK-37286) Move compileAggregates from JDBCRDD to JdbcDialect

2021-12-02 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-37286: --- Description: Currently, the method compileAggregates in JDBCRDD. But it is not reasonable, because

[jira] [Updated] (SPARK-37286) Move compileAggregates from JDBCRDD to JdbcDialect

2021-12-02 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-37286: --- Summary: Move compileAggregates from JDBCRDD to JdbcDialect (was: Move compileFilter and