[jira] [Resolved] (SPARK-31334) Use agg column in Having clause behave different with column type

2020-05-08 Thread angerszhu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] angerszhu resolved SPARK-31334. --- Resolution: Fixed > Use agg column in Having clause behave different with column type >

[jira] [Issue Comment Deleted] (SPARK-31663) Grouping sets with having clause returns the wrong result

2020-05-08 Thread angerszhu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] angerszhu updated SPARK-31663: -- Comment: was deleted (was: cc [~dongjoon] [~XuanYuan] Seems this problem  a little like  

[jira] [Assigned] (SPARK-31668) Saving and loading HashingTF leads to hash function changed

2020-05-08 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-31668: -- Assignee: Weichen Xu > Saving and loading HashingTF leads to hash function changed >

[jira] [Created] (SPARK-31668) Saving and loading HashingTF leads to hash function changed

2020-05-08 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-31668: -- Summary: Saving and loading HashingTF leads to hash function changed Key: SPARK-31668 URL: https://issues.apache.org/jira/browse/SPARK-31668 Project: Spark

[jira] [Commented] (SPARK-31663) Grouping sets with having clause returns the wrong result

2020-05-08 Thread angerszhu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17103039#comment-17103039 ] angerszhu commented on SPARK-31663: --- cc [~dongjoon] [~XuanYuan] Seems this problem  a little like  

[jira] [Updated] (SPARK-31610) Expose hashFunc property in HashingTF

2020-05-08 Thread Xiangrui Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-31610: -- Issue Type: Bug (was: Improvement) > Expose hashFunc property in HashingTF >

[jira] [Updated] (SPARK-31610) Expose hashFunc property in HashingTF

2020-05-08 Thread Xiangrui Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-31610: -- Priority: Critical (was: Major) > Expose hashFunc property in HashingTF >

[jira] [Resolved] (SPARK-31611) Register NettyMemoryMetrics into Node Manager's metrics system

2020-05-08 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-31611. --- Fix Version/s: 3.1.0 Resolution: Fixed Issue resolved by pull request 28416

[jira] [Assigned] (SPARK-31611) Register NettyMemoryMetrics into Node Manager's metrics system

2020-05-08 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-31611: - Assignee: Manu Zhang > Register NettyMemoryMetrics into Node Manager's metrics system

[jira] [Commented] (SPARK-31667) Python side flatten the result dataframe of ANOVATest/ChisqTest/FValueTest

2020-05-08 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102948#comment-17102948 ] Apache Spark commented on SPARK-31667: -- User 'huaxingao' has created a pull request for this issue:

[jira] [Assigned] (SPARK-31667) Python side flatten the result dataframe of ANOVATest/ChisqTest/FValueTest

2020-05-08 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-31667: Assignee: Apache Spark > Python side flatten the result dataframe of

[jira] [Commented] (SPARK-31667) Python side flatten the result dataframe of ANOVATest/ChisqTest/FValueTest

2020-05-08 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102947#comment-17102947 ] Apache Spark commented on SPARK-31667: -- User 'huaxingao' has created a pull request for this issue:

[jira] [Assigned] (SPARK-31667) Python side flatten the result dataframe of ANOVATest/ChisqTest/FValueTest

2020-05-08 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-31667: Assignee: (was: Apache Spark) > Python side flatten the result dataframe of

[jira] [Commented] (SPARK-31627) Font style of Spark SQL DAG-viz is broken in Chrome on macOS

2020-05-08 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102944#comment-17102944 ] Dongjoon Hyun commented on SPARK-31627: --- Thank you, [~sarutak] and [~hyukjin.kwon]. +1, too. >

[jira] [Created] (SPARK-31667) Python side flatten the result dataframe of ANOVATest/ChisqTest/FValueTest

2020-05-08 Thread Huaxin Gao (Jira)
Huaxin Gao created SPARK-31667: -- Summary: Python side flatten the result dataframe of ANOVATest/ChisqTest/FValueTest Key: SPARK-31667 URL: https://issues.apache.org/jira/browse/SPARK-31667 Project:

[jira] [Updated] (SPARK-31666) Cannot map hostPath volumes to container

2020-05-08 Thread Stephen Hopper (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen Hopper updated SPARK-31666: --- Description: I'm trying to mount additional hostPath directories as seen in a couple of

[jira] [Updated] (SPARK-31666) Cannot map hostPath volumes to container

2020-05-08 Thread Stephen Hopper (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen Hopper updated SPARK-31666: --- Description: I'm trying to mount additional hostPath directories as seen in a couple of

[jira] [Created] (SPARK-31666) Cannot map hostPath volumes to container

2020-05-08 Thread Stephen Hopper (Jira)
Stephen Hopper created SPARK-31666: -- Summary: Cannot map hostPath volumes to container Key: SPARK-31666 URL: https://issues.apache.org/jira/browse/SPARK-31666 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-30267) avro deserializer: ArrayList cannot be cast to GenericData$Array

2020-05-08 Thread Gengliang Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102919#comment-17102919 ] Gengliang Wang commented on SPARK-30267: Hi [~tashoyan], could you provide a simple reproduce

[jira] [Reopened] (SPARK-30267) avro deserializer: ArrayList cannot be cast to GenericData$Array

2020-05-08 Thread Arseniy Tashoyan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arseniy Tashoyan reopened SPARK-30267: -- With Spark 3.0.0 preview 2, I have the following failure here: {code:java}

[jira] [Commented] (SPARK-20732) Copy cache data when node is being shut down

2020-05-08 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-20732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102875#comment-17102875 ] Apache Spark commented on SPARK-20732: -- User 'holdenk' has created a pull request for this issue:

[jira] [Comment Edited] (SPARK-29037) [Core] Spark gives duplicate result when an application was killed and rerun

2020-05-08 Thread Afroz Baig (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102837#comment-17102837 ] Afroz Baig edited comment on SPARK-29037 at 5/8/20, 7:11 PM: -

[jira] [Updated] (SPARK-31663) Grouping sets with having clause returns the wrong result

2020-05-08 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-31663: -- Description: Grouping sets with having clause returns the wrong result when the condition of

[jira] [Commented] (SPARK-31663) Grouping sets with having clause returns the wrong result

2020-05-08 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102850#comment-17102850 ] Dongjoon Hyun commented on SPARK-31663: --- All 2.x versions are added into the affected versions.

[jira] [Assigned] (SPARK-31665) Test parquet dictionary encoding of random dates/timestamps

2020-05-08 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-31665: Assignee: (was: Apache Spark) > Test parquet dictionary encoding of random

[jira] [Updated] (SPARK-31663) Grouping sets with having clause returns the wrong result

2020-05-08 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-31663: -- Affects Version/s: 2.0.2 2.1.3 > Grouping sets with having clause

[jira] [Assigned] (SPARK-31665) Test parquet dictionary encoding of random dates/timestamps

2020-05-08 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-31665: Assignee: Apache Spark > Test parquet dictionary encoding of random dates/timestamps >

[jira] [Commented] (SPARK-31665) Test parquet dictionary encoding of random dates/timestamps

2020-05-08 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102848#comment-17102848 ] Apache Spark commented on SPARK-31665: -- User 'MaxGekk' has created a pull request for this issue:

[jira] [Updated] (SPARK-31663) Grouping sets with having clause returns the wrong result

2020-05-08 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-31663: -- Affects Version/s: (was: 2.4.4) (was: 2.4.3)

[jira] [Commented] (SPARK-31663) Grouping sets with having clause returns the wrong result

2020-05-08 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102844#comment-17102844 ] Dongjoon Hyun commented on SPARK-31663: --- Apache Spark 2.3.4 follows Hive syntaxes, but the result

[jira] [Commented] (SPARK-31663) Grouping sets with having clause returns the wrong result

2020-05-08 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102841#comment-17102841 ] Dongjoon Hyun commented on SPARK-31663: --- Also, with a changed syntax, this is reproduced in older

[jira] [Commented] (SPARK-29037) [Core] Spark gives duplicate result when an application was killed and rerun

2020-05-08 Thread Afroz Baig (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102837#comment-17102837 ] Afroz Baig commented on SPARK-29037: spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=2

[jira] [Created] (SPARK-31665) Test parquet dictionary encoding of random dates/timestamps

2020-05-08 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31665: -- Summary: Test parquet dictionary encoding of random dates/timestamps Key: SPARK-31665 URL: https://issues.apache.org/jira/browse/SPARK-31665 Project: Spark

[jira] [Commented] (SPARK-31663) Grouping sets with having clause returns the wrong result

2020-05-08 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102829#comment-17102829 ] Dongjoon Hyun commented on SPARK-31663: --- The issue is that `b` is interpreted differently in

[jira] [Commented] (SPARK-31663) Grouping sets with having clause returns the wrong result

2020-05-08 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102825#comment-17102825 ] Dongjoon Hyun commented on SPARK-31663: --- I confirmed that this is a correctness issue since 2.4.0.

[jira] [Updated] (SPARK-31663) Grouping sets with having clause returns the wrong result

2020-05-08 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-31663: -- Affects Version/s: 2.4.0 2.4.1 2.4.2

[jira] [Updated] (SPARK-31663) Grouping sets with having clause returns the wrong result

2020-05-08 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-31663: -- Labels: correctness (was: ) > Grouping sets with having clause returns the wrong result >

[jira] [Resolved] (SPARK-31658) SQL UI doesn't show write commands of AQE plan

2020-05-08 Thread Xiao Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-31658. - Fix Version/s: 3.0.0 Resolution: Fixed > SQL UI doesn't show write commands of AQE plan >

[jira] [Assigned] (SPARK-31658) SQL UI doesn't show write commands of AQE plan

2020-05-08 Thread Xiao Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reassigned SPARK-31658: --- Assignee: Manu Zhang > SQL UI doesn't show write commands of AQE plan >

[jira] [Assigned] (SPARK-31664) Race in YARN scheduler shutdown leads to uncaught SparkException "Could not find CoarseGrainedScheduler"

2020-05-08 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-31664: Assignee: Apache Spark > Race in YARN scheduler shutdown leads to uncaught

[jira] [Assigned] (SPARK-31664) Race in YARN scheduler shutdown leads to uncaught SparkException "Could not find CoarseGrainedScheduler"

2020-05-08 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-31664: Assignee: (was: Apache Spark) > Race in YARN scheduler shutdown leads to uncaught

[jira] [Commented] (SPARK-31664) Race in YARN scheduler shutdown leads to uncaught SparkException "Could not find CoarseGrainedScheduler"

2020-05-08 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102757#comment-17102757 ] Apache Spark commented on SPARK-31664: -- User 'baohe-zhang' has created a pull request for this

[jira] [Commented] (SPARK-31640) Support SHOW PARTITIONS for DataSource V2 tables

2020-05-08 Thread Burak Yavuz (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102746#comment-17102746 ] Burak Yavuz commented on SPARK-31640: - Hi [~younggyuchun],   I'd take a look at how SHOW

[jira] [Commented] (SPARK-31640) Support SHOW PARTITIONS for DataSource V2 tables

2020-05-08 Thread YoungGyu Chun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102738#comment-17102738 ] YoungGyu Chun commented on SPARK-31640: --- Hi [~brkyvz], I am trying to sort out but I have a

[jira] [Updated] (SPARK-31663) Grouping sets with having clause returns the wrong result

2020-05-08 Thread Yuanjian Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuanjian Li updated SPARK-31663: Description: Grouping sets with having clause returns the wrong result when the condition of

[jira] [Updated] (SPARK-31663) Grouping sets with having clause returns the wrong result

2020-05-08 Thread Yuanjian Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuanjian Li updated SPARK-31663: Description: Grouping sets with having clause returns the wrong result when the condition of

[jira] [Updated] (SPARK-31663) Grouping sets with having clause returns the wrong result

2020-05-08 Thread Yuanjian Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuanjian Li updated SPARK-31663: Labels: (was: correct) > Grouping sets with having clause returns the wrong result >

[jira] [Commented] (SPARK-31654) sequence producing inconsistent intervals for month step

2020-05-08 Thread Ramesh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102712#comment-17102712 ] Ramesh commented on SPARK-31654: [~roman_y]  ,  [~Ankitraj]  It is working as expected ..

[jira] [Created] (SPARK-31663) Grouping sets with having clause returns the wrong result

2020-05-08 Thread Yuanjian Li (Jira)
Yuanjian Li created SPARK-31663: --- Summary: Grouping sets with having clause returns the wrong result Key: SPARK-31663 URL: https://issues.apache.org/jira/browse/SPARK-31663 Project: Spark

[jira] [Updated] (SPARK-31663) Grouping sets with having clause returns the wrong result

2020-05-08 Thread Yuanjian Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuanjian Li updated SPARK-31663: Labels: correct (was: ) > Grouping sets with having clause returns the wrong result >

[jira] [Created] (SPARK-31664) Race in YARN scheduler shutdown leads to uncaught SparkException "Could not find CoarseGrainedScheduler"

2020-05-08 Thread Baohe Zhang (Jira)
Baohe Zhang created SPARK-31664: --- Summary: Race in YARN scheduler shutdown leads to uncaught SparkException "Could not find CoarseGrainedScheduler" Key: SPARK-31664 URL:

[jira] [Commented] (SPARK-31470) Introduce SORTED BY clause in CREATE TABLE statement

2020-05-08 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102635#comment-17102635 ] Yuming Wang commented on SPARK-31470: - [~rakson] I'm not sure if it can be 100% accepted by

[jira] [Updated] (SPARK-31662) Reading wrong dates from dictionary encoded columns in Parquet files

2020-05-08 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-31662: --- Description: Write dates with dictionary encoding enabled to parquet files: {code:scala} Welcome to

[jira] [Commented] (SPARK-31662) Reading wrong dates from dictionary encoded columns in Parquet files

2020-05-08 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102525#comment-17102525 ] Apache Spark commented on SPARK-31662: -- User 'MaxGekk' has created a pull request for this issue:

[jira] [Assigned] (SPARK-31662) Reading wrong dates from dictionary encoded columns in Parquet files

2020-05-08 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-31662: Assignee: (was: Apache Spark) > Reading wrong dates from dictionary encoded columns

[jira] [Assigned] (SPARK-31662) Reading wrong dates from dictionary encoded columns in Parquet files

2020-05-08 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-31662: Assignee: Apache Spark > Reading wrong dates from dictionary encoded columns in Parquet

[jira] [Created] (SPARK-31662) Reading wrong dates from dictionary encoded columns in Parquet files

2020-05-08 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31662: -- Summary: Reading wrong dates from dictionary encoded columns in Parquet files Key: SPARK-31662 URL: https://issues.apache.org/jira/browse/SPARK-31662 Project: Spark

[jira] [Commented] (SPARK-31622) Test-jar in the Spark distribution

2020-05-08 Thread Ankit Raj Boudh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102466#comment-17102466 ] Ankit Raj Boudh commented on SPARK-31622: - [~hyukjin.kwon], please confirm this then i will

[jira] [Commented] (SPARK-24193) Sort by disk when number of limit is big in TakeOrderedAndProjectExec

2020-05-08 Thread Xianjin YE (Jira)
[ https://issues.apache.org/jira/browse/SPARK-24193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102458#comment-17102458 ] Xianjin YE commented on SPARK-24193: I used `df.rdd.collect` intentionally to trigger the problem as

[jira] [Commented] (SPARK-24193) Sort by disk when number of limit is big in TakeOrderedAndProjectExec

2020-05-08 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-24193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102439#comment-17102439 ] Wenchen Fan commented on SPARK-24193: - I think it's not a problem if you do `df.collect` instead of

[jira] [Commented] (SPARK-31104) Add documentation for all new Json Functions

2020-05-08 Thread Rakesh Raushan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102438#comment-17102438 ] Rakesh Raushan commented on SPARK-31104: [~hyukjin.kwon] We can mark this as resolved as this

[jira] [Commented] (SPARK-31470) Introduce SORTED BY clause in CREATE TABLE statement

2020-05-08 Thread Rakesh Raushan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102437#comment-17102437 ] Rakesh Raushan commented on SPARK-31470: If this is required by community and [~yumwang] has not

[jira] [Commented] (SPARK-31654) sequence producing inconsistent intervals for month step

2020-05-08 Thread Ankit Raj Boudh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102421#comment-17102421 ] Ankit Raj Boudh commented on SPARK-31654: - [~roman_y], I will raise pr for this. > sequence

[jira] [Commented] (SPARK-31657) CSV Writer writes no header for empty DataFrames

2020-05-08 Thread Ankit Raj Boudh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102420#comment-17102420 ] Ankit Raj Boudh commented on SPARK-31657: - [~fpin], I will raise PR for this > CSV Writer

[jira] [Commented] (SPARK-31588) merge small files may need more common setting

2020-05-08 Thread philipse (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102414#comment-17102414 ] philipse commented on SPARK-31588: -- yes, the block size can be controlled in HDFS.i mean we just take

[jira] [Resolved] (SPARK-30385) WebUI occasionally throw IOException on stop()

2020-05-08 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-30385. - Fix Version/s: 3.1.0 Assignee: Kousuke Saruta Resolution: Fixed > WebUI

[jira] [Commented] (SPARK-16951) Alternative implementation of NOT IN to Anti-join

2020-05-08 Thread linna shuang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-16951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102336#comment-17102336 ] linna shuang commented on SPARK-16951: -- In TPC-H test, we met performance issue of Q16, which used

[jira] [Resolved] (SPARK-31656) AFT blockify input vectors

2020-05-08 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-31656. -- Fix Version/s: 3.1.0 Resolution: Fixed Issue resolved by pull request 28473

[jira] [Commented] (SPARK-24193) Sort by disk when number of limit is big in TakeOrderedAndProjectExec

2020-05-08 Thread Xianjin YE (Jira)
[ https://issues.apache.org/jira/browse/SPARK-24193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102285#comment-17102285 ] Xianjin YE commented on SPARK-24193: Hi, [~jinxing6...@126.com] [~cloud_fan] the fallback config has