[jira] [Created] (SPARK-24430) CREATE VIEW with UNION statement: Failed to recognize predicate 'UNION'.

2018-05-30 Thread Volodymyr Glushak (JIRA)
Volodymyr Glushak created SPARK-24430: - Summary: CREATE VIEW with UNION statement: Failed to recognize predicate 'UNION'. Key: SPARK-24430 URL: https://issues.apache.org/jira/browse/SPARK-24430

[jira] [Updated] (SPARK-23754) StopIterator exception in Python UDF results in partial result

2018-05-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-23754: - Fix Version/s: 2.3.1 > StopIterator exception in Python UDF results in partial result >

[jira] [Assigned] (SPARK-23161) Add missing APIs to Python GBTClassifier

2018-05-30 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-23161: Assignee: Huaxin Gao > Add missing APIs to Python GBTClassifier >

[jira] [Resolved] (SPARK-23161) Add missing APIs to Python GBTClassifier

2018-05-30 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-23161. -- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21413

[jira] [Assigned] (SPARK-23901) Data Masking Functions

2018-05-30 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin reassigned SPARK-23901: - Assignee: Marco Gaido > Data Masking Functions > -- > >

[jira] [Resolved] (SPARK-23901) Data Masking Functions

2018-05-30 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin resolved SPARK-23901. --- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21246

[jira] [Commented] (SPARK-24430) CREATE VIEW with UNION statement: Failed to recognize predicate 'UNION'.

2018-05-30 Thread Volodymyr Glushak (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495374#comment-16495374 ] Volodymyr Glushak commented on SPARK-24430: --- I wonder, if that is legal code, and this request

[jira] [Assigned] (SPARK-24384) spark-submit --py-files with .py files doesn't work in client mode before context initialization

2018-05-30 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-24384: -- Assignee: Hyukjin Kwon > spark-submit --py-files with .py files doesn't work in

[jira] [Resolved] (SPARK-24384) spark-submit --py-files with .py files doesn't work in client mode before context initialization

2018-05-30 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-24384. Resolution: Fixed Fix Version/s: 2.3.1 2.4.0 Issue resolved by

[jira] [Resolved] (SPARK-24419) Upgrade SBT to 0.13.17 with Scala 2.10.7

2018-05-30 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DB Tsai resolved SPARK-24419. - Resolution: Fixed > Upgrade SBT to 0.13.17 with Scala 2.10.7 >

[jira] [Updated] (SPARK-24417) Build and Run Spark on JDK9+

2018-05-30 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DB Tsai updated SPARK-24417: Description: This is an umbrella JIRA for Apache Spark to support Java 9+ As Java 8 is going away soon,

[jira] [Assigned] (SPARK-24369) A bug when having multiple distinct aggregations

2018-05-30 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-24369: --- Assignee: Takeshi Yamamuro > A bug when having multiple distinct aggregations >

[jira] [Resolved] (SPARK-24369) A bug when having multiple distinct aggregations

2018-05-30 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-24369. - Resolution: Fixed Fix Version/s: 2.3.1 2.4.0 Issue resolved by pull

[jira] [Commented] (SPARK-24091) Internally used ConfigMap prevents use of user-specified ConfigMaps carrying Spark configs files

2018-05-30 Thread Erik Erlandson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495619#comment-16495619 ] Erik Erlandson commented on SPARK-24091: If we support user-supplied yaml, that may become a

[jira] [Commented] (SPARK-24373) "df.cache() df.count()" no longer eagerly caches data when the analyzed plans are different after re-analyzing the plans

2018-05-30 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495695#comment-16495695 ] Li Jin commented on SPARK-24373: [~smilegator] Thank you for the suggestion. > "df.cache() df.count()"

[jira] [Created] (SPARK-24434) Support user-specified driver and executor pod templates

2018-05-30 Thread Yinan Li (JIRA)
Yinan Li created SPARK-24434: Summary: Support user-specified driver and executor pod templates Key: SPARK-24434 URL: https://issues.apache.org/jira/browse/SPARK-24434 Project: Spark Issue Type:

[jira] [Commented] (SPARK-24436) Add large dataset to examples sub-directory.

2018-05-30 Thread Varun Vishwanathan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495684#comment-16495684 ] Varun Vishwanathan commented on SPARK-24436: Going to add a parquet file with the data. >

[jira] [Commented] (SPARK-24434) Support user-specified driver and executor pod templates

2018-05-30 Thread Erik Erlandson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495615#comment-16495615 ] Erik Erlandson commented on SPARK-24434: Is the template-based solution being explicitly favored

[jira] [Updated] (SPARK-24437) Memory leak in UnsafeHashedRelation

2018-05-30 Thread gagan taneja (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gagan taneja updated SPARK-24437: - Attachment: Screen Shot 2018-05-30 at 2.05.40 PM.png > Memory leak in UnsafeHashedRelation >

[jira] [Commented] (SPARK-24373) "df.cache() df.count()" no longer eagerly caches data when the analyzed plans are different after re-analyzing the plans

2018-05-30 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495691#comment-16495691 ] Xiao Li commented on SPARK-24373: - [~icexelloss] This is still possible since the query plans are

[jira] [Created] (SPARK-24433) Add Spark R support

2018-05-30 Thread Yinan Li (JIRA)
Yinan Li created SPARK-24433: Summary: Add Spark R support Key: SPARK-24433 URL: https://issues.apache.org/jira/browse/SPARK-24433 Project: Spark Issue Type: New Feature Components:

[jira] [Created] (SPARK-24435) Support user-supplied YAML that can be merged with k8s pod descriptions

2018-05-30 Thread Erik Erlandson (JIRA)
Erik Erlandson created SPARK-24435: -- Summary: Support user-supplied YAML that can be merged with k8s pod descriptions Key: SPARK-24435 URL: https://issues.apache.org/jira/browse/SPARK-24435 Project:

[jira] [Resolved] (SPARK-24435) Support user-supplied YAML that can be merged with k8s pod descriptions

2018-05-30 Thread Erik Erlandson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Erlandson resolved SPARK-24435. Resolution: Duplicate > Support user-supplied YAML that can be merged with k8s pod

[jira] [Created] (SPARK-24437) Memory leak in UnsafeHashedRelation

2018-05-30 Thread gagan taneja (JIRA)
gagan taneja created SPARK-24437: Summary: Memory leak in UnsafeHashedRelation Key: SPARK-24437 URL: https://issues.apache.org/jira/browse/SPARK-24437 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 0.10.0.1 to 1.1.0

2018-05-30 Thread Ted Yu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495580#comment-16495580 ] Ted Yu commented on SPARK-18057: There are compilation errors in KafkaTestUtils.scala against Kafka

[jira] [Commented] (SPARK-24434) Support user-specified driver and executor pod templates

2018-05-30 Thread Erik Erlandson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495642#comment-16495642 ] Erik Erlandson commented on SPARK-24434: [~skonto] given the number of ideas that have gotten

[jira] [Created] (SPARK-24436) Add large dataset to examples sub-directory.

2018-05-30 Thread Varun Vishwanathan (JIRA)
Varun Vishwanathan created SPARK-24436: -- Summary: Add large dataset to examples sub-directory. Key: SPARK-24436 URL: https://issues.apache.org/jira/browse/SPARK-24436 Project: Spark

[jira] [Created] (SPARK-24432) Support for dynamic resource allocation

2018-05-30 Thread Yinan Li (JIRA)
Yinan Li created SPARK-24432: Summary: Support for dynamic resource allocation Key: SPARK-24432 URL: https://issues.apache.org/jira/browse/SPARK-24432 Project: Spark Issue Type: New Feature

[jira] [Updated] (SPARK-24432) Add support for dynamic resource allocation

2018-05-30 Thread Yinan Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yinan Li updated SPARK-24432: - Summary: Add support for dynamic resource allocation (was: Support for dynamic resource allocation) >

[jira] [Commented] (SPARK-24434) Support user-specified driver and executor pod templates

2018-05-30 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495601#comment-16495601 ] Stavros Kontopoulos commented on SPARK-24434: - [~liyinan926] I will work on a design

[jira] [Comment Edited] (SPARK-24434) Support user-specified driver and executor pod templates

2018-05-30 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495601#comment-16495601 ] Stavros Kontopoulos edited comment on SPARK-24434 at 5/30/18 7:55 PM:

[jira] [Commented] (SPARK-24434) Support user-specified driver and executor pod templates

2018-05-30 Thread Yinan Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495637#comment-16495637 ] Yinan Li commented on SPARK-24434: -- [~eje] That's a good question. I think we need to compare both and

[jira] [Updated] (SPARK-24437) Memory leak in UnsafeHashedRelation

2018-05-30 Thread gagan taneja (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gagan taneja updated SPARK-24437: - Attachment: Screen Shot 2018-05-30 at 2.07.22 PM.png > Memory leak in UnsafeHashedRelation >

[jira] [Updated] (SPARK-24316) Spark sql queries stall for column width more than 6k for parquet based table

2018-05-30 Thread Bimalendu Choudhary (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bimalendu Choudhary updated SPARK-24316: Summary: Spark sql queries stall for column width more than 6k for parquet based

[jira] [Commented] (SPARK-24427) Spark 2.2 - Exception occurred while saving table in spark. Multiple sources found for parquet

2018-05-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495982#comment-16495982 ] Hyukjin Kwon commented on SPARK-24427: -- Doesn't it sound you specified multiple versions of Spark

[jira] [Commented] (SPARK-24436) Add large dataset to examples sub-directory.

2018-05-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495984#comment-16495984 ] Hyukjin Kwon commented on SPARK-24436: -- I don't think we should add a large dataset in the code

[jira] [Resolved] (SPARK-24436) Add large dataset to examples sub-directory.

2018-05-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-24436. -- Resolution: Invalid > Add large dataset to examples sub-directory. >

[jira] [Created] (SPARK-24439) Add distanceMeasure to BisectingKMeans in PySpark

2018-05-30 Thread Huaxin Gao (JIRA)
Huaxin Gao created SPARK-24439: -- Summary: Add distanceMeasure to BisectingKMeans in PySpark Key: SPARK-24439 URL: https://issues.apache.org/jira/browse/SPARK-24439 Project: Spark Issue Type:

[jira] [Comment Edited] (SPARK-24434) Support user-specified driver and executor pod templates

2018-05-30 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495779#comment-16495779 ] Stavros Kontopoulos edited comment on SPARK-24434 at 5/30/18 10:12 PM:

[jira] [Commented] (SPARK-24434) Support user-specified driver and executor pod templates

2018-05-30 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495779#comment-16495779 ] Stavros Kontopoulos commented on SPARK-24434: - [~eje] I agree will give it a shot and try

[jira] [Commented] (SPARK-24439) Add distanceMeasure to BisectingKMeans in PySpark

2018-05-30 Thread Huaxin Gao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495998#comment-16495998 ] Huaxin Gao commented on SPARK-24439: I will work on this.  > Add distanceMeasure to BisectingKMeans

[jira] [Resolved] (SPARK-24276) semanticHash() returns different values for semantically the same IS IN

2018-05-30 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-24276. - Resolution: Fixed Assignee: Maxim Gekk Fix Version/s: 2.4.0 > semanticHash() returns

[jira] [Assigned] (SPARK-24276) semanticHash() returns different values for semantically the same IS IN

2018-05-30 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reassigned SPARK-24276: --- Assignee: Marco Gaido (was: Maxim Gekk) > semanticHash() returns different values for

[jira] [Commented] (SPARK-23649) CSV schema inferring fails on some UTF-8 chars

2018-05-30 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495843#comment-16495843 ] Shixiong Zhu commented on SPARK-23649: -- [~cloud_fan] looks like this is fixed? > CSV schema

[jira] [Assigned] (SPARK-24333) Add fit with validation set to spark.ml GBT: Python API

2018-05-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24333: Assignee: Apache Spark > Add fit with validation set to spark.ml GBT: Python API >

[jira] [Assigned] (SPARK-24333) Add fit with validation set to spark.ml GBT: Python API

2018-05-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24333: Assignee: (was: Apache Spark) > Add fit with validation set to spark.ml GBT: Python

[jira] [Commented] (SPARK-24333) Add fit with validation set to spark.ml GBT: Python API

2018-05-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495904#comment-16495904 ] Apache Spark commented on SPARK-24333: -- User 'huaxingao' has created a pull request for this issue:

[jira] [Created] (SPARK-24438) Empty strings and null strings are written to the same partition

2018-05-30 Thread Mukul Murthy (JIRA)
Mukul Murthy created SPARK-24438: Summary: Empty strings and null strings are written to the same partition Key: SPARK-24438 URL: https://issues.apache.org/jira/browse/SPARK-24438 Project: Spark

[jira] [Commented] (SPARK-16367) Wheelhouse Support for PySpark

2018-05-30 Thread Cyril Scetbon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495979#comment-16495979 ] Cyril Scetbon commented on SPARK-16367: --- Is there a better way to do it today ? I see that ticket

[jira] [Created] (SPARK-24431) wrong areaUnderPR calculation in BinaryClassificationEvaluator

2018-05-30 Thread Xinyong Tian (JIRA)
Xinyong Tian created SPARK-24431: Summary: wrong areaUnderPR calculation in BinaryClassificationEvaluator Key: SPARK-24431 URL: https://issues.apache.org/jira/browse/SPARK-24431 Project: Spark

[jira] [Assigned] (SPARK-24337) Improve the error message for invalid SQL conf value

2018-05-30 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reassigned SPARK-24337: --- Assignee: (was: Xiao Li) > Improve the error message for invalid SQL conf value >

[jira] [Commented] (SPARK-21187) Complete support for remaining Spark data types in Arrow Converters

2018-05-30 Thread Teddy Choi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16496131#comment-16496131 ] Teddy Choi commented on SPARK-21187: Hello [~bryanc], I'm working on Hive-Spark connector with

[jira] [Commented] (SPARK-24410) Missing optimization for Union on bucketed tables

2018-05-30 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16496132#comment-16496132 ] Liang-Chi Hsieh commented on SPARK-24410: - We can verify the partition of union dataframe:

[jira] [Comment Edited] (SPARK-24410) Missing optimization for Union on bucketed tables

2018-05-30 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16496132#comment-16496132 ] Liang-Chi Hsieh edited comment on SPARK-24410 at 5/31/18 5:39 AM: -- We

[jira] [Resolved] (SPARK-23649) CSV schema inferring fails on some UTF-8 chars

2018-05-30 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-23649. - Resolution: Fixed Fix Version/s: 2.4.0 2.3.1 2.2.2

[jira] [Resolved] (SPARK-24337) Improve the error message for invalid SQL conf value

2018-05-30 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-24337. - Resolution: Fixed Fix Version/s: 2.4.0 > Improve the error message for invalid SQL conf value >

[jira] [Commented] (SPARK-24424) Support ANSI-SQL compliant syntax for ROLLUP, CUBE and GROUPING SET

2018-05-30 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494754#comment-16494754 ] Xiao Li commented on SPARK-24424: - Also cc [~dkbiswal]  > Support ANSI-SQL compliant syntax for ROLLUP,

[jira] [Commented] (SPARK-24423) Add a new option `query` for JDBC sources

2018-05-30 Thread Dilip Biswal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494753#comment-16494753 ] Dilip Biswal commented on SPARK-24423: -- [~smilegator] Thanks Sean for pinging me. I would like to

[jira] [Updated] (SPARK-24424) Support ANSI-SQL compliant syntax for ROLLUP, CUBE and GROUPING SET

2018-05-30 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24424: Description: Currently, our Group By clause follows Hive

[jira] [Commented] (SPARK-24395) Fix Behavior of NOT IN with Literals Containing NULL

2018-05-30 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494769#comment-16494769 ] Xiao Li commented on SPARK-24395: - I think Oracle returns a different answer. We should fix them. >

[jira] [Updated] (SPARK-24423) Add a new option `query` for JDBC sources

2018-05-30 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24423: Description: Currently, our JDBC connector provides the option `dbtable` for users to specify the

[jira] [Updated] (SPARK-24423) Add a new option `query` for JDBC sources

2018-05-30 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24423: Description: Currently, our JDBC connector provides the option `dbtable` for users to specify the

[jira] [Commented] (SPARK-24424) Support ANSI-SQL compliant syntax for ROLLUP, CUBE and GROUPING SET

2018-05-30 Thread Dilip Biswal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494771#comment-16494771 ] Dilip Biswal commented on SPARK-24424: -- [~smilegator] Thank you. I would like to give it a try. >

[jira] [Commented] (SPARK-23442) Reading from partitioned and bucketed table uses only bucketSpec.numBuckets partitions in all cases

2018-05-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494806#comment-16494806 ] Apache Spark commented on SPARK-23442: -- User 'wangyum' has created a pull request for this issue:

[jira] [Assigned] (SPARK-23442) Reading from partitioned and bucketed table uses only bucketSpec.numBuckets partitions in all cases

2018-05-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23442: Assignee: Apache Spark > Reading from partitioned and bucketed table uses only

[jira] [Assigned] (SPARK-23442) Reading from partitioned and bucketed table uses only bucketSpec.numBuckets partitions in all cases

2018-05-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23442: Assignee: (was: Apache Spark) > Reading from partitioned and bucketed table uses

[jira] [Assigned] (SPARK-24331) Add arrays_overlap / array_repeat / map_entries

2018-05-30 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung reassigned SPARK-24331: Assignee: Marek Novotny > Add arrays_overlap / array_repeat / map_entries >

[jira] [Resolved] (SPARK-24331) Add arrays_overlap / array_repeat / map_entries

2018-05-30 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung resolved SPARK-24331. -- Resolution: Fixed Fix Version/s: 2.4.0 Target Version/s: 2.4.0 > Add

[jira] [Updated] (SPARK-24424) Support ANSI-SQL compliant syntax for ROLLUP, CUBE and GROUPING SET

2018-05-30 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24424: Description: Currently, our Group By clause follows Hive

[jira] [Updated] (SPARK-24424) Support ANSI-SQL compliant syntax for ROLLUP, CUBE and GROUPING SET

2018-05-30 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24424: Description: Currently, our Group By clause follows Hive

[jira] [Commented] (SPARK-18165) Kinesis support in Structured Streaming

2018-05-30 Thread Vikram Agrawal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494732#comment-16494732 ] Vikram Agrawal commented on SPARK-18165: [~mail2sivan...@gmail.com] - This library has been

[jira] [Commented] (SPARK-24423) Add a new option `query` for JDBC sources

2018-05-30 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494739#comment-16494739 ] Xiao Li commented on SPARK-24423: - cc [~dkbiswal] Are your team interested in this task? > Add a new

[jira] [Created] (SPARK-24423) Add a new option `query` for JDBC sources

2018-05-30 Thread Xiao Li (JIRA)
Xiao Li created SPARK-24423: --- Summary: Add a new option `query` for JDBC sources Key: SPARK-24423 URL: https://issues.apache.org/jira/browse/SPARK-24423 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-24424) Support ANSI-SQL compliant syntax for ROLLUP, CUBE and GROUPING SET

2018-05-30 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24424: Description: Currently, our Group By clause follows Hive

[jira] [Created] (SPARK-24424) Support ANSI-SQL compliant syntax for ROLLUP, CUBE and GROUPING SET

2018-05-30 Thread Xiao Li (JIRA)
Xiao Li created SPARK-24424: --- Summary: Support ANSI-SQL compliant syntax for ROLLUP, CUBE and GROUPING SET Key: SPARK-24424 URL: https://issues.apache.org/jira/browse/SPARK-24424 Project: Spark

[jira] [Commented] (SPARK-20144) spark.read.parquet no long maintains ordering of the data

2018-05-30 Thread Unai Sarasola (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494810#comment-16494810 ] Unai Sarasola commented on SPARK-20144: --- But if you want to have exactly a copy from your data in

[jira] [Resolved] (SPARK-23754) StopIterator exception in Python UDF results in partial result

2018-05-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-23754. -- Resolution: Fixed Fix Version/s: 2.4.0 Fixed in

[jira] [Created] (SPARK-24425) Regression from 1.6 to 2.x - Spark no longer respects input partitions, unnecessary shuffle required

2018-05-30 Thread sam (JIRA)
sam created SPARK-24425: --- Summary: Regression from 1.6 to 2.x - Spark no longer respects input partitions, unnecessary shuffle required Key: SPARK-24425 URL: https://issues.apache.org/jira/browse/SPARK-24425

[jira] [Commented] (SPARK-24425) Regression from 1.6 to 2.x - Spark no longer respects input partitions, unnecessary shuffle required

2018-05-30 Thread Unai Sarasola (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494864#comment-16494864 ] Unai Sarasola commented on SPARK-24425: --- Totally agree with you Sam. It's up to the developer to

[jira] [Assigned] (SPARK-23754) StopIterator exception in Python UDF results in partial result

2018-05-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-23754: Assignee: Emilio Dorigatti > StopIterator exception in Python UDF results in partial

[jira] [Commented] (SPARK-20144) spark.read.parquet no long maintains ordering of the data

2018-05-30 Thread sam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494852#comment-16494852 ] sam commented on SPARK-20144: - Regarding the original issue of sorting, I agree with [~srowen] in that it

[jira] [Commented] (SPARK-23904) Big execution plan cause OOM

2018-05-30 Thread Ruben Berenguel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494832#comment-16494832 ] Ruben Berenguel commented on SPARK-23904: - [~igreenfi] that's what I mean, removing the code

[jira] [Commented] (SPARK-24410) Missing optimization for Union on bucketed tables

2018-05-30 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494885#comment-16494885 ] Liang-Chi Hsieh commented on SPARK-24410: - I've done some experiments locally. But the results

[jira] [Comment Edited] (SPARK-24410) Missing optimization for Union on bucketed tables

2018-05-30 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494885#comment-16494885 ] Liang-Chi Hsieh edited comment on SPARK-24410 at 5/30/18 8:41 AM: -- I've

[jira] [Comment Edited] (SPARK-24410) Missing optimization for Union on bucketed tables

2018-05-30 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494885#comment-16494885 ] Liang-Chi Hsieh edited comment on SPARK-24410 at 5/30/18 8:41 AM: -- I've

[jira] [Commented] (SPARK-23754) StopIterator exception in Python UDF results in partial result

2018-05-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494975#comment-16494975 ] Apache Spark commented on SPARK-23754: -- User 'e-dorigatti' has created a pull request for this

[jira] [Created] (SPARK-24427) Spark 2.2 - Exception occurred while saving table in spark. Multiple sources found for parquet

2018-05-30 Thread Ashok Rai (JIRA)
Ashok Rai created SPARK-24427: - Summary: Spark 2.2 - Exception occurred while saving table in spark. Multiple sources found for parquet Key: SPARK-24427 URL: https://issues.apache.org/jira/browse/SPARK-24427

[jira] [Updated] (SPARK-24428) Remove unused code and fix any related doc

2018-05-30 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stavros Kontopoulos updated SPARK-24428: Summary: Remove unused code and fix any related doc (was: Remove unused code and

[jira] [Created] (SPARK-24428) Remove unused code and Fix docs

2018-05-30 Thread Stavros Kontopoulos (JIRA)
Stavros Kontopoulos created SPARK-24428: --- Summary: Remove unused code and Fix docs Key: SPARK-24428 URL: https://issues.apache.org/jira/browse/SPARK-24428 Project: Spark Issue Type:

[jira] [Updated] (SPARK-24428) Remove unused code and fix any related doc

2018-05-30 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stavros Kontopoulos updated SPARK-24428: Description: There are some relics of previous refactoring: like:

[jira] [Commented] (SPARK-24395) Fix Behavior of NOT IN with Literals Containing NULL

2018-05-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495068#comment-16495068 ] Marco Gaido commented on SPARK-24395: - The main issue here is that {{(null, null) = (1, 2)}} in

[jira] [Created] (SPARK-24426) Unexpected combination of cache and join on DataFrame

2018-05-30 Thread Krzysztof Skulski (JIRA)
Krzysztof Skulski created SPARK-24426: - Summary: Unexpected combination of cache and join on DataFrame Key: SPARK-24426 URL: https://issues.apache.org/jira/browse/SPARK-24426 Project: Spark

[jira] [Updated] (SPARK-24426) Unexpected combination of cache and join on DataFrame

2018-05-30 Thread Krzysztof Skulski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krzysztof Skulski updated SPARK-24426: -- Description: I have unexpected results, when I cache DataFrame and try to do another

[jira] [Updated] (SPARK-24415) Stage page aggregated executor metrics wrong when failures

2018-05-30 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-24415: -- Description: Running with spark 2.3 on yarn and having task failures and blacklisting, the

[jira] [Updated] (SPARK-24415) Stage page aggregated executor metrics wrong when failures

2018-05-30 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-24415: -- Description: Running with spark 2.3 on yarn and having task failures and blacklisting, the

[jira] [Updated] (SPARK-24415) Stage page aggregated executor metrics wrong when failures

2018-05-30 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-24415: -- Description: Running with spark 2.3 on yarn and having task failures and blacklisting, the

[jira] [Assigned] (SPARK-24428) Remove unused code and fix any related doc

2018-05-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24428: Assignee: Apache Spark > Remove unused code and fix any related doc >

[jira] [Commented] (SPARK-24428) Remove unused code and fix any related doc

2018-05-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495184#comment-16495184 ] Apache Spark commented on SPARK-24428: -- User 'skonto' has created a pull request for this issue:

[jira] [Assigned] (SPARK-24428) Remove unused code and fix any related doc

2018-05-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24428: Assignee: (was: Apache Spark) > Remove unused code and fix any related doc >

[jira] [Updated] (SPARK-24428) Remove unused code and fix any related doc in K8s module

2018-05-30 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stavros Kontopoulos updated SPARK-24428: Description: There are some relics of previous refactoring like:

  1   2   >