[jira] [Comment Edited] (SPARK-21989) createDataset and the schema of encoder class

2017-09-12 Thread Jen-Ming Chung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16164152#comment-16164152 ] Jen-Ming Chung edited comment on SPARK-21989 at 9/13/17 5:27 AM: - Hi

[jira] [Comment Edited] (SPARK-21989) createDataset and the schema of encoder class

2017-09-12 Thread Jen-Ming Chung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16164152#comment-16164152 ] Jen-Ming Chung edited comment on SPARK-21989 at 9/13/17 4:58 AM: - Hi

[jira] [Comment Edited] (SPARK-21989) createDataset and the schema of encoder class

2017-09-12 Thread Jen-Ming Chung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16164152#comment-16164152 ] Jen-Ming Chung edited comment on SPARK-21989 at 9/13/17 4:56 AM: - Hi

[jira] [Commented] (SPARK-21989) createDataset and the schema of encoder class

2017-09-12 Thread Jen-Ming Chung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16164152#comment-16164152 ] Jen-Ming Chung commented on SPARK-21989: Hi [~client.test], I write the above code in scala and

[jira] [Comment Edited] (SPARK-21989) createDataset and the schema of encoder class

2017-09-12 Thread Jen-Ming Chung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16164152#comment-16164152 ] Jen-Ming Chung edited comment on SPARK-21989 at 9/13/17 4:55 AM: - Hi

[jira] [Commented] (SPARK-21027) Parallel One vs. Rest Classifier

2017-09-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16164120#comment-16164120 ] Apache Spark commented on SPARK-21027: -- User 'WeichenXu123' has created a pull request for this

[jira] [Created] (SPARK-21989) createDataset and the schema of encoder class

2017-09-12 Thread taiho choi (JIRA)
taiho choi created SPARK-21989: -- Summary: createDataset and the schema of encoder class Key: SPARK-21989 URL: https://issues.apache.org/jira/browse/SPARK-21989 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-17642) Support DESC FORMATTED TABLE COLUMN command to show column-level statistics

2017-09-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16164017#comment-16164017 ] Apache Spark commented on SPARK-17642: -- User 'wzhfy' has created a pull request for this issue:

[jira] [Commented] (SPARK-21513) SQL to_json should support all column types

2017-09-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16164007#comment-16164007 ] Hyukjin Kwon commented on SPARK-21513: -- Thanks a lot too :). > SQL to_json should support all

[jira] [Commented] (SPARK-21513) SQL to_json should support all column types

2017-09-12 Thread Jia-Xuan Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16164005#comment-16164005 ] Jia-Xuan Liu commented on SPARK-21513: -- [~jerryshao] Yes, thanks a lot. :) > SQL to_json should

[jira] [Commented] (SPARK-21513) SQL to_json should support all column types

2017-09-12 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16164000#comment-16164000 ] Saisai Shao commented on SPARK-21513: - [~goldmedal] is it your correct JIRA name? [~hyukjin.kwon] I

[jira] [Assigned] (SPARK-21513) SQL to_json should support all column types

2017-09-12 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao reassigned SPARK-21513: --- Assignee: Jia-Xuan Liu > SQL to_json should support all column types >

[jira] [Commented] (SPARK-21513) SQL to_json should support all column types

2017-09-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163982#comment-16163982 ] Hyukjin Kwon commented on SPARK-21513: -- Hi [~jerryshao], would you mind if I ask set the contributor

[jira] [Commented] (SPARK-21866) SPIP: Image support in Spark

2017-09-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163976#comment-16163976 ] Joseph K. Bradley commented on SPARK-21866: --- 1. For the namespace, here are my thoughts: I

[jira] [Resolved] (SPARK-21513) SQL to_json should support all column types

2017-09-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-21513. -- Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 18875

[jira] [Commented] (SPARK-21986) QuantileDiscretizer picks wrong split point for data with lots of 0's

2017-09-12 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163778#comment-16163778 ] Sean Owen commented on SPARK-21986: --- Yeah, I shouldn't say it's specific to a small data set size. The

[jira] [Commented] (SPARK-21986) QuantileDiscretizer picks wrong split point for data with lots of 0's

2017-09-12 Thread Barry Becker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163768#comment-16163768 ] Barry Becker commented on SPARK-21986: -- But wait, the dataset I discovered the problem with was not

[jira] [Commented] (SPARK-15705) Spark won't read ORC schema from metastore for partitioned tables

2017-09-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163747#comment-16163747 ] Dongjoon Hyun commented on SPARK-15705: --- Hi, All. I'm tracking this bug. This seems to be fixed

[jira] [Commented] (SPARK-21988) Add default stats to StreamingExecutionRelation

2017-09-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163714#comment-16163714 ] Apache Spark commented on SPARK-21988: -- User 'joseph-torres' has created a pull request for this

[jira] [Assigned] (SPARK-21988) Add default stats to StreamingExecutionRelation

2017-09-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21988: Assignee: Apache Spark > Add default stats to StreamingExecutionRelation >

[jira] [Assigned] (SPARK-21988) Add default stats to StreamingExecutionRelation

2017-09-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21988: Assignee: (was: Apache Spark) > Add default stats to StreamingExecutionRelation >

[jira] [Commented] (SPARK-21867) Support async spilling in UnsafeShuffleWriter

2017-09-12 Thread Sital Kedia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163705#comment-16163705 ] Sital Kedia commented on SPARK-21867: - [~rxin] - You are right, it is very tricky to get it right.

[jira] [Commented] (SPARK-18838) High latency of event processing for large jobs

2017-09-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163695#comment-16163695 ] Apache Spark commented on SPARK-18838: -- User 'vanzin' has created a pull request for this issue:

[jira] [Created] (SPARK-21988) Add default stats to StreamingExecutionRelation

2017-09-12 Thread Jose Torres (JIRA)
Jose Torres created SPARK-21988: --- Summary: Add default stats to StreamingExecutionRelation Key: SPARK-21988 URL: https://issues.apache.org/jira/browse/SPARK-21988 Project: Spark Issue Type:

[jira] [Updated] (SPARK-21986) QuantileDiscretizer picks wrong split point for data with lots of 0's

2017-09-12 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-21986: -- Priority: Minor (was: Major) Issue Type: Improvement (was: Bug) It's an approximate algorithm,

[jira] [Commented] (SPARK-21987) Spark 2.3 cannot read 2.2 event logs

2017-09-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163622#comment-16163622 ] Xiao Li commented on SPARK-21987: - Thanks for reporting this! We need to ensure Spark 2.3 still can

[jira] [Resolved] (SPARK-17701) Refactor DataSourceScanExec so its sameResult call does not compare strings

2017-09-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-17701. - Resolution: Fixed Assignee: Wenchen Fan Fix Version/s: 2.3.0 > Refactor

[jira] [Commented] (SPARK-18085) SPIP: Better History Server scalability for many / large applications

2017-09-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163585#comment-16163585 ] Marcelo Vanzin commented on SPARK-18085: I filed SPARK-21987 for the event log issue (lest we

[jira] [Created] (SPARK-21987) Spark 2.3 cannot read 2.2 event logs

2017-09-12 Thread Marcelo Vanzin (JIRA)
Marcelo Vanzin created SPARK-21987: -- Summary: Spark 2.3 cannot read 2.2 event logs Key: SPARK-21987 URL: https://issues.apache.org/jira/browse/SPARK-21987 Project: Spark Issue Type: Bug

[jira] [Resolved] (SPARK-18128) Add support for publishing to PyPI

2017-09-12 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-18128. - Resolution: Fixed Assignee: holdenk Fix Version/s: 2.2.0 We got the package name

[jira] [Resolved] (SPARK-21979) Improve QueryPlanConstraints framework

2017-09-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-21979. - Resolution: Fixed Assignee: Gengliang Wang Fix Version/s: 2.3.0 > Improve

[jira] [Resolved] (SPARK-18267) Distribute PySpark via Python Package Index (pypi)

2017-09-12 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-18267. - Resolution: Fixed Assignee: holdenk Fix Version/s: 2.2.0 > Distribute PySpark via Python

[jira] [Created] (SPARK-21986) QuantileDiscretizer picks wrong split point for data with lots of 0's

2017-09-12 Thread Barry Becker (JIRA)
Barry Becker created SPARK-21986: Summary: QuantileDiscretizer picks wrong split point for data with lots of 0's Key: SPARK-21986 URL: https://issues.apache.org/jira/browse/SPARK-21986 Project: Spark

[jira] [Resolved] (SPARK-18608) Spark ML algorithms that check RDD cache level for internal caching double-cache data

2017-09-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-18608. --- Resolution: Fixed Fix Version/s: 2.3.0 2.2.1 Issue

[jira] [Updated] (SPARK-18608) Spark ML algorithms that check RDD cache level for internal caching double-cache data

2017-09-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-18608: -- Target Version/s: 2.2.1, 2.3.0 > Spark ML algorithms that check RDD cache level for

[jira] [Assigned] (SPARK-18608) Spark ML algorithms that check RDD cache level for internal caching double-cache data

2017-09-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-18608: - Assignee: zhengruifeng > Spark ML algorithms that check RDD cache level for

[jira] [Assigned] (SPARK-21027) Parallel One vs. Rest Classifier

2017-09-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-21027: - Assignee: Ajay Saini > Parallel One vs. Rest Classifier >

[jira] [Resolved] (SPARK-21027) Parallel One vs. Rest Classifier

2017-09-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-21027. --- Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 19110

[jira] [Updated] (SPARK-21985) PySpark PairDeserializer is broken for double-zipped RDDs

2017-09-12 Thread Stuart (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stuart updated SPARK-21985: --- Description: PySpark fails to deserialize double-zipped RDDs. For example, the following example used to

[jira] [Updated] (SPARK-21985) PySpark PairDeserializer is broken for double-zipped RDDs

2017-09-12 Thread Stuart (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stuart updated SPARK-21985: --- Description: PySpark fails to deserialize double-zipped RDDs. For example, the following example used to

[jira] [Updated] (SPARK-21985) PySpark PairDeserializer is broken for double-zipped RDDs

2017-09-12 Thread Stuart (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stuart updated SPARK-21985: --- Description: PySpark fails to deserialize double-zipped RDDs. For example, the following example used to

[jira] [Updated] (SPARK-21985) PySpark PairDeserializer is broken for double-zipped RDDs

2017-09-12 Thread Stuart (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stuart updated SPARK-21985: --- Description: PySpark fails to deserialize double-zipped RDDs. For example, the following example used to

[jira] [Updated] (SPARK-21985) PySpark PairDeserializer is broken for double-zipped RDDs

2017-09-12 Thread Stuart (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stuart updated SPARK-21985: --- Description: PySpark fails to deserialize double-zipped RDDs. For example, the following example used to

[jira] [Updated] (SPARK-21985) PySpark PairDeserializer is broken for double-zipped RDDs

2017-09-12 Thread Stuart (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stuart updated SPARK-21985: --- Description: PySpark fails to deserialize double-zipped RDDs. For example, the following example used to

[jira] [Updated] (SPARK-21985) PySpark PairDeserializer is broken for double-zipped RDDs

2017-09-12 Thread Stuart (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stuart updated SPARK-21985: --- Description: PySpark fails to deserialize double-zipped RDDs. For example, the following example used to

[jira] [Updated] (SPARK-21985) PySpark PairDeserializer is broken for double-zipped RDDs

2017-09-12 Thread Stuart (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stuart updated SPARK-21985: --- Description: PySpark fails to deserialize double-zipped RDDs. For example, the following example used to

[jira] [Created] (SPARK-21985) PySpark PairDeserializer is broken for double-zipped RDDs

2017-09-12 Thread Stuart (JIRA)
Stuart created SPARK-21985: -- Summary: PySpark PairDeserializer is broken for double-zipped RDDs Key: SPARK-21985 URL: https://issues.apache.org/jira/browse/SPARK-21985 Project: Spark Issue Type:

[jira] [Updated] (SPARK-21985) PySpark PairDeserializer is broken for double-zipped RDDs

2017-09-12 Thread Stuart (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stuart updated SPARK-21985: --- Description: PySpark fails to deserialize double-zipped RDDs. For example, the following example used to

[jira] [Resolved] (SPARK-17642) Support DESC FORMATTED TABLE COLUMN command to show column-level statistics

2017-09-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-17642. - Resolution: Fixed Assignee: Zhenhua Wang Fix Version/s: 2.3.0 > Support DESC FORMATTED

[jira] [Commented] (SPARK-21087) CrossValidator, TrainValidationSplit should preserve all models after fitting: Scala

2017-09-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163136#comment-16163136 ] Apache Spark commented on SPARK-21087: -- User 'WeichenXu123' has created a pull request for this

[jira] [Commented] (SPARK-19634) Feature parity for descriptive statistics in MLlib

2017-09-12 Thread Seth Hendrickson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163106#comment-16163106 ] Seth Hendrickson commented on SPARK-19634: -- Is there a plan for moving the linear algorithms

[jira] [Commented] (SPARK-21809) Change Stage Page to use datatables to support sorting columns and searching

2017-09-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163044#comment-16163044 ] Apache Spark commented on SPARK-21809: -- User 'pgandhi999' has created a pull request for this issue:

[jira] [Assigned] (SPARK-21809) Change Stage Page to use datatables to support sorting columns and searching

2017-09-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21809: Assignee: (was: Apache Spark) > Change Stage Page to use datatables to support

[jira] [Assigned] (SPARK-21809) Change Stage Page to use datatables to support sorting columns and searching

2017-09-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21809: Assignee: Apache Spark > Change Stage Page to use datatables to support sorting columns

[jira] [Assigned] (SPARK-21610) Corrupt records are not handled properly when creating a dataframe from a file

2017-09-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-21610: Assignee: Jen-Ming Chung > Corrupt records are not handled properly when creating a

[jira] [Resolved] (SPARK-21610) Corrupt records are not handled properly when creating a dataframe from a file

2017-09-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-21610. -- Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 19199

[jira] [Created] (SPARK-21984) Use histogram stats in join estimation

2017-09-12 Thread Zhenhua Wang (JIRA)
Zhenhua Wang created SPARK-21984: Summary: Use histogram stats in join estimation Key: SPARK-21984 URL: https://issues.apache.org/jira/browse/SPARK-21984 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-17997) Aggregation function for counting distinct values for multiple intervals

2017-09-12 Thread Zhenhua Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhenhua Wang updated SPARK-17997: - Issue Type: Sub-task (was: New Feature) Parent: SPARK-21975 > Aggregation function for

[jira] [Updated] (SPARK-17997) Aggregation function for counting distinct values for multiple intervals

2017-09-12 Thread Zhenhua Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhenhua Wang updated SPARK-17997: - Affects Version/s: (was: 2.1.0) 2.3.0 > Aggregation function for

[jira] [Updated] (SPARK-17997) Aggregation function for counting distinct values for multiple intervals

2017-09-12 Thread Zhenhua Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhenhua Wang updated SPARK-17997: - Issue Type: New Feature (was: Sub-task) Parent: (was: SPARK-16026) > Aggregation

[jira] [Created] (SPARK-21983) Fix ANTLR 4.7 deprecations

2017-09-12 Thread Herman van Hovell (JIRA)
Herman van Hovell created SPARK-21983: - Summary: Fix ANTLR 4.7 deprecations Key: SPARK-21983 URL: https://issues.apache.org/jira/browse/SPARK-21983 Project: Spark Issue Type: Improvement

[jira] [Assigned] (SPARK-21982) Set Locale to US in order to pass UtilsSuite when your jvm Locale is not US

2017-09-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21982: Assignee: (was: Apache Spark) > Set Locale to US in order to pass UtilsSuite when

[jira] [Assigned] (SPARK-21982) Set Locale to US in order to pass UtilsSuite when your jvm Locale is not US

2017-09-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21982: Assignee: Apache Spark > Set Locale to US in order to pass UtilsSuite when your jvm

[jira] [Commented] (SPARK-21982) Set Locale to US in order to pass UtilsSuite when your jvm Locale is not US

2017-09-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16162840#comment-16162840 ] Apache Spark commented on SPARK-21982: -- User 'Gschiavon' has created a pull request for this issue:

[jira] [Updated] (SPARK-21982) Set Locale to US in order to pass UtilsSuite when your jvm Locale is not US

2017-09-12 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-21982: -- Target Version/s: (was: 2.2.0) Labels: (was: test) It'd be great to fix up any call

[jira] [Created] (SPARK-21982) Set Locale to US in order to pass UtilsSuite when your jvm Locale is not US

2017-09-12 Thread German Schiavon Matteo (JIRA)
German Schiavon Matteo created SPARK-21982: -- Summary: Set Locale to US in order to pass UtilsSuite when your jvm Locale is not US Key: SPARK-21982 URL: https://issues.apache.org/jira/browse/SPARK-21982

[jira] [Assigned] (SPARK-21981) Python API for ClusteringEvaluator

2017-09-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21981: Assignee: Apache Spark > Python API for ClusteringEvaluator >

[jira] [Assigned] (SPARK-21981) Python API for ClusteringEvaluator

2017-09-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21981: Assignee: (was: Apache Spark) > Python API for ClusteringEvaluator >

[jira] [Commented] (SPARK-21981) Python API for ClusteringEvaluator

2017-09-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16162814#comment-16162814 ] Apache Spark commented on SPARK-21981: -- User 'mgaido91' has created a pull request for this issue:

[jira] [Resolved] (SPARK-21942) DiskBlockManager crashing when a root local folder has been externally deleted by OS

2017-09-12 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-21942. --- Resolution: Won't Fix It's a good useful discussion here. I think "won't fix" is the correct outcome

[jira] [Commented] (SPARK-21981) Python API for ClusteringEvaluator

2017-09-12 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16162764#comment-16162764 ] Marco Gaido commented on SPARK-21981: - [~yanboliang] yes, thanks. I will post a PR asap, thank you.

[jira] [Commented] (SPARK-21981) Python API for ClusteringEvaluator

2017-09-12 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16162759#comment-16162759 ] Yanbo Liang commented on SPARK-21981: - [~mgaido] Would you like to work on this? > Python API for

[jira] [Created] (SPARK-21981) Python API for ClusteringEvaluator

2017-09-12 Thread Yanbo Liang (JIRA)
Yanbo Liang created SPARK-21981: --- Summary: Python API for ClusteringEvaluator Key: SPARK-21981 URL: https://issues.apache.org/jira/browse/SPARK-21981 Project: Spark Issue Type: Improvement

[jira] [Resolved] (SPARK-14516) Clustering evaluator

2017-09-12 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang resolved SPARK-14516. - Resolution: Fixed Fix Version/s: 2.3.0 Target Version/s: 2.3.0 > Clustering

[jira] [Commented] (SPARK-21978) schemaInference option not to convert strings with leading zeros to int/long

2017-09-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16162749#comment-16162749 ] Hyukjin Kwon commented on SPARK-21978: -- Not sure. It sounds rather a niche use case. As a

[jira] [Commented] (SPARK-21980) References in grouping functions should be indexed with resolver

2017-09-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16162732#comment-16162732 ] Apache Spark commented on SPARK-21980: -- User 'DonnyZone' has created a pull request for this issue:

[jira] [Assigned] (SPARK-21980) References in grouping functions should be indexed with resolver

2017-09-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21980: Assignee: (was: Apache Spark) > References in grouping functions should be indexed

[jira] [Assigned] (SPARK-21980) References in grouping functions should be indexed with resolver

2017-09-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21980: Assignee: Apache Spark > References in grouping functions should be indexed with resolver

[jira] [Assigned] (SPARK-21976) Fix wrong doc about Mean Absolute Error

2017-09-12 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-21976: - Assignee: Favio Vázquez Priority: Minor (was: Trivial) > Fix wrong doc about Mean Absolute

[jira] [Resolved] (SPARK-21976) Fix wrong doc about Mean Absolute Error

2017-09-12 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-21976. --- Resolution: Fixed Fix Version/s: 2.1.2 2.3.0 2.2.1

[jira] [Updated] (SPARK-21980) References in grouping functions should be indexed with resolver

2017-09-12 Thread Feng Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feng Zhu updated SPARK-21980: - Description: In our spark-2.1 cluster, when users sumbit queries like {code:sql} select a, grouping(b),

[jira] [Created] (SPARK-21980) References in grouping functions should be indexed with resolver

2017-09-12 Thread Feng Zhu (JIRA)
Feng Zhu created SPARK-21980: Summary: References in grouping functions should be indexed with resolver Key: SPARK-21980 URL: https://issues.apache.org/jira/browse/SPARK-21980 Project: Spark

[jira] [Commented] (SPARK-21979) Improve QueryPlanConstraints framework

2017-09-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16162708#comment-16162708 ] Apache Spark commented on SPARK-21979: -- User 'gengliangwang' has created a pull request for this

[jira] [Assigned] (SPARK-21979) Improve QueryPlanConstraints framework

2017-09-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21979: Assignee: Apache Spark > Improve QueryPlanConstraints framework >

[jira] [Assigned] (SPARK-21979) Improve QueryPlanConstraints framework

2017-09-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21979: Assignee: (was: Apache Spark) > Improve QueryPlanConstraints framework >

[jira] [Created] (SPARK-21979) Improve QueryPlanConstraints framework

2017-09-12 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-21979: -- Summary: Improve QueryPlanConstraints framework Key: SPARK-21979 URL: https://issues.apache.org/jira/browse/SPARK-21979 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-09-12 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16162646#comment-16162646 ] Takuya Ueshin commented on SPARK-21190: --- [~icexelloss] Thank you for your suggestion. I agree that

[jira] [Commented] (SPARK-21610) Corrupt records are not handled properly when creating a dataframe from a file

2017-09-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16162575#comment-16162575 ] Apache Spark commented on SPARK-21610: -- User 'jmchung' has created a pull request for this issue:

[jira] [Commented] (SPARK-21926) Some transformers in spark.ml.feature fail when trying to transform streaming dataframes

2017-09-12 Thread Matthew Slipper (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16162568#comment-16162568 ] Matthew Slipper commented on SPARK-21926: - I'm happy to take a stab at number 1 (b), if that is