[jira] [Commented] (SPARK-19778) alais cannot use in group by

2017-02-28 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889700#comment-15889700 ] Takeshi Yamamuro commented on SPARK-19778: -- {code} scala> Seq(("a", 0), ("b", 1)).toDF("key",

[jira] [Commented] (SPARK-19768) Error for both aggregate and non-aggregate queries in Structured Streaming - "This query does not support recovering from checkpoint location"

2017-02-28 Thread Amit Baghel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889698#comment-15889698 ] Amit Baghel commented on SPARK-19768: - I am using aggregate query with format="parquet" and

[jira] [Commented] (SPARK-5159) Thrift server does not respect hive.server2.enable.doAs=true

2017-02-28 Thread Shridhar Ramachandran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889677#comment-15889677 ] Shridhar Ramachandran commented on SPARK-5159: -- I have faced this issue as well, on both 1.6

[jira] [Comment Edited] (SPARK-5159) Thrift server does not respect hive.server2.enable.doAs=true

2017-02-28 Thread Shridhar Ramachandran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889677#comment-15889677 ] Shridhar Ramachandran edited comment on SPARK-5159 at 3/1/17 7:29 AM: --

[jira] [Commented] (SPARK-19758) Casting string to timestamp in inline table definition fails with AnalysisException

2017-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889654#comment-15889654 ] Apache Spark commented on SPARK-19758: -- User 'viirya' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19758) Casting string to timestamp in inline table definition fails with AnalysisException

2017-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19758: Assignee: (was: Apache Spark) > Casting string to timestamp in inline table

[jira] [Assigned] (SPARK-19758) Casting string to timestamp in inline table definition fails with AnalysisException

2017-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19758: Assignee: Apache Spark > Casting string to timestamp in inline table definition fails

[jira] [Resolved] (SPARK-19633) FileSource read from FileSink

2017-02-28 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-19633. -- Resolution: Fixed Assignee: Liwei Lin Fix Version/s: 2.2.0 > FileSource read

[jira] [Commented] (SPARK-17495) Hive hash implementation

2017-02-28 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889645#comment-15889645 ] Tejas Patil commented on SPARK-17495: - >> Is it possible to figure out the hashing function based on

[jira] [Commented] (SPARK-19768) Error for both aggregate and non-aggregate queries in Structured Streaming - "This query does not support recovering from checkpoint location"

2017-02-28 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889639#comment-15889639 ] Shixiong Zhu commented on SPARK-19768: -- It should work for both aggregate and non-aggregate queries,

[jira] [Commented] (SPARK-19752) OrcGetSplits fails with 0 size files

2017-02-28 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889616#comment-15889616 ] Liang-Chi Hsieh commented on SPARK-19752: - >From the log, looks like it is a problem in Hive? >

[jira] [Commented] (SPARK-19764) Executors hang with supposedly running task that are really finished.

2017-02-28 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889611#comment-15889611 ] Shixiong Zhu commented on SPARK-19764: -- These are master and workers. From the master log, you are

[jira] [Resolved] (SPARK-19460) Update dataset used in R documentation, examples to reduce warning noise and confusions

2017-02-28 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung resolved SPARK-19460. -- Resolution: Fixed Assignee: Miao Wang Fix Version/s: 2.2.0 Target

[jira] [Commented] (SPARK-6951) History server slow startup if the event log directory is large

2017-02-28 Thread Zheng Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889588#comment-15889588 ] Zheng Shao commented on SPARK-6951: --- Did we consider using a distributed store to solve the scalability

[jira] [Resolved] (SPARK-19572) Allow to disable hive in sparkR shell

2017-02-28 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung resolved SPARK-19572. -- Resolution: Fixed Assignee: Jeff Zhang Fix Version/s: 2.2.0

[jira] [Commented] (SPARK-19764) Executors hang with supposedly running task that are really finished.

2017-02-28 Thread Ari Gesher (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889566#comment-15889566 ] Ari Gesher commented on SPARK-19764: And here's the stuck Executor: {noformat} Full thread dump

[jira] [Comment Edited] (SPARK-19764) Executors hang with supposedly running task that are really finished.

2017-02-28 Thread Ari Gesher (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889564#comment-15889564 ] Ari Gesher edited comment on SPARK-19764 at 3/1/17 6:05 AM: Nothing like

[jira] [Commented] (SPARK-13669) Job will always fail in the external shuffle service unavailable situation

2017-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889562#comment-15889562 ] Apache Spark commented on SPARK-13669: -- User 'jerryshao' has created a pull request for this issue:

[jira] [Assigned] (SPARK-13669) Job will always fail in the external shuffle service unavailable situation

2017-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13669: Assignee: Apache Spark > Job will always fail in the external shuffle service unavailable

[jira] [Assigned] (SPARK-13669) Job will always fail in the external shuffle service unavailable situation

2017-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13669: Assignee: (was: Apache Spark) > Job will always fail in the external shuffle service

[jira] [Commented] (SPARK-19764) Executors hang with supposedly running task that are really finished.

2017-02-28 Thread Ari Gesher (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889564#comment-15889564 ] Ari Gesher commented on SPARK-19764: Nothing like that. Full logs in the attached tarball. Here's

[jira] [Updated] (SPARK-19764) Executors hang with supposedly running task that are really finished.

2017-02-28 Thread Ari Gesher (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ari Gesher updated SPARK-19764: --- Attachment: SPARK-19764.tgz > Executors hang with supposedly running task that are really finished.

[jira] [Updated] (SPARK-19779) structured streaming exist residual tmp file

2017-02-28 Thread Feng Gui (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feng Gui updated SPARK-19779: - Description: The PR (https://github.com/apache/spark/pull/17012) can to fix restart a Structured

[jira] [Created] (SPARK-19779) structured streaming exist residual tmp file

2017-02-28 Thread Feng Gui (JIRA)
Feng Gui created SPARK-19779: Summary: structured streaming exist residual tmp file Key: SPARK-19779 URL: https://issues.apache.org/jira/browse/SPARK-19779 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-19776) Is the JavaKafkaWordCount example correct for Spark version 2.1?

2017-02-28 Thread Russell Abedin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russell Abedin updated SPARK-19776: --- Description: My question is

[jira] [Updated] (SPARK-19776) Is the JavaKafkaWordCount example correct for Spark version 2.1?

2017-02-28 Thread Russell Abedin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russell Abedin updated SPARK-19776: --- Summary: Is the JavaKafkaWordCount example correct for Spark version 2.1? (was: Is the

[jira] [Comment Edited] (SPARK-19768) Error for both aggregate and non-aggregate queries in Structured Streaming - "This query does not support recovering from checkpoint location"

2017-02-28 Thread Amit Baghel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889472#comment-15889472 ] Amit Baghel edited comment on SPARK-19768 at 3/1/17 4:14 AM: - Thanks

[jira] [Commented] (SPARK-19768) Error for both aggregate and non-aggregate queries in Structured Streaming - "This query does not support recovering from checkpoint location"

2017-02-28 Thread Amit Baghel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889472#comment-15889472 ] Amit Baghel commented on SPARK-19768: - Thanks [~zsxwing] for clarification. Documentation for

[jira] [Commented] (SPARK-16929) Speculation-related synchronization bottleneck in checkSpeculatableTasks

2017-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889435#comment-15889435 ] Apache Spark commented on SPARK-16929: -- User 'jinxing64' has created a pull request for this issue:

[jira] [Created] (SPARK-19778) alais cannot use in group by

2017-02-28 Thread xukun (JIRA)
xukun created SPARK-19778: - Summary: alais cannot use in group by Key: SPARK-19778 URL: https://issues.apache.org/jira/browse/SPARK-19778 Project: Spark Issue Type: Improvement Components:

[jira] [Commented] (SPARK-17495) Hive hash implementation

2017-02-28 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889432#comment-15889432 ] Reynold Xin commented on SPARK-17495: - Is it possible to figure out the hashing function based on

[jira] [Commented] (SPARK-19754) Casting to int from a JSON-parsed float rounds instead of truncating

2017-02-28 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889419#comment-15889419 ] Takeshi Yamamuro commented on SPARK-19754: -- cc: [~hyukjin.kwon] what do u think this? > Casting

[jira] [Commented] (SPARK-14698) CREATE FUNCTION cloud not add function to hive metastore

2017-02-28 Thread Shawn Lavelle (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889416#comment-15889416 ] Shawn Lavelle commented on SPARK-14698: --- [~poseidon] Would you be willing (and still able) to

[jira] [Comment Edited] (SPARK-18769) Spark to be smarter about what the upper bound is and to restrict number of executor when dynamic allocation is enabled

2017-02-28 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889412#comment-15889412 ] Marcelo Vanzin edited comment on SPARK-18769 at 3/1/17 3:04 AM: bq. What

[jira] [Commented] (SPARK-18769) Spark to be smarter about what the upper bound is and to restrict number of executor when dynamic allocation is enabled

2017-02-28 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889412#comment-15889412 ] Marcelo Vanzin commented on SPARK-18769: bq. What do you mean by this? I mean that, if I

[jira] [Assigned] (SPARK-19777) Scan runningTasksSet when check speculatable tasks in TaskSetManager.

2017-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19777: Assignee: (was: Apache Spark) > Scan runningTasksSet when check speculatable tasks in

[jira] [Assigned] (SPARK-19777) Scan runningTasksSet when check speculatable tasks in TaskSetManager.

2017-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19777: Assignee: Apache Spark > Scan runningTasksSet when check speculatable tasks in

[jira] [Commented] (SPARK-19777) Scan runningTasksSet when check speculatable tasks in TaskSetManager.

2017-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889401#comment-15889401 ] Apache Spark commented on SPARK-19777: -- User 'jinxing64' has created a pull request for this issue:

[jira] [Created] (SPARK-19777) Scan runningTasksSet when check speculatable tasks in TaskSetManager.

2017-02-28 Thread jin xing (JIRA)
jin xing created SPARK-19777: Summary: Scan runningTasksSet when check speculatable tasks in TaskSetManager. Key: SPARK-19777 URL: https://issues.apache.org/jira/browse/SPARK-19777 Project: Spark

[jira] [Assigned] (SPARK-19635) Feature parity for Chi-square hypothesis testing in MLlib

2017-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19635: Assignee: Apache Spark (was: Joseph K. Bradley) > Feature parity for Chi-square

[jira] [Assigned] (SPARK-19635) Feature parity for Chi-square hypothesis testing in MLlib

2017-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19635: Assignee: Joseph K. Bradley (was: Apache Spark) > Feature parity for Chi-square

[jira] [Commented] (SPARK-19635) Feature parity for Chi-square hypothesis testing in MLlib

2017-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889386#comment-15889386 ] Apache Spark commented on SPARK-19635: -- User 'jkbradley' has created a pull request for this issue:

[jira] [Commented] (SPARK-18769) Spark to be smarter about what the upper bound is and to restrict number of executor when dynamic allocation is enabled

2017-02-28 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889373#comment-15889373 ] Yuming Wang commented on SPARK-18769: - How about this approach:

[jira] [Updated] (SPARK-19764) Executors hang with supposedly running task that are really finished.

2017-02-28 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-19764: Attachment: netty-6153.jpg something like this: !netty-6153.jpg! > Executors hang with supposedly

[jira] [Assigned] (SPARK-19740) Spark executor always runs as root when running on mesos

2017-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19740: Assignee: (was: Apache Spark) > Spark executor always runs as root when running on

[jira] [Assigned] (SPARK-19740) Spark executor always runs as root when running on mesos

2017-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19740: Assignee: Apache Spark > Spark executor always runs as root when running on mesos >

[jira] [Commented] (SPARK-19740) Spark executor always runs as root when running on mesos

2017-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889331#comment-15889331 ] Apache Spark commented on SPARK-19740: -- User 'yanji84' has created a pull request for this issue:

[jira] [Commented] (SPARK-19211) Explicitly prevent Insert into View or Create View As Insert

2017-02-28 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889270#comment-15889270 ] Jiang Xingbo commented on SPARK-19211: -- I‘ve been working on it this week and perhaps I'll submit a

[jira] [Assigned] (SPARK-19636) Feature parity for correlation statistics in MLlib

2017-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19636: Assignee: Apache Spark (was: Tim Hunter) > Feature parity for correlation statistics in

[jira] [Assigned] (SPARK-19636) Feature parity for correlation statistics in MLlib

2017-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19636: Assignee: Tim Hunter (was: Apache Spark) > Feature parity for correlation statistics in

[jira] [Commented] (SPARK-19636) Feature parity for correlation statistics in MLlib

2017-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889220#comment-15889220 ] Apache Spark commented on SPARK-19636: -- User 'thunterdb' has created a pull request for this issue:

[jira] [Updated] (SPARK-19776) Is the JavaKafkaWordCount correct on Spark version 2.1

2017-02-28 Thread Russell Abedin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russell Abedin updated SPARK-19776: --- Summary: Is the JavaKafkaWordCount correct on Spark version 2.1 (was: JavaKafkaWordCount

[jira] [Updated] (SPARK-19776) JavaKafkaWordCount calls createStream on version 2.1

2017-02-28 Thread Russell Abedin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russell Abedin updated SPARK-19776: --- Affects Version/s: (was: 1.6.1) (was: 1.5.2)

[jira] [Updated] (SPARK-19776) JavaKafkaWordCount calls createStream on version 2.1

2017-02-28 Thread Russell Abedin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russell Abedin updated SPARK-19776: --- Description: My question is

[jira] [Comment Edited] (SPARK-19771) Support OR-AND amplification in Locality Sensitive Hashing (LSH)

2017-02-28 Thread Mingjie Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889180#comment-15889180 ] Mingjie Tang edited comment on SPARK-19771 at 3/1/17 12:30 AM: --- If we

[jira] [Assigned] (SPARK-19635) Feature parity for Chi-square hypothesis testing in MLlib

2017-02-28 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-19635: - Assignee: Joseph K. Bradley > Feature parity for Chi-square hypothesis testing

[jira] [Commented] (SPARK-19635) Feature parity for Chi-square hypothesis testing in MLlib

2017-02-28 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889190#comment-15889190 ] Joseph K. Bradley commented on SPARK-19635: --- That PR for trees looks pretty different. This

[jira] [Commented] (SPARK-19634) Feature parity for descriptive statistics in MLlib

2017-02-28 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889181#comment-15889181 ] Joseph K. Bradley commented on SPARK-19634: --- I'll assign this to [~timhunter] given the time

[jira] [Commented] (SPARK-19771) Support OR-AND amplification in Locality Sensitive Hashing (LSH)

2017-02-28 Thread Mingjie Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889180#comment-15889180 ] Mingjie Tang commented on SPARK-19771: -- If we follow the AND-OR framework, one optimization is here:

[jira] [Assigned] (SPARK-19634) Feature parity for descriptive statistics in MLlib

2017-02-28 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-19634: - Assignee: Timothy Hunter > Feature parity for descriptive statistics in MLlib >

[jira] [Updated] (SPARK-19776) JavaKafkaWordCount calls createStream on version 2.1

2017-02-28 Thread Russell Abedin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russell Abedin updated SPARK-19776: --- Issue Type: Question (was: Bug) > JavaKafkaWordCount calls createStream on version 2.1 >

[jira] [Created] (SPARK-19776) JavaKafkaWordCount calls createStream on version 2.1

2017-02-28 Thread Russell Abedin (JIRA)
Russell Abedin created SPARK-19776: -- Summary: JavaKafkaWordCount calls createStream on version 2.1 Key: SPARK-19776 URL: https://issues.apache.org/jira/browse/SPARK-19776 Project: Spark

[jira] [Updated] (SPARK-19382) Test sparse vectors in LinearSVCSuite

2017-02-28 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-19382: -- Shepherd: Joseph K. Bradley > Test sparse vectors in LinearSVCSuite >

[jira] [Resolved] (SPARK-14503) spark.ml Scala API for FPGrowth

2017-02-28 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-14503. --- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 15415

[jira] [Commented] (SPARK-18579) spark-csv strips whitespace (pyspark)

2017-02-28 Thread Adrian Bridgett (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889120#comment-15889120 ] Adrian Bridgett commented on SPARK-18579: - Yep, that's right. Sorry - not sure why I didn't reply

[jira] [Commented] (SPARK-18579) spark-csv strips whitespace (pyspark)

2017-02-28 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889117#comment-15889117 ] Hyukjin Kwon commented on SPARK-18579: -- Oh, I overlooked. You meant it always strips the white

[jira] [Commented] (SPARK-19713) saveAsTable

2017-02-28 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889114#comment-15889114 ] Hyukjin Kwon commented on SPARK-19713: -- Hi [~balaramraju] Could you update the title? > saveAsTable

[jira] [Resolved] (SPARK-19729) Strange behaviour with reading csv with schema into dataframe

2017-02-28 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-19729. -- Resolution: Invalid I am resolving this as {{Invalid}}. Please reopen this if I was wrong with

[jira] [Resolved] (SPARK-16846) read.csv() option: "inferSchema" don't work

2017-02-28 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-16846. -- Resolution: Not A Problem If the schema is given, it does not infer the schema. > read.csv()

[jira] [Commented] (SPARK-17495) Hive hash implementation

2017-02-28 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889090#comment-15889090 ] Tejas Patil commented on SPARK-17495: - [~rxin]: >> 1. On the read side we shouldn't care which hash

[jira] [Resolved] (SPARK-19373) Mesos implementation of spark.scheduler.minRegisteredResourcesRatio looks at acquired cores rather than registerd cores

2017-02-28 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-19373. --- Resolution: Fixed Assignee: Michael Gummelt Fix Version/s: 2.2.0 > Mesos

[jira] [Commented] (SPARK-19373) Mesos implementation of spark.scheduler.minRegisteredResourcesRatio looks at acquired cores rather than registerd cores

2017-02-28 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889084#comment-15889084 ] Sean Owen commented on SPARK-19373: --- Resolved by https://github.com/apache/spark/pull/17045 > Mesos

[jira] [Commented] (SPARK-14503) spark.ml Scala API for FPGrowth

2017-02-28 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889085#comment-15889085 ] Joseph K. Bradley commented on SPARK-14503: --- Sorry for the slow reply. I actually haven't read

[jira] [Resolved] (SPARK-16512) No way to load CSV data without dropping whole rows when some of data is not matched with given schema

2017-02-28 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-16512. -- Resolution: Duplicate > No way to load CSV data without dropping whole rows when some of data

[jira] [Assigned] (SPARK-19769) Quickstart self-contained application instructions do not work with current sbt

2017-02-28 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-19769: - Assignee: Michael McCune > Quickstart self-contained application instructions do not work with

[jira] [Resolved] (SPARK-19521) Error with embedded line break (multi-line record) in csv file.

2017-02-28 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-19521. -- Resolution: Duplicate I am resolving this as a duplicate of SPARK-19610 as that one has a PR

[jira] [Resolved] (SPARK-17225) Support multiple null values in csv files

2017-02-28 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-17225. -- Resolution: Duplicate I am resolving this as a duplicate because that JIRA has a PR. >

[jira] [Resolved] (SPARK-14194) spark csv reader not working properly if CSV content contains CRLF character (newline) in the intermediate cell

2017-02-28 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-14194. -- Resolution: Duplicate I proposed to solve this via {{wholeFile}} option and it seems merged. I

[jira] [Resolved] (SPARK-19769) Quickstart self-contained application instructions do not work with current sbt

2017-02-28 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-19769. --- Resolution: Fixed Fix Version/s: 2.2.0 2.0.3 2.1.1

[jira] [Comment Edited] (SPARK-19764) Executors hang with supposedly running task that are really finished.

2017-02-28 Thread Ari Gesher (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889067#comment-15889067 ] Ari Gesher edited comment on SPARK-19764 at 2/28/17 11:04 PM: -- That was the

[jira] [Resolved] (SPARK-17224) Support skipping multiple header rows in csv

2017-02-28 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-17224. -- Resolution: Duplicate Now multiple header line can be dealt with by {{wholeFile}} option. Let

[jira] [Commented] (SPARK-19764) Executors hang with supposedly running task that are really finished.

2017-02-28 Thread Ari Gesher (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889067#comment-15889067 ] Ari Gesher commented on SPARK-19764: That was the log in the application directory on the driver

[jira] [Commented] (SPARK-16102) Use Record API from Univocity rather than current data cast API.

2017-02-28 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889058#comment-15889058 ] Hyukjin Kwon commented on SPARK-16102: -- Yes, let me check out this API and other APIs too. Let me

[jira] [Resolved] (SPARK-16103) Share a single Row for CSV data source rather than creating every time

2017-02-28 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-16103. -- Resolution: Duplicate yup, fixed in https://github.com/apache/spark/pull/16669 > Share a

[jira] [Commented] (SPARK-18389) Disallow cyclic view reference

2017-02-28 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889047#comment-15889047 ] Wenchen Fan commented on SPARK-18389: - [~jiangxb1987] are you working on it? > Disallow cyclic view

[jira] [Commented] (SPARK-19211) Explicitly prevent Insert into View or Create View As Insert

2017-02-28 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889048#comment-15889048 ] Wenchen Fan commented on SPARK-19211: - Hi [~jiangxb1987] are you working on it? > Explicitly prevent

[jira] [Updated] (SPARK-19775) Remove an obsolete `partitionBy().insertInto()` test case

2017-02-28 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-19775: -- Description: This issue removes [a test

[jira] [Commented] (SPARK-16103) Share a single Row for CSV data source rather than creating every time

2017-02-28 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889042#comment-15889042 ] Wenchen Fan commented on SPARK-16103: - seems it's already fixed? > Share a single Row for CSV data

[jira] [Commented] (SPARK-16103) Share a single Row for CSV data source rather than creating every time

2017-02-28 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889038#comment-15889038 ] Wenchen Fan commented on SPARK-16103: - Hi [~hyukjin.kwon] are you working on it? > Share a single

[jira] [Commented] (SPARK-16102) Use Record API from Univocity rather than current data cast API.

2017-02-28 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889036#comment-15889036 ] Wenchen Fan commented on SPARK-16102: - Hi [~hyukjin.kwon] are you working on it? > Use Record API

[jira] [Assigned] (SPARK-19774) StreamExecution should call stop() on sources when a stream fails

2017-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19774: Assignee: Burak Yavuz (was: Apache Spark) > StreamExecution should call stop() on

[jira] [Assigned] (SPARK-19774) StreamExecution should call stop() on sources when a stream fails

2017-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19774: Assignee: Apache Spark (was: Burak Yavuz) > StreamExecution should call stop() on

[jira] [Updated] (SPARK-19775) Remove an obsolete `partitionBy().insertInto()` test case

2017-02-28 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-19775: -- Description: This issue removes [a test

[jira] [Commented] (SPARK-19764) Executors hang with supposedly running task that are really finished.

2017-02-28 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889026#comment-15889026 ] Shixiong Zhu commented on SPARK-19764: -- [~agesher] driver-log-stderr.log is actually the executor

[jira] [Commented] (SPARK-19774) StreamExecution should call stop() on sources when a stream fails

2017-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889032#comment-15889032 ] Apache Spark commented on SPARK-19774: -- User 'brkyvz' has created a pull request for this issue:

[jira] [Commented] (SPARK-14480) Remove meaningless StringIteratorReader for CSV data source for better performance

2017-02-28 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889030#comment-15889030 ] Wenchen Fan commented on SPARK-14480: - The regression has been fixed in

[jira] [Commented] (SPARK-19775) Remove an obsolete `partitionBy().insertInto()` test case

2017-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889028#comment-15889028 ] Apache Spark commented on SPARK-19775: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Assigned] (SPARK-19775) Remove an obsolete `partitionBy().insertInto()` test case

2017-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19775: Assignee: (was: Apache Spark) > Remove an obsolete `partitionBy().insertInto()` test

[jira] [Assigned] (SPARK-19775) Remove an obsolete `partitionBy().insertInto()` test case

2017-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19775: Assignee: Apache Spark > Remove an obsolete `partitionBy().insertInto()` test case >

[jira] [Resolved] (SPARK-14480) Remove meaningless StringIteratorReader for CSV data source for better performance

2017-02-28 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-14480. - Resolution: Fixed > Remove meaningless StringIteratorReader for CSV data source for better >

  1   2   >