[jira] [Commented] (SPARK-18508) Fix documentation for DateDiff

2016-11-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15678731#comment-15678731 ] Apache Spark commented on SPARK-18508: -- User 'rxin' has created a pull request for t

[jira] [Assigned] (SPARK-18508) Fix documentation for DateDiff

2016-11-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18508: Assignee: Reynold Xin (was: Apache Spark) > Fix documentation for DateDiff >

[jira] [Assigned] (SPARK-18508) Fix documentation for DateDiff

2016-11-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18508: Assignee: Apache Spark (was: Reynold Xin) > Fix documentation for DateDiff >

[jira] [Created] (SPARK-18508) Fix documentation for DateDiff

2016-11-18 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-18508: --- Summary: Fix documentation for DateDiff Key: SPARK-18508 URL: https://issues.apache.org/jira/browse/SPARK-18508 Project: Spark Issue Type: Bug Compon

[jira] [Closed] (SPARK-18089) Remove CollectLimitExec operator

2016-11-18 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh closed SPARK-18089. --- Resolution: Won't Fix > Remove CollectLimitExec operator > >

[jira] [Commented] (SPARK-18319) ML, Graph 2.1 QA: API: Experimental, DeveloperApi, final, sealed audit

2016-11-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15678230#comment-15678230 ] Joseph K. Bradley commented on SPARK-18319: --- I'd prefer not to open up Vector a

[jira] [Commented] (SPARK-18319) ML, Graph 2.1 QA: API: Experimental, DeveloperApi, final, sealed audit

2016-11-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15678231#comment-15678231 ] Joseph K. Bradley commented on SPARK-18319: --- Thanks [~yuhaoyan] for the audit!

[jira] [Resolved] (SPARK-18497) ForeachSink fails with "assertion failed: No plan for EventTimeWatermark"

2016-11-18 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-18497. --- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 15934 [https://g

[jira] [Resolved] (SPARK-18505) Simplify AnalyzeColumnCommand

2016-11-18 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18505. - Resolution: Fixed Fix Version/s: 2.1.0 > Simplify AnalyzeColumnCommand > -

[jira] [Commented] (SPARK-18319) ML, Graph 2.1 QA: API: Experimental, DeveloperApi, final, sealed audit

2016-11-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15678218#comment-15678218 ] Joseph K. Bradley commented on SPARK-18319: --- I agree with the "probably ready t

[jira] [Comment Edited] (SPARK-18507) Major performance regression in SHOW PARTITIONS on partitioned Hive tables

2016-11-18 Thread Michael Allman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15678215#comment-15678215 ] Michael Allman edited comment on SPARK-18507 at 11/19/16 12:30 AM:

[jira] [Commented] (SPARK-18507) Major performance regression in SHOW PARTITIONS on partitioned Hive tables

2016-11-18 Thread Michael Allman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15678215#comment-15678215 ] Michael Allman commented on SPARK-18507: CC [~ekhliang] > Major performance regr

[jira] [Created] (SPARK-18507) Major performance regression in SHOW PARTITIONS on partitioned Hive tables

2016-11-18 Thread Michael Allman (JIRA)
Michael Allman created SPARK-18507: -- Summary: Major performance regression in SHOW PARTITIONS on partitioned Hive tables Key: SPARK-18507 URL: https://issues.apache.org/jira/browse/SPARK-18507 Projec

[jira] [Updated] (SPARK-18506) kafka 0.10 with Spark 2.02 auto.offset.reset=earliest will only read from a single partition on a multi partition topic

2016-11-18 Thread Heji Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Heji Kim updated SPARK-18506: - Description: Our team is trying to upgrade to Spark 2.0.2/Kafka 0.10.1.0/spark-streaming-kafka-0-10_2.11

[jira] [Commented] (SPARK-18356) Issue + Resolution: Kmeans Spark Performances (ML package)

2016-11-18 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15678196#comment-15678196 ] yuhao yang commented on SPARK-18356: Surely I would not not mind. You're more than we

[jira] [Resolved] (SPARK-18477) Enable interrupts for HDFS in HDFSMetadataLog

2016-11-18 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-18477. --- Resolution: Fixed Fix Version/s: 2.1.0 2.0.3 Issue resolved by pull

[jira] [Updated] (SPARK-18477) Enable interrupts for HDFS in HDFSMetadataLog

2016-11-18 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-18477: -- Issue Type: Sub-task (was: Improvement) Parent: SPARK-8360 > Enable interrupts for HDF

[jira] [Created] (SPARK-18506) kafka 0.10 with Spark 2.02 auto.offset.reset=earliest will only read from a single partition on a multi partition topic

2016-11-18 Thread Heji Kim (JIRA)
Heji Kim created SPARK-18506: Summary: kafka 0.10 with Spark 2.02 auto.offset.reset=earliest will only read from a single partition on a multi partition topic Key: SPARK-18506 URL: https://issues.apache.org/jira/brows

[jira] [Closed] (SPARK-11613) Kinesis ASL should allow caller to set ClientConfiguration for socket timeouts and other connection setting

2016-11-18 Thread Heji Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Heji Kim closed SPARK-11613. Resolution: Fixed > Kinesis ASL should allow caller to set ClientConfiguration for socket > timeouts and o

[jira] [Commented] (SPARK-18504) Scalar subquery with extra group by columns returning incorrect result

2016-11-18 Thread Nattavut Sutyanyong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15678001#comment-15678001 ] Nattavut Sutyanyong commented on SPARK-18504: - While we have SPARK-18455 to t

[jira] [Assigned] (SPARK-18504) Scalar subquery with extra group by columns returning incorrect result

2016-11-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18504: Assignee: Apache Spark > Scalar subquery with extra group by columns returning incorrect r

[jira] [Assigned] (SPARK-18504) Scalar subquery with extra group by columns returning incorrect result

2016-11-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18504: Assignee: (was: Apache Spark) > Scalar subquery with extra group by columns returning

[jira] [Commented] (SPARK-18504) Scalar subquery with extra group by columns returning incorrect result

2016-11-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677991#comment-15677991 ] Apache Spark commented on SPARK-18504: -- User 'nsyca' has created a pull request for

[jira] [Commented] (SPARK-18188) Add checksum for block of broadcast

2016-11-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677983#comment-15677983 ] Apache Spark commented on SPARK-18188: -- User 'davies' has created a pull request for

[jira] [Comment Edited] (SPARK-2984) FileNotFoundException on _temporary directory

2016-11-18 Thread Giuseppe Bonaccorso (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677971#comment-15677971 ] Giuseppe Bonaccorso edited comment on SPARK-2984 at 11/18/16 10:35 PM: -

[jira] [Comment Edited] (SPARK-2984) FileNotFoundException on _temporary directory

2016-11-18 Thread Giuseppe Bonaccorso (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677971#comment-15677971 ] Giuseppe Bonaccorso edited comment on SPARK-2984 at 11/18/16 10:35 PM: -

[jira] [Commented] (SPARK-2984) FileNotFoundException on _temporary directory

2016-11-18 Thread Giuseppe Bonaccorso (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677971#comment-15677971 ] Giuseppe Bonaccorso commented on SPARK-2984: I'm facing the same issue with EM

[jira] [Updated] (SPARK-5992) Locality Sensitive Hashing (LSH)

2016-11-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-5992: - Summary: Locality Sensitive Hashing (LSH) (was: Locality Sensitive Hashing (LSH) for MLli

[jira] [Updated] (SPARK-18188) Add checksum for block of broadcast

2016-11-18 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-18188: --- Description: There is an understanding issue for a long time: https://issues.apache.org/jira/browse/

[jira] [Updated] (SPARK-18188) Add checksum for block of broadcast

2016-11-18 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-18188: --- Summary: Add checksum for block of broadcast (was: Add checksum for block in Spark) > Add checksum

[jira] [Closed] (SPARK-18000) Aggregation function for computing bins (distinct value, count) pairs for equi-width histograms

2016-11-18 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-18000. --- Resolution: Won't Fix Marking this as won't fix, since it looks like combination of count-min sketch

[jira] [Updated] (SPARK-16561) Potential numerical problem in MultivariateOnlineSummarizer min/max

2016-11-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-16561: -- Summary: Potential numerical problem in MultivariateOnlineSummarizer min/max (was: Pot

[jira] [Updated] (SPARK-16831) PySpark CrossValidator reports incorrect avgMetrics

2016-11-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-16831: -- Summary: PySpark CrossValidator reports incorrect avgMetrics (was: CrossValidator repo

[jira] [Assigned] (SPARK-18497) ForeachSink fails with "assertion failed: No plan for EventTimeWatermark"

2016-11-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18497: Assignee: Shixiong Zhu (was: Apache Spark) > ForeachSink fails with "assertion failed: No

[jira] [Commented] (SPARK-18497) ForeachSink fails with "assertion failed: No plan for EventTimeWatermark"

2016-11-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677932#comment-15677932 ] Apache Spark commented on SPARK-18497: -- User 'zsxwing' has created a pull request fo

[jira] [Assigned] (SPARK-18497) ForeachSink fails with "assertion failed: No plan for EventTimeWatermark"

2016-11-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18497: Assignee: Apache Spark (was: Shixiong Zhu) > ForeachSink fails with "assertion failed: No

[jira] [Updated] (SPARK-18505) Simplify AnalyzeColumnCommand

2016-11-18 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18505: Description: I'm spending more time at the design & code level for cost-based optimizer now, and h

[jira] [Commented] (SPARK-18505) Simplify AnalyzeColumnCommand

2016-11-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677895#comment-15677895 ] Apache Spark commented on SPARK-18505: -- User 'rxin' has created a pull request for t

[jira] [Assigned] (SPARK-18505) Simplify AnalyzeColumnCommand

2016-11-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18505: Assignee: Reynold Xin (was: Apache Spark) > Simplify AnalyzeColumnCommand > -

[jira] [Assigned] (SPARK-18505) Simplify AnalyzeColumnCommand

2016-11-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18505: Assignee: Apache Spark (was: Reynold Xin) > Simplify AnalyzeColumnCommand > -

[jira] [Assigned] (SPARK-18497) ForeachSink fails with "assertion failed: No plan for EventTimeWatermark"

2016-11-18 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reassigned SPARK-18497: Assignee: Shixiong Zhu > ForeachSink fails with "assertion failed: No plan for EventTimeWa

[jira] [Created] (SPARK-18505) Simplify AnalyzeColumnCommand

2016-11-18 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-18505: --- Summary: Simplify AnalyzeColumnCommand Key: SPARK-18505 URL: https://issues.apache.org/jira/browse/SPARK-18505 Project: Spark Issue Type: Sub-task Co

[jira] [Updated] (SPARK-18422) Fix wholeTextFiles test to pass on Windows in JavaAPISuite

2016-11-18 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-18422: -- Assignee: Hyukjin Kwon > Fix wholeTextFiles test to pass on Windows in JavaAPISuite > -

[jira] [Commented] (SPARK-18504) Scalar subquery with extra group by columns returning incorrect result

2016-11-18 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677849#comment-15677849 ] Herman van Hovell commented on SPARK-18504: --- Could you open a PR? > Scalar sub

[jira] [Resolved] (SPARK-18422) Fix wholeTextFiles test to pass on Windows in JavaAPISuite

2016-11-18 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-18422. --- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 15866 [https://github.co

[jira] [Commented] (SPARK-18504) Scalar subquery with extra group by columns returning incorrect result

2016-11-18 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677848#comment-15677848 ] Herman van Hovell commented on SPARK-18504: --- Is this a valid correlated scalar

[jira] [Updated] (SPARK-17363) fix MultivariateOnlineSummerizer.numNonZeros

2016-11-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-17363: -- Summary: fix MultivariateOnlineSummerizer.numNonZeros (was: fix MultivariantOnlineSumm

[jira] [Updated] (SPARK-18504) Scalar subquery with extra group by columns returning incorrect result

2016-11-18 Thread Nattavut Sutyanyong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nattavut Sutyanyong updated SPARK-18504: Summary: Scalar subquery with extra group by columns returning incorrect result (w

[jira] [Commented] (SPARK-18504) Scalar subquery with extra group by columns returning incorrect result

2016-11-18 Thread Nattavut Sutyanyong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677765#comment-15677765 ] Nattavut Sutyanyong commented on SPARK-18504: - // Incorrect result Seq(1).toD

[jira] [Created] (SPARK-18504) Scalar subquery returning incorrect result

2016-11-18 Thread Nattavut Sutyanyong (JIRA)
Nattavut Sutyanyong created SPARK-18504: --- Summary: Scalar subquery returning incorrect result Key: SPARK-18504 URL: https://issues.apache.org/jira/browse/SPARK-18504 Project: Spark Issu

[jira] [Created] (SPARK-18503) Pre 2.0 spark driver/executor memory default unit is bytes, post 2.0 default unit is MB

2016-11-18 Thread Chris McCubbin (JIRA)
Chris McCubbin created SPARK-18503: -- Summary: Pre 2.0 spark driver/executor memory default unit is bytes, post 2.0 default unit is MB Key: SPARK-18503 URL: https://issues.apache.org/jira/browse/SPARK-18503

[jira] [Updated] (SPARK-18334) What hashDistance should MinHash use?

2016-11-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-18334: -- Target Version/s: (was: 2.1.0) > What hashDistance should MinHash use? >

[jira] [Updated] (SPARK-18334) What hashDistance should MinHash use?

2016-11-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-18334: -- Priority: Minor (was: Trivial) > What hashDistance should MinHash use? > -

[jira] [Commented] (SPARK-18334) MinHash should use binary hash distance

2016-11-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677667#comment-15677667 ] Joseph K. Bradley commented on SPARK-18334: --- Adding a note: Per discussions on

[jira] [Updated] (SPARK-18334) What hashDistance should MinHash use?

2016-11-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-18334: -- Summary: What hashDistance should MinHash use? (was: MinHash should use binary hash di

[jira] [Updated] (SPARK-18339) Don't push down current_timestamp for filters in StructuredStreaming

2016-11-18 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18339: - Assignee: Tyson Condie > Don't push down current_timestamp for filters in StructuredStrea

[jira] [Updated] (SPARK-18339) Don't push down current_timestamp for filters in StructuredStreaming

2016-11-18 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18339: - Target Version/s: 2.1.0 (was: 2.2.0) > Don't push down current_timestamp for filters in

[jira] [Closed] (SPARK-18252) Improve serialized BloomFilter size

2016-11-18 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-18252. --- Resolution: Won't Fix > Improve serialized BloomFilter size > --- > >

[jira] [Commented] (SPARK-18252) Improve serialized BloomFilter size

2016-11-18 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677521#comment-15677521 ] Reynold Xin commented on SPARK-18252: - Thanks - going to close this. > Improve seri

[jira] [Resolved] (SPARK-18457) ORC and other columnar formats using HiveShim read all columns when doing a simple count

2016-11-18 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18457. - Resolution: Fixed Assignee: Andrew Ray Fix Version/s: 2.1.0 > ORC and other colum

[jira] [Resolved] (SPARK-18187) CompactibleFileStreamLog should not rely on "compactInterval" to detect a compaction batch

2016-11-18 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-18187. -- Resolution: Fixed Assignee: Tyson Condie Fix Version/s: 2.1.0 > CompactibleFile

[jira] [Commented] (SPARK-18187) CompactibleFileStreamLog should not rely on "compactInterval" to detect a compaction batch

2016-11-18 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677492#comment-15677492 ] Shixiong Zhu commented on SPARK-18187: -- Resolved by https://github.com/apache/spark/

[jira] [Closed] (SPARK-11785) When deployed against remote Hive metastore with lower versions, JDBC metadata calls throws exception

2016-11-18 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell closed SPARK-11785. - Resolution: Fixed Fix Version/s: 2.1.0 > When deployed against remote Hive metasto

[jira] [Commented] (SPARK-11785) When deployed against remote Hive metastore with lower versions, JDBC metadata calls throws exception

2016-11-18 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677469#comment-15677469 ] Cheng Lian commented on SPARK-11785: But I'm not sure which PR fixes this issue, thou

[jira] [Updated] (SPARK-10643) Support remote application download in client mode spark submit

2016-11-18 Thread Michael Gummelt (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Gummelt updated SPARK-10643: Summary: Support remote application download in client mode spark submit (was: Support HDF

[jira] [Commented] (SPARK-10643) Support HDFS application download in client mode spark submit

2016-11-18 Thread Michael Gummelt (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677467#comment-15677467 ] Michael Gummelt commented on SPARK-10643: - It's not just HDFS. HTTP urls fail as

[jira] [Commented] (SPARK-11785) When deployed against remote Hive metastore with lower versions, JDBC metadata calls throws exception

2016-11-18 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677468#comment-15677468 ] Cheng Lian commented on SPARK-11785: Confirmed that this is no longer an issue for 2.

[jira] [Updated] (SPARK-18321) ML 2.1 QA: API: Java compatibility, docs

2016-11-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-18321: -- Assignee: Seth Hendrickson > ML 2.1 QA: API: Java compatibility, docs > ---

[jira] [Resolved] (SPARK-18321) ML 2.1 QA: API: Java compatibility, docs

2016-11-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-18321. --- Resolution: Fixed Fix Version/s: 2.1.0 > ML 2.1 QA: API: Java compatibility, d

[jira] [Commented] (SPARK-18321) ML 2.1 QA: API: Java compatibility, docs

2016-11-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677435#comment-15677435 ] Joseph K. Bradley commented on SPARK-18321: --- I checked the diff between docs as

[jira] [Comment Edited] (SPARK-18251) DataSet API | RuntimeException: Null value appeared in non-nullable field when holding Option Case Class

2016-11-18 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677396#comment-15677396 ] Cheng Lian edited comment on SPARK-18251 at 11/18/16 6:38 PM: -

[jira] [Comment Edited] (SPARK-18251) DataSet API | RuntimeException: Null value appeared in non-nullable field when holding Option Case Class

2016-11-18 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677396#comment-15677396 ] Cheng Lian edited comment on SPARK-18251 at 11/18/16 6:37 PM: -

[jira] [Commented] (SPARK-18251) DataSet API | RuntimeException: Null value appeared in non-nullable field when holding Option Case Class

2016-11-18 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677396#comment-15677396 ] Cheng Lian commented on SPARK-18251: I'd prefer option 1 because of consistency of th

[jira] [Commented] (SPARK-18218) Optimize BlockMatrix multiplication, which may cause OOM and low parallelism usage problem in several cases

2016-11-18 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677367#comment-15677367 ] Burak Yavuz commented on SPARK-18218: - [~WeichenXu123] You are correct, this would be

[jira] [Commented] (SPARK-18134) SQL: MapType in Group BY and Joins not working

2016-11-18 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677266#comment-15677266 ] Herman van Hovell commented on SPARK-18134: --- There is not a political reason fo

[jira] [Commented] (SPARK-18134) SQL: MapType in Group BY and Joins not working

2016-11-18 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677252#comment-15677252 ] Herman van Hovell commented on SPARK-18134: --- In both cases you could use sorted

[jira] [Comment Edited] (SPARK-13913) DataFrame.withColumn fails when trying to replace existing column with dot in name

2016-11-18 Thread Barry Becker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677144#comment-15677144 ] Barry Becker edited comment on SPARK-13913 at 11/18/16 5:02 PM: ---

[jira] [Commented] (SPARK-13913) DataFrame.withColumn fails when trying to replace existing column with dot in name

2016-11-18 Thread Barry Becker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677144#comment-15677144 ] Barry Becker commented on SPARK-13913: -- I can still reproduce this using spark 1.6.3

[jira] [Commented] (SPARK-18249) StackOverflowError when saving dataset to parquet

2016-11-18 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677115#comment-15677115 ] Herman van Hovell commented on SPARK-18249: --- The good news is that this is not

[jira] [Commented] (SPARK-18202) Spark throws a mysterious system error when a Hive command has at least 100,000 results

2016-11-18 Thread Martin Petricek (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677072#comment-15677072 ] Martin Petricek commented on SPARK-18202: - I encountered the same problem in 1.6.

[jira] [Commented] (SPARK-18134) SQL: MapType in Group BY and Joins not working

2016-11-18 Thread Christian Zorneck (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677059#comment-15677059 ] Christian Zorneck commented on SPARK-18134: --- Because there is no valid workarou

[jira] [Commented] (SPARK-14155) Hide UserDefinedType in Spark 2.0

2016-11-18 Thread Raghu Ganti (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677052#comment-15677052 ] Raghu Ganti commented on SPARK-14155: - Is there an update re this? Any timeline as to

[jira] [Commented] (SPARK-18252) Improve serialized BloomFilter size

2016-11-18 Thread Gregory SSI-YAN-KAI (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15676986#comment-15676986 ] Gregory SSI-YAN-KAI commented on SPARK-18252: - I've worked on a custom implem

[jira] [Commented] (SPARK-18496) java.lang.AssertionError: assertion failed

2016-11-18 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15676956#comment-15676956 ] Dongjoon Hyun commented on SPARK-18496: --- Hi, Harish. Do you mean the officially Apa

[jira] [Commented] (SPARK-10872) Derby error (XSDB6) when creating new HiveContext after restarting SparkContext

2016-11-18 Thread Michal W (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15676944#comment-15676944 ] Michal W commented on SPARK-10872: -- I'm having the same issue on 1.6.1. It's quite incon

[jira] [Commented] (SPARK-18252) Improve serialized BloomFilter size

2016-11-18 Thread Aleksey Ponkin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15676908#comment-15676908 ] Aleksey Ponkin commented on SPARK-18252: The only thing that can be improved, IMH

[jira] [Commented] (SPARK-18252) Improve serialized BloomFilter size

2016-11-18 Thread Aleksey Ponkin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15676899#comment-15676899 ] Aleksey Ponkin commented on SPARK-18252: I did benchmarks(you can find it [here|

[jira] [Assigned] (SPARK-18448) SparkSession should implement java.lang.AutoCloseable like JavaSparkContext

2016-11-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18448: Assignee: Apache Spark > SparkSession should implement java.lang.AutoCloseable like JavaSp

[jira] [Commented] (SPARK-18448) SparkSession should implement java.lang.AutoCloseable like JavaSparkContext

2016-11-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15676878#comment-15676878 ] Apache Spark commented on SPARK-18448: -- User 'srowen' has created a pull request for

[jira] [Assigned] (SPARK-18448) SparkSession should implement java.lang.AutoCloseable like JavaSparkContext

2016-11-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18448: Assignee: (was: Apache Spark) > SparkSession should implement java.lang.AutoCloseable

[jira] [Created] (SPARK-18502) Spark does not handle columns that contain backquote (`)

2016-11-18 Thread Barry Becker (JIRA)
Barry Becker created SPARK-18502: Summary: Spark does not handle columns that contain backquote (`) Key: SPARK-18502 URL: https://issues.apache.org/jira/browse/SPARK-18502 Project: Spark Issu

[jira] [Commented] (SPARK-11977) Support accessing a DataFrame column using its name without backticks if the name contains '.'

2016-11-18 Thread Barry Becker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15676856#comment-15676856 ] Barry Becker commented on SPARK-11977: -- I would also like to know how to handle colu

[jira] [Comment Edited] (SPARK-18484) case class datasets - ability to specify decimal precision and scale

2016-11-18 Thread Damian Momot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15673347#comment-15673347 ] Damian Momot edited comment on SPARK-18484 at 11/18/16 1:51 PM: ---

[jira] [Commented] (SPARK-18249) StackOverflowError when saving dataset to parquet

2016-11-18 Thread Damian Momot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15676734#comment-15676734 ] Damian Momot commented on SPARK-18249: -- Just check that it also happens on 2.0.2 and

[jira] [Updated] (SPARK-18249) StackOverflowError when saving dataset to parquet

2016-11-18 Thread Damian Momot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damian Momot updated SPARK-18249: - Affects Version/s: 2.1.0 Summary: StackOverflowError when saving dataset to parquet

[jira] [Updated] (SPARK-18249) Spark 2.0.1 - StackOverflowError when saving dataset to parquet

2016-11-18 Thread Damian Momot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damian Momot updated SPARK-18249: - Affects Version/s: 2.0.2 > Spark 2.0.1 - StackOverflowError when saving dataset to parquet >

[jira] [Resolved] (SPARK-18393) DataFrame pivot output column names should respect aliases

2016-11-18 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-18393. --- Resolution: Duplicate > DataFrame pivot output column names should respect aliases >

[jira] [Commented] (SPARK-18471) In treeAggregate, generate (big) zeros instead of sending them.

2016-11-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15676386#comment-15676386 ] Apache Spark commented on SPARK-18471: -- User 'AnthonyTruchet' has created a pull req

[jira] [Resolved] (SPARK-12278) Move the shuffle related test case from Yarn module to Core module

2016-11-18 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-12278. --- Resolution: Not A Problem I don't see a suite by this name any more anyway. It sounds like this would

[jira] [Commented] (SPARK-18356) Issue + Resolution: Kmeans Spark Performances (ML package)

2016-11-18 Thread zakaria hili (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15676337#comment-15676337 ] zakaria hili commented on SPARK-18356: -- if you don't mind , yes > Issue + Resolutio

  1   2   >