[jira] [Updated] (SPARK-14194) spark csv reader not working properly if CSV content contains CRLF character (newline) in the intermediate cell

2017-02-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-14194: -- Affects Version/s: 2.1.0 > spark csv reader not working properly if CSV content contains CRLF

[jira] [Created] (SPARK-19646) binaryRecords replicates records in scala API

2017-02-16 Thread BahaaEddin AlAila (JIRA)
BahaaEddin AlAila created SPARK-19646: - Summary: binaryRecords replicates records in scala API Key: SPARK-19646 URL: https://issues.apache.org/jira/browse/SPARK-19646 Project: Spark

[jira] [Updated] (SPARK-19645) structured streaming job restart bug

2017-02-16 Thread guifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] guifeng updated SPARK-19645: Description: We are trying to use Structured Streaming in product, however currently there exists a

[jira] [Updated] (SPARK-19645) structured streaming job restart bug

2017-02-16 Thread guifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] guifeng updated SPARK-19645: Description: We are trying to use Structured Streaming in product, however currently there exists a

[jira] [Updated] (SPARK-19645) structured streaming job restart bug

2017-02-16 Thread guifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] guifeng updated SPARK-19645: Description: We are trying to use Structured Streaming in product, however currently there exists a

[jira] [Updated] (SPARK-19645) structured streaming job restart bug

2017-02-16 Thread guifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] guifeng updated SPARK-19645: Summary: structured streaming job restart bug (was: structured streaming job restart) > structured

[jira] [Updated] (SPARK-19645) structured streaming job restart

2017-02-16 Thread guifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] guifeng updated SPARK-19645: Description: We are trying to use Structured Streaming in product, however currently there exists a

[jira] [Updated] (SPARK-19645) structured streaming job restart

2017-02-16 Thread guifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] guifeng updated SPARK-19645: Description: We are trying to use Structured Streaming in product, however currently there exists a

[jira] [Updated] (SPARK-19645) structured streaming job restart

2017-02-16 Thread guifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] guifeng updated SPARK-19645: Description: We are trying to use Structured Streaming in product, however currently there exists a

[jira] [Created] (SPARK-19645) structured streaming job restart

2017-02-16 Thread guifeng (JIRA)
guifeng created SPARK-19645: --- Summary: structured streaming job restart Key: SPARK-19645 URL: https://issues.apache.org/jira/browse/SPARK-19645 Project: Spark Issue Type: Bug Components:

[jira] [Updated] (SPARK-19644) Memory leak in Spark Streaming

2017-02-16 Thread Deenbandhu Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deenbandhu Agarwal updated SPARK-19644: --- Affects Version/s: (was: 2.0.1) 2.0.2

[jira] [Updated] (SPARK-19644) Memory leak in Spark Streaming

2017-02-16 Thread Deenbandhu Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deenbandhu Agarwal updated SPARK-19644: --- Attachment: heapdump.png Snap shot of heap dump after 50 hours > Memory leak in

[jira] [Created] (SPARK-19644) Memory leak in Spark Streaming

2017-02-16 Thread Deenbandhu Agarwal (JIRA)
Deenbandhu Agarwal created SPARK-19644: -- Summary: Memory leak in Spark Streaming Key: SPARK-19644 URL: https://issues.apache.org/jira/browse/SPARK-19644 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-19641) JSON schema inference in DROPMALFORMED mode produces incorrect schema

2017-02-16 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871202#comment-15871202 ] Takeshi Yamamuro commented on SPARK-19641: -- okay, thanks! > JSON schema inference in

[jira] [Commented] (SPARK-19641) JSON schema inference in DROPMALFORMED mode produces incorrect schema

2017-02-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871196#comment-15871196 ] Hyukjin Kwon commented on SPARK-19641: -- Ah, thanks for cc'ing me. I happened to see the related

[jira] [Assigned] (SPARK-19557) Output parameters are not present in SQL Query Plan

2017-02-16 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-19557: --- Assignee: Wenchen Fan > Output parameters are not present in SQL Query Plan >

[jira] [Assigned] (SPARK-18120) QueryExecutionListener method doesnt' get executed for DataFrameWriter methods

2017-02-16 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-18120: --- Assignee: Wenchen Fan > QueryExecutionListener method doesnt' get executed for

[jira] [Resolved] (SPARK-19557) Output parameters are not present in SQL Query Plan

2017-02-16 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-19557. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16962

[jira] [Resolved] (SPARK-18120) QueryExecutionListener method doesnt' get executed for DataFrameWriter methods

2017-02-16 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-18120. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16962

[jira] [Assigned] (SPARK-18352) Parse normal, multi-line JSON files (not just JSON Lines)

2017-02-16 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-18352: --- Assignee: Nathan Howell > Parse normal, multi-line JSON files (not just JSON Lines) >

[jira] [Resolved] (SPARK-18352) Parse normal, multi-line JSON files (not just JSON Lines)

2017-02-16 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-18352. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16386

[jira] [Commented] (SPARK-19638) Filter pushdown not working for struct fields

2017-02-16 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871162#comment-15871162 ] Takeshi Yamamuro commented on SPARK-19638: -- The pushing-down stuffs depend on datasources, so

[jira] [Created] (SPARK-19643) Document how to use Spark/SparkR on Windows

2017-02-16 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-19643: Summary: Document how to use Spark/SparkR on Windows Key: SPARK-19643 URL: https://issues.apache.org/jira/browse/SPARK-19643 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-19641) JSON schema inference in DROPMALFORMED mode produces incorrect schema

2017-02-16 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871152#comment-15871152 ] Takeshi Yamamuro commented on SPARK-19641: -- Could you show us a simple query to reproduce this?

[jira] [Assigned] (SPARK-19556) Broadcast data is not encrypted when I/O encryption is on

2017-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19556: Assignee: (was: Apache Spark) > Broadcast data is not encrypted when I/O encryption

[jira] [Assigned] (SPARK-19556) Broadcast data is not encrypted when I/O encryption is on

2017-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19556: Assignee: Apache Spark > Broadcast data is not encrypted when I/O encryption is on >

[jira] [Commented] (SPARK-19573) Make NaN/null handling consistent in approxQuantile

2017-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871114#comment-15871114 ] Apache Spark commented on SPARK-19573: -- User 'zhengruifeng' has created a pull request for this

[jira] [Assigned] (SPARK-19573) Make NaN/null handling consistent in approxQuantile

2017-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19573: Assignee: (was: Apache Spark) > Make NaN/null handling consistent in approxQuantile >

[jira] [Assigned] (SPARK-19573) Make NaN/null handling consistent in approxQuantile

2017-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19573: Assignee: Apache Spark > Make NaN/null handling consistent in approxQuantile >

[jira] [Comment Edited] (SPARK-19623) Take rows from DataFrame with empty first partition

2017-02-16 Thread Jaeboo Jung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870990#comment-15870990 ] Jaeboo Jung edited comment on SPARK-19623 at 2/17/17 2:15 AM: -- Increasing

[jira] [Updated] (SPARK-19642) Improve the security guarantee for rest api and ui

2017-02-16 Thread Genmao Yu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Genmao Yu updated SPARK-19642: -- Summary: Improve the security guarantee for rest api and ui (was: Improve the security guarantee for

[jira] [Updated] (SPARK-19642) Improve the security guarantee for rest api

2017-02-16 Thread Genmao Yu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Genmao Yu updated SPARK-19642: -- Description: As Spark gets more and more features, data may start leaking through other places (e.g.

[jira] [Updated] (SPARK-19642) Improve the security guarantee for rest api

2017-02-16 Thread Genmao Yu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Genmao Yu updated SPARK-19642: -- Description: As Spark gets more and more features, data may start leaking through other places (e.g.

[jira] [Commented] (SPARK-19642) Improve the security guarantee for rest api

2017-02-16 Thread Genmao Yu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871058#comment-15871058 ] Genmao Yu commented on SPARK-19642: --- cc [~ajbozarth], [~vanzin] and [~srowen] > Improve the security

[jira] [Created] (SPARK-19642) Improve the security guarantee for rest api

2017-02-16 Thread Genmao Yu (JIRA)
Genmao Yu created SPARK-19642: - Summary: Improve the security guarantee for rest api Key: SPARK-19642 URL: https://issues.apache.org/jira/browse/SPARK-19642 Project: Spark Issue Type:

[jira] [Commented] (SPARK-19625) Authorization Support(on all operations not only DDL) in Spark Sql version 2.1.0

2017-02-16 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-19625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871035#comment-15871035 ] 翟玉勇 commented on SPARK-19625: - for spark 1.5.2 has https://github.com/apache/spark/pull/10144

[jira] [Created] (SPARK-19641) JSON schema inference in DROPMALFORMED mode produces incorrect schema

2017-02-16 Thread Nathan Howell (JIRA)
Nathan Howell created SPARK-19641: - Summary: JSON schema inference in DROPMALFORMED mode produces incorrect schema Key: SPARK-19641 URL: https://issues.apache.org/jira/browse/SPARK-19641 Project:

[jira] [Commented] (SPARK-19497) dropDuplicates with watermark

2017-02-16 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871016#comment-15871016 ] Shixiong Zhu commented on SPARK-19497: -- [~samelamin] Thanks! I just submitted a PR. Could you help

[jira] [Commented] (SPARK-19640) Incorrect documentation for MLlib CountVectorizerModel for spark 1.5.2

2017-02-16 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871015#comment-15871015 ] yuhao yang commented on SPARK-19640: Thanks for reporting the issue. Feel free to send a fix if you

[jira] [Assigned] (SPARK-19497) dropDuplicates with watermark

2017-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19497: Assignee: Shixiong Zhu (was: Apache Spark) > dropDuplicates with watermark >

[jira] [Commented] (SPARK-19497) dropDuplicates with watermark

2017-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871010#comment-15871010 ] Apache Spark commented on SPARK-19497: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19497) dropDuplicates with watermark

2017-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19497: Assignee: Apache Spark (was: Shixiong Zhu) > dropDuplicates with watermark >

[jira] [Commented] (SPARK-19640) Incorrect documentation for MLlib CountVectorizerModel for spark 1.5.2

2017-02-16 Thread Stephen Kinser (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871000#comment-15871000 ] Stephen Kinser commented on SPARK-19640: [~yuhaoyan] I saw that you were the one who originally

[jira] [Created] (SPARK-19640) Incorrect documentation for MLlib CountVectorizerModel for spark 1.5.2

2017-02-16 Thread Stephen Kinser (JIRA)
Stephen Kinser created SPARK-19640: -- Summary: Incorrect documentation for MLlib CountVectorizerModel for spark 1.5.2 Key: SPARK-19640 URL: https://issues.apache.org/jira/browse/SPARK-19640 Project:

[jira] [Commented] (SPARK-19623) Take rows from DataFrame with empty first partition

2017-02-16 Thread Jaeboo Jung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870990#comment-15870990 ] Jaeboo Jung commented on SPARK-19623: - Increasing driver memory can't clear this issue because memory

[jira] [Assigned] (SPARK-19639) Add spark.svmLinear example and update vignettes

2017-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19639: Assignee: Apache Spark > Add spark.svmLinear example and update vignettes >

[jira] [Assigned] (SPARK-19639) Add spark.svmLinear example and update vignettes

2017-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19639: Assignee: (was: Apache Spark) > Add spark.svmLinear example and update vignettes >

[jira] [Commented] (SPARK-19639) Add spark.svmLinear example and update vignettes

2017-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870918#comment-15870918 ] Apache Spark commented on SPARK-19639: -- User 'wangmiao1981' has created a pull request for this

[jira] [Created] (SPARK-19639) Add spark.svmLinear example and update vignettes

2017-02-16 Thread Miao Wang (JIRA)
Miao Wang created SPARK-19639: - Summary: Add spark.svmLinear example and update vignettes Key: SPARK-19639 URL: https://issues.apache.org/jira/browse/SPARK-19639 Project: Spark Issue Type:

[jira] [Created] (SPARK-19638) Filter pushdown not working for struct fields

2017-02-16 Thread Nick Dimiduk (JIRA)
Nick Dimiduk created SPARK-19638: Summary: Filter pushdown not working for struct fields Key: SPARK-19638 URL: https://issues.apache.org/jira/browse/SPARK-19638 Project: Spark Issue Type:

[jira] [Created] (SPARK-19637) add to_json APIs to SQL

2017-02-16 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-19637: --- Summary: add to_json APIs to SQL Key: SPARK-19637 URL: https://issues.apache.org/jira/browse/SPARK-19637 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-19634) Feature parity for descriptive statistics in MLlib

2017-02-16 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870825#comment-15870825 ] Miao Wang commented on SPARK-19634: --- I can give a try. Thanks! Miao > Feature parity for descriptive

[jira] [Commented] (SPARK-14658) when executor lost DagScheduer may submit one stage twice even if the first running taskset for this stage is not finished

2017-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870816#comment-15870816 ] Josh Rosen commented on SPARK-14658: Here's the logs from my reproduction, excerpted down to only the

[jira] [Assigned] (SPARK-19337) Documentation and examples for LinearSVC

2017-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19337: Assignee: Apache Spark > Documentation and examples for LinearSVC >

[jira] [Commented] (SPARK-19337) Documentation and examples for LinearSVC

2017-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870813#comment-15870813 ] Apache Spark commented on SPARK-19337: -- User 'hhbyyh' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19337) Documentation and examples for LinearSVC

2017-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19337: Assignee: (was: Apache Spark) > Documentation and examples for LinearSVC >

[jira] [Assigned] (SPARK-18409) LSH approxNearestNeighbors should use approxQuantile instead of sort

2017-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18409: Assignee: Apache Spark > LSH approxNearestNeighbors should use approxQuantile instead of

[jira] [Commented] (SPARK-18409) LSH approxNearestNeighbors should use approxQuantile instead of sort

2017-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870791#comment-15870791 ] Apache Spark commented on SPARK-18409: -- User 'Yunni' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18409) LSH approxNearestNeighbors should use approxQuantile instead of sort

2017-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18409: Assignee: (was: Apache Spark) > LSH approxNearestNeighbors should use approxQuantile

[jira] [Resolved] (SPARK-18286) Add Scala/Java/Python examples for MinHash and RandomProjection

2017-02-16 Thread Yun Ni (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yun Ni resolved SPARK-18286. Resolution: Fixed Fix Version/s: 2.2.0 > Add Scala/Java/Python examples for MinHash and

[jira] [Commented] (SPARK-19553) Add GroupedData.countApprox()

2017-02-16 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870780#comment-15870780 ] Nicholas Chammas commented on SPARK-19553: -- The utility of 1) would be being able to count items

[jira] [Updated] (SPARK-14658) when executor lost DagScheduer may submit one stage twice even if the first running taskset for this stage is not finished

2017-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-14658: --- Description: {code} 16/04/14 15:35:22 ERROR DAGSchedulerEventProcessLoop:

[jira] [Commented] (SPARK-14658) when executor lost DagScheduer may submit one stage twice even if the first running taskset for this stage is not finished

2017-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870777#comment-15870777 ] Josh Rosen commented on SPARK-14658: [~srowen], I think that [~yixiaohua] is right here: it looks

[jira] [Updated] (SPARK-14658) when executor lost DagScheduer may submit one stage twice even if the first running taskset for this stage is not finished

2017-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-14658: --- Affects Version/s: 2.2.0 2.0.0 2.1.0 > when executor

[jira] [Updated] (SPARK-14658) when executor lost DagScheduer may submit one stage twice even if the first running taskset for this stage is not finished

2017-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-14658: --- Component/s: (was: Spark Core) Scheduler > when executor lost DagScheduer may

[jira] [Reopened] (SPARK-14658) when executor lost DagScheduer may submit one stage twice even if the first running taskset for this stage is not finished

2017-02-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reopened SPARK-14658: > when executor lost DagScheduer may submit one stage twice even if the first > running taskset for

[jira] [Updated] (SPARK-19628) Duplicate Spark jobs in 2.1.0

2017-02-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-19628: -- Fix Version/s: (was: 2.0.1) > Duplicate Spark jobs in 2.1.0 > - > >

[jira] [Assigned] (SPARK-18450) Add AND-amplification to Locality Sensitive Hashing

2017-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18450: Assignee: (was: Apache Spark) > Add AND-amplification to Locality Sensitive Hashing >

[jira] [Commented] (SPARK-18450) Add AND-amplification to Locality Sensitive Hashing

2017-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870727#comment-15870727 ] Apache Spark commented on SPARK-18450: -- User 'Yunni' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19163) Lazy creation of the _judf

2017-02-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-19163: - Assignee: Maciej Szymkiewicz > Lazy creation of the _judf > -- > >

[jira] [Assigned] (SPARK-18450) Add AND-amplification to Locality Sensitive Hashing

2017-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18450: Assignee: Apache Spark > Add AND-amplification to Locality Sensitive Hashing >

[jira] [Assigned] (SPARK-19586) Incorrect push down filter for double negative in SQL

2017-02-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-19586: - Assignee: Xiao Li > Incorrect push down filter for double negative in SQL >

[jira] [Assigned] (SPARK-16043) Prepare GenericArrayData implementation specialized for a primitive array

2017-02-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-16043: - Assignee: Kazuaki Ishizaki > Prepare GenericArrayData implementation specialized for a

[jira] [Created] (SPARK-19636) Feature parity for correlation statistics in MLlib

2017-02-16 Thread Timothy Hunter (JIRA)
Timothy Hunter created SPARK-19636: -- Summary: Feature parity for correlation statistics in MLlib Key: SPARK-19636 URL: https://issues.apache.org/jira/browse/SPARK-19636 Project: Spark Issue

[jira] [Created] (SPARK-19635) Feature parity for Chi-square hypothesis testing in MLlib

2017-02-16 Thread Timothy Hunter (JIRA)
Timothy Hunter created SPARK-19635: -- Summary: Feature parity for Chi-square hypothesis testing in MLlib Key: SPARK-19635 URL: https://issues.apache.org/jira/browse/SPARK-19635 Project: Spark

[jira] [Created] (SPARK-19634) Feature parity for descriptive statistics in MLlib

2017-02-16 Thread Timothy Hunter (JIRA)
Timothy Hunter created SPARK-19634: -- Summary: Feature parity for descriptive statistics in MLlib Key: SPARK-19634 URL: https://issues.apache.org/jira/browse/SPARK-19634 Project: Spark Issue

[jira] [Commented] (SPARK-19557) Output parameters are not present in SQL Query Plan

2017-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870664#comment-15870664 ] Apache Spark commented on SPARK-19557: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19557) Output parameters are not present in SQL Query Plan

2017-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19557: Assignee: Apache Spark > Output parameters are not present in SQL Query Plan >

[jira] [Assigned] (SPARK-19557) Output parameters are not present in SQL Query Plan

2017-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19557: Assignee: (was: Apache Spark) > Output parameters are not present in SQL Query Plan >

[jira] [Commented] (SPARK-19208) MultivariateOnlineSummarizer performance optimization

2017-02-16 Thread Timothy Hunter (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870655#comment-15870655 ] Timothy Hunter commented on SPARK-19208: I put together the ideas in this thread into a document.

[jira] [Closed] (SPARK-19632) Allow configuring non-hive and non-local SessionState and ExternalCatalog

2017-02-16 Thread Robert Kruszewski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kruszewski closed SPARK-19632. - Resolution: Won't Fix > Allow configuring non-hive and non-local SessionState and

[jira] [Commented] (SPARK-19632) Allow configuring non-hive and non-local SessionState and ExternalCatalog

2017-02-16 Thread Robert Kruszewski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870649#comment-15870649 ] Robert Kruszewski commented on SPARK-19632: --- Thanks, I must have missed that. > Allow

[jira] [Commented] (SPARK-19534) Convert Java tests to use lambdas, Java 8 features

2017-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870638#comment-15870638 ] Apache Spark commented on SPARK-19534: -- User 'srowen' has created a pull request for this issue:

[jira] [Commented] (SPARK-19632) Allow configuring non-hive and non-local SessionState and ExternalCatalog

2017-02-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870630#comment-15870630 ] Dongjoon Hyun commented on SPARK-19632: --- Hi, [~robert3005]. I just added a previous JIRA issue link

[jira] [Created] (SPARK-19633) FileSource read from FileSink

2017-02-16 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-19633: Summary: FileSource read from FileSink Key: SPARK-19633 URL: https://issues.apache.org/jira/browse/SPARK-19633 Project: Spark Issue Type: New

[jira] [Assigned] (SPARK-19632) Allow configuring non-hive and non-local SessionState and ExternalCatalog

2017-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19632: Assignee: Apache Spark > Allow configuring non-hive and non-local SessionState and

[jira] [Assigned] (SPARK-19632) Allow configuring non-hive and non-local SessionState and ExternalCatalog

2017-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19632: Assignee: (was: Apache Spark) > Allow configuring non-hive and non-local SessionState

[jira] [Commented] (SPARK-19632) Allow configuring non-hive and non-local SessionState and ExternalCatalog

2017-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870617#comment-15870617 ] Apache Spark commented on SPARK-19632: -- User 'robert3005' has created a pull request for this issue:

[jira] [Commented] (SPARK-17302) Cannot set non-Spark SQL session variables in hive-site.xml, spark-defaults.conf, or using --conf

2017-02-16 Thread Abhishek Madav (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870616#comment-15870616 ] Abhishek Madav commented on SPARK-17302: I believe this is fixed as part of SPARK-15887. Could

[jira] [Updated] (SPARK-19628) Duplicate Spark jobs in 2.1.0

2017-02-16 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-19628: - Component/s: (was: Spark Core) SQL > Duplicate Spark jobs in 2.1.0 >

[jira] [Created] (SPARK-19632) Allow configuring non-hive and non-local SessionState and ExternalCatalog

2017-02-16 Thread Robert Kruszewski (JIRA)
Robert Kruszewski created SPARK-19632: - Summary: Allow configuring non-hive and non-local SessionState and ExternalCatalog Key: SPARK-19632 URL: https://issues.apache.org/jira/browse/SPARK-19632

[jira] [Commented] (SPARK-18120) QueryExecutionListener method doesnt' get executed for DataFrameWriter methods

2017-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870586#comment-15870586 ] Apache Spark commented on SPARK-18120: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Commented] (SPARK-18891) Support for specific collection types

2017-02-16 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-18891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870570#comment-15870570 ] Michal Šenkýř commented on SPARK-18891: --- Started my work on Map support as there is still no

[jira] [Commented] (SPARK-19614) add type-preserving null function

2017-02-16 Thread Nick Dimiduk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870525#comment-15870525 ] Nick Dimiduk commented on SPARK-19614: -- {{lit(null).cast(type)}} does exactly what I needed. Thanks

[jira] [Closed] (SPARK-19614) add type-preserving null function

2017-02-16 Thread Nick Dimiduk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk closed SPARK-19614. Resolution: Invalid > add type-preserving null function > - > >

[jira] [Updated] (SPARK-19617) Fix the race condition when starting and stopping a query quickly

2017-02-16 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-19617: - Description: The streaming thread in StreamExecution uses the following ways to check if it

[jira] [Updated] (SPARK-19617) Fix the race condition when starting and stopping a query quickly

2017-02-16 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-19617: - Summary: Fix the race condition when starting and stopping a query quickly (was: Fix a case

[jira] [Commented] (SPARK-19337) Documentation and examples for LinearSVC

2017-02-16 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870514#comment-15870514 ] yuhao yang commented on SPARK-19337: Sure, I'll send a PR today. > Documentation and examples for

[jira] [Commented] (SPARK-19615) Provide Dataset union convenience for divergent schema

2017-02-16 Thread Nick Dimiduk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870513#comment-15870513 ] Nick Dimiduk commented on SPARK-19615: -- IMHO, a union operation should be as generous as possible.

[jira] [Updated] (SPARK-19615) Provide Dataset union convenience for divergent schema

2017-02-16 Thread Nick Dimiduk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated SPARK-19615: - Priority: Major (was: Minor) > Provide Dataset union convenience for divergent schema >

  1   2   >