[jira] [Resolved] (SPARK-18645) spark-daemon.sh arguments error lead to throws Unrecognized option

2016-12-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-18645. --- Resolution: Fixed Assignee: Yuming Wang Fix Version/s: 2.1.0 Resolved by

[jira] [Updated] (SPARK-18645) spark-daemon.sh arguments error lead to throws Unrecognized option

2016-12-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-18645: -- Fix Version/s: (was: 2.1.0) 2.1.1 > spark-daemon.sh arguments error lead to

[jira] [Commented] (SPARK-17213) Parquet String Pushdown for Non-Eq Comparisons Broken

2016-12-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712038#comment-15712038 ] Hyukjin Kwon commented on SPARK-17213: -- It sounds we should disable the filters for string and

[jira] [Created] (SPARK-18674) improve the error message of natural join

2016-12-01 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-18674: --- Summary: improve the error message of natural join Key: SPARK-18674 URL: https://issues.apache.org/jira/browse/SPARK-18674 Project: Spark Issue Type:

[jira] [Assigned] (SPARK-18674) improve the error message of natural join

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18674: Assignee: Apache Spark (was: Wenchen Fan) > improve the error message of natural join >

[jira] [Commented] (SPARK-18674) improve the error message of natural join

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712130#comment-15712130 ] Apache Spark commented on SPARK-18674: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Updated] (SPARK-17213) Parquet String Pushdown for Non-Eq Comparisons Broken

2016-12-01 Thread Robert Kruszewski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kruszewski updated SPARK-17213: -- Affects Version/s: (was: 2.0.0) 2.1.0 > Parquet String

[jira] [Assigned] (SPARK-18674) improve the error message of natural join

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18674: Assignee: Apache Spark (was: Wenchen Fan) > improve the error message of natural join >

[jira] [Commented] (SPARK-16026) Cost-based Optimizer framework

2016-12-01 Thread Zhenhua Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712012#comment-15712012 ] Zhenhua Wang commented on SPARK-16026: -- [~rxin] Yes, I'll spend more time on CBO from this saturday.

[jira] [Assigned] (SPARK-18674) improve the error message of natural join

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18674: Assignee: Wenchen Fan (was: Apache Spark) > improve the error message of natural join >

[jira] [Assigned] (SPARK-18674) improve the error message of natural join

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18674: Assignee: Wenchen Fan (was: Apache Spark) > improve the error message of natural join >

[jira] [Commented] (SPARK-18650) race condition in FileScanRDD.scala

2016-12-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712358#comment-15712358 ] Sean Owen commented on SPARK-18650: --- I guess I'm surprised if the partitions are evaluated by multiple

[jira] [Updated] (SPARK-18620) Spark Streaming + Kinesis : Receiver MaxRate is violated

2016-12-01 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro updated SPARK-18620: - Attachment: Apply_limit in_spark_with_my_patch.png > Spark Streaming + Kinesis :

[jira] [Updated] (SPARK-18620) Spark Streaming + Kinesis : Receiver MaxRate is violated

2016-12-01 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro updated SPARK-18620: - Attachment: Apply_limit in_vanilla_spark.png > Spark Streaming + Kinesis : Receiver

[jira] [Updated] (SPARK-18498) Clean up HDFSMetadataLog API for better testing

2016-12-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-18498: -- Assignee: Tyson Condie > Clean up HDFSMetadataLog API for better testing >

[jira] [Assigned] (SPARK-18675) CTAS for hive serde table should work for all hive versions

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18675: Assignee: Apache Spark (was: Wenchen Fan) > CTAS for hive serde table should work for

[jira] [Updated] (SPARK-18674) improve the error message of using join

2016-12-01 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-18674: Summary: improve the error message of using join (was: improve the error message of natural join)

[jira] [Created] (SPARK-18675) CTAS for hive serde table should work for all hive versions

2016-12-01 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-18675: --- Summary: CTAS for hive serde table should work for all hive versions Key: SPARK-18675 URL: https://issues.apache.org/jira/browse/SPARK-18675 Project: Spark

[jira] [Commented] (SPARK-18269) NumberFormatException when reading csv for a nullable column

2016-12-01 Thread Jork Zijlstra (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712431#comment-15712431 ] Jork Zijlstra commented on SPARK-18269: --- Thanks for the quick response. Eagerly awaiting the spark

[jira] [Assigned] (SPARK-18374) Incorrect words in StopWords/english.txt

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18374: Assignee: Apache Spark > Incorrect words in StopWords/english.txt >

[jira] [Commented] (SPARK-18675) CTAS for hive serde table should work for all hive versions

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712559#comment-15712559 ] Apache Spark commented on SPARK-18675: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18675) CTAS for hive serde table should work for all hive versions

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18675: Assignee: Wenchen Fan (was: Apache Spark) > CTAS for hive serde table should work for

[jira] [Assigned] (SPARK-18586) netty-3.8.0.Final.jar has vulnerability CVE-2014-3488 and CVE-2014-0193

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18586: Assignee: (was: Apache Spark) > netty-3.8.0.Final.jar has vulnerability CVE-2014-3488

[jira] [Commented] (SPARK-18586) netty-3.8.0.Final.jar has vulnerability CVE-2014-3488 and CVE-2014-0193

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712341#comment-15712341 ] Apache Spark commented on SPARK-18586: -- User 'srowen' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18586) netty-3.8.0.Final.jar has vulnerability CVE-2014-3488 and CVE-2014-0193

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18586: Assignee: Apache Spark > netty-3.8.0.Final.jar has vulnerability CVE-2014-3488 and

[jira] [Updated] (SPARK-18620) Spark Streaming + Kinesis : Receiver MaxRate is violated

2016-12-01 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro updated SPARK-18620: - Attachment: Apply_no_limit.png > Spark Streaming + Kinesis : Receiver MaxRate is

[jira] [Commented] (SPARK-18506) kafka 0.10 with Spark 2.02 auto.offset.reset=earliest will only read from a single partition on a multi partition topic

2016-12-01 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712457#comment-15712457 ] Cody Koeninger commented on SPARK-18506: Yes, amazon linux. No, not spark-ec2, just a spark

[jira] [Assigned] (SPARK-18475) Be able to provide higher parallelization for StructuredStreaming Kafka Source

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18475: Assignee: Apache Spark > Be able to provide higher parallelization for

[jira] [Assigned] (SPARK-18475) Be able to provide higher parallelization for StructuredStreaming Kafka Source

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18475: Assignee: (was: Apache Spark) > Be able to provide higher parallelization for

[jira] [Commented] (SPARK-18374) Incorrect words in StopWords/english.txt

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712533#comment-15712533 ] Apache Spark commented on SPARK-18374: -- User 'hhbyyh' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18374) Incorrect words in StopWords/english.txt

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18374: Assignee: (was: Apache Spark) > Incorrect words in StopWords/english.txt >

[jira] [Commented] (SPARK-17213) Parquet String Pushdown for Non-Eq Comparisons Broken

2016-12-01 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712707#comment-15712707 ] Cheng Lian commented on SPARK-17213: Agree that we should disable string and binary filter push down

[jira] [Updated] (SPARK-18676) Spark 2.x query plan data size estimation can crash join queries versus 1.x

2016-12-01 Thread Michael Allman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Allman updated SPARK-18676: --- Description: Commit [c481bdf|https://github.com/apache/spark/commit/c481bdf] significantly

[jira] [Commented] (SPARK-13534) Implement Apache Arrow serializer for Spark DataFrame for use in DataFrame.toPandas

2016-12-01 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712825#comment-15712825 ] Li Jin commented on SPARK-13534: [~bryanc], Allow me to introduce myself. I am Li Jin and I am working

[jira] [Updated] (SPARK-18617) Close "kryo auto pick" feature for Spark Streaming

2016-12-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18617: Fix Version/s: 2.0.3 > Close "kryo auto pick" feature for Spark Streaming >

[jira] [Commented] (SPARK-18676) Spark 2.x query plan data size estimation can crash join queries versus 1.x

2016-12-01 Thread Michael Allman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712823#comment-15712823 ] Michael Allman commented on SPARK-18676: cc [~davies] as author of

[jira] [Commented] (SPARK-18560) Receiver data can not be dataSerialized properly.

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712837#comment-15712837 ] Apache Spark commented on SPARK-18560: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Commented] (SPARK-18617) Close "kryo auto pick" feature for Spark Streaming

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712835#comment-15712835 ] Apache Spark commented on SPARK-18617: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17213) Parquet String Pushdown for Non-Eq Comparisons Broken

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17213: Assignee: Apache Spark (was: Cheng Lian) > Parquet String Pushdown for Non-Eq

[jira] [Updated] (SPARK-18419) `JDBCRelation.insert` should not remove Spark options

2016-12-01 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-18419: -- Affects Version/s: 2.0.2 > `JDBCRelation.insert` should not remove Spark options >

[jira] [Updated] (SPARK-18419) `JDBCRelation.insert` should not remove Spark options

2016-12-01 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-18419: -- Description: Currently, `JDBCRelation.insert` removes Spark options too early by mistakenly

[jira] [Updated] (SPARK-18419) `JDBCRelation.insert` should not remove Spark options

2016-12-01 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-18419: -- Description: Currently, `JDBCRelation.insert` removes Spark options too early by mistakenly

[jira] [Commented] (SPARK-18419) Fix `JDBCOptions.asConnectionProperties` to be case-insensitive

2016-12-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712770#comment-15712770 ] Sean Owen commented on SPARK-18419: --- [~dongjoon] given your comment on the PR, is this resolved? > Fix

[jira] [Commented] (SPARK-18419) Fix `JDBCOptions.asConnectionProperties` to be case-insensitive

2016-12-01 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712843#comment-15712843 ] Dongjoon Hyun commented on SPARK-18419: --- Sorry, I thought it's resolved. But, when I tried to

[jira] [Resolved] (SPARK-18674) improve the error message of using join

2016-12-01 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-18674. --- Resolution: Fixed Fix Version/s: 2.1.0 2.0.3 > improve the

[jira] [Updated] (SPARK-18419) `JDBCRelation.insert` should not remove Spark options

2016-12-01 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-18419: -- Priority: Major (was: Minor) > `JDBCRelation.insert` should not remove Spark options >

[jira] [Updated] (SPARK-18419) `JDBCRelation.insert` should not remove Spark options

2016-12-01 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-18419: -- Summary: `JDBCRelation.insert` should not remove Spark options (was: Fix

[jira] [Resolved] (SPARK-9876) Upgrade parquet-mr to 1.8.1

2016-12-01 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-9876. --- Resolution: Fixed Fix Version/s: 2.1.0 > Upgrade parquet-mr to 1.8.1 >

[jira] [Created] (SPARK-18676) Spark 2.x query plan data size estimation can crash join queries versus 1.x

2016-12-01 Thread Michael Allman (JIRA)
Michael Allman created SPARK-18676: -- Summary: Spark 2.x query plan data size estimation can crash join queries versus 1.x Key: SPARK-18676 URL: https://issues.apache.org/jira/browse/SPARK-18676

[jira] [Resolved] (SPARK-18553) Executor loss may cause TaskSetManager to be leaked

2016-12-01 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-18553. Resolution: Fixed Fix Version/s: 1.6.4 > Executor loss may cause TaskSetManager to be

[jira] [Created] (SPARK-18677) Json path implementation fails to parse ['key']

2016-12-01 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-18677: - Summary: Json path implementation fails to parse ['key'] Key: SPARK-18677 URL: https://issues.apache.org/jira/browse/SPARK-18677 Project: Spark Issue Type: Bug

[jira] [Assigned] (SPARK-17213) Parquet String Pushdown for Non-Eq Comparisons Broken

2016-12-01 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian reassigned SPARK-17213: -- Assignee: Cheng Lian > Parquet String Pushdown for Non-Eq Comparisons Broken >

[jira] [Commented] (SPARK-17213) Parquet String Pushdown for Non-Eq Comparisons Broken

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712921#comment-15712921 ] Apache Spark commented on SPARK-17213: -- User 'liancheng' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17213) Parquet String Pushdown for Non-Eq Comparisons Broken

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17213: Assignee: Cheng Lian (was: Apache Spark) > Parquet String Pushdown for Non-Eq

[jira] [Closed] (SPARK-18641) Show databases NullPointerException while Sentry turned on

2016-12-01 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun closed SPARK-18641. - Resolution: Invalid I close this issue because the reported error message comes from Sentry

[jira] [Updated] (SPARK-18642) Spark SQL: Catalyst is scanning undesired columns

2016-12-01 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-18642: -- Affects Version/s: 1.6.3 > Spark SQL: Catalyst is scanning undesired columns >

[jira] [Created] (SPARK-18678) Skewed feature subsampling in Random forest

2016-12-01 Thread Bjoern Toldbod (JIRA)
Bjoern Toldbod created SPARK-18678: -- Summary: Skewed feature subsampling in Random forest Key: SPARK-18678 URL: https://issues.apache.org/jira/browse/SPARK-18678 Project: Spark Issue Type:

[jira] [Assigned] (SPARK-18670) Limit the number of StreamingQueryListener.StreamProgressEvent when there is no data

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18670: Assignee: Shixiong Zhu (was: Apache Spark) > Limit the number of

[jira] [Commented] (SPARK-18670) Limit the number of StreamingQueryListener.StreamProgressEvent when there is no data

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713142#comment-15713142 ] Apache Spark commented on SPARK-18670: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18677) Json path implementation fails to parse ['key']

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18677: Assignee: Apache Spark > Json path implementation fails to parse ['key'] >

[jira] [Updated] (SPARK-18274) Memory leak in PySpark StringIndexer

2016-12-01 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-18274: -- Target Version/s: 2.0.3, 2.1.0 (was: 2.0.3, 2.1.1, 2.2.0) > Memory leak in PySpark

[jira] [Commented] (SPARK-18588) KafkaSourceStressForDontFailOnDataLossSuite is flaky

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713179#comment-15713179 ] Apache Spark commented on SPARK-18588: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Updated] (SPARK-18274) Memory leak in PySpark StringIndexer

2016-12-01 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-18274: -- Assignee: Sandeep Singh > Memory leak in PySpark StringIndexer >

[jira] [Updated] (SPARK-18274) Memory leak in PySpark StringIndexer

2016-12-01 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-18274: -- Shepherd: Joseph K. Bradley > Memory leak in PySpark StringIndexer >

[jira] [Resolved] (SPARK-18274) Memory leak in PySpark StringIndexer

2016-12-01 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-18274. --- Resolution: Fixed Fix Version/s: 2.2.0 2.0.3

[jira] [Assigned] (SPARK-18670) Limit the number of StreamingQueryListener.StreamProgressEvent when there is no data

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18670: Assignee: Apache Spark (was: Shixiong Zhu) > Limit the number of

[jira] [Assigned] (SPARK-18677) Json path implementation fails to parse ['key']

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18677: Assignee: (was: Apache Spark) > Json path implementation fails to parse ['key'] >

[jira] [Commented] (SPARK-18677) Json path implementation fails to parse ['key']

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712996#comment-15712996 ] Apache Spark commented on SPARK-18677: -- User 'rdblue' has created a pull request for this issue:

[jira] [Commented] (SPARK-18642) Spark SQL: Catalyst is scanning undesired columns

2016-12-01 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713081#comment-15713081 ] Dongjoon Hyun commented on SPARK-18642: --- Thank you for reporting, [~mohitgargk]. It seems to be the

[jira] [Commented] (SPARK-18641) Show databases NullPointerException while Sentry turned on

2016-12-01 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713103#comment-15713103 ] Dongjoon Hyun commented on SPARK-18641: --- Thank you for reporting, [~zhangqw] But, I'm wondering

[jira] [Commented] (SPARK-18618) SparkR model predict should support type as a argument

2016-12-01 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713358#comment-15713358 ] Joseph K. Bradley commented on SPARK-18618: --- [~yanboliang] Shall we get this into 2.1 as a fix

[jira] [Created] (SPARK-18679) Regression in file listing performance

2016-12-01 Thread Eric Liang (JIRA)
Eric Liang created SPARK-18679: -- Summary: Regression in file listing performance Key: SPARK-18679 URL: https://issues.apache.org/jira/browse/SPARK-18679 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-18679) Regression in file listing performance

2016-12-01 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-18679: --- Affects Version/s: 2.1.0 > Regression in file listing performance >

[jira] [Updated] (SPARK-18679) Regression in file listing performance

2016-12-01 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-18679: --- Component/s: SQL > Regression in file listing performance > -- >

[jira] [Issue Comment Deleted] (SPARK-18476) SparkR Logistic Regression should should support output original label.

2016-12-01 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-18476: -- Comment: was deleted (was: [~wangmiao1981] This changes the output schema and is an

[jira] [Commented] (SPARK-13534) Implement Apache Arrow serializer for Spark DataFrame for use in DataFrame.toPandas

2016-12-01 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713340#comment-15713340 ] Bryan Cutler commented on SPARK-13534: -- Hi [~icexelloss], that sounds great! We could definitely

[jira] [Commented] (SPARK-18506) kafka 0.10 with Spark 2.02 auto.offset.reset=earliest will only read from a single partition on a multi partition topic

2016-12-01 Thread Heji Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713476#comment-15713476 ] Heji Kim commented on SPARK-18506: -- Breaking news I finally found the source of the problem. Our

[jira] [Updated] (SPARK-18506) kafka 0.10 with Spark 2.02 auto.offset.reset=earliest will only read from a single partition on a multi partition topic when kafka-clients 0.10.0.1 is used

2016-12-01 Thread Heji Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Heji Kim updated SPARK-18506: - Summary: kafka 0.10 with Spark 2.02 auto.offset.reset=earliest will only read from a single partition on

[jira] [Commented] (SPARK-18476) SparkR Logistic Regression should should support output original label.

2016-12-01 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713515#comment-15713515 ] Miao Wang commented on SPARK-18476: --- spark.logit predict should output original label instead of a

[jira] [Updated] (SPARK-18291) SparkR glm predict should output original label when family = "binomial"

2016-12-01 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-18291: -- Attachment: SparkR2.1decisionoutputschemaforGLMs.pdf I'm adding a little summary of

[jira] [Commented] (SPARK-18674) improve the error message of using join

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713321#comment-15713321 ] Apache Spark commented on SPARK-18674: -- User 'gatorsmile' has created a pull request for this issue:

[jira] [Commented] (SPARK-18131) Support returning Vector/Dense Vector from backend

2016-12-01 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713430#comment-15713430 ] Miao Wang commented on SPARK-18131: --- I can try to follow this discussion for an initial PR. > Support

[jira] [Commented] (SPARK-18476) SparkR Logistic Regression should should support output original label.

2016-12-01 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713296#comment-15713296 ] Joseph K. Bradley commented on SPARK-18476: --- [~wangmiao1981] This changes the output schema and

[jira] [Commented] (SPARK-18291) SparkR glm predict should output original label when family = "binomial"

2016-12-01 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713341#comment-15713341 ] Joseph K. Bradley commented on SPARK-18291: --- I just saw the comment at the end of the PR and

[jira] [Commented] (SPARK-18538) Concurrent Fetching DataFrameReader JDBC APIs Do Not Work

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713459#comment-15713459 ] Apache Spark commented on SPARK-18538: -- User 'gatorsmile' has created a pull request for this issue:

[jira] [Resolved] (SPARK-18506) kafka 0.10 with Spark 2.02 auto.offset.reset=earliest will only read from a single partition on a multi partition topic when kafka-clients 0.10.0.1 is used

2016-12-01 Thread Heji Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Heji Kim resolved SPARK-18506. -- Resolution: Not A Problem Just another library incompatibilty issue. We just downgraded the

[jira] [Updated] (SPARK-18506) kafka 0.10 with Spark 2.02 auto.offset.reset=earliest will only read from a single partition on a multi partition topic when kafka-clients 0.10.1.0 is used

2016-12-01 Thread Heji Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Heji Kim updated SPARK-18506: - Summary: kafka 0.10 with Spark 2.02 auto.offset.reset=earliest will only read from a single partition on

[jira] [Commented] (SPARK-18620) Spark Streaming + Kinesis : Receiver MaxRate is violated

2016-12-01 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712392#comment-15712392 ] Takeshi Yamamuro commented on SPARK-18620: -- I tried to fix this issue:

[jira] [Updated] (SPARK-18665) Spark ThriftServer jobs where are canceled are still “STARTED”

2016-12-01 Thread cen yuhai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cen yuhai updated SPARK-18665: -- Description: I find that, some jobs are canceled, but the state are still "STARTED", I think this bug

[jira] [Resolved] (SPARK-18141) jdbc datasource read fails when quoted columns (eg:mixed case, reserved words) in source table are used in the filter.

2016-12-01 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-18141. - Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 15662

[jira] [Commented] (SPARK-18681) Throw Filtering is supported only on partition keys of type string exception

2016-12-01 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713871#comment-15713871 ] Liang-Chi Hsieh commented on SPARK-18681: - Looks like you create two Jiras (SPARK-18680,

[jira] [Commented] (SPARK-18620) Spark Streaming + Kinesis : Receiver MaxRate is violated

2016-12-01 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713894#comment-15713894 ] Takeshi Yamamuro commented on SPARK-18620: -- yea, I'll make a pr in a day > Spark Streaming +

[jira] [Assigned] (SPARK-18679) Regression in file listing performance

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18679: Assignee: Apache Spark > Regression in file listing performance >

[jira] [Assigned] (SPARK-18679) Regression in file listing performance

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18679: Assignee: (was: Apache Spark) > Regression in file listing performance >

[jira] [Commented] (SPARK-18679) Regression in file listing performance

2016-12-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713826#comment-15713826 ] Apache Spark commented on SPARK-18679: -- User 'ericl' has created a pull request for this issue:

[jira] [Updated] (SPARK-18665) Spark ThriftServer jobs where are canceled are still “STARTED”

2016-12-01 Thread cen yuhai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cen yuhai updated SPARK-18665: -- Description: I find that, some jobs are canceled, but the state are still "STARTED", I think this bug

[jira] [Updated] (SPARK-18665) Spark ThriftServer jobs where are canceled are still “STARTED”

2016-12-01 Thread cen yuhai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cen yuhai updated SPARK-18665: -- Description: I find that, some jobs are canceled, but the state are still "STARTED", I think this bug

[jira] [Created] (SPARK-18684) Spark Executors off-heap memory usage keeps increasing while running spark streaming

2016-12-01 Thread Krishna Gandra (JIRA)
Krishna Gandra created SPARK-18684: -- Summary: Spark Executors off-heap memory usage keeps increasing while running spark streaming Key: SPARK-18684 URL: https://issues.apache.org/jira/browse/SPARK-18684

[jira] [Comment Edited] (SPARK-18684) Spark Executors off-heap memory usage keeps increasing while running spark streaming

2016-12-01 Thread Krishna Gandra (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713855#comment-15713855 ] Krishna Gandra edited comment on SPARK-18684 at 12/2/16 3:02 AM: -

[jira] [Commented] (SPARK-18684) Spark Executors off-heap memory usage keeps increasing while running spark streaming

2016-12-01 Thread Krishna Gandra (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713855#comment-15713855 ] Krishna Gandra commented on SPARK-18684: Executor off-heap size is keep increasing and eventually

  1   2   >