[jira] [Assigned] (SPARK-17534) Increase timeouts for DirectKafkaStreamSuite tests

2016-09-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17534: Assignee: Apache Spark > Increase timeouts for DirectKafkaStreamSuite tests >

[jira] [Commented] (SPARK-17534) Increase timeouts for DirectKafkaStreamSuite tests

2016-09-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490416#comment-15490416 ] Apache Spark commented on SPARK-17534: -- User 'a-roberts' has created a pull request for this issue:

[jira] [Commented] (SPARK-17480) CompressibleColumnBuilder inefficiently call gatherCompressibilityStats

2016-09-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490397#comment-15490397 ] Apache Spark commented on SPARK-17480: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Created] (SPARK-17534) Increase timeouts for DirectKafkaStreamSuite tests

2016-09-14 Thread Adam Roberts (JIRA)
Adam Roberts created SPARK-17534: Summary: Increase timeouts for DirectKafkaStreamSuite tests Key: SPARK-17534 URL: https://issues.apache.org/jira/browse/SPARK-17534 Project: Spark Issue

[jira] [Assigned] (SPARK-17534) Increase timeouts for DirectKafkaStreamSuite tests

2016-09-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17534: Assignee: (was: Apache Spark) > Increase timeouts for DirectKafkaStreamSuite tests >

[jira] [Commented] (SPARK-17496) missing int to float coercion in df.sample() signature

2016-09-14 Thread Max Moroz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490528#comment-15490528 ] Max Moroz commented on SPARK-17496: --- Agreed, I thought 1 means randomly permute the DataFrame like in

[jira] [Updated] (SPARK-17537) Improve performance for reading parquet schema

2016-09-14 Thread Yang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Wang updated SPARK-17537: -- Description: spark.read.parquet would issue a spark jobs to read parquet schema. When

[jira] [Commented] (SPARK-17510) Set Streaming MaxRate Independently For Multiple Streams

2016-09-14 Thread Jeff Nadler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490731#comment-15490731 ] Jeff Nadler commented on SPARK-17510: - I consider this to be a separate issue from the backpressure

[jira] [Created] (SPARK-17535) Performance Improvement of Signleton pattern in SparkContext

2016-09-14 Thread WangJianfei (JIRA)
WangJianfei created SPARK-17535: --- Summary: Performance Improvement of Signleton pattern in SparkContext Key: SPARK-17535 URL: https://issues.apache.org/jira/browse/SPARK-17535 Project: Spark

[jira] [Commented] (SPARK-17535) Performance Improvement of Signleton pattern in SparkContext

2016-09-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490610#comment-15490610 ] Sean Owen commented on SPARK-17535: --- This is called "double checked locking" and it doesn't actually

[jira] [Created] (SPARK-17536) Minor performance improvement to JDBC batch inserts

2016-09-14 Thread John Muller (JIRA)
John Muller created SPARK-17536: --- Summary: Minor performance improvement to JDBC batch inserts Key: SPARK-17536 URL: https://issues.apache.org/jira/browse/SPARK-17536 Project: Spark Issue

[jira] [Commented] (SPARK-15917) Define the number of executors in standalone mode with an easy-to-use property

2016-09-14 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490663#comment-15490663 ] Andrew Or commented on SPARK-15917: --- By the way is there a pull request? > Define the number of

[jira] [Commented] (SPARK-17535) Performance Improvement of Signleton pattern in SparkContext

2016-09-14 Thread WangJianfei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490711#comment-15490711 ] WangJianfei commented on SPARK-17535: - Oh sorry, I should use volatile as a class field, like this

[jira] [Created] (SPARK-17540) SparkR array serde cannot work correctly when array length == 0

2016-09-14 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-17540: -- Summary: SparkR array serde cannot work correctly when array length == 0 Key: SPARK-17540 URL: https://issues.apache.org/jira/browse/SPARK-17540 Project: Spark

[jira] [Assigned] (SPARK-17540) SparkR array serde cannot work correctly when array length == 0

2016-09-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17540: Assignee: Apache Spark > SparkR array serde cannot work correctly when array length == 0

[jira] [Closed] (SPARK-17535) Performance Improvement of Signleton pattern in SparkContext

2016-09-14 Thread WangJianfei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangJianfei closed SPARK-17535. --- Resolution: Duplicate > Performance Improvement of Signleton pattern in SparkContext >

[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2016-09-14 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490657#comment-15490657 ] Abel Rincón commented on SPARK-16742: - We at Stratio are working on this issue, Stratio design doc:

[jira] [Assigned] (SPARK-17537) Improve performance for reading parquet schema

2016-09-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17537: Assignee: (was: Apache Spark) > Improve performance for reading parquet schema >

[jira] [Commented] (SPARK-17537) Improve performance for reading parquet schema

2016-09-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490684#comment-15490684 ] Apache Spark commented on SPARK-17537: -- User 'yangw1234' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17537) Improve performance for reading parquet schema

2016-09-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17537: Assignee: Apache Spark > Improve performance for reading parquet schema >

[jira] [Reopened] (SPARK-17535) Performance Improvement of Signleton pattern in SparkContext

2016-09-14 Thread WangJianfei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangJianfei reopened SPARK-17535: - > Performance Improvement of Signleton pattern in SparkContext >

[jira] [Created] (SPARK-17538) sqlContext.registerDataFrameAsTable is not working sometimes in spark 2.0

2016-09-14 Thread Srinivas Rishindra Pothireddi (JIRA)
Srinivas Rishindra Pothireddi created SPARK-17538: - Summary: sqlContext.registerDataFrameAsTable is not working sometimes in spark 2.0 Key: SPARK-17538 URL:

[jira] [Created] (SPARK-17539) Streaming Backpressure Starves DirectStream When Used In Combination With Receivers

2016-09-14 Thread Jeff Nadler (JIRA)
Jeff Nadler created SPARK-17539: --- Summary: Streaming Backpressure Starves DirectStream When Used In Combination With Receivers Key: SPARK-17539 URL: https://issues.apache.org/jira/browse/SPARK-17539

[jira] [Assigned] (SPARK-17540) SparkR array serde cannot work correctly when array length == 0

2016-09-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17540: Assignee: (was: Apache Spark) > SparkR array serde cannot work correctly when array

[jira] [Commented] (SPARK-17540) SparkR array serde cannot work correctly when array length == 0

2016-09-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490790#comment-15490790 ] Apache Spark commented on SPARK-17540: -- User 'WeichenXu123' has created a pull request for this

[jira] [Updated] (SPARK-17535) Performance Improvement of Signleton pattern in SparkContext

2016-09-14 Thread WangJianfei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangJianfei updated SPARK-17535: Description: I think the singleton pattern of SparkContext is inefficient if there are many

[jira] [Updated] (SPARK-17535) Performance Improvement of Signleton pattern in SparkContext

2016-09-14 Thread WangJianfei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangJianfei updated SPARK-17535: Description: I think the singleton pattern of SparkContext is inefficient if there are many

[jira] [Resolved] (SPARK-17409) Query in CTAS is Optimized Twice

2016-09-14 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-17409. - Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 15048

[jira] [Updated] (SPARK-17537) Improve performance for reading parquet schema

2016-09-14 Thread Yang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Wang updated SPARK-17537: -- Description: spark.read.parquet would issue a spark job to read parquet schema. When

[jira] [Updated] (SPARK-17537) Improve performance for reading parquet schema

2016-09-14 Thread Yang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Wang updated SPARK-17537: -- Description: spark.read.parquet would issue a spark jobs to read parquet schema. When

[jira] [Updated] (SPARK-17538) sqlContext.registerDataFrameAsTable is not working sometimes in pyspark 2.0.0

2016-09-14 Thread Srinivas Rishindra Pothireddi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Srinivas Rishindra Pothireddi updated SPARK-17538: -- Summary: sqlContext.registerDataFrameAsTable is not working

[jira] [Updated] (SPARK-17538) sqlContext.registerDataFrameAsTable is not working sometimes in pyspark 2.0.0

2016-09-14 Thread Srinivas Rishindra Pothireddi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Srinivas Rishindra Pothireddi updated SPARK-17538: -- Description: I have a production job in spark 1.6.2 that

[jira] [Commented] (SPARK-17535) Performance Improvement of Signleton pattern in SparkContext

2016-09-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490737#comment-15490737 ] Sean Owen commented on SPARK-17535: --- I think this happens to work out because the context is held in an

[jira] [Created] (SPARK-17537) Improve performance for reading parquet schema

2016-09-14 Thread Yang Wang (JIRA)
Yang Wang created SPARK-17537: - Summary: Improve performance for reading parquet schema Key: SPARK-17537 URL: https://issues.apache.org/jira/browse/SPARK-17537 Project: Spark Issue Type:

[jira] [Commented] (SPARK-17510) Set Streaming MaxRate Independently For Multiple Streams

2016-09-14 Thread Jeff Nadler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490746#comment-15490746 ] Jeff Nadler commented on SPARK-17510: - I filed SPARK-17539 for the backpressure bug. > Set

[jira] [Commented] (SPARK-15835) The read path of json doesn't support write path when schema contains Options

2016-09-14 Thread Chris Horn (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490836#comment-15490836 ] Chris Horn commented on SPARK-15835: You can work around this issue by providing the schema

[jira] [Commented] (SPARK-17541) fix some DDL bugs about table management when same-name temp view exists

2016-09-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490922#comment-15490922 ] Apache Spark commented on SPARK-17541: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Resolved] (SPARK-17514) df.take(1) and df.limit(1).collect() perform differently in Python

2016-09-14 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-17514. Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue resolved by pull

[jira] [Updated] (SPARK-17542) Compiler warning in UnsafeInMemorySorter class

2016-09-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-17542: -- Issue Type: Improvement (was: Bug) There are unfortunately a number of warnings, and I don't think we

[jira] [Commented] (SPARK-17536) Minor performance improvement to JDBC batch inserts

2016-09-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490849#comment-15490849 ] Apache Spark commented on SPARK-17536: -- User 'blue666man' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17536) Minor performance improvement to JDBC batch inserts

2016-09-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17536: Assignee: (was: Apache Spark) > Minor performance improvement to JDBC batch inserts >

[jira] [Assigned] (SPARK-17541) fix some DDL bugs about table management when same-name temp view exists

2016-09-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17541: Assignee: Wenchen Fan (was: Apache Spark) > fix some DDL bugs about table management

[jira] [Assigned] (SPARK-17541) fix some DDL bugs about table management when same-name temp view exists

2016-09-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17541: Assignee: Apache Spark (was: Wenchen Fan) > fix some DDL bugs about table management

[jira] [Commented] (SPARK-17317) Add package vignette to SparkR

2016-09-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490995#comment-15490995 ] Apache Spark commented on SPARK-17317: -- User 'junyangq' has created a pull request for this issue:

[jira] [Created] (SPARK-17542) Compiler warning in UnsafeInMemorySorter class

2016-09-14 Thread Frederick Reiss (JIRA)
Frederick Reiss created SPARK-17542: --- Summary: Compiler warning in UnsafeInMemorySorter class Key: SPARK-17542 URL: https://issues.apache.org/jira/browse/SPARK-17542 Project: Spark Issue

[jira] [Commented] (SPARK-17510) Set Streaming MaxRate Independently For Multiple Streams

2016-09-14 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491087#comment-15491087 ] Cody Koeninger commented on SPARK-17510: I use direct stream for multiple topic jobs where the

[jira] [Comment Edited] (SPARK-17510) Set Streaming MaxRate Independently For Multiple Streams

2016-09-14 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491087#comment-15491087 ] Cody Koeninger edited comment on SPARK-17510 at 9/14/16 6:12 PM: - I use

[jira] [Assigned] (SPARK-17536) Minor performance improvement to JDBC batch inserts

2016-09-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17536: Assignee: Apache Spark > Minor performance improvement to JDBC batch inserts >

[jira] [Created] (SPARK-17543) Missing log4j config file for tests in common/network-shuffle

2016-09-14 Thread Frederick Reiss (JIRA)
Frederick Reiss created SPARK-17543: --- Summary: Missing log4j config file for tests in common/network-shuffle Key: SPARK-17543 URL: https://issues.apache.org/jira/browse/SPARK-17543 Project: Spark

[jira] [Created] (SPARK-17541) fix some DDL bugs about table management when same-name temp view exists

2016-09-14 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-17541: --- Summary: fix some DDL bugs about table management when same-name temp view exists Key: SPARK-17541 URL: https://issues.apache.org/jira/browse/SPARK-17541 Project:

[jira] [Commented] (SPARK-17510) Set Streaming MaxRate Independently For Multiple Streams

2016-09-14 Thread Jeff Nadler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491165#comment-15491165 ] Jeff Nadler commented on SPARK-17510: - Yes you're right - it's partly about differing rates but the

[jira] [Created] (SPARK-17544) Timeout waiting for connection from pool, DataFrame Reader's not closing S3 connections?

2016-09-14 Thread Brady Auen (JIRA)
Brady Auen created SPARK-17544: -- Summary: Timeout waiting for connection from pool, DataFrame Reader's not closing S3 connections? Key: SPARK-17544 URL: https://issues.apache.org/jira/browse/SPARK-17544

[jira] [Resolved] (SPARK-17511) Dynamic allocation race condition: Containers getting marked failed while releasing

2016-09-14 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-17511. --- Resolution: Fixed Assignee: Kishor Patil Fix Version/s: 2.1.0

[jira] [Commented] (SPARK-16534) Kafka 0.10 Python support

2016-09-14 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-16534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491107#comment-15491107 ] Maciej Bryński commented on SPARK-16534: [~rxin] Could you explain your decision ? I think that

[jira] [Commented] (SPARK-17508) Setting weightCol to None in ML library causes an error

2016-09-14 Thread Evan Zamir (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491186#comment-15491186 ] Evan Zamir commented on SPARK-17508: Honestly, if the documentation was just more explicit, users

[jira] [Commented] (SPARK-16424) Add support for Structured Streaming to the ML Pipeline API

2016-09-14 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491250#comment-15491250 ] holdenk commented on SPARK-16424: - Just an update - we have a really early proof of concept branch

[jira] [Commented] (SPARK-17508) Setting weightCol to None in ML library causes an error

2016-09-14 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491163#comment-15491163 ] Bryan Cutler commented on SPARK-17508: -- I had a similar discussion in this PR

[jira] [Resolved] (SPARK-10747) add support for NULLS FIRST|LAST in ORDER BY clause

2016-09-14 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-10747. --- Resolution: Fixed Assignee: Xin Wu Fix Version/s: 2.1.0 > add

[jira] [Comment Edited] (SPARK-17508) Setting weightCol to None in ML library causes an error

2016-09-14 Thread Evan Zamir (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491186#comment-15491186 ] Evan Zamir edited comment on SPARK-17508 at 9/14/16 6:52 PM: - Honestly, if

[jira] [Comment Edited] (SPARK-17508) Setting weightCol to None in ML library causes an error

2016-09-14 Thread Evan Zamir (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491186#comment-15491186 ] Evan Zamir edited comment on SPARK-17508 at 9/14/16 6:53 PM: - Honestly, if

[jira] [Commented] (SPARK-17510) Set Streaming MaxRate Independently For Multiple Streams

2016-09-14 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491389#comment-15491389 ] Cody Koeninger commented on SPARK-17510: Just for clarity's sake, compute time is far higher on

[jira] [Updated] (SPARK-17465) Inappropriate memory management in `org.apache.spark.storage.MemoryStore` may lead to memory leak

2016-09-14 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-17465: --- Assignee: Xing Shi > Inappropriate memory management in `org.apache.spark.storage.MemoryStore` may

[jira] [Updated] (SPARK-17465) Inappropriate memory management in `org.apache.spark.storage.MemoryStore` may lead to memory leak

2016-09-14 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-17465: --- Fix Version/s: 2.1.0 2.0.1 > Inappropriate memory management in

[jira] [Commented] (SPARK-17510) Set Streaming MaxRate Independently For Multiple Streams

2016-09-14 Thread Jeff Nadler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491460#comment-15491460 ] Jeff Nadler commented on SPARK-17510: - That would be incredible, thank you very much for looking into

[jira] [Commented] (SPARK-17346) Kafka 0.10 support in Structured Streaming

2016-09-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491487#comment-15491487 ] Apache Spark commented on SPARK-17346: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Created] (SPARK-17546) start-* scripts should use hostname --fqdn

2016-09-14 Thread Kevin Burton (JIRA)
Kevin Burton created SPARK-17546: Summary: start-* scripts should use hostname --fqdn Key: SPARK-17546 URL: https://issues.apache.org/jira/browse/SPARK-17546 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-17547) Temporary shuffle data files may be leaked following exception in write

2016-09-14 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-17547: -- Summary: Temporary shuffle data files may be leaked following exception in write Key: SPARK-17547 URL: https://issues.apache.org/jira/browse/SPARK-17547 Project: Spark

[jira] [Commented] (SPARK-17510) Set Streaming MaxRate Independently For Multiple Streams

2016-09-14 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491455#comment-15491455 ] Cody Koeninger commented on SPARK-17510: Ok, next time I get some free hacking time I can make a

[jira] [Assigned] (SPARK-17100) pyspark filter on a udf column after join gives java.lang.UnsupportedOperationException

2016-09-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17100: Assignee: Davies Liu (was: Apache Spark) > pyspark filter on a udf column after join

[jira] [Assigned] (SPARK-17100) pyspark filter on a udf column after join gives java.lang.UnsupportedOperationException

2016-09-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17100: Assignee: Apache Spark (was: Davies Liu) > pyspark filter on a udf column after join

[jira] [Commented] (SPARK-17100) pyspark filter on a udf column after join gives java.lang.UnsupportedOperationException

2016-09-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491518#comment-15491518 ] Apache Spark commented on SPARK-17100: -- User 'davies' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17547) Temporary shuffle data files may be leaked following exception in write

2016-09-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17547: Assignee: Josh Rosen (was: Apache Spark) > Temporary shuffle data files may be leaked

[jira] [Resolved] (SPARK-17472) Better error message for serialization failures of large objects in Python

2016-09-14 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-17472. Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 15026

[jira] [Created] (SPARK-17545) Spark SQL Catalyst doesn't handle ISO 8601 date with colon in offset

2016-09-14 Thread Nathan Beyer (JIRA)
Nathan Beyer created SPARK-17545: Summary: Spark SQL Catalyst doesn't handle ISO 8601 date with colon in offset Key: SPARK-17545 URL: https://issues.apache.org/jira/browse/SPARK-17545 Project: Spark

[jira] [Commented] (SPARK-17510) Set Streaming MaxRate Independently For Multiple Streams

2016-09-14 Thread Jeff Nadler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491412#comment-15491412 ] Jeff Nadler commented on SPARK-17510: - Well... both streams use updateStateByKey. The session one

[jira] [Resolved] (SPARK-17465) Inappropriate memory management in `org.apache.spark.storage.MemoryStore` may lead to memory leak

2016-09-14 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-17465. Resolution: Fixed Fix Version/s: 1.6.3 Issue resolved by pull request 15022

[jira] [Commented] (SPARK-17114) Adding a 'GROUP BY 1' where first column is literal results in wrong answer

2016-09-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491437#comment-15491437 ] Apache Spark commented on SPARK-17114: -- User 'hvanhovell' has created a pull request for this issue:

[jira] [Commented] (SPARK-17544) Timeout waiting for connection from pool, DataFrame Reader's not closing S3 connections?

2016-09-14 Thread Brady Auen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491445#comment-15491445 ] Brady Auen commented on SPARK-17544:

[jira] [Assigned] (SPARK-17346) Kafka 0.10 support in Structured Streaming

2016-09-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17346: Assignee: (was: Apache Spark) > Kafka 0.10 support in Structured Streaming >

[jira] [Assigned] (SPARK-17547) Temporary shuffle data files may be leaked following exception in write

2016-09-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17547: Assignee: Apache Spark (was: Josh Rosen) > Temporary shuffle data files may be leaked

[jira] [Commented] (SPARK-17547) Temporary shuffle data files may be leaked following exception in write

2016-09-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491548#comment-15491548 ] Apache Spark commented on SPARK-17547: -- User 'JoshRosen' has created a pull request for this issue:

[jira] [Closed] (SPARK-17533) I think it's necessary to have an overrided method of union in sparkContext

2016-09-14 Thread WangJianfei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangJianfei closed SPARK-17533. --- Resolution: Won't Fix > I think it's necessary to have an overrided method of union in sparkContext

[jira] [Commented] (SPARK-17508) Setting weightCol to None in ML library causes an error

2016-09-14 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491841#comment-15491841 ] Bryan Cutler commented on SPARK-17508: -- To respond to [~srowen] question, I think it's reasonable to

[jira] [Commented] (SPARK-17544) Timeout waiting for connection from pool, DataFrame Reader's not closing S3 connections?

2016-09-14 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491938#comment-15491938 ] Josh Rosen commented on SPARK-17544: I found a similar issue from the {{spark-avro}} repository:

[jira] [Commented] (SPARK-15573) Backwards-compatible persistence for spark.ml

2016-09-14 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491974#comment-15491974 ] yuhao yang commented on SPARK-15573: This sounds feasible. Two primary work items as I see: 1. Find a

[jira] [Commented] (SPARK-16407) Allow users to supply custom StreamSinkProviders

2016-09-14 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491739#comment-15491739 ] Shixiong Zhu commented on SPARK-16407: -- Right now we don't want to add such typed API to

[jira] [Commented] (SPARK-16439) Incorrect information in SQL Query details

2016-09-14 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491744#comment-15491744 ] Davies Liu commented on SPARK-16439: The separator was added on purpose, otherwise it's very

[jira] [Commented] (SPARK-16407) Allow users to supply custom StreamSinkProviders

2016-09-14 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491752#comment-15491752 ] Reynold Xin commented on SPARK-16407: - This doesn't work in SQL, Python, etc. I like the general

[jira] [Updated] (SPARK-17549) InMemoryRelation doesn't scale to large tables

2016-09-14 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-17549: --- Attachment: spark-1.6.patch example_1.6_pre_patch.png

[jira] [Reopened] (SPARK-16439) Incorrect information in SQL Query details

2016-09-14 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reopened SPARK-16439: We could bring the seperator back for better readability. > Incorrect information in SQL Query

[jira] [Created] (SPARK-17549) InMemoryRelation doesn't scale to large tables

2016-09-14 Thread Marcelo Vanzin (JIRA)
Marcelo Vanzin created SPARK-17549: -- Summary: InMemoryRelation doesn't scale to large tables Key: SPARK-17549 URL: https://issues.apache.org/jira/browse/SPARK-17549 Project: Spark Issue

[jira] [Assigned] (SPARK-16439) Incorrect information in SQL Query details

2016-09-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16439: Assignee: Maciej Bryński (was: Apache Spark) > Incorrect information in SQL Query

[jira] [Commented] (SPARK-16439) Incorrect information in SQL Query details

2016-09-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491766#comment-15491766 ] Apache Spark commented on SPARK-16439: -- User 'davies' has created a pull request for this issue:

[jira] [Assigned] (SPARK-16439) Incorrect information in SQL Query details

2016-09-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16439: Assignee: Apache Spark (was: Maciej Bryński) > Incorrect information in SQL Query

[jira] [Commented] (SPARK-16407) Allow users to supply custom StreamSinkProviders

2016-09-14 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491772#comment-15491772 ] holdenk commented on SPARK-16407: - Right the simplest example where you need to use the typed API is with

[jira] [Commented] (SPARK-16407) Allow users to supply custom StreamSinkProviders

2016-09-14 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491784#comment-15491784 ] holdenk commented on SPARK-16407: - It's true it doesn't work in SQL - but I don't think the current

[jira] [Commented] (SPARK-16407) Allow users to supply custom StreamSinkProviders

2016-09-14 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491779#comment-15491779 ] Reynold Xin commented on SPARK-16407: - Actually I spoke too soon. I only read the code change and

[jira] [Commented] (SPARK-16407) Allow users to supply custom StreamSinkProviders

2016-09-14 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491787#comment-15491787 ] holdenk commented on SPARK-16407: - That's part of why I decided to just use the ForeachRDD sink as

[jira] [Updated] (SPARK-17549) InMemoryRelation doesn't scale to large tables

2016-09-14 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-17549: --- Attachment: spark-1.6-2.patch Just noticed there's already a more accurate count of the

[jira] [Created] (SPARK-17550) DataFrameWriter.partitionBy() should throw exception if column is not present in Dataframe

2016-09-14 Thread Aniket Kulkarni (JIRA)
Aniket Kulkarni created SPARK-17550: --- Summary: DataFrameWriter.partitionBy() should throw exception if column is not present in Dataframe Key: SPARK-17550 URL: https://issues.apache.org/jira/browse/SPARK-17550

  1   2   >