[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-24 Thread Guozhang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982377#comment-15982377 ] Guozhang Wang commented on SPARK-18057: --- Just adding the related KIP for the recently added client

[jira] [Resolved] (SPARK-20451) Filter out nested mapType datatypes from sort order in randomSplit

2017-04-24 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-20451. - Resolution: Fixed Assignee: Sameer Agarwal Fix Version/s: 2.3.0

[jira] [Resolved] (SPARK-20453) Bump master branch version to 2.3.0-SNAPSHOT

2017-04-24 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-20453. - Resolution: Fixed Fix Version/s: 2.3.0 > Bump master branch version to 2.3.0-SNAPSHOT >

[jira] [Commented] (SPARK-20239) Improve HistoryServer ACL mechanism

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982299#comment-15982299 ] Apache Spark commented on SPARK-20239: -- User 'jerryshao' has created a pull request for this issue:

[jira] [Commented] (SPARK-20446) Optimize the process of MLLIB ALS recommendForAll

2017-04-24 Thread Peng Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982251#comment-15982251 ] Peng Meng commented on SPARK-20446: --- Yes, I compared with ML ALSModel.recommendAll. The data size is

[jira] [Resolved] (SPARK-20239) Improve HistoryServer ACL mechanism

2017-04-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-20239. Resolution: Fixed Assignee: Saisai Shao Fix Version/s: 2.2.0 > Improve

[jira] [Commented] (SPARK-20336) spark.read.csv() with wholeFile=True option fails to read non ASCII unicode characters

2017-04-24 Thread HanCheol Cho (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982218#comment-15982218 ] HanCheol Cho commented on SPARK-20336: -- Hi, [~original-brownbear] Thank you for your help. And can

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-24 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982180#comment-15982180 ] Shixiong Zhu commented on SPARK-18057: -- [~ijuma] it's not a regression. In Kafka 0.10.0.1, deleting

[jira] [Comment Edited] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-24 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982180#comment-15982180 ] Shixiong Zhu edited comment on SPARK-18057 at 4/25/17 12:34 AM: [~ijuma]

[jira] [Updated] (SPARK-20454) Improvement of ShortestPaths in Spark GraphX

2017-04-24 Thread Ji Dai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ji Dai updated SPARK-20454: --- Summary: Improvement of ShortestPaths in Spark GraphX (was: improvement of ShortestPaths in Spark GraphX)

[jira] [Updated] (SPARK-20454) Improvement of ShortestPaths in Spark GraphX

2017-04-24 Thread Ji Dai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ji Dai updated SPARK-20454: --- Description: The output of ShortestPaths is not enough. ShortestPaths in Graph/lib is currently in a simple

[jira] [Commented] (SPARK-18901) Require in LR LogisticAggregator is redundant

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982160#comment-15982160 ] Apache Spark commented on SPARK-18901: -- User 'wangmiao1981' has created a pull request for this

[jira] [Updated] (SPARK-20454) improvement of ShortestPaths in Spark GraphX

2017-04-24 Thread Ji Dai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ji Dai updated SPARK-20454: --- Summary: improvement of ShortestPaths in Spark GraphX (was: Concern about improvement of ShortestPaths in

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-24 Thread Ismael Juma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982153#comment-15982153 ] Ismael Juma commented on SPARK-18057: - [~helena], about KAFKA-4879, are you suggesting that it's a

[jira] [Assigned] (SPARK-9103) Tracking spark's memory usage

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-9103: --- Assignee: (was: Apache Spark) > Tracking spark's memory usage >

[jira] [Assigned] (SPARK-9103) Tracking spark's memory usage

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-9103: --- Assignee: Apache Spark > Tracking spark's memory usage > - > >

[jira] [Reopened] (SPARK-9103) Tracking spark's memory usage

2017-04-24 Thread Jose Soltren (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jose Soltren reopened SPARK-9103: - Work in progress. > Tracking spark's memory usage > - > >

[jira] [Updated] (SPARK-20454) Concern about improvement of ShortestPaths in Spark GraphX

2017-04-24 Thread Ji Dai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ji Dai updated SPARK-20454: --- Description: The output of ShortestPaths is not enough. ShortestPaths in Graph/lib is currently in a simple

[jira] [Created] (SPARK-20454) Concern about improvement of ShortestPaths in Spark GraphX

2017-04-24 Thread Ji Dai (JIRA)
Ji Dai created SPARK-20454: -- Summary: Concern about improvement of ShortestPaths in Spark GraphX Key: SPARK-20454 URL: https://issues.apache.org/jira/browse/SPARK-20454 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-20450) Unexpected first-query schema inference cost with 2.1.1 RC

2017-04-24 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-20450. --- Resolution: Fixed Assignee: Eric Liang Fix Version/s: 2.1.1 >

[jira] [Comment Edited] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-24 Thread Ismael Juma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982077#comment-15982077 ] Ismael Juma edited comment on SPARK-18057 at 4/24/17 11:07 PM: --- Hi. A few

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-24 Thread Ismael Juma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982077#comment-15982077 ] Ismael Juma commented on SPARK-18057: - Hi. A few clarifications below. "Based on previous kafka

[jira] [Assigned] (SPARK-20453) Bump master branch version to 2.3.0-SNAPSHOT

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20453: Assignee: Apache Spark (was: Josh Rosen) > Bump master branch version to 2.3.0-SNAPSHOT

[jira] [Assigned] (SPARK-20453) Bump master branch version to 2.3.0-SNAPSHOT

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20453: Assignee: Josh Rosen (was: Apache Spark) > Bump master branch version to 2.3.0-SNAPSHOT

[jira] [Commented] (SPARK-20453) Bump master branch version to 2.3.0-SNAPSHOT

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982071#comment-15982071 ] Apache Spark commented on SPARK-20453: -- User 'JoshRosen' has created a pull request for this issue:

[jira] [Created] (SPARK-20453) Bump master branch version to 2.3.0-SNAPSHOT

2017-04-24 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-20453: -- Summary: Bump master branch version to 2.3.0-SNAPSHOT Key: SPARK-20453 URL: https://issues.apache.org/jira/browse/SPARK-20453 Project: Spark Issue Type: Bug

[jira] [Assigned] (SPARK-20452) Cancel a batch Kafka query and rerun the same DataFrame may cause ConcurrentModificationException

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20452: Assignee: Shixiong Zhu (was: Apache Spark) > Cancel a batch Kafka query and rerun the

[jira] [Commented] (SPARK-20452) Cancel a batch Kafka query and rerun the same DataFrame may cause ConcurrentModificationException

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982033#comment-15982033 ] Apache Spark commented on SPARK-20452: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Assigned] (SPARK-20452) Cancel a batch Kafka query and rerun the same DataFrame may cause ConcurrentModificationException

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20452: Assignee: Apache Spark (was: Shixiong Zhu) > Cancel a batch Kafka query and rerun the

[jira] [Created] (SPARK-20452) Cancel a batch Kafka query and rerun the same DataFrame may cause ConcurrentModificationException

2017-04-24 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-20452: Summary: Cancel a batch Kafka query and rerun the same DataFrame may cause ConcurrentModificationException Key: SPARK-20452 URL: https://issues.apache.org/jira/browse/SPARK-20452

[jira] [Commented] (SPARK-20435) More thorough redaction of sensitive information from logs/UI, more unit tests

2017-04-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981947#comment-15981947 ] Marcelo Vanzin commented on SPARK-20435: bq. The user copies over the entire conf (say from

[jira] [Assigned] (SPARK-20451) Filter out nested mapType datatypes from sort order in randomSplit

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20451: Assignee: (was: Apache Spark) > Filter out nested mapType datatypes from sort order

[jira] [Assigned] (SPARK-20451) Filter out nested mapType datatypes from sort order in randomSplit

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20451: Assignee: Apache Spark > Filter out nested mapType datatypes from sort order in

[jira] [Commented] (SPARK-20451) Filter out nested mapType datatypes from sort order in randomSplit

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981926#comment-15981926 ] Apache Spark commented on SPARK-20451: -- User 'sameeragarwal' has created a pull request for this

[jira] [Created] (SPARK-20451) Filter out nested mapType datatypes from sort order in randomSplit

2017-04-24 Thread Sameer Agarwal (JIRA)
Sameer Agarwal created SPARK-20451: -- Summary: Filter out nested mapType datatypes from sort order in randomSplit Key: SPARK-20451 URL: https://issues.apache.org/jira/browse/SPARK-20451 Project:

[jira] [Assigned] (SPARK-4899) Support Mesos features: roles and checkpoints

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-4899: --- Assignee: Apache Spark > Support Mesos features: roles and checkpoints >

[jira] [Assigned] (SPARK-4899) Support Mesos features: roles and checkpoints

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-4899: --- Assignee: (was: Apache Spark) > Support Mesos features: roles and checkpoints >

[jira] [Commented] (SPARK-4899) Support Mesos features: roles and checkpoints

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981839#comment-15981839 ] Apache Spark commented on SPARK-4899: - User 'gkc2104' has created a pull request for this issue:

[jira] [Commented] (SPARK-4899) Support Mesos features: roles and checkpoints

2017-04-24 Thread Kamal Gurala (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981837#comment-15981837 ] Kamal Gurala commented on SPARK-4899: - https://github.com/apache/spark/pull/17750 > Support Mesos

[jira] [Commented] (SPARK-1359) SGD implementation is not efficient

2017-04-24 Thread yu peng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981794#comment-15981794 ] yu peng commented on SPARK-1359: i think by randomly shuffle partitions and do gradient Descent by

[jira] [Assigned] (SPARK-20449) Upgrade breeze version to 0.13.1

2017-04-24 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DB Tsai reassigned SPARK-20449: --- Assignee: Yanbo Liang > Upgrade breeze version to 0.13.1 > > >

[jira] [Commented] (SPARK-20450) Unexpected first-query schema inference cost with 2.1.1 RC

2017-04-24 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981753#comment-15981753 ] Eric Liang commented on SPARK-20450: I'm not sure what you mean by new issue, but it's only in the

[jira] [Comment Edited] (SPARK-20450) Unexpected first-query schema inference cost with 2.1.1 RC

2017-04-24 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981753#comment-15981753 ] Eric Liang edited comment on SPARK-20450 at 4/24/17 7:40 PM: - I'm not sure

[jira] [Assigned] (SPARK-20450) Unexpected first-query schema inference cost with 2.1.1 RC

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20450: Assignee: (was: Apache Spark) > Unexpected first-query schema inference cost with

[jira] [Assigned] (SPARK-20450) Unexpected first-query schema inference cost with 2.1.1 RC

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20450: Assignee: Apache Spark > Unexpected first-query schema inference cost with 2.1.1 RC >

[jira] [Commented] (SPARK-20450) Unexpected first-query schema inference cost with 2.1.1 RC

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981750#comment-15981750 ] Apache Spark commented on SPARK-20450: -- User 'ericl' has created a pull request for this issue:

[jira] [Updated] (SPARK-20450) Unexpected first-query schema inference cost with 2.1.1 RC

2017-04-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-20450: -- Priority: Major (was: Blocker) [~ekhliang]: don't set Blocker. Is this actually a new issue? or a

[jira] [Closed] (SPARK-20440) Allow SparkR session and context to have delayed binding

2017-04-24 Thread Vinayak Joshi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayak Joshi closed SPARK-20440. - Resolution: Workaround > Allow SparkR session and context to have delayed binding >

[jira] [Commented] (SPARK-20440) Allow SparkR session and context to have delayed binding

2017-04-24 Thread Vinayak Joshi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981744#comment-15981744 ] Vinayak Joshi commented on SPARK-20440: --- The PR comments include a workaround for this need such

[jira] [Created] (SPARK-20450) Unexpected first-query schema inference cost with 2.1.1 RC

2017-04-24 Thread Eric Liang (JIRA)
Eric Liang created SPARK-20450: -- Summary: Unexpected first-query schema inference cost with 2.1.1 RC Key: SPARK-20450 URL: https://issues.apache.org/jira/browse/SPARK-20450 Project: Spark Issue

[jira] [Commented] (SPARK-20435) More thorough redaction of sensitive information from logs/UI, more unit tests

2017-04-24 Thread Mark Grover (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981730#comment-15981730 ] Mark Grover commented on SPARK-20435: - bq. I'm not saying redacting from logs is useless, but I'm

[jira] [Commented] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-24 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981728#comment-15981728 ] Kazuaki Ishizaki commented on SPARK-20392: -- Thank you. I confirmed that blockbuster.csv is slow

[jira] [Commented] (SPARK-20312) query optimizer calls udf with null values when it doesn't expect them

2017-04-24 Thread Albert Meltzer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981694#comment-15981694 ] Albert Meltzer commented on SPARK-20312: [~maropu]: making the query a bit simpler might make the

[jira] [Commented] (SPARK-20446) Optimize the process of MLLIB ALS recommendForAll

2017-04-24 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981648#comment-15981648 ] Nick Pentreath commented on SPARK-20446: By "compare to DataFrame implementation" I mean the

[jira] [Commented] (SPARK-7481) Add spark-hadoop-cloud module to pull in object store support

2017-04-24 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981602#comment-15981602 ] Steve Loughran commented on SPARK-7481: --- One thing I want to emphasise here is: I have no loyalty to

[jira] [Assigned] (SPARK-19812) YARN shuffle service fails to relocate recovery DB across NFS directories

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19812: Assignee: Thomas Graves (was: Apache Spark) > YARN shuffle service fails to relocate

[jira] [Assigned] (SPARK-19812) YARN shuffle service fails to relocate recovery DB across NFS directories

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19812: Assignee: Apache Spark (was: Thomas Graves) > YARN shuffle service fails to relocate

[jira] [Updated] (SPARK-19812) YARN shuffle service fails to relocate recovery DB across NFS directories

2017-04-24 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-19812: -- Summary: YARN shuffle service fails to relocate recovery DB across NFS directories (was: YARN

[jira] [Commented] (SPARK-19812) YARN shuffle service fails to relocate recovery DB across NFS directories

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981592#comment-15981592 ] Apache Spark commented on SPARK-19812: -- User 'tgravescs' has created a pull request for this issue:

[jira] [Resolved] (SPARK-20208) Document R fpGrowth support in vignettes, programming guide and code example

2017-04-24 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung resolved SPARK-20208. -- Resolution: Fixed Fix Version/s: 2.2.0 > Document R fpGrowth support in vignettes,

[jira] [Commented] (SPARK-20115) Fix DAGScheduler to recompute all the lost shuffle blocks when external shuffle service is unavailable

2017-04-24 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-20115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981585#comment-15981585 ] Juan Rodríguez Hortalá commented on SPARK-20115: SPARK-20178 is a discussion about how to

[jira] [Resolved] (SPARK-20438) R wrappers for split and repeat

2017-04-24 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung resolved SPARK-20438. -- Resolution: Fixed Assignee: Maciej Szymkiewicz Fix Version/s: 2.3.0

[jira] [Comment Edited] (SPARK-20107) Add spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version option to configuration.md

2017-04-24 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981552#comment-15981552 ] Steve Loughran edited comment on SPARK-20107 at 4/24/17 5:30 PM: - This

[jira] [Commented] (SPARK-20107) Add spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version option to configuration.md

2017-04-24 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981552#comment-15981552 ] Steve Loughran commented on SPARK-20107: This does not solve the problem you think it does, not

[jira] [Closed] (SPARK-20379) Allow setting SSL-related passwords through env variables

2017-04-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin closed SPARK-20379. -- Resolution: Not A Problem Just remembered that in 2.x (2.1 at least) users can reference env

[jira] [Comment Edited] (SPARK-20446) Optimize the process of MLLIB ALS recommendForAll

2017-04-24 Thread Peng Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981148#comment-15981148 ] Peng Meng edited comment on SPARK-20446 at 4/24/17 4:18 PM: Thanks [~mlnick],

[jira] [Commented] (SPARK-11373) Add metrics to the History Server and providers

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981410#comment-15981410 ] Apache Spark commented on SPARK-11373: -- User 'steveloughran' has created a pull request for this

[jira] [Updated] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-24 Thread Barry Becker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Barry Becker updated SPARK-20392: - Attachment: model_9756.zip blockbuster_fewCols.csv attaching

[jira] [Resolved] (SPARK-18901) Require in LR LogisticAggregator is redundant

2017-04-24 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang resolved SPARK-18901. - Resolution: Fixed Assignee: Miao Wang Fix Version/s: 2.2.0 > Require in LR

[jira] [Commented] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-24 Thread Barry Becker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981386#comment-15981386 ] Barry Becker commented on SPARK-20392: -- [~viirya] that is correct. If I reduce the dataset to just

[jira] [Commented] (SPARK-18791) Stream-Stream Joins

2017-04-24 Thread Saul Shanabrook (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981383#comment-15981383 ] Saul Shanabrook commented on SPARK-18791: - I am using Spark to process the results from genetic

[jira] [Commented] (SPARK-20449) Upgrade breeze version to 0.13.1

2017-04-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981321#comment-15981321 ] Sean Owen commented on SPARK-20449: --- What if anything are the compatibility issues? that's always the

[jira] [Assigned] (SPARK-20449) Upgrade breeze version to 0.13.1

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20449: Assignee: (was: Apache Spark) > Upgrade breeze version to 0.13.1 >

[jira] [Commented] (SPARK-20449) Upgrade breeze version to 0.13.1

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981270#comment-15981270 ] Apache Spark commented on SPARK-20449: -- User 'yanboliang' has created a pull request for this issue:

[jira] [Assigned] (SPARK-20449) Upgrade breeze version to 0.13.1

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20449: Assignee: Apache Spark > Upgrade breeze version to 0.13.1 >

[jira] [Updated] (SPARK-20155) CSV-files with quoted quotes can't be parsed, if delimiter follows quoted quote

2017-04-24 Thread Rick Moritz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rick Moritz updated SPARK-20155: Description: According to : https://tools.ietf.org/html/rfc4180#section-2 7. If double-quotes

[jira] [Updated] (SPARK-20155) CSV-files with quoted quotes can't be parsed, if delimiter follows quoted quote

2017-04-24 Thread Rick Moritz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rick Moritz updated SPARK-20155: Description: According to : https://tools.ietf.org/html/rfc4180#section-2 7. If double-quotes

[jira] [Commented] (SPARK-20155) CSV-files with quoted quotes can't be parsed, if delimiter follows quoted quote

2017-04-24 Thread Rick Moritz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981185#comment-15981185 ] Rick Moritz commented on SPARK-20155: - Good info, thanks. I've added a link. > CSV-files with quoted

[jira] [Created] (SPARK-20449) Upgrade breeze version to 0.13.1

2017-04-24 Thread Yanbo Liang (JIRA)
Yanbo Liang created SPARK-20449: --- Summary: Upgrade breeze version to 0.13.1 Key: SPARK-20449 URL: https://issues.apache.org/jira/browse/SPARK-20449 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-20436) NullPointerException when restart from checkpoint file

2017-04-24 Thread Armin Braun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Armin Braun updated SPARK-20436: Description: I have written a Spark Streaming application which have two DStreams. Code is :

[jira] [Commented] (SPARK-20155) CSV-files with quoted quotes can't be parsed, if delimiter follows quoted quote

2017-04-24 Thread Armin Braun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981166#comment-15981166 ] Armin Braun commented on SPARK-20155: - [~RPCMoritz] take a look at what I just found:

[jira] [Commented] (SPARK-20446) Optimize the process of MLLIB ALS recommendForAll

2017-04-24 Thread Peng Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981148#comment-15981148 ] Peng Meng commented on SPARK-20446: --- Thanks [~mlnick], I also compared DataFrame Version ALS

[jira] [Commented] (SPARK-17159) Improve FileInputDStream.findNewFiles list performance

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981143#comment-15981143 ] Apache Spark commented on SPARK-17159: -- User 'steveloughran' has created a pull request for this

[jira] [Commented] (SPARK-20446) Optimize the process of MLLIB ALS recommendForAll

2017-04-24 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981134#comment-15981134 ] Nick Pentreath commented on SPARK-20446: Also would be good to compare to the new {{DataFrame}}

[jira] [Commented] (SPARK-20446) Optimize the process of MLLIB ALS recommendForAll

2017-04-24 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981129#comment-15981129 ] Nick Pentreath commented on SPARK-20446: Anyway I'd like to compare the approaches and see which

[jira] [Commented] (SPARK-20426) OneForOneStreamManager occupies too much memory.

2017-04-24 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981124#comment-15981124 ] jin xing commented on SPARK-20426: -- [~jerryshao] I think lazy initialization can resolve this issue. I

[jira] [Commented] (SPARK-20446) Optimize the process of MLLIB ALS recommendForAll

2017-04-24 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981122#comment-15981122 ] Nick Pentreath commented on SPARK-20446: The GC would come from the temp result array in the

[jira] [Assigned] (SPARK-20426) OneForOneStreamManager occupies too much memory.

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20426: Assignee: Apache Spark > OneForOneStreamManager occupies too much memory. >

[jira] [Assigned] (SPARK-20426) OneForOneStreamManager occupies too much memory.

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20426: Assignee: (was: Apache Spark) > OneForOneStreamManager occupies too much memory. >

[jira] [Commented] (SPARK-20426) OneForOneStreamManager occupies too much memory.

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981117#comment-15981117 ] Apache Spark commented on SPARK-20426: -- User 'jinxing64' has created a pull request for this issue:

[jira] [Commented] (SPARK-20446) Optimize the process of MLLIB ALS recommendForAll

2017-04-24 Thread Peng Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981115#comment-15981115 ] Peng Meng commented on SPARK-20446: --- I think you said: https://github.com/apache/spark/pull/9980 Maybe

[jira] [Assigned] (SPARK-20448) Document how FileInputDStream works with object storage

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20448: Assignee: (was: Apache Spark) > Document how FileInputDStream works with object

[jira] [Assigned] (SPARK-20448) Document how FileInputDStream works with object storage

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20448: Assignee: Apache Spark > Document how FileInputDStream works with object storage >

[jira] [Commented] (SPARK-20448) Document how FileInputDStream works with object storage

2017-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981094#comment-15981094 ] Apache Spark commented on SPARK-20448: -- User 'steveloughran' has created a pull request for this

[jira] [Commented] (SPARK-17159) Improve FileInputDStream.findNewFiles list performance

2017-04-24 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981081#comment-15981081 ] Steve Loughran commented on SPARK-17159: pulled out documentation into separate JIRA,

[jira] [Created] (SPARK-20448) Document how FileInputDStream works with object storage

2017-04-24 Thread Steve Loughran (JIRA)
Steve Loughran created SPARK-20448: -- Summary: Document how FileInputDStream works with object storage Key: SPARK-20448 URL: https://issues.apache.org/jira/browse/SPARK-20448 Project: Spark

[jira] [Commented] (SPARK-20446) Optimize the process of MLLIB ALS recommendForAll

2017-04-24 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981066#comment-15981066 ] Nick Pentreath commented on SPARK-20446: This is really a duplicate of

[jira] [Commented] (SPARK-20155) CSV-files with quoted quotes can't be parsed, if delimiter follows quoted quote

2017-04-24 Thread Rick Moritz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981068#comment-15981068 ] Rick Moritz commented on SPARK-20155: - Why shoudn't we change the default escape character, when

[jira] [Created] (SPARK-20447) spark mesos scheduler suppress call

2017-04-24 Thread Pavel Plotnikov (JIRA)
Pavel Plotnikov created SPARK-20447: --- Summary: spark mesos scheduler suppress call Key: SPARK-20447 URL: https://issues.apache.org/jira/browse/SPARK-20447 Project: Spark Issue Type: Wish

[jira] [Resolved] (SPARK-20155) CSV-files with quoted quotes can't be parsed, if delimiter follows quoted quote

2017-04-24 Thread Armin Braun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Armin Braun resolved SPARK-20155. - Resolution: Won't Fix > CSV-files with quoted quotes can't be parsed, if delimiter follows

  1   2   >