[jira] [Issue Comment Deleted] (SPARK-10795) FileNotFoundException while deploying pyspark job on cluster

2016-02-22 Thread Carlos Bribiescas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carlos Bribiescas updated SPARK-10795: -- Comment: was deleted (was: What is the command you use when this happens? I had this

[jira] [Commented] (SPARK-13437) Add InternalColumn

2016-02-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157574#comment-15157574 ] Apache Spark commented on SPARK-13437: -- User 'kiszk' has created a pull request for this issue:

[jira] [Assigned] (SPARK-13437) Add InternalColumn

2016-02-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13437: Assignee: (was: Apache Spark) > Add InternalColumn > -- > >

[jira] [Commented] (SPARK-11381) Replace example code in mllib-linear-methods.md using include_example

2016-02-22 Thread Xusen Yin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157608#comment-15157608 ] Xusen Yin commented on SPARK-11381: --- [~devaraj.k] Are you interested in working on it? > Replace

[jira] [Commented] (SPARK-13288) [1.6.0] Memory leak in Spark streaming

2016-02-22 Thread JESSE CHEN (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157725#comment-15157725 ] JESSE CHEN commented on SPARK-13288: The code is simple but I can't share the data per legal

[jira] [Commented] (SPARK-13046) Partitioning looks broken in 1.6

2016-02-22 Thread Julien Baley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157517#comment-15157517 ] Julien Baley commented on SPARK-13046: -- Well, my call to:

[jira] [Comment Edited] (SPARK-10795) FileNotFoundException while deploying pyspark job on cluster

2016-02-22 Thread Carlos Bribiescas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157683#comment-15157683 ] Carlos Bribiescas edited comment on SPARK-10795 at 2/22/16 8:57 PM:

[jira] [Commented] (SPARK-10795) FileNotFoundException while deploying pyspark job on cluster

2016-02-22 Thread Carlos Bribiescas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157683#comment-15157683 ] Carlos Bribiescas commented on SPARK-10795: --- Using this command spark-submit --master

[jira] [Assigned] (SPARK-13437) Add InternalColumn

2016-02-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13437: Assignee: Apache Spark > Add InternalColumn > -- > > Key:

[jira] [Commented] (SPARK-13288) [1.6.0] Memory leak in Spark streaming

2016-02-22 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157700#comment-15157700 ] Shixiong Zhu commented on SPARK-13288: -- Do you have a simple reproducer? > [1.6.0] Memory leak in

[jira] [Updated] (SPARK-11624) Spark SQL CLI will set sessionstate twice

2016-02-22 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11624: - Priority: Critical (was: Major) > Spark SQL CLI will set sessionstate twice >

[jira] [Updated] (SPARK-11624) Spark SQL CLI will set sessionstate twice

2016-02-22 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11624: - Target Version/s: 1.6.1 > Spark SQL CLI will set sessionstate twice >

[jira] [Commented] (SPARK-13046) Partitioning looks broken in 1.6

2016-02-22 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157500#comment-15157500 ] Yin Huai commented on SPARK-13046: -- [~julien.baley] I tried the following code {code}

[jira] [Commented] (SPARK-10109) NPE when saving Parquet To HDFS

2016-02-22 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157526#comment-15157526 ] Yin Huai commented on SPARK-10109: -- InternalParquetRecordWriter's close method can be only called once.

[jira] [Updated] (SPARK-12583) spark shuffle fails with mesos after 2mins

2016-02-22 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-12583: -- Target Version/s: 2.0.0 (was: 1.6.1) > spark shuffle fails with mesos after 2mins >

[jira] [Updated] (SPARK-13436) Add parameter drop to subsetting oeprator

2016-02-22 Thread Oscar D. Lara Yejas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oscar D. Lara Yejas updated SPARK-13436: Description: Parameter drops allows to return a vector/data.frame accordingly if

[jira] [Updated] (SPARK-13436) Add parameter drop to subsetting oeprator

2016-02-22 Thread Oscar D. Lara Yejas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oscar D. Lara Yejas updated SPARK-13436: Issue Type: Task (was: Bug) > Add parameter drop to subsetting oeprator >

[jira] [Created] (SPARK-13437) Add InternalColumn

2016-02-22 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-13437: Summary: Add InternalColumn Key: SPARK-13437 URL: https://issues.apache.org/jira/browse/SPARK-13437 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-13298) DAG visualization does not render correctly for jobs

2016-02-22 Thread Alex Bozarth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Bozarth updated SPARK-13298: - Attachment: no-dag.png Using your reproducer I saw the same js error but the dag image was

[jira] [Created] (SPARK-13436) Add parameter drop to subsetting oeprator

2016-02-22 Thread Oscar D. Lara Yejas (JIRA)
Oscar D. Lara Yejas created SPARK-13436: --- Summary: Add parameter drop to subsetting oeprator Key: SPARK-13436 URL: https://issues.apache.org/jira/browse/SPARK-13436 Project: Spark

[jira] [Commented] (SPARK-13436) Add parameter drop to subsetting oeprator

2016-02-22 Thread Oscar D. Lara Yejas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157544#comment-15157544 ] Oscar D. Lara Yejas commented on SPARK-13436: - I'm working on this one > Add parameter drop

[jira] [Commented] (SPARK-10795) FileNotFoundException while deploying pyspark job on cluster

2016-02-22 Thread Carlos Bribiescas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157612#comment-15157612 ] Carlos Bribiescas commented on SPARK-10795: --- What is the command you use when this happens? I

[jira] [Commented] (SPARK-13298) DAG visualization does not render correctly for jobs

2016-02-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157637#comment-15157637 ] Apache Spark commented on SPARK-13298: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Commented] (SPARK-13298) DAG visualization does not render correctly for jobs

2016-02-22 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157633#comment-15157633 ] Shixiong Zhu commented on SPARK-13298: -- Thanks for the reproducer. Just submit a PR to fix it. >

[jira] [Assigned] (SPARK-13298) DAG visualization does not render correctly for jobs

2016-02-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13298: Assignee: Apache Spark (was: Shixiong Zhu) > DAG visualization does not render correctly

[jira] [Assigned] (SPARK-13298) DAG visualization does not render correctly for jobs

2016-02-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13298: Assignee: Shixiong Zhu (was: Apache Spark) > DAG visualization does not render correctly

[jira] [Assigned] (SPARK-13298) DAG visualization does not render correctly for jobs

2016-02-22 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reassigned SPARK-13298: Assignee: Shixiong Zhu > DAG visualization does not render correctly for jobs >

[jira] [Updated] (SPARK-12546) Writing to partitioned parquet table can fail with OOM

2016-02-22 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-12546: - Assignee: Michael Armbrust Target Version/s: 1.6.1 Priority:

[jira] [Assigned] (SPARK-12546) Writing to partitioned parquet table can fail with OOM

2016-02-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12546: Assignee: Michael Armbrust (was: Apache Spark) > Writing to partitioned parquet table

[jira] [Assigned] (SPARK-12546) Writing to partitioned parquet table can fail with OOM

2016-02-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12546: Assignee: Apache Spark (was: Michael Armbrust) > Writing to partitioned parquet table

[jira] [Commented] (SPARK-12546) Writing to partitioned parquet table can fail with OOM

2016-02-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157605#comment-15157605 ] Apache Spark commented on SPARK-12546: -- User 'marmbrus' has created a pull request for this issue:

[jira] [Comment Edited] (SPARK-10795) FileNotFoundException while deploying pyspark job on cluster

2016-02-22 Thread Carlos Bribiescas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157683#comment-15157683 ] Carlos Bribiescas edited comment on SPARK-10795 at 2/22/16 8:58 PM:

[jira] [Comment Edited] (SPARK-10795) FileNotFoundException while deploying pyspark job on cluster

2016-02-22 Thread Carlos Bribiescas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157683#comment-15157683 ] Carlos Bribiescas edited comment on SPARK-10795 at 2/22/16 8:58 PM:

[jira] [Resolved] (SPARK-10749) Support multiple roles with Spark Mesos dispatcher

2016-02-22 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or resolved SPARK-10749. --- Resolution: Fixed Fix Version/s: 2.0.0 > Support multiple roles with Spark Mesos dispatcher >

[jira] [Updated] (SPARK-12757) Use reference counting to prevent blocks from being evicted during reads

2016-02-22 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-12757: -- Target Version/s: 2.0.0 > Use reference counting to prevent blocks from being evicted during reads >

[jira] [Updated] (SPARK-13436) Add parameter drop to subsetting oeprator

2016-02-22 Thread Oscar D. Lara Yejas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oscar D. Lara Yejas updated SPARK-13436: Issue Type: Sub-task (was: Task) Parent: SPARK-9315 > Add parameter drop

[jira] [Updated] (SPARK-13436) Add parameter drop to subsetting operator [

2016-02-22 Thread Oscar D. Lara Yejas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oscar D. Lara Yejas updated SPARK-13436: Summary: Add parameter drop to subsetting operator [ (was: Add parameter drop to

[jira] [Updated] (SPARK-13298) DAG visualization does not render correctly for jobs

2016-02-22 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-13298: -- Target Version/s: 1.6.1, 2.0.0 > DAG visualization does not render correctly for jobs >

[jira] [Commented] (SPARK-13289) Word2Vec generate infinite distances when numIterations>5

2016-02-22 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156785#comment-15156785 ] Nick Pentreath commented on SPARK-13289: [~daiqi5477] could you try your experiments again

[jira] [Commented] (SPARK-13026) Umbrella: Allow user to specify initial model when training

2016-02-22 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156803#comment-15156803 ] Nick Pentreath commented on SPARK-13026: [~holdenk] is this JIRA necessary, as it duplicates

[jira] [Resolved] (SPARK-12632) Make Parameter Descriptions Consistent for PySpark MLlib FPM and Recommendation

2016-02-22 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-12632. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11186

[jira] [Updated] (SPARK-13334) ML KMeansModel/BisectingKMeansModel should be set parent

2016-02-22 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-13334: --- Assignee: Yanbo Liang > ML KMeansModel/BisectingKMeansModel should be set parent >

[jira] [Resolved] (SPARK-13334) ML KMeansModel/BisectingKMeansModel should be set parent

2016-02-22 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-13334. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11214

[jira] [Commented] (SPARK-13431) Maven build fails due to: Method code too large! in Catalyst

2016-02-22 Thread Zhichao Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158471#comment-15158471 ] Zhichao Zhang commented on SPARK-13431: I can't build it with sbt because downloading jar from

[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0

2016-02-22 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-13448: -- Description: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can

[jira] [Resolved] (SPARK-13429) Unify Logistic Regression convergence tolerance of ML & MLlib

2016-02-22 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-13429. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11299

[jira] [Updated] (SPARK-13429) Unify Logistic Regression convergence tolerance of ML & MLlib

2016-02-22 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-13429: -- Assignee: Yanbo Liang > Unify Logistic Regression convergence tolerance of ML & MLlib >

[jira] [Updated] (SPARK-12363) PowerIterationClustering test case failed if we deprecated KMeans.setRuns

2016-02-22 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-12363: -- Fix Version/s: 2.0.0 1.6.1 1.5.3 >

[jira] [Resolved] (SPARK-12363) PowerIterationClustering test case failed if we deprecated KMeans.setRuns

2016-02-22 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-12363. --- Resolution: Fixed Fix Version/s: (was: 1.6.1) (was: 1.5.3)

[jira] [Resolved] (SPARK-13355) Replace GraphImpl.fromExistingRDDs by Graph

2016-02-22 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-13355. --- Resolution: Fixed > Replace GraphImpl.fromExistingRDDs by Graph >

[jira] [Updated] (SPARK-13355) Replace GraphImpl.fromExistingRDDs by Graph

2016-02-22 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-13355: -- Fix Version/s: 1.6.2 2.0.0 1.5.3 1.4.2

[jira] [Updated] (SPARK-13446) Spark need to support reading data from HIve 2.0.0 metastore

2016-02-22 Thread Lifeng Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lifeng Wang updated SPARK-13446: Description: Spark provided HIveContext class to read data from hive metastore directly. While it

[jira] [Created] (SPARK-13448) Document MLlib behavior changes in Spark 2.0

2016-02-22 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-13448: - Summary: Document MLlib behavior changes in Spark 2.0 Key: SPARK-13448 URL: https://issues.apache.org/jira/browse/SPARK-13448 Project: Spark Issue Type:

[jira] [Commented] (SPARK-12363) PowerIterationClustering test case failed if we deprecated KMeans.setRuns

2016-02-22 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158488#comment-15158488 ] Xiangrui Meng commented on SPARK-12363: --- I think it is still useful to backport the fix. People may

[jira] [Updated] (SPARK-12746) ArrayType(_, true) should also accept ArrayType(_, false)

2016-02-22 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-12746: -- Fix Version/s: 1.6.2 > ArrayType(_, true) should also accept ArrayType(_, false) >

[jira] [Resolved] (SPARK-13257) Refine naive Bayes example code

2016-02-22 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-13257. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11125

[jira] [Updated] (SPARK-13433) The standalone server should limit the count of cores and memory for running Drivers

2016-02-22 Thread lichenglin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lichenglin updated SPARK-13433: --- Description: I have a 16 cores cluster. A Running driver at least use 1 core may be more. When I

[jira] [Updated] (SPARK-13434) Reduce Spark RandomForest memory footprint

2016-02-22 Thread Ewan Higgs (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewan Higgs updated SPARK-13434: --- Attachment: rf-heap-usage.png JConsole output of memory use with 1.3G file. > Reduce Spark

[jira] [Commented] (SPARK-13434) Reduce Spark RandomForest memory footprint

2016-02-22 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156987#comment-15156987 ] Sean Owen commented on SPARK-13434: --- I'm missing what you're proposing -- what is the opportunity to

[jira] [Commented] (SPARK-13433) The standalone server should limit the count of cores and memory for running Drivers

2016-02-22 Thread lichenglin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157050#comment-15157050 ] lichenglin commented on SPARK-13433: It's something like deadlock. driver use all cores> application

[jira] [Commented] (SPARK-13431) Maven build fails due to: Method code too large! in Catalyst

2016-02-22 Thread Iulian Dragos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157044#comment-15157044 ] Iulian Dragos commented on SPARK-13431: --- Well, the class file format isn't embedding any names

[jira] [Updated] (SPARK-13433) The standalone server should limit the count of cores and memory for running Drivers

2016-02-22 Thread lichenglin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lichenglin updated SPARK-13433: --- Description: I have a 16 cores cluster. When I submit a lot of job to the standalone server in

[jira] [Created] (SPARK-13435) Add Weighted Cohen's kappa to MulticlassMetrics

2016-02-22 Thread zhengruifeng (JIRA)
zhengruifeng created SPARK-13435: Summary: Add Weighted Cohen's kappa to MulticlassMetrics Key: SPARK-13435 URL: https://issues.apache.org/jira/browse/SPARK-13435 Project: Spark Issue Type:

[jira] [Commented] (SPARK-13431) Maven build fails due to: Method code too large! in Catalyst

2016-02-22 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156991#comment-15156991 ] Sean Owen commented on SPARK-13431: --- Shade doesn't do optimization but it certainly does several

[jira] [Created] (SPARK-13434) Reduce Spark RandomForest memory footprint

2016-02-22 Thread Ewan Higgs (JIRA)
Ewan Higgs created SPARK-13434: -- Summary: Reduce Spark RandomForest memory footprint Key: SPARK-13434 URL: https://issues.apache.org/jira/browse/SPARK-13434 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-13433) The standalone server should limit the count of cores and memory for running Drivers

2016-02-22 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-13433. --- Resolution: Not A Problem I don't see that this is a problem. You're saying that if you use all your

[jira] [Updated] (SPARK-13435) Add Weighted Cohen's kappa to MulticlassMetrics

2016-02-22 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-13435: -- Priority: Minor (was: Major) I'm neutral on it... it's not widely used and easy to compute from the

[jira] [Assigned] (SPARK-13435) Add Weighted Cohen's kappa to MulticlassMetrics

2016-02-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13435: Assignee: Apache Spark > Add Weighted Cohen's kappa to MulticlassMetrics >

[jira] [Assigned] (SPARK-13435) Add Weighted Cohen's kappa to MulticlassMetrics

2016-02-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13435: Assignee: (was: Apache Spark) > Add Weighted Cohen's kappa to MulticlassMetrics >

[jira] [Commented] (SPARK-10420) Implementing Reactive Streams based Spark Streaming Receiver

2016-02-22 Thread Shyam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157001#comment-15157001 ] Shyam commented on SPARK-10420: --- Hi all, is there any activity around this ticket? > Implementing Reactive

[jira] [Commented] (SPARK-13435) Add Weighted Cohen's kappa to MulticlassMetrics

2016-02-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157000#comment-15157000 ] Apache Spark commented on SPARK-13435: -- User 'zhengruifeng' has created a pull request for this

[jira] [Commented] (SPARK-13431) Maven build fails due to: Method code too large! in Catalyst

2016-02-22 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156882#comment-15156882 ] Herman van Hovell commented on SPARK-13431: --- The good/bad news is that I can reproduce this

[jira] [Comment Edited] (SPARK-13431) Maven build fails due to: Method code too large! in Catalyst

2016-02-22 Thread Iulian Dragos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156902#comment-15156902 ] Iulian Dragos edited comment on SPARK-13431 at 2/22/16 1:29 PM: In my

[jira] [Updated] (SPARK-12379) Copy GBT implementation to spark.ml

2016-02-22 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-12379: --- Assignee: Seth Hendrickson > Copy GBT implementation to spark.ml >

[jira] [Commented] (SPARK-13431) Maven build fails due to: Method code too large! in Catalyst

2016-02-22 Thread Iulian Dragos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156902#comment-15156902 ] Iulian Dragos commented on SPARK-13431: --- In my experience the 'shade' plugin does a lot more than

[jira] [Created] (SPARK-13433) The standalone should limit the count of Running Drivers

2016-02-22 Thread lichenglin (JIRA)
lichenglin created SPARK-13433: -- Summary: The standalone should limit the count of Running Drivers Key: SPARK-13433 URL: https://issues.apache.org/jira/browse/SPARK-13433 Project: Spark Issue

[jira] [Updated] (SPARK-13434) Reduce Spark RandomForest memory footprint

2016-02-22 Thread Ewan Higgs (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewan Higgs updated SPARK-13434: --- Attachment: heap-usage.log Heap usage of RandomForest sampled with {{jmap -histo:live }} every 5

[jira] [Commented] (SPARK-13434) Reduce Spark RandomForest memory footprint

2016-02-22 Thread Ewan Higgs (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156921#comment-15156921 ] Ewan Higgs commented on SPARK-13434: SPARK-3728 is titled with a similar intent to this, but the

[jira] [Assigned] (SPARK-13390) Java Spark createDataFrame with List parameter bug

2016-02-22 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reassigned SPARK-13390: Assignee: Shixiong Zhu > Java Spark createDataFrame with List parameter bug >

[jira] [Updated] (SPARK-13438) Remove dash from save output paths

2016-02-22 Thread Peter Ableda (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Ableda updated SPARK-13438: - Description: The current implementation uses the above schema for the saveAsTextFiles,

[jira] [Resolved] (SPARK-13413) Remove SparkContext.metricsSystem

2016-02-22 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-13413. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11282

[jira] [Commented] (SPARK-13433) The standalone server should limit the count of cores and memory for running Drivers

2016-02-22 Thread lichenglin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157802#comment-15157802 ] lichenglin commented on SPARK-13433: I know the property 'spark.dirver.cores' What I want to limit

[jira] [Assigned] (SPARK-13438) Remove by default dash from output paths

2016-02-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13438: Assignee: (was: Apache Spark) > Remove by default dash from output paths >

[jira] [Assigned] (SPARK-13438) Remove by default dash from output paths

2016-02-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13438: Assignee: Apache Spark > Remove by default dash from output paths >

[jira] [Commented] (SPARK-13438) Remove by default dash from output paths

2016-02-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157868#comment-15157868 ] Apache Spark commented on SPARK-13438: -- User 'peterableda' has created a pull request for this

[jira] [Commented] (SPARK-12042) Python API for mllib.stat.test.StreamingTest

2016-02-22 Thread Xusen Yin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157885#comment-15157885 ] Xusen Yin commented on SPARK-12042: --- I'll work on it. > Python API for mllib.stat.test.StreamingTest >

[jira] [Updated] (SPARK-13025) Allow user to specify the initial model when training LogisticRegression

2016-02-22 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-13025: Issue Type: Bug (was: Sub-task) Parent: (was: SPARK-13026) > Allow user to specify the

[jira] [Commented] (SPARK-13026) Umbrella: Allow user to specify initial model when training

2016-02-22 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157913#comment-15157913 ] holdenk commented on SPARK-13026: - yah we can probably close this and move SPARK-13025 under SPARK-11136.

[jira] [Updated] (SPARK-13025) Allow user to specify the initial model when training LogisticRegression

2016-02-22 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-13025: Issue Type: Sub-task (was: Bug) Parent: SPARK-13026 > Allow user to specify the initial model

[jira] [Closed] (SPARK-13433) The standalone server should limit the count of cores and memory for running Drivers

2016-02-22 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen closed SPARK-13433. - The driver uses 1 core. If it can't schedule, nothing can. This is not deadlock, it's just resource

[jira] [Commented] (SPARK-13434) Reduce Spark RandomForest memory footprint

2016-02-22 Thread Ewan Higgs (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157105#comment-15157105 ] Ewan Higgs commented on SPARK-13434: Hi Sean, This is using jmap with the {{-histo:live}} argument

[jira] [Commented] (SPARK-13431) Maven build fails due to: Method code too large! in Catalyst

2016-02-22 Thread Iulian Dragos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157119#comment-15157119 ] Iulian Dragos commented on SPARK-13431: --- Probably the easiest fix is to break some grammar elements

[jira] [Commented] (SPARK-13433) The standalone server should limit the count of cores and memory for running Drivers

2016-02-22 Thread lichenglin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157118#comment-15157118 ] lichenglin commented on SPARK-13433: But When? When someting else frees up resources?? All the cores

[jira] [Commented] (SPARK-13219) Pushdown predicate propagation in SparkSQL with join

2016-02-22 Thread Evan Chan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157292#comment-15157292 ] Evan Chan commented on SPARK-13219: --- [~smilegator] [~doodlegum] what is the URL to the latest patch?

[jira] [Assigned] (SPARK-13266) Python DataFrameReader converts None to "None" instead of null

2016-02-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13266: Assignee: Apache Spark > Python DataFrameReader converts None to "None" instead of null >

[jira] [Assigned] (SPARK-13266) Python DataFrameReader converts None to "None" instead of null

2016-02-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13266: Assignee: (was: Apache Spark) > Python DataFrameReader converts None to "None"

[jira] [Commented] (SPARK-13266) Python DataFrameReader converts None to "None" instead of null

2016-02-22 Thread mathieu longtin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157260#comment-15157260 ] mathieu longtin commented on SPARK-13266: - https://github.com/apache/spark/pull/11305 > Python

[jira] [Commented] (SPARK-13266) Python DataFrameReader converts None to "None" instead of null

2016-02-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157258#comment-15157258 ] Apache Spark commented on SPARK-13266: -- User 'mathieulongtin' has created a pull request for this

[jira] [Issue Comment Deleted] (SPARK-13266) Python DataFrameReader converts None to "None" instead of null

2016-02-22 Thread mathieu longtin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mathieu longtin updated SPARK-13266: Comment: was deleted (was: https://github.com/apache/spark/pull/11305) > Python

[jira] [Commented] (SPARK-13431) Maven build fails due to: Method code too large! in Catalyst

2016-02-22 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157129#comment-15157129 ] Stavros Kontopoulos commented on SPARK-13431: - The Java Virtual Machine specification limits

  1   2   3   >