[jira] [Created] (SPARK-3740) Use a compressed bitmap to track zero sized blocks in HighlyCompressedMapStatus

2014-09-30 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-3740: -- Summary: Use a compressed bitmap to track zero sized blocks in HighlyCompressedMapStatus Key: SPARK-3740 URL: https://issues.apache.org/jira/browse/SPARK-3740 Project:

[jira] [Updated] (SPARK-3740) Use a compressed bitmap to track zero sized blocks in HighlyCompressedMapStatus

2014-09-30 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-3740: --- Labels: starter (was: ) Use a compressed bitmap to track zero sized blocks in

[jira] [Commented] (SPARK-1860) Standalone Worker cleanup should not clean up running executors

2014-09-30 Thread Aaron Davidson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14152839#comment-14152839 ] Aaron Davidson commented on SPARK-1860: --- The Executor could clean up its own jars

[jira] [Commented] (SPARK-3709) Executors don't always report broadcast block removal properly back to the driver

2014-09-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14152845#comment-14152845 ] Apache Spark commented on SPARK-3709: - User 'rxin' has created a pull request for this

[jira] [Closed] (SPARK-3734) DriverRunner should not read SPARK_HOME from submitter's environment

2014-09-30 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-3734. Resolution: Fixed DriverRunner should not read SPARK_HOME from submitter's environment

[jira] [Closed] (SPARK-3734) DriverRunner should not read SPARK_HOME from submitter's environment

2014-09-30 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-3734. Resolution: Fixed Fix Version/s: 1.2.0 1.1.1 Target Version/s: 1.1.1,

[jira] [Reopened] (SPARK-3734) DriverRunner should not read SPARK_HOME from submitter's environment

2014-09-30 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or reopened SPARK-3734: -- DriverRunner should not read SPARK_HOME from submitter's environment

[jira] [Closed] (SPARK-3738) InsertIntoHiveTable can't handle strings with \n

2014-09-30 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian closed SPARK-3738. - Resolution: Invalid False alarm, it's because of Hive's default SerDe, which uses '\n' as record

[jira] [Commented] (SPARK-3711) Optimize where in clause filter queries

2014-09-30 Thread Yash Datta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14152878#comment-14152878 ] Yash Datta commented on SPARK-3711: --- On a 2 node setup each machine config: 24 core

[jira] [Closed] (SPARK-2763) Add function of deleting applications under spark.eventLog.dir

2014-09-30 Thread meiyoula (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] meiyoula closed SPARK-2763. --- Resolution: Won't Fix Add function of deleting applications under spark.eventLog.dir

[jira] [Updated] (SPARK-3100) Spark RDD partitions are not running in the workers as per locality information given by each partition.

2014-09-30 Thread Ravindra Pesala (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala updated SPARK-3100: --- Description: I created a simple custom RDD (SampleRDD.scala)and created 4 splits for 4

[jira] [Updated] (SPARK-3100) Spark RDD partitions are not running in the workers as per locality information given by each partition.

2014-09-30 Thread Ravindra Pesala (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala updated SPARK-3100: --- Description: I created a simple custom RDD (SampleRDD.scala)and created 4 splits for 4

[jira] [Comment Edited] (SPARK-3100) Spark RDD partitions are not running in the workers as per locality information given by each partition.

2014-09-30 Thread Ravindra Pesala (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100664#comment-14100664 ] Ravindra Pesala edited comment on SPARK-3100 at 9/30/14 7:47 AM:

[jira] [Commented] (SPARK-3732) Yarn Client: Add option to NOT System.exit() at end of main()

2014-09-30 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14152912#comment-14152912 ] Sean Owen commented on SPARK-3732: -- FWIW, I was also surprised in the past that there is

[jira] [Updated] (SPARK-3731) RDD caching stops working in pyspark after some time

2014-09-30 Thread Milan Straka (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Milan Straka updated SPARK-3731: Environment: Linux, 32bit, both local master and standalone mode (was: Linux, 32bit, standalone

[jira] [Updated] (SPARK-3731) RDD caching stops working in pyspark after some time

2014-09-30 Thread Milan Straka (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Milan Straka updated SPARK-3731: Environment: Linux, 32bit, both in local mode or in standalone cluster mode (was: Linux, 32bit,

[jira] [Created] (SPARK-3741) ConnectionManager.sendMessage may not propagate errors to MessageStatus

2014-09-30 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-3741: --- Summary: ConnectionManager.sendMessage may not propagate errors to MessageStatus Key: SPARK-3741 URL: https://issues.apache.org/jira/browse/SPARK-3741 Project: Spark

[jira] [Commented] (SPARK-3741) ConnectionManager.sendMessage may not propagate errors to MessageStatus

2014-09-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14152957#comment-14152957 ] Apache Spark commented on SPARK-3741: - User 'zsxwing' has created a pull request for

[jira] [Created] (SPARK-3742) SparkUI page unable to open

2014-09-30 Thread meiyoula (JIRA)
meiyoula created SPARK-3742: --- Summary: SparkUI page unable to open Key: SPARK-3742 URL: https://issues.apache.org/jira/browse/SPARK-3742 Project: Spark Issue Type: Bug Components: Spark

[jira] [Updated] (SPARK-3742) SparkUI page unable to open

2014-09-30 Thread meiyoula (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] meiyoula updated SPARK-3742: Description: When run an application on yarn, the hyperlink on yarn page can't turn to sparkUI page. It

[jira] [Updated] (SPARK-3742) SparkUI page unable to open

2014-09-30 Thread meiyoula (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] meiyoula updated SPARK-3742: Description: When running an application on yarn, the hyperlink on yarn page can't jump to sparkUI page.

[jira] [Commented] (SPARK-3687) Spark hang while processing more than 100 sequence files

2014-09-30 Thread Ziv Huang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14152985#comment-14152985 ] Ziv Huang commented on SPARK-3687: -- I run jps on worker node when it hangs. I see two

[jira] [Resolved] (SPARK-3532) Spark On FreeBSD. Snappy used by torrent broadcast fails to load native libs.

2014-09-30 Thread Prashant Sharma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Sharma resolved SPARK-3532. Resolution: Fixed Spark On FreeBSD. Snappy used by torrent broadcast fails to load native

[jira] [Commented] (SPARK-3733) Support for programmatically submitting Spark jobs

2014-09-30 Thread Matthew Farrellee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153142#comment-14153142 ] Matthew Farrellee commented on SPARK-3733: -- will you describe what you mean by

[jira] [Commented] (SPARK-3702) Standardize MLlib classes for learners, models

2014-09-30 Thread Christoph Sawade (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153173#comment-14153173 ] Christoph Sawade commented on SPARK-3702: - Great initiative. I really appreciate

[jira] [Comment Edited] (SPARK-3702) Standardize MLlib classes for learners, models

2014-09-30 Thread Christoph Sawade (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153173#comment-14153173 ] Christoph Sawade edited comment on SPARK-3702 at 9/30/14 1:59 PM:

[jira] [Commented] (SPARK-2620) case class cannot be used as key for reduce

2014-09-30 Thread matthias (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153283#comment-14153283 ] matthias commented on SPARK-2620: - I have it too with 1.1.0 case class cannot be used as

[jira] [Comment Edited] (SPARK-2620) case class cannot be used as key for reduce

2014-09-30 Thread matthias (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153283#comment-14153283 ] matthias edited comment on SPARK-2620 at 9/30/14 3:28 PM: -- I have

[jira] [Reopened] (SPARK-3007) Add Dynamic Partition support to Spark Sql hive

2014-09-30 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell reopened SPARK-3007: This was reverted based on causing large numbers of test failures. Add Dynamic Partition

[jira] [Reopened] (SPARK-2778) Add unit tests for Yarn integration

2014-09-30 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell reopened SPARK-2778: This has been reverted due to failing tests. Add unit tests for Yarn integration

[jira] [Comment Edited] (SPARK-1860) Standalone Worker cleanup should not clean up running executors

2014-09-30 Thread Matt Cheah (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153382#comment-14153382 ] Matt Cheah edited comment on SPARK-1860 at 9/30/14 4:48 PM:

[jira] [Commented] (SPARK-1860) Standalone Worker cleanup should not clean up running executors

2014-09-30 Thread Matt Cheah (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153382#comment-14153382 ] Matt Cheah commented on SPARK-1860: --- Cool, I see where you're coming from now. I'll whip

[jira] [Commented] (SPARK-3708) Backticks aren't handled correctly is aliases

2014-09-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153392#comment-14153392 ] Apache Spark commented on SPARK-3708: - User 'ravipesala' has created a pull request

[jira] [Commented] (SPARK-3366) Compute best splits distributively in decision tree

2014-09-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153396#comment-14153396 ] Apache Spark commented on SPARK-3366: - User 'chouqin' has created a pull request for

[jira] [Created] (SPARK-3743) noisy logging when context is stopped

2014-09-30 Thread Davies Liu (JIRA)
Davies Liu created SPARK-3743: - Summary: noisy logging when context is stopped Key: SPARK-3743 URL: https://issues.apache.org/jira/browse/SPARK-3743 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-2778) Add unit tests for Yarn integration

2014-09-30 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2778: --- Attachment: yarn-logs.txt I'm attaching logs from the bad test. Add unit tests for Yarn

[jira] [Commented] (SPARK-1405) parallel Latent Dirichlet Allocation (LDA) atop of spark in MLlib

2014-09-30 Thread David Hall (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153412#comment-14153412 ] David Hall commented on SPARK-1405: --- Hi everyone, Sorry for taking so long for me to

[jira] [Commented] (SPARK-1405) parallel Latent Dirichlet Allocation (LDA) atop of spark in MLlib

2014-09-30 Thread David Hall (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153422#comment-14153422 ] David Hall commented on SPARK-1405: --- I should also mention it needs less space. Gibbs

[jira] [Created] (SPARK-3744) FlumeStreamSuite will fail during port contention

2014-09-30 Thread Patrick Wendell (JIRA)
Patrick Wendell created SPARK-3744: -- Summary: FlumeStreamSuite will fail during port contention Key: SPARK-3744 URL: https://issues.apache.org/jira/browse/SPARK-3744 Project: Spark Issue

[jira] [Commented] (SPARK-2778) Add unit tests for Yarn integration

2014-09-30 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153429#comment-14153429 ] Marcelo Vanzin commented on SPARK-2778: --- Patrick, would it be possible to get all

[jira] [Created] (SPARK-3745) curl on maven search repo apache rat url returns search status, not jar file

2014-09-30 Thread shane knapp (JIRA)
shane knapp created SPARK-3745: -- Summary: curl on maven search repo apache rat url returns search status, not jar file Key: SPARK-3745 URL: https://issues.apache.org/jira/browse/SPARK-3745 Project:

[jira] [Updated] (SPARK-3745) curl on maven search repo apache rat url returns search status, not jar file

2014-09-30 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shane knapp updated SPARK-3745: --- Description: in spark/dev/check-license, there are four attempts to download the apache rat jar from

[jira] [Updated] (SPARK-3745) curl on maven search repo apache rat url returns search status, not jar file

2014-09-30 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shane knapp updated SPARK-3745: --- Description: in spark/dev/check-license, there are four attempts to download the apache rat jar from

[jira] [Updated] (SPARK-3745) curl on maven search repo apache rat url returns search status, not jar file

2014-09-30 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shane knapp updated SPARK-3745: --- Description: in spark/dev/check-license, there are four attempts to download the apache rat jar from

[jira] [Commented] (SPARK-3745) curl on maven search repo apache rat url returns search status, not jar file

2014-09-30 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153459#comment-14153459 ] shane knapp commented on SPARK-3745: https://github.com/apache/spark/pull/2596 curl

[jira] [Commented] (SPARK-3745) curl on maven search repo apache rat url returns search status, not jar file

2014-09-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153463#comment-14153463 ] Apache Spark commented on SPARK-3745: - User 'shaneknapp' has created a pull request

[jira] [Commented] (SPARK-3745) curl on maven search repo apache rat url returns search status, not jar file

2014-09-30 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153471#comment-14153471 ] shane knapp commented on SPARK-3745: it works. from the console log: {noformat}

[jira] [Updated] (SPARK-3745) curl on maven search repo (apache rat) url returns search status, not jar file

2014-09-30 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shane knapp updated SPARK-3745: --- Summary: curl on maven search repo (apache rat) url returns search status, not jar file (was: curl

[jira] [Resolved] (SPARK-3356) Document when RDD elements' ordering within partitions is nondeterministic

2014-09-30 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-3356. -- Resolution: Fixed Fix Version/s: 1.2.0 Document when RDD elements' ordering within

[jira] [Updated] (SPARK-3356) Document when RDD elements' ordering within partitions is nondeterministic

2014-09-30 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-3356: - Assignee: Sean Owen Document when RDD elements' ordering within partitions is nondeterministic

[jira] [Updated] (SPARK-3366) Compute best splits distributively in decision tree

2014-09-30 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3366: - Target Version/s: 1.2.0 Compute best splits distributively in decision tree

[jira] [Commented] (SPARK-1860) Standalone Worker cleanup should not clean up running executors

2014-09-30 Thread Matt Cheah (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153514#comment-14153514 ] Matt Cheah commented on SPARK-1860: --- The change I am going to make is that when the

[jira] [Updated] (SPARK-3627) spark on yarn reports success even though job fails

2014-09-30 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-3627: - Affects Version/s: (was: 1.2.0) 1.1.0 spark on yarn reports success even

[jira] [Created] (SPARK-3746) Failure to lock hive client when creating tables

2014-09-30 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-3746: --- Summary: Failure to lock hive client when creating tables Key: SPARK-3746 URL: https://issues.apache.org/jira/browse/SPARK-3746 Project: Spark Issue

[jira] [Commented] (SPARK-3746) Failure to lock hive client when creating tables

2014-09-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153569#comment-14153569 ] Apache Spark commented on SPARK-3746: - User 'marmbrus' has created a pull request for

[jira] [Created] (SPARK-3747) TaskResultGetter could incorrectly abort a stage if it cannot get result for a specific task

2014-09-30 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-3747: -- Summary: TaskResultGetter could incorrectly abort a stage if it cannot get result for a specific task Key: SPARK-3747 URL: https://issues.apache.org/jira/browse/SPARK-3747

[jira] [Reopened] (SPARK-2063) Creating a SchemaRDD via sql() does not correctly resolve nested types

2014-09-30 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai reopened SPARK-2063: - I tried {code} sql(SELECT n.a FROM nestedData GROUP BY n.a ORDER BY n.a LIMIT 10) {code} It still cannot be

[jira] [Commented] (SPARK-2063) Creating a SchemaRDD via sql() does not correctly resolve nested types

2014-09-30 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153578#comment-14153578 ] Yin Huai commented on SPARK-2063: - Also, seems it is not duplicated with SPARK-3414.

[jira] [Created] (SPARK-3748) Log thread name in unit test logs

2014-09-30 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-3748: -- Summary: Log thread name in unit test logs Key: SPARK-3748 URL: https://issues.apache.org/jira/browse/SPARK-3748 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-3747) TaskResultGetter could incorrectly abort a stage if it cannot get result for a specific task

2014-09-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153593#comment-14153593 ] Apache Spark commented on SPARK-3747: - User 'rxin' has created a pull request for this

[jira] [Updated] (SPARK-3436) Streaming SVM

2014-09-30 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3436: - Summary: Streaming SVM (was: [MLlib]Streaming SVM ) Streaming SVM --

[jira] [Updated] (SPARK-3486) Add PySpark support for Word2Vec

2014-09-30 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3486: - Summary: Add PySpark support for Word2Vec (was: [MLlib]Add PySpark support for Word2Vec) Add

[jira] [Commented] (SPARK-3748) Log thread name in unit test logs

2014-09-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153615#comment-14153615 ] Apache Spark commented on SPARK-3748: - User 'rxin' has created a pull request for this

[jira] [Updated] (SPARK-3158) Avoid 1 extra aggregation for DecisionTree training

2014-09-30 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3158: - Target Version/s: 1.2.0 Avoid 1 extra aggregation for DecisionTree training

[jira] [Updated] (SPARK-3158) Avoid 1 extra aggregation for DecisionTree training

2014-09-30 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3158: - Priority: Major (was: Minor) Avoid 1 extra aggregation for DecisionTree training

[jira] [Updated] (SPARK-3161) Cache example-node map for DecisionTree training

2014-09-30 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3161: - Priority: Major (was: Minor) Target Version/s: 1.2.0 Cache example-node map for

[jira] [Commented] (SPARK-3744) FlumeStreamSuite will fail during port contention

2014-09-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153643#comment-14153643 ] Apache Spark commented on SPARK-3744: - User 'srowen' has created a pull request for

[jira] [Updated] (SPARK-3165) DecisionTree does not use sparsity in data

2014-09-30 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-3165: - Priority: Minor (was: Major) DecisionTree does not use sparsity in data

[jira] [Updated] (SPARK-3380) DecisionTree: overflow and precision in aggregation

2014-09-30 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-3380: - Priority: Minor (was: Major) DecisionTree: overflow and precision in aggregation

[jira] [Commented] (SPARK-2692) Decision Tree API update

2014-09-30 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153647#comment-14153647 ] Joseph K. Bradley commented on SPARK-2692: -- Closing this since it is part of the

[jira] [Closed] (SPARK-2692) Decision Tree API update

2014-09-30 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley closed SPARK-2692. Resolution: Duplicate This is superceded by the new API JIRA. Decision Tree API update

[jira] [Created] (SPARK-3749) Bugs in broadcast of large RDD

2014-09-30 Thread Davies Liu (JIRA)
Davies Liu created SPARK-3749: - Summary: Bugs in broadcast of large RDD Key: SPARK-3749 URL: https://issues.apache.org/jira/browse/SPARK-3749 Project: Spark Issue Type: Bug

[jira] [Resolved] (SPARK-3745) curl on maven search repo (apache rat) url returns search status, not jar file

2014-09-30 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-3745. --- Resolution: Fixed Fix Version/s: 1.1.1 1.2.0 Issue resolved by pull request

[jira] [Updated] (SPARK-2463) Creating then stopping StreamingContext multiple times from shell generates duplicate Streaming tabs in UI

2014-09-30 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-2463: Summary: Creating then stopping StreamingContext multiple times from shell generates

[jira] [Commented] (SPARK-3745) curl on maven search repo (apache rat) url returns search status, not jar file

2014-09-30 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153716#comment-14153716 ] shane knapp commented on SPARK-3745: thanks josh! curl on maven search repo (apache

[jira] [Commented] (SPARK-2008) Enhance spark-ec2 to be able to add and remove slaves to an existing cluster

2014-09-30 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153723#comment-14153723 ] Nicholas Chammas commented on SPARK-2008: - [~shivaram] [~pwendell] - Is this a

[jira] [Created] (SPARK-3750) Log ulimit settings at warning if they are too low

2014-09-30 Thread Andrew Ash (JIRA)
Andrew Ash created SPARK-3750: - Summary: Log ulimit settings at warning if they are too low Key: SPARK-3750 URL: https://issues.apache.org/jira/browse/SPARK-3750 Project: Spark Issue Type:

[jira] [Commented] (SPARK-3749) Bugs in broadcast of large RDD

2014-09-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153742#comment-14153742 ] Apache Spark commented on SPARK-3749: - User 'davies' has created a pull request for

[jira] [Commented] (SPARK-1860) Standalone Worker cleanup should not clean up running executors

2014-09-30 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153752#comment-14153752 ] Andrew Ash commented on SPARK-1860: --- That matches my expectations for this ticket Matt

[jira] [Commented] (SPARK-1546) Add AdaBoost algorithm to Spark MLlib

2014-09-30 Thread Manish Amde (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153754#comment-14153754 ] Manish Amde commented on SPARK-1546: This work was stalled due to decision tree

[jira] [Commented] (SPARK-3627) spark on yarn reports success even though job fails

2014-09-30 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153758#comment-14153758 ] Andrew Or commented on SPARK-3627: -- Did this problem not exist in 1.1? I vaguely remember

[jira] [Commented] (SPARK-3702) Standardize MLlib classes for learners, models

2014-09-30 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153763#comment-14153763 ] Joseph K. Bradley commented on SPARK-3702: -- Thanks for taking a close look! *

[jira] [Commented] (SPARK-2008) Enhance spark-ec2 to be able to add and remove slaves to an existing cluster

2014-09-30 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153770#comment-14153770 ] Shivaram Venkataraman commented on SPARK-2008: -- This will be a very useful

[jira] [Commented] (SPARK-3627) spark on yarn reports success even though job fails

2014-09-30 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153777#comment-14153777 ] Thomas Graves commented on SPARK-3627: -- As far as I am aware the common cases worked

[jira] [Commented] (SPARK-3434) Distributed block matrix

2014-09-30 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153774#comment-14153774 ] Shivaram Venkataraman commented on SPARK-3434: -- I'll post a design doc by

[jira] [Updated] (SPARK-3627) spark on yarn reports success even though job fails

2014-09-30 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-3627: - Affects Version/s: (was: 1.1.0) 1.2.0 spark on yarn reports success

[jira] [Created] (SPARK-3751) DecisionTreeRunner functionality improvement

2014-09-30 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3751: Summary: DecisionTreeRunner functionality improvement Key: SPARK-3751 URL: https://issues.apache.org/jira/browse/SPARK-3751 Project: Spark Issue

[jira] [Commented] (SPARK-2247) Data frame (or Pandas) like API for structured data

2014-09-30 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153799#comment-14153799 ] Reynold Xin commented on SPARK-2247: I'm not sure about Sparkling Pandas. I think that

[jira] [Commented] (SPARK-2778) Add unit tests for Yarn integration

2014-09-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153813#comment-14153813 ] Apache Spark commented on SPARK-2778: - User 'vanzin' has created a pull request for

[jira] [Commented] (SPARK-3627) spark on yarn reports success even though job fails

2014-09-30 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153830#comment-14153830 ] Andrew Or commented on SPARK-3627: -- Ok, sounds good. spark on yarn reports success even

[jira] [Resolved] (SPARK-3744) FlumeStreamSuite will fail during port contention

2014-09-30 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-3744. -- Resolution: Fixed FlumeStreamSuite will fail during port contention

[jira] [Updated] (SPARK-3744) FlumeStreamSuite will fail during port contention

2014-09-30 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-3744: - Fix Version/s: 1.2.0 FlumeStreamSuite will fail during port contention

[jira] [Commented] (SPARK-3541) Improve ALS internal storage

2014-09-30 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153911#comment-14153911 ] Xiangrui Meng commented on SPARK-3541: -- I put the implementation at

[jira] [Commented] (SPARK-3479) Have Jenkins show which tests failed in his GitHub messages

2014-09-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153914#comment-14153914 ] Apache Spark commented on SPARK-3479: - User 'nchammas' has created a pull request for

[jira] [Created] (SPARK-3752) Spark SQL needs more exhaustive tests for definite Hive UDF's

2014-09-30 Thread Vida Ha (JIRA)
Vida Ha created SPARK-3752: -- Summary: Spark SQL needs more exhaustive tests for definite Hive UDF's Key: SPARK-3752 URL: https://issues.apache.org/jira/browse/SPARK-3752 Project: Spark Issue Type:

[jira] [Commented] (SPARK-1547) Add gradient boosting algorithm to MLlib

2014-09-30 Thread Manish Amde (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14154051#comment-14154051 ] Manish Amde commented on SPARK-1547: Adding Hirakendu's feedback on checkpointing

[jira] [Created] (SPARK-3753) Spark hive join results in empty with shared hive context

2014-09-30 Thread Hector Yee (JIRA)
Hector Yee created SPARK-3753: - Summary: Spark hive join results in empty with shared hive context Key: SPARK-3753 URL: https://issues.apache.org/jira/browse/SPARK-3753 Project: Spark Issue

[jira] [Resolved] (SPARK-3701) Some clean-up work after the refactoring of MLlib's SerDe for PySpark

2014-09-30 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-3701. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2548

[jira] [Created] (SPARK-3754) Spark Streaming fileSystem API is not callable from Java

2014-09-30 Thread holdenk (JIRA)
holdenk created SPARK-3754: -- Summary: Spark Streaming fileSystem API is not callable from Java Key: SPARK-3754 URL: https://issues.apache.org/jira/browse/SPARK-3754 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-3754) Spark Streaming fileSystem API is not callable from Java

2014-09-30 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-3754: - Assignee: Holden Karau Spark Streaming fileSystem API is not callable from Java

  1   2   >