[jira] [Commented] (SPARK-2633) support register spark listener to listener bus with Java API

2014-07-23 Thread Chengxiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071402#comment-14071402 ] Chengxiang Li commented on SPARK-2633: -- For Hive job status monitor, spark listener

[jira] [Updated] (SPARK-2630) Input data size of CoalescedRDD is incorrect

2014-07-23 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-2630: -- Summary: Input data size of CoalescedRDD is incorrect (was: Input data size goes overflow when size

[jira] [Created] (SPARK-2640) In local[N], free cores of the only executor should be touched by spark.task.cpus for every finish/start-up of tasks.

2014-07-23 Thread woshilaiceshide (JIRA)
woshilaiceshide created SPARK-2640: -- Summary: In local[N], free cores of the only executor should be touched by spark.task.cpus for every finish/start-up of tasks. Key: SPARK-2640 URL:

[jira] [Commented] (SPARK-2640) In local[N], free cores of the only executor should be touched by spark.task.cpus for every finish/start-up of tasks.

2014-07-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071448#comment-14071448 ] Apache Spark commented on SPARK-2640: - User 'woshilaiceshide' has created a pull

[jira] [Updated] (SPARK-1516) Yarn Client should not call System.exit, should throw exception instead.

2014-07-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1516: - Assignee: John Zhao Yarn Client should not call System.exit, should throw exception instead.

[jira] [Updated] (SPARK-1935) Explicitly add commons-codec 1.5 as a dependency

2014-07-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1935: - Fix Version/s: 0.9.2 Explicitly add commons-codec 1.5 as a dependency

[jira] [Created] (SPARK-2641) Spark submit doesn't pick up executor instances from properties file

2014-07-23 Thread Kanwaljit Singh (JIRA)
Kanwaljit Singh created SPARK-2641: -- Summary: Spark submit doesn't pick up executor instances from properties file Key: SPARK-2641 URL: https://issues.apache.org/jira/browse/SPARK-2641 Project:

[jira] [Updated] (SPARK-2641) Spark submit doesn't pick up executor instances from properties file

2014-07-23 Thread Kanwaljit Singh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kanwaljit Singh updated SPARK-2641: --- Description: When running spark-submit in Yarn cluster mode, we provide properties file

[jira] [Updated] (SPARK-2641) Spark submit doesn't pick up executor instances from properties file

2014-07-23 Thread Kanwaljit Singh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kanwaljit Singh updated SPARK-2641: --- Description: When running spark-submit in Yarn cluster mode, we provide properties file

[jira] [Commented] (SPARK-2638) Improve concurrency of fetching Map outputs

2014-07-23 Thread Stephen Boesch (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071476#comment-14071476 ] Stephen Boesch commented on SPARK-2638: --- Upon examining the codebase further, I see

[jira] [Commented] (SPARK-2633) support register spark listener to listener bus with Java API

2014-07-23 Thread Chengxiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071478#comment-14071478 ] Chengxiang Li commented on SPARK-2633: -- add 2 more: # StageInfo class is not well

[jira] [Commented] (SPARK-2638) Improve concurrency of fetching Map outputs

2014-07-23 Thread Stephen Boesch (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071490#comment-14071490 ] Stephen Boesch commented on SPARK-2638: --- Other examples:

[jira] [Commented] (SPARK-2575) SVMWithSGD throwing Input Validation failed

2014-07-23 Thread navanee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071492#comment-14071492 ] navanee commented on SPARK-2575: yes Xiangrui Meng.. But if I use MLUtils.loadLibSVMFile

[jira] [Created] (SPARK-2642) Add jobId in web UI

2014-07-23 Thread YanTang Zhai (JIRA)
YanTang Zhai created SPARK-2642: --- Summary: Add jobId in web UI Key: SPARK-2642 URL: https://issues.apache.org/jira/browse/SPARK-2642 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-2618) use config spark.scheduler.priority for specifying TaskSet's priority on DAGScheduler

2014-07-23 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-2618: Description: we use shark server to do interative query. every sql run with a job. sometimes we

[jira] [Updated] (SPARK-2643) Stages web ui has ERROR when pool name is None

2014-07-23 Thread YanTang Zhai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YanTang Zhai updated SPARK-2643: Component/s: Web UI Stages web ui has ERROR when pool name is None

[jira] [Commented] (SPARK-2298) Show stage attempt in UI

2014-07-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071542#comment-14071542 ] Apache Spark commented on SPARK-2298: - User 'rxin' has created a pull request for this

[jira] [Assigned] (SPARK-2644) Hive should not be enabled by default in the build.

2014-07-23 Thread Prashant Sharma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Sharma reassigned SPARK-2644: -- Assignee: Prashant Sharma Hive should not be enabled by default in the build.

[jira] [Commented] (SPARK-2644) Hive should not be enabled by default in the build.

2014-07-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071556#comment-14071556 ] Apache Spark commented on SPARK-2644: - User 'ScrapCodes' has created a pull request

[jira] [Commented] (SPARK-2420) Change Spark build to minimize library conflicts

2014-07-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071571#comment-14071571 ] Sean Owen commented on SPARK-2420: -- I'd say it's already just as broken for those apps,

[jira] [Created] (SPARK-2645) Spark driver calls System.exit(50) after calling SparkContext.stop() the second time

2014-07-23 Thread Vlad Komarov (JIRA)
Vlad Komarov created SPARK-2645: --- Summary: Spark driver calls System.exit(50) after calling SparkContext.stop() the second time Key: SPARK-2645 URL: https://issues.apache.org/jira/browse/SPARK-2645

[jira] [Commented] (SPARK-2282) PySpark crashes if too many tasks complete quickly

2014-07-23 Thread Ken Carlile (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071669#comment-14071669 ] Ken Carlile commented on SPARK-2282: Well, something didn't work quite right.. our

[jira] [Created] (SPARK-2646) log4j initialization not quite compatible with log4j 2.x

2014-07-23 Thread Sean Owen (JIRA)
Sean Owen created SPARK-2646: Summary: log4j initialization not quite compatible with log4j 2.x Key: SPARK-2646 URL: https://issues.apache.org/jira/browse/SPARK-2646 Project: Spark Issue Type:

[jira] [Created] (SPARK-2647) DAGScheduler plugs others when processing one JobSubmitted event

2014-07-23 Thread YanTang Zhai (JIRA)
YanTang Zhai created SPARK-2647: --- Summary: DAGScheduler plugs others when processing one JobSubmitted event Key: SPARK-2647 URL: https://issues.apache.org/jira/browse/SPARK-2647 Project: Spark

[jira] [Commented] (SPARK-2646) log4j initialization not quite compatible with log4j 2.x

2014-07-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071745#comment-14071745 ] Apache Spark commented on SPARK-2646: - User 'srowen' has created a pull request for

[jira] [Commented] (SPARK-2647) DAGScheduler plugs others when processing one JobSubmitted event

2014-07-23 Thread YanTang Zhai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071755#comment-14071755 ] YanTang Zhai commented on SPARK-2647: - I've created PR:

[jira] [Commented] (SPARK-2647) DAGScheduler plugs others when processing one JobSubmitted event

2014-07-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071754#comment-14071754 ] Apache Spark commented on SPARK-2647: - User 'YanTangZhai' has created a pull request

[jira] [Commented] (SPARK-2282) PySpark crashes if too many tasks complete quickly

2014-07-23 Thread Ken Carlile (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071767#comment-14071767 ] Ken Carlile commented on SPARK-2282: Merging just the two files also did not work. I

[jira] [Commented] (SPARK-2604) Spark Application hangs on yarn in edge case scenario of executor memory requirement

2014-07-23 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071786#comment-14071786 ] Thomas Graves commented on SPARK-2604: -- Yes we should be adding the overhead in at

[jira] [Updated] (SPARK-2484) By default does not run hive compatibility tests

2014-07-23 Thread Prashant Sharma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Sharma updated SPARK-2484: --- Assignee: Guoqiang Li By default does not run hive compatibility tests

[jira] [Closed] (SPARK-2644) Hive should not be enabled by default in the build.

2014-07-23 Thread Prashant Sharma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Sharma closed SPARK-2644. -- Resolution: Duplicate Hive should not be enabled by default in the build.

[jira] [Updated] (SPARK-2484) Build should not run hive compatibility tests by default.

2014-07-23 Thread Prashant Sharma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Sharma updated SPARK-2484: --- Summary: Build should not run hive compatibility tests by default. (was: By default does

[jira] [Created] (SPARK-2648) through shuffling blocksByAddress avoid much reducers to fetch data from a executor at a time

2014-07-23 Thread Lianhui Wang (JIRA)
Lianhui Wang created SPARK-2648: --- Summary: through shuffling blocksByAddress avoid much reducers to fetch data from a executor at a time Key: SPARK-2648 URL: https://issues.apache.org/jira/browse/SPARK-2648

[jira] [Created] (SPARK-2649) EC2: Ganglia-httpd broken on hvm based machines like r3.4xlarge

2014-07-23 Thread npanj (JIRA)
npanj created SPARK-2649: Summary: EC2: Ganglia-httpd broken on hvm based machines like r3.4xlarge Key: SPARK-2649 URL: https://issues.apache.org/jira/browse/SPARK-2649 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-2609) Log thread ID when spilling ExternalAppendOnlyMap

2014-07-23 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2609. -- Resolution: Fixed Log thread ID when spilling ExternalAppendOnlyMap

[jira] [Updated] (SPARK-2609) Log thread ID when spilling ExternalAppendOnlyMap

2014-07-23 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-2609: - Assignee: Andrew Or Log thread ID when spilling ExternalAppendOnlyMap

[jira] [Commented] (SPARK-2632) Importing a method of class in Spark REPL causes the REPL to pulls in unnecessary stuff.

2014-07-23 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072044#comment-14072044 ] Yin Huai commented on SPARK-2632: - Seems the exception triggered by importing a method of

[jira] [Updated] (SPARK-2632) Importing a method of class in Spark REPL causes the REPL to pulls in unnecessary stuff.

2014-07-23 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-2632: Priority: Major (was: Blocker) Importing a method of class in Spark REPL causes the REPL to pulls in

[jira] [Resolved] (SPARK-2640) In local[N], free cores of the only executor should be touched by spark.task.cpus for every finish/start-up of tasks.

2014-07-23 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2640. -- Resolution: Fixed Fix Version/s: 1.1.0 In local[N], free cores of the only executor

[jira] [Updated] (SPARK-2640) In local[N], free cores of the only executor should be touched by spark.task.cpus for every finish/start-up of tasks.

2014-07-23 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-2640: - Priority: Minor (was: Major) In local[N], free cores of the only executor should be touched by

[jira] [Updated] (SPARK-2640) In local[N], free cores of the only executor should be touched by spark.task.cpus for every finish/start-up of tasks.

2014-07-23 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-2640: - Assignee: woshilaiceshide In local[N], free cores of the only executor should be touched by

[jira] [Updated] (SPARK-2277) Make TaskScheduler track whether there's host on a rack

2014-07-23 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-2277: - Fix Version/s: 1.1.0 Make TaskScheduler track whether there's host on a rack

[jira] [Updated] (SPARK-2277) Make TaskScheduler track whether there's host on a rack

2014-07-23 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-2277: - Assignee: Rui Li Make TaskScheduler track whether there's host on a rack

[jira] [Resolved] (SPARK-2277) Make TaskScheduler track whether there's host on a rack

2014-07-23 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2277. -- Resolution: Fixed Make TaskScheduler track whether there's host on a rack

[jira] [Created] (SPARK-2650) Wrong initial sizes for in-memory column buffers

2014-07-23 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-2650: --- Summary: Wrong initial sizes for in-memory column buffers Key: SPARK-2650 URL: https://issues.apache.org/jira/browse/SPARK-2650 Project: Spark Issue

[jira] [Resolved] (SPARK-2561) Repartitioning a SchemaRDD breaks resolution

2014-07-23 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-2561. - Resolution: Fixed Fix Version/s: 1.0.2 1.1.0 Repartitioning a

[jira] [Updated] (SPARK-1630) PythonRDDs don't handle nulls gracefully

2014-07-23 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-1630: Assignee: Davies Liu PythonRDDs don't handle nulls gracefully

[jira] [Updated] (SPARK-2650) Wrong initial sizes for in-memory column buffers

2014-07-23 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2650: Target Version/s: 1.1.0 Wrong initial sizes for in-memory column buffers

[jira] [Assigned] (SPARK-2569) Customized UDFs in hive not running with Spark SQL

2014-07-23 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust reassigned SPARK-2569: --- Assignee: Michael Armbrust Customized UDFs in hive not running with Spark SQL

[jira] [Updated] (SPARK-2569) Customized UDFs in hive not running with Spark SQL

2014-07-23 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2569: Priority: Critical (was: Major) Target Version/s: 1.1.0 Customized UDFs in

[jira] [Commented] (SPARK-2576) slave node throws NoClassDefFoundError $line11.$read$ when executing a Spark QL query on HDFS CSV file

2014-07-23 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072076#comment-14072076 ] Yin Huai commented on SPARK-2576: - [~prashant_] I have also created a [REPL

[jira] [Created] (SPARK-2651) Add maven scalastyle plugin

2014-07-23 Thread Rahul Singhal (JIRA)
Rahul Singhal created SPARK-2651: Summary: Add maven scalastyle plugin Key: SPARK-2651 URL: https://issues.apache.org/jira/browse/SPARK-2651 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-2575) SVMWithSGD throwing Input Validation failed

2014-07-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072140#comment-14072140 ] Xiangrui Meng commented on SPARK-2575: -- loadLibSVMFile converts labels to binary by

[jira] [Commented] (SPARK-2651) Add maven scalastyle plugin

2014-07-23 Thread Rahul Singhal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072156#comment-14072156 ] Rahul Singhal commented on SPARK-2651: -- PR: https://github.com/apache/spark/pull/1550

[jira] [Commented] (SPARK-2651) Add maven scalastyle plugin

2014-07-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072164#comment-14072164 ] Apache Spark commented on SPARK-2651: - User 'rahulsinghaliitd' has created a pull

[jira] [Commented] (SPARK-2567) Resubmitted stage sometimes remains as active stage in the web UI

2014-07-23 Thread Masayoshi TSUZUKI (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072161#comment-14072161 ] Masayoshi TSUZUKI commented on SPARK-2567: -- I noticed now but the cause of

[jira] [Commented] (SPARK-2642) Add jobId in web UI

2014-07-23 Thread Masayoshi TSUZUKI (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072171#comment-14072171 ] Masayoshi TSUZUKI commented on SPARK-2642: -- Is this the same as [SPARK-1362] ?

[jira] [Updated] (SPARK-1362) Web UI should provide page of showing statistics and stage list for a given job

2014-07-23 Thread Masayoshi TSUZUKI (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masayoshi TSUZUKI updated SPARK-1362: - Component/s: Web UI Web UI should provide page of showing statistics and stage list for

[jira] [Updated] (SPARK-2649) EC2: Ganglia-httpd broken on hvm based machines like r3.4xlarge

2014-07-23 Thread npanj (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] npanj updated SPARK-2649: - Description: On EC2 httpd daemon doesn't start (so ganglia is not accessble) on Hvm machines like r3.4xlarge(

[jira] [Updated] (SPARK-2649) EC2: Ganglia-httpd broken on hvm based machines like r3.4xlarge

2014-07-23 Thread npanj (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] npanj updated SPARK-2649: - Description: On EC2 httpd daemon doesn't start (so ganglia is not accessble) on Hvm machines like r3.4xlarge(

[jira] [Updated] (SPARK-2649) EC2: Ganglia-httpd broken on hvm based machines like r3.4xlarge

2014-07-23 Thread npanj (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] npanj updated SPARK-2649: - Description: On EC2 httpd daemon doesn't start (so ganglia is not accessble) on Hvm machines like r3.4xlarge(

[jira] [Commented] (SPARK-1630) PythonRDDs don't handle nulls gracefully

2014-07-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072235#comment-14072235 ] Apache Spark commented on SPARK-1630: - User 'davies' has created a pull request for

[jira] [Commented] (SPARK-2420) Change Spark build to minimize library conflicts

2014-07-23 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072259#comment-14072259 ] Marcelo Vanzin commented on SPARK-2420: --- Hi Sean, I agree in part about the

[jira] [Commented] (SPARK-2569) Customized UDFs in hive not running with Spark SQL

2014-07-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072258#comment-14072258 ] Apache Spark commented on SPARK-2569: - User 'marmbrus' has created a pull request for

[jira] [Created] (SPARK-2652) Turning default configurations for PySpark

2014-07-23 Thread Davies Liu (JIRA)
Davies Liu created SPARK-2652: - Summary: Turning default configurations for PySpark Key: SPARK-2652 URL: https://issues.apache.org/jira/browse/SPARK-2652 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-2653) Heap size should be the sum of driver.memory and executor.memory in local mode

2014-07-23 Thread Davies Liu (JIRA)
Davies Liu created SPARK-2653: - Summary: Heap size should be the sum of driver.memory and executor.memory in local mode Key: SPARK-2653 URL: https://issues.apache.org/jira/browse/SPARK-2653 Project:

[jira] [Created] (SPARK-2654) Leveled logging in PySpark

2014-07-23 Thread Davies Liu (JIRA)
Davies Liu created SPARK-2654: - Summary: Leveled logging in PySpark Key: SPARK-2654 URL: https://issues.apache.org/jira/browse/SPARK-2654 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-2655) Change the default logging level to WARN

2014-07-23 Thread Davies Liu (JIRA)
Davies Liu created SPARK-2655: - Summary: Change the default logging level to WARN Key: SPARK-2655 URL: https://issues.apache.org/jira/browse/SPARK-2655 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-2648) through shuffling blocksByAddress avoid much reducers to fetch data from a executor at a time

2014-07-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2648: --- Priority: Critical (was: Major) through shuffling blocksByAddress avoid much reducers to fetch

[jira] [Updated] (SPARK-2648) through shuffling blocksByAddress avoid much reducers to fetch data from a executor at a time

2014-07-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2648: --- Target Version/s: 1.1.0 Assignee: Lianhui Wang through shuffling blocksByAddress avoid

[jira] [Resolved] (SPARK-2588) Add some more DSLs.

2014-07-23 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-2588. - Resolution: Fixed Fix Version/s: 1.1.0 Assignee: Takuya Ueshin Add some

[jira] [Resolved] (SPARK-1726) Tasks that fail to serialize remain in active stages forever.

2014-07-23 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-1726. - Resolution: Fixed Fix Version/s: 1.1.0 [~kayousterhout] reports this is fixed in

[jira] [Reopened] (SPARK-1726) Tasks that fail to serialize remain in active stages forever.

2014-07-23 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout reopened SPARK-1726: --- Tasks that fail to serialize remain in active stages forever.

[jira] [Resolved] (SPARK-2226) HAVING should be able to contain aggregate expressions that don't appear in the aggregation list.

2014-07-23 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-2226. - Resolution: Fixed Fix Version/s: 1.1.0 HAVING should be able to contain

[jira] [Resolved] (SPARK-2569) Customized UDFs in hive not running with Spark SQL

2014-07-23 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-2569. - Resolution: Fixed Fix Version/s: 1.1.0 Customized UDFs in hive not running with

[jira] [Resolved] (SPARK-2102) Caching with GENERIC column type causes query execution to slow down significantly

2014-07-23 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-2102. - Resolution: Fixed Fix Version/s: 1.1.0 Caching with GENERIC column type causes

[jira] [Created] (SPARK-2656) Python version without support for exact sample size

2014-07-23 Thread Doris Xin (JIRA)
Doris Xin created SPARK-2656: Summary: Python version without support for exact sample size Key: SPARK-2656 URL: https://issues.apache.org/jira/browse/SPARK-2656 Project: Spark Issue Type:

[jira] [Commented] (SPARK-2633) support register spark listener to listener bus with Java API

2014-07-23 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072579#comment-14072579 ] Marcelo Vanzin commented on SPARK-2633: --- So, being able to register listeners is

[jira] [Commented] (SPARK-2656) Python version without support for exact sample size

2014-07-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072585#comment-14072585 ] Apache Spark commented on SPARK-2656: - User 'dorx' has created a pull request for this

[jira] [Created] (SPARK-2657) Use more compact data structures than ArrayBuffer in groupBy and cogroup

2014-07-23 Thread Matei Zaharia (JIRA)
Matei Zaharia created SPARK-2657: Summary: Use more compact data structures than ArrayBuffer in groupBy and cogroup Key: SPARK-2657 URL: https://issues.apache.org/jira/browse/SPARK-2657 Project:

[jira] [Commented] (SPARK-2657) Use more compact data structures than ArrayBuffer in groupBy and cogroup

2014-07-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072608#comment-14072608 ] Apache Spark commented on SPARK-2657: - User 'mateiz' has created a pull request for

[jira] [Assigned] (SPARK-2574) Avoid allocating new ArrayBuffer in groupByKey's mergeCombiner

2014-07-23 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia reassigned SPARK-2574: Assignee: Matei Zaharia Avoid allocating new ArrayBuffer in groupByKey's mergeCombiner

[jira] [Commented] (SPARK-2574) Avoid allocating new ArrayBuffer in groupByKey's mergeCombiner

2014-07-23 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072605#comment-14072605 ] Matei Zaharia commented on SPARK-2574: -- I implemented this as part of

[jira] [Resolved] (SPARK-2549) Functions defined inside of other functions trigger failures

2014-07-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2549. Resolution: Fixed Issue resolved by pull request 1510

[jira] [Updated] (SPARK-2574) Avoid allocating new ArrayBuffer in groupByKey's mergeCombiner

2014-07-23 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-2574: - Priority: Trivial (was: Major) Avoid allocating new ArrayBuffer in groupByKey's mergeCombiner

[jira] [Created] (SPARK-2658) HiveQL: 1 = true should evaluate to true

2014-07-23 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-2658: --- Summary: HiveQL: 1 = true should evaluate to true Key: SPARK-2658 URL: https://issues.apache.org/jira/browse/SPARK-2658 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-2658) HiveQL: 1 = true should evaluate to true

2014-07-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072629#comment-14072629 ] Apache Spark commented on SPARK-2658: - User 'marmbrus' has created a pull request for

[jira] [Created] (SPARK-2659) HiveQL: Division operator should always perform fractional division

2014-07-23 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-2659: --- Summary: HiveQL: Division operator should always perform fractional division Key: SPARK-2659 URL: https://issues.apache.org/jira/browse/SPARK-2659 Project:

[jira] [Commented] (SPARK-2659) HiveQL: Division operator should always perform fractional division

2014-07-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072655#comment-14072655 ] Apache Spark commented on SPARK-2659: - User 'marmbrus' has created a pull request for

[jira] [Updated] (SPARK-2316) StorageStatusListener should avoid O(blocks) operations

2014-07-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2316: --- Target Version/s: 1.1.0 (was: 1.0.2) StorageStatusListener should avoid O(blocks)

[jira] [Updated] (SPARK-2316) StorageStatusListener should avoid O(blocks) operations

2014-07-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2316: --- Priority: Critical (was: Major) StorageStatusListener should avoid O(blocks) operations

[jira] [Commented] (SPARK-2458) Make failed application log visible on History Server

2014-07-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072665#comment-14072665 ] Apache Spark commented on SPARK-2458: - User 'tsudukim' has created a pull request for

[jira] [Created] (SPARK-2660) Enable pretty-printing SchemaRDD Rows

2014-07-23 Thread Aaron Davidson (JIRA)
Aaron Davidson created SPARK-2660: - Summary: Enable pretty-printing SchemaRDD Rows Key: SPARK-2660 URL: https://issues.apache.org/jira/browse/SPARK-2660 Project: Spark Issue Type:

[jira] [Commented] (SPARK-2010) Support for nested data in PySpark SQL

2014-07-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072681#comment-14072681 ] Apache Spark commented on SPARK-2010: - User 'davies' has created a pull request for

[jira] [Updated] (SPARK-2648) Randomize order of executors when fetching shuffle blocks

2014-07-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2648: --- Summary: Randomize order of executors when fetching shuffle blocks (was: through shuffling

[jira] [Commented] (SPARK-2660) Enable pretty-printing SchemaRDD Rows

2014-07-23 Thread Larry Xiao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072687#comment-14072687 ] Larry Xiao commented on SPARK-2660: --- I think this one is suitable for newbie like me,

[jira] [Created] (SPARK-2661) Unpersist last RDD in bagel iteration

2014-07-23 Thread Adrian Wang (JIRA)
Adrian Wang created SPARK-2661: -- Summary: Unpersist last RDD in bagel iteration Key: SPARK-2661 URL: https://issues.apache.org/jira/browse/SPARK-2661 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-2662) Fix NPE for JsonProtocol

2014-07-23 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072689#comment-14072689 ] Guoqiang Li commented on SPARK-2662: PR: https://github.com/apache/spark/pull/1511

[jira] [Created] (SPARK-2662) Fix NPE for JsonProtocol

2014-07-23 Thread Guoqiang Li (JIRA)
Guoqiang Li created SPARK-2662: -- Summary: Fix NPE for JsonProtocol Key: SPARK-2662 URL: https://issues.apache.org/jira/browse/SPARK-2662 Project: Spark Issue Type: Bug Components:

[jira] [Commented] (SPARK-2568) RangePartitioner should go through the data only once

2014-07-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072729#comment-14072729 ] Apache Spark commented on SPARK-2568: - User 'mengxr' has created a pull request for

  1   2   >