[jira] [Commented] (SPARK-2574) Avoid allocating new ArrayBuffer in groupByKey's mergeCombiner

2014-07-19 Thread Sandeep Singh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067815#comment-14067815 ] Sandeep Singh commented on SPARK-2574: -- [~sandyr] we can rewrite mergeCombiners as (c

[jira] [Commented] (SPARK-2597) Improve the code related to Table Scan

2014-07-19 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067814#comment-14067814 ] Yin Huai commented on SPARK-2597: - Hive uses HiveInputFormat as the wrapper of different I

[jira] [Created] (SPARK-2597) Improve the code related to Table Scan

2014-07-19 Thread Yin Huai (JIRA)
Yin Huai created SPARK-2597: --- Summary: Improve the code related to Table Scan Key: SPARK-2597 URL: https://issues.apache.org/jira/browse/SPARK-2597 Project: Spark Issue Type: Improvement

[jira] [Resolved] (SPARK-2524) missing document about spark.deploy.retainedDrivers

2014-07-19 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2524. Resolution: Fixed Fix Version/s: 1.1.0 Issue resolved by pull request 1443 [https://

[jira] [Updated] (SPARK-2524) missing document about spark.deploy.retainedDrivers

2014-07-19 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2524: --- Assignee: Lianhui Wang > missing document about spark.deploy.retainedDrivers > --

[jira] [Resolved] (SPARK-2587) Error message is incorrect in make-distribution.sh

2014-07-19 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2587. Resolution: Fixed Fix Version/s: 1.1.0 Issue resolved by pull request 1489 [https://

[jira] [Updated] (SPARK-2587) Error message is incorrect in make-distribution.sh

2014-07-19 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2587: --- Assignee: Mark Wagner > Error message is incorrect in make-distribution.sh >

[jira] [Commented] (SPARK-2226) HAVING should be able to contain aggregate expressions that don't appear in the aggregation list.

2014-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067765#comment-14067765 ] Apache Spark commented on SPARK-2226: - User 'willb' has created a pull request for thi

[jira] [Resolved] (SPARK-2596) Populate pull requests on JIRA automatically

2014-07-19 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2596. Resolution: Fixed Fix Version/s: 1.1.0 Issue resolved by pull request 1496 [https://

[jira] [Commented] (SPARK-1682) Add gradient descent w/o sampling and RDA L1 updater

2014-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067739#comment-14067739 ] Apache Spark commented on SPARK-1682: - User 'dongwang218' has created a pull request f

[jira] [Issue Comment Deleted] (SPARK-2596) Populate pull requests on JIRA automatically

2014-07-19 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2596: --- Comment: was deleted (was: This is a test: http://google.com) > Populate pull requests on JI

[jira] [Commented] (SPARK-2596) Populate pull requests on JIRA automatically

2014-07-19 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067731#comment-14067731 ] Patrick Wendell commented on SPARK-2596: This is a test: http://google.com > Popu

[jira] [Commented] (SPARK-1022) Add unit tests for kafka streaming

2014-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067730#comment-14067730 ] Apache Spark commented on SPARK-1022: - User 'tdas' has created a pull request for this

[jira] [Commented] (SPARK-1630) PythonRDDs don't handle nulls gracefully

2014-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067729#comment-14067729 ] Apache Spark commented on SPARK-1630: - User 'kalpit' has created a pull request for th

[jira] [Commented] (SPARK-1597) Add a version of reduceByKey that takes the Partitioner as a second argument

2014-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067727#comment-14067727 ] Apache Spark commented on SPARK-1597: - User 'techaddict' has created a pull request fo

[jira] [Commented] (SPARK-1623) SPARK-1623. Broadcast cleaner should use getCanonicalPath when deleting files by name

2014-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067725#comment-14067725 ] Apache Spark commented on SPARK-1623: - User 'nsuthar' has created a pull request for t

[jira] [Created] (SPARK-2596) Populate pull requests on JIRA automatically

2014-07-19 Thread Patrick Wendell (JIRA)
Patrick Wendell created SPARK-2596: -- Summary: Populate pull requests on JIRA automatically Key: SPARK-2596 URL: https://issues.apache.org/jira/browse/SPARK-2596 Project: Spark Issue Type: Bu

[jira] [Commented] (SPARK-1795) Add recursive directory file search to fileInputStream

2014-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067722#comment-14067722 ] Apache Spark commented on SPARK-1795: - User 'patrickotoole' has created a pull request

[jira] [Commented] (SPARK-1612) Potential resource leaks in Utils.copyStream and Utils.offsetBytes

2014-07-19 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067720#comment-14067720 ] Patrick Wendell commented on SPARK-1612: A pull request has been posted for this i

[jira] [Issue Comment Deleted] (SPARK-1580) ALS: Estimate communication and computation costs given a partitioner

2014-07-19 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1580: --- Comment: was deleted (was: A pull request has been posted for this issue:Author: tmyklebuURL

[jira] [Commented] (SPARK-1581) Allow One Flume Avro RPC Server for Each Worker rather than Just One Worker

2014-07-19 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067719#comment-14067719 ] Patrick Wendell commented on SPARK-1581: A pull request has been posted for this i

[jira] [Commented] (SPARK-1580) ALS: Estimate communication and computation costs given a partitioner

2014-07-19 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067718#comment-14067718 ] Patrick Wendell commented on SPARK-1580: A pull request has been posted for this i

[jira] [Commented] (SPARK-1981) Add AWS Kinesis streaming support

2014-07-19 Thread Chris Fregly (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067696#comment-14067696 ] Chris Fregly commented on SPARK-1981: - [~pwendell] is there anything i need to do wit

[jira] [Commented] (SPARK-2595) The driver run garbage collection, when the executor throws OutOfMemoryError exception

2014-07-19 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067653#comment-14067653 ] Guoqiang Li commented on SPARK-2595: Sorry I removed it. > The driver run garbage col

[jira] [Updated] (SPARK-2595) The driver run garbage collection, when the executor throws OutOfMemoryError exception

2014-07-19 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guoqiang Li updated SPARK-2595: --- Description: [SPARK-1103|https://issues.apache.org/jira/browse/SPARK-1103] implementation GC-based c

[jira] [Comment Edited] (SPARK-2595) The driver run garbage collection, when the executor throws OutOfMemoryError exception

2014-07-19 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067653#comment-14067653 ] Guoqiang Li edited comment on SPARK-2595 at 7/19/14 7:45 PM: -

[jira] [Commented] (SPARK-2595) The driver run garbage collection, when the executor throws OutOfMemoryError exception

2014-07-19 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067651#comment-14067651 ] Patrick Wendell commented on SPARK-2595: I was not proposing that we should do thi

[jira] [Updated] (SPARK-2595) The driver run garbage collection, when the executor throws OutOfMemoryError exception

2014-07-19 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guoqiang Li updated SPARK-2595: --- Description: [SPARK-1103|https://issues.apache.org/jira/browse/SPARK-1103] implementation GC-based c

[jira] [Created] (SPARK-2595) The driver run garbage collection, when the executor throws OutOfMemoryError exception

2014-07-19 Thread Guoqiang Li (JIRA)
Guoqiang Li created SPARK-2595: -- Summary: The driver run garbage collection, when the executor throws OutOfMemoryError exception Key: SPARK-2595 URL: https://issues.apache.org/jira/browse/SPARK-2595 Proj

[jira] [Commented] (SPARK-2591) Add config property to disable incremental collection used in Thrift server

2014-07-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067563#comment-14067563 ] Michael Armbrust commented on SPARK-2591: - We should benchmark this and make sure

[jira] [Created] (SPARK-2594) Add CACHE TABLE AS SELECT ...

2014-07-19 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-2594: --- Summary: Add CACHE TABLE AS SELECT ... Key: SPARK-2594 URL: https://issues.apache.org/jira/browse/SPARK-2594 Project: Spark Issue Type: New Feature

[jira] [Updated] (SPARK-2576) slave node throws NoClassDefFoundError $line11.$read$ when executing a Spark QL query on HDFS CSV file

2014-07-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2576: Target Version/s: 1.1.0 > slave node throws NoClassDefFoundError $line11.$read$ when execut

[jira] [Resolved] (SPARK-2591) Add config property to disable incremental collection used in Thrift server

2014-07-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-2591. - Resolution: Duplicate > Add config property to disable incremental collection used in Thr

[jira] [Commented] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark

2014-07-19 Thread Helena Edelson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067540#comment-14067540 ] Helena Edelson commented on SPARK-2593: --- I should note that I'd be happy to do the c

[jira] [Created] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark

2014-07-19 Thread Helena Edelson (JIRA)
Helena Edelson created SPARK-2593: - Summary: Add ability to pass an existing Akka ActorSystem into Spark Key: SPARK-2593 URL: https://issues.apache.org/jira/browse/SPARK-2593 Project: Spark

[jira] [Commented] (SPARK-2420) Change Spark build to minimize library conflicts

2014-07-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067512#comment-14067512 ] Sean Owen commented on SPARK-2420: -- https://github.com/srowen/spark/commit/f111393131008b

[jira] [Commented] (SPARK-1997) Update breeze to version 0.8.1

2014-07-19 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067504#comment-14067504 ] Guoqiang Li commented on SPARK-1997: I'm sorry, came late. the breeze 0.8.1 jar has {{

[jira] [Commented] (SPARK-2226) HAVING should be able to contain aggregate expressions that don't appear in the aggregation list.

2014-07-19 Thread William Benton (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067503#comment-14067503 ] William Benton commented on SPARK-2226: --- [~rxin], yes, and I'm mostly done. I'll po

[jira] [Commented] (SPARK-2552) Stabilize the computation of logistic function in pyspark

2014-07-19 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067449#comment-14067449 ] Xiangrui Meng commented on SPARK-2552: -- PR: https://github.com/apache/spark/pull/1493

[jira] [Commented] (SPARK-1997) Update breeze to version 0.8.1

2014-07-19 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067450#comment-14067450 ] Xiangrui Meng commented on SPARK-1997: -- PR: https://github.com/apache/spark/pull/940

[jira] [Commented] (SPARK-2495) Ability to re-create ML models

2014-07-19 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067445#comment-14067445 ] Xiangrui Meng commented on SPARK-2495: -- I sent out a PR for linear models: https://gi

[jira] [Commented] (SPARK-2552) Stabilize the computation of logistic function in pyspark

2014-07-19 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067419#comment-14067419 ] Xiangrui Meng commented on SPARK-2552: -- It is not necessary to check the ranges becau