[jira] [Commented] (SPARK-3481) HiveComparisonTest throws exception of org.apache.hadoop.hive.ql.metadata.HiveException: Database does not exist: default

2014-09-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129766#comment-14129766 ] Apache Spark commented on SPARK-3481: - User 'chenghao-intel' has created a pull

[jira] [Created] (SPARK-3484) Can updateStateByKey hold only last 10 keys?

2014-09-11 Thread Yadong Qi (JIRA)
Yadong Qi created SPARK-3484: Summary: Can updateStateByKey hold only last 10 keys? Key: SPARK-3484 URL: https://issues.apache.org/jira/browse/SPARK-3484 Project: Spark Issue Type: Question

[jira] [Commented] (SPARK-558) Simplify run script by relying on sbt to launch app

2014-09-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129798#comment-14129798 ] Sean Owen commented on SPARK-558: - Is this stale too? given that SBT is less used, I don't

[jira] [Commented] (SPARK-558) Simplify run script by relying on sbt to launch app

2014-09-11 Thread Ismael Juma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129801#comment-14129801 ] Ismael Juma commented on SPARK-558: --- Probably stale, yes. Simplify run script by

[jira] [Resolved] (SPARK-683) Spark 0.7 with Hadoop 1.0 does not work with current AMI's HDFS installation

2014-09-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-683. - Resolution: Fixed I think this is likely long since obsolete or fixed, since Spark, Hadoop and AMI Hadoop

[jira] [Commented] (SPARK-880) When built with Hadoop2, spark-shell and examples don't initialize log4j properly

2014-09-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129803#comment-14129803 ] Sean Owen commented on SPARK-880: - This should be resolved/obsoleted by subsequent updates

[jira] [Created] (SPARK-3485) should check parameter type when find constructors

2014-09-11 Thread Adrian Wang (JIRA)
Adrian Wang created SPARK-3485: -- Summary: should check parameter type when find constructors Key: SPARK-3485 URL: https://issues.apache.org/jira/browse/SPARK-3485 Project: Spark Issue Type:

[jira] [Commented] (SPARK-3480) Throws out Not a valid command 'yarn-alpha/scalastyle' in dev/scalastyle for sbt build tool during 'Running Scala style checks'

2014-09-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129824#comment-14129824 ] Apache Spark commented on SPARK-3480: - User 'jameszhouyi' has created a pull request

[jira] [Created] (SPARK-3486) Add PySpark support for Word2Vec

2014-09-11 Thread Liquan Pei (JIRA)
Liquan Pei created SPARK-3486: - Summary: Add PySpark support for Word2Vec Key: SPARK-3486 URL: https://issues.apache.org/jira/browse/SPARK-3486 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-3482) Allow symlinking to scripts (spark-shell, spark-submit, ...)

2014-09-11 Thread Radim Kolar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129850#comment-14129850 ] Radim Kolar commented on SPARK-3482: yes, SPARK-2960 is same issue. Allow symlinking

[jira] [Commented] (SPARK-2182) Scalastyle rule blocking unicode operators

2014-09-11 Thread Prashant Sharma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129865#comment-14129865 ] Prashant Sharma commented on SPARK-2182: We will have to come up with a regex that

[jira] [Commented] (SPARK-2182) Scalastyle rule blocking unicode operators

2014-09-11 Thread Prashant Sharma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129867#comment-14129867 ] Prashant Sharma commented on SPARK-2182: Or there is even better Regex for all

[jira] [Commented] (SPARK-3485) should check parameter type when find constructors

2014-09-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129870#comment-14129870 ] Apache Spark commented on SPARK-3485: - User 'adrian-wang' has created a pull request

[jira] [Commented] (SPARK-3486) Add PySpark support for Word2Vec

2014-09-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129871#comment-14129871 ] Apache Spark commented on SPARK-3486: - User 'Ishiihara' has created a pull request for

[jira] [Closed] (SPARK-3484) Can updateStateByKey hold only last 10 keys?

2014-09-11 Thread Yadong Qi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yadong Qi closed SPARK-3484. Resolution: Not a Problem Can updateStateByKey hold only last 10 keys?

[jira] [Updated] (SPARK-3486) [MLlib]Add PySpark support for Word2Vec

2014-09-11 Thread Liquan Pei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liquan Pei updated SPARK-3486: -- Summary: [MLlib]Add PySpark support for Word2Vec (was: Add PySpark support for Word2Vec) [MLlib]Add

[jira] [Updated] (SPARK-3477) Clean up code in Yarn Client / ClientBase

2014-09-11 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-3477: - Description: With the addition of new features and supporting multiple versions of yarn the code

[jira] [Resolved] (SPARK-2140) yarn stable client doesn't properly handle MEMORY_OVERHEAD for AM

2014-09-11 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-2140. -- Resolution: Fixed Fix Version/s: (was: 1.0.1) (was:

[jira] [Comment Edited] (SPARK-3435) Distributed matrix multiplication

2014-09-11 Thread Gaurav Mishra (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130008#comment-14130008 ] Gaurav Mishra edited comment on SPARK-3435 at 9/11/14 1:26 PM:

[jira] [Updated] (SPARK-2532) Fix issues with consolidated shuffle

2014-09-11 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ash updated SPARK-2532: -- Component/s: Shuffle Fix issues with consolidated shuffle

[jira] [Created] (SPARK-3489) support rdd.zip(rdd1, rdd2,...) with variable number of rdds as params

2014-09-11 Thread Mohit Jaggi (JIRA)
Mohit Jaggi created SPARK-3489: -- Summary: support rdd.zip(rdd1, rdd2,...) with variable number of rdds as params Key: SPARK-3489 URL: https://issues.apache.org/jira/browse/SPARK-3489 Project: Spark

[jira] [Updated] (SPARK-3158) Avoid 1 extra aggregation for DecisionTree training

2014-09-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-3158: - Description: Improvement: computation Currently, the implementation does one unnecessary

[jira] [Created] (SPARK-3490) Alleviate port collisions during tests

2014-09-11 Thread Andrew Or (JIRA)
Andrew Or created SPARK-3490: Summary: Alleviate port collisions during tests Key: SPARK-3490 URL: https://issues.apache.org/jira/browse/SPARK-3490 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-3172) Distinguish between shuffle spill on the map and reduce side

2014-09-11 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130278#comment-14130278 ] Andrew Ash commented on SPARK-3172: --- Sandy do you mean distinguish between these two in

[jira] [Updated] (SPARK-2791) Fix committing, reverting and state tracking in shuffle file consolidation

2014-09-11 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ash updated SPARK-2791: -- Component/s: Shuffle Fix committing, reverting and state tracking in shuffle file consolidation

[jira] [Updated] (SPARK-1239) Don't fetch all map output statuses at each reducer during shuffles

2014-09-11 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ash updated SPARK-1239: -- Component/s: Shuffle Don't fetch all map output statuses at each reducer during shuffles

[jira] [Updated] (SPARK-3277) LZ4 compression cause the the ExternalSort exception

2014-09-11 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ash updated SPARK-3277: -- Component/s: Shuffle LZ4 compression cause the the ExternalSort exception

[jira] [Closed] (SPARK-3487) Remove unused import in ApplicationMaster

2014-09-11 Thread Kousuke Saruta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta closed SPARK-3487. - Resolution: Unresolved Remove unused import in ApplicationMaster

[jira] [Created] (SPARK-3491) Use pickle to serialize the data in MLlib Python

2014-09-11 Thread Davies Liu (JIRA)
Davies Liu created SPARK-3491: - Summary: Use pickle to serialize the data in MLlib Python Key: SPARK-3491 URL: https://issues.apache.org/jira/browse/SPARK-3491 Project: Spark Issue Type:

[jira] [Created] (SPARK-3492) Clean up Yarn integration code

2014-09-11 Thread Andrew Or (JIRA)
Andrew Or created SPARK-3492: Summary: Clean up Yarn integration code Key: SPARK-3492 URL: https://issues.apache.org/jira/browse/SPARK-3492 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-2933) Refactor and cleanup Yarn AM code

2014-09-11 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-2933: - Summary: Refactor and cleanup Yarn AM code (was: Cleanup unnecessary and duplicated code in Yarn module)

[jira] [Updated] (SPARK-2933) Cleanup unnecessary and duplicated code in Yarn module

2014-09-11 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-2933: - Issue Type: Sub-task (was: Improvement) Parent: SPARK-3492 Cleanup unnecessary and duplicated

[jira] [Updated] (SPARK-3421) StructField.toString should quote the name field to allow arbitrary character as struct field name

2014-09-11 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-3421: Target Version/s: 1.2.0 (was: 1.1.0) StructField.toString should quote the name field to

[jira] [Resolved] (SPARK-2560) Create Spark SQL syntax reference

2014-09-11 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-2560. - Resolution: Fixed Fix Version/s: 1.1.0 Assignee: Michael Armbrust The 1.1

[jira] [Commented] (SPARK-2560) Create Spark SQL syntax reference

2014-09-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130464#comment-14130464 ] Nicholas Chammas commented on SPARK-2560: - [~marmbrus] - Would that be [this

[jira] [Commented] (SPARK-3172) Distinguish between shuffle spill on the map and reduce side

2014-09-11 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130475#comment-14130475 ] Sandy Ryza commented on SPARK-3172: --- I mean in the web UI (which will require

[jira] [Resolved] (SPARK-3047) add an option to use str in textFileRDD()

2014-09-11 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-3047. --- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 1951

[jira] [Commented] (SPARK-2560) Create Spark SQL syntax reference

2014-09-11 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130492#comment-14130492 ] Michael Armbrust commented on SPARK-2560: - Yeah, that coupled with the earlier

[jira] [Resolved] (SPARK-2917) Avoid CTAS creates table in logical plan analyzing.

2014-09-11 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-2917. - Resolution: Fixed Fix Version/s: 1.2.0 Avoid CTAS creates table in logical plan

[jira] [Updated] (SPARK-1047) Ability to disable the spark ui server (unit tests)

2014-09-11 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-1047: - Target Version/s: 1.2.0 Assignee: Andrew Or Ability to disable the spark ui server (unit

[jira] [Commented] (SPARK-1047) Ability to disable the spark ui server (unit tests)

2014-09-11 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130521#comment-14130521 ] Andrew Or commented on SPARK-1047: -- Here's the more updated PR:

[jira] [Updated] (SPARK-3490) Alleviate port collisions during tests

2014-09-11 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-3490: - Target Version/s: 1.1.1, 1.2.0 (was: 1.1.1) Alleviate port collisions during tests

[jira] [Created] (SPARK-3493) Unrolling behavior is too aggressive in dropping blocks

2014-09-11 Thread Andrew Or (JIRA)
Andrew Or created SPARK-3493: Summary: Unrolling behavior is too aggressive in dropping blocks Key: SPARK-3493 URL: https://issues.apache.org/jira/browse/SPARK-3493 Project: Spark Issue Type:

[jira] [Updated] (SPARK-3493) Unrolling behavior is too aggressive in dropping blocks

2014-09-11 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-3493: - Priority: Major (was: Critical) Unrolling behavior is too aggressive in dropping blocks

[jira] [Updated] (SPARK-3493) Unrolling behavior is too aggressive in dropping blocks

2014-09-11 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-3493: - Issue Type: Bug (was: Improvement) Unrolling behavior is too aggressive in dropping blocks

[jira] [Commented] (SPARK-2560) Create Spark SQL syntax reference

2014-09-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130580#comment-14130580 ] Nicholas Chammas commented on SPARK-2560: - I guess that's good for starters, but I

[jira] [Commented] (SPARK-3390) sqlContext.jsonRDD fails on a complex structure of JSON array and JSON object nesting

2014-09-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130645#comment-14130645 ] Apache Spark commented on SPARK-3390: - User 'yhuai' has created a pull request for

[jira] [Commented] (SPARK-3492) Clean up Yarn integration code

2014-09-11 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130736#comment-14130736 ] Marcelo Vanzin commented on SPARK-3492: --- If you want to keep track, SPARK-3187 is

[jira] [Updated] (SPARK-3374) Spark on Yarn config cleanup

2014-09-11 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-3374: - Issue Type: Sub-task (was: Improvement) Parent: SPARK-3492 Spark on Yarn config cleanup

[jira] [Resolved] (SPARK-1891) Add admin acls to the Web UI

2014-09-11 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-1891. -- Resolution: Fixed Add admin acls to the Web UI

[jira] [Commented] (SPARK-3250) More Efficient Sampling

2014-09-11 Thread Erik Erlandson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130809#comment-14130809 ] Erik Erlandson commented on SPARK-3250: --- I developed prototype iterator classes for

[jira] [Resolved] (SPARK-3390) sqlContext.jsonRDD fails on a complex structure of JSON array and JSON object nesting

2014-09-11 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-3390. - Resolution: Fixed Fix Version/s: 1.2.0 sqlContext.jsonRDD fails on a complex

[jira] [Updated] (SPARK-2669) Hadoop configuration is not localised when submitting job in yarn-cluster mode

2014-09-11 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-2669: - Affects Version/s: 1.0.0 Hadoop configuration is not localised when submitting job in yarn-cluster mode

[jira] [Updated] (SPARK-2669) Hadoop configuration is not localised when submitting job in yarn-cluster mode

2014-09-11 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-2669: - Component/s: YARN Hadoop configuration is not localised when submitting job in yarn-cluster mode

[jira] [Created] (SPARK-3494) DecisionTree overflow error in calculating maxMemoryUsage

2014-09-11 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3494: Summary: DecisionTree overflow error in calculating maxMemoryUsage Key: SPARK-3494 URL: https://issues.apache.org/jira/browse/SPARK-3494 Project: Spark

[jira] [Commented] (SPARK-1764) EOF reached before Python server acknowledged

2014-09-11 Thread Fi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130874#comment-14130874 ] Fi commented on SPARK-1764: --- FYI the workaround described in SPARK-2282 got me past the same EOF

[jira] [Updated] (SPARK-2058) SPARK_CONF_DIR should override all present configs

2014-09-11 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-2058: - Fix Version/s: (was: 1.0.1) (was: 1.1.0) SPARK_CONF_DIR should override all

[jira] [Updated] (SPARK-2058) SPARK_CONF_DIR should override all present configs

2014-09-11 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-2058: - Target Version/s: 1.1.1, 1.2.0 (was: 1.0.1, 1.1.1, 1.2.0) SPARK_CONF_DIR should override all present

[jira] [Updated] (SPARK-2058) SPARK_CONF_DIR should override all present configs

2014-09-11 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-2058: - Target Version/s: 1.0.1, 1.1.1, 1.2.0 (was: 1.0.1) SPARK_CONF_DIR should override all present configs

[jira] [Commented] (SPARK-2951) SerDeUtils.pythonToPairRDD fails on RDDs of pickled array.arrays in Python 2.6

2014-09-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130878#comment-14130878 ] Apache Spark commented on SPARK-2951: - User 'davies' has created a pull request for

[jira] [Commented] (SPARK-3250) More Efficient Sampling

2014-09-11 Thread RJ Nowling (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130880#comment-14130880 ] RJ Nowling commented on SPARK-3250: --- Great work! If these performance improvements hold

[jira] [Resolved] (SPARK-1764) EOF reached before Python server acknowledged

2014-09-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-1764. --- Resolution: Fixed This is fixed by #2282 EOF reached before Python server acknowledged

[jira] [Updated] (SPARK-1764) EOF reached before Python server acknowledged

2014-09-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-1764: -- Fix Version/s: 1.1.0 EOF reached before Python server acknowledged

[jira] [Commented] (SPARK-3490) Alleviate port collisions during tests

2014-09-11 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130916#comment-14130916 ] Andrew Or commented on SPARK-3490: -- Fixed in https://github.com/apache/spark/pull/2363

[jira] [Closed] (SPARK-1047) Ability to disable the spark ui server (unit tests)

2014-09-11 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-1047. Resolution: Fixed Fix Version/s: 1.2.0 1.1.1 Ability to disable the spark ui

[jira] [Updated] (SPARK-3490) Alleviate port collisions during tests

2014-09-11 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-3490: - Fix Version/s: 1.2.0 1.1.1 Alleviate port collisions during tests

[jira] [Closed] (SPARK-3429) Don't include the empty string as a defaultAclUser

2014-09-11 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-3429. Resolution: Fixed Fix Version/s: 1.2.0 1.1.1 Assignee: Andrew Ash

[jira] [Commented] (SPARK-2560) Create Spark SQL syntax reference

2014-09-11 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130929#comment-14130929 ] Michael Armbrust commented on SPARK-2560: - Honestly for now I'd point people at

[jira] [Resolved] (SPARK-3462) parquet pushdown for unionAll

2014-09-11 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-3462. - Resolution: Fixed Fix Version/s: 1.2.0 parquet pushdown for unionAll

[jira] [Created] (SPARK-3495) Block replication fails continuously when the replication target node is dead

2014-09-11 Thread Tathagata Das (JIRA)
Tathagata Das created SPARK-3495: Summary: Block replication fails continuously when the replication target node is dead Key: SPARK-3495 URL: https://issues.apache.org/jira/browse/SPARK-3495 Project:

[jira] [Commented] (SPARK-2045) Sort-based shuffle implementation

2014-09-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130974#comment-14130974 ] Nicholas Chammas commented on SPARK-2045: - The [1.1.0 release

[jira] [Commented] (SPARK-3495) Block replication fails continuously when the replication target node is dead

2014-09-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130978#comment-14130978 ] Apache Spark commented on SPARK-3495: - User 'tdas' has created a pull request for this

[jira] [Created] (SPARK-3496) Block replication can by mistake choose driver BlockManager as a peer for replication

2014-09-11 Thread Tathagata Das (JIRA)
Tathagata Das created SPARK-3496: Summary: Block replication can by mistake choose driver BlockManager as a peer for replication Key: SPARK-3496 URL: https://issues.apache.org/jira/browse/SPARK-3496

[jira] [Commented] (SPARK-2482) Resolve sbt warnings during build

2014-09-11 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131005#comment-14131005 ] Andrew Or commented on SPARK-2482: -- Fixed in https://github.com/apache/spark/pull/1330

[jira] [Closed] (SPARK-2482) Resolve sbt warnings during build

2014-09-11 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-2482. Resolution: Fixed Fix Version/s: 1.2.0 Assignee: Guoqiang Li Target Version/s:

[jira] [Created] (SPARK-3497) Report serialized size of task binary

2014-09-11 Thread Sandy Ryza (JIRA)
Sandy Ryza created SPARK-3497: - Summary: Report serialized size of task binary Key: SPARK-3497 URL: https://issues.apache.org/jira/browse/SPARK-3497 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-2926) Add MR-style (merge-sort) SortShuffleReader for sort-based shuffle

2014-09-11 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao updated SPARK-2926: --- Attachment: Spark Shuffle Test Report(contd).pdf Add MR-style (merge-sort) SortShuffleReader for

[jira] [Updated] (SPARK-3416) Add matrix operations for large data set

2014-09-11 Thread Yu Ishikawa (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Ishikawa updated SPARK-3416: --- Description: I think matrix operations for large data set would be helpful. There is a method to

[jira] [Created] (SPARK-3498) Block always replicated to the same node

2014-09-11 Thread shenhong (JIRA)
shenhong created SPARK-3498: --- Summary: Block always replicated to the same node Key: SPARK-3498 URL: https://issues.apache.org/jira/browse/SPARK-3498 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-3499) Create Spark-based distcp utility

2014-09-11 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-3499: --- Summary: Create Spark-based distcp utility Key: SPARK-3499 URL: https://issues.apache.org/jira/browse/SPARK-3499 Project: Spark Issue Type: Wish

[jira] [Updated] (SPARK-3498) Block always replicated to the same node

2014-09-11 Thread shenhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated SPARK-3498: Description: When running a spark streaming job, we should replicate receiver blocks, but all the blocks

[jira] [Updated] (SPARK-3498) Block always replicated to the same node

2014-09-11 Thread shenhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated SPARK-3498: Description: When running a spark streaming job, we should replicate receiver blocks, but all the blocks

[jira] [Updated] (SPARK-3498) Block always replicated to the same node

2014-09-11 Thread shenhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated SPARK-3498: Description: When running a spark streaming job, we should replicate receiver blocks, but all the blocks

[jira] [Updated] (SPARK-3498) Block always replicated to the same node

2014-09-11 Thread shenhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated SPARK-3498: Description: When running a spark streaming job, we should replicate receiver blocks, but all the blocks

[jira] [Updated] (SPARK-3498) Block always replicated to the same node

2014-09-11 Thread shenhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated SPARK-3498: Description: When running a spark streaming job, we should replicate receiver blocks, but all the blocks

[jira] [Commented] (SPARK-3499) Create Spark-based distcp utility

2014-09-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131054#comment-14131054 ] Nicholas Chammas commented on SPARK-3499: - I'm not sure if this type of request

[jira] [Updated] (SPARK-3377) Metrics can be accidentally aggregated against our intention

2014-09-11 Thread Kousuke Saruta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-3377: -- Summary: Metrics can be accidentally aggregated against our intention (was: Don't mix metrics

[jira] [Created] (SPARK-3500) SchemaRDD from jsonRDD() has not coalesce() method

2014-09-11 Thread Davies Liu (JIRA)
Davies Liu created SPARK-3500: - Summary: SchemaRDD from jsonRDD() has not coalesce() method Key: SPARK-3500 URL: https://issues.apache.org/jira/browse/SPARK-3500 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-3501) Hive SimpleUDF will create duplicated type cast which cause exception in constant folding

2014-09-11 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-3501: Summary: Hive SimpleUDF will create duplicated type cast which cause exception in constant folding Key: SPARK-3501 URL: https://issues.apache.org/jira/browse/SPARK-3501

[jira] [Created] (SPARK-3502) SO_RCVBUF and SO_SNDBUF should be bootstrap childOption, not option

2014-09-11 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-3502: -- Summary: SO_RCVBUF and SO_SNDBUF should be bootstrap childOption, not option Key: SPARK-3502 URL: https://issues.apache.org/jira/browse/SPARK-3502 Project: Spark

[jira] [Commented] (SPARK-3501) Hive SimpleUDF will create duplicated type cast which cause exception in constant folding

2014-09-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131120#comment-14131120 ] Apache Spark commented on SPARK-3501: - User 'chenghao-intel' has created a pull

[jira] [Created] (SPARK-3503) Disable thread local cache in PooledByteBufAllocator

2014-09-11 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-3503: -- Summary: Disable thread local cache in PooledByteBufAllocator Key: SPARK-3503 URL: https://issues.apache.org/jira/browse/SPARK-3503 Project: Spark Issue Type:

[jira] [Updated] (SPARK-3018) Release all ManagedBuffers upon task completion/failure

2014-09-11 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-3018: --- Description: BlockFetcherIterator retains ManagedBuffers returned by BlockClient.fetchBlocks. Those

[jira] [Updated] (SPARK-3002) Maintains a connection pool and reuse clients in BlockClientFactory

2014-09-11 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-3002: --- Summary: Maintains a connection pool and reuse clients in BlockClientFactory (was: Reuse Netty

[jira] [Updated] (SPARK-3500) SchemaRDD from jsonRDD() has not coalesce() method

2014-09-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-3500: -- Description: {code} sqlCtx.jsonRDD(sc.parallelize(['{foo:bar}', '{foo:baz}'])).coalesce(1)

[jira] [Updated] (SPARK-3500) SchemaRDD from jsonRDD() has not coalesce() method

2014-09-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-3500: -- Description: ``` sqlCtx.jsonRDD(sc.parallelize(['{foo:bar}', '{foo:baz}'])).coalesce(1) Py4JError: