[jira] [Updated] (SPARK-5743) java.lang.OutOfMemoryError: Java heap space with RandomForest

2015-02-11 Thread Guillaume Charhon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guillaume Charhon updated SPARK-5743: - Description: I am running a training a model with a RamdomForest using this code snippet:

[jira] [Commented] (SPARK-5743) java.lang.OutOfMemoryError: Java heap space with RandomForest

2015-02-11 Thread Guillaume Charhon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316474#comment-14316474 ] Guillaume Charhon commented on SPARK-5743: -- You I right. I meant 104 GB. I have

[jira] [Commented] (SPARK-5739) Size exceeds Integer.MAX_VALUE in File Map

2015-02-11 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316393#comment-14316393 ] DjvuLee commented on SPARK-5739: Yes, 1M maybe enough for the Kmeans algorithm. But if we

[jira] [Commented] (SPARK-5743) java.lang.OutOfMemoryError: Java heap space with RandomForest

2015-02-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316427#comment-14316427 ] Sean Owen commented on SPARK-5743: -- How much memory do executors get though? your

[jira] [Commented] (SPARK-5739) Size exceeds Integer.MAX_VALUE in File Map

2015-02-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316441#comment-14316441 ] Sean Owen commented on SPARK-5739: -- It's hard to say because it depends on d, k, runs,

[jira] [Created] (SPARK-5743) java.lang.OutOfMemoryError: Java heap space with RandomForest

2015-02-11 Thread Guillaume Charhon (JIRA)
Guillaume Charhon created SPARK-5743: Summary: java.lang.OutOfMemoryError: Java heap space with RandomForest Key: SPARK-5743 URL: https://issues.apache.org/jira/browse/SPARK-5743 Project: Spark

[jira] [Comment Edited] (SPARK-5743) java.lang.OutOfMemoryError: Java heap space with RandomForest

2015-02-11 Thread Guillaume Charhon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316474#comment-14316474 ] Guillaume Charhon edited comment on SPARK-5743 at 2/11/15 4:30 PM:

[jira] [Commented] (SPARK-1302) httpd doesn't start in spark-ec2 (cc2.8xlarge)

2015-02-11 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316541#comment-14316541 ] Shivaram Venkataraman commented on SPARK-1302: -- [~soid] Could you let us

[jira] [Commented] (SPARK-5739) Size exceeds Integer.MAX_VALUE in File Map

2015-02-11 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316430#comment-14316430 ] DjvuLee commented on SPARK-5739: Ok, Got it, I will look the code for more detail. I

[jira] [Updated] (SPARK-5655) YARN Auxiliary Shuffle service can't access shuffle files on Hadoop cluster configured in secure mode

2015-02-11 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-5655: - Affects Version/s: 1.3.0 YARN Auxiliary Shuffle service can't access shuffle files on Hadoop cluster

[jira] [Commented] (SPARK-4553) query for parquet table with string fields in spark sql hive get binary result

2015-02-11 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316687#comment-14316687 ] Cheng Lian commented on SPARK-4553: --- We still need {{spark.sql.parquet.binaryAsString}}

[jira] [Created] (SPARK-5745) Allow to use custom TaskMetrics implementation

2015-02-11 Thread Jacek Lewandowski (JIRA)
Jacek Lewandowski created SPARK-5745: Summary: Allow to use custom TaskMetrics implementation Key: SPARK-5745 URL: https://issues.apache.org/jira/browse/SPARK-5745 Project: Spark Issue

[jira] [Created] (SPARK-5744) RDD.isEmpty fails when rdd contains empty partitions.

2015-02-11 Thread Tobias Bertelsen (JIRA)
Tobias Bertelsen created SPARK-5744: --- Summary: RDD.isEmpty fails when rdd contains empty partitions. Key: SPARK-5744 URL: https://issues.apache.org/jira/browse/SPARK-5744 Project: Spark

[jira] [Commented] (SPARK-4553) query for parquet table with string fields in spark sql hive get binary result

2015-02-11 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316664#comment-14316664 ] Yin Huai commented on SPARK-4553: - [~lian cheng] Should we close it (since setting

[jira] [Updated] (SPARK-5744) RDD.isEmpty fails when rdd contains empty partitions.

2015-02-11 Thread Tobias Bertelsen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tobias Bertelsen updated SPARK-5744: Description: The implementation of {{RDD.isEmpty()}} fails if there is empty partitions. It

[jira] [Commented] (SPARK-5744) RDD.isEmpty fails when rdd contains empty partitions.

2015-02-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316722#comment-14316722 ] Apache Spark commented on SPARK-5744: - User 'tbertelsen' has created a pull request

[jira] [Commented] (SPARK-5739) Size exceeds Integer.MAX_VALUE in File Map

2015-02-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316412#comment-14316412 ] Sean Owen commented on SPARK-5739: -- No, it should be able to operate on sparse vectors,

[jira] [Resolved] (SPARK-5733) Error Link in Pagination of HistroyPage when showing Incomplete Applications

2015-02-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-5733. -- Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4523

[jira] [Updated] (SPARK-5733) Error Link in Pagination of HistroyPage when showing Incomplete Applications

2015-02-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-5733: - Priority: Minor (was: Major) Affects Version/s: (was: 1.3.0)

[jira] [Issue Comment Deleted] (SPARK-5516) ActorSystemImpl: Uncaught fatal error from thread [sparkDriver-akka.actor.default-dispatcher-22] shutting down ActorSystem [sparkDriver] java.lang.OutOfMemo

2015-02-11 Thread Guillaume Charhon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guillaume Charhon updated SPARK-5516: - Comment: was deleted (was: I have the same error. I am using Debian 7 machines.)

[jira] [Commented] (SPARK-5739) Size exceeds Integer.MAX_VALUE in File Map

2015-02-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316397#comment-14316397 ] Sean Owen commented on SPARK-5739: -- Yes, but you're talking about extremely sparse

[jira] [Commented] (SPARK-5722) Infer_schema_type incorrect for Integers in pyspark

2015-02-11 Thread Don Drake (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316940#comment-14316940 ] Don Drake commented on SPARK-5722: -- Hi, I've submitted 2 pull requests for branch-1.2 and

[jira] [Created] (SPARK-5746) INSERT OVERWRITE throws FileNotFoundException when the source and destination point to the same table.

2015-02-11 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-5746: - Summary: INSERT OVERWRITE throws FileNotFoundException when the source and destination point to the same table. Key: SPARK-5746 URL: https://issues.apache.org/jira/browse/SPARK-5746

[jira] [Commented] (SPARK-5722) Infer_schema_type incorrect for Integers in pyspark

2015-02-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316936#comment-14316936 ] Apache Spark commented on SPARK-5722: - User 'dondrake' has created a pull request for

[jira] [Resolved] (SPARK-4648) Support COALESCE function in Spark SQL and HiveQL

2015-02-11 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-4648. - Resolution: Duplicate It has been resolved by https://github.com/apache/spark/pull/4057/ (the PR of

[jira] [Commented] (SPARK-5746) INSERT OVERWRITE throws FileNotFoundException when the source and destination point to the same table.

2015-02-11 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316891#comment-14316891 ] Cheng Lian commented on SPARK-5746: --- cc [~yhuai] INSERT OVERWRITE throws

[jira] [Commented] (SPARK-5736) Add executor log url to Executors page on Yarn

2015-02-11 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316949#comment-14316949 ] Sandy Ryza commented on SPARK-5736: --- Is this the same as SPARK-2450? Add executor log

[jira] [Updated] (SPARK-5745) Allow to use custom TaskMetrics implementation

2015-02-11 Thread Jacek Lewandowski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacek Lewandowski updated SPARK-5745: - Description: There can be various RDDs implemented and the {{TaskMetrics}} provides a

[jira] [Resolved] (SPARK-5454) [SQL] Self join with ArrayType columns problems

2015-02-11 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-5454. - Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4520

[jira] [Resolved] (SPARK-5736) Add executor log url to Executors page on Yarn

2015-02-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-5736. -- Resolution: Duplicate Add executor log url to Executors page on Yarn

[jira] [Commented] (SPARK-2808) update kafka to version 0.8.2

2015-02-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316933#comment-14316933 ] Apache Spark commented on SPARK-2808: - User 'koeninger' has created a pull request for

[jira] [Commented] (SPARK-5502) User guide for isotonic regression

2015-02-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316861#comment-14316861 ] Apache Spark commented on SPARK-5502: - User 'zapletal-martin' has created a pull

[jira] [Resolved] (SPARK-5677) Python DataFrame API remaining tasks

2015-02-11 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-5677. Resolution: Fixed Fix Version/s: 1.3.0 Python DataFrame API remaining tasks

[jira] [Resolved] (SPARK-5734) Allow creating a DataFrame from local Python data

2015-02-11 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-5734. Resolution: Fixed Fix Version/s: 1.3.0 Allow creating a DataFrame from local Python data

[jira] [Commented] (SPARK-2808) update kafka to version 0.8.2

2015-02-11 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316937#comment-14316937 ] Cody Koeninger commented on SPARK-2808: --- I'm also kind of curious what the

[jira] [Resolved] (SPARK-3688) LogicalPlan can't resolve column correctlly

2015-02-11 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-3688. - Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4524

[jira] [Commented] (SPARK-3688) LogicalPlan can't resolve column correctlly

2015-02-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317073#comment-14317073 ] Apache Spark commented on SPARK-3688: - User 'rxin' has created a pull request for this

[jira] [Created] (SPARK-5747) Review all Bash scripts for word splitting bugs

2015-02-11 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-5747: --- Summary: Review all Bash scripts for word splitting bugs Key: SPARK-5747 URL: https://issues.apache.org/jira/browse/SPARK-5747 Project: Spark Issue

[jira] [Created] (SPARK-5748) Improve Vectors.sqdist implementation

2015-02-11 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-5748: Summary: Improve Vectors.sqdist implementation Key: SPARK-5748 URL: https://issues.apache.org/jira/browse/SPARK-5748 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-5749) Fix Bash word splitting bugs in compute-classpath.sh

2015-02-11 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-5749: --- Summary: Fix Bash word splitting bugs in compute-classpath.sh Key: SPARK-5749 URL: https://issues.apache.org/jira/browse/SPARK-5749 Project: Spark

[jira] [Commented] (SPARK-1302) httpd doesn't start in spark-ec2 (cc2.8xlarge)

2015-02-11 Thread Greg Temchenko (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317063#comment-14317063 ] Greg Temchenko commented on SPARK-1302: --- Indeed, I used 1.2.0. Sorry for the false

[jira] [Updated] (SPARK-3688) LogicalPlan can't resolve column correctlly

2015-02-11 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-3688: --- Assignee: Yi Tian LogicalPlan can't resolve column correctlly

[jira] [Updated] (SPARK-5747) Review all Bash scripts for word splitting bugs

2015-02-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5747: Description: Triggered by [this

[jira] [Resolved] (SPARK-1302) httpd doesn't start in spark-ec2 (cc2.8xlarge)

2015-02-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-1302. -- Resolution: Fixed Fix Version/s: 1.3.0 Assignee: Shivaram Venkataraman httpd doesn't

[jira] [Updated] (SPARK-5746) INSERT OVERWRITE throws FileNotFoundException when the source and destination point to the same table.

2015-02-11 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-5746: Priority: Blocker (was: Major) INSERT OVERWRITE throws FileNotFoundException when the source and

[jira] [Comment Edited] (SPARK-2808) update kafka to version 0.8.2

2015-02-11 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317240#comment-14317240 ] koert kuipers edited comment on SPARK-2808 at 2/12/15 12:05 AM:

[jira] [Updated] (SPARK-5748) Improve Vectors.sqdist implementation

2015-02-11 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5748: - Description: Saw some regression of k-means in 1.3 performance tests. I think the problem is the

[jira] [Updated] (SPARK-5740) Change comment default value from empty string to null in DescribeCommand

2015-02-11 Thread Li Sheng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Sheng updated SPARK-5740: Description: Change comment default value from empty string to null in DescribeCommand (was: Change

[jira] [Updated] (SPARK-5740) Change comment default value from empty string to null in DescribeCommand

2015-02-11 Thread Li Sheng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Sheng updated SPARK-5740: Summary: Change comment default value from empty string to null in DescribeCommand (was: Change default

[jira] [Updated] (SPARK-5750) Document that ordering of elements in shuffled partitions is not deterministic across runs

2015-02-11 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-5750: -- Summary: Document that ordering of elements in shuffled partitions is not deterministic across runs

[jira] [Created] (SPARK-5750) Document that ordering of elements in post-shuffle partitions is not deterministic across runs

2015-02-11 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-5750: - Summary: Document that ordering of elements in post-shuffle partitions is not deterministic across runs Key: SPARK-5750 URL: https://issues.apache.org/jira/browse/SPARK-5750

[jira] [Updated] (SPARK-5749) Fix Bash word splitting bugs in compute-classpath.sh

2015-02-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5749: Issue Type: Bug (was: Sub-task) Parent: (was: SPARK-5747) Fix Bash word

[jira] [Commented] (SPARK-2808) update kafka to version 0.8.2

2015-02-11 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317240#comment-14317240 ] koert kuipers commented on SPARK-2808: -- scala 2.11, thats good point, i didnt think

[jira] [Commented] (SPARK-5573) Support explode in DataFrame DSL

2015-02-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317369#comment-14317369 ] Apache Spark commented on SPARK-5573: - User 'marmbrus' has created a pull request for

[jira] [Commented] (SPARK-3299) [SQL] Public API in SQLContext to list tables

2015-02-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317400#comment-14317400 ] Apache Spark commented on SPARK-3299: - User 'yhuai' has created a pull request for

[jira] [Commented] (SPARK-4423) Improve foreach() documentation to avoid confusion between local- and cluster-mode behavior

2015-02-11 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317334#comment-14317334 ] Ilya Ganelin commented on SPARK-4423: - Hi [~pwendell] and [~joshrosen], how do you

[jira] [Comment Edited] (SPARK-4423) Improve foreach() documentation to avoid confusion between local- and cluster-mode behavior

2015-02-11 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317334#comment-14317334 ] Ilya Ganelin edited comment on SPARK-4423 at 2/12/15 1:46 AM: --

[jira] [Comment Edited] (SPARK-4423) Improve foreach() documentation to avoid confusion between local- and cluster-mode behavior

2015-02-11 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317334#comment-14317334 ] Ilya Ganelin edited comment on SPARK-4423 at 2/12/15 1:46 AM: --

[jira] [Created] (SPARK-5752) Don't implicitly convert RDDs directly to DataFrames

2015-02-11 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-5752: -- Summary: Don't implicitly convert RDDs directly to DataFrames Key: SPARK-5752 URL: https://issues.apache.org/jira/browse/SPARK-5752 Project: Spark Issue Type:

[jira] [Updated] (SPARK-5752) Don't implicitly convert RDDs directly to DataFrames

2015-02-11 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-5752: --- Assignee: Reynold Xin Don't implicitly convert RDDs directly to DataFrames

[jira] [Updated] (SPARK-5752) Don't implicitly convert RDDs directly to DataFrames

2015-02-11 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-5752: --- Target Version/s: 1.3.0 Don't implicitly convert RDDs directly to DataFrames

[jira] [Created] (SPARK-5753) add basic support to JDBCRDD for postgresql types: uuid, hstore, and array

2015-02-11 Thread Ricky Nguyen (JIRA)
Ricky Nguyen created SPARK-5753: --- Summary: add basic support to JDBCRDD for postgresql types: uuid, hstore, and array Key: SPARK-5753 URL: https://issues.apache.org/jira/browse/SPARK-5753 Project:

[jira] [Comment Edited] (SPARK-4423) Improve foreach() documentation to avoid confusion between local- and cluster-mode behavior

2015-02-11 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317334#comment-14317334 ] Ilya Ganelin edited comment on SPARK-4423 at 2/12/15 2:39 AM: --

[jira] [Commented] (SPARK-5753) add basic support to JDBCRDD for postgresql types: uuid, hstore, and array

2015-02-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317464#comment-14317464 ] Apache Spark commented on SPARK-5753: - User 'lepfhty' has created a pull request for

[jira] [Commented] (SPARK-2808) update kafka to version 0.8.2

2015-02-11 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317304#comment-14317304 ] Saisai Shao commented on SPARK-2808: I'd like to upgrade Kafka to 0.8.2, currently in

[jira] [Created] (SPARK-5751) First test case of HiveThriftServer2Suite sometimes timeouts

2015-02-11 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-5751: - Summary: First test case of HiveThriftServer2Suite sometimes timeouts Key: SPARK-5751 URL: https://issues.apache.org/jira/browse/SPARK-5751 Project: Spark Issue

[jira] [Commented] (SPARK-5739) Size exceeds Integer.MAX_VALUE in File Map

2015-02-11 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317384#comment-14317384 ] DjvuLee commented on SPARK-5739: Yes, I do not explain cleanly. What I mean is that we can

[jira] [Comment Edited] (SPARK-4423) Improve foreach() documentation to avoid confusion between local- and cluster-mode behavior

2015-02-11 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317334#comment-14317334 ] Ilya Ganelin edited comment on SPARK-4423 at 2/12/15 1:43 AM: --

[jira] [Comment Edited] (SPARK-5159) Thrift server does not respect hive.server2.enable.doAs=true

2015-02-11 Thread Tao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315951#comment-14315951 ] Tao Wang edited comment on SPARK-5159 at 2/12/15 4:15 AM: -- I have

[jira] [Commented] (SPARK-5755) remove unnecessary Add for unary plus sign

2015-02-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317584#comment-14317584 ] Apache Spark commented on SPARK-5755: - User 'adrian-wang' has created a pull request

[jira] [Updated] (SPARK-5606) Support plus sign in HiveContext

2015-02-11 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-5606: --- Assignee: Yadong Qi Support plus sign in HiveContext

[jira] [Created] (SPARK-5761) Revamp StandaloneRestProtocolSuite

2015-02-11 Thread Andrew Or (JIRA)
Andrew Or created SPARK-5761: Summary: Revamp StandaloneRestProtocolSuite Key: SPARK-5761 URL: https://issues.apache.org/jira/browse/SPARK-5761 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-5759) ExecutorRunnable should catch YarnException while NMClient start container

2015-02-11 Thread Lianhui Wang (JIRA)
Lianhui Wang created SPARK-5759: --- Summary: ExecutorRunnable should catch YarnException while NMClient start container Key: SPARK-5759 URL: https://issues.apache.org/jira/browse/SPARK-5759 Project:

[jira] [Commented] (SPARK-5759) ExecutorRunnable should catch YarnException while NMClient start container

2015-02-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317734#comment-14317734 ] Apache Spark commented on SPARK-5759: - User 'lianhuiwang' has created a pull request

[jira] [Commented] (SPARK-5761) Revamp StandaloneRestProtocolSuite

2015-02-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317762#comment-14317762 ] Apache Spark commented on SPARK-5761: - User 'andrewor14' has created a pull request

[jira] [Commented] (SPARK-5760) StandaloneRestClient/Server error behavior is incorrect

2015-02-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317760#comment-14317760 ] Apache Spark commented on SPARK-5760: - User 'andrewor14' has created a pull request

[jira] [Commented] (SPARK-5752) Don't implicitly convert RDDs directly to DataFrames

2015-02-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317761#comment-14317761 ] Apache Spark commented on SPARK-5752: - User 'rxin' has created a pull request for this

[jira] [Commented] (SPARK-575) Maintain a cache of JARs on each node to avoid unnecessary copying

2015-02-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317741#comment-14317741 ] Apache Spark commented on SPARK-575: User 'mengxr' has created a pull request for this

[jira] [Commented] (SPARK-5739) Size exceeds Integer.MAX_VALUE in File Map

2015-02-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315901#comment-14315901 ] Sean Owen commented on SPARK-5739: -- You are generating 10,000,000-dimensional data, so

[jira] [Comment Edited] (SPARK-4879) Missing output partitions after job completes with speculative execution

2015-02-11 Thread Matt Cheah (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315918#comment-14315918 ] Matt Cheah edited comment on SPARK-4879 at 2/11/15 10:00 AM: -

[jira] [Commented] (SPARK-5159) Thrift server does not respect hive.server2.enable.doAs=true

2015-02-11 Thread Tao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315951#comment-14315951 ] Tao Wang commented on SPARK-5159: - I have tested this on branch 1.2, below are results:

[jira] [Created] (SPARK-5736) Add executor log url

2015-02-11 Thread Hong Shen (JIRA)
Hong Shen created SPARK-5736: Summary: Add executor log url Key: SPARK-5736 URL: https://issues.apache.org/jira/browse/SPARK-5736 Project: Spark Issue Type: Bug Reporter: Hong Shen

[jira] [Commented] (SPARK-1302) httpd doesn't start in spark-ec2 (cc2.8xlarge)

2015-02-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315757#comment-14315757 ] Sean Owen commented on SPARK-1302: -- It looks like this was basically resolved recently

[jira] [Resolved] (SPARK-5728) MQTTStreamSuite leaves behind ActiveMQ database files

2015-02-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-5728. -- Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4517

[jira] [Resolved] (SPARK-4436) Debian packaging misses datanucleus jars

2015-02-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-4436. -- Resolution: Won't Fix Per SPARK-5727, I believe the outstanding Debian issues should be closed.

[jira] [Resolved] (SPARK-3624) Failed to find Spark assembly in /usr/share/spark/lib for RELEASED debian packages

2015-02-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-3624. -- Resolution: Won't Fix Per SPARK-5727, I believe the outstanding Debian issues should be closed.

[jira] [Resolved] (SPARK-2614) Add the spark-examples-xxx-.jar to the Debian packages created with mvn ... -Pdeb (using assembly/pom.xml)

2015-02-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-2614. -- Resolution: Won't Fix Per SPARK-5727, I believe the outstanding Debian issues should be closed. Add

[jira] [Commented] (SPARK-5677) Python DataFrame API remaining tasks

2015-02-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315809#comment-14315809 ] Apache Spark commented on SPARK-5677: - User 'davies' has created a pull request for

[jira] [Commented] (SPARK-5734) Allow creating a DataFrame from local Python data

2015-02-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315810#comment-14315810 ] Apache Spark commented on SPARK-5734: - User 'davies' has created a pull request for

[jira] [Resolved] (SPARK-665) Create RPM packages for Spark

2015-02-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-665. - Resolution: Won't Fix Given SPARK-5727, I suggest this will also be WontFix, in favor of delegating this

[jira] [Updated] (SPARK-5732) Add an option to print the spark version in spark script

2015-02-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-5732: - Component/s: (was: Spark Core) Spark Submit Target Version/s: 1.3.0

[jira] [Commented] (SPARK-5727) Deprecate, remove Debian packaging

2015-02-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315788#comment-14315788 ] Apache Spark commented on SPARK-5727: - User 'srowen' has created a pull request for

[jira] [Resolved] (SPARK-1799) Add init script to the debian packaging

2015-02-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-1799. -- Resolution: Won't Fix Per SPARK-5727, I believe the outstanding Debian issues should be closed. Add

[jira] [Updated] (SPARK-5736) Add executor log url to Executors page on Yarn

2015-02-11 Thread Hong Shen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Shen updated SPARK-5736: - Summary: Add executor log url to Executors page on Yarn (was: Add executor log url to Executors page)

[jira] [Updated] (SPARK-5739) Size exceeds Integer.MAX_VALUE in File Map

2015-02-11 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-5739: --- Summary: Size exceeds Integer.MAX_VALUE in File Map (was: Size exceeds Integer.MAX_VALUE in FileMap) Size

[jira] [Commented] (SPARK-5736) Add executor log url to Executors page on Yarn

2015-02-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315836#comment-14315836 ] Apache Spark commented on SPARK-5736: - User 'shenh062326' has created a pull request

[jira] [Commented] (SPARK-5739) Size exceeds Integer.MAX_VALUE in File Map

2015-02-11 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315845#comment-14315845 ] DjvuLee commented on SPARK-5739: the data is generated by the example KMeansDataGenerator

[jira] [Commented] (SPARK-5740) Change default value of comment in DescribeCommand from null to “None”

2015-02-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315872#comment-14315872 ] Apache Spark commented on SPARK-5740: - User 'OopsOutOfMemory' has created a pull

[jira] [Commented] (SPARK-5738) [SQL] Reuse mutable row for each record at jsonStringToRow

2015-02-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315906#comment-14315906 ] Apache Spark commented on SPARK-5738: - User 'yanbohappy' has created a pull request

[jira] [Comment Edited] (SPARK-5566) Tokenizer for mllib package

2015-02-11 Thread Augustin Borsu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313996#comment-14313996 ] Augustin Borsu edited comment on SPARK-5566 at 2/11/15 9:58 AM:

  1   2   >