[jira] [Resolved] (SPARK-2778) Add unit tests for Yarn integration

2014-09-25 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2778. Resolution: Fixed Fix Version/s: 1.2.0 Fixed by:

[jira] [Commented] (SPARK-3687) Spark hang while processing more than 100 sequence files

2014-09-25 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147465#comment-14147465 ] Patrick Wendell commented on SPARK-3687: Can you perform a jstack on the executor

[jira] [Resolved] (SPARK-3576) Provide script for creating the Spark AMI from scratch

2014-09-25 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-3576. Resolution: Fixed This was fixed in spark-ec2 itself Provide script for creating the

[jira] [Updated] (SPARK-3288) All fields in TaskMetrics should be private and use getters/setters

2014-09-25 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3288: --- Assignee: (was: Andrew Or) All fields in TaskMetrics should be private and use

[jira] [Updated] (SPARK-3288) All fields in TaskMetrics should be private and use getters/setters

2014-09-25 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3288: --- Labels: starter (was: ) All fields in TaskMetrics should be private and use getters/setters

[jira] [Commented] (SPARK-3687) Spark hang while processing more than 100 sequence files

2014-09-25 Thread Ziv Huang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147504#comment-14147504 ] Ziv Huang commented on SPARK-3687: -- The following is the jstack dump of one executor when

[jira] [Created] (SPARK-3688) LogicalPlan can't resolve column correctlly

2014-09-25 Thread Yi Tian (JIRA)
Yi Tian created SPARK-3688: -- Summary: LogicalPlan can't resolve column correctlly Key: SPARK-3688 URL: https://issues.apache.org/jira/browse/SPARK-3688 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-3687) Spark hang while processing more than 100 sequence files

2014-09-25 Thread Ziv Huang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147513#comment-14147513 ] Ziv Huang commented on SPARK-3687: -- Just a few mins ago I ran a job twice, processing 203

[jira] [Resolved] (SPARK-3422) JavaAPISuite.getHadoopInputSplits isn't used anywhere

2014-09-25 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza resolved SPARK-3422. --- Resolution: Fixed JavaAPISuite.getHadoopInputSplits isn't used anywhere

[jira] [Commented] (SPARK-3688) LogicalPlan can't resolve column correctlly

2014-09-25 Thread Yi Tian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147520#comment-14147520 ] Yi Tian commented on SPARK-3688: As we know, the hive support complex colunm datatype like

[jira] [Comment Edited] (SPARK-3687) Spark hang while processing more than 100 sequence files

2014-09-25 Thread Ziv Huang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147513#comment-14147513 ] Ziv Huang edited comment on SPARK-3687 at 9/25/14 8:36 AM: --- Just

[jira] [Commented] (SPARK-3651) Consolidate executor maps in CoarseGrainedSchedulerBackend

2014-09-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147619#comment-14147619 ] Apache Spark commented on SPARK-3651: - User 'tigerquoll' has created a pull request

[jira] [Created] (SPARK-3689) FileLogger should create new instance of FileSystem regardless of it's scheme

2014-09-25 Thread Kousuke Saruta (JIRA)
Kousuke Saruta created SPARK-3689: - Summary: FileLogger should create new instance of FileSystem regardless of it's scheme Key: SPARK-3689 URL: https://issues.apache.org/jira/browse/SPARK-3689

[jira] [Commented] (SPARK-3689) FileLogger should create new instance of FileSystem regardless of it's scheme

2014-09-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147689#comment-14147689 ] Apache Spark commented on SPARK-3689: - User 'sarutak' has created a pull request for

[jira] [Commented] (SPARK-3682) Add helpful warnings to the UI

2014-09-25 Thread Arun Ahuja (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147730#comment-14147730 ] Arun Ahuja commented on SPARK-3682: --- We've been running to a lot of these issues so this

[jira] [Commented] (SPARK-3638) Commons HTTP client dependency conflict in extras/kinesis-asl module

2014-09-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147751#comment-14147751 ] Apache Spark commented on SPARK-3638: - User 'aniketbhatnagar' has created a pull

[jira] [Commented] (SPARK-3639) Kinesis examples set master as local

2014-09-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147758#comment-14147758 ] Apache Spark commented on SPARK-3639: - User 'aniketbhatnagar' has created a pull

[jira] [Created] (SPARK-3690) Closing shuffle writers we swallow more important exception

2014-09-25 Thread Egor Pakhomov (JIRA)
Egor Pakhomov created SPARK-3690: Summary: Closing shuffle writers we swallow more important exception Key: SPARK-3690 URL: https://issues.apache.org/jira/browse/SPARK-3690 Project: Spark

[jira] [Updated] (SPARK-3690) Closing shuffle writers we swallow more important exception

2014-09-25 Thread Egor Pakhomov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Egor Pakhomov updated SPARK-3690: - Description: ShaffleMapTask: line 75 {code:title=Bar.java|borderStyle=solid} case e: Exception

[jira] [Updated] (SPARK-3690) Closing shuffle writers we swallow more important exception

2014-09-25 Thread Egor Pakhomov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Egor Pakhomov updated SPARK-3690: - Description: ShaffleMapTask: line 75 {code:title=ShaffleMapTask|borderStyle=solid} case e:

[jira] [Commented] (SPARK-3678) Yarn app name reported in RM is different between cluster and client mode

2014-09-25 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147835#comment-14147835 ] Thomas Graves commented on SPARK-3678: -- Also note that the spark-submit --name

[jira] [Issue Comment Deleted] (SPARK-3687) Spark hang while processing more than 100 sequence files

2014-09-25 Thread Ziv Huang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ziv Huang updated SPARK-3687: - Comment: was deleted (was: Just a few mins ago I ran a job twice, processing 203 sequence files. Both

[jira] [Updated] (SPARK-3687) Spark hang while processing more than 100 sequence files

2014-09-25 Thread Ziv Huang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ziv Huang updated SPARK-3687: - Description: In my application, I read more than 100 sequence files to a JavaPairRDD, perform flatmap to

[jira] [Comment Edited] (SPARK-3687) Spark hang while processing more than 100 sequence files

2014-09-25 Thread Ziv Huang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147504#comment-14147504 ] Ziv Huang edited comment on SPARK-3687 at 9/25/14 3:09 PM: --- The

[jira] [Commented] (SPARK-3690) Closing shuffle writers we swallow more important exception

2014-09-25 Thread Egor Pakhomov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147864#comment-14147864 ] Egor Pakhomov commented on SPARK-3690: -- https://github.com/apache/spark/pull/2537

[jira] [Commented] (SPARK-3690) Closing shuffle writers we swallow more important exception

2014-09-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147867#comment-14147867 ] Apache Spark commented on SPARK-3690: - User 'epahomov' has created a pull request for

[jira] [Commented] (SPARK-2516) Bootstrapping

2014-09-25 Thread Yu Ishikawa (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147875#comment-14147875 ] Yu Ishikawa commented on SPARK-2516: Hi [~mengxr], I would like to work this issue, if

[jira] [Commented] (SPARK-3633) Fetches failure observed after SPARK-2711

2014-09-25 Thread Arun Ahuja (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147918#comment-14147918 ] Arun Ahuja commented on SPARK-3633: --- Which timeout values were increased to work around

[jira] [Commented] (SPARK-3633) Fetches failure observed after SPARK-2711

2014-09-25 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147939#comment-14147939 ] Zhan Zhang commented on SPARK-3633: --- Increasing timeout does not help my case either. I

[jira] [Commented] (SPARK-3633) Fetches failure observed after SPARK-2711

2014-09-25 Thread Arun Ahuja (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148004#comment-14148004 ] Arun Ahuja commented on SPARK-3633: --- Also, which timeout setting was useful:

[jira] [Commented] (SPARK-3561) Native Hadoop/YARN integration for batch/ETL workloads

2014-09-25 Thread Mayank Bansal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148010#comment-14148010 ] Mayank Bansal commented on SPARK-3561: -- HI Guys, we at ebay are having some issues

[jira] [Commented] (SPARK-3633) Fetches failure observed after SPARK-2711

2014-09-25 Thread Nishkam Ravi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148026#comment-14148026 ] Nishkam Ravi commented on SPARK-3633: - Increasing the value of

[jira] [Commented] (SPARK-546) Support full outer join and multiple join in a single shuffle

2014-09-25 Thread Aaron Staple (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148052#comment-14148052 ] Aaron Staple commented on SPARK-546: Hi, I think there are two features requested in

[jira] [Created] (SPARK-3691) Provide a mini cluster for testing system built on Spark

2014-09-25 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created SPARK-3691: -- Summary: Provide a mini cluster for testing system built on Spark Key: SPARK-3691 URL: https://issues.apache.org/jira/browse/SPARK-3691 Project: Spark Issue

[jira] [Updated] (SPARK-2932) Move MasterFailureTest out of main source directory

2014-09-25 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-2932: - Fix Version/s: 1.2.0 Move MasterFailureTest out of main source directory

[jira] [Resolved] (SPARK-2932) Move MasterFailureTest out of main source directory

2014-09-25 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-2932. -- Resolution: Fixed Move MasterFailureTest out of main source directory

[jira] [Commented] (SPARK-3691) Provide a mini cluster for testing system built on Spark

2014-09-25 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148071#comment-14148071 ] Xuefu Zhang commented on SPARK-3691: cc [~sandyr] Provide a mini cluster for testing

[jira] [Comment Edited] (SPARK-3691) Provide a mini cluster for testing system built on Spark

2014-09-25 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148071#comment-14148071 ] Xuefu Zhang edited comment on SPARK-3691 at 9/25/14 6:21 PM: -

[jira] [Comment Edited] (SPARK-1823) ExternalAppendOnlyMap can still OOM if one key is very large

2014-09-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148208#comment-14148208 ] Josh Rosen edited comment on SPARK-1823 at 9/25/14 7:42 PM:

[jira] [Commented] (SPARK-1823) ExternalAppendOnlyMap can still OOM if one key is very large

2014-09-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148208#comment-14148208 ] Josh Rosen commented on SPARK-1823: --- SPARK-3074 is a related issue for PySpark.

[jira] [Commented] (SPARK-2546) Configuration object thread safety issue

2014-09-25 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148244#comment-14148244 ] Andrew Ash commented on SPARK-2546: --- Another proposed fix: extend JobConf as a shim and

[jira] [Commented] (SPARK-1241) Support sliding in RDD

2014-09-25 Thread Frens Jan Rumph (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148281#comment-14148281 ] Frens Jan Rumph commented on SPARK-1241: Hi, I'm investigating use of Spark for

[jira] [Created] (SPARK-3692) RBF Kernel implementation to SVM

2014-09-25 Thread Ekrem Aksoy (JIRA)
Ekrem Aksoy created SPARK-3692: -- Summary: RBF Kernel implementation to SVM Key: SPARK-3692 URL: https://issues.apache.org/jira/browse/SPARK-3692 Project: Spark Issue Type: New Feature

[jira] [Updated] (SPARK-3692) RBF Kernel implementation to SVM

2014-09-25 Thread Ekrem Aksoy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ekrem Aksoy updated SPARK-3692: --- Description: Radial Basis Function is another type of kernel that can be used instead of linear

[jira] [Created] (SPARK-3693) Cached Hadoop RDD always return rows with the same value

2014-09-25 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created SPARK-3693: -- Summary: Cached Hadoop RDD always return rows with the same value Key: SPARK-3693 URL: https://issues.apache.org/jira/browse/SPARK-3693 Project: Spark Issue

[jira] [Updated] (SPARK-3693) Cached Hadoop RDD always return rows with the same value

2014-09-25 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated SPARK-3693: --- Description: While trying RDD caching, it's found that caching a Hadoop RDD causes data correctness

[jira] [Commented] (SPARK-3693) Cached Hadoop RDD always return rows with the same value

2014-09-25 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148295#comment-14148295 ] Xuefu Zhang commented on SPARK-3693: cc [~rxin], [~sandyr] Cached Hadoop RDD always

[jira] [Commented] (SPARK-3693) Cached Hadoop RDD always return rows with the same value

2014-09-25 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148303#comment-14148303 ] Sandy Ryza commented on SPARK-3693: --- Spark's documentation actually makes a note of

[jira] [Commented] (SPARK-3693) Cached Hadoop RDD always return rows with the same value

2014-09-25 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148306#comment-14148306 ] Reynold Xin commented on SPARK-3693: Just responded to you offline as well. This is a

[jira] [Resolved] (SPARK-3693) Cached Hadoop RDD always return rows with the same value

2014-09-25 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-3693. Resolution: Duplicate Cached Hadoop RDD always return rows with the same value

[jira] [Commented] (SPARK-3693) Cached Hadoop RDD always return rows with the same value

2014-09-25 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148318#comment-14148318 ] Xuefu Zhang commented on SPARK-3693: Thanks, guys. We are fine with the workaround.

[jira] [Resolved] (SPARK-3550) Disable automatic rdd caching in python api for relevant learners

2014-09-25 Thread Aaron Staple (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Staple resolved SPARK-3550. - Resolution: Fixed Disable automatic rdd caching in python api for relevant learners

[jira] [Commented] (SPARK-3550) Disable automatic rdd caching in python api for relevant learners

2014-09-25 Thread Aaron Staple (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148347#comment-14148347 ] Aaron Staple commented on SPARK-3550: - This has been addressed in another commit:

[jira] [Resolved] (SPARK-3488) cache deserialized python RDDs before iterative learning

2014-09-25 Thread Aaron Staple (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Staple resolved SPARK-3488. - Resolution: Won't Fix cache deserialized python RDDs before iterative learning

[jira] [Commented] (SPARK-3690) Closing shuffle writers we swallow more important exception

2014-09-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148365#comment-14148365 ] Josh Rosen commented on SPARK-3690: --- For additional context, here's the mailing list

[jira] [Resolved] (SPARK-3690) Closing shuffle writers we swallow more important exception

2014-09-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-3690. --- Resolution: Fixed Issue resolved by pull request 2537 [https://github.com/apache/spark/pull/2537]

[jira] [Updated] (SPARK-3661) spark.driver.memory is ignored in cluster mode

2014-09-25 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-3661: - Priority: Critical (was: Major) spark.driver.memory is ignored in cluster mode

[jira] [Updated] (SPARK-3682) Add helpful warnings to the UI

2014-09-25 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated SPARK-3682: -- Description: Spark has a zillion configuration options and a zillion different things that can go

[jira] [Commented] (SPARK-3682) Add helpful warnings to the UI

2014-09-25 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148379#comment-14148379 ] Sandy Ryza commented on SPARK-3682: --- Oops, that should have read increased. When a task

[jira] [Commented] (SPARK-2377) Create a Python API for Spark Streaming

2014-09-25 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148385#comment-14148385 ] Davies Liu commented on SPARK-2377: --- [~giwa] I also start to work on this (based on your

[jira] [Created] (SPARK-3694) Allow printing object graph of tasks/RDD's with a debug flag

2014-09-25 Thread Patrick Wendell (JIRA)
Patrick Wendell created SPARK-3694: -- Summary: Allow printing object graph of tasks/RDD's with a debug flag Key: SPARK-3694 URL: https://issues.apache.org/jira/browse/SPARK-3694 Project: Spark

[jira] [Updated] (SPARK-3694) Allow printing object graph of tasks/RDD's with a debug flag

2014-09-25 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3694: --- Description: This would be useful for debugging extra references inside of RDD's Here is an

[jira] [Comment Edited] (SPARK-3032) Potential bug when running sort-based shuffle with sorting using TimSort

2014-09-25 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148411#comment-14148411 ] Andrew Ash edited comment on SPARK-3032 at 9/25/14 10:35 PM: -

[jira] [Commented] (SPARK-3032) Potential bug when running sort-based shuffle with sorting using TimSort

2014-09-25 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148411#comment-14148411 ] Andrew Ash commented on SPARK-3032: --- This bug prevents people from doing testing of

[jira] [Resolved] (SPARK-1484) MLlib should warn if you are using an iterative algorithm on non-cached data

2014-09-25 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-1484. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2347

[jira] [Updated] (SPARK-1484) MLlib should warn if you are using an iterative algorithm on non-cached data

2014-09-25 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1484: - Assignee: Aaron Staple MLlib should warn if you are using an iterative algorithm on non-cached

[jira] [Resolved] (SPARK-3584) sbin/slaves doesn't work when we use password authentication for SSH

2014-09-25 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-3584. Resolution: Fixed Fix Version/s: 1.2.0 Assignee: Kousuke Saruta

[jira] [Commented] (SPARK-2377) Create a Python API for Spark Streaming

2014-09-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148512#comment-14148512 ] Apache Spark commented on SPARK-2377: - User 'davies' has created a pull request for

[jira] [Commented] (SPARK-1405) parallel Latent Dirichlet Allocation (LDA) atop of spark in MLlib

2014-09-25 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148529#comment-14148529 ] Xiangrui Meng commented on SPARK-1405: -- [~Guoqiang Li] and [~pedrorodriguez], since

[jira] [Updated] (SPARK-1405) parallel Latent Dirichlet Allocation (LDA) atop of spark in MLlib

2014-09-25 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1405: - Assignee: Guoqiang Li (was: Xusen Yin) parallel Latent Dirichlet Allocation (LDA) atop of spark

[jira] [Updated] (SPARK-1405) parallel Latent Dirichlet Allocation (LDA) atop of spark in MLlib

2014-09-25 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1405: - Shepherd: Xiangrui Meng parallel Latent Dirichlet Allocation (LDA) atop of spark in MLlib

[jira] [Commented] (SPARK-1241) Support sliding in RDD

2014-09-25 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148549#comment-14148549 ] Xiangrui Meng commented on SPARK-1241: -- This is implemented MLlib:

[jira] [Commented] (SPARK-1405) parallel Latent Dirichlet Allocation (LDA) atop of spark in MLlib

2014-09-25 Thread Pedro Rodriguez (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148555#comment-14148555 ] Pedro Rodriguez commented on SPARK-1405: [~mengxr], definitely a good idea to be

[jira] [Resolved] (SPARK-2634) MapOutputTrackerWorker.mapStatuses should be thread-safe

2014-09-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2634. --- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 1541

[jira] [Commented] (SPARK-2546) Configuration object thread safety issue

2014-09-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148645#comment-14148645 ] Josh Rosen commented on SPARK-2546: --- JobConf has a _ton_ of methods and it's not clear

[jira] [Created] (SPARK-3695) Enable to show host and port in block fetch failure

2014-09-25 Thread Adrian Wang (JIRA)
Adrian Wang created SPARK-3695: -- Summary: Enable to show host and port in block fetch failure Key: SPARK-3695 URL: https://issues.apache.org/jira/browse/SPARK-3695 Project: Spark Issue Type:

[jira] [Commented] (SPARK-3695) Enable to show host and port in block fetch failure

2014-09-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148656#comment-14148656 ] Apache Spark commented on SPARK-3695: - User 'adrian-wang' has created a pull request

[jira] [Commented] (SPARK-2546) Configuration object thread safety issue

2014-09-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148657#comment-14148657 ] Josh Rosen commented on SPARK-2546: --- A synchronization wrapper (whether written by hand

[jira] [Commented] (SPARK-2532) Fix issues with consolidated shuffle

2014-09-25 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148666#comment-14148666 ] Andrew Ash commented on SPARK-2532: --- [~pwendell] should we close this ticket and track

[jira] [Updated] (SPARK-3688) LogicalPlan can't resolve column correctlly

2014-09-25 Thread Yi Tian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Tian updated SPARK-3688: --- Description: How to reproduce this problem: create a table: {code} create table test (a string, b string);

[jira] [Resolved] (SPARK-3686) flume.SparkSinkSuite.Success is flaky

2014-09-25 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-3686. Resolution: Fixed Resolved by: https://github.com/apache/spark/pull/2531