[jira] [Commented] (SPARK-3533) Add saveAsTextFileByKey() method to RDDs

2015-01-05 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265122#comment-14265122 ] Ilya Ganelin commented on SPARK-3533: - Hi all - I have that solution (using

[jira] [Commented] (SPARK-4927) Spark does not clean up properly during long jobs.

2014-12-31 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14262313#comment-14262313 ] Ilya Ganelin commented on SPARK-4927: - The below code reproduces the problem. Code

[jira] [Commented] (SPARK-4927) Spark does not clean up properly during long jobs.

2014-12-30 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261724#comment-14261724 ] Ilya Ganelin commented on SPARK-4927: - The below code can produce this issue. I've

[jira] [Comment Edited] (SPARK-4927) Spark does not clean up properly during long jobs.

2014-12-30 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261724#comment-14261724 ] Ilya Ganelin edited comment on SPARK-4927 at 12/31/14 12:33 AM:

[jira] [Comment Edited] (SPARK-4927) Spark does not clean up properly during long jobs.

2014-12-30 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261724#comment-14261724 ] Ilya Ganelin edited comment on SPARK-4927 at 12/31/14 12:32 AM:

[jira] [Comment Edited] (SPARK-4927) Spark does not clean up properly during long jobs.

2014-12-30 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261724#comment-14261724 ] Ilya Ganelin edited comment on SPARK-4927 at 12/31/14 12:33 AM:

[jira] [Comment Edited] (SPARK-4927) Spark does not clean up properly during long jobs.

2014-12-30 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261724#comment-14261724 ] Ilya Ganelin edited comment on SPARK-4927 at 12/31/14 12:33 AM:

[jira] [Issue Comment Deleted] (SPARK-4927) Spark does not clean up properly during long jobs.

2014-12-30 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Ganelin updated SPARK-4927: Comment: was deleted (was: The below code can produce this issue. I've also included some log

[jira] [Created] (SPARK-4927) Spark does not clean up properly during long jobs.

2014-12-22 Thread Ilya Ganelin (JIRA)
Ilya Ganelin created SPARK-4927: --- Summary: Spark does not clean up properly during long jobs. Key: SPARK-4927 URL: https://issues.apache.org/jira/browse/SPARK-4927 Project: Spark Issue Type:

[jira] [Commented] (SPARK-4779) PySpark Shuffle Fails Looking for Files that Don't Exist when low on Memory

2014-12-11 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242831#comment-14242831 ] Ilya Ganelin commented on SPARK-4779: - I've seen this issue on Scala as well. This

[jira] [Commented] (SPARK-3533) Add saveAsTextFileByKey() method to RDDs

2014-12-11 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242913#comment-14242913 ] Ilya Ganelin commented on SPARK-3533: - I am looking into a solution for this. Add

[jira] [Commented] (SPARK-4417) New API: sample RDD to fixed number of items

2014-12-08 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238613#comment-14238613 ] Ilya Ganelin commented on SPARK-4417: - Hi, I'd like to work on this. Can someone

[jira] [Commented] (SPARK-4101) [MLLIB] Improve API in Word2Vec model

2014-12-01 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14229916#comment-14229916 ] Ilya Ganelin commented on SPARK-4101: - Hu Peter - did you have an algorithm in mind

[jira] [Comment Edited] (SPARK-4101) [MLLIB] Improve API in Word2Vec model

2014-12-01 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14229916#comment-14229916 ] Ilya Ganelin edited comment on SPARK-4101 at 12/1/14 3:48 PM: --

[jira] [Commented] (SPARK-4189) FileSegmentManagedBuffer should have a configurable memory map threshold

2014-12-01 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230579#comment-14230579 ] Ilya Ganelin commented on SPARK-4189: - Looking at the code I see // Just copy the

[jira] [Comment Edited] (SPARK-1962) Add RDD cache reference counting

2014-12-01 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230908#comment-14230908 ] Ilya Ganelin edited comment on SPARK-1962 at 12/2/14 3:16 AM: --

[jira] [Issue Comment Deleted] (SPARK-4101) [MLLIB] Improve API in Word2Vec model

2014-11-29 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Ganelin updated SPARK-4101: Comment: was deleted (was: If no-one is working on this I would be happy to knock this out. Thanks!

[jira] [Commented] (SPARK-3694) Allow printing object graph of tasks/RDD's with a debug flag

2014-11-28 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14228580#comment-14228580 ] Ilya Ganelin commented on SPARK-3694: - Hi Patrick - I am working on it - I am just

[jira] [Commented] (SPARK-3694) Allow printing object graph of tasks/RDD's with a debug flag

2014-11-28 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14228631#comment-14228631 ] Ilya Ganelin commented on SPARK-3694: - Tests are completed and I will be submitting a

[jira] [Commented] (SPARK-3080) ArrayIndexOutOfBoundsException in ALS for Large datasets

2014-11-14 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212822#comment-14212822 ] Ilya Ganelin commented on SPARK-3080: - Hi Xiangrui - I was not doing any sort of

[jira] [Commented] (SPARK-3694) Allow printing object graph of tasks/RDD's with a debug flag

2014-11-14 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212875#comment-14212875 ] Ilya Ganelin commented on SPARK-3694: - There is also task serialization that happens

[jira] [Commented] (SPARK-3080) ArrayIndexOutOfBoundsException in ALS for Large datasets

2014-10-29 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14188354#comment-14188354 ] Ilya Ganelin commented on SPARK-3080: - Hello Xiangrui - happy to hear that you're on

[jira] [Commented] (SPARK-3080) ArrayIndexOutOfBoundsException in ALS for Large datasets

2014-10-29 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14189039#comment-14189039 ] Ilya Ganelin commented on SPARK-3080: - Hi all - I have managed to make some

[jira] [Commented] (SPARK-3080) ArrayIndexOutOfBoundsException in ALS for Large datasets

2014-10-27 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14185846#comment-14185846 ] Ilya Ganelin commented on SPARK-3080: - I've seen the same error on a dataset of ~200

[jira] [Commented] (SPARK-3694) Allow printing object graph of tasks/RDD's with a debug flag

2014-10-18 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176119#comment-14176119 ] Ilya Ganelin commented on SPARK-3694: - Awesome. Thanks Patrick. Allow printing

[jira] [Commented] (SPARK-3694) Allow printing object graph of tasks/RDD's with a debug flag

2014-10-17 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14174981#comment-14174981 ] Ilya Ganelin commented on SPARK-3694: - Hello. I would like to work on this. Can you

<    1   2