[GitHub] spark pull request: [SPARK-3453] [WIP] Refactor Netty module to us...

2014-09-11 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2330#discussion_r17463503 --- Diff: core/src/main/scala/org/apache/spark/network/netty/BlockClientFactory.scala --- @@ -0,0 +1,179 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-3453] [WIP] Refactor Netty module to us...

2014-09-11 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2330#discussion_r17463515 --- Diff: core/src/main/scala/org/apache/spark/network/netty/BlockClientFactory.scala --- @@ -0,0 +1,179 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-3453] [WIP] Refactor Netty module to us...

2014-09-11 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2330#discussion_r17463593 --- Diff: core/src/main/scala/org/apache/spark/network/netty/protocol.scala --- @@ -0,0 +1,260 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-2304] tera sort example program for shu...

2014-09-11 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1242#issuecomment-55364233 I don't think we are going to merge this in Spark, unless there is huge demand from users... --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-3171] Don't print meaningless informati...

2014-09-11 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2078#issuecomment-55364261 @mridulm @JoshRosen Does this PR look good to you guys? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-3469] Call all TaskCompletionListeners ...

2014-09-12 Thread rxin
Github user rxin closed the pull request at: https://github.com/apache/spark/pull/2343 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-3469] Call all TaskCompletionListeners ...

2014-09-12 Thread rxin
GitHub user rxin reopened a pull request: https://github.com/apache/spark/pull/2343 [SPARK-3469] Call all TaskCompletionListeners even if some fail This is necessary because we rely on this callback interface to clean resources up. The old behavior would lead to resource leaks

[GitHub] spark pull request: [SPARK-3469] Call all TaskCompletionListeners ...

2014-09-12 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2343#issuecomment-55371449 What do you think would be a good way to handle this? Perhaps throw back the first exception encountered? or throw an exception with all the error messages

[GitHub] spark pull request: [SPARK-3469] Call all TaskCompletionListeners ...

2014-09-12 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2343#issuecomment-55372290 Ok I pushed a new version that creates a TaskCompletionListenerException that contains all the error messages. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-3495] Block replication fails continuou...

2014-09-12 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2366#issuecomment-55372429 Isn't 1s cache span too low? How often will we get a cache hit if they expire in 1 sec? --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-1021] Defer the data-driven computation...

2014-09-12 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1689#issuecomment-55456438 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1021] Defer the data-driven computation...

2014-09-12 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1689#issuecomment-55458226 @erikerlandson thanks for looking at this. A few questions: 1. After this pull request, does anything still use SimpleFutureAction? 2. If I understand

[GitHub] spark pull request: [SPARK-3427] [GraphX] Avoid active vertex trac...

2014-09-12 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2308#issuecomment-55461002 Thanks. Merging this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-3453] [WIP] Refactor Netty module to us...

2014-09-12 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2330#discussion_r17502935 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -185,26 +223,34 @@ final class ShuffleBlockFetcherIterator

[GitHub] spark pull request: [SPARK-3469] Make sure all TaskCompletionListe...

2014-09-12 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2343#issuecomment-55481920 Ok merging this in master. Thanks for taking a look. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SQL][Docs] Update SQL programming guide to sh...

2014-09-12 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2374#issuecomment-55481956 Merging in master. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [WIP][SPARK-3517]mapPartitions is not correct ...

2014-09-12 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2376#issuecomment-55482040 What is the problem here? Can you give an example or test case? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-2947] DAGScheduler resubmit the stage i...

2014-09-12 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1877#discussion_r17510517 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1046,41 +1046,37 @@ class DAGScheduler( case FetchFailed

[GitHub] spark pull request: SPARK-3470 [CORE] [STREAMING] Add Closeable / ...

2014-09-12 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2346#discussion_r17510558 --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaSparkContext.scala --- @@ -40,7 +41,9 @@ import org.apache.spark.rdd.{EmptyRDD, HadoopRDD

[GitHub] spark pull request: SPARK-3470 [CORE] [STREAMING] Add Closeable / ...

2014-09-12 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2346#issuecomment-55482552 @srowen any reason you did not add this to the Scala SparkContext? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-1477]: Add the lifecycle interface

2014-09-12 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/991#issuecomment-55482578 Hey @witgo I thought about this more, and at this point I'm not sure if it is worth it to standardize this interface. The reason is we have a lot of services in Spark

[GitHub] spark pull request: [SPARK-1477]: Add the lifecycle interface

2014-09-12 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/991#issuecomment-55482579 Maybe we should just implement Closeable? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [WIP][SPARK-3517]mapPartitions is not correct ...

2014-09-12 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2376#issuecomment-55482915 Can you add a unit test? You can just create a RDD and serialize it to test the size. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-3433][BUILD] Fix for Mima false-positiv...

2014-09-15 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2285#issuecomment-55686427 @ScrapCodes is this good to merge now? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-3540] Add reboot-slaves functionality t...

2014-09-15 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/2404 [SPARK-3540] Add reboot-slaves functionality to the ec2 script Tested on a real cluster. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rxin/spark

[GitHub] spark pull request: [SPARK-3433][BUILD] Fix for Mima false-positiv...

2014-09-15 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2285#issuecomment-55689081 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-3540] Add reboot-slaves functionality t...

2014-09-15 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2404#issuecomment-55692669 Merging in master. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-3540] Add reboot-slaves functionality t...

2014-09-15 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2404#issuecomment-55692659 @ScrapCodes any idea why random public classes are being reported, even though they have nothing to do with this PR? --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-3433][BUILD] Fix for Mima false-positiv...

2014-09-15 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2285#issuecomment-55692830 Ok merging this one. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-2895: Add mapPartitionsWithContext relat...

2014-09-15 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2194#issuecomment-55692911 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-3546] InputStream of ManagedBuffer is n...

2014-09-16 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2408#discussion_r17617004 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -111,10 +112,18 @@ final class ShuffleBlockFetcherIterator

[GitHub] spark pull request: [SPARK-3546] InputStream of ManagedBuffer is n...

2014-09-16 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2408#discussion_r17617628 --- Diff: core/src/main/scala/org/apache/spark/network/ManagedBuffer.scala --- @@ -66,8 +67,15 @@ final class FileSegmentManagedBuffer(val file: File, val

[GitHub] spark pull request: [SPARK-3546] InputStream of ManagedBuffer is n...

2014-09-16 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2408#issuecomment-55788663 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1021] Defer the data-driven computation...

2014-09-16 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1689#issuecomment-55797086 Yea I don't think we need to fully solve 3 here. My main concern with these set of changes is 2, since a single badly behaved RDD can potentially block

[GitHub] spark pull request: [SPARK-3546] InputStream of ManagedBuffer is n...

2014-09-16 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2408#issuecomment-55801829 Thanks. Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-3453] [WIP] Refactor Netty module to us...

2014-09-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2330#discussion_r17650983 --- Diff: core/src/main/scala/org/apache/spark/network/netty/BlockClientFactory.scala --- @@ -0,0 +1,182 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2425#discussion_r17651529 --- Diff: core/src/main/java/org/apache/spark/TaskContext.java --- @@ -0,0 +1,176 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2425#discussion_r17651524 --- Diff: core/src/main/java/org/apache/spark/TaskContext.java --- @@ -0,0 +1,176 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2425#discussion_r17651575 --- Diff: core/src/main/java/org/apache/spark/TaskContext.java --- @@ -0,0 +1,176 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2425#discussion_r17651616 --- Diff: core/src/main/java/org/apache/spark/TaskContext.java --- @@ -0,0 +1,176 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2425#discussion_r17651635 --- Diff: core/src/main/java/org/apache/spark/TaskContext.java --- @@ -0,0 +1,176 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2425#discussion_r17651668 --- Diff: core/src/main/java/org/apache/spark/TaskContext.java --- @@ -0,0 +1,176 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2425#discussion_r17651760 --- Diff: core/src/main/java/org/apache/spark/TaskContext.java --- @@ -0,0 +1,176 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2425#discussion_r17651763 --- Diff: core/src/main/java/org/apache/spark/TaskContext.java --- @@ -0,0 +1,176 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2425#discussion_r17651774 --- Diff: core/src/main/java/org/apache/spark/TaskContext.java --- @@ -0,0 +1,176 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2425#discussion_r17678506 --- Diff: core/src/main/java/org/apache/spark/TaskContext.java --- @@ -0,0 +1,234 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2425#discussion_r17678497 --- Diff: core/src/main/java/org/apache/spark/TaskContext.java --- @@ -0,0 +1,234 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2425#discussion_r17678511 --- Diff: core/src/main/java/org/apache/spark/TaskContext.java --- @@ -0,0 +1,234 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2425#discussion_r17678541 --- Diff: core/src/main/java/org/apache/spark/TaskContext.java --- @@ -0,0 +1,234 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2425#discussion_r17678577 --- Diff: core/src/main/java/org/apache/spark/TaskContext.java --- @@ -0,0 +1,234 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] spark pull request: [SPARK-1455] [SPARK-3534] [Build] When possibl...

2014-09-17 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2420#issuecomment-55942318 This looks good to me, but I'm not very good at reviewing bash code (who is?). I guess we can merge and watch closely the next few PRs to see if they are functioning

[GitHub] spark pull request: [SPARK-1455] [SPARK-3534] [Build] When possibl...

2014-09-17 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2420#issuecomment-55942909 What's the dummy route? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1455] [SPARK-3534] [Build] When possibl...

2014-09-17 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2420#issuecomment-55943343 Yea that makes sense to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2425#discussion_r17712911 --- Diff: core/src/main/java/org/apache/spark/TaskContext.java --- @@ -0,0 +1,234 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2425#discussion_r17713298 --- Diff: core/src/main/java/org/apache/spark/TaskContext.java --- @@ -0,0 +1,234 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2425#discussion_r17713674 --- Diff: core/src/main/java/org/apache/spark/TaskContext.java --- @@ -0,0 +1,234 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2425#discussion_r17713666 --- Diff: core/src/main/java/org/apache/spark/TaskContext.java --- @@ -0,0 +1,234 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] spark pull request: [SPARK-3597][Mesos] Implement `killTask`.

2014-09-19 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2453#issuecomment-56152819 Sorry for asking - but have you tested this on a real cluster? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-3597][Mesos] Implement `killTask`.

2014-09-19 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2453#issuecomment-56152843 Oh and thanks for doing this! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-3578] Fix upper bound in GraphGenerator...

2014-09-19 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2439#issuecomment-56153581 @jegonzal you should take a look :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: SPARK-2333 - spark_ec2 script should allow opt...

2014-09-19 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1899#discussion_r17812734 --- Diff: ec2/spark_ec2.py --- @@ -440,14 +449,29 @@ def launch_cluster(conn, opts, cluster_name): print Launched master in %s, regid = %s % (zone

[GitHub] spark pull request: SPARK-2333 - spark_ec2 script should allow opt...

2014-09-19 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1899#discussion_r17812732 --- Diff: ec2/spark_ec2.py --- @@ -440,14 +449,29 @@ def launch_cluster(conn, opts, cluster_name): print Launched master in %s, regid = %s % (zone

[GitHub] spark pull request: SPARK-3608 Break if the instance tag naming su...

2014-09-19 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2466#issuecomment-56243651 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-3608 Break if the instance tag naming su...

2014-09-19 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2466#issuecomment-56243659 LGTM pending Jenkins. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-3613] Record only average block size in...

2014-09-20 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/2470 [SPARK-3613] Record only average block size in MapStatus for large stages This changes the way we send MapStatus from executors back to driver for large stages (2000 tasks). For large stages, we

[GitHub] spark pull request: SPARK-3608 Break if the instance tag naming su...

2014-09-20 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2466#issuecomment-56260810 Merging this in master branch-1.1. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-3613] Record only average block size in...

2014-09-20 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2470#issuecomment-56277279 It's more than that. We use estimated sizes to track the total size of outstanding fetches, and try to bound that to a certain size in case an executor sends too many

[GitHub] spark pull request: [SPARK-3446] Expose underlying job ids in Futu...

2014-09-20 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2337#issuecomment-56278413 The API is slightly awkward as you suggested. Is this intended to get job progress? If yes, maybe we can do that through the job group to get the list of job ids

[GitHub] spark pull request: stop, start and destroy require the EC2_REGION

2014-09-20 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2473#discussion_r17822066 --- Diff: docs/ec2-scripts.md --- @@ -137,11 +137,11 @@ cost you any EC2 cycles, but ***will*** continue to cost money for EBS storage

[GitHub] spark pull request: stop, start and destroy require the EC2_REGION

2014-09-20 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2473#issuecomment-56282092 Thanks. This is great to have. I left a tiny comment. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: stop, start and destroy require the EC2_REGION

2014-09-20 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2473#discussion_r17823526 --- Diff: docs/ec2-scripts.md --- @@ -137,11 +137,11 @@ cost you any EC2 cycles, but ***will*** continue to cost money for EBS storage

[GitHub] spark pull request: [SPARK-3613] Record only average block size in...

2014-09-21 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2470#issuecomment-56292164 It really depends on the number of zero-sized blocks. One thing we can possibly do is to create a compressed bitmap to track zero sized blocks, as discussed here: http

[GitHub] spark pull request: [SPARK-3613] Record only average block size in...

2014-09-21 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2470#issuecomment-56292398 @lemire our requirements here are very simple. We just need to have a bitmap to track the position of zero-sized blocks in Spark shuffle. Things we need from the bitmap

[GitHub] spark pull request: [SPARK-3626] [WIP] Replace AsyncRDDActions wit...

2014-09-21 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2482#issuecomment-56309919 I don't think we can just wipe the old one out. At the very least, we need to deprecate it. Even that is debatable because some applications might prefer this async model

[GitHub] spark pull request: [SPARK-3626] [WIP] Replace AsyncRDDActions wit...

2014-09-21 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2482#discussion_r17827924 --- Diff: core/src/main/scala/org/apache/spark/RunAsyncResult.scala --- @@ -0,0 +1,93 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] spark pull request: [SPARK-3495] Block replication fails continuou...

2014-09-21 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2366#discussion_r17829791 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -787,31 +789,88 @@ private[spark] class BlockManager

[GitHub] spark pull request: [SPARK-3495] Block replication fails continuou...

2014-09-21 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2366#discussion_r17829854 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -787,31 +789,88 @@ private[spark] class BlockManager

[GitHub] spark pull request: [SPARK-3495] Block replication fails continuou...

2014-09-21 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2366#discussion_r17829869 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -111,6 +111,8 @@ private[spark] class BlockManager

[GitHub] spark pull request: [SPARK-3495] Block replication fails continuou...

2014-09-21 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2366#discussion_r17829908 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -111,6 +111,8 @@ private[spark] class BlockManager

[GitHub] spark pull request: [SPARK-3495] Block replication fails continuou...

2014-09-21 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2366#discussion_r17829919 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -787,31 +789,88 @@ private[spark] class BlockManager

[GitHub] spark pull request: [SPARK-3495] Block replication fails continuou...

2014-09-21 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2366#discussion_r17829945 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -787,31 +789,88 @@ private[spark] class BlockManager

[GitHub] spark pull request: [SPARK-3495] Block replication fails continuou...

2014-09-21 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2366#discussion_r17829956 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -787,31 +789,88 @@ private[spark] class BlockManager

[GitHub] spark pull request: [SPARK-3495] Block replication fails continuou...

2014-09-21 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2366#discussion_r17829979 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -787,31 +789,88 @@ private[spark] class BlockManager

[GitHub] spark pull request: [SPARK-3495] Block replication fails continuou...

2014-09-21 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2366#discussion_r17830017 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -787,31 +789,88 @@ private[spark] class BlockManager

[GitHub] spark pull request: [SPARK-3495] Block replication fails continuou...

2014-09-21 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2366#discussion_r17830025 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -787,31 +789,88 @@ private[spark] class BlockManager

[GitHub] spark pull request: SPARK-3612. Executor shouldn't quit if heartbe...

2014-09-22 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2487#issuecomment-56341801 LGTM pending Jenkins. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: stop, start and destroy require the EC2_REGION

2014-09-22 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2473#discussion_r17889800 --- Diff: docs/ec2-scripts.md --- @@ -137,11 +146,11 @@ cost you any EC2 cycles, but ***will*** continue to cost money for EBS storage

[GitHub] spark pull request: [SPARK-3495] Block replication fails continuou...

2014-09-23 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2366#issuecomment-56573235 If this is getting more complicated, we should consider standardizing the internal api and then buliding a separate service that properly handles all these issues

[GitHub] spark pull request: [SPARK-3632] ConnectionManager can run out of ...

2014-09-25 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2484#issuecomment-56777341 Hey sorry really busy this week. I will take a look at this next week. Added to my todo list. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-3453] [WIP] Refactor Netty module to us...

2014-09-26 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2330#discussion_r18121216 --- Diff: core/src/main/scala/org/apache/spark/network/netty/BlockClientFactory.scala --- @@ -0,0 +1,182 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-3613] Record only average block size in...

2014-09-26 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2470#issuecomment-57040450 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Minor cleanup to tighten visibility and remove...

2014-09-26 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/2555 Minor cleanup to tighten visibility and remove compilation warning. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rxin/spark cleanup

[GitHub] spark pull request: SPARK-2761 refactor #maybeSpill into Spillable

2014-09-26 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2416#discussion_r18121813 --- Diff: core/src/main/scala/org/apache/spark/util/collection/Spillable.scala --- @@ -0,0 +1,111 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: SPARK-2761 refactor #maybeSpill into Spillable

2014-09-26 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2416#discussion_r18121818 --- Diff: core/src/main/scala/org/apache/spark/util/collection/Spillable.scala --- @@ -0,0 +1,111 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: SPARK-2761 refactor #maybeSpill into Spillable

2014-09-26 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2416#discussion_r18121816 --- Diff: core/src/main/scala/org/apache/spark/util/collection/Spillable.scala --- @@ -0,0 +1,111 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: SPARK-2761 refactor #maybeSpill into Spillable

2014-09-26 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2416#discussion_r18121824 --- Diff: core/src/main/scala/org/apache/spark/util/collection/Spillable.scala --- @@ -0,0 +1,111 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: SPARK-2761 refactor #maybeSpill into Spillable

2014-09-26 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2416#discussion_r18121825 --- Diff: core/src/main/scala/org/apache/spark/util/collection/Spillable.scala --- @@ -0,0 +1,111 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: SPARK-2761 refactor #maybeSpill into Spillable

2014-09-26 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2416#discussion_r18121834 --- Diff: core/src/main/scala/org/apache/spark/util/collection/Spillable.scala --- @@ -0,0 +1,111 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-3613] Record only average block size in...

2014-09-26 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2470#issuecomment-57041883 hm mima failed even though MapStatus is private[spark]. I will add an exclude. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-26 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2425#discussion_r18121869 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala --- @@ -45,7 +45,8 @@ import org.apache.spark.util.Utils private[spark] abstract class

  1   2   3   4   5   6   7   8   9   10   >