[GitHub] [spark] tgravescs commented on issue #25299: [SPARK-27651][Core] Avoid the network when shuffle blocks are fetched from the same host

2019-11-26 Thread GitBox
tgravescs commented on issue #25299: [SPARK-27651][Core] Avoid the network when shuffle blocks are fetched from the same host URL: https://github.com/apache/spark/pull/25299#issuecomment-558642474 test this please This is an

[GitHub] [spark] tgravescs commented on issue #25299: [SPARK-27651][Core] Avoid the network when shuffle blocks are fetched from the same host

2019-11-25 Thread GitBox
tgravescs commented on issue #25299: [SPARK-27651][Core] Avoid the network when shuffle blocks are fetched from the same host URL: https://github.com/apache/spark/pull/25299#issuecomment-558299984 Definitely agree, sounds like gain and been wanting this for a while so thanks @attilapiros

[GitHub] [spark] tgravescs commented on issue #25299: [SPARK-27651][Core] Avoid the network when shuffle blocks are fetched from the same host

2019-11-25 Thread GitBox
tgravescs commented on issue #25299: [SPARK-27651][Core] Avoid the network when shuffle blocks are fetched from the same host URL: https://github.com/apache/spark/pull/25299#issuecomment-558228346 ok that makes sense. thanks. So it seems we fetch all the local blocks in first and then

[GitHub] [spark] tgravescs commented on issue #25299: [SPARK-27651][Core] Avoid the network when shuffle blocks are fetched from the same host

2019-11-25 Thread GitBox
tgravescs commented on issue #25299: [SPARK-27651][Core] Avoid the network when shuffle blocks are fetched from the same host URL: https://github.com/apache/spark/pull/25299#issuecomment-558202319 I only skimmed this but overall looks good, like this approach. When fetching the host

[GitHub] [spark] tgravescs commented on issue #25299: [SPARK-27651][Core] Avoid the network when shuffle blocks are fetched from the same host

2019-09-04 Thread GitBox
tgravescs commented on issue #25299: [SPARK-27651][Core] Avoid the network when shuffle blocks are fetched from the same host URL: https://github.com/apache/spark/pull/25299#issuecomment-527909451 So the downside to using a disk is that at least on yarn containers may not see the same

[GitHub] [spark] tgravescs commented on issue #25299: [SPARK-27651][Core] Avoid the network when shuffle blocks are fetched from the same host

2019-08-01 Thread GitBox
tgravescs commented on issue #25299: [SPARK-27651][Core] Avoid the network when shuffle blocks are fetched from the same host URL: https://github.com/apache/spark/pull/25299#issuecomment-517397877 yes we had talked about this and I looked briefly, but like mentioned in the description you