[GitHub] [spark] maryannxue commented on issue #26633: [SPARK-29994][CORE] Add WILDCARD task location

2019-12-09 Thread GitBox
maryannxue commented on issue #26633: [SPARK-29994][CORE] Add WILDCARD task location URL: https://github.com/apache/spark/pull/26633#issuecomment-563323728 Thank you @squito , for the feedback! > In fact this may be against the wishes of of one particular spark application, but still

[GitHub] [spark] maryannxue commented on issue #26633: [SPARK-29994][CORE] Add WILDCARD task location

2019-12-04 Thread GitBox
maryannxue commented on issue #26633: [SPARK-29994][CORE] Add WILDCARD task location URL: https://github.com/apache/spark/pull/26633#issuecomment-561682124 retest this please This is an automated message from the Apache Git

[GitHub] [spark] maryannxue commented on issue #26633: [SPARK-29994][CORE] Add WILDCARD task location

2019-12-03 Thread GitBox
maryannxue commented on issue #26633: [SPARK-29994][CORE] Add WILDCARD task location URL: https://github.com/apache/spark/pull/26633#issuecomment-561345400 @tgravescs Our benchmark comparing AQE w/ LSR (local shuffle reader) with AQE w/o LSR showed that before locality wait fix, there

[GitHub] [spark] maryannxue commented on issue #26633: [SPARK-29994][CORE] Add WILDCARD task location

2019-11-27 Thread GitBox
maryannxue commented on issue #26633: [SPARK-29994][CORE] Add WILDCARD task location URL: https://github.com/apache/spark/pull/26633#issuecomment-559134477 @attilapiros Very good point. I'll go thru all references of `host` as well as `TaskLocation.apply`.

[GitHub] [spark] maryannxue commented on issue #26633: [SPARK-29994][CORE] Add WILDCARD task location

2019-11-27 Thread GitBox
maryannxue commented on issue #26633: [SPARK-29994][CORE] Add WILDCARD task location URL: https://github.com/apache/spark/pull/26633#issuecomment-559133653 > why not fix locality for all RDDs as they can hit the same issue. So what fix exactly are you talking about here?

[GitHub] [spark] maryannxue commented on issue #26633: [SPARK-29994][CORE] Add WILDCARD task location

2019-11-26 Thread GitBox
maryannxue commented on issue #26633: [SPARK-29994][CORE] Add WILDCARD task location URL: https://github.com/apache/spark/pull/26633#issuecomment-558916448 @tgravescs > People are setting this to zero now anyway so changing default makes sense to me. Not sure how

[GitHub] [spark] maryannxue commented on issue #26633: [SPARK-29994][CORE] Add WILDCARD task location

2019-11-26 Thread GitBox
maryannxue commented on issue #26633: [SPARK-29994][CORE] Add WILDCARD task location URL: https://github.com/apache/spark/pull/26633#issuecomment-558850004 > I don't follow this logic how do you go from 200 output partitions to 40 tasks? I would expect 200 output partitions to have 200

[GitHub] [spark] maryannxue commented on issue #26633: [SPARK-29994][CORE] Add WILDCARD task location

2019-11-26 Thread GitBox
maryannxue commented on issue #26633: [SPARK-29994][CORE] Add WILDCARD task location URL: https://github.com/apache/spark/pull/26633#issuecomment-558804630 Changing the default locality wait time to 0 (or whatever it is) is based on the assumption that all workloads do not have serious

[GitHub] [spark] maryannxue commented on issue #26633: [SPARK-29994][CORE] Add WILDCARD task location

2019-11-26 Thread GitBox
maryannxue commented on issue #26633: [SPARK-29994][CORE] Add WILDCARD task location URL: https://github.com/apache/spark/pull/26633#issuecomment-558682192 Thanks for the feedback, @tgravescs! This is a workaround. A complete solution would be bring the current locality fallback to task

[GitHub] [spark] maryannxue commented on issue #26633: [SPARK-29994][CORE] Add WILDCARD task location

2019-11-25 Thread GitBox
maryannxue commented on issue #26633: [SPARK-29994][CORE] Add WILDCARD task location URL: https://github.com/apache/spark/pull/26633#issuecomment-558389814 Sure, @jiangxb1987 ! This is an automated message from the Apache

[GitHub] [spark] maryannxue commented on issue #26633: [SPARK-29994][CORE] Add WILDCARD task location

2019-11-25 Thread GitBox
maryannxue commented on issue #26633: [SPARK-29994][CORE] Add WILDCARD task location URL: https://github.com/apache/spark/pull/26633#issuecomment-558379888 I don't think there is a need to restrict it. Every RDD should "know" their own locality preference as well as the penalty for a