Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

Mark Hamstra Mon, 04 Mar 2019 15:24:03 -0800

I'll try to find some time, but it's really at a premium right now.

On Mon, Mar 4, 2019 at 3:17 PM Xiangrui Meng <men...@gmail.com> wrote:


>
>
> On Mon, Mar 4, 2019 at 3:10 PM Mark Hamstra <m...@clearstorydata.com>
> wrote:
>
>> :) Sorry, that was ambiguous. I was seconding Imran's comment.
>>
>
> Could you also help review Xingbo's design sketch and help evaluate the
> cost?
>
>
>>
>> On Mon, Mar 4, 2019 at 3:09 PM Xiangrui Meng <men...@gmail.com> wrote:
>>
>>>
>>>
>>> On Mon, Mar 4, 2019 at 1:56 PM Mark Hamstra <m...@clearstorydata.com>
>>> wrote:
>>>
>>>> +1
>>>>
>>>
>>> Mark, just to be clear, are you +1 on the SPIP or Imran's point?
>>>
>>>
>>>>
>>>> On Mon, Mar 4, 2019 at 12:52 PM Imran Rashid <im...@therashids.com>
>>>> wrote:
>>>>
>>>>> On Sun, Mar 3, 2019 at 6:51 PM Xiangrui Meng <men...@gmail.com> wrote:
>>>>>
>>>>>> On Sun, Mar 3, 2019 at 10:20 AM Felix Cheung <
>>>>>> felixcheun...@hotmail.com> wrote:
>>>>>>
>>>>>>> IMO upfront allocation is less useful. Specifically too expensive
>>>>>>> for large jobs.
>>>>>>>
>>>>>>
>>>>>> This is also an API/design discussion.
>>>>>>
>>>>>
>>>>> I agree with Felix -- this is more than just an API question.  It has
>>>>> a huge impact on the complexity of what you're proposing.  You might be
>>>>> proposing big changes to a core and brittle part of spark, which is 
>>>>> already
>>>>> short of experts.
>>>>>
>>>>
>>> To my understanding, Felix's comment is mostly on the user interfaces,
>>> stating upfront allocation is less useful, specially for large jobs. I
>>> agree that for large jobs we better have dynamic allocation, which was
>>> mentioned in the YARN support section in the companion scoping doc. We
>>> restrict the new container type to initially requested to keep things
>>> simple. However upfront allocation already meets the requirements of basic
>>> workflows like data + DL training/inference + data. Saying "it is less
>>> useful specifically for large jobs" kinda missed the fact that "it is super
>>> useful for basic use cases".
>>>
>>> Your comment is mostly on the implementation side, which IMHO it is the
>>> KEY question to conclude this vote: does the design sketch sufficiently
>>> demonstrate that the internal changes to Spark scheduler is manageable? I
>>> read Xingbo's design sketch and I think it is doable, which led to my +1.
>>> But I'm not an expert on the scheduler. So I would feel more confident if
>>> the design was reviewed by some scheduler experts. I also read the design
>>> sketch to support different cluster managers, which I think is less
>>> critical than the internal scheduler changes.
>>>
>>>
>>>>
>>>>> I don't see any value in having a vote on "does feature X sound cool?"
>>>>>
>>>>
>>> I believe no one would disagree. To prepare the companion doc, we went
>>> through several rounds of discussions to provide concrete stories such that
>>> the proposal is not just "cool".
>>>
>>>
>>>>
>>>>>
>>>> We have to evaluate the potential benefit against the risks the feature
>>>>> brings and the continued maintenance cost.  We don't need super low-level
>>>>> details, but we have to a sketch of the design to be able to make that
>>>>> tradeoff.
>>>>>
>>>>
>>> Could you review the design sketch from Xingbo, help evaluate the cost,
>>> and provide feedback?
>>>
>>>
>>

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

Reply via email to