Re: Increase Portable SDK Harness share of memory?

Lukasz Cwik Mon, 01 Apr 2019 14:14:02 -0700

Yes, need to use the new fields everywhere and then deprecate the old
fields.


On Mon, Apr 1, 2019 at 1:33 PM Kenneth Knowles <[email protected]> wrote:

>
>
> On Mon, Apr 1, 2019 at 8:59 AM Lukasz Cwik <[email protected]> wrote:
>
>> To clarify, docker isn't the only environment type we are using. We have
>> a process based and "existing" environment mode that don't fit the current
>> protobuf and is being worked around.
>>
>
> Ah, understood.
>
>
>> The idea would be to move to a URN + payload model like our PTransforms
>> and coders with a docker specific one. Using the URN + payload would allow
>> us to have a versioned way to update the environment specifications and
>> deprecate/remove things that are ill defined.
>>
>
> Makes sense to me. It looks like this migration path is already in place
> in `message Environment` in beam_runner_api.proto, with `message
> StandardEnvironments` enumerating some URNs and corresponding payload
> messages just below. So is the gap just getting the two portable runners to
> look at the new fields?
>
> Kenn
>
>
>> On Fri, Mar 29, 2019 at 6:41 PM Kenneth Knowles <[email protected]> wrote:
>>
>>>
>>>
>>> On Thu, Mar 28, 2019 at 9:30 AM Lukasz Cwik <[email protected]> wrote:
>>>
>>>> The intention is that these kinds of hints such as CPU and/or memory
>>>> should be embedded in the environment specification that is associated with
>>>> the transforms that need resource hints.
>>>>
>>>> The environment spec is woefully ill prepared as it only has a docker
>>>> URL right now.
>>>>
>>>
>>> FWIW I think this is actually "extremely well prepared" :-)
>>>
>>> Protobuf is great for adding fields when you need more but removing is
>>> nearly impossible once deployed, so it is best to do the absolute minimum
>>> until you need to expand.
>>>
>>> Kenn
>>>
>>>
>>>>
>>>> On Thu, Mar 28, 2019 at 8:45 AM Robert Burke <[email protected]>
>>>> wrote:
>>>>
>>>>> A question came over the beam-go slack that I wasn't able to answer,
>>>>> in particular for Dataflow*, is there a way to increase how much of a
>>>>> Portable FnAPI worker is dedicated for the SDK side, vs the Runner side?
>>>>>
>>>>> My assumption is that runners should manage it, and have the Runner
>>>>> Harness side be as lightweight as possible, to operate under reasonable
>>>>> memory bounds, allowing the user-code more room to spread, since it's
>>>>> largely unknown.
>>>>>
>>>>> I saw there's the Provisioning API
>>>>> <https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_provision_api.proto#L52>
>>>>> which to communicates resource limits to the SDK side, but is there a way
>>>>> to make the request (probably on job start up) in the other direction?
>>>>>
>>>>> I imagine it has to do with the container boot code, but I have only
>>>>> vague knowledge of how that works at present.
>>>>>
>>>>> If there's a portable way for it, that's ideal, but I suspect this
>>>>> will be require a Dataflow specific answer.
>>>>>
>>>>> Thanks!
>>>>> Robert B
>>>>>
>>>>> *Dataflow doesn't support the Go SDK, but the Go SDK supports Dataflow.
>>>>>
>>>>

Re: Increase Portable SDK Harness share of memory?

Reply via email to