Re: Beam Python Portable Runner - Adding timeout to JobServer grpc calls

enrico canzonieri Fri, 09 Aug 2019 09:56:16 -0700

There seem to be consensus here around adding this feature. I filed
BEAM-7933 <https://issues.apache.org/jira/browse/BEAM-7933> and assigned it
to me. @Robert I'll check the places where it makes sense to re-use the
timeout value for RPCs.
I should be able to publish a pr sometime around next week.


Thanks,
Enrico

On Fri, Aug 9, 2019 at 12:41 AM Robert Bradshaw <[email protected]> wrote:

> If we do provide a configuration value for this, I would make it have a
> fairly large default and ure-use the flag for all RPCs of similar nature,
> not tweeks for this particular service only.
>
> On Fri, Aug 9, 2019 at 2:58 AM Ahmet Altay <[email protected]> wrote:
>
>> Default plus a flag to override sounds reasonable. Although from Dataflow
>> experience I do not remember timeouts causing issues and each new added
>> flag adds complexity. What do others think?
>>
>> On Thu, Aug 8, 2019 at 11:38 AM Kyle Weaver <[email protected]> wrote:
>>
>>> If we do make a default, I still think it should be configurable via a
>>> flag. I can't think of why the prepare, stage artifact, job state, or job
>>> message requests might take more than 60 seconds, but you never know,
>>> particularly with artifact staging, which might be uploading artifacts to
>>> distributed storage.
>>>
>>> I assume the run request itself would not be subject to timeouts, as
>>> running the pipeline can be assumed to take significantly longer than the
>>> setup work.
>>>
>>> Kyle Weaver | Software Engineer | github.com/ibzib | [email protected]
>>>
>>>
>>> On Thu, Aug 8, 2019 at 11:20 AM Enrico Canzonieri <[email protected]>
>>> wrote:
>>>
>>>> Default timeout with no flag may work as well.
>>>> The main consideration here is whether some api calls may take longer
>>>> than 60 seconds because of the complexity of the users' Beam pipeline. E.g.
>>>> Could job_service.Prepare() take longer than 60 seconds if the given Beam
>>>> pipeline is extremely complex?
>>>>
>>>> Basically if there are cases when the user code may cause the call
>>>> duration to increase to the point the timeout prevents submitting the app
>>>> itself then we should consider having a flag.
>>>>
>>>> On 2019/08/07 20:13:12, Ahmet Altay wrote:
>>>> > Could we pick a default timeout value instead of introducing a flag?
>>>> We use>
>>>> > 60 seconds as the default timeout for http client [1], we can do the
>>>> same>
>>>> > here.>
>>>> >
>>>> > [1]>
>>>> >
>>>> https://github.com/apache/beam/blob/3a182d64c86ad038692800f5c343659ab0b935b0/sdks/python/apache_beam/internal/http_client.py#L32>
>>>>
>>>> >
>>>> > On Wed, Aug 7, 2019 at 11:53 AM enrico canzonieri >
>>>> > wrote:>
>>>> >
>>>> > > Hello,>
>>>> > >>
>>>> > > I noticed that the calls to the JobServer from the Python SDK do
>>>> not have>
>>>> > > timeouts. If I'm not mistaken that means that the call to
>>>> pipeline.run()>
>>>> > > could hang forever if the JobServer is not running (or failing to
>>>> start).>
>>>> > > E.g.>
>>>> > >
>>>> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/portable_runner.py#L307>
>>>>
>>>> > > the call to Prepare() doesn't provide any timeout value and the
>>>> same>
>>>> > > applies to other JobServer requests.>
>>>> > > I was considering adding a --job-server-request-timeout to the>
>>>> > > PortableOptions>
>>>> > > >
>>>> > > class to be used in the JobServer interactions inside
>>>> probable_runner.py.>
>>>> > > Is there any specific reason why the timeout is not currently
>>>> supported?>
>>>> > > Does anybody have any objection adding the jobserver timeout? I
>>>> could>
>>>> > > volunteer to file a ticket and submit a pr for this.>
>>>> > >>
>>>> > > Cheers,>
>>>> > > Enrico Canzonieri>
>>>> > >>
>>>> >
>>>>
>>>>

Re: Beam Python Portable Runner - Adding timeout to JobServer grpc calls

Reply via email to