Hi Sebastien,

Python Dataflow jobs typically have a ~2 minute startup time per VM, About
1 minute of that is spinning up the VM and another minute is installing
user code and dependencies. By using a custom container (when that is
available) you might shave some time off from the latter, but you will have
additional time spent on downloading the new custom container. Unless your
pipeline has a really long time for installing dependencies, using custom
docker images would not help much with the startup time. Also note that
startup time is a fixed amount and amortized over a long pipeline.

Could you explain more about what you are observing, what dependencies you
are installing and what kind of an improvement you would like to get?

Thank you,
Ahmet

On Thu, Jul 27, 2017 at 9:03 AM, Valentyn Tymofieiev <[email protected]>
wrote:

> Hi Sebastien,
>
> Dataflow currently does not allow providing a custom-built worker_harness_
> container_image. A mechanism to provide custom [SDK harness] container
> images with the pipeline should be possible in FnAPI world [1], which is a
> work in progress. I'm working on a document to discuss the role of
> containers in FnAPI and will post it in another thread. Stay tuned.
>
> To clarify, the PR mentioned by JB introduces containers that serve a
> different purpose. The focus of that PR is to provide a containerized
> version of Beam SDK for development purposes.
>
> [1]: s.apache.org/beam-fn-api
>
> On Thu, Jul 27, 2017 at 8:57 AM, Jean-Baptiste Onofré <[email protected]>
> wrote:
>
>> Hi Seb,
>>
>> no, it's (in the best case) for 2.2.0.
>>
>> 2.1.0 release process already started.
>>
>> Regards
>> JB
>>
>> On 07/27/2017 05:45 PM, Morand, Sebastien wrote:
>>
>>> Hi,
>>>
>>> Thanks for your fast answer and really nice to know. As far as I
>>> understand it's in the plan of the 2.1.0, can you confirm?
>>>
>>> Any release date for this so far?
>>>
>>> Regards,
>>>
>>> *Sébastien MORAND*
>>> Team Lead Solution Architect
>>> Technology & Operations / Digital Factory
>>> Veolia - Group Information Systems & Technology (IS&T)
>>> Cell.:+33 7 52 66 20 81 / Direct: +33 1 85 57 71 08
>>> Bureau 0144C (Ouest)
>>> 30, rue Madeleine-Vionnet - 93300 Aubervilliers, France
>>> _www.veolia.com <http://www.veolia.com>_
>>> <http://www.veolia.com>
>>> <https://www.facebook.com/veoliaenvironment/> <
>>> https://www.youtube.com/user/veoliaenvironnement> <
>>> https://www.linkedin.com/company/veolia-environnement> <
>>> https://twitter.com/veolia>
>>>
>>> On 27 July 2017 at 15:24, Jean-Baptiste Onofré <[email protected] <mailto:
>>> [email protected]>> wrote:
>>>
>>>     Hi Seb,
>>>
>>>     We already have a PR about Dockerfiles ready:
>>>
>>>     https://github.com/apache/beam/pull/3651
>>>     <https://github.com/apache/beam/pull/3651>
>>>
>>>     You can take a look and test that.
>>>
>>>     Regards
>>>     JB
>>>
>>>     On 07/27/2017 03:22 PM, Morand, Sebastien wrote:
>>>
>>>         Hi,
>>>
>>>         In order to increase dataflow startup speed, we would like to
>>> provide
>>>         our own docker image.
>>>
>>>         I see a worker_harness_container_image image which should be the
>>> right
>>>         thinks but how does it work?
>>>
>>>         Regards,
>>>
>>>         *Sébastien MORAND*
>>>         Team Lead Solution Architect
>>>         Technology & Operations / Digital Factory
>>>         Veolia - Group Information Systems & Technology (IS&T)
>>>         Cell.:+33 7 52 66 20 81 <tel:%2B33%207%2052%2066%2020%2081> /
>>> Direct:
>>>         +33 1 85 57 71 08 <tel:%2B33%201%2085%2057%2071%2008>
>>>         Bureau 0144C (Ouest)
>>>         30, rue Madeleine-Vionnet - 93300 Aubervilliers, France
>>>         _www.veolia.com <http://www.veolia.com> <http://www.veolia.com>_
>>>         <http://www.veolia.com>
>>>         <https://www.facebook.com/veoliaenvironment/
>>>         <https://www.facebook.com/veoliaenvironment/>>
>>>         <https://www.youtube.com/user/veoliaenvironnement
>>>
>>>         <https://www.youtube.com/user/veoliaenvironnement>>
>>>         <https://www.linkedin.com/company/veolia-environnement
>>>         <https://www.linkedin.com/company/veolia-environnement>>
>>>         <https://twitter.com/veolia>
>>>
>>>
>>>         ------------------------------------------------------------
>>> --------------------------------
>>>         This e-mail transmission (message and any attached files) may
>>> contain
>>>         information that is proprietary, privileged and/or confidential
>>> to
>>>         Veolia Environnement and/or its affiliates and is intended
>>> exclusively
>>>         for the person(s) to whom it is addressed. If you are not the
>>> intended
>>>         recipient, please notify the sender by return e-mail and delete
>>> all
>>>         copies of this e-mail, including all attachments. Unless
>>> expressly
>>>         authorized, any use, disclosure, publication, retransmission or
>>>         dissemination of this e-mail and/or of its attachments is
>>> strictly
>>>         prohibited.
>>>
>>>         Ce message electronique et ses fichiers attaches sont strictement
>>>         confidentiels et peuvent contenir des elements dont Veolia
>>> Environnement
>>>         et/ou l'une de ses entites affiliees sont proprietaires. Ils
>>> sont donc
>>>         destines a l'usage de leurs seuls destinataires. Si vous avez
>>> recu ce
>>>         message par erreur, merci de le retourner a son emetteur et de le
>>>         detruire ainsi que toutes les pieces attachees. L'utilisation, la
>>>         divulgation, la publication, la distribution, ou la reproduction
>>> non
>>>         expressement autorisees de ce message et de ses pieces attachees
>>> sont
>>>         interdites.
>>>         ------------------------------------------------------------
>>> --------------------------------
>>>
>>>
>>>     --     Jean-Baptiste Onofré
>>>     [email protected] <mailto:[email protected]>
>>>     http://blog.nanthrax.net
>>>     Talend - http://www.talend.com
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------
>>> --------------------------------
>>> This e-mail transmission (message and any attached files) may contain
>>> information that is proprietary, privileged and/or confidential to Veolia
>>> Environnement and/or its affiliates and is intended exclusively for the
>>> person(s) to whom it is addressed. If you are not the intended recipient,
>>> please notify the sender by return e-mail and delete all copies of this
>>> e-mail, including all attachments. Unless expressly authorized, any use,
>>> disclosure, publication, retransmission or dissemination of this e-mail
>>> and/or of its attachments is strictly prohibited.
>>>
>>> Ce message electronique et ses fichiers attaches sont strictement
>>> confidentiels et peuvent contenir des elements dont Veolia Environnement
>>> et/ou l'une de ses entites affiliees sont proprietaires. Ils sont donc
>>> destines a l'usage de leurs seuls destinataires. Si vous avez recu ce
>>> message par erreur, merci de le retourner a son emetteur et de le detruire
>>> ainsi que toutes les pieces attachees. L'utilisation, la divulgation, la
>>> publication, la distribution, ou la reproduction non expressement
>>> autorisees de ce message et de ses pieces attachees sont interdites.
>>> ------------------------------------------------------------
>>> --------------------------------
>>>
>>
>> --
>> Jean-Baptiste Onofré
>> [email protected]
>> http://blog.nanthrax.net
>> Talend - http://www.talend.com
>>
>
>

Reply via email to