Pinning setuptools is generally not a good practice. The reason is at
installation time it might cause removal of the the setuptools that is
being used to install packages.

FWIW, dataflow workers should have setuptools 33.1.1, which was released in
2017/01/16.

Ahmet

On Tue, Jun 6, 2017 at 6:53 PM, Dmitry Demeshchuk <[email protected]>
wrote:

> Thanks, Ahmet, it really turned out that Stackdriver had more logs than
> just the Dataflow logs section.
>
> So, I ended up seeing this code that fails constantly:
>
> I    Running setup.py install for dataflow: started
> I      Running setup.py install for dataflow: finished with status 'error'
> I      Complete output from command /usr/bin/python -u -c "import setuptools, 
> tokenize;__file__='/tmp/pip-bXyST4-build/setup.py';f=getattr(tokenize, 
> 'open', open)(__file__);code=f.read().replace('\r\n', 
> '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record 
> /tmp/pip-sHw6oI-record/install-record.txt --single-version-externally-managed 
> --compile:
> I      usage: -c [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
> I         or: -c --help [cmd1 cmd2 ...]
> I         or: -c --help-commands
> I         or: -c cmd --help
> I
> I      error: option --single-version-externally-managed not recognized
> I
> I      ----------------------------------------
> I  Command "/usr/bin/python -u -c "import setuptools, 
> tokenize;__file__='/tmp/pip-bXyST4-build/setup.py';f=getattr(tokenize, 
> 'open', open)(__file__);code=f.read().replace('\r\n', 
> '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record 
> /tmp/pip-sHw6oI-record/install-record.txt --single-version-externally-managed 
> --compile" failed with error code 1 in /tmp/pip-bXyST4-build/
> I  /usr/local/bin/pip failed with exit status 1
>
>
> This seems to mean that the natively installed setuptools are too old, and
> the new command has been generated with a newer version of setuptools
> (specifically, my project has setuptools==36.0.1 as a dependency of some
> package). I'm still digging more through the Stackdriver logs but so far
> couldn't find out the exact reason of the failure.
>
> Also talking to the Dataflow folks, maybe they'll have a better idea. I'll
> also try to compare this to the output of successful pipelines and see if
> it gives me any ideas.
>
> Thank you.
>
> On Tue, Jun 6, 2017 at 4:40 PM, Ahmet Altay <[email protected]> wrote:
>
>>
>>
>> On Tue, Jun 6, 2017 at 2:07 PM, Dmitry Demeshchuk <[email protected]>
>> wrote:
>>
>>> Hi Ahmet,
>>>
>>> Thanks a lot for pointing out that doc, I somehow missed it from the
>>> official Python SDK page!
>>>
>>> One thing that comes to my mind is that generally one should probably
>>> use the 'install' command in setuptools, not 'build', like it's done in
>>> https://github.com/apache/beam/blob/master/sdks/python/ap
>>> ache_beam/examples/complete/juliaset/setup.py#L113. Reason being, the
>>> 'build' step seems to be executed on the original machine, not inside the
>>> runner's containers, while 'install' will be triggered inside of them. If I
>>> run a pipeline that uses setup.py with a "build" step, it fails due to
>>> being unable to "apt-get install libpq-dev" on a mac.
>>>
>>
>> Thank you. This example should similarly work in install commands I
>> believe. Also, if possible please file a JIRA issue with your ideas and we
>> can work on improving things.
>>
>>
>>>
>>> I'm still trying to make it work with either build or install steps,
>>> talking to the Dataflow folks in parallel to get more understanding of what
>>> I'm doing wrong (Dataflow doesn't send out installation failure logs to
>>> Stackdriver, only runtime logs, so it seems).
>>>
>>
>> Have you tried looking worker-startup logs? All of the logs should be in
>> stackdriver.
>>
>>
>>>
>>> On Tue, Jun 6, 2017 at 9:21 AM, Ahmet Altay <[email protected]> wrote:
>>>
>>>> Hi,
>>>>
>>>> Please see Managing Python Pipeline Dependencies [1] for various ways
>>>> on installing additional dependencies. The section on non-python
>>>> dependencies is relevant to your question.
>>>>
>>>> Thank you,
>>>> Ahmet
>>>>
>>>> [1] https://beam.apache.org/documentation/sdks/python-pipeli
>>>> ne-dependencies/
>>>>
>>>> On Mon, Jun 5, 2017 at 11:52 PM, Morand, Sebastien <
>>>> [email protected]> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Interested too. Could be fine for instance to add sftp BoundedSource,
>>>>> but compilalation of paramiko with ssl library (and so installation of
>>>>> ssl-dev)
>>>>>
>>>>> Regards,
>>>>>
>>>>> *Sébastien MORAND*
>>>>> Team Lead Solution Architect
>>>>> Technology & Operations / Digital Factory
>>>>> Veolia - Group Information Systems & Technology (IS&T)
>>>>> Cell.: +33 7 52 66 20 81 / Direct: +33 1 85 57 71 08
>>>>> <+33%201%2085%2057%2071%2008>
>>>>> Bureau 0144C (Ouest)
>>>>> 30, rue Madeleine-Vionnet - 93300 Aubervilliers, France
>>>>> *www.veolia.com <http://www.veolia.com>*
>>>>> <http://www.veolia.com>
>>>>> <https://www.facebook.com/veoliaenvironment/>
>>>>> <https://www.youtube.com/user/veoliaenvironnement>
>>>>> <https://www.linkedin.com/company/veolia-environnement>
>>>>> <https://twitter.com/veolia>
>>>>>
>>>>> On 6 June 2017 at 08:01, Dmitry Demeshchuk <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Hi again, folks,
>>>>>>
>>>>>> How should I go about installing Python packages that require to be
>>>>>> built and/or require native dependencies like shared libraries or such?
>>>>>>
>>>>>> I guess, I could potentially build the C-based modules using the same
>>>>>> version of kernel and glibc that Dataflow is running, but doesn't seem 
>>>>>> like
>>>>>> there's any way to install shared libraries at these boxes, right?
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> --
>>>>>> Best regards,
>>>>>> Dmitry Demeshchuk.
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ------------------------------------------------------------
>>>>> --------------------------------
>>>>> This e-mail transmission (message and any attached files) may contain
>>>>> information that is proprietary, privileged and/or confidential to Veolia
>>>>> Environnement and/or its affiliates and is intended exclusively for the
>>>>> person(s) to whom it is addressed. If you are not the intended recipient,
>>>>> please notify the sender by return e-mail and delete all copies of this
>>>>> e-mail, including all attachments. Unless expressly authorized, any use,
>>>>> disclosure, publication, retransmission or dissemination of this e-mail
>>>>> and/or of its attachments is strictly prohibited.
>>>>>
>>>>> Ce message electronique et ses fichiers attaches sont strictement
>>>>> confidentiels et peuvent contenir des elements dont Veolia Environnement
>>>>> et/ou l'une de ses entites affiliees sont proprietaires. Ils sont donc
>>>>> destines a l'usage de leurs seuls destinataires. Si vous avez recu ce
>>>>> message par erreur, merci de le retourner a son emetteur et de le detruire
>>>>> ainsi que toutes les pieces attachees. L'utilisation, la divulgation, la
>>>>> publication, la distribution, ou la reproduction non expressement
>>>>> autorisees de ce message et de ses pieces attachees sont interdites.
>>>>> ------------------------------------------------------------
>>>>> --------------------------------
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Dmitry Demeshchuk.
>>>
>>
>>
>
>
> --
> Best regards,
> Dmitry Demeshchuk.
>

Reply via email to