[
https://issues.apache.org/jira/browse/BEAM-3950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426212#comment-16426212
]
Valentyn Tymofieiev commented on BEAM-3950:
-------------------------------------------
We should start with allowing --sdk_location to point to a wheel file and
handle that correctly, without renaming to tar. This will unblock release
qualification of wheels, and allow users to pass wheels if they want to,
although just that is not a very convenient user experience.
If we want to stage a wheel by default, I think we should stage both source
tarball and wheel(s). Then, worker container should decide what to use. It can
try to use the wheel, if it does not work out, fall back to use tarball.
Reasons:
* Custom containers may have a platform that is incompatible with the wheel we
choose to stage.
* Python 3 containers may choose to install SDK from sources for sometime,
until we start building Python3 wheels.
* Wheels may not be immediately recognized by Dataflow worker containers,
although this is not critical if we can wait with SDK changes.
Starting from version 2.4, Dataflow SDK should be installing it's dependency
apache-beam[gcp] on Dataflow workers from wheels already.
> Dataflow Runner should supply a wheel version of Python SDK if it is available
> ------------------------------------------------------------------------------
>
> Key: BEAM-3950
> URL: https://issues.apache.org/jira/browse/BEAM-3950
> Project: Beam
> Issue Type: Improvement
> Components: runner-dataflow, sdk-py-core
> Reporter: Valentyn Tymofieiev
> Assignee: Valentyn Tymofieiev
> Priority: Major
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)