Thanks for the explanation, as well as the tip on Dataproc+Flink.  I'll
give that a whirl.
Cheers,
Chris

On Wed, Oct 2, 2019 at 11:10 AM Valentyn Tymofieiev <[email protected]>
wrote:

>
> Hi Chris,
>
> Dataflow does not support GPUs at the moment, but this feature is on our
> radar and we are considering it for future prioritization. Dataflow-on-GKE
> is also not supported.
>
> Currently Dataflow worker pool is homogenous. However, in the future,
> resource annotations in pipeline should be a way to go. As you noted,
> resource annotation support needs to happen in Beam SDK. This feature is
> not tied to a particular functionality (GPUs) or a particular runner
> (Dataflow), and can be implemented in Beam codebase.
>
> At the moment, you can try experimenting with Direct runner on a single
> machine with a GPU, or try portable runners that use a stand-alone
> infrastructure for example, Beam Flink runner +  Flink on Dataproc cluster
> with GPUs.
>
> Thanks,
> Valentyn
>
> On Tue, Oct 1, 2019 at 11:24 AM Chris Roat <[email protected]> wrote:
>
>> While evaluating many tools for a project, I found Beam suits my needs
>> quite well from the abstraction point of view.  Both the dead-simple way to
>> scale up (and even down to single-machine for testing) and the ease of
>> moving between different runners are key.  Plus, I'm familiar with the
>> framework from having used Flume while at Google.
>>
>> One thing I'd find useful in the implementation are resource hints[1],
>> particularly to use GPUs for several parts of the processing.  Forgoing
>> hints and the ability to run easily on GPUs, I'd be happy to break up my
>> pipeline, and just spin up all my machines with GPUs for the sub-pipelines
>> that need it.
>>
>> Some paths I'm considering:
>> - Find the easiest way to go from start-cluster-with-cpus (i.e. gcloud
>> container clusters ... --accelerator=...) to run-dataflow-on-said-cluster.
>> What would that be?
>> - Implement --accelerator in PipelineOptions and implement for Dataflow
>>
>> Thanks for any advice,
>> Chris
>>
>> [1] https://issues.apache.org/jira/browse/BEAM-2085
>>
>

Reply via email to