On Fri, Apr 17, 2020 at 2:45 PM Robert Bradshaw <rober...@google.com> wrote:

> Hi Holden!
>
> I agree with Kyle that it makes sense to have some caveat about Flink and
> Spark, though at this point they're not /that/ new (at least not Flink).
>
True, maybe "early-stage" would be better wording?  The TFX PyBeam Flink
support isn't yet mature enough (although there is interest in integrating
it in Kubeflow I believe, it hasn't happened yet).

>
> I am curious what extra support Kubeflow is "missing" (or, conversely,
> what extra support it has for Dataflow that goes beyond just specifying a
> different runner) to the point that these runners are declared
> "unsupported." Or it it literally a matter of not providing user support?
>
So the Kubeflow TFX components (in
https://github.com/kubeflow/pipelines/tree/master/components) are limited
to local mode.

>
> On Fri, Apr 17, 2020 at 12:27 PM Kyle Weaver <kcwea...@google.com> wrote:
>
>> Hi Holden,
>>
>> The note on Flink & Spark support sounds reasonable to me. I am
>> optimistic about getting Flink + TFX + Kubeflow working fairly soon, but I
>> agree that we don't want to over-promise.
>>
>> I'm not so sure about the status of Dataflow here, perhaps someone else
>> can comment on that.
>>
>> Looking forward to the book :)
>>
>> Kyle
>>
>> On Fri, Apr 17, 2020 at 1:14 PM Holden Karau <hol...@pigscanfly.ca>
>> wrote:
>>
>>> Hi Apache Beam Developers,
>>>
>>> I'm working on a book about Kubeflow, which naturally has a section on
>>> TFX. I want to set users expectations correctly so I wanted to know what
>>> y'all thought of this NOTE we were thinking of including in the early
>>> release:
>>>
>>> Apache Beam’s Python support outside of Google cloud's Dataflow is
>>> relatively new. TFX is a Python tool, so scaling it depends on Apache
>>> Beam's Python support. You can scale your job by using the non-portable
>>> dataflow component, but this requires changing your pipeline code and isn't
>>> supported by Kubeflow's current TFX components. As Apache Beam's support
>>> for Apache Flink & Spark improves support may be added for scaling the TFX
>>> components in a portable manner.
>>>
>>> Does this sound reasonable to folks? I don't want to over-promise but I
>>> also don't want to scare people away given all of the progress that is
>>> being made in supporting the open-source runners with language portability.
>>>
>>> Cheers,
>>>
>>> Holden :)
>>>
>>> --
>>> Twitter: https://twitter.com/holdenkarau
>>> Books (Learning Spark, High Performance Spark, etc.):
>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>
>>

-- 
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
YouTube Live Streams: https://www.youtube.com/user/holdenkarau

Reply via email to