Hi Ted - KafkaIO is not yet implemented using Splittable DoFn's (it was
implemented before SDFs existed and hasn't been rewritten yet), but it will
be, once more runners catch up with the support: currently we have Dataflow
and Flink. +Chamikara Jayalath <[email protected]> is currently working
on implementing it using SDFs in the Python SDK.

On Thu, Mar 8, 2018 at 4:34 PM Ted Yu <[email protected]> wrote:

> Eugene:
> Very informative talk.
>
> I looked at:
>
> sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/OffsetRangeTrackerTest.java
>
> Is there some example showing how OffsetRangeTracker works with Kafka
> partition(s) ?
>
> Thanks
>
> On Thu, Mar 8, 2018 at 3:58 PM, Eugene Kirpichov <[email protected]>
> wrote:
>
>> Hi Thomas!
>>
>> In case of tailing a Kafka partition, the restriction would be
>> [start_offset, infinity), and it would keep being split by checkpointing
>> into [start_offset, end_offset) and [end_offset, infinity)
>>
>> On Thu, Mar 8, 2018 at 3:52 PM Thomas Weise <[email protected]> wrote:
>>
>>> Eugene,
>>>
>>> I actually had one question regarding the application of SDF for the
>>> Kafka consumer. Reading through a topic partition can be parallel by
>>> splitting a partition into multiple restrictions (for use cases where order
>>> does not matter). But how would the tail read be managed? I assume there
>>> would not be a new restriction whenever new records arrive (added latency)?
>>> The examples on slide 40 show an end offset for Kafka, but for a continuous
>>> read there wouldn't be an end offset?
>>>
>>> Thanks,
>>> Thomas
>>>
>>>
>>> On Thu, Mar 8, 2018 at 2:59 PM, Thomas Weise <[email protected]> wrote:
>>>
>>>> Great, thanks for sharing!
>>>>
>>>>
>>>> On Thu, Mar 8, 2018 at 12:16 PM, Eugene Kirpichov <[email protected]
>>>> > wrote:
>>>>
>>>>> Oops that's just the template I used. Thanks for noticing, will
>>>>> regenerate the PDF and reupload when I get to it.
>>>>>
>>>>>
>>>>> On Thu, Mar 8, 2018, 11:59 AM Dan Halperin <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Looks like it was a good talk! Why is it Google Confidential &
>>>>>> Proprietary, though?
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>> On Thu, Mar 8, 2018 at 11:49 AM, Eugene Kirpichov <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Hey all,
>>>>>>>
>>>>>>> The slides for my yesterday's talk at Strata San Jose
>>>>>>> https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/63696
>>>>>>>  have
>>>>>>> been posted on the talk page. They may be of interest both to users and 
>>>>>>> IO
>>>>>>> authors.
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>
>>>>>>
>>>>
>>>
>

Reply via email to