Re: "Radically modular data ingestion APIs in Apache Beam" @ Strata - slides available

2018-06-01 Thread Matthias Baetens
Hey Ismaël,

That totally makes sense, I will work on this on the weekend and send a PR.

Cheers,
Matthias

On Thu, 31 May 2018 at 09:45 Ismaël Mejía  wrote:

> Great !
>
> Matthias I think it makes sense to have these guidelines in the beam
> site better than a google doc. Can you please submit a PR for this?
>
> On Thu, May 31, 2018 at 8:03 AM Matthias Baetens
>  wrote:
> >
> > Hey Eugene, hi all!
> >
> > Happy to say your talk is now on the Beam YouTube channel and can be
> watched here.
> > It'd be great to see more of these on the channel so we can start
> sharing this on meetups, conferences and other places and see this grow, so
> don't hesitate to reach out to me (directly) if you want you video on the
> channel. Find more about the process / guidelines here.
> >
> > Best,
> > Matthias
> >
> > On Mon, 14 May 2018 at 22:19 Eugene Kirpichov 
> wrote:
> >>
> >> Hi Matthias,
> >>
> >> Thank you. Here is the raw file on Google Drive:
> https://drive.google.com/file/d/1gUoe6UrpNNO3ijYSgTvD5GBy-14Zx6dT/view?usp=sharing
> >> And yes, I have permission from O'Reilly/Strata to use this file
> whichever way I want, so it's ok to share on YouTube.
> >>
> >> On Fri, May 11, 2018 at 12:13 PM Matthias Baetens <
> baetensmatth...@gmail.com> wrote:
> >>>
> >>> Hey Eugene,
> >>>
> >>> Apologies for picking this up so late, but I could help uploading your
> video to the Beam channel.
> >>> Are you able to send me the raw file and do you have sign-off to go
> ahead with sharing it on YouTube?
> >>>
> >>> Thanks.
> >>> Matthias
> >>>
> >>> On Sat, 14 Apr 2018 at 21:45 Eugene Kirpichov 
> wrote:
> 
>  Hi all,
> 
>  The video is now available. I got it from my Strata account and I
> have permission to use and share it freely, so I published it on my own
> YouTube page (where there's nothing else...). Perhaps it makes sense to add
> to the Beam YouTube channel, but AFAIK only a PMC member can do that.
> 
>  https://www.youtube.com/watch?v=NIn9E5TVoCA
> 
> 
>  On Tue, Mar 13, 2018 at 3:33 AM James  wrote:
> >
> > Very informative, thanks!
> >
> > On Fri, Mar 9, 2018 at 4:49 PM Etienne Chauchot <
> echauc...@apache.org> wrote:
> >>
> >> Great !
> >>
> >> Thanks for sharing.
> >>
> >> Etienne
> >>
> >>
> >> Le jeudi 08 mars 2018 à 19:49 +, Eugene Kirpichov a écrit :
> >>
> >> Hey all,
> >>
> >> The slides for my yesterday's talk at Strata San Jose
> https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/63696
> have been posted on the talk page. They may be of interest both to users
> and IO authors.
> >>
> >> Thanks.
> >>>
> >>> --
> >>>
> >
> > --
> >
>
--


Re: "Radically modular data ingestion APIs in Apache Beam" @ Strata - slides available

2018-05-31 Thread Ismaël Mejía
Great !

Matthias I think it makes sense to have these guidelines in the beam
site better than a google doc. Can you please submit a PR for this?

On Thu, May 31, 2018 at 8:03 AM Matthias Baetens
 wrote:
>
> Hey Eugene, hi all!
>
> Happy to say your talk is now on the Beam YouTube channel and can be watched 
> here.
> It'd be great to see more of these on the channel so we can start sharing 
> this on meetups, conferences and other places and see this grow, so don't 
> hesitate to reach out to me (directly) if you want you video on the channel. 
> Find more about the process / guidelines here.
>
> Best,
> Matthias
>
> On Mon, 14 May 2018 at 22:19 Eugene Kirpichov  wrote:
>>
>> Hi Matthias,
>>
>> Thank you. Here is the raw file on Google Drive: 
>> https://drive.google.com/file/d/1gUoe6UrpNNO3ijYSgTvD5GBy-14Zx6dT/view?usp=sharing
>> And yes, I have permission from O'Reilly/Strata to use this file whichever 
>> way I want, so it's ok to share on YouTube.
>>
>> On Fri, May 11, 2018 at 12:13 PM Matthias Baetens 
>>  wrote:
>>>
>>> Hey Eugene,
>>>
>>> Apologies for picking this up so late, but I could help uploading your 
>>> video to the Beam channel.
>>> Are you able to send me the raw file and do you have sign-off to go ahead 
>>> with sharing it on YouTube?
>>>
>>> Thanks.
>>> Matthias
>>>
>>> On Sat, 14 Apr 2018 at 21:45 Eugene Kirpichov  wrote:

 Hi all,

 The video is now available. I got it from my Strata account and I have 
 permission to use and share it freely, so I published it on my own YouTube 
 page (where there's nothing else...). Perhaps it makes sense to add to the 
 Beam YouTube channel, but AFAIK only a PMC member can do that.

 https://www.youtube.com/watch?v=NIn9E5TVoCA


 On Tue, Mar 13, 2018 at 3:33 AM James  wrote:
>
> Very informative, thanks!
>
> On Fri, Mar 9, 2018 at 4:49 PM Etienne Chauchot  
> wrote:
>>
>> Great !
>>
>> Thanks for sharing.
>>
>> Etienne
>>
>>
>> Le jeudi 08 mars 2018 à 19:49 +, Eugene Kirpichov a écrit :
>>
>> Hey all,
>>
>> The slides for my yesterday's talk at Strata San Jose 
>> https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/63696
>>  have been posted on the talk page. They may be of interest both to 
>> users and IO authors.
>>
>> Thanks.
>>>
>>> --
>>>
>
> --
>


Re: "Radically modular data ingestion APIs in Apache Beam" @ Strata - slides available

2018-05-30 Thread Matthias Baetens
Hey Eugene, hi all!

Happy to say your talk is now on the Beam YouTube channel and can be
watched here .
It'd be great to see more of these on the channel so we can start sharing
this on meetups, conferences and other places and see this grow, so don't
hesitate to reach out to me (directly) if you want you video on the
channel. Find more about the process / guidelines here

.

Best,
Matthias

On Mon, 14 May 2018 at 22:19 Eugene Kirpichov  wrote:

> Hi Matthias,
>
> Thank you. Here is the raw file on Google Drive:
> https://drive.google.com/file/d/1gUoe6UrpNNO3ijYSgTvD5GBy-14Zx6dT/view?usp=sharing
>
> And yes, I have permission from O'Reilly/Strata to use this file whichever
> way I want, so it's ok to share on YouTube.
>
> On Fri, May 11, 2018 at 12:13 PM Matthias Baetens <
> baetensmatth...@gmail.com> wrote:
>
>> Hey Eugene,
>>
>> Apologies for picking this up so late, but I could help uploading your
>> video to the Beam channel.
>> Are you able to send me the raw file and do you have sign-off to go ahead
>> with sharing it on YouTube?
>>
>> Thanks.
>> Matthias
>>
>> On Sat, 14 Apr 2018 at 21:45 Eugene Kirpichov 
>> wrote:
>>
>>> Hi all,
>>>
>>> The video is now available. I got it from my Strata account and I have
>>> permission to use and share it freely, so I published it on my own YouTube
>>> page (where there's nothing else...). Perhaps it makes sense to add to the
>>> Beam YouTube channel, but AFAIK only a PMC member can do that.
>>>
>>> https://www.youtube.com/watch?v=NIn9E5TVoCA
>>>
>>>
>>> On Tue, Mar 13, 2018 at 3:33 AM James  wrote:
>>>
 Very informative, thanks!

 On Fri, Mar 9, 2018 at 4:49 PM Etienne Chauchot 
 wrote:

> Great !
>
> Thanks for sharing.
>
> Etienne
>

> Le jeudi 08 mars 2018 à 19:49 +, Eugene Kirpichov a écrit :
>
> Hey all,
>
> The slides for my yesterday's talk at Strata San Jose
> https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/63696
>  have
> been posted on the talk page. They may be of interest both to users and IO
> authors.
>
> Thanks.
>
> --
>>
>>
> --


Re: "Radically modular data ingestion APIs in Apache Beam" @ Strata - slides available

2018-05-14 Thread Eugene Kirpichov
Hi Matthias,

Thank you. Here is the raw file on Google Drive:
https://drive.google.com/file/d/1gUoe6UrpNNO3ijYSgTvD5GBy-14Zx6dT/view?usp=sharing

And yes, I have permission from O'Reilly/Strata to use this file whichever
way I want, so it's ok to share on YouTube.

On Fri, May 11, 2018 at 12:13 PM Matthias Baetens 
wrote:

> Hey Eugene,
>
> Apologies for picking this up so late, but I could help uploading your
> video to the Beam channel.
> Are you able to send me the raw file and do you have sign-off to go ahead
> with sharing it on YouTube?
>
> Thanks.
> Matthias
>
> On Sat, 14 Apr 2018 at 21:45 Eugene Kirpichov 
> wrote:
>
>> Hi all,
>>
>> The video is now available. I got it from my Strata account and I have
>> permission to use and share it freely, so I published it on my own YouTube
>> page (where there's nothing else...). Perhaps it makes sense to add to the
>> Beam YouTube channel, but AFAIK only a PMC member can do that.
>>
>> https://www.youtube.com/watch?v=NIn9E5TVoCA
>>
>>
>> On Tue, Mar 13, 2018 at 3:33 AM James  wrote:
>>
>>> Very informative, thanks!
>>>
>>> On Fri, Mar 9, 2018 at 4:49 PM Etienne Chauchot 
>>> wrote:
>>>
 Great !

 Thanks for sharing.

 Etienne

>>>
 Le jeudi 08 mars 2018 à 19:49 +, Eugene Kirpichov a écrit :

 Hey all,

 The slides for my yesterday's talk at Strata San Jose
 https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/63696
  have
 been posted on the talk page. They may be of interest both to users and IO
 authors.

 Thanks.

 --
>
>


Re: "Radically modular data ingestion APIs in Apache Beam" @ Strata - slides available

2018-05-11 Thread Matthias Baetens
Hey Eugene,

Apologies for picking this up so late, but I could help uploading your
video to the Beam channel.
Are you able to send me the raw file and do you have sign-off to go ahead
with sharing it on YouTube?

Thanks.
Matthias

On Sat, 14 Apr 2018 at 21:45 Eugene Kirpichov  wrote:

> Hi all,
>
> The video is now available. I got it from my Strata account and I have
> permission to use and share it freely, so I published it on my own YouTube
> page (where there's nothing else...). Perhaps it makes sense to add to the
> Beam YouTube channel, but AFAIK only a PMC member can do that.
>
> https://www.youtube.com/watch?v=NIn9E5TVoCA
>
>
> On Tue, Mar 13, 2018 at 3:33 AM James  wrote:
>
>> Very informative, thanks!
>>
>> On Fri, Mar 9, 2018 at 4:49 PM Etienne Chauchot 
>> wrote:
>>
>>> Great !
>>>
>>> Thanks for sharing.
>>>
>>> Etienne
>>>
>>
>>> Le jeudi 08 mars 2018 à 19:49 +, Eugene Kirpichov a écrit :
>>>
>>> Hey all,
>>>
>>> The slides for my yesterday's talk at Strata San Jose
>>> https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/63696
>>>  have
>>> been posted on the talk page. They may be of interest both to users and IO
>>> authors.
>>>
>>> Thanks.
>>>
>>> --


Re: "Radically modular data ingestion APIs in Apache Beam" @ Strata - slides available

2018-04-14 Thread Eugene Kirpichov
Hi all,

The video is now available. I got it from my Strata account and I have
permission to use and share it freely, so I published it on my own YouTube
page (where there's nothing else...). Perhaps it makes sense to add to the
Beam YouTube channel, but AFAIK only a PMC member can do that.

https://www.youtube.com/watch?v=NIn9E5TVoCA


On Tue, Mar 13, 2018 at 3:33 AM James  wrote:

> Very informative, thanks!
>
> On Fri, Mar 9, 2018 at 4:49 PM Etienne Chauchot 
> wrote:
>
>> Great !
>>
>> Thanks for sharing.
>>
>> Etienne
>>
>> Le jeudi 08 mars 2018 à 19:49 +, Eugene Kirpichov a écrit :
>>
>> Hey all,
>>
>> The slides for my yesterday's talk at Strata San Jose
>> https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/63696
>>  have
>> been posted on the talk page. They may be of interest both to users and IO
>> authors.
>>
>> Thanks.
>>
>>


Re: "Radically modular data ingestion APIs in Apache Beam" @ Strata - slides available

2018-03-13 Thread James
Very informative, thanks!

On Fri, Mar 9, 2018 at 4:49 PM Etienne Chauchot 
wrote:

> Great !
>
> Thanks for sharing.
>
> Etienne
>
> Le jeudi 08 mars 2018 à 19:49 +, Eugene Kirpichov a écrit :
>
> Hey all,
>
> The slides for my yesterday's talk at Strata San Jose
> https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/63696 
> have
> been posted on the talk page. They may be of interest both to users and IO
> authors.
>
> Thanks.
>
>


Re: "Radically modular data ingestion APIs in Apache Beam" @ Strata - slides available

2018-03-09 Thread Etienne Chauchot
Great !
Thanks for sharing.
Etienne
Le jeudi 08 mars 2018 à 19:49 +, Eugene Kirpichov a écrit :
> Hey all,
> 
> The slides for my yesterday's talk at Strata San Jose 
> https://conferences.oreilly.com/strata/strata-ca/public/schedule
> /detail/63696 have been posted on the talk page. They may be of interest both 
> to users and IO authors.
> 
> Thanks.

Re: "Radically modular data ingestion APIs in Apache Beam" @ Strata - slides available

2018-03-08 Thread Chamikara Jayalath
Great talk, Eugene.

Ted, will share more info on Kafka IO for Python soon :)

- Cham

On Thu, Mar 8, 2018 at 4:55 PM Ted Yu  wrote:

> I see.
>
> I have added myself as watcher on BEAM-3788.
>
> Thanks
>
> On Thu, Mar 8, 2018 at 4:51 PM, Eugene Kirpichov 
> wrote:
>
>> Hi Ted - KafkaIO is not yet implemented using Splittable DoFn's (it was
>> implemented before SDFs existed and hasn't been rewritten yet), but it will
>> be, once more runners catch up with the support: currently we have Dataflow
>> and Flink. +Chamikara Jayalath  is currently
>> working on implementing it using SDFs in the Python SDK.
>>
>> On Thu, Mar 8, 2018 at 4:34 PM Ted Yu  wrote:
>>
>>> Eugene:
>>> Very informative talk.
>>>
>>> I looked at:
>>>
>>> sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/OffsetRangeTrackerTest.java
>>>
>>> Is there some example showing how OffsetRangeTracker works with Kafka
>>> partition(s) ?
>>>
>>> Thanks
>>>
>>> On Thu, Mar 8, 2018 at 3:58 PM, Eugene Kirpichov 
>>> wrote:
>>>
 Hi Thomas!

 In case of tailing a Kafka partition, the restriction would be
 [start_offset, infinity), and it would keep being split by checkpointing
 into [start_offset, end_offset) and [end_offset, infinity)

 On Thu, Mar 8, 2018 at 3:52 PM Thomas Weise  wrote:

> Eugene,
>
> I actually had one question regarding the application of SDF for the
> Kafka consumer. Reading through a topic partition can be parallel by
> splitting a partition into multiple restrictions (for use cases where 
> order
> does not matter). But how would the tail read be managed? I assume there
> would not be a new restriction whenever new records arrive (added 
> latency)?
> The examples on slide 40 show an end offset for Kafka, but for a 
> continuous
> read there wouldn't be an end offset?
>
> Thanks,
> Thomas
>
>
> On Thu, Mar 8, 2018 at 2:59 PM, Thomas Weise  wrote:
>
>> Great, thanks for sharing!
>>
>>
>> On Thu, Mar 8, 2018 at 12:16 PM, Eugene Kirpichov <
>> kirpic...@google.com> wrote:
>>
>>> Oops that's just the template I used. Thanks for noticing, will
>>> regenerate the PDF and reupload when I get to it.
>>>
>>>
>>> On Thu, Mar 8, 2018, 11:59 AM Dan Halperin 
>>> wrote:
>>>
 Looks like it was a good talk! Why is it Google Confidential &
 Proprietary, though?

 Dan

 On Thu, Mar 8, 2018 at 11:49 AM, Eugene Kirpichov <
 kirpic...@google.com> wrote:

> Hey all,
>
> The slides for my yesterday's talk at Strata San Jose
> https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/63696
>  have
> been posted on the talk page. They may be of interest both to users 
> and IO
> authors.
>
> Thanks.
>


>>
>
>>>
>


Re: "Radically modular data ingestion APIs in Apache Beam" @ Strata - slides available

2018-03-08 Thread Ted Yu
I see.

I have added myself as watcher on BEAM-3788.

Thanks

On Thu, Mar 8, 2018 at 4:51 PM, Eugene Kirpichov 
wrote:

> Hi Ted - KafkaIO is not yet implemented using Splittable DoFn's (it was
> implemented before SDFs existed and hasn't been rewritten yet), but it will
> be, once more runners catch up with the support: currently we have Dataflow
> and Flink. +Chamikara Jayalath  is currently
> working on implementing it using SDFs in the Python SDK.
>
> On Thu, Mar 8, 2018 at 4:34 PM Ted Yu  wrote:
>
>> Eugene:
>> Very informative talk.
>>
>> I looked at:
>> sdks/java/core/src/test/java/org/apache/beam/sdk/
>> transforms/splittabledofn/OffsetRangeTrackerTest.java
>>
>> Is there some example showing how OffsetRangeTracker works with Kafka
>> partition(s) ?
>>
>> Thanks
>>
>> On Thu, Mar 8, 2018 at 3:58 PM, Eugene Kirpichov 
>> wrote:
>>
>>> Hi Thomas!
>>>
>>> In case of tailing a Kafka partition, the restriction would be
>>> [start_offset, infinity), and it would keep being split by checkpointing
>>> into [start_offset, end_offset) and [end_offset, infinity)
>>>
>>> On Thu, Mar 8, 2018 at 3:52 PM Thomas Weise  wrote:
>>>
 Eugene,

 I actually had one question regarding the application of SDF for the
 Kafka consumer. Reading through a topic partition can be parallel by
 splitting a partition into multiple restrictions (for use cases where order
 does not matter). But how would the tail read be managed? I assume there
 would not be a new restriction whenever new records arrive (added latency)?
 The examples on slide 40 show an end offset for Kafka, but for a continuous
 read there wouldn't be an end offset?

 Thanks,
 Thomas


 On Thu, Mar 8, 2018 at 2:59 PM, Thomas Weise  wrote:

> Great, thanks for sharing!
>
>
> On Thu, Mar 8, 2018 at 12:16 PM, Eugene Kirpichov <
> kirpic...@google.com> wrote:
>
>> Oops that's just the template I used. Thanks for noticing, will
>> regenerate the PDF and reupload when I get to it.
>>
>>
>> On Thu, Mar 8, 2018, 11:59 AM Dan Halperin 
>> wrote:
>>
>>> Looks like it was a good talk! Why is it Google Confidential &
>>> Proprietary, though?
>>>
>>> Dan
>>>
>>> On Thu, Mar 8, 2018 at 11:49 AM, Eugene Kirpichov <
>>> kirpic...@google.com> wrote:
>>>
 Hey all,

 The slides for my yesterday's talk at Strata San Jose
 https://conferences.oreilly.com/strata/strata-ca/
 public/schedule/detail/63696 have been posted on the talk page.
 They may be of interest both to users and IO authors.

 Thanks.

>>>
>>>
>

>>


Re: "Radically modular data ingestion APIs in Apache Beam" @ Strata - slides available

2018-03-08 Thread Eugene Kirpichov
Hi Ted - KafkaIO is not yet implemented using Splittable DoFn's (it was
implemented before SDFs existed and hasn't been rewritten yet), but it will
be, once more runners catch up with the support: currently we have Dataflow
and Flink. +Chamikara Jayalath  is currently working
on implementing it using SDFs in the Python SDK.

On Thu, Mar 8, 2018 at 4:34 PM Ted Yu  wrote:

> Eugene:
> Very informative talk.
>
> I looked at:
>
> sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/OffsetRangeTrackerTest.java
>
> Is there some example showing how OffsetRangeTracker works with Kafka
> partition(s) ?
>
> Thanks
>
> On Thu, Mar 8, 2018 at 3:58 PM, Eugene Kirpichov 
> wrote:
>
>> Hi Thomas!
>>
>> In case of tailing a Kafka partition, the restriction would be
>> [start_offset, infinity), and it would keep being split by checkpointing
>> into [start_offset, end_offset) and [end_offset, infinity)
>>
>> On Thu, Mar 8, 2018 at 3:52 PM Thomas Weise  wrote:
>>
>>> Eugene,
>>>
>>> I actually had one question regarding the application of SDF for the
>>> Kafka consumer. Reading through a topic partition can be parallel by
>>> splitting a partition into multiple restrictions (for use cases where order
>>> does not matter). But how would the tail read be managed? I assume there
>>> would not be a new restriction whenever new records arrive (added latency)?
>>> The examples on slide 40 show an end offset for Kafka, but for a continuous
>>> read there wouldn't be an end offset?
>>>
>>> Thanks,
>>> Thomas
>>>
>>>
>>> On Thu, Mar 8, 2018 at 2:59 PM, Thomas Weise  wrote:
>>>
 Great, thanks for sharing!


 On Thu, Mar 8, 2018 at 12:16 PM, Eugene Kirpichov >>> > wrote:

> Oops that's just the template I used. Thanks for noticing, will
> regenerate the PDF and reupload when I get to it.
>
>
> On Thu, Mar 8, 2018, 11:59 AM Dan Halperin 
> wrote:
>
>> Looks like it was a good talk! Why is it Google Confidential &
>> Proprietary, though?
>>
>> Dan
>>
>> On Thu, Mar 8, 2018 at 11:49 AM, Eugene Kirpichov <
>> kirpic...@google.com> wrote:
>>
>>> Hey all,
>>>
>>> The slides for my yesterday's talk at Strata San Jose
>>> https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/63696
>>>  have
>>> been posted on the talk page. They may be of interest both to users and 
>>> IO
>>> authors.
>>>
>>> Thanks.
>>>
>>
>>

>>>
>


Re: "Radically modular data ingestion APIs in Apache Beam" @ Strata - slides available

2018-03-08 Thread Ted Yu
Eugene:
Very informative talk.

I looked at:
sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/OffsetRangeTrackerTest.java

Is there some example showing how OffsetRangeTracker works with Kafka
partition(s) ?

Thanks

On Thu, Mar 8, 2018 at 3:58 PM, Eugene Kirpichov 
wrote:

> Hi Thomas!
>
> In case of tailing a Kafka partition, the restriction would be
> [start_offset, infinity), and it would keep being split by checkpointing
> into [start_offset, end_offset) and [end_offset, infinity)
>
> On Thu, Mar 8, 2018 at 3:52 PM Thomas Weise  wrote:
>
>> Eugene,
>>
>> I actually had one question regarding the application of SDF for the
>> Kafka consumer. Reading through a topic partition can be parallel by
>> splitting a partition into multiple restrictions (for use cases where order
>> does not matter). But how would the tail read be managed? I assume there
>> would not be a new restriction whenever new records arrive (added latency)?
>> The examples on slide 40 show an end offset for Kafka, but for a continuous
>> read there wouldn't be an end offset?
>>
>> Thanks,
>> Thomas
>>
>>
>> On Thu, Mar 8, 2018 at 2:59 PM, Thomas Weise  wrote:
>>
>>> Great, thanks for sharing!
>>>
>>>
>>> On Thu, Mar 8, 2018 at 12:16 PM, Eugene Kirpichov 
>>> wrote:
>>>
 Oops that's just the template I used. Thanks for noticing, will
 regenerate the PDF and reupload when I get to it.


 On Thu, Mar 8, 2018, 11:59 AM Dan Halperin  wrote:

> Looks like it was a good talk! Why is it Google Confidential &
> Proprietary, though?
>
> Dan
>
> On Thu, Mar 8, 2018 at 11:49 AM, Eugene Kirpichov <
> kirpic...@google.com> wrote:
>
>> Hey all,
>>
>> The slides for my yesterday's talk at Strata San Jose
>> https://conferences.oreilly.com/strata/strata-ca/
>> public/schedule/detail/63696 have been posted on the talk page. They
>> may be of interest both to users and IO authors.
>>
>> Thanks.
>>
>
>
>>>
>>


Re: "Radically modular data ingestion APIs in Apache Beam" @ Strata - slides available

2018-03-08 Thread Eugene Kirpichov
Hi Thomas!

In case of tailing a Kafka partition, the restriction would be
[start_offset, infinity), and it would keep being split by checkpointing
into [start_offset, end_offset) and [end_offset, infinity)

On Thu, Mar 8, 2018 at 3:52 PM Thomas Weise  wrote:

> Eugene,
>
> I actually had one question regarding the application of SDF for the Kafka
> consumer. Reading through a topic partition can be parallel by splitting a
> partition into multiple restrictions (for use cases where order does not
> matter). But how would the tail read be managed? I assume there would not
> be a new restriction whenever new records arrive (added latency)? The
> examples on slide 40 show an end offset for Kafka, but for a continuous
> read there wouldn't be an end offset?
>
> Thanks,
> Thomas
>
>
> On Thu, Mar 8, 2018 at 2:59 PM, Thomas Weise  wrote:
>
>> Great, thanks for sharing!
>>
>>
>> On Thu, Mar 8, 2018 at 12:16 PM, Eugene Kirpichov 
>> wrote:
>>
>>> Oops that's just the template I used. Thanks for noticing, will
>>> regenerate the PDF and reupload when I get to it.
>>>
>>>
>>> On Thu, Mar 8, 2018, 11:59 AM Dan Halperin  wrote:
>>>
 Looks like it was a good talk! Why is it Google Confidential &
 Proprietary, though?

 Dan

 On Thu, Mar 8, 2018 at 11:49 AM, Eugene Kirpichov >>> > wrote:

> Hey all,
>
> The slides for my yesterday's talk at Strata San Jose
> https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/63696
>  have
> been posted on the talk page. They may be of interest both to users and IO
> authors.
>
> Thanks.
>


>>
>


Re: "Radically modular data ingestion APIs in Apache Beam" @ Strata - slides available

2018-03-08 Thread Thomas Weise
Eugene,

I actually had one question regarding the application of SDF for the Kafka
consumer. Reading through a topic partition can be parallel by splitting a
partition into multiple restrictions (for use cases where order does not
matter). But how would the tail read be managed? I assume there would not
be a new restriction whenever new records arrive (added latency)? The
examples on slide 40 show an end offset for Kafka, but for a continuous
read there wouldn't be an end offset?

Thanks,
Thomas


On Thu, Mar 8, 2018 at 2:59 PM, Thomas Weise  wrote:

> Great, thanks for sharing!
>
>
> On Thu, Mar 8, 2018 at 12:16 PM, Eugene Kirpichov 
> wrote:
>
>> Oops that's just the template I used. Thanks for noticing, will
>> regenerate the PDF and reupload when I get to it.
>>
>>
>> On Thu, Mar 8, 2018, 11:59 AM Dan Halperin  wrote:
>>
>>> Looks like it was a good talk! Why is it Google Confidential &
>>> Proprietary, though?
>>>
>>> Dan
>>>
>>> On Thu, Mar 8, 2018 at 11:49 AM, Eugene Kirpichov 
>>> wrote:
>>>
 Hey all,

 The slides for my yesterday's talk at Strata San Jose
 https://conferences.oreilly.com/strata/strata-ca/public
 /schedule/detail/63696 have been posted on the talk page. They may be
 of interest both to users and IO authors.

 Thanks.

>>>
>>>
>


Re: "Radically modular data ingestion APIs in Apache Beam" @ Strata - slides available

2018-03-08 Thread Thomas Weise
Great, thanks for sharing!


On Thu, Mar 8, 2018 at 12:16 PM, Eugene Kirpichov 
wrote:

> Oops that's just the template I used. Thanks for noticing, will regenerate
> the PDF and reupload when I get to it.
>
>
> On Thu, Mar 8, 2018, 11:59 AM Dan Halperin  wrote:
>
>> Looks like it was a good talk! Why is it Google Confidential &
>> Proprietary, though?
>>
>> Dan
>>
>> On Thu, Mar 8, 2018 at 11:49 AM, Eugene Kirpichov 
>> wrote:
>>
>>> Hey all,
>>>
>>> The slides for my yesterday's talk at Strata San Jose
>>> https://conferences.oreilly.com/strata/strata-ca/
>>> public/schedule/detail/63696 have been posted on the talk page. They
>>> may be of interest both to users and IO authors.
>>>
>>> Thanks.
>>>
>>
>>


Re: "Radically modular data ingestion APIs in Apache Beam" @ Strata - slides available

2018-03-08 Thread Raghu Angadi
Terrific! Thanks Eugene. Just the slides themselves are so good, can't wait
for the video.
Do you know when the video might be available?


On Thu, Mar 8, 2018 at 12:16 PM Eugene Kirpichov 
wrote:

> Oops that's just the template I used. Thanks for noticing, will regenerate
> the PDF and reupload when I get to it.
>
> On Thu, Mar 8, 2018, 11:59 AM Dan Halperin  wrote:
>
>> Looks like it was a good talk! Why is it Google Confidential &
>> Proprietary, though?
>>
>> Dan
>>
>> On Thu, Mar 8, 2018 at 11:49 AM, Eugene Kirpichov 
>> wrote:
>>
>>> Hey all,
>>>
>>> The slides for my yesterday's talk at Strata San Jose
>>> https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/63696
>>>  have
>>> been posted on the talk page. They may be of interest both to users and IO
>>> authors.
>>>
>>> Thanks.
>>>
>>
>>


Re: "Radically modular data ingestion APIs in Apache Beam" @ Strata - slides available

2018-03-08 Thread Kenneth Knowles
Love it. Great flashy title, too :-)

On Thu, Mar 8, 2018 at 12:16 PM Eugene Kirpichov 
wrote:

> Oops that's just the template I used. Thanks for noticing, will regenerate
> the PDF and reupload when I get to it.
>
> On Thu, Mar 8, 2018, 11:59 AM Dan Halperin  wrote:
>
>> Looks like it was a good talk! Why is it Google Confidential &
>> Proprietary, though?
>>
>> Dan
>>
>> On Thu, Mar 8, 2018 at 11:49 AM, Eugene Kirpichov 
>> wrote:
>>
>>> Hey all,
>>>
>>> The slides for my yesterday's talk at Strata San Jose
>>> https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/63696
>>>  have
>>> been posted on the talk page. They may be of interest both to users and IO
>>> authors.
>>>
>>> Thanks.
>>>
>>
>>


Re: "Radically modular data ingestion APIs in Apache Beam" @ Strata - slides available

2018-03-08 Thread Eugene Kirpichov
Oops that's just the template I used. Thanks for noticing, will regenerate
the PDF and reupload when I get to it.

On Thu, Mar 8, 2018, 11:59 AM Dan Halperin  wrote:

> Looks like it was a good talk! Why is it Google Confidential &
> Proprietary, though?
>
> Dan
>
> On Thu, Mar 8, 2018 at 11:49 AM, Eugene Kirpichov 
> wrote:
>
>> Hey all,
>>
>> The slides for my yesterday's talk at Strata San Jose
>> https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/63696
>>  have
>> been posted on the talk page. They may be of interest both to users and IO
>> authors.
>>
>> Thanks.
>>
>
>


Re: "Radically modular data ingestion APIs in Apache Beam" @ Strata - slides available

2018-03-08 Thread Lukasz Cwik
I really like slide 19:
Author: "I made a bigdata programming model"
Reader: "Cool, how does data get in and out?"
Author: "Brb"

On Thu, Mar 8, 2018 at 11:49 AM, Eugene Kirpichov 
wrote:

> Hey all,
>
> The slides for my yesterday's talk at Strata San Jose https://conferences.
> oreilly.com/strata/strata-ca/public/schedule/detail/63696 have been
> posted on the talk page. They may be of interest both to users and IO
> authors.
>
> Thanks.
>


Re: "Radically modular data ingestion APIs in Apache Beam" @ Strata - slides available

2018-03-08 Thread Dan Halperin
Looks like it was a good talk! Why is it Google Confidential & Proprietary,
though?

Dan

On Thu, Mar 8, 2018 at 11:49 AM, Eugene Kirpichov 
wrote:

> Hey all,
>
> The slides for my yesterday's talk at Strata San Jose https://conferences.
> oreilly.com/strata/strata-ca/public/schedule/detail/63696 have been
> posted on the talk page. They may be of interest both to users and IO
> authors.
>
> Thanks.
>


"Radically modular data ingestion APIs in Apache Beam" @ Strata - slides available

2018-03-08 Thread Eugene Kirpichov
Hey all,

The slides for my yesterday's talk at Strata San Jose
https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/63696
have
been posted on the talk page. They may be of interest both to users and IO
authors.

Thanks.