Re: [Proposal] Utilities for reading, transforming and creating Streaming savepoints

2018-08-22 Thread Hequn Cheng
Hi Gyula, +1 for this feature. State bootstrapping and state analytics will be very helpful. On Thu, Aug 23, 2018 at 4:09 AM Gyula Fóra wrote: > Hi Shuyi, > > The tool allows you to convert a Flink DataSet containing the individual > state rows, (key, state) pairs, into a state for a streaming

Re: [Proposal] Utilities for reading, transforming and creating Streaming savepoints

2018-08-22 Thread Gyula Fóra
Hi Shuyi, The tool allows you to convert a Flink DataSet containing the individual state rows, (key, state) pairs, into a state for a streaming operator. So if you want to bootstrap your state from data on HDFS, you would read the file in Flink, convert it to the required DataSet format, then

Re: [Proposal] Utilities for reading, transforming and creating Streaming savepoints

2018-08-22 Thread Shuyi Chen
+1 on the tooling. Also, you mentioned about state bootstrapping problem. Could you please elaborate on how we can leverage the tooling to solve state bootstrapping? I think this is a common problem to stream processing, and it will be great the community can work on it. Thanks. Shuyi On Wed,

Re: [Proposal] Utilities for reading, transforming and creating Streaming savepoints

2018-08-22 Thread Gyula Fóra
Thanks, I guess the first thing that would be great help from anyone interested in helping is to try it for some streaming state :) We have tested these tools at King to analyze, transform and perform some aggregations on our user-states. The major limitation is that it requires RocksDB

Re: [Proposal] Utilities for reading, transforming and creating Streaming savepoints

2018-08-21 Thread Till Rohrmann
big +1 for this feature. A tool to get your state out of and into Flink will be tremendously helpful. On Mon, Aug 20, 2018 at 10:21 AM Aljoscha Krettek wrote: > +1 I'd like to have something like this in Flink a lot! > > > On 19. Aug 2018, at 11:57, Gyula Fóra wrote: > > > > Hi all! > > > >

Re: [Proposal] Utilities for reading, transforming and creating Streaming savepoints

2018-08-20 Thread Aljoscha Krettek
+1 I'd like to have something like this in Flink a lot! > On 19. Aug 2018, at 11:57, Gyula Fóra wrote: > > Hi all! > > Thanks for the feedback and I'm happy there is some interest :) > Tomorrow I will start improving the proposal based on the feedback and will > get back to work. > > If you

Re: [Proposal] Utilities for reading, transforming and creating Streaming savepoints

2018-08-19 Thread Gyula Fóra
Hi all! Thanks for the feedback and I'm happy there is some interest :) Tomorrow I will start improving the proposal based on the feedback and will get back to work. If you are interested working together in this please ping me and we can discuss some ideas/plans and how to share work. Cheers,

Re: [Proposal] Utilities for reading, transforming and creating Streaming savepoints

2018-08-18 Thread Paris Carbone
+1 Might also be a good start to implement queryable stream state with snapshot isolation using that mechanism. Paris > On 17 Aug 2018, at 12:28, Gyula Fóra wrote: > > Hi All! > > I want to share with you a little project we have been working on at King > (with some help from some

Re: [Proposal] Utilities for reading, transforming and creating Streaming savepoints

2018-08-17 Thread Jamie Grier
This is great, Gyula! A colleague here at Lyft has also done some work around bootstrapping DataStream programs and we've also talked a bit about doing this by running DataSet programs. On Fri, Aug 17, 2018 at 3:28 AM, Gyula Fóra wrote: > Hi All! > > I want to share with you a little project

Re: [Proposal] Utilities for reading, transforming and creating Streaming savepoints

2018-08-17 Thread Gyula Fóra
Thanks for the feedback :) I agree that combining this with SQL would give an extremely nice layer to analyse the states. Our goal is to contribute this to Flink, I think this should live as part of the Flink project to make deeper intergration possible in the long run. Of course a pre-requisite

Re: [Proposal] Utilities for reading, transforming and creating Streaming savepoints

2018-08-17 Thread Piotr Nowojski
Hi, Very huge +1 from my side. I found lack of such tool/possibility as a big problem for long term maintainability of Flink jobs. In the long run, I would be delight to see Flink SQL support for those things as well. Ad hoc analysis is one of the prime use case of SQL. This tool would make

[Proposal] Utilities for reading, transforming and creating Streaming savepoints

2018-08-17 Thread Gyula Fóra
Hi All! I want to share with you a little project we have been working on at King (with some help from some dataArtisans folks). I think this would be a valuable addition to Flink and solve a bunch of outstanding production use-cases and headaches around state bootstrapping and state analytics.