Re: Bootstrapping multiple state within same operator

2023-03-24 Thread David Artiga
FFR] = null > > > > override def snapshotState(context: FunctionSnapshotContext): Unit = { > > } > > > > override def initializeState(context: FunctionInitializationContext): > Unit = { > > val fFRStateDescriptor = new ListStateDescriptor[FFR](&

RE: Bootstrapping multiple state within same operator

2023-03-24 Thread Schwalbe Matthias
user@flink.apache.org Subject: Re: Bootstrapping multiple state within same operator ⚠EXTERNAL MESSAGE – CAUTION: Think Before You Click ⚠ Not familiar with the implementation but thinking some options: - composable transformations - underlying MultiMap - ... On Wed, Mar 22, 2023

Re: Bootstrapping multiple state within same operator

2023-03-22 Thread David Artiga
Not familiar with the implementation but thinking some options: - composable transformations - underlying MultiMap - ... On Wed, Mar 22, 2023 at 10:24 AM Hang Ruan wrote: > Hi, David, > I also read the code about the `SavepointWriter#withOperator`. The > transformations are stored in a `Map` wh

Re: Bootstrapping multiple state within same operator

2023-03-22 Thread Hang Ruan
Hi, David, I also read the code about the `SavepointWriter#withOperator`. The transformations are stored in a `Map` whose key is `OperatorID`. I don't come up with a way that we could register multi transformations for one operator with the provided API. Maybe we need a new type of `XXXStateBoots

Re: Re: Bootstrapping the state

2018-07-23 Thread Fabian Hueske
Hi Henkka, You might want to consider implementing a dedicated job for state bootstrapping that uses the same operator UUID and state names. That might be easier than integrating the logic into your regular job. I think you have to use the monitoring file source because AFAIK it won't be possible

Re: Re: Bootstrapping the state

2018-07-22 Thread Henri Heiskanen
Hi, With state bootstrapping I mean loading the state with initial data before starting the actual job. For example, in our case I would like to load information like registration date of our users (>5 years of data) so that I can enrich our event data in streaming (5 days retention). Before watc

Re: Re: Bootstrapping the state

2018-07-20 Thread Vino yang
Hi Henkka, If you want to customize the datastream text source for your purpose. You can use a read counter, if the value of counter would not change in a interval you can guess all the data has been read. Just a idea, you can choose other solution. About creating a savepoint automatically on jo

Re: Bootstrapping the state

2018-07-20 Thread Henri Heiskanen
Hi, Thanks. Just to clarify, where would you then invoke the savepoint creation? I basically need to know when all data is read, create a savepoint and then exit. I think I could just as well use the PROCESS_CONTINUOSLY, monitor the stream outside and then use rest api to cancel with savepoint. A

Re: Bootstrapping the state

2018-07-19 Thread vino yang
Hi Henkka, The behavior of the text file source meets expectation. Flink will not keep your source task thread when it exit from it's invoke method. That means you should keep your source task alive. So to implement this, you should customize a text file source (implement SourceFunction interface)

Re: Bootstrapping

2018-01-26 Thread Aljoscha Krettek
Hi, I see this coming up more and more often these days. For now, the solution of doing a savepoint and switching sources should work but I've had it in my head for a while now to add functionality for bootstrapping inputs in the API. An operator would read from the bootstrap stream (which is f

Re: Bootstrapping

2018-01-25 Thread Chen Qin
Hi Gregory, I have similar issue when dealing with historical data. We choose Lambda and figure out use case specific hand off protocol. Unless storage side can support replay logs within a time range, Streaming application authors still needs to carry extra work to implement batching layer