FFR] = null
>
>
>
> override def snapshotState(context: FunctionSnapshotContext): Unit = {
>
> }
>
>
>
> override def initializeState(context: FunctionInitializationContext):
> Unit = {
>
> val fFRStateDescriptor = new ListStateDescriptor[FFR](&
user@flink.apache.org
Subject: Re: Bootstrapping multiple state within same operator
⚠EXTERNAL MESSAGE – CAUTION: Think Before You Click ⚠
Not familiar with the implementation but thinking some options:
- composable transformations
- underlying MultiMap
- ...
On Wed, Mar 22, 2023
Not familiar with the implementation but thinking some options:
- composable transformations
- underlying MultiMap
- ...
On Wed, Mar 22, 2023 at 10:24 AM Hang Ruan wrote:
> Hi, David,
> I also read the code about the `SavepointWriter#withOperator`. The
> transformations are stored in a `Map` wh
Hi, David,
I also read the code about the `SavepointWriter#withOperator`. The
transformations are stored in a `Map` whose key is `OperatorID`. I don't
come up with a way that we could register multi transformations for one
operator with the provided API.
Maybe we need a new type of `XXXStateBoots
Hi Henkka,
You might want to consider implementing a dedicated job for state
bootstrapping that uses the same operator UUID and state names. That might
be easier than integrating the logic into your regular job.
I think you have to use the monitoring file source because AFAIK it won't
be possible
Hi,
With state bootstrapping I mean loading the state with initial data before
starting the actual job. For example, in our case I would like to load
information like registration date of our users (>5 years of data) so that
I can enrich our event data in streaming (5 days retention).
Before watc
Hi Henkka, If you want to customize the datastream text source for your
purpose. You can use a read counter, if the value of counter would not change
in a interval you can guess all the data has been read. Just a idea, you can
choose other solution. About creating a savepoint automatically on jo
Hi,
Thanks. Just to clarify, where would you then invoke the savepoint
creation? I basically need to know when all data is read, create a
savepoint and then exit. I think I could just as well use the
PROCESS_CONTINUOSLY, monitor the stream outside and then use rest api to
cancel with savepoint.
A
Hi Henkka,
The behavior of the text file source meets expectation. Flink will not keep
your source task thread when it exit from it's invoke method. That means
you should keep your source task alive. So to implement this, you should
customize a text file source (implement SourceFunction interface)
Hi,
I see this coming up more and more often these days. For now, the solution of
doing a savepoint and switching sources should work but I've had it in my head
for a while now to add functionality for bootstrapping inputs in the API. An
operator would read from the bootstrap stream (which is f
Hi Gregory,
I have similar issue when dealing with historical data. We choose Lambda
and figure out use case specific hand off protocol.
Unless storage side can support replay logs within a time range, Streaming
application authors still needs to carry extra work to implement batching
layer
11 matches
Mail list logo