Thank you. Your instruction was helpful in my solving this. You can read about my solution at https://github.com/minmay/flink-patterns/tree/master/bootstrap-keyed-state-into-stream <https://github.com/minmay/flink-patterns/tree/master/bootstrap-keyed-state-into-stream>
> On Aug 10, 2020, at 4:07 AM, orionemail <orionem...@protonmail.com> wrote: > > I recently was in the same situation as Marco, the docs do explain what you > need to do, but without experience with Flink it might still not be obvious > what you need to do. > > What I did initially: > > Setup the job to run in a 'write a save state' mode by implementing a command > line switch I could use when running the job: > > flink run somejob.jar -d /some/path > > The code then when run with this switch ran *only* the required code to setup > a version of state and write that to a savestate. > > This worked and I was on my way. > > However, I then decided to split this out into a new flink 'jar' with the > sole purpose of creating a save state. This is a cleaner approach in my case > and also removes dependancies (my state was loaded from DynamoDB) that were > only required in this one instance. > > As rebuilding the state from this application is intended to only be done the > once, with checkpoints/savestates the main approach going forward. > > Just remember to name your Operators with the same ID/name to make sure it is > compatible. > > Sent with ProtonMail Secure Email. > > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ > On Monday, 10 August 2020 07:27, Tzu-Li Tai <tzuli...@gmail.com> wrote: > >> Hi, >> >> For the NullPointerException, what seems to be happening is that you are >> setting NULL values in your MapState, that is not allowed by the API. >> >> Otherwise, the code that you showed for bootstrapping state seems to be >> fine. >> >>> I have yet to find a working example that shows how to do both >>> (bootstrapping state and start a streaming application with that state) >> >> Not entirely sure what you mean here by "doing both". >> The savepoint written using the State Processor API (what you are doing in >> the bootstrap() method) is a savepoint that may be restored from as you >> would with a typical Flink streaming job restore. >> So, usually the bootstrapping part happens as a batch "offline" job, while >> you keep your streaming job as a separate job. What are you trying to >> achieve with having both written within the same job? >> >> Cheers, >> Gordon >> >> >> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- >> >> Sent from: >> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ > >