Re: Flink rolling upgrade support
No. This is the thread that answers my question - http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Does-FlinkKafkaConsumer010-care-about-consumer-group-td14323.html Moiz — sent from phone On 19-Jul-2017, at 10:04 PM, Ted Yu <yuzhih...@gmail.com> wrote: This was the other thread, right ? http://search-hadoop.com/m/Flink/VkLeQ0dXIf1SkHpY?subj=Re+Does+job+restart+resume+from+last+known+internal+checkpoint+ > On Wed, Jul 19, 2017 at 9:02 AM, Moiz Jinia <moiz.ji...@gmail.com> wrote: > Yup! Thanks. > > Moiz > > — > sent from phone > > On 19-Jul-2017, at 9:21 PM, Aljoscha Krettek [via Apache Flink User Mailing > List archive.] <[hidden email]> wrote: > > This was now answered in your other Thread, right? > > Best, > Aljoscha > >>> On 18. Jul 2017, at 11:37, Moiz Jinia <[hidden email]> wrote: >>> >>> Aljoscha Krettek wrote >>> Hi, >>> zero-downtime updates are currently not supported. What is supported in >>> Flink right now is a savepoint-shutdown-restore cycle. With this, you >>> first >>> draw a savepoint (which is essentially a checkpoint with some meta data), >>> then you cancel your job, then you do whatever you need to do (update >>> machines, update Flink, update Job) and restore from the savepoint. >>> >>> A possible solution for zero-downtime update would be to do a savepoint, >>> then start a second Flink job from that savepoint, then shutdown the first >>> job. With this, your data sinks would need to be able to handle being >>> written to by 2 jobs at the same time, i.e. writes should probably be >>> idempotent. >>> >>> This is the link to the savepoint doc: >>> https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/savepoints.html >>> >>> Does that help? >>> >>> Cheers, >>> Aljoscha >>> >>> On Fri, 16 Dec 2016 at 18:16 Andrew Hoblitzell >> >>> ahoblitzell@ >> >>> >>> wrote: >>> >>>> Hi. Does Apache Flink currently have support for zero down time or the = >>>> ability to do rolling upgrades? >>>> >>>> If so, what are concerns to watch for and what best practices might = >>>> exist? Are there version management and data inconsistency issues to = >>>> watch for?= >>>> >> >> When a second job instance is started in parallel from a savepoint, my >> incoming kafka messages would get sharded between the 2 running instances of >> the job (since they both would belong to the same consumer group). So when I >> stop the older version of the job, i stand to lose data (inspite of the fact >> that my downstream consumer is idempotent) >> >> If I used a different consumer group for the new job version (and start it >> from a savepoint), will the savepoint ensure that the second job instance >> starts from the correct offset? Do I need to do anything extra to make this >> work? (example set the uid on the source of the job). >> >> Thanks! >> Moiz >> >> >> >> -- >> View this message in context: >> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-rolling-upgrade-support-tp10674p14313.html >> Sent from the Apache Flink User Mailing List archive. mailing list archive >> at Nabble.com. > > > > If you reply to this email, your message will be added to the discussion > below: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-rolling-upgrade-support-tp10674p14337.html > To unsubscribe from Flink rolling upgrade support, click here. > NAML > > View this message in context: Re: Flink rolling upgrade support > > Sent from the Apache Flink User Mailing List archive. mailing list archive at > Nabble.com.
Re: Flink rolling upgrade support
Yup! Thanks. Moiz — sent from phone On 19-Jul-2017, at 9:21 PM, Aljoscha Krettek [via Apache Flink User Mailing List archive.] <ml+s2336050n14337...@n4.nabble.com> wrote: This was now answered in your other Thread, right? Best, Aljoscha >> On 18. Jul 2017, at 11:37, Moiz Jinia <[hidden email]> wrote: >> >> Aljoscha Krettek wrote >> Hi, >> zero-downtime updates are currently not supported. What is supported in >> Flink right now is a savepoint-shutdown-restore cycle. With this, you >> first >> draw a savepoint (which is essentially a checkpoint with some meta data), >> then you cancel your job, then you do whatever you need to do (update >> machines, update Flink, update Job) and restore from the savepoint. >> >> A possible solution for zero-downtime update would be to do a savepoint, >> then start a second Flink job from that savepoint, then shutdown the first >> job. With this, your data sinks would need to be able to handle being >> written to by 2 jobs at the same time, i.e. writes should probably be >> idempotent. >> >> This is the link to the savepoint doc: >> https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/savepoints.html >> >> Does that help? >> >> Cheers, >> Aljoscha >> >> On Fri, 16 Dec 2016 at 18:16 Andrew Hoblitzell > >> ahoblitzell@ > >> >> wrote: >> >>> Hi. Does Apache Flink currently have support for zero down time or the = >>> ability to do rolling upgrades? >>> >>> If so, what are concerns to watch for and what best practices might = >>> exist? Are there version management and data inconsistency issues to = >>> watch for?= >>> > > When a second job instance is started in parallel from a savepoint, my > incoming kafka messages would get sharded between the 2 running instances of > the job (since they both would belong to the same consumer group). So when I > stop the older version of the job, i stand to lose data (inspite of the fact > that my downstream consumer is idempotent) > > If I used a different consumer group for the new job version (and start it > from a savepoint), will the savepoint ensure that the second job instance > starts from the correct offset? Do I need to do anything extra to make this > work? (example set the uid on the source of the job). > > Thanks! > Moiz > > > > -- > View this message in context: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-rolling-upgrade-support-tp10674p14313.html > Sent from the Apache Flink User Mailing List archive. mailing list archive at > Nabble.com. If you reply to this email, your message will be added to the discussion below: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-rolling-upgrade-support-tp10674p14337.html To unsubscribe from Flink rolling upgrade support, click here. NAML -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-rolling-upgrade-support-tp10674p14338.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.
Does FlinkKafkaConsumer010 care about consumer group?
Below is a plan for downtime-free upgrade of a Flink job. The downstream consumer of the Flink job is duplicate proof. Scenario 1 - 1. Start Flink job A with consumer group G1 (12 slot job) 2. While job A is running, take a savepoint AS. 3. Start newer version of Flink job A' from savepoint AS with consumer group *G1* (12 slot job again) 4. Stop job A. Scenario 2 - 1. Start Flink job A with consumer group G1 (12 slot job) 2. While job A is running, take a savepoint AS. 3. Start newer version of Flink job A' from savepoint AS with consumer group *G2* (12 slot job again) 4. Stop job A Does it matter what consumer group job A' uses? The desired behavior is that during the window when both A and A' are running, all messages should go to both jobs. (And of course I want that job A' should start consuming from the offsets in the savepoint and not the earliest). -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Does-FlinkKafkaConsumer010-care-about-consumer-group-tp14323.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.
Re: Flink rolling upgrade support
Aljoscha Krettek wrote > Hi, > zero-downtime updates are currently not supported. What is supported in > Flink right now is a savepoint-shutdown-restore cycle. With this, you > first > draw a savepoint (which is essentially a checkpoint with some meta data), > then you cancel your job, then you do whatever you need to do (update > machines, update Flink, update Job) and restore from the savepoint. > > A possible solution for zero-downtime update would be to do a savepoint, > then start a second Flink job from that savepoint, then shutdown the first > job. With this, your data sinks would need to be able to handle being > written to by 2 jobs at the same time, i.e. writes should probably be > idempotent. > > This is the link to the savepoint doc: > https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/savepoints.html > > Does that help? > > Cheers, > Aljoscha > > On Fri, 16 Dec 2016 at 18:16 Andrew Hoblitzell > ahoblitzell@ > > wrote: > >> Hi. Does Apache Flink currently have support for zero down time or the = >> ability to do rolling upgrades? >> >> If so, what are concerns to watch for and what best practices might = >> exist? Are there version management and data inconsistency issues to = >> watch for?= >> When a second job instance is started in parallel from a savepoint, my incoming kafka messages would get sharded between the 2 running instances of the job (since they both would belong to the same consumer group). So when I stop the older version of the job, i stand to lose data (inspite of the fact that my downstream consumer is idempotent) If I used a different consumer group for the new job version (and start it from a savepoint), will the savepoint ensure that the second job instance starts from the correct offset? Do I need to do anything extra to make this work? (example set the uid on the source of the job). Thanks! Moiz -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-rolling-upgrade-support-tp10674p14313.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.
Re: Long running time based Patterns
It'll definitely have a where clause. Just forgot to include it in the example. Just meant to focus on the within clause. Am on 1.3 - expect it'll be fixed by the time stable is out? Thanks! Moiz — sent from phone On 04-May-2017, at 8:12 PM, Kostas Kloudaswrote: Hi Moiz, You are on Flink 1.2 or 1.3? In Flink 1.2 (latest stable) there are no known issues, so this will work correctly. Keep in mind that without any conditions (where-clauses), you will only get all possible 2-tuples of incoming elements, which could also be done with a simple process function I would say. In Flink 1.3 (unreleased) there is this issue: https://issues.apache.org/jira/browse/FLINK-6445 Thanks, Kostas > On May 4, 2017, at 1:45 PM, Moiz S Jinia wrote: > > Does Flink (with a persistent State backend such as RocksDB) work well with > long running Patterns of this type? (running into days) > > Pattern.begin("start").followedBy("end").within(Time.days(3)) > > Is there some gotchas here or things to watch out for? > > Thanks, > Moiz