Re: Flink rolling upgrade support

2017-07-19 Thread Moiz Jinia
No. This is the thread that answers my question -

http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Does-FlinkKafkaConsumer010-care-about-consumer-group-td14323.html

Moiz

—
sent from phone

On 19-Jul-2017, at 10:04 PM, Ted Yu <yuzhih...@gmail.com> wrote:

This was the other thread, right ?

http://search-hadoop.com/m/Flink/VkLeQ0dXIf1SkHpY?subj=Re+Does+job+restart+resume+from+last+known+internal+checkpoint+

> On Wed, Jul 19, 2017 at 9:02 AM, Moiz Jinia <moiz.ji...@gmail.com> wrote:
> Yup! Thanks.
> 
> Moiz
> 
> —
> sent from phone
> 
> On 19-Jul-2017, at 9:21 PM, Aljoscha Krettek [via Apache Flink User Mailing 
> List archive.] <[hidden email]> wrote:
> 
> This was now answered in your other Thread, right?
> 
> Best,
> Aljoscha
> 
>>> On 18. Jul 2017, at 11:37, Moiz Jinia <[hidden email]> wrote:
>>> 
>>> Aljoscha Krettek wrote
>>> Hi,
>>> zero-downtime updates are currently not supported. What is supported in
>>> Flink right now is a savepoint-shutdown-restore cycle. With this, you
>>> first
>>> draw a savepoint (which is essentially a checkpoint with some meta data),
>>> then you cancel your job, then you do whatever you need to do (update
>>> machines, update Flink, update Job) and restore from the savepoint.
>>> 
>>> A possible solution for zero-downtime update would be to do a savepoint,
>>> then start a second Flink job from that savepoint, then shutdown the first
>>> job. With this, your data sinks would need to be able to handle being
>>> written to by 2 jobs at the same time, i.e. writes should probably be
>>> idempotent.
>>> 
>>> This is the link to the savepoint doc:
>>> https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/savepoints.html
>>> 
>>> Does that help?
>>> 
>>> Cheers,
>>> Aljoscha
>>> 
>>> On Fri, 16 Dec 2016 at 18:16 Andrew Hoblitzell 
>> 
>>> ahoblitzell@
>> 
>>> 
>>> wrote:
>>> 
>>>> Hi. Does Apache Flink currently have support for zero down time or the =
>>>> ability to do rolling upgrades?
>>>> 
>>>> If so, what are concerns to watch for and what best practices might =
>>>> exist? Are there version management and data inconsistency issues to =
>>>> watch for?=
>>>> 
>> 
>> When a second job instance is started in parallel from a savepoint, my
>> incoming kafka messages would get sharded between the 2 running instances of
>> the job (since they both would belong to the same consumer group). So when I
>> stop the older version of the job, i stand to lose data (inspite of the fact
>> that my downstream consumer is idempotent)
>> 
>> If I used a different consumer group for the new job version (and start it
>> from a savepoint), will the savepoint ensure that the second job instance
>> starts from the correct offset? Do I need to do anything extra to make this
>> work? (example set the uid on the source of the job).
>> 
>> Thanks!
>> Moiz
>> 
>> 
>> 
>> --
>> View this message in context: 
>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-rolling-upgrade-support-tp10674p14313.html
>> Sent from the Apache Flink User Mailing List archive. mailing list archive 
>> at Nabble.com.
> 
> 
> 
> If you reply to this email, your message will be added to the discussion 
> below:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-rolling-upgrade-support-tp10674p14337.html
> To unsubscribe from Flink rolling upgrade support, click here.
> NAML
> 
> View this message in context: Re: Flink rolling upgrade support
> 
> Sent from the Apache Flink User Mailing List archive. mailing list archive at 
> Nabble.com.



Re: Flink rolling upgrade support

2017-07-19 Thread Moiz Jinia
Yup! Thanks.

Moiz

—
sent from phone

On 19-Jul-2017, at 9:21 PM, Aljoscha Krettek [via Apache Flink User Mailing 
List archive.] <ml+s2336050n14337...@n4.nabble.com> wrote:

This was now answered in your other Thread, right?

Best,
Aljoscha

>> On 18. Jul 2017, at 11:37, Moiz Jinia <[hidden email]> wrote:
>> 
>> Aljoscha Krettek wrote
>> Hi,
>> zero-downtime updates are currently not supported. What is supported in
>> Flink right now is a savepoint-shutdown-restore cycle. With this, you
>> first
>> draw a savepoint (which is essentially a checkpoint with some meta data),
>> then you cancel your job, then you do whatever you need to do (update
>> machines, update Flink, update Job) and restore from the savepoint.
>> 
>> A possible solution for zero-downtime update would be to do a savepoint,
>> then start a second Flink job from that savepoint, then shutdown the first
>> job. With this, your data sinks would need to be able to handle being
>> written to by 2 jobs at the same time, i.e. writes should probably be
>> idempotent.
>> 
>> This is the link to the savepoint doc:
>> https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/savepoints.html
>> 
>> Does that help?
>> 
>> Cheers,
>> Aljoscha
>> 
>> On Fri, 16 Dec 2016 at 18:16 Andrew Hoblitzell 
> 
>> ahoblitzell@
> 
>> 
>> wrote:
>> 
>>> Hi. Does Apache Flink currently have support for zero down time or the =
>>> ability to do rolling upgrades?
>>> 
>>> If so, what are concerns to watch for and what best practices might =
>>> exist? Are there version management and data inconsistency issues to =
>>> watch for?=
>>> 
> 
> When a second job instance is started in parallel from a savepoint, my
> incoming kafka messages would get sharded between the 2 running instances of
> the job (since they both would belong to the same consumer group). So when I
> stop the older version of the job, i stand to lose data (inspite of the fact
> that my downstream consumer is idempotent)
> 
> If I used a different consumer group for the new job version (and start it
> from a savepoint), will the savepoint ensure that the second job instance
> starts from the correct offset? Do I need to do anything extra to make this
> work? (example set the uid on the source of the job).
> 
> Thanks!
> Moiz
> 
> 
> 
> --
> View this message in context: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-rolling-upgrade-support-tp10674p14313.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive at 
> Nabble.com.



If you reply to this email, your message will be added to the discussion below:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-rolling-upgrade-support-tp10674p14337.html
To unsubscribe from Flink rolling upgrade support, click here.
NAML



--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-rolling-upgrade-support-tp10674p14338.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.

Does FlinkKafkaConsumer010 care about consumer group?

2017-07-19 Thread Moiz Jinia
Below is a plan for downtime-free upgrade of a Flink job. The downstream
consumer of the Flink job is duplicate proof.

Scenario 1 -
1. Start Flink job A with consumer group G1 (12 slot job)
2. While job A is running, take a savepoint AS.
3. Start newer version of Flink job A' from savepoint AS with consumer group
*G1* (12 slot job again)
4. Stop job A.

Scenario 2 -
1. Start Flink job A with consumer group G1 (12 slot job)
2. While job A is running, take a savepoint AS.
3. Start newer version of Flink job A' from savepoint AS with consumer group
*G2* (12 slot job again)
4. Stop job A

Does it matter what consumer group job A' uses? The desired behavior is that
during the window when both A and A' are running, all messages should go to
both jobs. (And of course I want that job A' should start consuming from the
offsets in the savepoint and not the earliest).






--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Does-FlinkKafkaConsumer010-care-about-consumer-group-tp14323.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.


Re: Flink rolling upgrade support

2017-07-18 Thread Moiz Jinia
Aljoscha Krettek wrote
> Hi,
> zero-downtime updates are currently not supported. What is supported in
> Flink right now is a savepoint-shutdown-restore cycle. With this, you
> first
> draw a savepoint (which is essentially a checkpoint with some meta data),
> then you cancel your job, then you do whatever you need to do (update
> machines, update Flink, update Job) and restore from the savepoint.
> 
> A possible solution for zero-downtime update would be to do a savepoint,
> then start a second Flink job from that savepoint, then shutdown the first
> job. With this, your data sinks would need to be able to handle being
> written to by 2 jobs at the same time, i.e. writes should probably be
> idempotent.
> 
> This is the link to the savepoint doc:
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/savepoints.html
> 
> Does that help?
> 
> Cheers,
> Aljoscha
> 
> On Fri, 16 Dec 2016 at 18:16 Andrew Hoblitzell 

> ahoblitzell@

> 
> wrote:
> 
>> Hi. Does Apache Flink currently have support for zero down time or the =
>> ability to do rolling upgrades?
>>
>> If so, what are concerns to watch for and what best practices might =
>> exist? Are there version management and data inconsistency issues to =
>> watch for?=
>>

When a second job instance is started in parallel from a savepoint, my
incoming kafka messages would get sharded between the 2 running instances of
the job (since they both would belong to the same consumer group). So when I
stop the older version of the job, i stand to lose data (inspite of the fact
that my downstream consumer is idempotent)

If I used a different consumer group for the new job version (and start it
from a savepoint), will the savepoint ensure that the second job instance
starts from the correct offset? Do I need to do anything extra to make this
work? (example set the uid on the source of the job).

Thanks!
Moiz



--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-rolling-upgrade-support-tp10674p14313.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.


Re: Long running time based Patterns

2017-05-04 Thread Moiz Jinia
It'll definitely have a where clause. Just forgot to include it in the example. 
Just meant to focus on the within clause.

Am on 1.3 - expect it'll be fixed by the time stable is out?

Thanks!

Moiz

—
sent from phone

On 04-May-2017, at 8:12 PM, Kostas Kloudas  wrote:

Hi Moiz,

You are on Flink 1.2 or 1.3? 
In Flink 1.2 (latest stable) there are no known issues, so this will work 
correctly. 
Keep in mind that without any conditions (where-clauses), you will only get all 
possible 
2-tuples of incoming elements, which could also be done with a simple process 
function I would say.

In Flink 1.3 (unreleased) there is this issue: 
https://issues.apache.org/jira/browse/FLINK-6445

Thanks,
Kostas

> On May 4, 2017, at 1:45 PM, Moiz S Jinia  wrote:
> 
> Does Flink (with a persistent State backend such as RocksDB) work well with 
> long running Patterns of this type? (running into days)
> 
> Pattern.begin("start").followedBy("end").within(Time.days(3))
> 
> Is there some gotchas here or things to watch out for?
> 
> Thanks,
> Moiz