event time and late events - documentation

2018-07-16 Thread Sofer, Tovi
Hi group, Can someone please elaborate on the comment at the end of section "Debugging Windows & Event Time"? Didn't understand it meaning. https://ci.apache.org/projects/flink/flink-docs-master/monitoring/debugging_event_time.html "Handling Event Time Stragglers Approach 1: Watermark stays late

FW: high availability with automated disaster recovery using zookeeper

2018-07-16 Thread Sofer, Tovi
Thank you Scott, Looks like a very elegant solution. How did you manage high availability in single data center? Thanks, Tovi From: Scott Kidder Sent: יום ו 13 יולי 2018 01:13 To: Sofer, Tovi [ICG-IT] Cc: user@flink.apache.org Subject: Re: high availability with automated disaster recovery

RE: high availability with automated disaster recovery using zookeeper

2018-07-10 Thread Sofer, Tovi
/projects/flink/flink-docs-release-1.5/ops/deployment/mesos.html [Not sure this is accurate, since it seems to contradict the image in link below https://mesosphere.com/blog/apache-flink-on-dcos-and-apache-mesos ] From: Sofer, Tovi [ICG-IT] Sent: יום ג 10 יולי 2018 20:04 To: 'Till Rohrmann' ; user Cc

RE: high availability with automated disaster recovery using zookeeper

2018-07-10 Thread Sofer, Tovi
-for-disaster-recovery/ ) Is this supported by Flink cluster on Mesos ? Thanks again Tovi From: Till Rohrmann Sent: יום ג 10 יולי 2018 10:11 To: Sofer, Tovi [ICG-IT] Cc: user Subject: Re: high availability with automated disaster recovery using zookeeper Hi Tovi, that is an interesting use case

high availability with automated disaster recovery using zookeeper

2018-07-09 Thread Sofer, Tovi
Hi all, We are now examining how to achieve high availability for Flink, and to support also automatic recovery in disaster scenario- when all DC goes down. We have DC1 which we usually want work to be done, and DC2 - which is more remote and we want work to go there only when DC1 is down. We

RE: kafka as recovery only source

2018-02-07 Thread Sofer, Tovi
Hi Fabian, Thank you for the suggestion. We will consider it. Would be glad to hear other ideas how to handle such requirement. Thanks again, Tovi From: Fabian Hueske [mailto:fhue...@gmail.com] Sent: יום ד 07 פברואר 2018 11:47 To: Sofer, Tovi [ICG-IT] <ts72...@imceu.eu.ssmb.com> Cc

kafka as recovery only source

2018-02-06 Thread Sofer, Tovi
Hi group, I wanted to get your suggestion on how to implement two requirements we have: * One is to read from external message queue (JMS) at very fast latency * Second is to support zero data loss, so that in case of restart and recovery, messages not checkpointed (and not

RE: Sync and Async checkpoint time

2018-01-31 Thread Sofer, Tovi
To: Sofer, Tovi [ICG-IT] <ts72...@imceu.eu.ssmb.com> Cc: user@flink.apache.org Subject: Re: Sync and Async checkpoint time Hi, this looks like the timer service is the culprit for this problem. Timers are currently not stored in the state backend, but in a separate on-heap data str

Sync and Async checkpoint time

2018-01-30 Thread Sofer, Tovi
Hi group, In our project we are using asynchronous FSStateBackend, and we are trying to move to distributed storage - currently S3. When using this storage we are experiencing issues of high backpressure and high latency, in comparison of local storage. We are trying to understand the reason,

RE: Two operators consuming from same stream

2018-01-04 Thread Sofer, Tovi
ect(pricesKeyedStream).flatMap(identityMapper); ds.flatMap(mapperA); ds.flatMap(mapperB); Regards, Timo Am 1/1/18 um 2:50 PM schrieb Sofer, Tovi : Hi group, We have the following graph below, on which we added metrics for latency calculation. We have two streams which are consumed by two o

Two operators consuming from same stream

2018-01-01 Thread Sofer, Tovi
Hi group, We have the following graph below, on which we added metrics for latency calculation. We have two streams which are consumed by two operators: * ordersStream and pricesStream - they are both consumed by two operators: CoMapperA and CoMapperB, each using connect. Initially

RE: slot group indication per operator

2017-12-11 Thread Sofer, Tovi
behavior. @Chesnay: Do you know if we can get this information? If not through the Web UI, maybe via REST? Do we have access to the full ExecutionGraph somewhere? Otherwise it might make sense to open an issue for this. Regards, Timo Am 12/5/17 um 4:25 PM schrieb Sofer, Tovi : Hi all, I am

RE: Testing CoFlatMap correctness

2017-12-10 Thread Sofer, Tovi
such scenario so that results are predictable, and that elements from main stream arrive after elements from control stream, or other way around. Thanks again, Tovi From: Kostas Kloudas [mailto:k.klou...@data-artisans.com] Sent: יום ה 07 דצמבר 2017 19:11 To: Sofer, Tovi [ICG-IT] <t

Testing CoFlatMap correctness

2017-12-07 Thread Sofer, Tovi
Hi group, What is the best practice for testing CoFlatMap operator correctness? We have two source functions, each emits data to stream, and a connect between them, and I want to make sure that when streamA element arrive before stream element, a certain behavior happens. How can I test this

slot group indication per operator

2017-12-05 Thread Sofer, Tovi
Hi all, I am trying to use the slot group feature, by having 'default' group and additional 'market' group. The purpose is to divide the resources equally between two sources and their following operators. I've set the slotGroup on the source of the market data. Can I assume that all following

RE: Negative values using latency marker

2017-11-05 Thread Sofer, Tovi
Hi Nico, Actually the run below is on my local machine, and both Kafka and flink run on it. Thanks and regards, Tovi -Original Message- From: Nico Kruber [mailto:n...@data-artisans.com] Sent: יום ו 03 נובמבר 2017 15:22 To: user@flink.apache.org Cc: Sofer, Tovi [ICG-IT] <t

Negative values using latency marker

2017-11-02 Thread Sofer, Tovi
Hi group, Can someone maybe elaborate how can latency gauge shown by latency marker be negative? 2017-11-02 18:54:56,842 INFO com.citi.artemis.flink.reporters.ArtemisReporter - [Flink-MetricRegistry-1] 127.0.0.1.taskmanager.f40ca57104c8642218e5b8979a380ee4.Flink Streaming Job.Sink:

RE: state size effects latency

2017-10-30 Thread Sofer, Tovi
From: Biplob Biswas [mailto:revolutioni...@gmail.com] Sent: יום ב 30 אוקטובר 2017 11:02 To: Sofer, Tovi [ICG-IT] <ts72...@imceu.eu.ssmb.com> Cc: Narendra Joshi <narendr...@gmail.com>; user <user@flink.apache.org> Subject: Re: state size effects latency Hi Tovi, This might s

RE: state size effects latency

2017-10-30 Thread Sofer, Tovi
Thank you Joshi. We are using currently FsStateBackend since in version 1.3 it supports async snapshots, and no RocksDB. Does anyone else has feedback on this issues? From: Narendra Joshi [mailto:narendr...@gmail.com] Sent: יום א 29 אוקטובר 2017 12:13 To: Sofer, Tovi [ICG-IT] <t

state size effects latency

2017-10-29 Thread Sofer, Tovi
Hi all, In our application we have a requirement to very low latency, preferably less than 5ms. We were able to achieve this so far, but when we start increasing the state size, we see distinctive decrease in latency. We have added MinPauseBetweenCheckpoints, and are using async snapshots. *

RE: kafka consumer parallelism

2017-10-03 Thread Sofer, Tovi
Hi Robert, I had similar issue. For me the problem was that the topic was auto created with one partition. You can alter it to have 5 partitions using kafka-topics command. Example: kafka-topics --alter --partitions 5 --topic fix --zookeeper localhost:2181 Regards, Tovi -Original

RE: Flink kafka consumer that read from two partitions in local mode

2017-09-26 Thread Sofer, Tovi
-up question – is it possible to create the topic with two partitions while creating the FlinkKafKaProducer? Since by default it seems to create it with one partition. Thanks and regards, Tovi From: Sofer, Tovi [ICG-IT] Sent: יום ב 25 ספטמבר 2017 17:18 To: 'Tzu-Li (Gordon) Tai'; Fabian Hueske Cc

RE: Flink kafka consumer that read from two partitions in local mode

2017-09-25 Thread Sofer, Tovi
e into default topic fix Thanks, Tovi From: Tzu-Li (Gordon) Tai [mailto:tzuli...@apache.org] Sent: יום ב 25 ספטמבר 2017 15:06 To: Sofer, Tovi [ICG-IT]; Fabian Hueske Cc: user Subject: RE: Flink kafka consumer that read from two partitions in local mode Hi Tovi, Your code seems to be correct,

RE: Flink kafka consumer that read from two partitions in local mode

2017-09-24 Thread Sofer, Tovi
Thank you Fabian. Fabian, Gordon, am I missing something in consumer setup? Should I configure consumer in some way to subscribe to two partitions? Thanks and regards, Tovi From: Fabian Hueske [mailto:fhue...@gmail.com] Sent: יום ג 19 ספטמבר 2017 22:58 To: Sofer, Tovi [ICG-IT] Cc: user; Tzu-Li

Flink kafka consumer that read from two partitions in local mode

2017-09-19 Thread Sofer, Tovi
Hi, I am trying to setup FlinkKafkaConsumer which reads from two partitions in local mode, using setParallelism=2. The producer writes to two partition (as it is shown in metrics report). But the consumer seems to read always from one partition only. Am I missing something in partition