Re: Measure Latency from source to sink

2018-06-25 Thread antonio saldivar
Thank you very much I already did #2 but ate the moment i print te output as i am using a trigger alert and evaluete the window it replace me the toString values to null or 0 and only prints the ones saved in my accumulator and the keyBy value On Mon, Jun 25, 2018, 9:22 PM Hequn Cheng wrote: >

Re: Measure Latency from source to sink

2018-06-25 Thread Hequn Cheng
Hi antonio, I see two options to solve your problem. 1. Enable the latency tracking[1]. But you have to pay attention to it's mechanism, for example, a) the sources only *periodically* emit a special record and b) the latency markers are not accounting for the time user records spend in operators

high-availability.storageDir clean up?

2018-06-25 Thread Elias Levy
I noticed in one of our cluster that they are relatively old submittedJobGraph* and completedCheckpoint* files. I was wondering at what point it is save to clean some of these up.

Measure Latency from source to sink

2018-06-25 Thread antonio saldivar
Hello I am trying to measure the latency of each transaction traveling across the system as a DataSource I have a Kafka consumer and I would like to measure the time that takes from the Source to Sink. Does any one has an example?. Thank you Best Regards

Re: How to partition within same physical node in Flink

2018-06-25 Thread Vijay Balakrishnan
I see a .slotSharingGroup for SingleOutputStreamOperator which can put parallel instances of operations in same TM

Re: How to partition within same physical node in Flink

2018-06-25 Thread Vijay Balakrishnan
Thanks, Fabian. Been reading your excellent book on Flink Streaming.Can't wait for more chapters. Attached a pic. [image: partition-by-cam-ts.jpg] I have records with seq# 1 and cam1 and cam2. I also have records with varying seq#'s. By partitioning on cam field first(keyBy(cam)), I can get cam1

Re: Few question about upgrade from 1.4 to 1.5 flink ( some very basic )

2018-06-25 Thread Vishal Santoshi
I think all I need to add is web.port: 8081 rest.port: 8081 to the JM flink conf ? On Mon, Jun 25, 2018 at 10:46 AM, Vishal Santoshi wrote: > Another issue I saw with flink cli... > > org.apache.flink.client.program.ProgramInvocationException: The program > execution failed: JobManager did

Re: Few question about upgrade from 1.4 to 1.5 flink ( some very basic )

2018-06-25 Thread Vishal Santoshi
Another issue I saw with flink cli... org.apache.flink.client.program.ProgramInvocationException: The program execution failed: JobManager did not respond within 12 ms at org.apache.flink.client.program.ClusterClient.runDetached(ClusterClient.java:524) at

Re: Few question about upgrade from 1.4 to 1.5 flink ( some very basic )

2018-06-25 Thread Chesnay Schepler
The watermark issue is know and will be fixed in 1.5.1 On 25.06.2018 15:03, Vishal Santoshi wrote: Thank you One addition I do not see WM info on the UI ( Attached ) Is this a know issue. The same pipe on our production has the WM ( In fact never had an issue with Watermarks not

Re: Multiple kafka consumers

2018-06-25 Thread zhangminglei
Hi, Amol Yes. I think it is. But, env.setParallelism(80) means that you set a global parallelism for all operators. Actually, it depends on your job to set one of them(operators). Instead, You just set the source operator parallelism is enough. Like below, It will be 80 kafka consumers [also

Re: Multiple kafka consumers

2018-06-25 Thread zhangminglei
Hi, Amol As @Sihua said. Also in my case, if the kafka partition is 80. I will also set the job source operator parallelism to 80 as well. Cheers Minglei > 在 2018年6月25日,下午5:39,sihua zhou 写道: > > Hi Amol, > > I think If you set the parallelism of the source node equal to the number of > the

Re: Multiple kafka consumers

2018-06-25 Thread sihua zhou
Hi Amol, I think If you set the parallelism of the source node equal to the number of the partition of the kafka topic, you could have per kafka customer per partition in your job. But if the number of the partitions of the kafka is dynamic, the 1:1 relationship might break. I think maybe

Re: Few question about upgrade from 1.4 to 1.5 flink ( some very basic )

2018-06-25 Thread Fabian Hueske
Hi Vishal, 1. I don't think a rolling update is possible. Flink 1.5.0 changed the process orchestration and how they communicate. IMO, the way to go is to start a Flink 1.5.0 cluster, take a savepoint on the running job, start from the savepoint on the new cluster and shut the old job down. 2.

Re: How to partition within same physical node in Flink

2018-06-25 Thread Fabian Hueske
Hi, Flink distributes task instances to slots and does not expose physical machines. Records are partitioned to task instances by hash partitioning. It is also not possible to guarantee that the records in two different operators are send to the same slot. Sharing information by side-passing it

Re: [BucketingSink] incorrect indexing of part files, when part suffix is specified

2018-06-25 Thread Rinat
Hi Mingey ! Thx for your reply, really, have no idea why everything works in your case, I have implemented unit tests in my PR which shows, that problem exists. Please, let me know which Flink version do you use ? Current fix is actual for current master branch, here it an example of unit

Re: Custom Watermarks with Flink

2018-06-25 Thread Fabian Hueske
Hi, I would not encode this information in watermarks. Watermarks are rather an internal mechanism to reason about event-time. Flink also generates watermarks internally. This makes the behavior less predictive. You could either inject special meta data records (which Flink handles just like

Re: CEP: Different consuming strategies within a pattern

2018-06-25 Thread Dawid Wysakowicz
Hi Shailesh, It does not emit results because "followedBy" accepts only the first occurrence of matching event. Therefore in your case it only tries to construct pattern with start(id=2). Try removing this event and you will see it matches the other one. If you want to try to construct match with

Re: Restore state from save point with add new flink sql

2018-06-25 Thread James (Jian Wu) [FDS Data Platform]
Hi Till: Thanks for your answer, so if I just add new sql and not modified old sql then use `/`--allowNonRestoredState option to restart job can resume old sql state from savepoints? Regards James From: Till Rohrmann Date: Friday, June 15, 2018 at 8:13 PM To: "James (Jian Wu) [FDS Data

Re: Some doubts related to Rocksdb state backed and checkpointing!

2018-06-25 Thread sihua zhou
Hi Ashwin, I think the questions here might be a bit general and that could make it a bit hard to offer the answer meet your expected exactly, could you please somehow bref outlined your user case here to accossiated with questions, that would definitely make it easier to offer a better

CEP: Different consuming strategies within a pattern

2018-06-25 Thread Shailesh Jain
Hi, I'm trying to detect a sequence like A followed by B, C, D. i.e. there is no strict contiguity between A and B, but strict contiguity between B, C and D. Sample test case: https://gist.github.com/jainshailesh/57832683fb5137bd306e4844abd9ef86 testStrictFollowedByRelaxedContiguity passes, but