Re: FlinkCEP latency/throughput

2017-05-19 Thread Dawid Wysakowicz
Hello Alfred,

Just some considerations  from my side as for the latency. I think the
first step should be defining what does "latency" for a CEP library really
means.
The first thing that comes to my mind is the time period between the
arrival of an event that should trigger a match (ending pattern) and actual
time when the match is emitted(for that case a select function is a good
place I think).

I think Kostas was also referring to similar kind of issue.

Hope it will be helpful.

Z pozdrowieniami! / Cheers!

Dawid Wysakowicz

*Data/Software Engineer*

Skype: dawid_wys | Twitter: @OneMoreCoder

<http://getindata.com/>

2017-05-19 10:59 GMT+02:00 Sonex <alfredjens...@gmail.com>:

> Hello Kostas,
>
> thanks for your response. Regarding throughput, it makes sense.
>
> But there is still one question remaining. How can I measure the latency of
> my FlinkCEP application ???
>
> Maybe you answered it, but I didn`t quite get that. As far as your number 2
> question about measuring latency, the answer is yes, the first element in
> the matching pattern will wait inevitably longer than the last one
>
> Thank you for your time!!!
>
>
>
> --
> View this message in context: http://apache-flink-user-
> mailing-list-archive.2336050.n4.nabble.com/FlinkCEP-latency-throughput-
> tp13170p13221.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive
> at Nabble.com.
>


Re: FlinkCEP latency/throughput

2017-05-17 Thread Dean Wampler
On Wed, May 17, 2017 at 10:34 AM, Kostas Kloudas <
k.klou...@data-artisans.com> wrote:

> Hello Alfred,
>
> As a first general remark, Flink was not optimized for multicore
> deployments
> but rather for distributed environments. This implies overheads
> (serialization,
> communication etc), when compared to libs optimized for multicores. So
> there
> may be libraries that are better optimized for those settings if you are
> planning
> to use just a multicore machine.
>
> Now for your suggestion:
>
...

If you're interested in a multi-core option, check out Akka Streams
 or perhaps
the underlying Actor Model 
.



-- 
*Dean Wampler, Ph.D.*
VP, Fast Data Engineering



dean.wamp...@lightbend.com
@deanwampler 
https://www.linkedin.com/in/deanwampler
https://github.com/deanwampler


Re: FlinkCEP latency/throughput

2017-05-17 Thread Kostas Kloudas
Hello Alfred,

As a first general remark, Flink was not optimized for multicore deployments 
but rather for distributed environments. This implies overheads (serialization, 
communication etc), when compared to libs optimized for multicores. So there
may be libraries that are better optimized for those settings if you are 
planning 
to use just a multicore machine.

Now for your suggestion:

> On May 16, 2017, at 6:03 PM, Sonex  wrote:
> 
> Hello everyone,
> 
> I am testing some patterns with FlinkCEP and I want to measure latency and
> throughput when using 1 or more processing cores. How can I do that ??
> 
> What I have done so far:
> Latency: Each time an event arrives I store the system time
> (System.currentTimeMillis). When flink calls the select function which means
> we have a full pattern match, again I take the system time. The difference
> of the system time taken from the first event of the complex event and the
> system time taken when the function is called is the latency for now.
> 

1) If you are using event time, then you are also accounting for internal 
buffering and 
ordering of the incoming events.
 
2) I am not sure if measuring the time between the arrival of each element, and 
when 
its matching pattern is emitted makes much sense. In a long pattern, the first 
element
in the matching pattern will wait inevitably longer than the last one, right?

> Throughput: I divide the total number of the events of the dataset by the
> time taken to complete the experiment.
> 
> 

For throughput you could create a job with a sink that does nothing and only a 
CEP pattern
in your job and count the elements read by your source/min. If your source is 
not the bottleneck
then the CEP part of the pipeline is the dominating factor (given that your 
sink just discards everything
so it cannot create backpressure).

I hope this helps,
Kostas

FlinkCEP latency/throughput

2017-05-16 Thread Sonex
Hello everyone,

I am testing some patterns with FlinkCEP and I want to measure latency and
throughput when using 1 or more processing cores. How can I do that ??

What I have done so far:
Latency: Each time an event arrives I store the system time
(System.currentTimeMillis). When flink calls the select function which means
we have a full pattern match, again I take the system time. The difference
of the system time taken from the first event of the complex event and the
system time taken when the function is called is the latency for now.

Throughput: I divide the total number of the events of the dataset by the
time taken to complete the experiment.



--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/FlinkCEP-latency-throughput-tp13170.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.