Re: Optimizing multiple aggregate queries on a CEP using Flink

2018-02-15 Thread Sahil Arora
Thank you Kostas for your inputs. We will try to integrate an optimizer
into flink and will get back in case we get stuck.

Regards.

On Thu, 15 Feb 2018 at 19:11 Kostas Kloudas 
wrote:

> Hi Sahil,
>
> Currently CEP does not support multi-query optimizations out-of-the-box.
> In some cases you can do manual optimizations to your code, but there is
> no optimizer involved.
>
> Cheers,
> Kostas
>
>
> On Feb 15, 2018, at 11:12 AM, Sahil Arora 
> wrote:
>
> Hi Timo,
> Thanks a lot for the help. I will be looking forward to a reply from
> Kostas to be clearer on this.
>
>
> On Mon, 12 Feb 2018, 10:01 pm Timo Walther,  wrote:
>
>> Hi Sahil,
>>
>> I'm not a CEP expert but I will loop in Kostas (in CC). In general, the
>> example that you described can be easily done with a ProcessFunction [1]. A
>> process function not only allows to keep state (like a count) but also
>> allows you to set timers flexibly for specific use cases such that
>> aggregations can be triggered/reused. So in general I would say that
>> implementing and testing such an algorithm is possible. How easy it can be
>> interegrated into the CEP API, I don't know.
>>
>> Regards,
>> Timo
>>
>>
>>
>> [1]
>> https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/operators/process_function.html
>>
>> Am 2/9/18 um 11:28 PM schrieb Sahil Arora:
>>
>> Hi there,
>> We have been working on a project with the title "Optimizing Multiple
>> Aggregate Queries over a Complex Event Processing Engine". The aim is to
>> optimize a group of queries. Take such as* "how many cars passed the
>> post in the past 1 minute" *and* "how many cars passed the post in the
>> past 2 minutes"* are 2 queries, and the naive and inefficient method to
>> answer both the queries is to independently solve both of these queries one
>> by one and find the answer. However, the optimum way would be to minimize
>> the computation by using the answer given by query 1 and using it in query
>> 2. This is basically what our aim is, to minimize computation cost when we
>> have multiple aggregate queries in a CEP.
>>
>> We have been searching for some platform which supports CEP, and Flink is
>> probably one of them. Hence, it would be very helpful if we could get some
>> answers to the following questions:
>>
>> 1. Does flink already have some method of optimizing multiple aggregate
>> queries?
>> 2. Is it possible for us to implement / test such an algorithm in flink
>> which considers multiple queries in a CEP, like having a database of SQL
>> queries and testing an algorithm of our choice?
>>
>> Any other inputs which may help us with solving the problem would be
>> highly welcome.
>>
>> Thanks a lot.
>> --
>> Sahil Arora
>> Final year B.Tech Undergrad | Indian Institute of Technology Mandi
>> Web: https://sahilarora535.github.io
>> LinkedIn: sahilarora535 
>> Ph: +91-8130506047 <+91%2081305%2006047>
>>
>>
>> --
> Sahil Arora
> Final year B.Tech Undergrad | Indian Institute of Technology Mandi
> Web: https://sahilarora535.github.io
> LinkedIn: sahilarora535 
> Ph: +91-8130506047 <+91%2081305%2006047>
>
>
> --
Sahil Arora
Final year B.Tech Undergrad | Indian Institute of Technology Mandi
Web: https://sahilarora535.github.io
LinkedIn: sahilarora535 
Ph: +91-8130506047 <+91%2081305%2006047>


Re: Optimizing multiple aggregate queries on a CEP using Flink

2018-02-15 Thread Kostas Kloudas
Hi Sahil,

Currently CEP does not support multi-query optimizations out-of-the-box.
In some cases you can do manual optimizations to your code, but there is 
no optimizer involved.

Cheers,
Kostas

> On Feb 15, 2018, at 11:12 AM, Sahil Arora  wrote:
> 
> Hi Timo,
> Thanks a lot for the help. I will be looking forward to a reply from Kostas 
> to be clearer on this.
>  
> 
> On Mon, 12 Feb 2018, 10:01 pm Timo Walther,  > wrote:
> Hi Sahil,
> 
> I'm not a CEP expert but I will loop in Kostas (in CC). In general, the 
> example that you described can be easily done with a ProcessFunction [1]. A 
> process function not only allows to keep state (like a count) but also allows 
> you to set timers flexibly for specific use cases such that aggregations can 
> be triggered/reused. So in general I would say that implementing and testing 
> such an algorithm is possible. How easy it can be interegrated into the CEP 
> API, I don't know.
> 
> Regards,
> Timo
> 
> 
> 
> [1] 
> https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/operators/process_function.html
>  
> 
> 
> Am 2/9/18 um 11:28 PM schrieb Sahil Arora:
>> Hi there,
>> We have been working on a project with the title "Optimizing Multiple 
>> Aggregate Queries over a Complex Event Processing Engine". The aim is to 
>> optimize a group of queries. Take such as "how many cars passed the post in 
>> the past 1 minute" and "how many cars passed the post in the past 2 minutes" 
>> are 2 queries, and the naive and inefficient method to answer both the 
>> queries is to independently solve both of these queries one by one and find 
>> the answer. However, the optimum way would be to minimize the computation by 
>> using the answer given by query 1 and using it in query 2. This is basically 
>> what our aim is, to minimize computation cost when we have multiple 
>> aggregate queries in a CEP.
>> 
>> We have been searching for some platform which supports CEP, and Flink is 
>> probably one of them. Hence, it would be very helpful if we could get some 
>> answers to the following questions:
>> 
>> 1. Does flink already have some method of optimizing multiple aggregate 
>> queries?
>> 2. Is it possible for us to implement / test such an algorithm in flink 
>> which considers multiple queries in a CEP, like having a database of SQL 
>> queries and testing an algorithm of our choice? 
>> 
>> Any other inputs which may help us with solving the problem would be highly 
>> welcome.
>> 
>> Thanks a lot.
>> -- 
>> Sahil Arora
>> Final year B.Tech Undergrad | Indian Institute of Technology Mandi
>> Web: https://sahilarora535.github.io 
>> LinkedIn: sahilarora535 
>> Ph: +91-8130506047 
> -- 
> Sahil Arora
> Final year B.Tech Undergrad | Indian Institute of Technology Mandi
> Web: https://sahilarora535.github.io 
> LinkedIn: sahilarora535 
> Ph: +91-8130506047 


Re: Optimizing multiple aggregate queries on a CEP using Flink

2018-02-15 Thread Sahil Arora
Hi Timo,
Thanks a lot for the help. I will be looking forward to a reply from Kostas
to be clearer on this.


On Mon, 12 Feb 2018, 10:01 pm Timo Walther,  wrote:

> Hi Sahil,
>
> I'm not a CEP expert but I will loop in Kostas (in CC). In general, the
> example that you described can be easily done with a ProcessFunction [1]. A
> process function not only allows to keep state (like a count) but also
> allows you to set timers flexibly for specific use cases such that
> aggregations can be triggered/reused. So in general I would say that
> implementing and testing such an algorithm is possible. How easy it can be
> interegrated into the CEP API, I don't know.
>
> Regards,
> Timo
>
>
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/operators/process_function.html
>
> Am 2/9/18 um 11:28 PM schrieb Sahil Arora:
>
> Hi there,
> We have been working on a project with the title "Optimizing Multiple
> Aggregate Queries over a Complex Event Processing Engine". The aim is to
> optimize a group of queries. Take such as* "how many cars passed the post
> in the past 1 minute" *and* "how many cars passed the post in the past 2
> minutes"* are 2 queries, and the naive and inefficient method to answer
> both the queries is to independently solve both of these queries one by one
> and find the answer. However, the optimum way would be to minimize the
> computation by using the answer given by query 1 and using it in query 2.
> This is basically what our aim is, to minimize computation cost when we
> have multiple aggregate queries in a CEP.
>
> We have been searching for some platform which supports CEP, and Flink is
> probably one of them. Hence, it would be very helpful if we could get some
> answers to the following questions:
>
> 1. Does flink already have some method of optimizing multiple aggregate
> queries?
> 2. Is it possible for us to implement / test such an algorithm in flink
> which considers multiple queries in a CEP, like having a database of SQL
> queries and testing an algorithm of our choice?
>
> Any other inputs which may help us with solving the problem would be
> highly welcome.
>
> Thanks a lot.
> --
> Sahil Arora
> Final year B.Tech Undergrad | Indian Institute of Technology Mandi
> Web: https://sahilarora535.github.io
> LinkedIn: sahilarora535 
> Ph: +91-8130506047 <+91%2081305%2006047>
>
>
> --
Sahil Arora
Final year B.Tech Undergrad | Indian Institute of Technology Mandi
Web: https://sahilarora535.github.io
LinkedIn: sahilarora535 
Ph: +91-8130506047 <+91%2081305%2006047>


Re: Optimizing multiple aggregate queries on a CEP using Flink

2018-02-12 Thread Timo Walther

Hi Sahil,

I'm not a CEP expert but I will loop in Kostas (in CC). In general, the 
example that you described can be easily done with a ProcessFunction 
[1]. A process function not only allows to keep state (like a count) but 
also allows you to set timers flexibly for specific use cases such that 
aggregations can be triggered/reused. So in general I would say that 
implementing and testing such an algorithm is possible. How easy it can 
be interegrated into the CEP API, I don't know.


Regards,
Timo



[1] 
https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/operators/process_function.html


Am 2/9/18 um 11:28 PM schrieb Sahil Arora:

Hi there,
We have been working on a project with the title "Optimizing Multiple 
Aggregate Queries over a Complex Event Processing Engine". The aim is 
to optimize a group of queries. Take such as*"how many cars passed the 
post in the past 1 minute" *and*"how many cars passed the post in the 
past 2 minutes"* are 2 queries, and the naive and inefficient method 
to answer both the queries is to independently solve both of these 
queries one by one and find the answer. However, the optimum way would 
be to minimize the computation by using the answer given by query 1 
and using it in query 2. This is basically what our aim is, to 
minimize computation cost when we have multiple aggregate queries in a 
CEP.


We have been searching for some platform which supports CEP, and Flink 
is probably one of them. Hence, it would be very helpful if we could 
get some answers to the following questions:


1. Does flink already have some method of optimizing multiple 
aggregate queries?
2. Is it possible for us to implement / test such an algorithm in 
flink which considers multiple queries in a CEP, like having a 
database of SQL queries and testing an algorithm of our choice?


Any other inputs which may help us with solving the problem would be 
highly welcome.


Thanks a lot.
--
Sahil Arora
Final year B.Tech Undergrad | Indian Institute of Technology Mandi
Web: https://sahilarora535.github.io
LinkedIn: sahilarora535 
Ph: +91-8130506047 





Optimizing multiple aggregate queries on a CEP using Flink

2018-02-09 Thread Sahil Arora
Hi there,
We have been working on a project with the title "Optimizing Multiple
Aggregate Queries over a Complex Event Processing Engine". The aim is to
optimize a group of queries. Take such as* "how many cars passed the post
in the past 1 minute" *and* "how many cars passed the post in the past 2
minutes"* are 2 queries, and the naive and inefficient method to answer
both the queries is to independently solve both of these queries one by one
and find the answer. However, the optimum way would be to minimize the
computation by using the answer given by query 1 and using it in query 2.
This is basically what our aim is, to minimize computation cost when we
have multiple aggregate queries in a CEP.

We have been searching for some platform which supports CEP, and Flink is
probably one of them. Hence, it would be very helpful if we could get some
answers to the following questions:

1. Does flink already have some method of optimizing multiple aggregate
queries?
2. Is it possible for us to implement / test such an algorithm in flink
which considers multiple queries in a CEP, like having a database of SQL
queries and testing an algorithm of our choice?

Any other inputs which may help us with solving the problem would be highly
welcome.

Thanks a lot.
-- 
Sahil Arora
Final year B.Tech Undergrad | Indian Institute of Technology Mandi
Web: https://sahilarora535.github.io
LinkedIn: sahilarora535 
Ph: +91-8130506047 <+91%2081305%2006047>