Re: Parallel CEP

2019-01-23 Thread Dian Fu
I'm afraid you cannot do that. The inputs having the same key should be 
processed by the same CEP operator. Otherwise the results will be 
nondeterministic and also be wrong.

Regards,
Dian

> 在 2019年1月24日,下午2:56,dhanuka ranasinghe  写道:
> 
> In this example key will be same. I am using 1 million messages with same key 
> for performance testing. But still I want to process them parallel. Can't I 
> use Split function and get a SplitStream for that purpose?
> 
> On Thu, Jan 24, 2019 at 2:49 PM Dian Fu  > wrote:
> Hi Dhanuka,
> 
> Does the KeySelector of Event::getTriggerID generate the same key for all the 
> inputs or only generate very few key values and these key values happen to be 
> hashed to the same downstream operator? You can print the results of 
> Event::getTriggerID to check if it's that case.
> 
> Regards,
> Dian
> 
>> 在 2019年1月24日,下午2:08,dhanuka ranasinghe > > 写道:
>> 
>> Hi Dian,
>> 
>> Thanks for the explanation. Please find the screen shot and source code for 
>> above mention use case. And in main issue is though I use KeyedStream , 
>> parallelism not apply properly.
>> Only one host is processing messages.
>> 
>> 
>> 
>> Cheers,
>> Dhanuka
>> 
>> On Thu, Jan 24, 2019 at 1:40 PM Dian Fu > > wrote:
>> Whether using KeyedStream depends on the logic of your job, i.e, whether you 
>> are looking for patterns for some partitions, i.e, patterns for a particular 
>> user. If so, you should partition the input data before the CEP operator. 
>> Otherwise, the input data should not be partitioned.
>> 
>> Regards,
>> Dian 
>> 
>>> 在 2019年1月24日,下午12:37,dhanuka ranasinghe >> > 写道:
>>> 
>>> Hi Dian,
>>> 
>>> I tried that but then kafkaproducer only produce to single partition and 
>>> only single flink host working while rest not contribute for processing . I 
>>> will share the code and screenshot
>>> 
>>> Cheers 
>>> Dhanuka
>>> 
>>> On Thu, 24 Jan 2019, 12:31 Dian Fu >>  wrote:
>>> Hi Dhanuka,
>>> 
>>> In order to make the CEP operator to run parallel, the input stream should 
>>> be KeyedStream. You can refer [1] for detailed information.
>>> 
>>> Regards,
>>> Dian
>>> 
>>> [1]: 
>>> https://ci.apache.org/projects/flink/flink-docs-master/dev/libs/cep.html#detecting-patterns
>>>  
>>> 
>>> 
>>> > 在 2019年1月24日,上午10:18,dhanuka ranasinghe >> > > 写道:
>>> > 
>>> > Hi All,
>>> > 
>>> > Is there way to run CEP function parallel. Currently CEP run only 
>>> > sequentially
>>> > 
>>> > 
>>> > .
>>> > 
>>> > Cheers,
>>> > Dhanuka
>>> > 
>>> > -- 
>>> > Nothing Impossible,Creativity is more important than knowledge.
>>> 
>> 
>> 
>> 
>> -- 
>> Nothing Impossible,Creativity is more important than knowledge.
>> 
> 
> 
> 
> -- 
> Nothing Impossible,Creativity is more important than knowledge.



Re: Parallel CEP

2019-01-23 Thread dhanuka ranasinghe
In this example key will be same. I am using 1 million messages with same
key for performance testing. But still I want to process them parallel.
Can't I use Split function and get a SplitStream for that purpose?

On Thu, Jan 24, 2019 at 2:49 PM Dian Fu  wrote:

> Hi Dhanuka,
>
> Does the KeySelector of Event::getTriggerID generate the same key for all
> the inputs or only generate very few key values and these key values happen
> to be hashed to the same downstream operator? You can print the results of
> Event::getTriggerID to check if it's that case.
>
> Regards,
> Dian
>
> 在 2019年1月24日,下午2:08,dhanuka ranasinghe  写道:
>
> Hi Dian,
>
> Thanks for the explanation. Please find the screen shot and source code
> for above mention use case. And in main issue is though I use KeyedStream ,
> parallelism not apply properly.
> Only one host is processing messages.
>
> 
>
> Cheers,
> Dhanuka
>
> On Thu, Jan 24, 2019 at 1:40 PM Dian Fu  wrote:
>
>> Whether using KeyedStream depends on the logic of your job, i.e, whether
>> you are looking for patterns for some partitions, i.e, patterns for a
>> particular user. If so, you should partition the input data before the CEP
>> operator. Otherwise, the input data should not be partitioned.
>>
>> Regards,
>> Dian
>>
>> 在 2019年1月24日,下午12:37,dhanuka ranasinghe  写道:
>>
>> Hi Dian,
>>
>> I tried that but then kafkaproducer only produce to single partition and
>> only single flink host working while rest not contribute for processing . I
>> will share the code and screenshot
>>
>> Cheers
>> Dhanuka
>>
>> On Thu, 24 Jan 2019, 12:31 Dian Fu >
>>> Hi Dhanuka,
>>>
>>> In order to make the CEP operator to run parallel, the input stream
>>> should be KeyedStream. You can refer [1] for detailed information.
>>>
>>> Regards,
>>> Dian
>>>
>>> [1]:
>>> https://ci.apache.org/projects/flink/flink-docs-master/dev/libs/cep.html#detecting-patterns
>>>
>>> > 在 2019年1月24日,上午10:18,dhanuka ranasinghe 
>>> 写道:
>>> >
>>> > Hi All,
>>> >
>>> > Is there way to run CEP function parallel. Currently CEP run only
>>> sequentially
>>> >
>>> > 
>>> > .
>>> >
>>> > Cheers,
>>> > Dhanuka
>>> >
>>> > --
>>> > Nothing Impossible,Creativity is more important than knowledge.
>>>
>>>
>>
>
> --
> Nothing Impossible,Creativity is more important than knowledge.
> 
>
>
>

-- 
Nothing Impossible,Creativity is more important than knowledge.


Re: Parallel CEP

2019-01-23 Thread Dian Fu
Hi Dhanuka,

Does the KeySelector of Event::getTriggerID generate the same key for all the 
inputs or only generate very few key values and these key values happen to be 
hashed to the same downstream operator? You can print the results of 
Event::getTriggerID to check if it's that case.

Regards,
Dian

> 在 2019年1月24日,下午2:08,dhanuka ranasinghe  写道:
> 
> Hi Dian,
> 
> Thanks for the explanation. Please find the screen shot and source code for 
> above mention use case. And in main issue is though I use KeyedStream , 
> parallelism not apply properly.
> Only one host is processing messages.
> 
> 
> 
> Cheers,
> Dhanuka
> 
> On Thu, Jan 24, 2019 at 1:40 PM Dian Fu  > wrote:
> Whether using KeyedStream depends on the logic of your job, i.e, whether you 
> are looking for patterns for some partitions, i.e, patterns for a particular 
> user. If so, you should partition the input data before the CEP operator. 
> Otherwise, the input data should not be partitioned.
> 
> Regards,
> Dian 
> 
>> 在 2019年1月24日,下午12:37,dhanuka ranasinghe > > 写道:
>> 
>> Hi Dian,
>> 
>> I tried that but then kafkaproducer only produce to single partition and 
>> only single flink host working while rest not contribute for processing . I 
>> will share the code and screenshot
>> 
>> Cheers 
>> Dhanuka
>> 
>> On Thu, 24 Jan 2019, 12:31 Dian Fu >  wrote:
>> Hi Dhanuka,
>> 
>> In order to make the CEP operator to run parallel, the input stream should 
>> be KeyedStream. You can refer [1] for detailed information.
>> 
>> Regards,
>> Dian
>> 
>> [1]: 
>> https://ci.apache.org/projects/flink/flink-docs-master/dev/libs/cep.html#detecting-patterns
>>  
>> 
>> 
>> > 在 2019年1月24日,上午10:18,dhanuka ranasinghe > > > 写道:
>> > 
>> > Hi All,
>> > 
>> > Is there way to run CEP function parallel. Currently CEP run only 
>> > sequentially
>> > 
>> > 
>> > .
>> > 
>> > Cheers,
>> > Dhanuka
>> > 
>> > -- 
>> > Nothing Impossible,Creativity is more important than knowledge.
>> 
> 
> 
> 
> -- 
> Nothing Impossible,Creativity is more important than knowledge.
> 



Re: Parallel CEP

2019-01-23 Thread Dian Fu
Whether using KeyedStream depends on the logic of your job, i.e, whether you 
are looking for patterns for some partitions, i.e, patterns for a particular 
user. If so, you should partition the input data before the CEP operator. 
Otherwise, the input data should not be partitioned.

Regards,
Dian 

> 在 2019年1月24日,下午12:37,dhanuka ranasinghe  写道:
> 
> Hi Dian,
> 
> I tried that but then kafkaproducer only produce to single partition and only 
> single flink host working while rest not contribute for processing . I will 
> share the code and screenshot
> 
> Cheers 
> Dhanuka
> 
> On Thu, 24 Jan 2019, 12:31 Dian Fu   wrote:
> Hi Dhanuka,
> 
> In order to make the CEP operator to run parallel, the input stream should be 
> KeyedStream. You can refer [1] for detailed information.
> 
> Regards,
> Dian
> 
> [1]: 
> https://ci.apache.org/projects/flink/flink-docs-master/dev/libs/cep.html#detecting-patterns
>  
> 
> 
> > 在 2019年1月24日,上午10:18,dhanuka ranasinghe  > > 写道:
> > 
> > Hi All,
> > 
> > Is there way to run CEP function parallel. Currently CEP run only 
> > sequentially
> > 
> > 
> > .
> > 
> > Cheers,
> > Dhanuka
> > 
> > -- 
> > Nothing Impossible,Creativity is more important than knowledge.
> 



Re: Parallel CEP

2019-01-23 Thread Dian Fu
Hi Dhanuka,

In order to make the CEP operator to run parallel, the input stream should be 
KeyedStream. You can refer [1] for detailed information.

Regards,
Dian

[1]: 
https://ci.apache.org/projects/flink/flink-docs-master/dev/libs/cep.html#detecting-patterns

> 在 2019年1月24日,上午10:18,dhanuka ranasinghe  写道:
> 
> Hi All,
> 
> Is there way to run CEP function parallel. Currently CEP run only sequentially
> 
> 
> .
> 
> Cheers,
> Dhanuka
> 
> -- 
> Nothing Impossible,Creativity is more important than knowledge.



Re: Parallel CEP

2019-01-23 Thread dhanuka ranasinghe
Hi Dian,

I tried that but then kafkaproducer only produce to single partition and
only single flink host working while rest not contribute for processing . I
will share the code and screenshot

Cheers
Dhanuka

On Thu, 24 Jan 2019, 12:31 Dian Fu  Hi Dhanuka,
>
> In order to make the CEP operator to run parallel, the input stream should
> be KeyedStream. You can refer [1] for detailed information.
>
> Regards,
> Dian
>
> [1]:
> https://ci.apache.org/projects/flink/flink-docs-master/dev/libs/cep.html#detecting-patterns
>
> > 在 2019年1月24日,上午10:18,dhanuka ranasinghe 
> 写道:
> >
> > Hi All,
> >
> > Is there way to run CEP function parallel. Currently CEP run only
> sequentially
> >
> > 
> > .
> >
> > Cheers,
> > Dhanuka
> >
> > --
> > Nothing Impossible,Creativity is more important than knowledge.
>
>


Parallel CEP

2019-01-23 Thread dhanuka ranasinghe
Hi All,

Is there way to run CEP function parallel. Currently CEP run only
sequentially

[image: flink-CEP.png]
.

Cheers,
Dhanuka

-- 
Nothing Impossible,Creativity is more important than knowledge.