Thanks Stephan for the explanation and everyone involved. You guys are awesome!
I’ll wait for your the next great release.


cheers!

Andrew


> On 10 Aug 2016, at 16:01, Stephan Ewen <se...@apache.org> wrote:
> 
> Hi!
> 
> In the above example the keySelector would run once before and once inside 
> the window operator. In that sense, the version below is a better way to do 
> it.
> 
> You can also create windows of 50 or max 100 ms by writing your own trigger. 
> Have a look at the count trigger. You can augment it by scheduling a time 
> callback for 100ms to trigger the window.
> https://github.com/apache/flink/blob/master//flink-streaming-java/src/main/java/org/apache/flink/streaming/api/windowing/triggers/Trigger.java
>  
> <https://github.com/apache/flink/blob/master//flink-streaming-java/src/main/java/org/apache/flink/streaming/api/windowing/triggers/Trigger.java>
> 
> 
> The better version of the "random key" program:
> 
> stream
>     .map(new MapFunction<SocialData, Tuple2<SocialData, Integer>>() {
>         private int key;
> 
>         @Override
>         public Tuple2<SocialData, Integer>map(SocialData data) {
>             if (++key >= 24) {
>                 key = 0;
>             }
>             return new Tuple2<>(key, data);
>         }
>     })
>     .keyBy(0)
>     .timeWindow(Time.milliseconds(100))
>     .apply(...)
> 
> 
> Greetings,
> Stephan
> 
> 
> 
> On Wed, Aug 10, 2016 at 3:54 PM, Andrew Ge Wu <andrew.ge...@eniro.com 
> <mailto:andrew.ge...@eniro.com>> wrote:
> Hi Stephan
> 
> Thanks for the explanation! We will stick to 1.0.3 to keep our code clean.
> In the workaround case, how does key selector instantiated? One instance per 
> window operator?
> By the way is there a way to create a hybrid window of count and time, like 
> 50 items or max process time 100ms?
> 
> 
> Thanks!
> 
> Andrew
>> On 10 Aug 2016, at 15:33, Stephan Ewen <se...@apache.org 
>> <mailto:se...@apache.org>> wrote:
>> 
>> Hi Andrew!
>> 
>> Here is the reason for what is happening with your job: 
>> 
>> You have used some sort of undocumented and unofficial corner case behavior 
>> of Flink 1.0.0, namely, using parallel windowAll().
>> Initially, windowAll() was supposed to not be parallel, but the system did 
>> not prevent to set a parallelism.
>> 
>> In Flink 1.0.0 it just happened that a parallel windowAll() behaved like a 
>> "window over stream partition".
>> In Flink 1.1.0, the parallel windowAll() really sends all data to one of the 
>> parallel operators, and the others are idle. Admittedly, Flink 1.1.0 should 
>> simply not allow to set a parallelism on windowAll() - we will fix that.
>> 
>> What we need to figure out now is how to have an adequate replacement for 
>> the "window over stream partition" use case. I think we need to add an 
>> explicit "windowPartition()" function for that case.
>> 
>> Until then, you could stay on Flink 1.0.3 or you can try and use instead of 
>> "windowAll()" a "keyBy().window()" operator and use an incrementing 
>> number%24 as a key (would not be perfectly balanced, but a temporary 
>> workaround):
>> 
>> stream
>>     .keyBy(new KeySelector<SocialData, Integer>() {
>>         private int key;
>> 
>>         @Override
>>         public Integer getKey(SocialData data) {
>>             if (++key >= 24) {
>>                 key = 0;
>>             }
>>             return key;
>>         }
>>     })
>>     .timeWindow(Time.milliseconds(100))
>>     .apply(...)
>> 
>> 
>> Sorry for the inconvenience!
>> 
>> Greetings,
>> Stephan
>> 
>> 
>> 
>> On Wed, Aug 10, 2016 at 1:15 PM, Andrew Ge Wu <andrew.ge...@eniro.com 
>> <mailto:andrew.ge...@eniro.com>> wrote:
>> Hi Aljoscha
>> 
>> We are not using state backend explicitly, recovery and state backend are 
>> pointed to file path.
>> See attached json file 
>> 
>> Confidentiality Notice: This e-mail transmission may contain confidential or 
>> legally privileged information that is intended only for the individual or 
>> entity named in the e-mail address. If you are not the intended recipient, 
>> you are hereby notified that any disclosure, copying, distribution, or 
>> reliance upon the contents of this e-mail is strictly prohibited and may be 
>> unlawful. If you have received this e-mail in error, please notify the 
>> sender immediately by return e-mail and delete all copies of this message.
>> 
>> Thanks for the help.
>> 
>> 
>> Best regards
>> 
>> 
>> Andrew
>> 
>>> On 10 Aug 2016, at 11:38, Aljoscha Krettek <aljos...@apache.org 
>>> <mailto:aljos...@apache.org>> wrote:
>>> 
>>> Oh, are you by any chance specifying a custom state backend for your job? 
>>> For example, RocksDBStateBackend.
>>> 
>>> Cheers,
>>> Aljoscha
>>> 
>>> On Wed, 10 Aug 2016 at 11:17 Aljoscha Krettek <aljos...@apache.org 
>>> <mailto:aljos...@apache.org>> wrote:
>>> Hi,
>>> could you maybe send us the output of "env.getExecutionPlan()". This would 
>>> help us better understand which operators are used exactly. (You can of 
>>> course remove any security sensitive stuff.)
>>> 
>>> Cheers,
>>> Aljoscha
>>> 
>>> On Tue, 9 Aug 2016 at 15:30 Andrew Ge Wu <andrew.ge...@eniro.com 
>>> <mailto:andrew.ge...@eniro.com>> wrote:
>>> Oh sorry missed that part, no, Im not explicitly set that.
>>> 
>>> 
>>>> On 09 Aug 2016, at 15:29, Aljoscha Krettek <aljos...@apache.org 
>>>> <mailto:aljos...@apache.org>> wrote:
>>>> 
>>>> Hi,
>>>> are you setting a StreamTimeCharacteristic, i.e. 
>>>> env.setStreamTimeCharacteristic?
>>>> 
>>>> Cheers,
>>>> Aljoscha
>>>> 
>>>> On Tue, 9 Aug 2016 at 14:52 Andrew Ge Wu <andrew.ge...@eniro.com 
>>>> <mailto:andrew.ge...@eniro.com>> wrote:
>>>> Hi Aljoscha
>>>> 
>>>> 
>>>> Plan attached, there are split streams and union operations around, but 
>>>> here is how windows are created
>>>> 
>>>> Confidentiality Notice: This e-mail transmission may contain confidential 
>>>> or legally privileged information that is intended only for the individual 
>>>> or entity named in the e-mail address. If you are not the intended 
>>>> recipient, you are hereby notified that any disclosure, copying, 
>>>> distribution, or reliance upon the contents of this e-mail is strictly 
>>>> prohibited and may be unlawful. If you have received this e-mail in error, 
>>>> please notify the sender immediately by return e-mail and delete all 
>>>> copies of this message.
>>>> 
>>>> Let me know if I’m doing something out of ordinary here.
>>>> 
>>>> 
>>>> 
>>>> Thanks!
>>>> 
>>>> 
>>>> Andrew
>>>>> On 09 Aug 2016, at 14:18, Aljoscha Krettek <aljos...@apache.org 
>>>>> <mailto:aljos...@apache.org>> wrote:
>>>>> 
>>>> 
>>>>> Hi,
>>>>> could you maybe post how exactly you specify the window? Also, did you 
>>>>> set a "stream time characteristic", for example EventTime?
>>>>> 
>>>>> That could help us pinpoint the problem.
>>>>> 
>>>>> Cheers,
>>>>> Aljoscha
>>>>> 
>>>> 
>>>>> On Tue, 9 Aug 2016 at 12:42 Andrew Ge Wu <andrew.ge...@eniro.com 
>>>>> <mailto:andrew.ge...@eniro.com>> wrote:
>>>> 
>>>>> I rolled back to 1.0.3
>>>> 
>>>>> If I understand this correctly, the peak when topology starts is because 
>>>>> it is trying to fill all the buffers, but I can not see that in 1.1.0.
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>>>> On 09 Aug 2016, at 12:10, Robert Metzger <rmetz...@apache.org 
>>>>>> <mailto:rmetz...@apache.org>> wrote:
>>>>>> 
>>>>> 
>>>>>> Which source are you using?
>>>>>> 
>>>>>> On Tue, Aug 9, 2016 at 11:50 AM, Andrew Ge Wu <andrew.ge...@eniro.com 
>>>>>> <mailto:andrew.ge...@eniro.com>> wrote:
>>>>>> Hi Robert
>>>>>> 
>>>>>> 
>>>>>> Thanks for the quick reply, I guess I’m one of the early birds.
>>>>>> Yes, it is much slower, I’m not sure why, I copied slaves, masters, 
>>>>>> log4j.properties and flink-conf.yaml directly from 1.0.3
>>>>>> I have parallelization 1 on my sources, I can increase that to achieve 
>>>>>> the same speed, but I’m interested to know why is that.
>>>>>> 
>>>>>> 
>>>>>> Thanks!
>>>>>> 
>>>>>> 
>>>>>> Andrew
>>>>>>> On 09 Aug 2016, at 11:47, Robert Metzger <rmetz...@apache.org 
>>>>>>> <mailto:rmetz...@apache.org>> wrote:
>>>>>>> 
>>>>>>> Hi Andrew,
>>>>>>> 
>>>>>>> here is the release announcement, with a list of all changes: 
>>>>>>> http://flink.apache.org/news/2016/08/08/release-1.1.0.html 
>>>>>>> <http://flink.apache.org/news/2016/08/08/release-1.1.0.html>, 
>>>>>>> http://flink.apache.org/blog/release_1.1.0-changelog.html 
>>>>>>> <http://flink.apache.org/blog/release_1.1.0-changelog.html>
>>>>>>> 
>>>>>>> What does the chart say? Are the results different? is Flink faster or 
>>>>>>> slower now?
>>>>>>> 
>>>>>>> 
>>>>>>> Regards,
>>>>>>> Robert
>>>>>>> 
>>>>>>> On Tue, Aug 9, 2016 at 11:32 AM, Andrew Ge Wu <andrew.ge...@eniro.com 
>>>>>>> <mailto:andrew.ge...@eniro.com>> wrote:
>>>>>>> Hi,
>>>>>>> 
>>>>>>> We found out there is a new stable version released: 1.1.0 but we can 
>>>>>>> not find any release note.
>>>>>>> Do anyone know where to find it?
>>>>>>> 
>>>>>>> 
>>>>>>> We are experience some change of behavior, I’m not sure if it is 
>>>>>>> related.
>>>>>>> 
>>>>>>> <PastedGraphic-1.png>
>>>>>>> 
>>>>>>> Thanks
>>>>>>> 
>>>>>>> 
>>>>>>> Andrew
>>>>>>> 
>>>>>>> Confidentiality Notice: This e-mail transmission may contain 
>>>>>>> confidential or legally privileged information that is intended only 
>>>>>>> for the individual or entity named in the e-mail address. If you are 
>>>>>>> not the intended recipient, you are hereby notified that any 
>>>>>>> disclosure, copying, distribution, or reliance upon the contents of 
>>>>>>> this e-mail is strictly prohibited and may be unlawful. If you have 
>>>>>>> received this e-mail in error, please notify the sender immediately by 
>>>>>>> return e-mail and delete all copies of this message.
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Confidentiality Notice: This e-mail transmission may contain 
>>>>>> confidential or legally privileged information that is intended only for 
>>>>>> the individual or entity named in the e-mail address. If you are not the 
>>>>>> intended recipient, you are hereby notified that any disclosure, 
>>>>>> copying, distribution, or reliance upon the contents of this e-mail is 
>>>>>> strictly prohibited and may be unlawful. If you have received this 
>>>>>> e-mail in error, please notify the sender immediately by return e-mail 
>>>>>> and delete all copies of this message.
>>>>>> 
>>>>> 
>>>>> Confidentiality Notice: This e-mail transmission may contain confidential 
>>>>> or legally privileged information that is intended only for the 
>>>>> individual or entity named in the e-mail address. If you are not the 
>>>>> intended recipient, you are hereby notified that any disclosure, copying, 
>>>>> distribution, or reliance upon the contents of this e-mail is strictly 
>>>>> prohibited and may be unlawful. If you have received this e-mail in 
>>>>> error, please notify the sender immediately by return e-mail and delete 
>>>>> all copies of this message.
>>> 
>>> 
>>> Confidentiality Notice: This e-mail transmission may contain confidential 
>>> or legally privileged information that is intended only for the individual 
>>> or entity named in the e-mail address. If you are not the intended 
>>> recipient, you are hereby notified that any disclosure, copying, 
>>> distribution, or reliance upon the contents of this e-mail is strictly 
>>> prohibited and may be unlawful. If you have received this e-mail in error, 
>>> please notify the sender immediately by return e-mail and delete all copies 
>>> of this message.
>> 
>> 
>> 
> 
> 
> Confidentiality Notice: This e-mail transmission may contain confidential or 
> legally privileged information that is intended only for the individual or 
> entity named in the e-mail address. If you are not the intended recipient, 
> you are hereby notified that any disclosure, copying, distribution, or 
> reliance upon the contents of this e-mail is strictly prohibited and may be 
> unlawful. If you have received this e-mail in error, please notify the sender 
> immediately by return e-mail and delete all copies of this message.
> 


-- 
Confidentiality Notice: This e-mail transmission may contain confidential 
or legally privileged information that is intended only for the individual 
or entity named in the e-mail address. If you are not the intended 
recipient, you are hereby notified that any disclosure, copying, 
distribution, or reliance upon the contents of this e-mail is strictly 
prohibited and may be unlawful. If you have received this e-mail in error, 
please notify the sender immediately by return e-mail and delete all copies 
of this message.

Reply via email to