Thank you guys, I was looking for a benchmark in order to add it to an
"official" document as a reference. Somebody asked me to do this, and I
agree that the choice of the grouping policy 90% depends on the business
logic. I was just wondering I there was any "public result" I wasn't able
to find!

Thank you again.




On Wed, Mar 5, 2014 at 7:22 PM, Michael Rose <[email protected]>wrote:

> +1, localOrShuffle will be a winner, as long as it's evenly distributing
> work. If 1 tuple could say produce a variable 1-100 resultant tuples (and
> these results were expensive enough to process, e.g. IO), it might well be
> worth shuffling vs. localShuffling.
>
> Michael Rose (@Xorlev <https://twitter.com/xorlev>)
> Senior Platform Engineer, FullContact <http://www.fullcontact.com/>
> [email protected]
>
>
> On Wed, Mar 5, 2014 at 11:19 AM, Nathan Leung <[email protected]> wrote:
>
>> In my experience on a 1 Gb network localOrShuffleGrouping was a clear
>> winner in terms of performance.  But I haven't tested with 10 Gb, and if
>> you have substantial business logic then that becomes a bigger factor than
>> serializing/transferring data on the network.  I think the performance of
>> any given grouping is too dependent on your business logic; it will be
>> difficult to quantify how well it performs in a canned benchmark.  And
>> sometimes your business logic will define a grouping for you (e.g. fields
>> grouping) whether it's the best performer or not.
>>
>>
>> On Wed, Mar 5, 2014 at 1:05 PM, Roberto Coluccio <
>> [email protected]> wrote:
>>
>>> Hello Michael, thanks for your feedback.
>>>
>>> I'm looking for a performance comparison. I know that not all the
>>> policies are "really comparable", but even obvious comparisons all listed
>>> together could be a useful reference.
>>>
>>> Roberto
>>>
>>>
>>> On Wed, Mar 5, 2014 at 6:58 PM, Michael Rose <[email protected]>wrote:
>>>
>>>> What kind of comparisons are you looking for? How they functionally
>>>> work?
>>>>
>>>> Michael Rose (@Xorlev <https://twitter.com/xorlev>)
>>>> Senior Platform Engineer, FullContact <http://www.fullcontact.com/>
>>>> [email protected]
>>>>
>>>>
>>>> On Wed, Mar 5, 2014 at 9:52 AM, Roberto Coluccio <
>>>> [email protected]> wrote:
>>>>
>>>>> Hello folks,
>>>>>
>>>>> I was unable to find any complete example (or, better, related work in
>>>>> the scientific literature) in which (almost) all the *stream grouping
>>>>> policies* have been used and compared. Do you have any reference you
>>>>> could please share with me?
>>>>>
>>>>> Thank you and best regards,
>>>>>
>>>>> Roberto Coluccio
>>>>>
>>>>
>>>>
>>>
>>
>

Reply via email to