[ 
https://issues.apache.org/jira/browse/FLINK-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15154224#comment-15154224
 ] 

ASF GitHub Bot commented on FLINK-2237:
---------------------------------------

Github user fhueske commented on the pull request:

    https://github.com/apache/flink/pull/1517#issuecomment-186219363
  
    OK, then lets keep the data in one partition for now. In case of var-length 
updates, this can default to a memory usage / combine behavior which is 
somewhat similar to the sort-based strategy: Filling the memory with records 
and emitting it (putting compaction aside).
    
    I'll review the PR once more will run a few end-to-end benchmarks as well.
    What kind of benchmarks have you done so far? 
    - Did you check the combine rate (input / output ratio) compared to the 
sort-based strategy? 
    - How much memory did you use for tests (upper bound)? Did you vary the 
memory?
    - Have you checked heap memory consumption / GC activity compared to the 
sort-based strategy?
    
    It might take a few more days before I actually get to this, but it is on 
my list.
    
    Thanks,
    Fabian


> Add hash-based Aggregation
> --------------------------
>
>                 Key: FLINK-2237
>                 URL: https://issues.apache.org/jira/browse/FLINK-2237
>             Project: Flink
>          Issue Type: New Feature
>            Reporter: Rafiullah Momand
>            Assignee: Gabor Gevay
>            Priority: Minor
>
> Aggregation functions at the moment are implemented in a sort-based way.
> How can we implement hash based Aggregation for Flink?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to