Github user vasia commented on the pull request:

    https://github.com/apache/flink/pull/1124#issuecomment-140146467
  
    Sure, I'm not saying you shouldn't have opened this PR. I'm trying to save 
you some time.
    
    So, my opinion is that in order for this to be useful, we need 2 things:
    - Understand the performance implications of the method. When is it 
beneficial? What's the memory overhead? What's the pre-processing overhead? How 
does it depend on the input? What if I give a wrong threshold? etc.
    - Make this as automatic as possible. Ideally, the user would only have to 
give the combiner function and the threshold. Splitting and merge should be 
handled internally. Actually, we could probably find ways to automatically 
determine the threshold, too, based on the graph, the current load and the 
available memory.
    
    These are not easy tasks and will need a lot of work. That's why I'm 
proposing to link to the current state of this method from the Gelly docs, so 
that we can get feedback. We could create something like a "research projects 
on Gelly" page, where we link to work in progress from our roadmap tasks.
    
    That's just my view, but let's see what other people think :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to