[ 
https://issues.apache.org/jira/browse/FLINK-2661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14742905#comment-14742905
 ] 

ASF GitHub Bot commented on FLINK-2661:
---------------------------------------

Github user vasia commented on the pull request:

    https://github.com/apache/flink/pull/1124#issuecomment-139959564
  
    Hi @andralungu,
    
    I was under the impression that we never really reached consensus in the 
mailing list thread regarding this addition.
    
    I definitely think we need to handle skewed graphs and you've done great 
work, but I wouldn't add this to Gelly at its current state.
    
    - This is a very recent method that has not been tested thoroughly (apart 
from the experiments in your thesis work). Its benefits and overheads are not 
yet well understood.
    - The API certainly needs rethinking. This should be a transparent method 
that would be easy to activate with a flag/option. Right now, it seems to me 
that it's too complicated to use and can easily allow erroneous implementations.
    
    For now I would suggest that we keep this in your personal repository and 
we link to it from the Gelly documentation as additional/experimental feature. 
After we have better understanding of the technique and we have thought of a 
nicer API, we can reconsider adding it. What do you think?


> Add a Node Splitting Technique to Overcome the Limitations of Skewed Graphs
> ---------------------------------------------------------------------------
>
>                 Key: FLINK-2661
>                 URL: https://issues.apache.org/jira/browse/FLINK-2661
>             Project: Flink
>          Issue Type: Task
>          Components: Gelly
>    Affects Versions: 0.10
>            Reporter: Andra Lungu
>            Assignee: Andra Lungu
>
> Skewed graphs raise unique challenges to computation models such as Gelly's 
> vertex-centric or GSA iterations. This is mainly because of the fact that 
> these approaches uniformly process vertices regardless of their degree 
> distribution. 
> In vertex-centric, for instance, a skewed node will take more time to process 
> its neighbors compared to the other nodes in the graph. The first will act as 
> a straggler causing the latter to remain idle until it finishes its 
> computation. 
> This issue can be mitigated by splitting a high-degree node into subnodes and 
> evenly distributing the edges to the the resulted subvertices. The 
> computation will then be performed on the split vertex. 
> To this end, we should add a Splitting API on top of Gelly which can help:
> - determine skewed nodes 
> - split them
> - merge them back at the end of the computation, given a user defined 
> combiner.
> To illustrate the usage of these methods, we should add an example as well as 
> a separate entry in the documentation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to