[ https://issues.apache.org/jira/browse/GIRAPH-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16033404#comment-16033404 ]
ASF GitHub Bot commented on GIRAPH-1148: ---------------------------------------- Github user majakabiljo commented on a diff in the pull request: https://github.com/apache/giraph/pull/39#discussion_r119686989 --- Diff: giraph-block-app/src/main/java/org/apache/giraph/block_app/library/Pieces.java --- @@ -320,6 +325,91 @@ public String toString() { } /** + * Like reduceAndBroadcast, but uses array of handles for reducers and + * broadcasts, to make it feasible and performant when values are large. + * Each supplied value to reduce will be reduced in the handle defined by + * handleHashSupplier%numHandles + * + * @param <S> Single value type, objects passed on workers + * @param <R> Reduced value type + * @param <I> Vertex id type + * @param <V> Vertex value type + * @param <E> Edge value type + */ + public static + <S, R extends Writable, I extends WritableComparable, V extends Writable, + E extends Writable> + Piece<I, V, E, NoMessage, Object> reduceAndBroadcastWithArrayOfHandles( + final String name, + final int numHandles, + final ReduceOperation<S, R> reduceOp, + final SupplierFromVertex<I, V, E, Long> handleHashSupplier, + final SupplierFromVertex<I, V, E, S> valueSupplier, + final ConsumerWithVertex<I, V, E, R> reducedValueConsumer) { + return new Piece<I, V, E, NoMessage, Object>() { + protected ArrayOfHandles.ArrayOfReducers<S, R> reducers; + protected BroadcastArrayHandle<R> broadcasts; + + private int getHandleIndex(Vertex<I, V, E> vertex) { + return (int) Math.abs(handleHashSupplier.get(vertex) % numHandles); + } + + @Override + public void registerReducers( + final CreateReducersApi reduceApi, Object executionStage) { + reducers = new ArrayOfHandles.ArrayOfReducers<>( + numHandles, + new Supplier<ReducerHandle<S, R>>() { + @Override + public ReducerHandle<S, R> get() { + return reduceApi.createLocalReducer(reduceOp); --- End diff -- Good catch, it didn't occur to me. I'll fix it not to reuse the same ReduceOperation object. > Connected components - make calculate sizes work with large number of > components > -------------------------------------------------------------------------------- > > Key: GIRAPH-1148 > URL: https://issues.apache.org/jira/browse/GIRAPH-1148 > Project: Giraph > Issue Type: Improvement > Reporter: Maja Kabiljo > Assignee: Maja Kabiljo > > Currently if we have a graph with large number of connected components, > calculating connected components sizes fails because reducer becomes too > large. Use array of handles instead. -- This message was sent by Atlassian JIRA (v6.3.15#6346)