Re: [GraphX] aggregateMessages with active set

James Thu, 09 Apr 2015 03:23:29 -0700

In aggregateMessagesWithActiveSet, Spark still have to read all edges. It
means that a fixed time which scale with graph size is unavoidable on a
pregel-like iteration.


But what if I have to iterate nearly 100 iterations but at the last 50
iterations there are only < 0.1% nodes need to be updated ? The fixed time
make the program finished at a unacceptable time consumption.

Alcaid

2015-04-08 1:41 GMT+08:00 Ankur Dave <ankurd...@gmail.com>:

> We thought it would be better to simplify the interface, since the
> active set is a performance optimization but the result is identical
> to calling subgraph before aggregateMessages.
>
> The active set option is still there in the package-private method
> aggregateMessagesWithActiveSet. You can actually access it publicly
> via GraphImpl, though the API isn't guaranteed to be stable:
> graph.asInstanceOf[GraphImpl[VD,ED]].aggregateMessagesWithActiveSet(...)
> Ankur
>
>
> On Tue, Apr 7, 2015 at 2:56 AM, James <alcaid1...@gmail.com> wrote:
> > Hello,
> >
> > The old api of GraphX "mapReduceTriplets" has an optional parameter
> > "activeSetOpt: Option[(VertexRDD[_]" that limit the input of sendMessage.
> >
> > However, to the new api "aggregateMessages" I could not find this option,
> > why it does not offer any more?
> >
> > Alcaid
>

Re: [GraphX] aggregateMessages with active set

Reply via email to