It's probably easier if you re-phrase your complexity in terms of number of iterations and number of messages being produced per iteration, both dependent on your N. That should give you a better idea of feasibility with giraph.

## Advertising

On Thu, Mar 8, 2012 at 1:18 PM, Timmy Wilson <tim...@smarttypes.org> wrote: > In it's ideal form (no artificial information decay) the algo is n*n > > Every node communicates it's position to every other node > > From the node's perspective, it considers the positions of every other > node and it's relationship to the node before making a move > > If n*n will scale to 5M nodes on giraph then it's simple > > If not, we have to condense the messaging -- Barnes-Hut simulation > using an Octree or Quadtree structure is one option -- > http://en.wikipedia.org/wiki/Barnes%E2%80%93Hut_simulation -- this > reduces the algo to n*log(n), but adds an additional graph to consider > > > > On Wed, Mar 7, 2012 at 11:57 AM, Claudio Martella > <claudio.marte...@gmail.com> wrote: >> Ok, but why a bipartite graph? I can imagine you can have 2 or 3 >> coordinates associated with each vertices without actually having multiple >> vertex types. >> >> >> On Wednesday, March 7, 2012, Timmy Wilson <tim...@smarttypes.org> wrote: >>>> while i'd expect from linlog something like an (x,y) coordinate. is >>>> that correct? >>> >>> (x,y,z) is also common -- i think of it as extreme dimension reduction >>> -- after which you're free to use your favorite gis tool -- >>> http://postgis.refractions.net/ -- you can also pull in tools from >>> statistical mechanics/thermodynamics >>> >>> Without complicating the algo for efficiency -- the essence of the >>> energy model is simple -- all nodes repulse each other -- connected >>> nodes attract each other -- this works for any type of graph -- i'm >>> using a weighted directed graph -- in which case the weighs influence >>> the attractive force >>> >>> It's all very continuous/natural -- living in a 3d world we've build a >>> lot of tools/methods to process this kind of information >>> >>> The problem is without efficiency methods like Barnes-Hut we have >>> Cartesian(x,y) or Cartesian(x,y,z) -- because every vertex influences >>> every other vertex >>> >>> An efficient giraph implementation would need 2 layers -- a bipartite >>> graph -- assuming (x,y,z) reduction -- we would need an Octree graph >>> interacting dynamically w/ our graph of interest >>> >>> >>> On Wed, Mar 7, 2012 at 5:00 AM, Claudio Martella >>> <claudio.marte...@gmail.com> wrote: >>>> My personal take is that they do have similar function (they "extract" >>>> communities), but they have a general different type of output. in >>>> label propagation you'd end up with an id for each vertex (the vertex >>>> id that is the centroid for the community each vertex belongs to), >>>> while i'd expect from linlog something like an (x,y) coordinate. is >>>> that correct? >>>> >>>> On Wed, Mar 7, 2012 at 2:45 AM, Timmy Wilson <tim...@smarttypes.org> >> wrote: >>>>> Thank you everyone! >>>>> >>>>> I would love to see a comparison of force directed layouts >>>>> (specifically LinLog) and label propagation. >>>>> >>>>> I searched but alas nothing -- they seem to be oddly similar? >>>>> >>>>> The current, serial LinLog implementation -- >>>>> http://code.google.com/p/linloglayout/ -- uses Barnes-Hut simulation >>>>> -- n*log(n): >>>>> >>>>> http://en.wikipedia.org/wiki/Barnes%E2%80%93Hut_simulation >>>>> >>>>> I guess the root question is -- do you think it's reasonable to use >>>>> giraph for Barnes-Hut simulation? >>>>> >>>>> >>>>> >>>>> >>>>> On Tue, Mar 6, 2012 at 11:34 AM, Claudio Martella >>>>> <claudio.marte...@gmail.com> wrote: >>>>>> Hi, >>>>>> >>>>>> I'm not definitely familiar with the algorithm or implementation of >>>>>> LinLog, I've been just a user. It should be doable with Giraph if you >>>>>> can express it in terms of message-passing between vertices and >>>>>> without a dependency on a global view of the graph (except for the >>>>>> convergence criteria, such as total energy). >>>>>> >>>>>> Please consider that Giraph's data model is based on a directed graph, >>>>>> this should be a quite "interesting" constraint for you, if your >>>>>> implementation is going to modify energy associated with edges (you'd >>>>>> have two views over the undirected edge, one in each endpoint). >>>>>> >>>>>> In general, a good way of doing community analysis would be to look at >>>>>> algorithms that belong to the family of label-propagation clustering >>>>>> algorithms. >>>>>> >>>>>> >>>>>> Hope this helps, >>>>>> Claudio >>>>>> >>>>>> On Tue, Mar 6, 2012 at 3:28 PM, Timmy Wilson <tim...@smarttypes.org> >> wrote: >>>>>>> Hi giraph community, >>>>>>> >>>>>>> I'm interested in using giraph for distributed n-body simulation. >>>>>>> >>>>>>> Initially, i'm interested in force directed layouts -- ie, graph >> drawing: >>>>>>> >>>>>>> http://en.wikipedia.org/wiki/Force-based_algorithms_(graph_drawing) >>>>>>> >>>>>>> I'm interested specifically in Dr. Andreas Noack's LinLog energy model >>>>>>> -- which performs well w/ community detection: >>>>>>> >>>>>>> http://www.informatik.tu-cottbus.de/~an/GD/linlog.html >>>>>>> >>>>>>> I have a few examples of a serial implementation here: >>>>>>> >>>>>>> http://www.smarttypes.org/ >>>>>>> >>>>>>> The model maximizes the distance between all nodes while minimizing >>>>>>> the distance between connected nodes. >>>>>>> >>>>>>> Without getting into too much detail, i'm curious if anyone has >>>>>>> considered using giraph for force directed graph embedding (yet >>>>>>> another name for it)? >>>>>>> >>>>>>> I'm also considering something like http://www.mcs.anl.gov/petsc/ or >>>>>>> http://www.cs.cmu.edu/~scandal/alg/nbody.html -- which have fast >>> >> >> -- >> Claudio Martella >> claudio.marte...@gmail.com -- Claudio Martella claudio.marte...@gmail.com