Re: Landmarks in GraphX section of Spark API

2014-11-18 Thread Ankur Dave
At 2014-11-17 14:47:50 +0530, Deep Pradhan pradhandeep1...@gmail.com wrote:
 I was going through the graphx section in the Spark API in
 https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.graphx.lib.ShortestPaths$

 Here, I find the word landmark. Can anyone explain to me what is landmark
 means. Is it a simple English word or does it mean something else in graphx.

The landmarks in the context of the shortest-paths algorithm are just the 
vertices of interest. For each vertex in the graph, the algorithm will return 
the distance to each of the landmark vertices.

Ankur

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Landmarks in GraphX section of Spark API

2014-11-18 Thread Deep Pradhan
So landmark can contain just one vertex right?
Which algorithm has been used to compute the shortest path?

Thank You

On Tue, Nov 18, 2014 at 2:53 PM, Ankur Dave ankurd...@gmail.com wrote:

 At 2014-11-17 14:47:50 +0530, Deep Pradhan pradhandeep1...@gmail.com
 wrote:
  I was going through the graphx section in the Spark API in
 
 https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.graphx.lib.ShortestPaths$
 
  Here, I find the word landmark. Can anyone explain to me what is
 landmark
  means. Is it a simple English word or does it mean something else in
 graphx.

 The landmarks in the context of the shortest-paths algorithm are just
 the vertices of interest. For each vertex in the graph, the algorithm will
 return the distance to each of the landmark vertices.

 Ankur



Re: Landmarks in GraphX section of Spark API

2014-11-18 Thread Ankur Dave
At 2014-11-18 14:59:20 +0530, Deep Pradhan pradhandeep1...@gmail.com wrote:
 So landmark can contain just one vertex right?

Right.

 Which algorithm has been used to compute the shortest path?

It's distributed Bellman-Ford.

Ankur

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Landmarks in GraphX section of Spark API

2014-11-18 Thread Deep Pradhan
Does Bellman-Ford give the best solution?

On Tue, Nov 18, 2014 at 3:27 PM, Ankur Dave ankurd...@gmail.com wrote:

 At 2014-11-18 14:59:20 +0530, Deep Pradhan pradhandeep1...@gmail.com
 wrote:
  So landmark can contain just one vertex right?

 Right.

  Which algorithm has been used to compute the shortest path?

 It's distributed Bellman-Ford.

 Ankur



Re: Landmarks in GraphX section of Spark API

2014-11-18 Thread Ankur Dave
At 2014-11-18 15:29:08 +0530, Deep Pradhan pradhandeep1...@gmail.com wrote:
 Does Bellman-Ford give the best solution?

It gives the same solution as any other algorithm, since there's only one 
correct solution for shortest paths and it's guaranteed to find it eventually. 
There are probably faster distributed algorithms for single-source shortest 
paths, but I'm not familiar with them. As far as I can tell, distributed 
Bellman-Ford is the most widely-used one.

Ankur

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Landmarks in GraphX section of Spark API

2014-11-18 Thread Ankur Dave
At 2014-11-18 15:44:31 +0530, Deep Pradhan pradhandeep1...@gmail.com wrote:
 I meant to ask whether it gives the solution faster than other algorithms.

No, it's just that it's much simpler and easier to implement than the others. 
Section 5.2 of the Pregel paper [1] justifies using it for a graph (a binary 
tree) with 1 billion vertices on 300 machines:

More advanced parallel algorithms exist, e.g., Thorup [44] or the ∆-stepping
method [37], and have been used as the basis for special-purpose parallel
shortest paths implementations [12, 32]. Such advanced algorithms can also
be expressed in the Pregel framework. The simplicity of the implementation
in Figure 5, however, together with the already acceptable performance (see
Section 6), may appeal to users who can't do extensive tuning or
customization.

 What do you mean by distributed algorithms? Can we not use any algorithm on
 a distributed environment?

Any algorithm can be split up and run in a distributed environment, but because 
inter-node coordination is expensive, that can be very inefficient. Distributed 
algorithms in this context are ones that reduce coordination.

Ankur

[1] http://db.cs.berkeley.edu/cs286/papers/pregel-sigmod2010.pdf

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org