Repository: incubator-madlib
Updated Branches:
  refs/heads/master 67b69eb8a -> 6025c4b0d


Graph Docs: Update BFS design doc

Include write-up for BFS impplementation and design.

Closes #161


Project: http://git-wip-us.apache.org/repos/asf/incubator-madlib/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-madlib/commit/6025c4b0
Tree: http://git-wip-us.apache.org/repos/asf/incubator-madlib/tree/6025c4b0
Diff: http://git-wip-us.apache.org/repos/asf/incubator-madlib/diff/6025c4b0

Branch: refs/heads/master
Commit: 6025c4b0d0d1a83a522a6c08553eb1d0632329be
Parents: 67b69eb
Author: Rashmi Raghu <rra...@pivotal.io>
Authored: Thu Aug 3 16:15:32 2017 -0700
Committer: Nandish Jayaram <njaya...@apache.org>
Committed: Fri Aug 11 11:25:05 2017 -0700

----------------------------------------------------------------------
 doc/design/modules/graph.tex | 51 ++++++++++++++++++++++++++++++++++++++-
 doc/literature.bib           |  5 ++++
 2 files changed, 55 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-madlib/blob/6025c4b0/doc/design/modules/graph.tex
----------------------------------------------------------------------
diff --git a/doc/design/modules/graph.tex b/doc/design/modules/graph.tex
index f9183b3..c631e16 100644
--- a/doc/design/modules/graph.tex
+++ b/doc/design/modules/graph.tex
@@ -22,7 +22,7 @@
 \chapter[Graph]{Graph}
 
 \begin{moduleinfo}
-\item[Authors] \href{mailto:okis...@pivotal.io}{Orhan Kislal}, 
\href{mailto:njaya...@pivotal.io}{Nandish Jayaram}
+\item[Authors] \href{mailto:okis...@pivotal.io}{Orhan Kislal}, 
\href{mailto:njaya...@pivotal.io}{Nandish Jayaram}, 
\href{mailto:rra...@pivotal.io}{Rashmi Raghu}
 \item[History]
        \begin{modulehistory}
                \item[v0.1] Initial version, SSSP only.
@@ -30,6 +30,7 @@
         \item[v0.3] PageRank
         \item[v0.4] APSP
         \item[v0.5] Weakly Connected Components
+        \item[v0.6] Breadth First Search (BFS)
        \end{modulehistory}
 \end{moduleinfo}
 
@@ -616,3 +617,51 @@ associated with each vertex in $G$. The component ID of 
all the vertices
 in a component is equal to the smallest vertex ID in the corresponding
 subgraph.
 
+\section{Breadth-first Search} \label{sec:graph:bfs}
+
+Given a graph $G$ and a user-specified origin vertex $src$, this algorithm
+searches and discovers connected nodes in a graph in breadth-first order
+\cite{bfs_wikipedia}. The graph can be treated as either directed or
+undirected based on a parameter specified when calling the function.
+There is also a parameter to control the number of hops (edges) to traverse
+from the source vertex. If not specified, all nodes accessible from the
+source node will be discovered.
+
+\subsection{Implementation Details}
+\begin{algorithm}[Breadth First Search$(V, E, src)$] \label{alg:bfs:high}
+\begin{algorithmic}[1]
+    \State Set $dist \leftarrow 0$
+    \State Create $message$ table with $src$ vertex, $NULL$ parent, and $dist$
+    \State Copy $message$ table to output table $out$
+    \Repeat
+        \State Create $toupdate$ table using $out$ and $message$ tables
+        \State $dist \leftarrow dist + 1$
+        \State Update $message$ table with newly found candidate vertices, 
parent and $dist$
+        \State Copy $message$ table to $out$
+    \Until {There are no candidate vertices remaining in $message$ table}
+\end{algorithmic}
+\end{algorithm}
+
+The implementation details are similar to SSSP, albeit simpler. We only have to
+track the number of hops and not the sum of weights, but other requirements and
+restrictions such as finding the parent, distinct updates, etc. are common in
+both cases. The output table is initialized only to the $src$ vertex to begin
+with. A $message$ table is also maintained that contains the list of vertices
+to traverse and explore in the next iteration, which is also initialized with
+the $src$ vertex. BFS then runs iteratively until no more candidate vertices
+remain in the $message$ table, as outlined in~\ref{alg:bfs:high}.
+
+At every iteration, $toupdate$ table is updated with vertices that are 
neighbors
+of vertices in the $message$ table, that are not already visited in the past
+(one scan of the $out$ table reveals all the vertices that have already been
+visited in the past). All such newly found neighboring vertices in the current
+iteration will have one or more parents, based on how many vertices in the
+$message$ table have a direct edge to them. Each such vertex in the $message$
+table is marked as the parent of such newly found neighboring vertices in
+the $toupdate$ table.
+
+The $message$ table is then cleared and updated with the contents of $toupdate$
+table, except that for each new neighboring vertex considered, only one of the
+parents is recorded as its parent (the node with the smallest id among all
+parent nodes). The content of this updated $message$ is then copied
+to the $out$ table, and this process continues until $message$ table is empty.

http://git-wip-us.apache.org/repos/asf/incubator-madlib/blob/6025c4b0/doc/literature.bib
----------------------------------------------------------------------
diff --git a/doc/literature.bib b/doc/literature.bib
index aadd65b..225622d 100644
--- a/doc/literature.bib
+++ b/doc/literature.bib
@@ -949,3 +949,8 @@ Applied Survival Analysis},
     Title = {{MLP(III): Back-Propagation}},
     Author = {{Yu Hen Hu}}
 }
+
+@online{bfs_wikipedia,
+   title = {Breadth-first search},
+   url={https://en.wikipedia.org/wiki/Breadth-first_search}
+}
\ No newline at end of file

Reply via email to