Repository: incubator-madlib
Updated Branches:
  refs/heads/master 4fcb60ed8 -> 029f73b15


Misc doc changes, mostly graph related

Closes #150


Project: http://git-wip-us.apache.org/repos/asf/incubator-madlib/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-madlib/commit/029f73b1
Tree: http://git-wip-us.apache.org/repos/asf/incubator-madlib/tree/029f73b1
Diff: http://git-wip-us.apache.org/repos/asf/incubator-madlib/diff/029f73b1

Branch: refs/heads/master
Commit: 029f73b1567f50c012df71dea754a49125d6b049
Parents: 4fcb60e
Author: Frank McQuillan <fmcquil...@pivotal.io>
Authored: Fri Jul 14 13:38:02 2017 -0700
Committer: Orhan Kislal <okis...@pivotal.io>
Committed: Thu Jul 20 15:41:39 2017 -0700

----------------------------------------------------------------------
 doc/mainpage.dox.in                             |  2 +-
 src/ports/postgres/modules/graph/apsp.sql_in    | 15 +++++++-----
 src/ports/postgres/modules/graph/bfs.sql_in     | 25 ++++++++++++--------
 src/ports/postgres/modules/graph/sssp.sql_in    |  2 +-
 .../recursive_partitioning/decision_tree.sql_in |  2 +-
 5 files changed, 27 insertions(+), 19 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-madlib/blob/029f73b1/doc/mainpage.dox.in
----------------------------------------------------------------------
diff --git a/doc/mainpage.dox.in b/doc/mainpage.dox.in
index f64de15..de70d5d 100644
--- a/doc/mainpage.dox.in
+++ b/doc/mainpage.dox.in
@@ -129,7 +129,7 @@ complete matrix stored as a distributed table.
     @defgroup grp_apsp All Pairs Shortest Path
     @ingroup grp_graph
 
-    @defgroup grp_bfs Breadth-first Search
+    @defgroup grp_bfs Breadth-First Search
     @ingroup grp_graph
     
     @defgroup grp_pagerank PageRank

http://git-wip-us.apache.org/repos/asf/incubator-madlib/blob/029f73b1/src/ports/postgres/modules/graph/apsp.sql_in
----------------------------------------------------------------------
diff --git a/src/ports/postgres/modules/graph/apsp.sql_in 
b/src/ports/postgres/modules/graph/apsp.sql_in
index c8df70a..637afdf 100644
--- a/src/ports/postgres/modules/graph/apsp.sql_in
+++ b/src/ports/postgres/modules/graph/apsp.sql_in
@@ -46,12 +46,15 @@ The all pairs shortest paths (APSP) algorithm finds the 
length (summed weights)
 of the shortest paths between all pairs of vertices, such that the sum of the
 weights of the path edges is minimized.
 
-@note APSP is an expensive algorithm for run-time
+@warning APSP is an expensive algorithm for run-time
 because it finds the shortest path between all nodes
-in the graph.  The worst case run-time for this implementation
-is O(V^2 * E) where V is the number of vertices and E is the
-number of edges.  In practice, run-time will be generally be
-much less than this, depending on the graph.
+in the graph.  It is recommended that you start with a 
+small graph to get a sense of run-time for your use case, 
+then increase size carefully from there.  The worst case run-time 
+for this implementation is O(V^2 * E) where V is the 
+number of vertices and E is the number of edges.  In 
+practice, run-time will be generally be
+much less than this, but it depends on the graph.
 
 @anchor apsp
 @par APSP
@@ -112,7 +115,7 @@ table that keeps a record of the input parameters and is 
used by the path
 retrieval function described below.
 </dd>
 
-<dt>grouping_cols</dt>
+<dt>grouping_cols (optional)</dt>
 <dd>TEXT, default = NULL. List of columns used to group the input into discrete
 subgraphs. These columns must exist in the edge table. When this value is null,
 no grouping is used and a single APSP result is generated. </dd>

http://git-wip-us.apache.org/repos/asf/incubator-madlib/blob/029f73b1/src/ports/postgres/modules/graph/bfs.sql_in
----------------------------------------------------------------------
diff --git a/src/ports/postgres/modules/graph/bfs.sql_in 
b/src/ports/postgres/modules/graph/bfs.sql_in
index fd7a396..f4e8edc 100644
--- a/src/ports/postgres/modules/graph/bfs.sql_in
+++ b/src/ports/postgres/modules/graph/bfs.sql_in
@@ -23,7 +23,7 @@
  * @brief SQL functions for graph analytics
  * @date Jun 2017
  *
- * @sa Provides Breadth First Search graph algorithm.
+ * @sa Provides a breadth first search graph algorithm.
  *
  *//* ----------------------------------------------------------------------- 
*/
 m4_include(`SQLCommon.m4')
@@ -32,7 +32,7 @@ m4_include(`SQLCommon.m4')
 
 <div class="toc"><b>Contents</b>
 <ul>
-<li><a href="#bfs">Breadth-first Search</a></li>
+<li><a href="#bfs">Breadth-First Search</a></li>
 <li><a href="#notes">Notes</a></li>
 <li><a href="#examples">Examples</a></li>
 <li><a href="#literature">Literature</a></li>
@@ -41,7 +41,7 @@ m4_include(`SQLCommon.m4')
 
 @brief Finds the nodes reachable from a given source vertex using a 
breadth-first approach.
 
-Given a graph and a source vertex, the Breadth-first Search (BFS) algorithm
+Given a graph and a source vertex, the breadth-first search (BFS) algorithm
 finds all nodes reachable from the source vertex by searching / traversing the 
graph 
 in a breadth-first manner.
 
@@ -84,7 +84,7 @@ the form "name=value". The following parameters are supported 
for
 this string argument:
   - src (INTEGER): Name of the column containing the source vertex ids in the 
edge table. 
   Default column name is 'src'.
-  (Not to be confused with the source_vertex argument passed to the BFS 
function)
+  (This is not to be confused with the 'source_vertex' argument passed to the 
BFS function.)
   - dest (INTEGER): Name of the column containing the destination vertex ids 
in 
   the edge table. Default column name is 'dest'.
   
@@ -110,13 +110,18 @@ The output table will have the following columns (in 
addition to the grouping co
 A summary table named <out_table>_summary is also created. This is an internal 
table that keeps a record of the input parameters.
 </dd>
 
-<dt>max_distance</dt>
-<dd>INT, default = NULL. Maximum distance (number of edges) from source_vertex 
to search through in the graph.</dd>
+<dt>max_distance (optional)</dt>
+<dd>INT, default = NULL. Maximum distance to traverse 
+from the source vertex.  When this value is null, 
+traverses until reaches leaf node.  E.g., if set 
+to 1 will return only adjacent vertices, if set 
+to 7 will return vertices up to a maximum distance 
+of 7 vertices away.
 
-<dt>directed</dt>
+<dt>directed (optional)</dt>
 <dd>BOOLEAN, default = FALSE. If TRUE the graph will be treated as directed, 
else it will be treated as an undirected graph.</dd>
 
-<dt>grouping_cols</dt>
+<dt>grouping_cols (optional)</dt>
 <dd>TEXT, default = NULL. A comma-separated list of columns used to group the 
 input into discrete subgraphs. 
 These columns must exist in the edge table. When this value is NULL, no 
grouping is used 
@@ -128,8 +133,8 @@ and a single BFS result is generated.
 @anchor notes
 @par Notes
 
-The graph_bfs function is a SQL implementation of the well-known Breadth-first 
-Search algorithm [1] modified appropriately for a relational database. It will 
+The graph_bfs function is a SQL implementation of the well-known breadth-first 
+search algorithm [1] modified appropriately for a relational database. It will 
 find any node in the graph reachable from the source_vertex only once. If a 
node 
 is reachable by many different paths from the source_vertex (i.e. has more 
than 
 one parent), then only one of those parents is present in the output table. 

http://git-wip-us.apache.org/repos/asf/incubator-madlib/blob/029f73b1/src/ports/postgres/modules/graph/sssp.sql_in
----------------------------------------------------------------------
diff --git a/src/ports/postgres/modules/graph/sssp.sql_in 
b/src/ports/postgres/modules/graph/sssp.sql_in
index fb0cdba..372f1fb 100644
--- a/src/ports/postgres/modules/graph/sssp.sql_in
+++ b/src/ports/postgres/modules/graph/sssp.sql_in
@@ -100,7 +100,7 @@ the following columns (in addition to the grouping columns):
 A summary table named <out_table>_summary is also created. This is an internal 
table that keeps a record of the input parameters and is used by the path 
function described below.
 </dd>
 
-<dt>grouping_cols</dt>
+<dt>grouping_cols (optional)</dt>
 <dd>TEXT, default = NULL. List of columns used to group the input into 
discrete subgraphs. These columns must exist in the edge table. When this value 
is null, no grouping is used and a single SSSP result is generated. </dd>
 </dl>
 

http://git-wip-us.apache.org/repos/asf/incubator-madlib/blob/029f73b1/src/ports/postgres/modules/recursive_partitioning/decision_tree.sql_in
----------------------------------------------------------------------
diff --git 
a/src/ports/postgres/modules/recursive_partitioning/decision_tree.sql_in 
b/src/ports/postgres/modules/recursive_partitioning/decision_tree.sql_in
index 92a123d..e8f37f8 100644
--- a/src/ports/postgres/modules/recursive_partitioning/decision_tree.sql_in
+++ b/src/ports/postgres/modules/recursive_partitioning/decision_tree.sql_in
@@ -186,7 +186,7 @@ tree_train(
     </table>
   </DD>
 
-  <DT>surrogate_params</DT>
+  <DT>surrogate_params (optional)</DT>
   <DD>TEXT. Comma-separated string of key-value pairs controlling the behavior
   of surrogate splits for each node. A surrogate variable is another predictor
   variable that is associated (correlated) with the primary predictor variable

Reply via email to