Github user njayaram2 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/178#discussion_r139215837
--- Diff: src/ports/postgres/modules/graph/hits.sql_in ---
@@ -0,0 +1,329 @@
+/* -----------------------------------------------------------------------
*//**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ *
+ *
+ * @file graph.sql_in
+ *
+ * @brief SQL functions for graph analytics
+ * @date Nov 2016
+ *
+ * @sa Provides various graph algorithms.
+ *
+ *//*
----------------------------------------------------------------------- */
+m4_include(`SQLCommon.m4')
+
+/**
+@addtogroup grp_hits
+
+<div class="toc"><b>Contents</b>
+<ul>
+<li><a href="#hits">HITS</a></li>
+<li><a href="#notes">Notes</a></li>
+<li><a href="#examples">Examples</a></li>
+<li><a href="#literature">Literature</a></li>
+</ul>
+</div>
+
+@brief Find the HITS scores(Authority and Hub) of all vertices in a
directed graph.
+
+Given a graph, the HITS (Hyperlink-Induced Topic Search) algorithm outputs
the
+authority score and hub score of every vertex, where authority estimates
the
+value of the content of the page and hub estimates the value of its links
to
+other pages. This algorithm was developed by Jon Kleinberg to rate web
pages.
+
+@anchor hits
+@par HITS
+<pre class="syntax">
+hits( vertex_table,
+ vertex_id,
+ edge_table,
+ edge_args,
+ out_table,
+ max_iter,
+ threshold,
+ grouping_cols
+ )
+</pre>
+
+\b Arguments
+<dl class="arglist">
+<dt>vertex_table</dt>
+<dd>TEXT. Name of the table containing the vertex data for the graph. Must
+ contain the column specified in the 'vertex_id' parameter below.</dd>
+
+<dt>vertex_id</dt>
+<dd>TEXT, default = 'id'. Name of the column in 'vertex_table' containing
+ vertex ids. The vertex ids are of type INTEGER with no duplicates.
They
+ do not need to be contiguous.</dd>
+
+<dt>edge_table</dt>
+<dd>TEXT. Name of the table containing the edge data. The edge table must
+contain columns for source vertex and destination vertex.</dd>
+
+<dt>edge_args</dt>
+<dd>TEXT. A comma-delimited string containing multiple named arguments of
+the form "name=value". The following parameters are supported for
+this string argument:
+ - src (INTEGER): Name of the column containing the source vertex ids in
+ the edge table. Default column name is 'src'.
+ - dest (INTEGER): Name of the column containing the destination vertex
+ ids in the edge table. Default column name is
'dest'.</dd>
+
+<dt>out_table</dt>
+<dd>TEXT. Name of the table to store the result of HITS. It will contain
+ a row for every vertex from 'vertex_table' with the following columns:
+ - vertex_id : The id of a vertex. Will use the input parameter
'vertex_id'
+ for column naming.
+ - authority : The vertex's Authority score.
+ - hub : The vertex's Hub score.</dd>
+
+A summary table is also created that contains information
+regarding the number of iterations required for convergence.
+It is named by adding the suffix '_summary' to the 'out_table'
+parameter.
+
+<dt>max_iter (optional) </dt>
+<dd>INTEGER, default: 100. The maximum number of iterations allowed. An
+ iteration consists of both Authority and Hub phases.</dd>
+
+<dt>threshold (optional) </dt>
+<dd>FLOAT8, default: (1/number of vertices * 1000). If the difference
+ between the values of both scores (Authority and Hub) for every vertex
of
+ two consecutive iterations is smaller than 'threshold', or the
iteration
+ number is larger than 'max_iter', the computation stops. If you set
the
+ threshold to zero, then you will force the algorithm to run for the
full
+ number of iterations specified in 'max_iter'. Threshold need to be set
to
+ a value equal or less than 1 since both values (Authority and Hub) of
nodes
+ are initialized as 1. Note that both Authority and Hub value
difference
+ must be below threshold for the algorithm to stop.</dd>
+
+<dt>grouping_cols (optional)[not support yet]</dt>
+<dd>TEXT, default: NULL. A single column or a list of comma-separated
columns
+ that divides the input data into discrete groups, resulting in one
+ distribution per group. When this value is NULL, no grouping is used
and a
+ single model is generated for all data.
+ @note Expressions are not currently supported for 'grouping_cols'.
+ The grouping support will be added later.</dd>
+
+</dl>
+
+@anchor notes
+@par Notes
+
+1. The HITS algorithm is based on Kleinburg's paper [1].
--- End diff --
`Kleinburg` -> `Kleinberg`
---