madlib git commit: add note to user docs on vec2cols about unequal arrays

fmcquillan Fri, 17 Aug 2018 13:38:58 -0700

Repository: madlib
Updated Branches:
  refs/heads/master a3b59356f -> 5e707f745



add note to user docs on vec2cols about unequal arrays


Project: http://git-wip-us.apache.org/repos/asf/madlib/repo
Commit: http://git-wip-us.apache.org/repos/asf/madlib/commit/5e707f74
Tree: http://git-wip-us.apache.org/repos/asf/madlib/tree/5e707f74
Diff: http://git-wip-us.apache.org/repos/asf/madlib/diff/5e707f74

Branch: refs/heads/master
Commit: 5e707f745c50343dd7395a3e8f86c04428210977
Parents: a3b5935
Author: Frank McQuillan <fmcquil...@pivotal.io>
Authored: Fri Aug 17 13:38:20 2018 -0700
Committer: Frank McQuillan <fmcquil...@pivotal.io>
Committed: Fri Aug 17 13:38:20 2018 -0700

----------------------------------------------------------------------
 .../postgres/modules/stats/correlation.sql_in    | 10 +++++-----
 .../postgres/modules/utilities/cols2vec.sql_in   |  4 ++--
 .../postgres/modules/utilities/vec2cols.sql_in   | 19 ++++++++++++-------
 3 files changed, 19 insertions(+), 14 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/madlib/blob/5e707f74/src/ports/postgres/modules/stats/correlation.sql_in
----------------------------------------------------------------------
diff --git a/src/ports/postgres/modules/stats/correlation.sql_in 
b/src/ports/postgres/modules/stats/correlation.sql_in
index 64ed27e..3bf3e46 100644
--- a/src/ports/postgres/modules/stats/correlation.sql_in
+++ b/src/ports/postgres/modules/stats/correlation.sql_in
@@ -222,7 +222,7 @@ SELECT * FROM example_data_output ORDER BY column_position;
 <pre class="result">
  column_position |  variable   |     temperature     | humidity 
 -----------------+-------------+---------------------+----------
-               1 | temperature |                   1 |         
+               1 | temperature |                   1 | 
                2 | humidity    | 0.00607993890408995 |        1
 (2 rows)
 </pre>
@@ -259,11 +259,11 @@ SELECT * FROM example_data_output ORDER BY day, 
column_position;
 <pre class="result">
  column_position |  variable   | day  |    temperature    | humidity 
 -----------------+-------------+------+-------------------+----------
-               1 | temperature | Mon  |                 1 |         
+               1 | temperature | Mon  |                 1 | 
                2 | humidity    | Mon  | 0.616876934548786 |        1
-               1 | temperature | Tues |                 1 |         
+               1 | temperature | Tues |                 1 | 
                2 | humidity    | Tues | 0.616876934548786 |        1
-               1 | temperature | Wed  |                 1 |         
+               1 | temperature | Wed  |                 1 | 
                2 | humidity    | Wed  | -0.28969669368457 |        1
 (6 rows)
 </pre>
@@ -315,7 +315,7 @@ SELECT * FROM example_data_output ORDER BY column_position;
 <pre class="result">
  column_position |  variable   |   temperature    |     humidity     
 -----------------+-------------+------------------+------------------
-               1 | temperature | 507.926664293343 |                 
+               1 | temperature | 507.926664293343 | 
                2 | humidity    | 2.40227839088644 | 307.359914560342
 (2 rows)
 </pre>

http://git-wip-us.apache.org/repos/asf/madlib/blob/5e707f74/src/ports/postgres/modules/utilities/cols2vec.sql_in
----------------------------------------------------------------------
diff --git a/src/ports/postgres/modules/utilities/cols2vec.sql_in 
b/src/ports/postgres/modules/utilities/cols2vec.sql_in
index 82a1f94..0c54ab5 100644
--- a/src/ports/postgres/modules/utilities/cols2vec.sql_in
+++ b/src/ports/postgres/modules/utilities/cols2vec.sql_in
@@ -82,8 +82,8 @@ values.</dd>
 
 <dt>list_of_features_to_exclude (optional)</dt>
 <dd>TEXT. Default NULL.
-Comma-separated string of column names to exclude from the feature array.  
-Typically used when 'list_of_features' is set to '*'.</dd>
+Comma-separated string of column names to exclude from the feature array.  
Typically used 
+when 'list_of_features' is set to '*'.</dd>
 
 <dt>cols_to_output (optional)</dt>
 <dd>TEXT. Default NULL.

http://git-wip-us.apache.org/repos/asf/madlib/blob/5e707f74/src/ports/postgres/modules/utilities/vec2cols.sql_in
----------------------------------------------------------------------
diff --git a/src/ports/postgres/modules/utilities/vec2cols.sql_in 
b/src/ports/postgres/modules/utilities/vec2cols.sql_in
index 989074c..115e015 100644
--- a/src/ports/postgres/modules/utilities/vec2cols.sql_in
+++ b/src/ports/postgres/modules/utilities/vec2cols.sql_in
@@ -72,23 +72,28 @@ vec2cols(
 same name already exists, an error will be returned.</tt>
 
 <dt>vector_col</dt>
-<dd>TEXT. Name of the column containing the feature array.  
-Must be a one-dimensional array.</tt>
+<dd>TEXT. Name of the column containing the feature array.  Must be a 
one-dimensional array.</tt>
 
 <dt>feature_names (optional)</dt>
-<dd>TEXT[]. Array of names associated with the feature array.  
-Note that this array exists in the
-summary table created by the function 'cols2vec'.  
-If the 'feature_names' array is not specified,
+<dd>TEXT[]. Array of names associated with the feature array.  Note that 
+this array exists in the summary table created by the function 'cols2vec'. If 
+the 'feature_names' array is not specified,
 column names will be automatically generated of 
 the form 'f1, f2, ...fn'.</tt>
+@note If you specify the 'feature_names' parameter, you will get exactly that 
number of 
+feature columns in the 'output_table'.  It means feature arrays from the 
'vector_col' may be 
+padded or truncated, if a particular feature array size does not match the 
target
+number of feature columns.  <br><br>If you do not specify the 'feature names' 
parameter, 
+the number of feature columns generated
+in the 'output_table' will be the maximum array size from 'vector_col'.
+Feature arrays that are less than this maximum will be padded.
 
 <dt>cols_to_output (optional)</dt>
 <dd>TEXT, default NULL. Comma-separated string of column names 
 from the source table to keep in the
 output table, in addition to the feature columns.  
 To keep all columns from the source table, use '*'.
-Note: total number of columns in a table cannot exceed the 
+The total number of columns in a table cannot exceed the 
 PostgreSQL limits.</tt>
 </dd>
 </dl>

madlib git commit: add note to user docs on vec2cols about unequal arrays

Reply via email to