[GitHub] madlib pull request #254: Enable grouping for minibatch preprocessing

2018-04-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/madlib/pull/254


---


[GitHub] madlib pull request #254: Enable grouping for minibatch preprocessing

2018-04-03 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request:

https://github.com/apache/madlib/pull/254#discussion_r178684532
  
--- Diff: 
src/ports/postgres/modules/utilities/mean_std_dev_calculator.py_in ---
@@ -40,15 +41,27 @@ class MeanStdDevCalculator:
 self.dimension = dimension
 
 def get_mean_and_std_dev_for_ind_var(self):
-set_zero_std_to_one = True
-
 x_scaled_vals = utils_ind_var_scales(self.source_table,
  self.indep_var_array_str,
  self.dimension,
  self.schema_madlib,
- None, # do not dump the 
output to a temp table
- set_zero_std_to_one)
+ x_mean_table = None, # do not 
dump the output to a temp table
+ set_zero_std_to_one=True)
 x_mean_str = _array_to_string(x_scaled_vals["mean"])
 x_std_str = _array_to_string(x_scaled_vals["std"])
 
+if not x_mean_str or not x_std_str:
+plpy.error("mean/stddev for the independent variable"
+   "cannot be null")
+
 return x_mean_str, x_std_str
+
+def create_mean_std_table_for_ind_var_grouping(self, x_mean_table, 
grouping_cols):
+utils_ind_var_scales_grouping(self.source_table,
+ self.indep_var_array_str,
+ self.dimension,
+ self.schema_madlib,
+ grouping_cols,
+ x_mean_table,
+ set_zero_std_to_one = True,
+ create_temp_table = False)
--- End diff --

Could you please correct the indentation here?


---


[GitHub] madlib pull request #254: Enable grouping for minibatch preprocessing

2018-04-03 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request:

https://github.com/apache/madlib/pull/254#discussion_r178684358
  
--- Diff: src/ports/postgres/modules/convex/utils_regularization.py_in ---
@@ -85,6 +86,8 @@ def utils_ind_var_scales_grouping(tbl_data, col_ind_var, 
dimension,
 x_mean_table,
 set_zero_std_to_one (optional, default is False. If set to true
  0.0 standard deviation values will be set to 1.0)
+create_temp_table If set to true, create a persistent instead of a 
temp
+  table, else create a temp table for x_mean
--- End diff --

Shouldn't this comment say create temp table when true, and a persistent 
table when set to false?


---