Github user fmcquillan99 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/239#discussion_r172920825
--- Diff: src/ports/postgres/modules/sample/balance_sample.sql_in ---
@@ -543,6 +544,90 @@ SELECT * FROM output_table ORDER BY mainhue, name;
(25 rows)
</pre>
+-# To perform the balance sampling for independent groups, use the
'grouping_cols'
+parameter. Note below that each group (zone) has a different count of the
classes (mainhue),
+with some groups not containing some class values.
+<pre class="syntax">
+DROP TABLE IF EXISTS output_table;
+SELECT madlib.balance_sample(
+ 'flags', -- Source table
+ 'output_table', -- Output table
+ 'mainhue', -- Class column
+ NULL, -- Uniform
+ NULL, -- Output table size
+ 'zone' -- No grouping
+);
--- End diff --
I think you mean "Group by zone" or something like that, not "No grouping" .
---