Github user fmcquillan99 commented on the issue:
https://github.com/apache/madlib/pull/223
Started testing, some early observations:
(1)
class_size default should be âuniformâ, it seems to be set to
âundersampleâ currently
(2)
`
SELECT madlib.balance_sample(
'flags', -- Source table
'output_table', -- Output table
'mainhue', -- Class column
'red=7, blue=7'); -- Want 7 reds and 7
blues
`
results in an error, not sure why it does not like this class size:
`
InternalError: (psycopg2.InternalError) plpy.Error: Sample: Invalid class
size (red=7, blue=7)! (plpython.c:4648)
CONTEXT: Traceback (most recent call last):
PL/Python function "balance_sample", line 23, in <module>
return balance_sample.balance_sample(**globals())
PL/Python function "balance_sample", line 62, in balance_sample
PL/Python function "balance_sample", line 851, in _validate_strs
PL/Python function "balance_sample", line 77, in _assert
PL/Python function "balance_sample"
[SQL: "SELECT madlib.balance_sample(\n
'flags', -- Source table\n
'output_table', -- Output table\n 'mainhue',
-- Class column\n 'red=7, blue=7'); --
Want 7 reds and 7 blues"]
`
(3)
`
SELECT madlib.balance_sample(
'flags', -- Source table
'output_table', -- Output table
'mainhue', -- Class column
'red=.25', -- Want 25% red flags
20); -- Desire output table
size
`
results in an error, not sure why it does not like this class size:
`
InternalError: (psycopg2.InternalError) plpy.Error: Sample: Invalid class
size (red=.25)! (plpython.c:4648)
CONTEXT: Traceback (most recent call last):
PL/Python function "balance_sample", line 23, in <module>
return balance_sample.balance_sample(**globals())
PL/Python function "balance_sample", line 62, in balance_sample
PL/Python function "balance_sample", line 851, in _validate_strs
PL/Python function "balance_sample", line 77, in _assert
PL/Python function "balance_sample"
[SQL: "SELECT madlib.balance_sample(\n
'flags', -- Source table\n
'output_table', -- Output table\n 'mainhue',
-- Class column\n 'red=.25', --
Want 25%% red flags\n 20); --
Desire output table size"]
`
---