Github user fmcquillan99 commented on the issue:
https://github.com/apache/incubator-madlib/pull/93
Thank you for the PR.
It seems you would like to always output an svec for this function? i.e.,
not support regular array output? This seems overly restrictive.
Most MADlib functions that would use the output from
encode_categorical_variables (e.g., regression) accept array inputs but not
necessarily svec inputs).
If we would like to support svec, I would suggest we would need to change
the array_output param from BOOLEAN to multi-valued (no array, array, svec) to
support the intent of this PR.
array_output (optional)
BOOLEAN. default: FALSE. This parameter controls the output format of the
indicator variables. If FALSE, a column is created for each indicator variable.
PostgreSQL limits the number of columns in a table. If the total number of
indicator columns exceeds the limit, then make this parameter TRUE to combine
the indicator columns into an array. The order of the array is the same as
specified in 'categorical_cols'. A dictionary will be created when
'array_output' is TRUE to define an index into the array. The dictionary table
will be given the name of the 'output_table' appended by '_dictionary'.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---