Encoding categorical variables

2016-10-14 Thread Frank McQuillan
For the module encoding categorical variables http://madlib.incubator.apache.org/docs/latest/group__grp__data__prep.html does anyone have any suggestions on improvements that we could make? Here is a video on how encoding categorical variables works for those not familiar with it https://www.youtu

Re: Encoding categorical variables

2016-10-14 Thread Jarrod Vawdrey
Hey Frank, How are special character values handled today? It is often not ideal to end up with column names that require double quotes to call due to downstream scripts. A couple of features that would be useful * Option to define resulting column names. Please see pdltools implementation - the