[ 
https://issues.apache.org/jira/browse/MADLIB-1038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706798#comment-15706798
 ] 

Frank McQuillan edited comment on MADLIB-1038 at 11/29/16 10:49 PM:
--------------------------------------------------------------------

Updated the requirements doc, seems more or less complete for this go around. 

Please let me know if you see something that needs addressing.

Here is the old interface:

{code}
create_indicator_variables (
        source_table,
        output_table,
        categorical_cols,
        keep_null,              -- Optional
        distributed_by          -- Optional
)
{code}

Here is the proposed new interface:

{code}
encode_categorical_variables (
        source_table,
        output_table,
        categorical_cols,
        categorical_cols_to_exclude,    -- Optional
        row_id,                                 -- Optional
        top,                                            -- Optional
        value_to_drop,                  -- Optional
        keep_null,                              -- Optional
        array_output,                           -- Optional
        output_col_dictionary,                  -- Optional
        distributed_by                          -- Optional
)
{code}



was (Author: fmcquillan):
Updated the requirements doc, seems more or less complete for this go around. 

Please let me know if you see something that needs addressing.

> Improvements to encoding categorical variables
> ----------------------------------------------
>
>                 Key: MADLIB-1038
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1038
>             Project: Apache MADlib
>          Issue Type: Improvement
>          Components: Module: Utilities
>            Reporter: Frank McQuillan
>            Assignee: Rahul Iyer
>             Fix For: v1.10
>
>         Attachments: Encoding categorical variables requirements - 29 nov 
> 2016.pdf
>
>
> For the module
> http://madlib.incubator.apache.org/docs/latest/group__grp__data__prep.html
> there are several improvements that can be made.
> Please see attached requirements document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to