[ https://issues.apache.org/jira/browse/MADLIB-1202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363275#comment-16363275 ]
ASF GitHub Bot commented on MADLIB-1202: ---------------------------------------- GitHub user jingyimei opened a pull request: https://github.com/apache/madlib/pull/234 Create lower case column name in encode_categorical_variables() JIRA:MADLIB-1202 The previous madlib.encode_categorical_variables() function generates column name with some capital characters, including: 1. when you specify top_values, there will be a column name with suffix __MISC__ 2. when you set encode_nulls as True, there will be a column name with suffix __NULL 3. when the original column is boolean type, there will be column names with suffix _True and _False In the above cases, users have to use double quoting to query, which is not conveninet. This commit adresses this, and all of the three scenarios will generate coloumn name with lower cases. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jingyimei/madlib encode_categorial_column_name_change Alternatively you can review and apply these changes as the patch at: https://github.com/apache/madlib/pull/234.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #234 ---- commit 4d78f9425ffb76089d30bbe85cb3e07f1050268a Author: Jingyi Mei <jmei@...> Date: 2018-02-14T00:11:39Z Create lower case column name in encode_categorical_variables() JIRA:MADLIB-1202 The previous madlib.encode_categorical_variables() function generates column name with some capital characters, including: 1. when you specify top_values, there will be a column name with suffix __MISC__ 2. when you set encode_nulls as True, there will be a column name with suffix __NULL 3. when the original column is boolean type, there will be column names with suffix _True and _False In the above cases, users have to use double quoting to query, which is not conveninet. This commit adresses this, and all of the three scenarios will generate coloumn name with lower cases. ---- > encode_categorical_variables() creates all lower case column names for > boolean columns > -------------------------------------------------------------------------------------- > > Key: MADLIB-1202 > URL: https://issues.apache.org/jira/browse/MADLIB-1202 > Project: Apache MADlib > Issue Type: Improvement > Reporter: Jarrod Vawdrey > Assignee: Jingyi Mei > Priority: Minor > Fix For: v1.14 > > > > It would be handy if encode_categorical_variables() created lower case column > names for boolean columns vs upper case that require double quoting to query. > Current implementation generates "<boolean column name>_True" and "<boolean > column name>_False". > Improvement to generate <boolean column name>_true and <boolean column > name>_false. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)