[
https://issues.apache.org/jira/browse/MADLIB-1202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363275#comment-16363275
]
ASF GitHub Bot commented on MADLIB-1202:
----------------------------------------
GitHub user jingyimei opened a pull request:
https://github.com/apache/madlib/pull/234
Create lower case column name in encode_categorical_variables()
JIRA:MADLIB-1202
The previous madlib.encode_categorical_variables() function generates
column name with some capital characters, including:
1. when you specify top_values, there will be a column name with suffix
__MISC__
2. when you set encode_nulls as True, there will be a column name with
suffix
__NULL
3. when the original column is boolean type, there will be column names
with suffix _True and _False
In the above cases, users have to use double quoting to query, which is
not conveninet.
This commit adresses this, and all of the three scenarios will generate
coloumn name with lower cases.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/jingyimei/madlib
encode_categorial_column_name_change
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/madlib/pull/234.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #234
----
commit 4d78f9425ffb76089d30bbe85cb3e07f1050268a
Author: Jingyi Mei <jmei@...>
Date: 2018-02-14T00:11:39Z
Create lower case column name in encode_categorical_variables()
JIRA:MADLIB-1202
The previous madlib.encode_categorical_variables() function generates
column name with some capital characters, including:
1. when you specify top_values, there will be a column name with suffix
__MISC__
2. when you set encode_nulls as True, there will be a column name with
suffix
__NULL
3. when the original column is boolean type, there will be column names
with suffix _True and _False
In the above cases, users have to use double quoting to query, which is
not conveninet.
This commit adresses this, and all of the three scenarios will generate
coloumn name with lower cases.
----
> encode_categorical_variables() creates all lower case column names for
> boolean columns
> --------------------------------------------------------------------------------------
>
> Key: MADLIB-1202
> URL: https://issues.apache.org/jira/browse/MADLIB-1202
> Project: Apache MADlib
> Issue Type: Improvement
> Reporter: Jarrod Vawdrey
> Assignee: Jingyi Mei
> Priority: Minor
> Fix For: v1.14
>
>
>
> It would be handy if encode_categorical_variables() created lower case column
> names for boolean columns vs upper case that require double quoting to query.
> Current implementation generates "<boolean column name>_True" and "<boolean
> column name>_False".
> Improvement to generate <boolean column name>_true and <boolean column
> name>_false.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)