[ 
https://issues.apache.org/jira/browse/MADLIB-1050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Schleich updated MADLIB-1050:
----------------------------------------
    Description: 
Hello, 
I am trying to use the dummy encoding for categorical variables and feed it to 
a linear regression model. My dataset, however, has more than 1664 categories, 
so Postgres cannot store it in one table. Is there any other way for encoding 
dummy variables that does not require the creation of a new table, perhaps the 
function can be streamlined into the regression model? 
Thank you for your help!

  was:
Hello, 
I am trying to use the dummy encoding for categorical variables and feed it to 
a linear regression model. but my dataset has more than 1664 categories, so 
Postgres cannot store it in one table. Is there any other way for encoding 
dummy variables that does not require the creation of a new table, perhaps the 
function can be streamlined into the regression model? 
Thank you for your help!


> Encoding of categorical variables limited to ~1600 colums? 
> -----------------------------------------------------------
>
>                 Key: MADLIB-1050
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1050
>             Project: Apache MADlib
>          Issue Type: Bug
>            Reporter: Maximilian Schleich
>
> Hello, 
> I am trying to use the dummy encoding for categorical variables and feed it 
> to a linear regression model. My dataset, however, has more than 1664 
> categories, so Postgres cannot store it in one table. Is there any other way 
> for encoding dummy variables that does not require the creation of a new 
> table, perhaps the function can be streamlined into the regression model? 
> Thank you for your help!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to