[ 
https://issues.apache.org/jira/browse/MADLIB-1213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikhil updated MADLIB-1213:
---------------------------
    Description: 
 The minibatch preprocessor currently does not support all expressions for 
independent and dependent variables.
 # Independent varname does not support any logical expression.
 # Dependent varname only supports logical expression for numerical columns. 
For ex 'length >1' is a valid expression but it does not support creating an 
alias for this expression.
 # we might already support expressions that evaluate to array but haven't 
tested it.
 
This is the only expression that is supported
{code}
SELECT madlib.minibatch_preprocessor('minibatch_preprocessing_input', 
'minibatch_preprocessing_out',  'y > 10',  ' x1,x2', 4);
 {code}

Not supported :
{code}
SELECT madlib.minibatch_preprocessor('minibatch_preprocessing_input', 
'minibatch_preprocessing_out',  'y > 10 as foo',  'x1,x2', 4);
{code}

{code}
SELECT madlib.minibatch_preprocessor('minibatch_preprocessing_input', 
'minibatch_preprocessing_out',  'y=''F''',  'x1,x2', 4);
{code}

Open Questions :
1. How about expressions that evaluate to array ? We might already support this 
but haven't tested it yet.
2. Do we need to support logical expressions for both independent and dependent 
varname
3. If yes, to what extent ? 
4. Should the user be allowed to create an alias for logical expressions? 
5. There might be other modules that may partially support logical expressions. 
Should we find out which modules ?

  was:
* The minibatch preprocessor currently does not support all logical expressions 
for independent and dependent variables.
 # Independent varname does not support any logical expression.
 # Dependent varname only supports logical expression for numerical columns. 
For ex 'length >1' is a valid expression but it does not support creating an 
alias for this expression.

This is the only expression that is supported
{code}
SELECT madlib.minibatch_preprocessor('minibatch_preprocessing_input', 
'minibatch_preprocessing_out',  'y > 10',  ' x1,x2', 4);
 {code}

Not supported :
{code}
SELECT madlib.minibatch_preprocessor('minibatch_preprocessing_input', 
'minibatch_preprocessing_out',  'y > 10 as foo',  'x1,x2', 4);
{code}

{code}
SELECT madlib.minibatch_preprocessor('minibatch_preprocessing_input', 
'minibatch_preprocessing_out',  'y=''F''',  'x1,x2', 4);
{code}

Open Questions :
1. How about expressions that evaluate to array ? We might already support this 
but haven't tested it yet.
2. Do we need to support logical expressions for both independent and dependent 
varname
3. If yes, to what extent ? 
4. Should the user be allowed to create an alias for logical expressions? 
5. There might be other modules that may partially support logical expressions. 
Should we find out which modules ?


> Support expressions for minibatch preprocessor
> ----------------------------------------------
>
>                 Key: MADLIB-1213
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1213
>             Project: Apache MADlib
>          Issue Type: Improvement
>          Components: Module: Utilities
>            Reporter: Jingyi Mei
>            Priority: Major
>             Fix For: v1.14
>
>
>  The minibatch preprocessor currently does not support all expressions for 
> independent and dependent variables.
>  # Independent varname does not support any logical expression.
>  # Dependent varname only supports logical expression for numerical columns. 
> For ex 'length >1' is a valid expression but it does not support creating an 
> alias for this expression.
>  # we might already support expressions that evaluate to array but haven't 
> tested it.
>  
> This is the only expression that is supported
> {code}
> SELECT madlib.minibatch_preprocessor('minibatch_preprocessing_input', 
> 'minibatch_preprocessing_out',  'y > 10',  ' x1,x2', 4);
>  {code}
> Not supported :
> {code}
> SELECT madlib.minibatch_preprocessor('minibatch_preprocessing_input', 
> 'minibatch_preprocessing_out',  'y > 10 as foo',  'x1,x2', 4);
> {code}
> {code}
> SELECT madlib.minibatch_preprocessor('minibatch_preprocessing_input', 
> 'minibatch_preprocessing_out',  'y=''F''',  'x1,x2', 4);
> {code}
> Open Questions :
> 1. How about expressions that evaluate to array ? We might already support 
> this but haven't tested it yet.
> 2. Do we need to support logical expressions for both independent and 
> dependent varname
> 3. If yes, to what extent ? 
> 4. Should the user be allowed to create an alias for logical expressions? 
> 5. There might be other modules that may partially support logical 
> expressions. Should we find out which modules ?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to