Github user njayaram2 commented on the issue:

    https://github.com/apache/madlib/pull/342
  
    @reductionista thank you for the comments.
    The existing `minibatch_preprocessor` module outputs new columns called 
`dependent_varname` and `independent_varname` instead of the column names from 
the input table. The reason we did the same here is purely to conform with what 
is already in the other module. The other module allows expressions as input 
params (which may have been the reason behind a different column name in its 
output table), while this module does not explicitly support expressions. So, I 
do agree with your point about the output table column names, but I am just not 
sure how odd it would be to have the difference between the two modules. May be 
other folks could also weigh in to help us decide. Also, this module 
(`minibatch_preprocessor_dl`) is at early stage dev, so this will be a great 
time to try out options.
    
    Regarding your comment on the ordering of the two input params (`x` and 
`y`):
    This is following the convention we have in every other MADlib module, 
namely, we first have the dependent variable followed by the independent 
variable in the input parameters list. If you'd like it to be the opposite, it 
might be a good idea to start a separate thread in the community mailing list 
to discuss it. It will break conformity if we change the order of the two 
variables only in this module. BTW, `2.0` release will be a good time to change 
it since that release would break backward compatibility. 


---

Reply via email to