Thank you. That is helpful. More specifically, I am trying to implement a regression of a form like this:
write_score = B0 + B2*log(math) + B3*log(read) Where a student's predicted writing score is a function of gender, the log of a math score, the log of a reading score. But in fact, what I am trying to understand is how to do feature engineering inside of PIO. I want to try various manipulations of the data to figure out what the best features are for a given model (log is a common example). I might want to try, for example, another regression like: write_score = B0 + B2*(math - read)^2 Where the score on writing is a function of the squared difference between the math and reading scores. I'd prefer manipulate variables within the PIO Engine because the servers that send the event data to PIO are "just dump pipes" and I'd like to keep the "data science" logic outside of those pipes and inside of PIO as much as possible. On Fri, Jan 20, 2017 at 12:45 PM Pat Ferrel <[email protected]> wrote: > It would help to know what you are trying to implement. > > The datasource and preparator are used only during the input part of > train, they pass data to the train method of your algorithm when you run > `pio train`. The predict method does not use them at all. It may get data > from the EventStore, but not through those other classes. > > If you need data to always be the log of some number you may want to take > the log before it is sent to the EventServer so it will always be a log, > event when you get the Query or out of the EventSever. > > > On Jan 20, 2017, at 5:13 AM, Daniel Gabrieli <[email protected]> > wrote: > > Hi, > > I am a new to PIO. > > I have a variable called X that I would like take the log of during > training and then during prediction as well. Where is the appropriate > place to put the log function? > > My guess is to override the "prepare" method; while I think the prepare > method is called just before training, I am not clear whether it is also > called before prediction. > > Do I call the log transformation again somewhere else so that it occurs > during prediction? Possibly in the predict method? > > Thank you, > > > > > prepare > > >
