This feature will be useful only if training can be done at scale. There may be some models which can be built incrementally, do you know any ?
On Tue, Oct 20, 2015 at 11:37 AM Siyuan Hua <[email protected]> wrote: > Hi Sandesh, > > This is not supposed to scale up the H2O itself. It's just about a bridge > between h2o and Apex. Nowadays if you want to use apex to prepare the data > for H2O. You have to output data to some file(ex hdfs) And then manually > start h2o to build the model. > With this bridge you can build one pipeline to do the whole thing. > > > Siyuan > > On Tue, Oct 20, 2015 at 10:56 AM, Sandesh Hegde <[email protected]> > wrote: > > > How do you propose to handle the scalability required for H2o model > > creation ? > > > > On Tue, Oct 20, 2015 at 9:58 AM Siyuan Hua <[email protected]> > wrote: > > > > > In ML model training, we discovered a pattern that apex can be used to > > > process raw data to feature data, then H2O takes the feature data into > > it's > > > model train engine to train the model. > > > > > > But there is a gap in between 2 pipelines, I have a proposal that we > > could > > > create some operator which feed the processed data directly into H2O or > > > maybe start a container for H2O and throw data into it. In that way, we > > > could build a continuous online model train pipeline. > > > > > > I've created a jira here https://malhar.atlassian.net/browse/MLHR-1875 > > > > > > Feel free to throw any thoughts > > > > > > Best, > > > Siyuan > > > > > >
