In ML model training, we discovered a pattern that apex can be used to
process raw data to feature data, then H2O takes the feature data into it's
model train engine to train the model.

But there is a gap in between 2 pipelines, I have a proposal that we could
create some operator which feed the processed data directly into H2O or
maybe start a container for H2O and throw data into it. In that way, we
could build a continuous online model train pipeline.

I've created a jira here https://malhar.atlassian.net/browse/MLHR-1875

Feel free to throw any thoughts

Best,
Siyuan

Reply via email to