In ML model training, we discovered a pattern that apex can be used to process raw data to feature data, then H2O takes the feature data into it's model train engine to train the model.
But there is a gap in between 2 pipelines, I have a proposal that we could create some operator which feed the processed data directly into H2O or maybe start a container for H2O and throw data into it. In that way, we could build a continuous online model train pipeline. I've created a jira here https://malhar.atlassian.net/browse/MLHR-1875 Feel free to throw any thoughts Best, Siyuan
