Do you think it will be useful to separate those models and model loader/writer code into another spark-ml-common jar without any spark platform dependencies so users can load the models trained by Spark ML in their application and run the prediction?
Sincerely, DB Tsai ---------------------------------------------------------- Web: https://www.dbtsai.com PGP Key ID: 0xAF08DF8D On Wed, Nov 11, 2015 at 3:14 AM, Nirmal Fernando <nir...@wso2.com> wrote: > As of now, we are basically serializing the ML model and then deserialize > it for prediction at real time. > > On Wed, Nov 11, 2015 at 4:39 PM, Adrian Tanase <atan...@adobe.com> wrote: > >> I don’t think this answers your question but here’s how you would >> evaluate the model in realtime in a streaming app >> >> https://databricks.gitbooks.io/databricks-spark-reference-applications/content/twitter_classifier/predict.html >> >> Maybe you can find a way to extract portions of MLLib and run them >> outside of spark – loading the precomputed model and calling .predict on it… >> >> -adrian >> >> From: Andy Davidson >> Date: Tuesday, November 10, 2015 at 11:31 PM >> To: "user @spark" >> Subject: thought experiment: use spark ML to real time prediction >> >> Lets say I have use spark ML to train a linear model. I know I can save >> and load the model to disk. I am not sure how I can use the model in a real >> time environment. For example I do not think I can return a “prediction” to >> the client using spark streaming easily. Also for some applications the >> extra latency created by the batch process might not be acceptable. >> >> If I was not using spark I would re-implement the model I trained in my >> batch environment in a lang like Java and implement a rest service that >> uses the model to create a prediction and return the prediction to the >> client. Many models make predictions using linear algebra. Implementing >> predictions is relatively easy if you have a good vectorized LA package. Is >> there a way to use a model I trained using spark ML outside of spark? >> >> As a motivating example, even if its possible to return data to the >> client using spark streaming. I think the mini batch latency would not be >> acceptable for a high frequency stock trading system. >> >> Kind regards >> >> Andy >> >> P.s. The examples I have seen so far use spark streaming to “preprocess” >> predictions. For example a recommender system might use what current users >> are watching to calculate “trending recommendations”. These are stored on >> disk and served up to users when the use the “movie guide”. If a >> recommendation was a couple of min. old it would not effect the end users >> experience. >> >> > > > -- > > Thanks & regards, > Nirmal > > Team Lead - WSO2 Machine Learner > Associate Technical Lead - Data Technologies Team, WSO2 Inc. > Mobile: +94715779733 > Blog: http://nirmalfdo.blogspot.com/ > > >