On Mon, Oct 12, 2015 at 2:17 PM, Fazlan Nazeem <[email protected]> wrote:
> Sure Nirmal. > > Thanks CD for pointing it out! > > On Mon, Oct 12, 2015 at 2:14 PM, Nirmal Fernando <[email protected]> wrote: > >> Excellent! Good catch! Fazlan please fix. exportAsPMMLModel may be? >> >> On Mon, Oct 12, 2015 at 2:10 PM, CD Athuraliya <[email protected]> >> wrote: >> >>> To me buildPMMLModel(long modelId) sounds more like we are building (or >>> training) the model. ExportPMML or something similar would sound more >>> like the actual action IMO. >>> >>> On Mon, Oct 12, 2015 at 2:02 PM, Nirmal Fernando <[email protected]> >>> wrote: >>> >>>> Hi CD/Vidura, >>>> >>>> On Mon, Oct 12, 2015 at 1:56 PM, CD Athuraliya <[email protected]> >>>> wrote: >>>> >>>>> >>>>> >>>>> On Mon, Oct 12, 2015 at 12:36 PM, Vidura Gamini Abhaya < >>>>> [email protected]> wrote: >>>>> >>>>>> Hi Fazlan, >>>>>> >>>>>> Please see my comments inline in blue. >>>>>> >>>>>>> >>>>>>> No I am not planning to build the model from scratch. Once the >>>>>>> serialized spark model is built, we can export it to PMML format. In >>>>>>> other >>>>>>> words, we are using the serialized model in order to build the PMML >>>>>>> model. >>>>>>> >>>>>> >>>>>> That's great. >>>>>> >>>>>> If I have not mistaken what you are suggesting is let the user go >>>>>>> through the normal workflow of model building and once it is done, give >>>>>>> an >>>>>>> option to the user to export it to PMML format(also for the models that >>>>>>> have been already built)? >>>>>>> >>>>>> >>>>> Yes exactly! What we should not do IMO is asking the user to go >>>>> through the whole workflow if he needs to export already created model in >>>>> PMML. >>>>> >>>> >>>> Can you please explain from where did you get this idea? If this idea >>>> is there in Fazlan's content, we need to fix it. >>>> >>>> >>>>>> Yes, this is exactly what I meant. >>>>>> >>>>>> >>>>>>> @Vidura I will check on the run-time support, if that is possible >>>>>>> that would be great. >>>>>>> >>>>>> >>>>>> If it's supported, it'll be great. If not we can still do it based on >>>>>> the model type but I think it'll be a bit messy as the code wouldn't be >>>>>> as >>>>>> generic. >>>>>> >>>>>> >>>>>> Thanks and Regards, >>>>>> >>>>>> Vidura >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> On Mon, Oct 12, 2015 at 12:10 PM, Vidura Gamini Abhaya < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Hi Fazlan, >>>>>>>> >>>>>>>> Are you planning to build a PMML model from the scratch (i.e going >>>>>>>> through the entire flow to build an ML model) or is this to be used for >>>>>>>> exporting a PMML out of an already built model? >>>>>>>> >>>>>>>> If it's the former, +1 to what CD mentioned on not asking user to >>>>>>>> go through the entire ML workflow for PMML. My preference is also for >>>>>>>> saving/exporting a model in PMML to be an option for the user, once a >>>>>>>> model >>>>>>>> is built and for models that have already been built. >>>>>>>> >>>>>>>> @Fazlan - Can we find out whether the PMML export is possible at >>>>>>>> runtime through a method or through the inheritance hierarchy? If so, >>>>>>>> we >>>>>>>> could only make the export option visible on a UI, only for supported >>>>>>>> models. >>>>>>>> >>>>>>> This option can be something similar to platform selection in typical software downloads where we have native model type and PMML. > >>>>>>>> Thanks and Regards, >>>>>>>> >>>>>>>> Vidura >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 12 October 2015 at 11:33, CD Athuraliya <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I feel that asking user to go through the complete ML workflow for >>>>>>>>> PMML is too demanding. Computationally this conversion should be less >>>>>>>>> expensive compared to model training in real world use cases (since >>>>>>>>> it's a >>>>>>>>> mapping of model parameters from Java objects to XML AFAIK). And model >>>>>>>>> training should be independent from the model format. Instead can't we >>>>>>>>> support this conversion on demand? Or save in both formats for now? >>>>>>>>> Once >>>>>>>>> Spark starts supporting PMML for all algorithms we can go for Method >>>>>>>>> 1 if >>>>>>>>> it looks consistent through out our ML life cycle. >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> >>>>>>>>> On Mon, Oct 12, 2015 at 11:09 AM, Fazlan Nazeem <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I am working on redmine[1] regarding PMML support for Machine >>>>>>>>>> Learner. Please provide your opinion on this design. >>>>>>>>>> [1]https://redmine.wso2.com/issues/4303 >>>>>>>>>> >>>>>>>>>> *Overview* >>>>>>>>>> >>>>>>>>>> Spark 1.5.1(lastest version) supports PMML model export for some >>>>>>>>>> of the available models in Spark through MLlib. >>>>>>>>>> >>>>>>>>>> The table below outlines the MLlib models that can be exported to >>>>>>>>>> PMML and their equivalent PMML model. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> MLlib model >>>>>>>>>> >>>>>>>>>> PMML model >>>>>>>>>> >>>>>>>>>> KMeansModel >>>>>>>>>> >>>>>>>>>> ClusteringModel >>>>>>>>>> >>>>>>>>>> LinearRegressionModel >>>>>>>>>> >>>>>>>>>> RegressionModel (functionName="regression") >>>>>>>>>> >>>>>>>>>> RidgeRegressionModel >>>>>>>>>> >>>>>>>>>> RegressionModel (functionName="regression") >>>>>>>>>> >>>>>>>>>> LassoModel >>>>>>>>>> >>>>>>>>>> RegressionModel (functionName="regression") >>>>>>>>>> >>>>>>>>>> SVMModel >>>>>>>>>> >>>>>>>>>> RegressionModel (functionName="classification" >>>>>>>>>> normalizationMethod="none") >>>>>>>>>> >>>>>>>>>> Binary LogisticRegressionModel >>>>>>>>>> >>>>>>>>>> RegressionModel (functionName="classification" >>>>>>>>>> normalizationMethod="logit") >>>>>>>>>> >>>>>>>>>> Not all models available in MLlib can be exported to PMML as of >>>>>>>>>> now. >>>>>>>>>> Goal >>>>>>>>>> >>>>>>>>>> 1. >>>>>>>>>> >>>>>>>>>> We need to save models generated by WSO2 ML(PMML supported >>>>>>>>>> models) in PMML format, so that those could be reused from PMML >>>>>>>>>> supported >>>>>>>>>> tools. >>>>>>>>>> >>>>>>>>>> How To >>>>>>>>>> >>>>>>>>>> if “clusters” is the trained model, we can do the following with >>>>>>>>>> the PMML support. >>>>>>>>>> >>>>>>>>>> // Export the model to a String in PMML format >>>>>>>>>> clusters.toPMML >>>>>>>>>> >>>>>>>>>> // Export the model to a local file in PMML format >>>>>>>>>> clusters.toPMML("/tmp/kmeans.xml") >>>>>>>>>> >>>>>>>>>> // Export the model to a directory on a distributed file system >>>>>>>>>> in PMML format >>>>>>>>>> clusters.toPMML(sc,"/tmp/kmeans") >>>>>>>>>> >>>>>>>>>> // Export the model to the OutputStream in PMML format >>>>>>>>>> clusters.toPMML(System.out) >>>>>>>>>> >>>>>>>>>> For unsupported models, either you will not find a .toPMML method >>>>>>>>>> or an IllegalArgumentException will be thrown. >>>>>>>>>> Design >>>>>>>>>> >>>>>>>>>> In the following diagram models highlighted in green can be >>>>>>>>>> exported to PMML, but not the models highlighted in red. The diagram >>>>>>>>>> illustrates algorithms supported by WSO2 Machine Learner. >>>>>>>>>> >>>>>>>>>> [image: Inline image 2] >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Method 1 >>>>>>>>>> >>>>>>>>>> By default save the models in PMML if PMML export is supported, >>>>>>>>>> using one of these supported options. >>>>>>>>>> >>>>>>>>>> 1. Export the model to a String in PMML format >>>>>>>>>> 2. Export the model to a local file in PMML format >>>>>>>>>> 3. Export the model to a directory on a distributed file system >>>>>>>>>> in PMML format >>>>>>>>>> 4 . Export the model to the OutputStream in PMML format >>>>>>>>>> >>>>>>>>>> Classes need to be modified (apart from UI) >>>>>>>>>> >>>>>>>>>> - >>>>>>>>>> >>>>>>>>>> SupervisedSparkModelBuilder >>>>>>>>>> - >>>>>>>>>> >>>>>>>>>> UnsupervisedSparkModelBuilder >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> e.g >>>>>>>>>> >>>>>>>>>> [image: Inline image 1] >>>>>>>>>> >>>>>>>>>> As of now the serialized models are saved in “models” folder. The >>>>>>>>>> PMML models can also be saved in the same directory with a PMML >>>>>>>>>> suffix. >>>>>>>>>> >>>>>>>>>> optional: >>>>>>>>>> >>>>>>>>>> After the model is generated let the user export the PMML model >>>>>>>>>> to a chosen location through the UI. >>>>>>>>>> >>>>>>>>>> Method 2 >>>>>>>>>> >>>>>>>>>> Add a *new REST API* to build models with PMML >>>>>>>>>> >>>>>>>>>> public Response buildPMMLModel(@PathParam("modelId") long modelId >>>>>>>>>> ) >>>>>>>>>> >>>>>>>>>> in the backend we could add an additional argument to >>>>>>>>>> "buildXModel" methods to decide whether to save the PMML model or >>>>>>>>>> not. >>>>>>>>>> >>>>>>>>>> UI modifications also needed (An option for the user to choose >>>>>>>>>> whether to build the PMML and to choose the path to save it) >>>>>>>>>> >>>>>>>>>> Identified classes need to be modified (apart from UI) >>>>>>>>>> >>>>>>>>>> - >>>>>>>>>> >>>>>>>>>> SupervisedSparkModelBuilder >>>>>>>>>> - >>>>>>>>>> >>>>>>>>>> UnsupervisedSparkModelBuilder >>>>>>>>>> - >>>>>>>>>> >>>>>>>>>> ModelApiV10 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *Conclusion* >>>>>>>>>> >>>>>>>>>> Currently we have decided to go with "Method 2" because of the >>>>>>>>>> following reasons. >>>>>>>>>> >>>>>>>>>> - Not all models have PMML support in Spark. >>>>>>>>>> - If we are to use anything apart from Spark MLlib, such as >>>>>>>>>> H2O, we will be depending on PMML support from H2O. >>>>>>>>>> - With Method 1 we might be generating PMML models when users >>>>>>>>>> are not in need of it (useless computation). >>>>>>>>>> >>>>>>>>>> Please let me know if there is a better way to improve the >>>>>>>>>> design. >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Thanks & Regards, >>>>>>>>>> >>>>>>>>>> Fazlan Nazeem >>>>>>>>>> >>>>>>>>>> *Software Engineer* >>>>>>>>>> >>>>>>>>>> *WSO2 Inc* >>>>>>>>>> Mobile : +94772338839 >>>>>>>>>> <%2B94%20%280%29%20773%20451194> >>>>>>>>>> [email protected] >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> *CD Athuraliya* >>>>>>>>> Software Engineer >>>>>>>>> WSO2, Inc. >>>>>>>>> lean . enterprise . middleware >>>>>>>>> Mobile: +94 716288847 <94716288847> >>>>>>>>> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter >>>>>>>>> <https://twitter.com/cdathuraliya> | Blog >>>>>>>>> <https://cdathuraliya.wordpress.com/> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Vidura Gamini Abhaya, Ph.D. >>>>>>>> Director of Engineering >>>>>>>> M:+94 77 034 7754 >>>>>>>> E: [email protected] >>>>>>>> >>>>>>>> WSO2 Inc. (http://wso2.com) >>>>>>>> lean.enterprise.middleware >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Thanks & Regards, >>>>>>> >>>>>>> Fazlan Nazeem >>>>>>> >>>>>>> *Software Engineer* >>>>>>> >>>>>>> *WSO2 Inc* >>>>>>> Mobile : +94772338839 >>>>>>> <%2B94%20%280%29%20773%20451194> >>>>>>> [email protected] >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Vidura Gamini Abhaya, Ph.D. >>>>>> Director of Engineering >>>>>> M:+94 77 034 7754 >>>>>> E: [email protected] >>>>>> >>>>>> WSO2 Inc. (http://wso2.com) >>>>>> lean.enterprise.middleware >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> *CD Athuraliya* >>>>> Software Engineer >>>>> WSO2, Inc. >>>>> lean . enterprise . middleware >>>>> Mobile: +94 716288847 <94716288847> >>>>> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter >>>>> <https://twitter.com/cdathuraliya> | Blog >>>>> <https://cdathuraliya.wordpress.com/> >>>>> >>>> >>>> >>>> >>>> -- >>>> >>>> Thanks & regards, >>>> Nirmal >>>> >>>> Team Lead - WSO2 Machine Learner >>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>> Mobile: +94715779733 >>>> Blog: http://nirmalfdo.blogspot.com/ >>>> >>>> >>>> >>> >>> >>> -- >>> *CD Athuraliya* >>> Software Engineer >>> WSO2, Inc. >>> lean . enterprise . middleware >>> Mobile: +94 716288847 <94716288847> >>> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter >>> <https://twitter.com/cdathuraliya> | Blog >>> <https://cdathuraliya.wordpress.com/> >>> >> >> >> >> -- >> >> Thanks & regards, >> Nirmal >> >> Team Lead - WSO2 Machine Learner >> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >> Mobile: +94715779733 >> Blog: http://nirmalfdo.blogspot.com/ >> >> >> > > > -- > Thanks & Regards, > > Fazlan Nazeem > > *Software Engineer* > > *WSO2 Inc* > Mobile : +94772338839 > <%2B94%20%280%29%20773%20451194> > [email protected] > -- *CD Athuraliya* Software Engineer WSO2, Inc. lean . enterprise . middleware Mobile: +94 716288847 <94716288847> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter <https://twitter.com/cdathuraliya> | Blog <https://cdathuraliya.wordpress.com/>
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
