Hi Nirmal, As CD pointed out, it was the name of the REST API for me as well.
Thanks and Regards, Vidura On 12 October 2015 at 14:02, Nirmal Fernando <[email protected]> wrote: > Hi CD/Vidura, > > On Mon, Oct 12, 2015 at 1:56 PM, CD Athuraliya <[email protected]> > wrote: > >> >> >> On Mon, Oct 12, 2015 at 12:36 PM, Vidura Gamini Abhaya <[email protected]> >> wrote: >> >>> Hi Fazlan, >>> >>> Please see my comments inline in blue. >>> >>>> >>>> No I am not planning to build the model from scratch. Once the >>>> serialized spark model is built, we can export it to PMML format. In other >>>> words, we are using the serialized model in order to build the PMML model. >>>> >>> >>> That's great. >>> >>> If I have not mistaken what you are suggesting is let the user go >>>> through the normal workflow of model building and once it is done, give an >>>> option to the user to export it to PMML format(also for the models that >>>> have been already built)? >>>> >>> >> Yes exactly! What we should not do IMO is asking the user to go through >> the whole workflow if he needs to export already created model in PMML. >> > > Can you please explain from where did you get this idea? If this idea is > there in Fazlan's content, we need to fix it. > > >>> Yes, this is exactly what I meant. >>> >>> >>>> @Vidura I will check on the run-time support, if that is possible that >>>> would be great. >>>> >>> >>> If it's supported, it'll be great. If not we can still do it based on >>> the model type but I think it'll be a bit messy as the code wouldn't be as >>> generic. >>> >>> >>> Thanks and Regards, >>> >>> Vidura >>> >>> >>> >>>> >>>> On Mon, Oct 12, 2015 at 12:10 PM, Vidura Gamini Abhaya <[email protected] >>>> > wrote: >>>> >>>>> Hi Fazlan, >>>>> >>>>> Are you planning to build a PMML model from the scratch (i.e going >>>>> through the entire flow to build an ML model) or is this to be used for >>>>> exporting a PMML out of an already built model? >>>>> >>>>> If it's the former, +1 to what CD mentioned on not asking user to go >>>>> through the entire ML workflow for PMML. My preference is also for >>>>> saving/exporting a model in PMML to be an option for the user, once a >>>>> model >>>>> is built and for models that have already been built. >>>>> >>>>> @Fazlan - Can we find out whether the PMML export is possible at >>>>> runtime through a method or through the inheritance hierarchy? If so, we >>>>> could only make the export option visible on a UI, only for supported >>>>> models. >>>>> >>>>> Thanks and Regards, >>>>> >>>>> Vidura >>>>> >>>>> >>>>> >>>>> On 12 October 2015 at 11:33, CD Athuraliya <[email protected]> >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I feel that asking user to go through the complete ML workflow for >>>>>> PMML is too demanding. Computationally this conversion should be less >>>>>> expensive compared to model training in real world use cases (since it's >>>>>> a >>>>>> mapping of model parameters from Java objects to XML AFAIK). And model >>>>>> training should be independent from the model format. Instead can't we >>>>>> support this conversion on demand? Or save in both formats for now? Once >>>>>> Spark starts supporting PMML for all algorithms we can go for Method 1 if >>>>>> it looks consistent through out our ML life cycle. >>>>>> >>>>>> Thanks >>>>>> >>>>>> On Mon, Oct 12, 2015 at 11:09 AM, Fazlan Nazeem <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I am working on redmine[1] regarding PMML support for Machine >>>>>>> Learner. Please provide your opinion on this design. >>>>>>> [1]https://redmine.wso2.com/issues/4303 >>>>>>> >>>>>>> *Overview* >>>>>>> >>>>>>> Spark 1.5.1(lastest version) supports PMML model export for some of >>>>>>> the available models in Spark through MLlib. >>>>>>> >>>>>>> The table below outlines the MLlib models that can be exported to >>>>>>> PMML and their equivalent PMML model. >>>>>>> >>>>>>> >>>>>>> >>>>>>> MLlib model >>>>>>> >>>>>>> PMML model >>>>>>> >>>>>>> KMeansModel >>>>>>> >>>>>>> ClusteringModel >>>>>>> >>>>>>> LinearRegressionModel >>>>>>> >>>>>>> RegressionModel (functionName="regression") >>>>>>> >>>>>>> RidgeRegressionModel >>>>>>> >>>>>>> RegressionModel (functionName="regression") >>>>>>> >>>>>>> LassoModel >>>>>>> >>>>>>> RegressionModel (functionName="regression") >>>>>>> >>>>>>> SVMModel >>>>>>> >>>>>>> RegressionModel (functionName="classification" >>>>>>> normalizationMethod="none") >>>>>>> >>>>>>> Binary LogisticRegressionModel >>>>>>> >>>>>>> RegressionModel (functionName="classification" >>>>>>> normalizationMethod="logit") >>>>>>> >>>>>>> Not all models available in MLlib can be exported to PMML as of now. >>>>>>> Goal >>>>>>> >>>>>>> 1. >>>>>>> >>>>>>> We need to save models generated by WSO2 ML(PMML supported >>>>>>> models) in PMML format, so that those could be reused from PMML >>>>>>> supported >>>>>>> tools. >>>>>>> >>>>>>> How To >>>>>>> >>>>>>> if “clusters” is the trained model, we can do the following with the >>>>>>> PMML support. >>>>>>> >>>>>>> // Export the model to a String in PMML format >>>>>>> clusters.toPMML >>>>>>> >>>>>>> // Export the model to a local file in PMML format >>>>>>> clusters.toPMML("/tmp/kmeans.xml") >>>>>>> >>>>>>> // Export the model to a directory on a distributed file system in >>>>>>> PMML format >>>>>>> clusters.toPMML(sc,"/tmp/kmeans") >>>>>>> >>>>>>> // Export the model to the OutputStream in PMML format >>>>>>> clusters.toPMML(System.out) >>>>>>> >>>>>>> For unsupported models, either you will not find a .toPMML method or >>>>>>> an IllegalArgumentException will be thrown. >>>>>>> Design >>>>>>> >>>>>>> In the following diagram models highlighted in green can be exported >>>>>>> to PMML, but not the models highlighted in red. The diagram illustrates >>>>>>> algorithms supported by WSO2 Machine Learner. >>>>>>> >>>>>>> [image: Inline image 2] >>>>>>> >>>>>>> >>>>>>> Method 1 >>>>>>> >>>>>>> By default save the models in PMML if PMML export is supported, >>>>>>> using one of these supported options. >>>>>>> >>>>>>> 1. Export the model to a String in PMML format >>>>>>> 2. Export the model to a local file in PMML format >>>>>>> 3. Export the model to a directory on a distributed file system in >>>>>>> PMML format >>>>>>> 4 . Export the model to the OutputStream in PMML format >>>>>>> >>>>>>> Classes need to be modified (apart from UI) >>>>>>> >>>>>>> - >>>>>>> >>>>>>> SupervisedSparkModelBuilder >>>>>>> - >>>>>>> >>>>>>> UnsupervisedSparkModelBuilder >>>>>>> >>>>>>> >>>>>>> e.g >>>>>>> >>>>>>> [image: Inline image 1] >>>>>>> >>>>>>> As of now the serialized models are saved in “models” folder. The >>>>>>> PMML models can also be saved in the same directory with a PMML suffix. >>>>>>> >>>>>>> optional: >>>>>>> >>>>>>> After the model is generated let the user export the PMML model to a >>>>>>> chosen location through the UI. >>>>>>> >>>>>>> Method 2 >>>>>>> >>>>>>> Add a *new REST API* to build models with PMML >>>>>>> >>>>>>> public Response buildPMMLModel(@PathParam("modelId") long modelId) >>>>>>> >>>>>>> in the backend we could add an additional argument to "buildXModel" >>>>>>> methods to decide whether to save the PMML model or not. >>>>>>> >>>>>>> UI modifications also needed (An option for the user to choose >>>>>>> whether to build the PMML and to choose the path to save it) >>>>>>> >>>>>>> Identified classes need to be modified (apart from UI) >>>>>>> >>>>>>> - >>>>>>> >>>>>>> SupervisedSparkModelBuilder >>>>>>> - >>>>>>> >>>>>>> UnsupervisedSparkModelBuilder >>>>>>> - >>>>>>> >>>>>>> ModelApiV10 >>>>>>> >>>>>>> >>>>>>> >>>>>>> *Conclusion* >>>>>>> >>>>>>> Currently we have decided to go with "Method 2" because of the >>>>>>> following reasons. >>>>>>> >>>>>>> - Not all models have PMML support in Spark. >>>>>>> - If we are to use anything apart from Spark MLlib, such as H2O, >>>>>>> we will be depending on PMML support from H2O. >>>>>>> - With Method 1 we might be generating PMML models when users >>>>>>> are not in need of it (useless computation). >>>>>>> >>>>>>> Please let me know if there is a better way to improve the design. >>>>>>> >>>>>>> -- >>>>>>> Thanks & Regards, >>>>>>> >>>>>>> Fazlan Nazeem >>>>>>> >>>>>>> *Software Engineer* >>>>>>> >>>>>>> *WSO2 Inc* >>>>>>> Mobile : +94772338839 >>>>>>> <%2B94%20%280%29%20773%20451194> >>>>>>> [email protected] >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> *CD Athuraliya* >>>>>> Software Engineer >>>>>> WSO2, Inc. >>>>>> lean . enterprise . middleware >>>>>> Mobile: +94 716288847 <94716288847> >>>>>> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter >>>>>> <https://twitter.com/cdathuraliya> | Blog >>>>>> <https://cdathuraliya.wordpress.com/> >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Vidura Gamini Abhaya, Ph.D. >>>>> Director of Engineering >>>>> M:+94 77 034 7754 >>>>> E: [email protected] >>>>> >>>>> WSO2 Inc. (http://wso2.com) >>>>> lean.enterprise.middleware >>>>> >>>> >>>> >>>> >>>> -- >>>> Thanks & Regards, >>>> >>>> Fazlan Nazeem >>>> >>>> *Software Engineer* >>>> >>>> *WSO2 Inc* >>>> Mobile : +94772338839 >>>> <%2B94%20%280%29%20773%20451194> >>>> [email protected] >>>> >>> >>> >>> >>> -- >>> Vidura Gamini Abhaya, Ph.D. >>> Director of Engineering >>> M:+94 77 034 7754 >>> E: [email protected] >>> >>> WSO2 Inc. (http://wso2.com) >>> lean.enterprise.middleware >>> >> >> >> >> -- >> *CD Athuraliya* >> Software Engineer >> WSO2, Inc. >> lean . enterprise . middleware >> Mobile: +94 716288847 <94716288847> >> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter >> <https://twitter.com/cdathuraliya> | Blog >> <https://cdathuraliya.wordpress.com/> >> > > > > -- > > Thanks & regards, > Nirmal > > Team Lead - WSO2 Machine Learner > Associate Technical Lead - Data Technologies Team, WSO2 Inc. > Mobile: +94715779733 > Blog: http://nirmalfdo.blogspot.com/ > > > -- Vidura Gamini Abhaya, Ph.D. Director of Engineering M:+94 77 034 7754 E: [email protected] WSO2 Inc. (http://wso2.com) lean.enterprise.middleware
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
