Thanks Vidura! On Mon, Oct 12, 2015 at 3:10 PM, Vidura Gamini Abhaya <[email protected]> wrote:
> Hi Nirmal, > > As CD pointed out, it was the name of the REST API for me as well. > > Thanks and Regards, > > Vidura > > > On 12 October 2015 at 14:02, Nirmal Fernando <[email protected]> wrote: > >> Hi CD/Vidura, >> >> On Mon, Oct 12, 2015 at 1:56 PM, CD Athuraliya <[email protected]> >> wrote: >> >>> >>> >>> On Mon, Oct 12, 2015 at 12:36 PM, Vidura Gamini Abhaya <[email protected]> >>> wrote: >>> >>>> Hi Fazlan, >>>> >>>> Please see my comments inline in blue. >>>> >>>>> >>>>> No I am not planning to build the model from scratch. Once the >>>>> serialized spark model is built, we can export it to PMML format. In other >>>>> words, we are using the serialized model in order to build the PMML model. >>>>> >>>> >>>> That's great. >>>> >>>> If I have not mistaken what you are suggesting is let the user go >>>>> through the normal workflow of model building and once it is done, give an >>>>> option to the user to export it to PMML format(also for the models that >>>>> have been already built)? >>>>> >>>> >>> Yes exactly! What we should not do IMO is asking the user to go through >>> the whole workflow if he needs to export already created model in PMML. >>> >> >> Can you please explain from where did you get this idea? If this idea is >> there in Fazlan's content, we need to fix it. >> >> >>>> Yes, this is exactly what I meant. >>>> >>>> >>>>> @Vidura I will check on the run-time support, if that is possible that >>>>> would be great. >>>>> >>>> >>>> If it's supported, it'll be great. If not we can still do it based on >>>> the model type but I think it'll be a bit messy as the code wouldn't be as >>>> generic. >>>> >>>> >>>> Thanks and Regards, >>>> >>>> Vidura >>>> >>>> >>>> >>>>> >>>>> On Mon, Oct 12, 2015 at 12:10 PM, Vidura Gamini Abhaya < >>>>> [email protected]> wrote: >>>>> >>>>>> Hi Fazlan, >>>>>> >>>>>> Are you planning to build a PMML model from the scratch (i.e going >>>>>> through the entire flow to build an ML model) or is this to be used for >>>>>> exporting a PMML out of an already built model? >>>>>> >>>>>> If it's the former, +1 to what CD mentioned on not asking user to go >>>>>> through the entire ML workflow for PMML. My preference is also for >>>>>> saving/exporting a model in PMML to be an option for the user, once a >>>>>> model >>>>>> is built and for models that have already been built. >>>>>> >>>>>> @Fazlan - Can we find out whether the PMML export is possible at >>>>>> runtime through a method or through the inheritance hierarchy? If so, we >>>>>> could only make the export option visible on a UI, only for supported >>>>>> models. >>>>>> >>>>>> Thanks and Regards, >>>>>> >>>>>> Vidura >>>>>> >>>>>> >>>>>> >>>>>> On 12 October 2015 at 11:33, CD Athuraliya <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I feel that asking user to go through the complete ML workflow for >>>>>>> PMML is too demanding. Computationally this conversion should be less >>>>>>> expensive compared to model training in real world use cases (since >>>>>>> it's a >>>>>>> mapping of model parameters from Java objects to XML AFAIK). And model >>>>>>> training should be independent from the model format. Instead can't we >>>>>>> support this conversion on demand? Or save in both formats for now? Once >>>>>>> Spark starts supporting PMML for all algorithms we can go for Method 1 >>>>>>> if >>>>>>> it looks consistent through out our ML life cycle. >>>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> On Mon, Oct 12, 2015 at 11:09 AM, Fazlan Nazeem <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I am working on redmine[1] regarding PMML support for Machine >>>>>>>> Learner. Please provide your opinion on this design. >>>>>>>> [1]https://redmine.wso2.com/issues/4303 >>>>>>>> >>>>>>>> *Overview* >>>>>>>> >>>>>>>> Spark 1.5.1(lastest version) supports PMML model export for some of >>>>>>>> the available models in Spark through MLlib. >>>>>>>> >>>>>>>> The table below outlines the MLlib models that can be exported to >>>>>>>> PMML and their equivalent PMML model. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> MLlib model >>>>>>>> >>>>>>>> PMML model >>>>>>>> >>>>>>>> KMeansModel >>>>>>>> >>>>>>>> ClusteringModel >>>>>>>> >>>>>>>> LinearRegressionModel >>>>>>>> >>>>>>>> RegressionModel (functionName="regression") >>>>>>>> >>>>>>>> RidgeRegressionModel >>>>>>>> >>>>>>>> RegressionModel (functionName="regression") >>>>>>>> >>>>>>>> LassoModel >>>>>>>> >>>>>>>> RegressionModel (functionName="regression") >>>>>>>> >>>>>>>> SVMModel >>>>>>>> >>>>>>>> RegressionModel (functionName="classification" >>>>>>>> normalizationMethod="none") >>>>>>>> >>>>>>>> Binary LogisticRegressionModel >>>>>>>> >>>>>>>> RegressionModel (functionName="classification" >>>>>>>> normalizationMethod="logit") >>>>>>>> >>>>>>>> Not all models available in MLlib can be exported to PMML as of now. >>>>>>>> Goal >>>>>>>> >>>>>>>> 1. >>>>>>>> >>>>>>>> We need to save models generated by WSO2 ML(PMML supported >>>>>>>> models) in PMML format, so that those could be reused from PMML >>>>>>>> supported >>>>>>>> tools. >>>>>>>> >>>>>>>> How To >>>>>>>> >>>>>>>> if “clusters” is the trained model, we can do the following with >>>>>>>> the PMML support. >>>>>>>> >>>>>>>> // Export the model to a String in PMML format >>>>>>>> clusters.toPMML >>>>>>>> >>>>>>>> // Export the model to a local file in PMML format >>>>>>>> clusters.toPMML("/tmp/kmeans.xml") >>>>>>>> >>>>>>>> // Export the model to a directory on a distributed file system in >>>>>>>> PMML format >>>>>>>> clusters.toPMML(sc,"/tmp/kmeans") >>>>>>>> >>>>>>>> // Export the model to the OutputStream in PMML format >>>>>>>> clusters.toPMML(System.out) >>>>>>>> >>>>>>>> For unsupported models, either you will not find a .toPMML method >>>>>>>> or an IllegalArgumentException will be thrown. >>>>>>>> Design >>>>>>>> >>>>>>>> In the following diagram models highlighted in green can be >>>>>>>> exported to PMML, but not the models highlighted in red. The diagram >>>>>>>> illustrates algorithms supported by WSO2 Machine Learner. >>>>>>>> >>>>>>>> [image: Inline image 2] >>>>>>>> >>>>>>>> >>>>>>>> Method 1 >>>>>>>> >>>>>>>> By default save the models in PMML if PMML export is supported, >>>>>>>> using one of these supported options. >>>>>>>> >>>>>>>> 1. Export the model to a String in PMML format >>>>>>>> 2. Export the model to a local file in PMML format >>>>>>>> 3. Export the model to a directory on a distributed file system in >>>>>>>> PMML format >>>>>>>> 4 . Export the model to the OutputStream in PMML format >>>>>>>> >>>>>>>> Classes need to be modified (apart from UI) >>>>>>>> >>>>>>>> - >>>>>>>> >>>>>>>> SupervisedSparkModelBuilder >>>>>>>> - >>>>>>>> >>>>>>>> UnsupervisedSparkModelBuilder >>>>>>>> >>>>>>>> >>>>>>>> e.g >>>>>>>> >>>>>>>> [image: Inline image 1] >>>>>>>> >>>>>>>> As of now the serialized models are saved in “models” folder. The >>>>>>>> PMML models can also be saved in the same directory with a PMML suffix. >>>>>>>> >>>>>>>> optional: >>>>>>>> >>>>>>>> After the model is generated let the user export the PMML model to >>>>>>>> a chosen location through the UI. >>>>>>>> >>>>>>>> Method 2 >>>>>>>> >>>>>>>> Add a *new REST API* to build models with PMML >>>>>>>> >>>>>>>> public Response buildPMMLModel(@PathParam("modelId") long modelId) >>>>>>>> >>>>>>>> in the backend we could add an additional argument to "buildXModel" >>>>>>>> methods to decide whether to save the PMML model or not. >>>>>>>> >>>>>>>> UI modifications also needed (An option for the user to choose >>>>>>>> whether to build the PMML and to choose the path to save it) >>>>>>>> >>>>>>>> Identified classes need to be modified (apart from UI) >>>>>>>> >>>>>>>> - >>>>>>>> >>>>>>>> SupervisedSparkModelBuilder >>>>>>>> - >>>>>>>> >>>>>>>> UnsupervisedSparkModelBuilder >>>>>>>> - >>>>>>>> >>>>>>>> ModelApiV10 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> *Conclusion* >>>>>>>> >>>>>>>> Currently we have decided to go with "Method 2" because of the >>>>>>>> following reasons. >>>>>>>> >>>>>>>> - Not all models have PMML support in Spark. >>>>>>>> - If we are to use anything apart from Spark MLlib, such as >>>>>>>> H2O, we will be depending on PMML support from H2O. >>>>>>>> - With Method 1 we might be generating PMML models when users >>>>>>>> are not in need of it (useless computation). >>>>>>>> >>>>>>>> Please let me know if there is a better way to improve the design. >>>>>>>> >>>>>>>> -- >>>>>>>> Thanks & Regards, >>>>>>>> >>>>>>>> Fazlan Nazeem >>>>>>>> >>>>>>>> *Software Engineer* >>>>>>>> >>>>>>>> *WSO2 Inc* >>>>>>>> Mobile : +94772338839 >>>>>>>> <%2B94%20%280%29%20773%20451194> >>>>>>>> [email protected] >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> *CD Athuraliya* >>>>>>> Software Engineer >>>>>>> WSO2, Inc. >>>>>>> lean . enterprise . middleware >>>>>>> Mobile: +94 716288847 <94716288847> >>>>>>> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter >>>>>>> <https://twitter.com/cdathuraliya> | Blog >>>>>>> <https://cdathuraliya.wordpress.com/> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Vidura Gamini Abhaya, Ph.D. >>>>>> Director of Engineering >>>>>> M:+94 77 034 7754 >>>>>> E: [email protected] >>>>>> >>>>>> WSO2 Inc. (http://wso2.com) >>>>>> lean.enterprise.middleware >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Thanks & Regards, >>>>> >>>>> Fazlan Nazeem >>>>> >>>>> *Software Engineer* >>>>> >>>>> *WSO2 Inc* >>>>> Mobile : +94772338839 >>>>> <%2B94%20%280%29%20773%20451194> >>>>> [email protected] >>>>> >>>> >>>> >>>> >>>> -- >>>> Vidura Gamini Abhaya, Ph.D. >>>> Director of Engineering >>>> M:+94 77 034 7754 >>>> E: [email protected] >>>> >>>> WSO2 Inc. (http://wso2.com) >>>> lean.enterprise.middleware >>>> >>> >>> >>> >>> -- >>> *CD Athuraliya* >>> Software Engineer >>> WSO2, Inc. >>> lean . enterprise . middleware >>> Mobile: +94 716288847 <94716288847> >>> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter >>> <https://twitter.com/cdathuraliya> | Blog >>> <https://cdathuraliya.wordpress.com/> >>> >> >> >> >> -- >> >> Thanks & regards, >> Nirmal >> >> Team Lead - WSO2 Machine Learner >> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >> Mobile: +94715779733 >> Blog: http://nirmalfdo.blogspot.com/ >> >> >> > > > -- > Vidura Gamini Abhaya, Ph.D. > Director of Engineering > M:+94 77 034 7754 > E: [email protected] > > WSO2 Inc. (http://wso2.com) > lean.enterprise.middleware > -- Thanks & regards, Nirmal Team Lead - WSO2 Machine Learner Associate Technical Lead - Data Technologies Team, WSO2 Inc. Mobile: +94715779733 Blog: http://nirmalfdo.blogspot.com/
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
