Hi, The feature is in a working condition at the moment and we had a code review. Current implementation has a Download button and when clicked on that a dialog will be presented with two options namely "Serialized" and "PMML". When clicked on "Serialized" for an unsupported model type, a message will be presented to the user as "PMML download not supported for this model type".
However there was a suggestion (to be discussed) to add the current supported model types in machine-learner.xml and using a new API read the XML to disable the "PMML" download option for unsupported model types. This can be a temporary solution, but eventually since Spark will support all models this will be redundant in the future. Any thoughts on what should be the best way to go? On Tue, Oct 13, 2015 at 7:47 PM, Fazlan Nazeem <[email protected]> wrote: > Hi all, > > After the discussion the following points are to be incorporated into the > design. > > > - The "download" button in the model page will provide the user two > options namely "Download as serialized object" and "Download as PMML" > - "Download as PMML" option will trigger a REST API call to a newly > implemented API which will handle this scenario > - Enable or disable "Download as PMML" option in the UI depending on > whether the model can be exported as PMML (Will try to achieve this, if > unable display an appropriate message to the unsupported models ) > - The feature will only use the already built serialized models and > convert them into PMML (not going through the complete ML workflow) > > Please add if I have missed any. > > On Mon, Oct 12, 2015 at 3:16 PM, Nirmal Fernando <[email protected]> wrote: > >> Thanks Vidura! >> >> On Mon, Oct 12, 2015 at 3:10 PM, Vidura Gamini Abhaya <[email protected]> >> wrote: >> >>> Hi Nirmal, >>> >>> As CD pointed out, it was the name of the REST API for me as well. >>> >>> Thanks and Regards, >>> >>> Vidura >>> >>> >>> On 12 October 2015 at 14:02, Nirmal Fernando <[email protected]> wrote: >>> >>>> Hi CD/Vidura, >>>> >>>> On Mon, Oct 12, 2015 at 1:56 PM, CD Athuraliya <[email protected]> >>>> wrote: >>>> >>>>> >>>>> >>>>> On Mon, Oct 12, 2015 at 12:36 PM, Vidura Gamini Abhaya < >>>>> [email protected]> wrote: >>>>> >>>>>> Hi Fazlan, >>>>>> >>>>>> Please see my comments inline in blue. >>>>>> >>>>>>> >>>>>>> No I am not planning to build the model from scratch. Once the >>>>>>> serialized spark model is built, we can export it to PMML format. In >>>>>>> other >>>>>>> words, we are using the serialized model in order to build the PMML >>>>>>> model. >>>>>>> >>>>>> >>>>>> That's great. >>>>>> >>>>>> If I have not mistaken what you are suggesting is let the user go >>>>>>> through the normal workflow of model building and once it is done, give >>>>>>> an >>>>>>> option to the user to export it to PMML format(also for the models that >>>>>>> have been already built)? >>>>>>> >>>>>> >>>>> Yes exactly! What we should not do IMO is asking the user to go >>>>> through the whole workflow if he needs to export already created model in >>>>> PMML. >>>>> >>>> >>>> Can you please explain from where did you get this idea? If this idea >>>> is there in Fazlan's content, we need to fix it. >>>> >>>> >>>>>> Yes, this is exactly what I meant. >>>>>> >>>>>> >>>>>>> @Vidura I will check on the run-time support, if that is possible >>>>>>> that would be great. >>>>>>> >>>>>> >>>>>> If it's supported, it'll be great. If not we can still do it based on >>>>>> the model type but I think it'll be a bit messy as the code wouldn't be >>>>>> as >>>>>> generic. >>>>>> >>>>>> >>>>>> Thanks and Regards, >>>>>> >>>>>> Vidura >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> On Mon, Oct 12, 2015 at 12:10 PM, Vidura Gamini Abhaya < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Hi Fazlan, >>>>>>>> >>>>>>>> Are you planning to build a PMML model from the scratch (i.e going >>>>>>>> through the entire flow to build an ML model) or is this to be used for >>>>>>>> exporting a PMML out of an already built model? >>>>>>>> >>>>>>>> If it's the former, +1 to what CD mentioned on not asking user to >>>>>>>> go through the entire ML workflow for PMML. My preference is also for >>>>>>>> saving/exporting a model in PMML to be an option for the user, once a >>>>>>>> model >>>>>>>> is built and for models that have already been built. >>>>>>>> >>>>>>>> @Fazlan - Can we find out whether the PMML export is possible at >>>>>>>> runtime through a method or through the inheritance hierarchy? If so, >>>>>>>> we >>>>>>>> could only make the export option visible on a UI, only for supported >>>>>>>> models. >>>>>>>> >>>>>>>> Thanks and Regards, >>>>>>>> >>>>>>>> Vidura >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 12 October 2015 at 11:33, CD Athuraliya <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I feel that asking user to go through the complete ML workflow for >>>>>>>>> PMML is too demanding. Computationally this conversion should be less >>>>>>>>> expensive compared to model training in real world use cases (since >>>>>>>>> it's a >>>>>>>>> mapping of model parameters from Java objects to XML AFAIK). And model >>>>>>>>> training should be independent from the model format. Instead can't we >>>>>>>>> support this conversion on demand? Or save in both formats for now? >>>>>>>>> Once >>>>>>>>> Spark starts supporting PMML for all algorithms we can go for Method >>>>>>>>> 1 if >>>>>>>>> it looks consistent through out our ML life cycle. >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> >>>>>>>>> On Mon, Oct 12, 2015 at 11:09 AM, Fazlan Nazeem <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I am working on redmine[1] regarding PMML support for Machine >>>>>>>>>> Learner. Please provide your opinion on this design. >>>>>>>>>> [1]https://redmine.wso2.com/issues/4303 >>>>>>>>>> >>>>>>>>>> *Overview* >>>>>>>>>> >>>>>>>>>> Spark 1.5.1(lastest version) supports PMML model export for some >>>>>>>>>> of the available models in Spark through MLlib. >>>>>>>>>> >>>>>>>>>> The table below outlines the MLlib models that can be exported to >>>>>>>>>> PMML and their equivalent PMML model. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> MLlib model >>>>>>>>>> >>>>>>>>>> PMML model >>>>>>>>>> >>>>>>>>>> KMeansModel >>>>>>>>>> >>>>>>>>>> ClusteringModel >>>>>>>>>> >>>>>>>>>> LinearRegressionModel >>>>>>>>>> >>>>>>>>>> RegressionModel (functionName="regression") >>>>>>>>>> >>>>>>>>>> RidgeRegressionModel >>>>>>>>>> >>>>>>>>>> RegressionModel (functionName="regression") >>>>>>>>>> >>>>>>>>>> LassoModel >>>>>>>>>> >>>>>>>>>> RegressionModel (functionName="regression") >>>>>>>>>> >>>>>>>>>> SVMModel >>>>>>>>>> >>>>>>>>>> RegressionModel (functionName="classification" >>>>>>>>>> normalizationMethod="none") >>>>>>>>>> >>>>>>>>>> Binary LogisticRegressionModel >>>>>>>>>> >>>>>>>>>> RegressionModel (functionName="classification" >>>>>>>>>> normalizationMethod="logit") >>>>>>>>>> >>>>>>>>>> Not all models available in MLlib can be exported to PMML as of >>>>>>>>>> now. >>>>>>>>>> Goal >>>>>>>>>> >>>>>>>>>> 1. >>>>>>>>>> >>>>>>>>>> We need to save models generated by WSO2 ML(PMML supported >>>>>>>>>> models) in PMML format, so that those could be reused from PMML >>>>>>>>>> supported >>>>>>>>>> tools. >>>>>>>>>> >>>>>>>>>> How To >>>>>>>>>> >>>>>>>>>> if “clusters” is the trained model, we can do the following with >>>>>>>>>> the PMML support. >>>>>>>>>> >>>>>>>>>> // Export the model to a String in PMML format >>>>>>>>>> clusters.toPMML >>>>>>>>>> >>>>>>>>>> // Export the model to a local file in PMML format >>>>>>>>>> clusters.toPMML("/tmp/kmeans.xml") >>>>>>>>>> >>>>>>>>>> // Export the model to a directory on a distributed file system >>>>>>>>>> in PMML format >>>>>>>>>> clusters.toPMML(sc,"/tmp/kmeans") >>>>>>>>>> >>>>>>>>>> // Export the model to the OutputStream in PMML format >>>>>>>>>> clusters.toPMML(System.out) >>>>>>>>>> >>>>>>>>>> For unsupported models, either you will not find a .toPMML method >>>>>>>>>> or an IllegalArgumentException will be thrown. >>>>>>>>>> Design >>>>>>>>>> >>>>>>>>>> In the following diagram models highlighted in green can be >>>>>>>>>> exported to PMML, but not the models highlighted in red. The diagram >>>>>>>>>> illustrates algorithms supported by WSO2 Machine Learner. >>>>>>>>>> >>>>>>>>>> [image: Inline image 2] >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Method 1 >>>>>>>>>> >>>>>>>>>> By default save the models in PMML if PMML export is supported, >>>>>>>>>> using one of these supported options. >>>>>>>>>> >>>>>>>>>> 1. Export the model to a String in PMML format >>>>>>>>>> 2. Export the model to a local file in PMML format >>>>>>>>>> 3. Export the model to a directory on a distributed file system >>>>>>>>>> in PMML format >>>>>>>>>> 4 . Export the model to the OutputStream in PMML format >>>>>>>>>> >>>>>>>>>> Classes need to be modified (apart from UI) >>>>>>>>>> >>>>>>>>>> - >>>>>>>>>> >>>>>>>>>> SupervisedSparkModelBuilder >>>>>>>>>> - >>>>>>>>>> >>>>>>>>>> UnsupervisedSparkModelBuilder >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> e.g >>>>>>>>>> >>>>>>>>>> [image: Inline image 1] >>>>>>>>>> >>>>>>>>>> As of now the serialized models are saved in “models” folder. The >>>>>>>>>> PMML models can also be saved in the same directory with a PMML >>>>>>>>>> suffix. >>>>>>>>>> >>>>>>>>>> optional: >>>>>>>>>> >>>>>>>>>> After the model is generated let the user export the PMML model >>>>>>>>>> to a chosen location through the UI. >>>>>>>>>> >>>>>>>>>> Method 2 >>>>>>>>>> >>>>>>>>>> Add a *new REST API* to build models with PMML >>>>>>>>>> >>>>>>>>>> public Response buildPMMLModel(@PathParam("modelId") long modelId >>>>>>>>>> ) >>>>>>>>>> >>>>>>>>>> in the backend we could add an additional argument to >>>>>>>>>> "buildXModel" methods to decide whether to save the PMML model or >>>>>>>>>> not. >>>>>>>>>> >>>>>>>>>> UI modifications also needed (An option for the user to choose >>>>>>>>>> whether to build the PMML and to choose the path to save it) >>>>>>>>>> >>>>>>>>>> Identified classes need to be modified (apart from UI) >>>>>>>>>> >>>>>>>>>> - >>>>>>>>>> >>>>>>>>>> SupervisedSparkModelBuilder >>>>>>>>>> - >>>>>>>>>> >>>>>>>>>> UnsupervisedSparkModelBuilder >>>>>>>>>> - >>>>>>>>>> >>>>>>>>>> ModelApiV10 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *Conclusion* >>>>>>>>>> >>>>>>>>>> Currently we have decided to go with "Method 2" because of the >>>>>>>>>> following reasons. >>>>>>>>>> >>>>>>>>>> - Not all models have PMML support in Spark. >>>>>>>>>> - If we are to use anything apart from Spark MLlib, such as >>>>>>>>>> H2O, we will be depending on PMML support from H2O. >>>>>>>>>> - With Method 1 we might be generating PMML models when users >>>>>>>>>> are not in need of it (useless computation). >>>>>>>>>> >>>>>>>>>> Please let me know if there is a better way to improve the >>>>>>>>>> design. >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Thanks & Regards, >>>>>>>>>> >>>>>>>>>> Fazlan Nazeem >>>>>>>>>> >>>>>>>>>> *Software Engineer* >>>>>>>>>> >>>>>>>>>> *WSO2 Inc* >>>>>>>>>> Mobile : +94772338839 >>>>>>>>>> <%2B94%20%280%29%20773%20451194> >>>>>>>>>> [email protected] >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> *CD Athuraliya* >>>>>>>>> Software Engineer >>>>>>>>> WSO2, Inc. >>>>>>>>> lean . enterprise . middleware >>>>>>>>> Mobile: +94 716288847 <94716288847> >>>>>>>>> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter >>>>>>>>> <https://twitter.com/cdathuraliya> | Blog >>>>>>>>> <https://cdathuraliya.wordpress.com/> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Vidura Gamini Abhaya, Ph.D. >>>>>>>> Director of Engineering >>>>>>>> M:+94 77 034 7754 >>>>>>>> E: [email protected] >>>>>>>> >>>>>>>> WSO2 Inc. (http://wso2.com) >>>>>>>> lean.enterprise.middleware >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Thanks & Regards, >>>>>>> >>>>>>> Fazlan Nazeem >>>>>>> >>>>>>> *Software Engineer* >>>>>>> >>>>>>> *WSO2 Inc* >>>>>>> Mobile : +94772338839 >>>>>>> <%2B94%20%280%29%20773%20451194> >>>>>>> [email protected] >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Vidura Gamini Abhaya, Ph.D. >>>>>> Director of Engineering >>>>>> M:+94 77 034 7754 >>>>>> E: [email protected] >>>>>> >>>>>> WSO2 Inc. (http://wso2.com) >>>>>> lean.enterprise.middleware >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> *CD Athuraliya* >>>>> Software Engineer >>>>> WSO2, Inc. >>>>> lean . enterprise . middleware >>>>> Mobile: +94 716288847 <94716288847> >>>>> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter >>>>> <https://twitter.com/cdathuraliya> | Blog >>>>> <https://cdathuraliya.wordpress.com/> >>>>> >>>> >>>> >>>> >>>> -- >>>> >>>> Thanks & regards, >>>> Nirmal >>>> >>>> Team Lead - WSO2 Machine Learner >>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>> Mobile: +94715779733 >>>> Blog: http://nirmalfdo.blogspot.com/ >>>> >>>> >>>> >>> >>> >>> -- >>> Vidura Gamini Abhaya, Ph.D. >>> Director of Engineering >>> M:+94 77 034 7754 >>> E: [email protected] >>> >>> WSO2 Inc. (http://wso2.com) >>> lean.enterprise.middleware >>> >> >> >> >> -- >> >> Thanks & regards, >> Nirmal >> >> Team Lead - WSO2 Machine Learner >> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >> Mobile: +94715779733 >> Blog: http://nirmalfdo.blogspot.com/ >> >> >> > > > -- > Thanks & Regards, > > Fazlan Nazeem > > *Software Engineer* > > *WSO2 Inc* > Mobile : +94772338839 > <%2B94%20%280%29%20773%20451194> > [email protected] > -- Thanks & Regards, Fazlan Nazeem *Software Engineer* *WSO2 Inc* Mobile : +94772338839 <%2B94%20%280%29%20773%20451194> [email protected]
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
