Excellent! Good catch! Fazlan please fix. exportAsPMMLModel may be?

On Mon, Oct 12, 2015 at 2:10 PM, CD Athuraliya <[email protected]> wrote:

> To me buildPMMLModel(long modelId) sounds more like we are building (or
> training) the model. ExportPMML or something similar would sound more
> like the actual action IMO.
>
> On Mon, Oct 12, 2015 at 2:02 PM, Nirmal Fernando <[email protected]> wrote:
>
>> Hi CD/Vidura,
>>
>> On Mon, Oct 12, 2015 at 1:56 PM, CD Athuraliya <[email protected]>
>> wrote:
>>
>>>
>>>
>>> On Mon, Oct 12, 2015 at 12:36 PM, Vidura Gamini Abhaya <[email protected]>
>>> wrote:
>>>
>>>> Hi Fazlan,
>>>>
>>>> Please see my comments inline in blue.
>>>>
>>>>>
>>>>> No I am not planning to build the model from scratch. Once the
>>>>> serialized spark model is built, we can export it to PMML format. In other
>>>>> words, we are using the serialized model in order to build the PMML model.
>>>>>
>>>>
>>>> That's great.
>>>>
>>>> If I have not mistaken what you are suggesting is let the user go
>>>>> through the normal workflow of model building and once it is done, give an
>>>>> option to the user to export it to PMML format(also for the models that
>>>>> have been already built)?
>>>>>
>>>>
>>> Yes exactly! What we should not do IMO is asking the user to go through
>>> the whole workflow if he needs to export already created model in PMML.
>>>
>>
>> Can you please explain from where did you get this idea? If this idea is
>> there in Fazlan's content, we need to fix it.
>>
>>
>>>> Yes, this is exactly what I meant.
>>>>
>>>>
>>>>> @Vidura I will check on the run-time support, if that is possible that
>>>>> would be great.
>>>>>
>>>>
>>>> If it's supported, it'll be great. If not we can still do it based on
>>>> the model type but I think it'll be a bit messy as the code wouldn't be as
>>>> generic.
>>>>
>>>>
>>>> Thanks and Regards,
>>>>
>>>> Vidura
>>>>
>>>>
>>>>
>>>>>
>>>>> On Mon, Oct 12, 2015 at 12:10 PM, Vidura Gamini Abhaya <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Hi Fazlan,
>>>>>>
>>>>>> Are you planning to build a PMML model from the scratch (i.e going
>>>>>> through the entire flow to build an ML model) or is this to be used for
>>>>>> exporting a PMML out of an already built model?
>>>>>>
>>>>>> If it's the former, +1 to what CD mentioned on not asking user to go
>>>>>> through the entire ML workflow for PMML. My preference is also for
>>>>>> saving/exporting a model in PMML to be an option for the user, once a 
>>>>>> model
>>>>>> is built and for models that have already been built.
>>>>>>
>>>>>> @Fazlan - Can we find out whether the PMML export is possible at
>>>>>> runtime through a method or through the inheritance hierarchy? If so, we
>>>>>> could only make the export option visible on a UI, only for supported
>>>>>> models.
>>>>>>
>>>>>> Thanks and Regards,
>>>>>>
>>>>>> Vidura
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 12 October 2015 at 11:33, CD Athuraliya <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I feel that asking user to go through the complete ML workflow for
>>>>>>> PMML is too demanding. Computationally this conversion should be less
>>>>>>> expensive compared to model training in real world use cases (since 
>>>>>>> it's a
>>>>>>> mapping of model parameters from Java objects to XML AFAIK). And model
>>>>>>> training should be independent from the model format. Instead can't we
>>>>>>> support this conversion on demand? Or save in both formats for now? Once
>>>>>>> Spark starts supporting PMML for all algorithms we can go for Method 1 
>>>>>>> if
>>>>>>> it looks consistent through out our ML life cycle.
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> On Mon, Oct 12, 2015 at 11:09 AM, Fazlan Nazeem <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I am working on redmine[1] regarding PMML support for Machine
>>>>>>>> Learner. Please provide your opinion on this design.
>>>>>>>> [1]https://redmine.wso2.com/issues/4303
>>>>>>>>
>>>>>>>> *Overview*
>>>>>>>>
>>>>>>>> Spark 1.5.1(lastest version) supports PMML model export for some of
>>>>>>>> the available models in Spark through MLlib.
>>>>>>>>
>>>>>>>> The table below outlines the MLlib models that can be exported to
>>>>>>>> PMML and their equivalent PMML model.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> MLlib model
>>>>>>>>
>>>>>>>> PMML model
>>>>>>>>
>>>>>>>> KMeansModel
>>>>>>>>
>>>>>>>> ClusteringModel
>>>>>>>>
>>>>>>>> LinearRegressionModel
>>>>>>>>
>>>>>>>> RegressionModel (functionName="regression")
>>>>>>>>
>>>>>>>> RidgeRegressionModel
>>>>>>>>
>>>>>>>> RegressionModel (functionName="regression")
>>>>>>>>
>>>>>>>> LassoModel
>>>>>>>>
>>>>>>>> RegressionModel (functionName="regression")
>>>>>>>>
>>>>>>>> SVMModel
>>>>>>>>
>>>>>>>> RegressionModel (functionName="classification"
>>>>>>>> normalizationMethod="none")
>>>>>>>>
>>>>>>>> Binary LogisticRegressionModel
>>>>>>>>
>>>>>>>> RegressionModel (functionName="classification"
>>>>>>>> normalizationMethod="logit")
>>>>>>>>
>>>>>>>> Not all models available in MLlib can be exported to PMML as of now.
>>>>>>>> Goal
>>>>>>>>
>>>>>>>>    1.
>>>>>>>>
>>>>>>>>    We need to save models generated by WSO2 ML(PMML supported
>>>>>>>>    models) in PMML format, so that those could be reused from PMML 
>>>>>>>> supported
>>>>>>>>    tools.
>>>>>>>>
>>>>>>>> How To
>>>>>>>>
>>>>>>>> if “clusters” is the trained model, we can do the following with
>>>>>>>> the PMML support.
>>>>>>>>
>>>>>>>> // Export the model to a String in PMML format
>>>>>>>> clusters.toPMML
>>>>>>>>
>>>>>>>> // Export the model to a local file in PMML format
>>>>>>>> clusters.toPMML("/tmp/kmeans.xml")
>>>>>>>>
>>>>>>>> // Export the model to a directory on a distributed file system in
>>>>>>>> PMML format
>>>>>>>> clusters.toPMML(sc,"/tmp/kmeans")
>>>>>>>>
>>>>>>>> // Export the model to the OutputStream in PMML format
>>>>>>>> clusters.toPMML(System.out)
>>>>>>>>
>>>>>>>> For unsupported models, either you will not find a .toPMML method
>>>>>>>> or an IllegalArgumentException will be thrown.
>>>>>>>> Design
>>>>>>>>
>>>>>>>> In the following diagram models highlighted in green can be
>>>>>>>> exported to PMML, but not the models highlighted in red. The diagram
>>>>>>>> illustrates algorithms supported by WSO2 Machine Learner.
>>>>>>>>
>>>>>>>> [image: Inline image 2]
>>>>>>>> ​
>>>>>>>>
>>>>>>>> Method 1
>>>>>>>>
>>>>>>>> By default save the models in PMML if PMML export is supported,
>>>>>>>> using one of these supported options.
>>>>>>>>
>>>>>>>> 1.  Export the model to a String in PMML format
>>>>>>>> 2.  Export the model to a local file in PMML format
>>>>>>>> 3.  Export the model to a directory on a distributed file system in
>>>>>>>> PMML format
>>>>>>>> 4 . Export the model to the OutputStream in PMML format
>>>>>>>>
>>>>>>>> Classes need to be modified (apart from UI)
>>>>>>>>
>>>>>>>>    -
>>>>>>>>
>>>>>>>>    SupervisedSparkModelBuilder
>>>>>>>>    -
>>>>>>>>
>>>>>>>>    UnsupervisedSparkModelBuilder
>>>>>>>>
>>>>>>>>
>>>>>>>> e.g
>>>>>>>>
>>>>>>>> [image: Inline image 1]
>>>>>>>>
>>>>>>>> As of now the serialized models are saved in “models” folder. The
>>>>>>>> PMML models can also be saved in the same directory with a PMML suffix.
>>>>>>>>
>>>>>>>> optional:
>>>>>>>>
>>>>>>>> After the model is generated let the user export the PMML model to
>>>>>>>> a chosen location through the UI.
>>>>>>>>
>>>>>>>> Method 2
>>>>>>>>
>>>>>>>> Add a *new REST API* to build models with PMML
>>>>>>>>
>>>>>>>> public Response buildPMMLModel(@PathParam("modelId") long modelId)
>>>>>>>>
>>>>>>>> in the backend we could add an additional argument to "buildXModel"
>>>>>>>> methods to decide whether to save the PMML model or not.
>>>>>>>>
>>>>>>>> UI modifications also needed (An option for the user to choose
>>>>>>>> whether to build the PMML and to choose the path to save it)
>>>>>>>>
>>>>>>>> Identified classes need to be modified (apart from UI)
>>>>>>>>
>>>>>>>>    -
>>>>>>>>
>>>>>>>>    SupervisedSparkModelBuilder
>>>>>>>>    -
>>>>>>>>
>>>>>>>>    UnsupervisedSparkModelBuilder
>>>>>>>>    -
>>>>>>>>
>>>>>>>>    ModelApiV10
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *Conclusion*
>>>>>>>>
>>>>>>>> Currently we have decided to go with "Method 2" because of the
>>>>>>>> following reasons.
>>>>>>>>
>>>>>>>>    - Not all models have PMML support in Spark.
>>>>>>>>    - If we are to use anything apart from Spark MLlib, such as
>>>>>>>>    H2O, we will be depending on PMML support from H2O.
>>>>>>>>    - With Method 1 we might be generating PMML models when users
>>>>>>>>    are not in need of it (useless computation).
>>>>>>>>
>>>>>>>>  Please let me know if there is a better way to improve the design.
>>>>>>>>
>>>>>>>> --
>>>>>>>> Thanks & Regards,
>>>>>>>>
>>>>>>>> Fazlan Nazeem
>>>>>>>>
>>>>>>>> *Software Engineer*
>>>>>>>>
>>>>>>>> *WSO2 Inc*
>>>>>>>> Mobile : +94772338839
>>>>>>>> <%2B94%20%280%29%20773%20451194>
>>>>>>>> [email protected]
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> *CD Athuraliya*
>>>>>>> Software Engineer
>>>>>>> WSO2, Inc.
>>>>>>> lean . enterprise . middleware
>>>>>>> Mobile: +94 716288847 <94716288847>
>>>>>>> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter
>>>>>>> <https://twitter.com/cdathuraliya> | Blog
>>>>>>> <https://cdathuraliya.wordpress.com/>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Vidura Gamini Abhaya, Ph.D.
>>>>>> Director of Engineering
>>>>>> M:+94 77 034 7754
>>>>>> E: [email protected]
>>>>>>
>>>>>> WSO2 Inc. (http://wso2.com)
>>>>>> lean.enterprise.middleware
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Thanks & Regards,
>>>>>
>>>>> Fazlan Nazeem
>>>>>
>>>>> *Software Engineer*
>>>>>
>>>>> *WSO2 Inc*
>>>>> Mobile : +94772338839
>>>>> <%2B94%20%280%29%20773%20451194>
>>>>> [email protected]
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Vidura Gamini Abhaya, Ph.D.
>>>> Director of Engineering
>>>> M:+94 77 034 7754
>>>> E: [email protected]
>>>>
>>>> WSO2 Inc. (http://wso2.com)
>>>> lean.enterprise.middleware
>>>>
>>>
>>>
>>>
>>> --
>>> *CD Athuraliya*
>>> Software Engineer
>>> WSO2, Inc.
>>> lean . enterprise . middleware
>>> Mobile: +94 716288847 <94716288847>
>>> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter
>>> <https://twitter.com/cdathuraliya> | Blog
>>> <https://cdathuraliya.wordpress.com/>
>>>
>>
>>
>>
>> --
>>
>> Thanks & regards,
>> Nirmal
>>
>> Team Lead - WSO2 Machine Learner
>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>> Mobile: +94715779733
>> Blog: http://nirmalfdo.blogspot.com/
>>
>>
>>
>
>
> --
> *CD Athuraliya*
> Software Engineer
> WSO2, Inc.
> lean . enterprise . middleware
> Mobile: +94 716288847 <94716288847>
> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter
> <https://twitter.com/cdathuraliya> | Blog
> <https://cdathuraliya.wordpress.com/>
>



-- 

Thanks & regards,
Nirmal

Team Lead - WSO2 Machine Learner
Associate Technical Lead - Data Technologies Team, WSO2 Inc.
Mobile: +94715779733
Blog: http://nirmalfdo.blogspot.com/
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to