Sure Nirmal.

Thanks CD for pointing it out!

On Mon, Oct 12, 2015 at 2:14 PM, Nirmal Fernando <[email protected]> wrote:

> Excellent! Good catch! Fazlan please fix. exportAsPMMLModel may be?
>
> On Mon, Oct 12, 2015 at 2:10 PM, CD Athuraliya <[email protected]>
> wrote:
>
>> To me buildPMMLModel(long modelId) sounds more like we are building (or
>> training) the model. ExportPMML or something similar would sound more
>> like the actual action IMO.
>>
>> On Mon, Oct 12, 2015 at 2:02 PM, Nirmal Fernando <[email protected]> wrote:
>>
>>> Hi CD/Vidura,
>>>
>>> On Mon, Oct 12, 2015 at 1:56 PM, CD Athuraliya <[email protected]>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Mon, Oct 12, 2015 at 12:36 PM, Vidura Gamini Abhaya <[email protected]
>>>> > wrote:
>>>>
>>>>> Hi Fazlan,
>>>>>
>>>>> Please see my comments inline in blue.
>>>>>
>>>>>>
>>>>>> No I am not planning to build the model from scratch. Once the
>>>>>> serialized spark model is built, we can export it to PMML format. In 
>>>>>> other
>>>>>> words, we are using the serialized model in order to build the PMML 
>>>>>> model.
>>>>>>
>>>>>
>>>>> That's great.
>>>>>
>>>>> If I have not mistaken what you are suggesting is let the user go
>>>>>> through the normal workflow of model building and once it is done, give 
>>>>>> an
>>>>>> option to the user to export it to PMML format(also for the models that
>>>>>> have been already built)?
>>>>>>
>>>>>
>>>> Yes exactly! What we should not do IMO is asking the user to go through
>>>> the whole workflow if he needs to export already created model in PMML.
>>>>
>>>
>>> Can you please explain from where did you get this idea? If this idea is
>>> there in Fazlan's content, we need to fix it.
>>>
>>>
>>>>> Yes, this is exactly what I meant.
>>>>>
>>>>>
>>>>>> @Vidura I will check on the run-time support, if that is possible
>>>>>> that would be great.
>>>>>>
>>>>>
>>>>> If it's supported, it'll be great. If not we can still do it based on
>>>>> the model type but I think it'll be a bit messy as the code wouldn't be as
>>>>> generic.
>>>>>
>>>>>
>>>>> Thanks and Regards,
>>>>>
>>>>> Vidura
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> On Mon, Oct 12, 2015 at 12:10 PM, Vidura Gamini Abhaya <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Hi Fazlan,
>>>>>>>
>>>>>>> Are you planning to build a PMML model from the scratch (i.e going
>>>>>>> through the entire flow to build an ML model) or is this to be used for
>>>>>>> exporting a PMML out of an already built model?
>>>>>>>
>>>>>>> If it's the former, +1 to what CD mentioned on not asking user to go
>>>>>>> through the entire ML workflow for PMML. My preference is also for
>>>>>>> saving/exporting a model in PMML to be an option for the user, once a 
>>>>>>> model
>>>>>>> is built and for models that have already been built.
>>>>>>>
>>>>>>> @Fazlan - Can we find out whether the PMML export is possible at
>>>>>>> runtime through a method or through the inheritance hierarchy? If so, we
>>>>>>> could only make the export option visible on a UI, only for supported
>>>>>>> models.
>>>>>>>
>>>>>>> Thanks and Regards,
>>>>>>>
>>>>>>> Vidura
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 12 October 2015 at 11:33, CD Athuraliya <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I feel that asking user to go through the complete ML workflow for
>>>>>>>> PMML is too demanding. Computationally this conversion should be less
>>>>>>>> expensive compared to model training in real world use cases (since 
>>>>>>>> it's a
>>>>>>>> mapping of model parameters from Java objects to XML AFAIK). And model
>>>>>>>> training should be independent from the model format. Instead can't we
>>>>>>>> support this conversion on demand? Or save in both formats for now? 
>>>>>>>> Once
>>>>>>>> Spark starts supporting PMML for all algorithms we can go for Method 1 
>>>>>>>> if
>>>>>>>> it looks consistent through out our ML life cycle.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>>
>>>>>>>> On Mon, Oct 12, 2015 at 11:09 AM, Fazlan Nazeem <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I am working on redmine[1] regarding PMML support for Machine
>>>>>>>>> Learner. Please provide your opinion on this design.
>>>>>>>>> [1]https://redmine.wso2.com/issues/4303
>>>>>>>>>
>>>>>>>>> *Overview*
>>>>>>>>>
>>>>>>>>> Spark 1.5.1(lastest version) supports PMML model export for some
>>>>>>>>> of the available models in Spark through MLlib.
>>>>>>>>>
>>>>>>>>> The table below outlines the MLlib models that can be exported to
>>>>>>>>> PMML and their equivalent PMML model.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> MLlib model
>>>>>>>>>
>>>>>>>>> PMML model
>>>>>>>>>
>>>>>>>>> KMeansModel
>>>>>>>>>
>>>>>>>>> ClusteringModel
>>>>>>>>>
>>>>>>>>> LinearRegressionModel
>>>>>>>>>
>>>>>>>>> RegressionModel (functionName="regression")
>>>>>>>>>
>>>>>>>>> RidgeRegressionModel
>>>>>>>>>
>>>>>>>>> RegressionModel (functionName="regression")
>>>>>>>>>
>>>>>>>>> LassoModel
>>>>>>>>>
>>>>>>>>> RegressionModel (functionName="regression")
>>>>>>>>>
>>>>>>>>> SVMModel
>>>>>>>>>
>>>>>>>>> RegressionModel (functionName="classification"
>>>>>>>>> normalizationMethod="none")
>>>>>>>>>
>>>>>>>>> Binary LogisticRegressionModel
>>>>>>>>>
>>>>>>>>> RegressionModel (functionName="classification"
>>>>>>>>> normalizationMethod="logit")
>>>>>>>>>
>>>>>>>>> Not all models available in MLlib can be exported to PMML as of
>>>>>>>>> now.
>>>>>>>>> Goal
>>>>>>>>>
>>>>>>>>>    1.
>>>>>>>>>
>>>>>>>>>    We need to save models generated by WSO2 ML(PMML supported
>>>>>>>>>    models) in PMML format, so that those could be reused from PMML 
>>>>>>>>> supported
>>>>>>>>>    tools.
>>>>>>>>>
>>>>>>>>> How To
>>>>>>>>>
>>>>>>>>> if “clusters” is the trained model, we can do the following with
>>>>>>>>> the PMML support.
>>>>>>>>>
>>>>>>>>> // Export the model to a String in PMML format
>>>>>>>>> clusters.toPMML
>>>>>>>>>
>>>>>>>>> // Export the model to a local file in PMML format
>>>>>>>>> clusters.toPMML("/tmp/kmeans.xml")
>>>>>>>>>
>>>>>>>>> // Export the model to a directory on a distributed file system in
>>>>>>>>> PMML format
>>>>>>>>> clusters.toPMML(sc,"/tmp/kmeans")
>>>>>>>>>
>>>>>>>>> // Export the model to the OutputStream in PMML format
>>>>>>>>> clusters.toPMML(System.out)
>>>>>>>>>
>>>>>>>>> For unsupported models, either you will not find a .toPMML method
>>>>>>>>> or an IllegalArgumentException will be thrown.
>>>>>>>>> Design
>>>>>>>>>
>>>>>>>>> In the following diagram models highlighted in green can be
>>>>>>>>> exported to PMML, but not the models highlighted in red. The diagram
>>>>>>>>> illustrates algorithms supported by WSO2 Machine Learner.
>>>>>>>>>
>>>>>>>>> [image: Inline image 2]
>>>>>>>>> ​
>>>>>>>>>
>>>>>>>>> Method 1
>>>>>>>>>
>>>>>>>>> By default save the models in PMML if PMML export is supported,
>>>>>>>>> using one of these supported options.
>>>>>>>>>
>>>>>>>>> 1.  Export the model to a String in PMML format
>>>>>>>>> 2.  Export the model to a local file in PMML format
>>>>>>>>> 3.  Export the model to a directory on a distributed file system
>>>>>>>>> in PMML format
>>>>>>>>> 4 . Export the model to the OutputStream in PMML format
>>>>>>>>>
>>>>>>>>> Classes need to be modified (apart from UI)
>>>>>>>>>
>>>>>>>>>    -
>>>>>>>>>
>>>>>>>>>    SupervisedSparkModelBuilder
>>>>>>>>>    -
>>>>>>>>>
>>>>>>>>>    UnsupervisedSparkModelBuilder
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> e.g
>>>>>>>>>
>>>>>>>>> [image: Inline image 1]
>>>>>>>>>
>>>>>>>>> As of now the serialized models are saved in “models” folder. The
>>>>>>>>> PMML models can also be saved in the same directory with a PMML 
>>>>>>>>> suffix.
>>>>>>>>>
>>>>>>>>> optional:
>>>>>>>>>
>>>>>>>>> After the model is generated let the user export the PMML model to
>>>>>>>>> a chosen location through the UI.
>>>>>>>>>
>>>>>>>>> Method 2
>>>>>>>>>
>>>>>>>>> Add a *new REST API* to build models with PMML
>>>>>>>>>
>>>>>>>>> public Response buildPMMLModel(@PathParam("modelId") long modelId)
>>>>>>>>>
>>>>>>>>> in the backend we could add an additional argument to
>>>>>>>>> "buildXModel" methods to decide whether to save the PMML model or not.
>>>>>>>>>
>>>>>>>>> UI modifications also needed (An option for the user to choose
>>>>>>>>> whether to build the PMML and to choose the path to save it)
>>>>>>>>>
>>>>>>>>> Identified classes need to be modified (apart from UI)
>>>>>>>>>
>>>>>>>>>    -
>>>>>>>>>
>>>>>>>>>    SupervisedSparkModelBuilder
>>>>>>>>>    -
>>>>>>>>>
>>>>>>>>>    UnsupervisedSparkModelBuilder
>>>>>>>>>    -
>>>>>>>>>
>>>>>>>>>    ModelApiV10
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *Conclusion*
>>>>>>>>>
>>>>>>>>> Currently we have decided to go with "Method 2" because of the
>>>>>>>>> following reasons.
>>>>>>>>>
>>>>>>>>>    - Not all models have PMML support in Spark.
>>>>>>>>>    - If we are to use anything apart from Spark MLlib, such as
>>>>>>>>>    H2O, we will be depending on PMML support from H2O.
>>>>>>>>>    - With Method 1 we might be generating PMML models when users
>>>>>>>>>    are not in need of it (useless computation).
>>>>>>>>>
>>>>>>>>>  Please let me know if there is a better way to improve the design.
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Thanks & Regards,
>>>>>>>>>
>>>>>>>>> Fazlan Nazeem
>>>>>>>>>
>>>>>>>>> *Software Engineer*
>>>>>>>>>
>>>>>>>>> *WSO2 Inc*
>>>>>>>>> Mobile : +94772338839
>>>>>>>>> <%2B94%20%280%29%20773%20451194>
>>>>>>>>> [email protected]
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> *CD Athuraliya*
>>>>>>>> Software Engineer
>>>>>>>> WSO2, Inc.
>>>>>>>> lean . enterprise . middleware
>>>>>>>> Mobile: +94 716288847 <94716288847>
>>>>>>>> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter
>>>>>>>> <https://twitter.com/cdathuraliya> | Blog
>>>>>>>> <https://cdathuraliya.wordpress.com/>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Vidura Gamini Abhaya, Ph.D.
>>>>>>> Director of Engineering
>>>>>>> M:+94 77 034 7754
>>>>>>> E: [email protected]
>>>>>>>
>>>>>>> WSO2 Inc. (http://wso2.com)
>>>>>>> lean.enterprise.middleware
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Thanks & Regards,
>>>>>>
>>>>>> Fazlan Nazeem
>>>>>>
>>>>>> *Software Engineer*
>>>>>>
>>>>>> *WSO2 Inc*
>>>>>> Mobile : +94772338839
>>>>>> <%2B94%20%280%29%20773%20451194>
>>>>>> [email protected]
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Vidura Gamini Abhaya, Ph.D.
>>>>> Director of Engineering
>>>>> M:+94 77 034 7754
>>>>> E: [email protected]
>>>>>
>>>>> WSO2 Inc. (http://wso2.com)
>>>>> lean.enterprise.middleware
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> *CD Athuraliya*
>>>> Software Engineer
>>>> WSO2, Inc.
>>>> lean . enterprise . middleware
>>>> Mobile: +94 716288847 <94716288847>
>>>> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter
>>>> <https://twitter.com/cdathuraliya> | Blog
>>>> <https://cdathuraliya.wordpress.com/>
>>>>
>>>
>>>
>>>
>>> --
>>>
>>> Thanks & regards,
>>> Nirmal
>>>
>>> Team Lead - WSO2 Machine Learner
>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>> Mobile: +94715779733
>>> Blog: http://nirmalfdo.blogspot.com/
>>>
>>>
>>>
>>
>>
>> --
>> *CD Athuraliya*
>> Software Engineer
>> WSO2, Inc.
>> lean . enterprise . middleware
>> Mobile: +94 716288847 <94716288847>
>> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter
>> <https://twitter.com/cdathuraliya> | Blog
>> <https://cdathuraliya.wordpress.com/>
>>
>
>
>
> --
>
> Thanks & regards,
> Nirmal
>
> Team Lead - WSO2 Machine Learner
> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
> Mobile: +94715779733
> Blog: http://nirmalfdo.blogspot.com/
>
>
>


-- 
Thanks & Regards,

Fazlan Nazeem

*Software Engineer*

*WSO2 Inc*
Mobile : +94772338839
<%2B94%20%280%29%20773%20451194>
[email protected]
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to