On Mon, Oct 12, 2015 at 2:17 PM, Fazlan Nazeem <[email protected]> wrote:

> Sure Nirmal.
>
> Thanks CD for pointing it out!
>
> On Mon, Oct 12, 2015 at 2:14 PM, Nirmal Fernando <[email protected]> wrote:
>
>> Excellent! Good catch! Fazlan please fix. exportAsPMMLModel may be?
>>
>> On Mon, Oct 12, 2015 at 2:10 PM, CD Athuraliya <[email protected]>
>> wrote:
>>
>>> To me buildPMMLModel(long modelId) sounds more like we are building (or
>>> training) the model. ExportPMML or something similar would sound more
>>> like the actual action IMO.
>>>
>>> On Mon, Oct 12, 2015 at 2:02 PM, Nirmal Fernando <[email protected]>
>>> wrote:
>>>
>>>> Hi CD/Vidura,
>>>>
>>>> On Mon, Oct 12, 2015 at 1:56 PM, CD Athuraliya <[email protected]>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Mon, Oct 12, 2015 at 12:36 PM, Vidura Gamini Abhaya <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Hi Fazlan,
>>>>>>
>>>>>> Please see my comments inline in blue.
>>>>>>
>>>>>>>
>>>>>>> No I am not planning to build the model from scratch. Once the
>>>>>>> serialized spark model is built, we can export it to PMML format. In 
>>>>>>> other
>>>>>>> words, we are using the serialized model in order to build the PMML 
>>>>>>> model.
>>>>>>>
>>>>>>
>>>>>> That's great.
>>>>>>
>>>>>> If I have not mistaken what you are suggesting is let the user go
>>>>>>> through the normal workflow of model building and once it is done, give 
>>>>>>> an
>>>>>>> option to the user to export it to PMML format(also for the models that
>>>>>>> have been already built)?
>>>>>>>
>>>>>>
>>>>> Yes exactly! What we should not do IMO is asking the user to go
>>>>> through the whole workflow if he needs to export already created model in
>>>>> PMML.
>>>>>
>>>>
>>>> Can you please explain from where did you get this idea? If this idea
>>>> is there in Fazlan's content, we need to fix it.
>>>>
>>>>
>>>>>> Yes, this is exactly what I meant.
>>>>>>
>>>>>>
>>>>>>> @Vidura I will check on the run-time support, if that is possible
>>>>>>> that would be great.
>>>>>>>
>>>>>>
>>>>>> If it's supported, it'll be great. If not we can still do it based on
>>>>>> the model type but I think it'll be a bit messy as the code wouldn't be 
>>>>>> as
>>>>>> generic.
>>>>>>
>>>>>>
>>>>>> Thanks and Regards,
>>>>>>
>>>>>> Vidura
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> On Mon, Oct 12, 2015 at 12:10 PM, Vidura Gamini Abhaya <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> Hi Fazlan,
>>>>>>>>
>>>>>>>> Are you planning to build a PMML model from the scratch (i.e going
>>>>>>>> through the entire flow to build an ML model) or is this to be used for
>>>>>>>> exporting a PMML out of an already built model?
>>>>>>>>
>>>>>>>> If it's the former, +1 to what CD mentioned on not asking user to
>>>>>>>> go through the entire ML workflow for PMML. My preference is also for
>>>>>>>> saving/exporting a model in PMML to be an option for the user, once a 
>>>>>>>> model
>>>>>>>> is built and for models that have already been built.
>>>>>>>>
>>>>>>>> @Fazlan - Can we find out whether the PMML export is possible at
>>>>>>>> runtime through a method or through the inheritance hierarchy? If so, 
>>>>>>>> we
>>>>>>>> could only make the export option visible on a UI, only for supported
>>>>>>>> models.
>>>>>>>>
>>>>>>>
This option can be something similar to platform selection in typical
software downloads where we have native model type and PMML.

>
>>>>>>>> Thanks and Regards,
>>>>>>>>
>>>>>>>> Vidura
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 12 October 2015 at 11:33, CD Athuraliya <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I feel that asking user to go through the complete ML workflow for
>>>>>>>>> PMML is too demanding. Computationally this conversion should be less
>>>>>>>>> expensive compared to model training in real world use cases (since 
>>>>>>>>> it's a
>>>>>>>>> mapping of model parameters from Java objects to XML AFAIK). And model
>>>>>>>>> training should be independent from the model format. Instead can't we
>>>>>>>>> support this conversion on demand? Or save in both formats for now? 
>>>>>>>>> Once
>>>>>>>>> Spark starts supporting PMML for all algorithms we can go for Method 
>>>>>>>>> 1 if
>>>>>>>>> it looks consistent through out our ML life cycle.
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>>
>>>>>>>>> On Mon, Oct 12, 2015 at 11:09 AM, Fazlan Nazeem <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I am working on redmine[1] regarding PMML support for Machine
>>>>>>>>>> Learner. Please provide your opinion on this design.
>>>>>>>>>> [1]https://redmine.wso2.com/issues/4303
>>>>>>>>>>
>>>>>>>>>> *Overview*
>>>>>>>>>>
>>>>>>>>>> Spark 1.5.1(lastest version) supports PMML model export for some
>>>>>>>>>> of the available models in Spark through MLlib.
>>>>>>>>>>
>>>>>>>>>> The table below outlines the MLlib models that can be exported to
>>>>>>>>>> PMML and their equivalent PMML model.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> MLlib model
>>>>>>>>>>
>>>>>>>>>> PMML model
>>>>>>>>>>
>>>>>>>>>> KMeansModel
>>>>>>>>>>
>>>>>>>>>> ClusteringModel
>>>>>>>>>>
>>>>>>>>>> LinearRegressionModel
>>>>>>>>>>
>>>>>>>>>> RegressionModel (functionName="regression")
>>>>>>>>>>
>>>>>>>>>> RidgeRegressionModel
>>>>>>>>>>
>>>>>>>>>> RegressionModel (functionName="regression")
>>>>>>>>>>
>>>>>>>>>> LassoModel
>>>>>>>>>>
>>>>>>>>>> RegressionModel (functionName="regression")
>>>>>>>>>>
>>>>>>>>>> SVMModel
>>>>>>>>>>
>>>>>>>>>> RegressionModel (functionName="classification"
>>>>>>>>>> normalizationMethod="none")
>>>>>>>>>>
>>>>>>>>>> Binary LogisticRegressionModel
>>>>>>>>>>
>>>>>>>>>> RegressionModel (functionName="classification"
>>>>>>>>>> normalizationMethod="logit")
>>>>>>>>>>
>>>>>>>>>> Not all models available in MLlib can be exported to PMML as of
>>>>>>>>>> now.
>>>>>>>>>> Goal
>>>>>>>>>>
>>>>>>>>>>    1.
>>>>>>>>>>
>>>>>>>>>>    We need to save models generated by WSO2 ML(PMML supported
>>>>>>>>>>    models) in PMML format, so that those could be reused from PMML 
>>>>>>>>>> supported
>>>>>>>>>>    tools.
>>>>>>>>>>
>>>>>>>>>> How To
>>>>>>>>>>
>>>>>>>>>> if “clusters” is the trained model, we can do the following with
>>>>>>>>>> the PMML support.
>>>>>>>>>>
>>>>>>>>>> // Export the model to a String in PMML format
>>>>>>>>>> clusters.toPMML
>>>>>>>>>>
>>>>>>>>>> // Export the model to a local file in PMML format
>>>>>>>>>> clusters.toPMML("/tmp/kmeans.xml")
>>>>>>>>>>
>>>>>>>>>> // Export the model to a directory on a distributed file system
>>>>>>>>>> in PMML format
>>>>>>>>>> clusters.toPMML(sc,"/tmp/kmeans")
>>>>>>>>>>
>>>>>>>>>> // Export the model to the OutputStream in PMML format
>>>>>>>>>> clusters.toPMML(System.out)
>>>>>>>>>>
>>>>>>>>>> For unsupported models, either you will not find a .toPMML method
>>>>>>>>>> or an IllegalArgumentException will be thrown.
>>>>>>>>>> Design
>>>>>>>>>>
>>>>>>>>>> In the following diagram models highlighted in green can be
>>>>>>>>>> exported to PMML, but not the models highlighted in red. The diagram
>>>>>>>>>> illustrates algorithms supported by WSO2 Machine Learner.
>>>>>>>>>>
>>>>>>>>>> [image: Inline image 2]
>>>>>>>>>> ​
>>>>>>>>>>
>>>>>>>>>> Method 1
>>>>>>>>>>
>>>>>>>>>> By default save the models in PMML if PMML export is supported,
>>>>>>>>>> using one of these supported options.
>>>>>>>>>>
>>>>>>>>>> 1.  Export the model to a String in PMML format
>>>>>>>>>> 2.  Export the model to a local file in PMML format
>>>>>>>>>> 3.  Export the model to a directory on a distributed file system
>>>>>>>>>> in PMML format
>>>>>>>>>> 4 . Export the model to the OutputStream in PMML format
>>>>>>>>>>
>>>>>>>>>> Classes need to be modified (apart from UI)
>>>>>>>>>>
>>>>>>>>>>    -
>>>>>>>>>>
>>>>>>>>>>    SupervisedSparkModelBuilder
>>>>>>>>>>    -
>>>>>>>>>>
>>>>>>>>>>    UnsupervisedSparkModelBuilder
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> e.g
>>>>>>>>>>
>>>>>>>>>> [image: Inline image 1]
>>>>>>>>>>
>>>>>>>>>> As of now the serialized models are saved in “models” folder. The
>>>>>>>>>> PMML models can also be saved in the same directory with a PMML 
>>>>>>>>>> suffix.
>>>>>>>>>>
>>>>>>>>>> optional:
>>>>>>>>>>
>>>>>>>>>> After the model is generated let the user export the PMML model
>>>>>>>>>> to a chosen location through the UI.
>>>>>>>>>>
>>>>>>>>>> Method 2
>>>>>>>>>>
>>>>>>>>>> Add a *new REST API* to build models with PMML
>>>>>>>>>>
>>>>>>>>>> public Response buildPMMLModel(@PathParam("modelId") long modelId
>>>>>>>>>> )
>>>>>>>>>>
>>>>>>>>>> in the backend we could add an additional argument to
>>>>>>>>>> "buildXModel" methods to decide whether to save the PMML model or 
>>>>>>>>>> not.
>>>>>>>>>>
>>>>>>>>>> UI modifications also needed (An option for the user to choose
>>>>>>>>>> whether to build the PMML and to choose the path to save it)
>>>>>>>>>>
>>>>>>>>>> Identified classes need to be modified (apart from UI)
>>>>>>>>>>
>>>>>>>>>>    -
>>>>>>>>>>
>>>>>>>>>>    SupervisedSparkModelBuilder
>>>>>>>>>>    -
>>>>>>>>>>
>>>>>>>>>>    UnsupervisedSparkModelBuilder
>>>>>>>>>>    -
>>>>>>>>>>
>>>>>>>>>>    ModelApiV10
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> *Conclusion*
>>>>>>>>>>
>>>>>>>>>> Currently we have decided to go with "Method 2" because of the
>>>>>>>>>> following reasons.
>>>>>>>>>>
>>>>>>>>>>    - Not all models have PMML support in Spark.
>>>>>>>>>>    - If we are to use anything apart from Spark MLlib, such as
>>>>>>>>>>    H2O, we will be depending on PMML support from H2O.
>>>>>>>>>>    - With Method 1 we might be generating PMML models when users
>>>>>>>>>>    are not in need of it (useless computation).
>>>>>>>>>>
>>>>>>>>>>  Please let me know if there is a better way to improve the
>>>>>>>>>> design.
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Thanks & Regards,
>>>>>>>>>>
>>>>>>>>>> Fazlan Nazeem
>>>>>>>>>>
>>>>>>>>>> *Software Engineer*
>>>>>>>>>>
>>>>>>>>>> *WSO2 Inc*
>>>>>>>>>> Mobile : +94772338839
>>>>>>>>>> <%2B94%20%280%29%20773%20451194>
>>>>>>>>>> [email protected]
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> *CD Athuraliya*
>>>>>>>>> Software Engineer
>>>>>>>>> WSO2, Inc.
>>>>>>>>> lean . enterprise . middleware
>>>>>>>>> Mobile: +94 716288847 <94716288847>
>>>>>>>>> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter
>>>>>>>>> <https://twitter.com/cdathuraliya> | Blog
>>>>>>>>> <https://cdathuraliya.wordpress.com/>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Vidura Gamini Abhaya, Ph.D.
>>>>>>>> Director of Engineering
>>>>>>>> M:+94 77 034 7754
>>>>>>>> E: [email protected]
>>>>>>>>
>>>>>>>> WSO2 Inc. (http://wso2.com)
>>>>>>>> lean.enterprise.middleware
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Thanks & Regards,
>>>>>>>
>>>>>>> Fazlan Nazeem
>>>>>>>
>>>>>>> *Software Engineer*
>>>>>>>
>>>>>>> *WSO2 Inc*
>>>>>>> Mobile : +94772338839
>>>>>>> <%2B94%20%280%29%20773%20451194>
>>>>>>> [email protected]
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Vidura Gamini Abhaya, Ph.D.
>>>>>> Director of Engineering
>>>>>> M:+94 77 034 7754
>>>>>> E: [email protected]
>>>>>>
>>>>>> WSO2 Inc. (http://wso2.com)
>>>>>> lean.enterprise.middleware
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> *CD Athuraliya*
>>>>> Software Engineer
>>>>> WSO2, Inc.
>>>>> lean . enterprise . middleware
>>>>> Mobile: +94 716288847 <94716288847>
>>>>> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter
>>>>> <https://twitter.com/cdathuraliya> | Blog
>>>>> <https://cdathuraliya.wordpress.com/>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Thanks & regards,
>>>> Nirmal
>>>>
>>>> Team Lead - WSO2 Machine Learner
>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>>> Mobile: +94715779733
>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> *CD Athuraliya*
>>> Software Engineer
>>> WSO2, Inc.
>>> lean . enterprise . middleware
>>> Mobile: +94 716288847 <94716288847>
>>> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter
>>> <https://twitter.com/cdathuraliya> | Blog
>>> <https://cdathuraliya.wordpress.com/>
>>>
>>
>>
>>
>> --
>>
>> Thanks & regards,
>> Nirmal
>>
>> Team Lead - WSO2 Machine Learner
>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>> Mobile: +94715779733
>> Blog: http://nirmalfdo.blogspot.com/
>>
>>
>>
>
>
> --
> Thanks & Regards,
>
> Fazlan Nazeem
>
> *Software Engineer*
>
> *WSO2 Inc*
> Mobile : +94772338839
> <%2B94%20%280%29%20773%20451194>
> [email protected]
>



-- 
*CD Athuraliya*
Software Engineer
WSO2, Inc.
lean . enterprise . middleware
Mobile: +94 716288847 <94716288847>
LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter
<https://twitter.com/cdathuraliya> | Blog
<https://cdathuraliya.wordpress.com/>
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to