On Mon, Oct 12, 2015 at 12:36 PM, Vidura Gamini Abhaya <[email protected]>
wrote:

> Hi Fazlan,
>
> Please see my comments inline in blue.
>
>>
>> No I am not planning to build the model from scratch. Once the serialized
>> spark model is built, we can export it to PMML format. In other words, we
>> are using the serialized model in order to build the PMML model.
>>
>
> That's great.
>
> If I have not mistaken what you are suggesting is let the user go through
>> the normal workflow of model building and once it is done, give an option
>> to the user to export it to PMML format(also for the models that have been
>> already built)?
>>
>
Yes exactly! What we should not do IMO is asking the user to go through the
whole workflow if he needs to export already created model in PMML.

>
> Yes, this is exactly what I meant.
>
>
>> @Vidura I will check on the run-time support, if that is possible that
>> would be great.
>>
>
> If it's supported, it'll be great. If not we can still do it based on the
> model type but I think it'll be a bit messy as the code wouldn't be as
> generic.
>
>
> Thanks and Regards,
>
> Vidura
>
>
>
>>
>> On Mon, Oct 12, 2015 at 12:10 PM, Vidura Gamini Abhaya <[email protected]>
>> wrote:
>>
>>> Hi Fazlan,
>>>
>>> Are you planning to build a PMML model from the scratch (i.e going
>>> through the entire flow to build an ML model) or is this to be used for
>>> exporting a PMML out of an already built model?
>>>
>>> If it's the former, +1 to what CD mentioned on not asking user to go
>>> through the entire ML workflow for PMML. My preference is also for
>>> saving/exporting a model in PMML to be an option for the user, once a model
>>> is built and for models that have already been built.
>>>
>>> @Fazlan - Can we find out whether the PMML export is possible at runtime
>>> through a method or through the inheritance hierarchy? If so, we could only
>>> make the export option visible on a UI, only for supported models.
>>>
>>> Thanks and Regards,
>>>
>>> Vidura
>>>
>>>
>>>
>>> On 12 October 2015 at 11:33, CD Athuraliya <[email protected]> wrote:
>>>
>>>> Hi,
>>>>
>>>> I feel that asking user to go through the complete ML workflow for PMML
>>>> is too demanding. Computationally this conversion should be less expensive
>>>> compared to model training in real world use cases (since it's a mapping of
>>>> model parameters from Java objects to XML AFAIK). And model training should
>>>> be independent from the model format. Instead can't we support this
>>>> conversion on demand? Or save in both formats for now? Once Spark starts
>>>> supporting PMML for all algorithms we can go for Method 1 if it looks
>>>> consistent through out our ML life cycle.
>>>>
>>>> Thanks
>>>>
>>>> On Mon, Oct 12, 2015 at 11:09 AM, Fazlan Nazeem <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I am working on redmine[1] regarding PMML support for Machine Learner.
>>>>> Please provide your opinion on this design.
>>>>> [1]https://redmine.wso2.com/issues/4303
>>>>>
>>>>> *Overview*
>>>>>
>>>>> Spark 1.5.1(lastest version) supports PMML model export for some of
>>>>> the available models in Spark through MLlib.
>>>>>
>>>>> The table below outlines the MLlib models that can be exported to PMML
>>>>> and their equivalent PMML model.
>>>>>
>>>>>
>>>>>
>>>>> MLlib model
>>>>>
>>>>> PMML model
>>>>>
>>>>> KMeansModel
>>>>>
>>>>> ClusteringModel
>>>>>
>>>>> LinearRegressionModel
>>>>>
>>>>> RegressionModel (functionName="regression")
>>>>>
>>>>> RidgeRegressionModel
>>>>>
>>>>> RegressionModel (functionName="regression")
>>>>>
>>>>> LassoModel
>>>>>
>>>>> RegressionModel (functionName="regression")
>>>>>
>>>>> SVMModel
>>>>>
>>>>> RegressionModel (functionName="classification"
>>>>> normalizationMethod="none")
>>>>>
>>>>> Binary LogisticRegressionModel
>>>>>
>>>>> RegressionModel (functionName="classification"
>>>>> normalizationMethod="logit")
>>>>>
>>>>> Not all models available in MLlib can be exported to PMML as of now.
>>>>> Goal
>>>>>
>>>>>    1.
>>>>>
>>>>>    We need to save models generated by WSO2 ML(PMML supported models)
>>>>>    in PMML format, so that those could be reused from PMML supported 
>>>>> tools.
>>>>>
>>>>> How To
>>>>>
>>>>> if “clusters” is the trained model, we can do the following with the
>>>>> PMML support.
>>>>>
>>>>> // Export the model to a String in PMML format
>>>>> clusters.toPMML
>>>>>
>>>>> // Export the model to a local file in PMML format
>>>>> clusters.toPMML("/tmp/kmeans.xml")
>>>>>
>>>>> // Export the model to a directory on a distributed file system in
>>>>> PMML format
>>>>> clusters.toPMML(sc,"/tmp/kmeans")
>>>>>
>>>>> // Export the model to the OutputStream in PMML format
>>>>> clusters.toPMML(System.out)
>>>>>
>>>>> For unsupported models, either you will not find a .toPMML method or
>>>>> an IllegalArgumentException will be thrown.
>>>>> Design
>>>>>
>>>>> In the following diagram models highlighted in green can be exported
>>>>> to PMML, but not the models highlighted in red. The diagram illustrates
>>>>> algorithms supported by WSO2 Machine Learner.
>>>>>
>>>>> [image: Inline image 2]
>>>>> ​
>>>>>
>>>>> Method 1
>>>>>
>>>>> By default save the models in PMML if PMML export is supported, using
>>>>> one of these supported options.
>>>>>
>>>>> 1.  Export the model to a String in PMML format
>>>>> 2.  Export the model to a local file in PMML format
>>>>> 3.  Export the model to a directory on a distributed file system in
>>>>> PMML format
>>>>> 4 . Export the model to the OutputStream in PMML format
>>>>>
>>>>> Classes need to be modified (apart from UI)
>>>>>
>>>>>    -
>>>>>
>>>>>    SupervisedSparkModelBuilder
>>>>>    -
>>>>>
>>>>>    UnsupervisedSparkModelBuilder
>>>>>
>>>>>
>>>>> e.g
>>>>>
>>>>> [image: Inline image 1]
>>>>>
>>>>> As of now the serialized models are saved in “models” folder. The PMML
>>>>> models can also be saved in the same directory with a PMML suffix.
>>>>>
>>>>> optional:
>>>>>
>>>>> After the model is generated let the user export the PMML model to a
>>>>> chosen location through the UI.
>>>>>
>>>>> Method 2
>>>>>
>>>>> Add a *new REST API* to build models with PMML
>>>>>
>>>>> public Response buildPMMLModel(@PathParam("modelId") long modelId)
>>>>>
>>>>> in the backend we could add an additional argument to "buildXModel"
>>>>> methods to decide whether to save the PMML model or not.
>>>>>
>>>>> UI modifications also needed (An option for the user to choose whether
>>>>> to build the PMML and to choose the path to save it)
>>>>>
>>>>> Identified classes need to be modified (apart from UI)
>>>>>
>>>>>    -
>>>>>
>>>>>    SupervisedSparkModelBuilder
>>>>>    -
>>>>>
>>>>>    UnsupervisedSparkModelBuilder
>>>>>    -
>>>>>
>>>>>    ModelApiV10
>>>>>
>>>>>
>>>>>
>>>>> *Conclusion*
>>>>>
>>>>> Currently we have decided to go with "Method 2" because of the
>>>>> following reasons.
>>>>>
>>>>>    - Not all models have PMML support in Spark.
>>>>>    - If we are to use anything apart from Spark MLlib, such as H2O,
>>>>>    we will be depending on PMML support from H2O.
>>>>>    - With Method 1 we might be generating PMML models when users are
>>>>>    not in need of it (useless computation).
>>>>>
>>>>>  Please let me know if there is a better way to improve the design.
>>>>>
>>>>> --
>>>>> Thanks & Regards,
>>>>>
>>>>> Fazlan Nazeem
>>>>>
>>>>> *Software Engineer*
>>>>>
>>>>> *WSO2 Inc*
>>>>> Mobile : +94772338839
>>>>> <%2B94%20%280%29%20773%20451194>
>>>>> [email protected]
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> *CD Athuraliya*
>>>> Software Engineer
>>>> WSO2, Inc.
>>>> lean . enterprise . middleware
>>>> Mobile: +94 716288847 <94716288847>
>>>> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter
>>>> <https://twitter.com/cdathuraliya> | Blog
>>>> <https://cdathuraliya.wordpress.com/>
>>>>
>>>
>>>
>>>
>>> --
>>> Vidura Gamini Abhaya, Ph.D.
>>> Director of Engineering
>>> M:+94 77 034 7754
>>> E: [email protected]
>>>
>>> WSO2 Inc. (http://wso2.com)
>>> lean.enterprise.middleware
>>>
>>
>>
>>
>> --
>> Thanks & Regards,
>>
>> Fazlan Nazeem
>>
>> *Software Engineer*
>>
>> *WSO2 Inc*
>> Mobile : +94772338839
>> <%2B94%20%280%29%20773%20451194>
>> [email protected]
>>
>
>
>
> --
> Vidura Gamini Abhaya, Ph.D.
> Director of Engineering
> M:+94 77 034 7754
> E: [email protected]
>
> WSO2 Inc. (http://wso2.com)
> lean.enterprise.middleware
>



-- 
*CD Athuraliya*
Software Engineer
WSO2, Inc.
lean . enterprise . middleware
Mobile: +94 716288847 <94716288847>
LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter
<https://twitter.com/cdathuraliya> | Blog
<https://cdathuraliya.wordpress.com/>
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to