Hi Fazlan,

Are you planning to build a PMML model from the scratch (i.e going through
the entire flow to build an ML model) or is this to be used for exporting a
PMML out of an already built model?

If it's the former, +1 to what CD mentioned on not asking user to go
through the entire ML workflow for PMML. My preference is also for
saving/exporting a model in PMML to be an option for the user, once a model
is built and for models that have already been built.

@Fazlan - Can we find out whether the PMML export is possible at runtime
through a method or through the inheritance hierarchy? If so, we could only
make the export option visible on a UI, only for supported models.

Thanks and Regards,

Vidura



On 12 October 2015 at 11:33, CD Athuraliya <[email protected]> wrote:

> Hi,
>
> I feel that asking user to go through the complete ML workflow for PMML is
> too demanding. Computationally this conversion should be less expensive
> compared to model training in real world use cases (since it's a mapping of
> model parameters from Java objects to XML AFAIK). And model training should
> be independent from the model format. Instead can't we support this
> conversion on demand? Or save in both formats for now? Once Spark starts
> supporting PMML for all algorithms we can go for Method 1 if it looks
> consistent through out our ML life cycle.
>
> Thanks
>
> On Mon, Oct 12, 2015 at 11:09 AM, Fazlan Nazeem <[email protected]> wrote:
>
>> Hi,
>>
>> I am working on redmine[1] regarding PMML support for Machine Learner.
>> Please provide your opinion on this design.
>> [1]https://redmine.wso2.com/issues/4303
>>
>> *Overview*
>>
>> Spark 1.5.1(lastest version) supports PMML model export for some of the
>> available models in Spark through MLlib.
>>
>> The table below outlines the MLlib models that can be exported to PMML
>> and their equivalent PMML model.
>>
>>
>>
>> MLlib model
>>
>> PMML model
>>
>> KMeansModel
>>
>> ClusteringModel
>>
>> LinearRegressionModel
>>
>> RegressionModel (functionName="regression")
>>
>> RidgeRegressionModel
>>
>> RegressionModel (functionName="regression")
>>
>> LassoModel
>>
>> RegressionModel (functionName="regression")
>>
>> SVMModel
>>
>> RegressionModel (functionName="classification" normalizationMethod="none")
>>
>> Binary LogisticRegressionModel
>>
>> RegressionModel (functionName="classification"
>> normalizationMethod="logit")
>>
>> Not all models available in MLlib can be exported to PMML as of now.
>> Goal
>>
>>    1.
>>
>>    We need to save models generated by WSO2 ML(PMML supported models) in
>>    PMML format, so that those could be reused from PMML supported tools.
>>
>> How To
>>
>> if “clusters” is the trained model, we can do the following with the PMML
>> support.
>>
>> // Export the model to a String in PMML format
>> clusters.toPMML
>>
>> // Export the model to a local file in PMML format
>> clusters.toPMML("/tmp/kmeans.xml")
>>
>> // Export the model to a directory on a distributed file system in PMML
>> format
>> clusters.toPMML(sc,"/tmp/kmeans")
>>
>> // Export the model to the OutputStream in PMML format
>> clusters.toPMML(System.out)
>>
>> For unsupported models, either you will not find a .toPMML method or an
>> IllegalArgumentException will be thrown.
>> Design
>>
>> In the following diagram models highlighted in green can be exported to
>> PMML, but not the models highlighted in red. The diagram illustrates
>> algorithms supported by WSO2 Machine Learner.
>>
>> [image: Inline image 2]
>> ​
>>
>> Method 1
>>
>> By default save the models in PMML if PMML export is supported, using one
>> of these supported options.
>>
>> 1.  Export the model to a String in PMML format
>> 2.  Export the model to a local file in PMML format
>> 3.  Export the model to a directory on a distributed file system in PMML
>> format
>> 4 . Export the model to the OutputStream in PMML format
>>
>> Classes need to be modified (apart from UI)
>>
>>    -
>>
>>    SupervisedSparkModelBuilder
>>    -
>>
>>    UnsupervisedSparkModelBuilder
>>
>>
>> e.g
>>
>> [image: Inline image 1]
>>
>> As of now the serialized models are saved in “models” folder. The PMML
>> models can also be saved in the same directory with a PMML suffix.
>>
>> optional:
>>
>> After the model is generated let the user export the PMML model to a
>> chosen location through the UI.
>>
>> Method 2
>>
>> Add a *new REST API* to build models with PMML
>>
>> public Response buildPMMLModel(@PathParam("modelId") long modelId)
>>
>> in the backend we could add an additional argument to "buildXModel"
>> methods to decide whether to save the PMML model or not.
>>
>> UI modifications also needed (An option for the user to choose whether to
>> build the PMML and to choose the path to save it)
>>
>> Identified classes need to be modified (apart from UI)
>>
>>    -
>>
>>    SupervisedSparkModelBuilder
>>    -
>>
>>    UnsupervisedSparkModelBuilder
>>    -
>>
>>    ModelApiV10
>>
>>
>>
>> *Conclusion*
>>
>> Currently we have decided to go with "Method 2" because of the following
>> reasons.
>>
>>    - Not all models have PMML support in Spark.
>>    - If we are to use anything apart from Spark MLlib, such as H2O, we
>>    will be depending on PMML support from H2O.
>>    - With Method 1 we might be generating PMML models when users are not
>>    in need of it (useless computation).
>>
>>  Please let me know if there is a better way to improve the design.
>>
>> --
>> Thanks & Regards,
>>
>> Fazlan Nazeem
>>
>> *Software Engineer*
>>
>> *WSO2 Inc*
>> Mobile : +94772338839
>> <%2B94%20%280%29%20773%20451194>
>> [email protected]
>>
>
>
>
> --
> *CD Athuraliya*
> Software Engineer
> WSO2, Inc.
> lean . enterprise . middleware
> Mobile: +94 716288847 <94716288847>
> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter
> <https://twitter.com/cdathuraliya> | Blog
> <https://cdathuraliya.wordpress.com/>
>



-- 
Vidura Gamini Abhaya, Ph.D.
Director of Engineering
M:+94 77 034 7754
E: [email protected]

WSO2 Inc. (http://wso2.com)
lean.enterprise.middleware
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to