Hi Vidura,

No I am not planning to build the model from scratch. Once the serialized
spark model is built, we can export it to PMML format. In other words, we
are using the serialized model in order to build the PMML model.

If I have not mistaken what you are suggesting is let the user go through
the normal workflow of model building and once it is done, give an option
to the user to export it to PMML format(also for the models that have been
already built)?
+1

@CD I hope that is what you also meant.

@Vidura I will check on the run-time support, if that is possible that
would be great.


On Mon, Oct 12, 2015 at 12:10 PM, Vidura Gamini Abhaya <[email protected]>
wrote:

> Hi Fazlan,
>
> Are you planning to build a PMML model from the scratch (i.e going through
> the entire flow to build an ML model) or is this to be used for exporting a
> PMML out of an already built model?
>
> If it's the former, +1 to what CD mentioned on not asking user to go
> through the entire ML workflow for PMML. My preference is also for
> saving/exporting a model in PMML to be an option for the user, once a model
> is built and for models that have already been built.
>
> @Fazlan - Can we find out whether the PMML export is possible at runtime
> through a method or through the inheritance hierarchy? If so, we could only
> make the export option visible on a UI, only for supported models.
>
> Thanks and Regards,
>
> Vidura
>
>
>
> On 12 October 2015 at 11:33, CD Athuraliya <[email protected]> wrote:
>
>> Hi,
>>
>> I feel that asking user to go through the complete ML workflow for PMML
>> is too demanding. Computationally this conversion should be less expensive
>> compared to model training in real world use cases (since it's a mapping of
>> model parameters from Java objects to XML AFAIK). And model training should
>> be independent from the model format. Instead can't we support this
>> conversion on demand? Or save in both formats for now? Once Spark starts
>> supporting PMML for all algorithms we can go for Method 1 if it looks
>> consistent through out our ML life cycle.
>>
>> Thanks
>>
>> On Mon, Oct 12, 2015 at 11:09 AM, Fazlan Nazeem <[email protected]> wrote:
>>
>>> Hi,
>>>
>>> I am working on redmine[1] regarding PMML support for Machine Learner.
>>> Please provide your opinion on this design.
>>> [1]https://redmine.wso2.com/issues/4303
>>>
>>> *Overview*
>>>
>>> Spark 1.5.1(lastest version) supports PMML model export for some of the
>>> available models in Spark through MLlib.
>>>
>>> The table below outlines the MLlib models that can be exported to PMML
>>> and their equivalent PMML model.
>>>
>>>
>>>
>>> MLlib model
>>>
>>> PMML model
>>>
>>> KMeansModel
>>>
>>> ClusteringModel
>>>
>>> LinearRegressionModel
>>>
>>> RegressionModel (functionName="regression")
>>>
>>> RidgeRegressionModel
>>>
>>> RegressionModel (functionName="regression")
>>>
>>> LassoModel
>>>
>>> RegressionModel (functionName="regression")
>>>
>>> SVMModel
>>>
>>> RegressionModel (functionName="classification"
>>> normalizationMethod="none")
>>>
>>> Binary LogisticRegressionModel
>>>
>>> RegressionModel (functionName="classification"
>>> normalizationMethod="logit")
>>>
>>> Not all models available in MLlib can be exported to PMML as of now.
>>> Goal
>>>
>>>    1.
>>>
>>>    We need to save models generated by WSO2 ML(PMML supported models)
>>>    in PMML format, so that those could be reused from PMML supported tools.
>>>
>>> How To
>>>
>>> if “clusters” is the trained model, we can do the following with the
>>> PMML support.
>>>
>>> // Export the model to a String in PMML format
>>> clusters.toPMML
>>>
>>> // Export the model to a local file in PMML format
>>> clusters.toPMML("/tmp/kmeans.xml")
>>>
>>> // Export the model to a directory on a distributed file system in PMML
>>> format
>>> clusters.toPMML(sc,"/tmp/kmeans")
>>>
>>> // Export the model to the OutputStream in PMML format
>>> clusters.toPMML(System.out)
>>>
>>> For unsupported models, either you will not find a .toPMML method or an
>>> IllegalArgumentException will be thrown.
>>> Design
>>>
>>> In the following diagram models highlighted in green can be exported to
>>> PMML, but not the models highlighted in red. The diagram illustrates
>>> algorithms supported by WSO2 Machine Learner.
>>>
>>> [image: Inline image 2]
>>> ​
>>>
>>> Method 1
>>>
>>> By default save the models in PMML if PMML export is supported, using
>>> one of these supported options.
>>>
>>> 1.  Export the model to a String in PMML format
>>> 2.  Export the model to a local file in PMML format
>>> 3.  Export the model to a directory on a distributed file system in PMML
>>> format
>>> 4 . Export the model to the OutputStream in PMML format
>>>
>>> Classes need to be modified (apart from UI)
>>>
>>>    -
>>>
>>>    SupervisedSparkModelBuilder
>>>    -
>>>
>>>    UnsupervisedSparkModelBuilder
>>>
>>>
>>> e.g
>>>
>>> [image: Inline image 1]
>>>
>>> As of now the serialized models are saved in “models” folder. The PMML
>>> models can also be saved in the same directory with a PMML suffix.
>>>
>>> optional:
>>>
>>> After the model is generated let the user export the PMML model to a
>>> chosen location through the UI.
>>>
>>> Method 2
>>>
>>> Add a *new REST API* to build models with PMML
>>>
>>> public Response buildPMMLModel(@PathParam("modelId") long modelId)
>>>
>>> in the backend we could add an additional argument to "buildXModel"
>>> methods to decide whether to save the PMML model or not.
>>>
>>> UI modifications also needed (An option for the user to choose whether
>>> to build the PMML and to choose the path to save it)
>>>
>>> Identified classes need to be modified (apart from UI)
>>>
>>>    -
>>>
>>>    SupervisedSparkModelBuilder
>>>    -
>>>
>>>    UnsupervisedSparkModelBuilder
>>>    -
>>>
>>>    ModelApiV10
>>>
>>>
>>>
>>> *Conclusion*
>>>
>>> Currently we have decided to go with "Method 2" because of the following
>>> reasons.
>>>
>>>    - Not all models have PMML support in Spark.
>>>    - If we are to use anything apart from Spark MLlib, such as H2O, we
>>>    will be depending on PMML support from H2O.
>>>    - With Method 1 we might be generating PMML models when users are
>>>    not in need of it (useless computation).
>>>
>>>  Please let me know if there is a better way to improve the design.
>>>
>>> --
>>> Thanks & Regards,
>>>
>>> Fazlan Nazeem
>>>
>>> *Software Engineer*
>>>
>>> *WSO2 Inc*
>>> Mobile : +94772338839
>>> <%2B94%20%280%29%20773%20451194>
>>> [email protected]
>>>
>>
>>
>>
>> --
>> *CD Athuraliya*
>> Software Engineer
>> WSO2, Inc.
>> lean . enterprise . middleware
>> Mobile: +94 716288847 <94716288847>
>> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter
>> <https://twitter.com/cdathuraliya> | Blog
>> <https://cdathuraliya.wordpress.com/>
>>
>
>
>
> --
> Vidura Gamini Abhaya, Ph.D.
> Director of Engineering
> M:+94 77 034 7754
> E: [email protected]
>
> WSO2 Inc. (http://wso2.com)
> lean.enterprise.middleware
>



-- 
Thanks & Regards,

Fazlan Nazeem

*Software Engineer*

*WSO2 Inc*
Mobile : +94772338839
<%2B94%20%280%29%20773%20451194>
[email protected]
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to