how about spark?
It contains some common machine learning algorithms and support JAVA api.
On Jun 13, 2016 01:26, "Gaurav gupta" <gupta.gaurav0...@gmail.com> wrote:

>
> Hi All,
>
>
>
> Could you please guide me on how to *create and execute *a machine
> learning models/statistical models (regression, Decision tree, K means
> clustering, Naive bayes, scorecard/linear/logistic regression etc. and GBM,
> GLM ) in *Java/JVM based application* (in production).
>
>
>
> We have an ETL sort of Java based product where one can do most of data
> Preparation steps for machine learning, like data ingestion from JDBC,
> files, HDFS, No SQL etc., joins and aggregations etc.(which are required
> for Feature engineering) and now we want to add Analytics capabilities
> using machine learning/statistical modeling.
>
>
>
> Right now, we are using JPMML- evaluator
> <https://github.com/jpmml/jpmml-evaluator> to score the models created in
> PMML format using R and python (and Knime) but it needs three separate and
> unconnected steps:-
>
>  1- first step for data preparation in our Java/JVM application and save
> the sampling data (training and test) data in csv file or in DB, - *<JAVA/JVM
> BASED application>*
>
>  *2-  Create a machine learning Model in R and python (and Knime) and
> export it in PMML 4.2 format -  <NON JAVA BASED >*
>
>  3- Import/deploy the PMML in our Java based application and use JPMML
> evaluator to execute it in production. *<JAVA BASED>*
>
>
>
> I am sure it's a common problem in machine learning as generally in
> Production JAVA is preferred over Python or R. Could you suggest what is
> the better approach(s) to *create* as well as *execute* a python/scikit
> based machine learning model in JVM based application.
>
>
>
> What are your thought to achieve the steps # 2 and #3 more seamlessly in a
> JVM based application, without compromising *performance and usability*:-
>
>
>
> 1-     Call a java program which internally calls the python scikit
> script
> <http://stackoverflow.com/questions/12738827/how-can-i-call-scikit-learn-classifiers-from-java>(under
> the hood) to create a model in PMML
> <https://github.com/jpmml/jpmml-sklearn> and then use JPMML evaluator. It
> will pretend to the user that he is in a single JVM based application
> (better usability). I am not sure what are the limitations and short coming
> of using PMML as not all features are supported in jpmml-sklearn
> <https://github.com/jpmml/jpmml-sklearn>.
>
> 2-     Call a java program which internally calls the python script and
> do the model creation as well as execution in an external python
> environment and serialized the model and the results in a file/csv or in
> memory DB (or cache, like hazelcast) from where the parent Java application
> will fetch the results etc.. I researched that I can’t use Jython for
> executing Sci-kit models
> <http://stackoverflow.com/questions/12738827/how-can-i-call-scikit-learn-classifiers-from-java>
> .
>
> 3-     Can I use Jep <https://github.com/mrj0/jep> (Embed Python in Java)
> to embed Cpython in JVM ? Does anybody tried it for sci-kit models?
>
>
>
> Alternatively, I should explore to use Mahout or weka  - java based
> machine learning libraries in my JVM based application. (I need to support
> both windows and non-windows platforms)
>
>
>
> I am also exploring H2Oai which is java based. Does anybody tried it.
>
>
> Regards
>
> Gaurav
>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to