Valeriy Avanesov created SPARK-23437:
----------------------------------------

             Summary: Distributed Gaussian Process Regression for MLlib
                 Key: SPARK-23437
                 URL: https://issues.apache.org/jira/browse/SPARK-23437
             Project: Spark
          Issue Type: New Feature
          Components: ML, MLlib
    Affects Versions: 2.2.1
            Reporter: Valeriy Avanesov


Gaussian Process Regression (GP) is a well known black box non-linear 
regression approach [1]. For years the approach remained inapplicable to large 
samples due to its cubic computational complexity, however, more recent 
techniques (Sparse GP) allowed for only linear complexity. The field continues 
to attracts interest of the researches – several papers devoted to GP were 
present on NIPS 2017. 

Unfortunately, non-parametric regression techniques coming with mllib are 
restricted to tree-based approaches.

I propose to create and include an implementation (which I am going to work on) 
of so-called robust Bayesian Committee Machine proposed and investigated in [2].

[1] Carl Edward Rasmussen and Christopher K. I. Williams. 2005. _Gaussian 
Processes for Machine Learning (Adaptive Computation and Machine Learning)_. 
The MIT Press.

[2] Marc Peter Deisenroth and Jun Wei Ng. 2015. Distributed Gaussian processes. 
In _Proceedings of the 32nd International Conference on International 
Conference on Machine Learning - Volume 37_ (ICML'15), Francis Bach and David 
Blei (Eds.), Vol. 37. JMLR.org 1481-1490.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to