Valeriy Avanesov created SPARK-23437:
----------------------------------------
Summary: Distributed Gaussian Process Regression for MLlib
Key: SPARK-23437
URL: https://issues.apache.org/jira/browse/SPARK-23437
Project: Spark
Issue Type: New Feature
Components: ML, MLlib
Affects Versions: 2.2.1
Reporter: Valeriy Avanesov
Gaussian Process Regression (GP) is a well known black box non-linear
regression approach [1]. For years the approach remained inapplicable to large
samples due to its cubic computational complexity, however, more recent
techniques (Sparse GP) allowed for only linear complexity. The field continues
to attracts interest of the researches – several papers devoted to GP were
present on NIPS 2017.
Unfortunately, non-parametric regression techniques coming with mllib are
restricted to tree-based approaches.
I propose to create and include an implementation (which I am going to work on)
of so-called robust Bayesian Committee Machine proposed and investigated in [2].
[1] Carl Edward Rasmussen and Christopher K. I. Williams. 2005. _Gaussian
Processes for Machine Learning (Adaptive Computation and Machine Learning)_.
The MIT Press.
[2] Marc Peter Deisenroth and Jun Wei Ng. 2015. Distributed Gaussian processes.
In _Proceedings of the 32nd International Conference on International
Conference on Machine Learning - Volume 37_ (ICML'15), Francis Bach and David
Blei (Eds.), Vol. 37. JMLR.org 1481-1490.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]