GitHub user p4nna opened a pull request: https://github.com/apache/flink/pull/3625
[FLINK-5785] Add an Imputer for preparing data Provides an Imputer for sparse DataSets of Vectors. Adds missing values with the mean, median or most frequent value of each vector resp. dimension You can merge this pull request into a Git repository by running: $ git pull https://github.com/p4nna/flink ml-Imputer-edits Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3625.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3625 ---- commit f2875ac5890564213d5f055d710976d1fede3962 Author: p4nna <b...@dbs.ifi.lmu.de> Date: 2017-03-27T09:47:39Z Add files via upload commit 8e6909b52dad34d6c4cd6c84618616ac50cd83d1 Author: p4nna <b...@dbs.ifi.lmu.de> Date: 2017-03-27T09:49:59Z Test for Imputer class Two testclasses which test the functions implemented in the new imputer class. One for the rowwise imputing over all vectors and one for the vectorwise imputing commit 0c420a84c136b330135ce180db04d899b5a6f54c Author: p4nna <b...@dbs.ifi.lmu.de> Date: 2017-03-27T09:56:51Z removed unused imports and methods commit 9136607e84a0297bb4fb24a53bad9950b86bf116 Author: p4nna <b...@dbs.ifi.lmu.de> Date: 2017-03-27T15:58:37Z Imputer was added adds missing values in sparse DataSets of Vectors ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---