GitHub user tillrohrmann opened a pull request:
https://github.com/apache/flink/pull/704
[FLINK-2050] Introduces new pipelining mechanism for FlinkML
This PR introduces the new pipelining mechanism for FlinkML. In order to
make pipeline applicable to different input types, the algorithm logic and the
state of the pipeline operator have been separated. The logic is now kept in
implicit values which are automatically selected by the Scala compiler based on
the input and output types of the pipeline operators and the input data.
The operator itself keeps now the model data which is trained in the fit
phase. Thus, there is no longer a distinct model which is returned from the
algorithm.
The pipelining allows, for example, a pipeline which scales vectors to work
on the `Vector` type as well as `LabeledVector` type even though both types are
not related. The only requirement is that implicit values implementing the
algorithm are available. This approach is similar to the mechanism which can be
found in the Breeze library.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/tillrohrmann/flink pipeline
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/704.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #704
----
commit 4e5118b10cb7525e19147d49a5fdc6da3aae639c
Author: Till Rohrmann <[email protected]>
Date: 2015-05-05T13:04:32Z
[FLINK-2050] [ml] Introduces new pipelining mechanism using implicit
classes to wrap the algorithm logic
commit da7d0bfe3a0780b386fcb9b0640513c32ee7bbab
Author: Till Rohrmann <[email protected]>
Date: 2015-05-20T11:49:52Z
[FLINK-2050] [ml] Ports existing ML algorithms to new pipeline mechanism
Adds pipeline comments
Adds pipeline IT case
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---