[
https://issues.apache.org/jira/browse/FLINK-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15808074#comment-15808074
]
ASF GitHub Bot commented on FLINK-5426:
---------------------------------------
GitHub user Fokko opened a pull request:
https://github.com/apache/flink/pull/3081
[FLINK-5426] Clean up the Flink Machine Learning library
Hi guys,
I would like to contribute to the Flink ML library. I took the liberty to
clean up some of the code and improve the scaladoc. Beside that I've
implemented #3077 to get more familiar with the Flink API and I would love to
contribute more in the future, in particular the machine learning library.
If you have any questions, please let me know. Let me know if improvements
to the ML library are appreciated in general.
- [x] General
- The pull request references the related JIRA issue ("[FLINK-XXX] Jira
title text")
- The pull request addresses only one issue
- Each commit in the PR has a meaningful commit message (including the
JIRA id)
- [x] Documentation
- Documentation has been added for new functionality
- Old documentation affected by the pull request has been updated
- JavaDoc for public methods has been added
- [x] Tests & Build
- Functionality added by the pull request is covered by tests
- `mvn clean verify` has been executed successfully locally or a Travis
build has passed
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/Fokko/flink fd-cleanup
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/3081.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #3081
----
commit 013b22d7bcaf48c8e96983295fcc455faf0aa94b
Author: Fokko Driesprong <[email protected]>
Date: 2017-01-06T20:34:53Z
Removed duplicate tests, inproved scaladoc and naming, removed typo's in
scaladoc, introduced and improved use of constants, improved test-case naming.
----
> Clean up the Flink Machine Learning library
> -------------------------------------------
>
> Key: FLINK-5426
> URL: https://issues.apache.org/jira/browse/FLINK-5426
> Project: Flink
> Issue Type: Improvement
> Components: Machine Learning Library
> Reporter: Fokko Driesprong
>
> Hi Guys,
> I would like to clean up the Machine Learning library. A lot of the code in
> the ML Library does not conform to the original contribution guide. For
> example:
> Duplicate tests, different names, but exactly the same testcase:
> https://github.com/apache/flink/blob/master/flink-libraries/flink-ml/src/test/scala/org/apache/flink/ml/math/DenseVectorSuite.scala#L148
> https://github.com/apache/flink/blob/master/flink-libraries/flink-ml/src/test/scala/org/apache/flink/ml/math/DenseVectorSuite.scala#L164
> Lot of multi-line tests-cases:
> https://github.com/Fokko/flink/blob/master/flink-libraries/flink-ml/src/test/scala/org/apache/flink/ml/math/DenseVectorSuite.scala
> Mis-use of constants:
> https://github.com/apache/flink/blob/master/flink-libraries/flink-ml/src/main/scala/org/apache/flink/ml/math/DenseMatrix.scala#L58
> Please allow me to clean this up, and I'm looking forward to contribute more
> code, especially to the ML part. I've have been a contributor to Apache Spark
> and am happy to extend the codebase with new distributed algorithms and make
> the codebase more mature.
> Cheers, Fokko
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)