[
https://issues.apache.org/jira/browse/FLINK-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14900541#comment-14900541
]
ASF GitHub Bot commented on FLINK-1719:
---------------------------------------
GitHub user JonathanH5 opened a pull request:
https://github.com/apache/flink/pull/1156
Pull Request
This pull request is related to
[FLINK-1719](https://issues.apache.org/jira/browse/FLINK-1719).
Multinomial Naive Bayes was successfully implemented @tillrohrmann and
different ideas proposed by other authors were incorporated.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/JonathanH5/flink pullrequest
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/1156.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1156
----
commit aec4cf0b378247e479991f5356f169703ab8ee45
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-05-10T21:06:27Z
Added a first version of the Naive Bayes Classifier, it works.
commit d216a71edcf525b70ce76310ec122b1dcd72c6c6
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-07-08T16:11:43Z
First steps to convert to new MLL done
commit ab43bb2686afeed31793a4018b40e26a52c8d4c4
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-07-09T15:14:36Z
Small changes for talk with Till
commit b5952dbf222ccbb18c96d6ab626236fe1505e203
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-07-14T09:35:57Z
NaiveB now working with new MLL Layout
commit 1e8dfdf7494f5f57289551a434b400a92f35edb3
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-07-15T15:52:25Z
Removal of old code
commit 2bcee72bf185fde3e32dbb8e3fb0bc3c8fa73c05
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-07-16T13:53:29Z
Renamed to MultinomialNaiveBayes, improved code comments, created class for
automatic benchmarking: MultinomalNaiveBayesRuns
commit ddadbb0299f9116bf0c2acb6b11c5f26a4bd9e10
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-07-22T14:55:41Z
Added the first two possipilities to choose from, enhanced code comments
and code structure
commit e8f5b7dd1496edd235b6b40d5462077059002ebd
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-07-23T10:45:10Z
Added Possibility 3 and improved code comments by a lot (kind of done)
commit f7af3e06c95d58b38edae827974b182082b3d22a
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-07-23T12:44:41Z
Added tests for all possibilities that use data provided by the Collection
class
commit 42921e9c911fae996bac2aee4dd32a6b0ee7d3e7
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-07-24T09:51:22Z
Duplicated MultinomialNaiveBayes class and renamed it to
MultinomialNaiveBayesJoinedModel. Both classes (end tests) do now exactly the
same (also same line numbers), only the name differs).
commit a5e9c80c2214c178f3ca7c87b6e9e763409f90e0
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-07-24T11:36:09Z
MultinomialNaiveBayes now stores its data in two different models -> class
related and word related, results are the same but it seems to be faster than
MultinomialNaiveBayesJoinedModel, tests already work
commit a8e62cfc46eae87aece575896ea494c02bc48a11
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-07-27T12:04:35Z
Resolved 404 Scala style errors
commit c90b25d1f8de933400a6a69c307f28cbec317bb5
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-08-11T22:52:49Z
First incorparation of SR1, only Schneider so far, works but test show that
accucary for webkb is 10 percent worse
commit 8c15e7c0bc8a014840baa866b51b62edce2846ae
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-08-14T15:53:00Z
Added SR1 = 2, results seem weird. Also added first code for a Transformer
that applies feature selection
commit bb46a2951501d9ecbdb3161c306177fde751e770
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-08-19T14:09:10Z
Improved CRQ and some other things
commit 32c05d8d8860205e4e81f05a193980958a69d1b8
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-08-19T14:12:48Z
Removed changes for SR1 = 2 from the Fit Operation because nothing needs to
be changed there
commit ed843c6a95429a8522d71436c97f1ee0a7c8b159
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-08-20T11:21:02Z
Added SR=1
commit 509184692f4352e5d228897bfde8564a35163d39
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-08-21T15:28:49Z
Added R1
commit 98d1dee42c77d73e5da32246c6d7bbf9c8ac6f2e
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-08-25T12:44:26Z
Resolved systematic error when calculation SR1=1, SR1=2 and R1=1
commit dd4acacb18e01aa44e708d84724a51c96a705872
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-08-27T11:50:25Z
Version I used for testing the theory improvements
commit 06534f0d517219981577f678f8668c90be81bdab
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-08-28T10:55:03Z
Small changes
commit bab1ecb076e56b8227af84303e22e4beb6751e5c
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-09-21T11:15:49Z
Join with Huge
commit 2c70bf41f5d5f2e9536ce52995cae1e294776347
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-09-21T11:22:47Z
Cleanup for pull request
----
> Add naive Bayes classification algorithm to machine learning library
> --------------------------------------------------------------------
>
> Key: FLINK-1719
> URL: https://issues.apache.org/jira/browse/FLINK-1719
> Project: Flink
> Issue Type: New Feature
> Components: Machine Learning Library
> Reporter: Till Rohrmann
> Assignee: Jonathan Hasenburg
> Labels: ML
>
> Add naive Bayes algorithm to Flink's machine learning library as a basic
> classification algorithm. Maybe we can incorporate some of the improvements
> developed by [Karl-Michael
> Schneider|http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.59.2085&rep=rep1&type=pdf],
> [Sang-Bum Kim et
> al.|http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=1704799] or
> [Jason Rennie et
> al.|http://people.csail.mit.edu/jrennie/papers/icml03-nb.pdf] into the
> implementation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)