GitHub user JonathanH5 opened a pull request:
https://github.com/apache/flink/pull/1156
Pull Request
This pull request is related to
[FLINK-1719](https://issues.apache.org/jira/browse/FLINK-1719).
Multinomial Naive Bayes was successfully implemented @tillrohrmann and
different ideas proposed by other authors were incorporated.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/JonathanH5/flink pullrequest
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/1156.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1156
----
commit aec4cf0b378247e479991f5356f169703ab8ee45
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-05-10T21:06:27Z
Added a first version of the Naive Bayes Classifier, it works.
commit d216a71edcf525b70ce76310ec122b1dcd72c6c6
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-07-08T16:11:43Z
First steps to convert to new MLL done
commit ab43bb2686afeed31793a4018b40e26a52c8d4c4
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-07-09T15:14:36Z
Small changes for talk with Till
commit b5952dbf222ccbb18c96d6ab626236fe1505e203
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-07-14T09:35:57Z
NaiveB now working with new MLL Layout
commit 1e8dfdf7494f5f57289551a434b400a92f35edb3
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-07-15T15:52:25Z
Removal of old code
commit 2bcee72bf185fde3e32dbb8e3fb0bc3c8fa73c05
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-07-16T13:53:29Z
Renamed to MultinomialNaiveBayes, improved code comments, created class for
automatic benchmarking: MultinomalNaiveBayesRuns
commit ddadbb0299f9116bf0c2acb6b11c5f26a4bd9e10
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-07-22T14:55:41Z
Added the first two possipilities to choose from, enhanced code comments
and code structure
commit e8f5b7dd1496edd235b6b40d5462077059002ebd
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-07-23T10:45:10Z
Added Possibility 3 and improved code comments by a lot (kind of done)
commit f7af3e06c95d58b38edae827974b182082b3d22a
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-07-23T12:44:41Z
Added tests for all possibilities that use data provided by the Collection
class
commit 42921e9c911fae996bac2aee4dd32a6b0ee7d3e7
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-07-24T09:51:22Z
Duplicated MultinomialNaiveBayes class and renamed it to
MultinomialNaiveBayesJoinedModel. Both classes (end tests) do now exactly the
same (also same line numbers), only the name differs).
commit a5e9c80c2214c178f3ca7c87b6e9e763409f90e0
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-07-24T11:36:09Z
MultinomialNaiveBayes now stores its data in two different models -> class
related and word related, results are the same but it seems to be faster than
MultinomialNaiveBayesJoinedModel, tests already work
commit a8e62cfc46eae87aece575896ea494c02bc48a11
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-07-27T12:04:35Z
Resolved 404 Scala style errors
commit c90b25d1f8de933400a6a69c307f28cbec317bb5
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-08-11T22:52:49Z
First incorparation of SR1, only Schneider so far, works but test show that
accucary for webkb is 10 percent worse
commit 8c15e7c0bc8a014840baa866b51b62edce2846ae
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-08-14T15:53:00Z
Added SR1 = 2, results seem weird. Also added first code for a Transformer
that applies feature selection
commit bb46a2951501d9ecbdb3161c306177fde751e770
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-08-19T14:09:10Z
Improved CRQ and some other things
commit 32c05d8d8860205e4e81f05a193980958a69d1b8
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-08-19T14:12:48Z
Removed changes for SR1 = 2 from the Fit Operation because nothing needs to
be changed there
commit ed843c6a95429a8522d71436c97f1ee0a7c8b159
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-08-20T11:21:02Z
Added SR=1
commit 509184692f4352e5d228897bfde8564a35163d39
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-08-21T15:28:49Z
Added R1
commit 98d1dee42c77d73e5da32246c6d7bbf9c8ac6f2e
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-08-25T12:44:26Z
Resolved systematic error when calculation SR1=1, SR1=2 and R1=1
commit dd4acacb18e01aa44e708d84724a51c96a705872
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-08-27T11:50:25Z
Version I used for testing the theory improvements
commit 06534f0d517219981577f678f8668c90be81bdab
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-08-28T10:55:03Z
Small changes
commit bab1ecb076e56b8227af84303e22e4beb6751e5c
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-09-21T11:15:49Z
Join with Huge
commit 2c70bf41f5d5f2e9536ce52995cae1e294776347
Author: Jonathan Hasenburg <[email protected]>
Date: 2015-09-21T11:22:47Z
Cleanup for pull request
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---