This is an automated email from the ASF dual-hosted git repository.

joern pushed a change to branch fixed_target_length
in repository https://gitbox.apache.org/repos/asf/opennlp-sandbox.git.


      at 8c74466  Compute max target length instead of fixed value in normalizer

This branch includes the following new commits:

     new 1542aa1  OPENNLP-132 Added sandbox folder
     new 7449178  OPENNLP-206 Created folder for corpus server
     new de7270c  OPENNLP-206 Initial corpus server check in
     new 0cb1372  OPENNLP-209 Created wikinews-importer folder
     new 64fa6a9  OPENNLP-209 Simple tool to upload wikinews xmi files
     new 335ecf7  OPENNLP-206 Added support to parse CASes and a method to 
create a corpus
     new 4334479  OPENNLP-209 Added one more util to create a corpus in the 
corpus server
     new 0935bb4  OPENNLP-206 Added folder for Corpus Server Client
     new e723d04  OPENNLP-206 Created a tools project for the corpus server
     new 8e12c0f  OPENNLP-206 Renamed from client to tools
     new 61a6cfd  OPENNLP-206 Added the usual suspects to svn:ignore
     new d29c6cd  OPENNLP-211 First version of wikinews importer, based on code 
contributed by Olivier Grisel. Thanks.
     new dbf560c  OPENNLP-208 Added derby storage support
     new 308f24a  OPENNLP-235 Created project folder for Cas Editor plugin
     new 7dd2e29  OPENNLP-235 Added initial project structure
     new 8ac39b3  OPENNLP-235 Added basic pom based on a pom from an uima 
eclipse project
     new 29b99b9  OPENNLP-235 Added OpenNLP Plugin class
     new 57d1407  OPENNLP-235 Added preference initializer
     new 8c1c3b2  OPENNLP-235 Added preference page
     new a3e8f36  OPENNLP-235 Added name finder view classes
     new f7ef4a0  OPENNLP-235 Added columns to the table viewer and it shows 
now a sample entity
     new 5f1fd2d  OPENNLP-235 Added the name finder to the content provider. 
Now it runs once and provides the table with a list of names.
     new 3932004  OPENNLP-235 Added confirmed column to the table, and a 
confirmed field to Entity.
     new 4475d44  OPENNLP-235 Added confirm action. Right now only creates an 
annotation.
     new 2a6260f  OPENNLP-235 Name finder is now restricted to existing 
annotations.
     new d7139fc  OPENNLP-235 Added entity merging, and name finder is 
triggered on changes.
     new 7c0c01c  OPENNLP-235 searchEntities now searches for intersecting 
entities, instead of entities with identical bounds.
     new 1918c9c  OPENNLP-235 Added a sorter to the table.
     new f1735c1  OPENNLP-207 Initial lucas indexing support
     new 025e479  OPENNLP-207 Initial lucas indexing support
     new 45286a5  OPENNLP-207 Initial lucas indexing support
     new 2f70982  OPENNLP-207 Improved lucas integration
     new 74b3d69  OPENNLP-235 Fixed mistake in sequence validation
     new 896c44f  OPENNLP-235 Renamed project from opennlp-caseditor-plugin to 
caseditor-opennlp-plugin
     new e9bcd69  OPENNLP-249 Now indexes the full article text, and person and 
organization annotations.
     new 53cc528  OPENNLP-250 Now using prepared statement. All chars are 
correctly escaped now.
     new 08b2c9e  OPENNLP-249 Changed searcher to reflect new mapping file.
     new ac293f1  OPENNLP-249 Changed return type of search method.
     new 36829e7  OPENNLP-249 Changed return type of search method.
     new 23b2d61  OPENNLP-210 Initial check in of task queue interfaces
     new ca69a45  OPENNLP-254 Index is now either created or appended.
     new 0174dee  OPENNLP-251 Added search tool
     new f025676  OPENNLP-210 Initial task queue support
     new 0330e72  OPENNLP-235 Fixed a bug in the restricted sequence validation 
code
     new 4cdd129  OPENNLP-252 Created project folder
     new 1f12022  OPENNLP-252 Created source folder
     new 0b8e527  OPENNLP-252 Copied pom.xml from caseditor opennlp plugin and 
updated names
     new 9e9eb70  OPENNLP-252 Changed a name
     new 8ec349e  OPENNLP-235 Added code which selects and reveals the text in 
the Annotation Editor of the selected, not confirmed entity.
     new 7c21336  OPENNLP-235 Now is forced to detect every confirmed entity 
token as an entity. This dramatically boosts the recall of the name finder.
     new 8756404  OPENNLP-276 Added test corpus server instance. Thanks to 
Tommaso Teofili for provding this class as part of a patch he attached to 
OPENNLP - 261.
     new ee51940  OPENNLP-261 Added project folder for the 
corpus-server-connector
     new f950dbd  OPENNLP-261 Added empty source folder
     new 4d0c083  OPENNLP-261 Added pom.xml file based on pom from 
corpus-server-tools
     new 0fa0a4d  OPENNLP-261 Added empty test source folder
     new 899c117  OPENNLP-261 Added empty test source folder
     new a41c1eb  OPENNLP-261 Added test descriptors. Thanks to Tommaso Teofili 
for contributing this.
     new d0b66f2  OPENNLP-261 Added collection reader, cas consumer and tests. 
Thanks to Tommaso Teofili for contributing this.
     new 39ec8b8  OPENNLP-281 First draft of a corpus backup tool
     new 81c5bdc  OPENNLP-281 Also created a tool to create a task queue
     new 155b59e  OPENNLP-281 Fixed path handling
     new 5ed2738  OPENNLP-277 Fixed bug in id handling, a relative id was used 
without re-basing it.
     new ef3e240  OPENNLP-277 Removed unused set, which was added for debugging.
     new ad37b65  OPENNLP-285 Added fix to load model from configured path 
instead from classpath
     new 00d97b3  OPENNLP-252 Added generated Activator class
     new 13904df  OPENNLP-252 Added dummy view, to test that it can be created.
     new 50910c3  OPENNLP-252 Updated activator class name.
     new 85d7f0e  OPENNLP-252 Added corpus server editor input, there a 
resource is identified by an URL and cas id
     new a8ae2b5  OPENNLP-252 First experimental code to open an Annotation 
Editor on a CAS from the Corpus Server.
     new 0b23401  OPENNLP-277 Updated Lucas dependency to released version.
     new 7f3a638  OPENNLP-252Now it can pull the next CAS from the server.
     new 8336a91  OPENNLP-252 Added support to save a CAS back to the server, 
based on Cas Editor code.
     new a71f959  OPENNLP-252 Updated document provider extension to work with 
updated Cas Editor.
     new a5aec01  OPENNLP-252 Fixed formating.
     new 3f44de9  OPENNLP-252 Fixed view ids.
     new f7e4f78  OPENNLP-252 Added equals and hashCode methods.
     new debcea8  OPENNLP-252 Settings are now saved in memory
     new e79e1ea  OPENNLP-252 Fixed equals.
     new 5e3b9b6  OPENNLP-252 Updated implementation to work with changed 
CasDocumentProvider.
     new 965b491  OPENNLP-252 Fixed handling of editor annotation status.
     new 1a5592c  OPENNLP-252 Now only keeps n items in the history. History 
view now contains IEditorInput elements and supports opening via the opening 
listener.
     new 62cfe02  OPENNLP-235 Stump for the sentence detector view
     new 8d449bb  OPENNLP-235 Moved settings for token and sentence type config 
to OpenNLP pref page.
     new 9dd3b3d  OPENNLP-235 Moved confirm action to inner class.
     new a8eb227  OPENNLP-235 Added sentence detector job.
     new 76ddd68  OPENNLP-235 Now sets the preference store.
     new 2fcc861  OPENNLP-235 Initial check in.
     new 6be5754  OPENNLP-235 Instance variables are now local variables.
     new 1920625  OPENNLP-235 Added preference field for paragraph annotation 
type
     new 4f0cb52  OPENNLP-235 Moved comperator to separate class
     new ae55ee9  OPENNLP-235 Moved confirm action to separate class
     new 46b99c0  OPENNLP-235 Declared methods as public
     new 585077f  OPENNLP-235 Moved inner classes to separate files
     new 39e7653  OPENNLP-235 Added dummy tokenizer view.
     new d569cc5  OPENNLP-235 Removed call to add FS to CAS, because that is 
already done by the ICasDocument
     new e9ecc3f  OPENNLP-235 Improved inputChanged implementation, and added 
some comments.
     new 2a700fe  OPENNLP-235 Added very basic sentence detector support.
     new 47c24ac  OPENNLP-235 Added wrong field.
     new 74b52d6  OPENNLP-235 Pages are now grouped under OpenNLP.
     new 8ec1e8a  OPENNLP-235 Changed merge logic slightly
     new 4e4d3cb  OPENNLP-235 Changed merge logic slightly
     new 68a4121  OPENNLP-235 Added setter for entity text
     new 583d569  OPENNLP-235 Improved entity confirmation listener
     new 47153bd  OPENNLP-235 Removed unused imports
     new 9b24a7b  OPENNLP-235 Added dummy for tokenizer preference page.
     new 0b9b097  OPENNLP-235 Added edit to entity list selection support based 
on Cas Editor code.
     new d37d393  OPENNLP-235 Entity List now provides AnnotationFS selection 
also.
     new 325e636  OPENNLP-235 Hooked up Quick Annotate Action command to the 
confirm action, now a user can confirm a potential annotation by pressing Enter.
     new 8dca767  OPENNLP-235 Sentence detection can now be restricted to 
paragraph annotations.
     new 6af2d1c  OPENNLP-235 Added key short cut to confirm action.
     new 3909d95  OPENNLP-235 Added tokenizer job.
     new e6b9774  OPENNLP-235 Added tokenizer constants.
     new 0876bf9  OPENNLP-235 Name Finder now supports multiple sentence types.
     new c2eba79  OPENNLP-235 Extended the Span object to add a confidence 
score.
     new 31c9ebe  OPENNLP-235 Moved name finder integration code to a separate 
class.
     new 4d616b3  OPENNLP-235 First support for multiple models.
     new f2c4224  No jira, added a comment.
     new 6a2af99  OPENNLP-299 Index mapping can now be defined per corpus.
     new efe9c15  OPENNLP-299 Index mapping can now be defined per corpus.
     new 62c100b  OPENNLP-299 Added index mapping file.
     new 469a654  OPENNLP-299 Index mapping can now be defined per corpus.
     new fe5848b  No jira, added comment.
     new 6d8b69e  OPENNLP-300 Added ability to import a single xmi file or a 
folder of xmi files.
     new 17b8a43  OPENNLP-299 Fixed handling of index mapping file.
     new d046b57  OPENNLP-299 Fixed handling of index mapping file.
     new 2e93edd  OPENNLP-261 Converted the consumer to read from the Corpus 
Server via its RESTful API instead.
     new af708e9  OPENNLP-261 Added collection reader xml file which is derived 
from Apache UIMAs example FileSystemCollectionReader.xml.
     new 1683e39  OPENNLP-261 Adapted stump collection reader xml to work with 
CScollectionReader class.
     new 7a438d7  OPENNLP-261 Renamed collection reader
     new 42c6209  OPENNLP-261 Inserted sample addresses
     new b5efcea  OPENNLP-261 Added analysis engine xml file which is derived 
from Apache UIMAs example RegExAnnotator.xml.
     new 672edd1  OPENNLP-261 Renamed CSCasConsumer to CSCasWriter
     new 9bd2a95  OPENNLP-261 Adapted descriptor to work with CSCasWriter.
     new 544f7d2  OPENNLP-302 Now the generated OSGi MANIFEST.MF is included in 
the created jar file.
     new ead91a6  OPENNLP-302 Fixed path to MANIFEST.MF file.
     new e2ebebf  OPENNLP-235 It is now possible to create annotations 
depending on the entity type.
     new 657ad30  OPENNLP-235 Enabled verified name restriction again.
     new 0bc012d  OPENNLP-235 Enabled recall boosting for already confirmed 
names again.
     new 2935b30  OPENNLP-235 Now also uses type to decide if to entities are 
equal.
     new 926963f  OPENNLP-235 Improved matching of types, now also considers 
the type.
     new 3943759  OPENNLP-299 Added JSON configuration
     new 85d3f66  OPENNLP-252 Added a Corpus Explorer View to browse the 
contents of a corpus
     new e8263b5  OPENNLP-310 Firsts changes to use new preference store. Also 
added a bit error handling to avoid exception, but further work is needed.
     new 9a1f343  OPENNLP-310 Moved action to show preference dialog to an 
separate action. Added action to sentence detector view tool bar.
     new 1c14df1  OPENNLP-312 Confirmed entities are now directly removed from 
the list of new potential entities.
     new 080a72a  OPENNLP-310 Removed preferences pages
     new ceb9da3  OPENNLP-314 Updated UIMA version to 2.4.0-SNAPSHOT from 
2.3.2-SNAPSHOT and changed jar file name.
     new 47c5f85  OPENNLP-314 Updated UIMA version to 2.4.0-SNAPSHOT from 
2.3.2-SNAPSHOT.
     new 8bd79ae  OPENNLP-310 Fixed Sentence Detector Preference Page, it was 
accidentally showing the Name Finder Preference Page.
     new 46dfa33  OPENNLP-310 Removed unused method call to retrieve the 
plugins preference store.
     new 3f4ddbd  OPENNLP-310 Updated content provider to be compatible with 
latest Cas Editor changes in UIMA-2245.
     new 2d40573  OPENNLP-310 Added call to document provider to save changed 
settings!
     new d5366df  OPENNLP-315 Added log methods.
     new 34369d7  OPENNLP-321 Added code to locally store type system 
preferences.
     new c472b55  OPENNLP-310 Updated to be provide with new session preference 
store.
     new bb5be7f  OPENNLP-253 Created project for new relevance component.
     new 240285e  OPENNLP-253 Added source folders.
     new 348eb99  OPENNLP-253 Initial check in of contribution from Boris 
Galitsky. Thanks for contributing.
     new a3e1798  OPENNLP-315 Added first error reporting to the name finder 
view.
     new a7643e9  OPENNLP-315 Added error reporting to the sentence detector 
view.
     new 50753d2  OPENNLP-312 Improved selection handling after a possible 
entity is remved from the list through confirmation.
     new 353cf1a  OPENNLP-312 Fixed setting of jobs system status.
     new f242d0f  OPENNLP-303 Added a containing contraint. Class is taken from 
OpenNLP UIMA Integration.
     new 6ddcc83  OPENNLP-303 Now uses token annotations from input CAS instead 
of simple tokenizer.
     new 51a412b  OPENNLP-315 Now reports an error if sentence annotation is 
invalid.
     new 18c7772  OPENNLP-315 Now correctly closes input stream during model 
loading.
     new 05ee426  OPENNLP-310 Added name space to preference keys.
     new 5090e53  OPENNLP-310 Changing properties now triggers a name finder 
run.
     new 24f533a  OPENNLP-310 Added an option to enable / disable confirmed 
name recall boosting
     new b1ac975  OPENNLP-326 Now multiple paragraph types can be configured.
     new 44a0b17  OPENNLP-323 Fixed dependencies, and compatibility changed to 
work with 1.5.2. Applied patch provided by Boris Galitsky. Thanks for providing 
the patch.
     new 280b41c  OPENNLP-323 Added missing AL 2.0 headers. Thanks to Boris 
Galitsky for providing a patch.
     new fad10e2  OPENNLP-323 Replaced log4j with java.util.log. Thanks to 
Boris Galitsky for providing a patch.
     new 0b41238  OPENNLP-330 Fixed junit tests, so they work with opennlp 
1.5.2. Thanks to Boris Galitsky for providing a patch.
     new 1bd7c1b  OPENNLP-330 Fixed junit tests, so they work with opennlp 
1.5.2. Thanks to Boris Galitsky for providing a patch.
     new c48473e  OPENNLP-323 Replaced log4j with java.util.log. Thanks to 
Boris Galitsky for providing a patch.
     new b46c4c7  OPENNLP-331 Added functions substituting POS taggers by 
Parser POS. Thanks to Boris Galitsky for providing a patch.
     new 8431406  No jira, added eclipse files to svn:ignore.
     new 670c219  OPENNLP-313 Added a button to do tokenization once.
     new e0f4d5d  OPENNLP-345 Now handles the case correctly where nothing is 
selected in the entity list.
     new 8190469  OPENNLP-339 If the query parameter is specified the queue 
will be created or reseted.
     new bfdc73e  OPENNLP-311 Initial http model loading support.
     new 1240979  OPENNLP-348 Turned open button into an open listener
     new 1543d0d  OPENNLP-346 Now remembers last used server. Renamed Activator 
to CorpusServerPlugin.
     new 6f19af7  OPENNLP-347 Replaced text query field with combo query field 
which even remembers the last used queries.
     new 65527ef  OPENNLP-350 Corpus Explorer should not query server in the UI 
thread
     new dcffa81  OPENNLP-351 Now enter or selection from the query combo 
triggers a search.
     new cdd1988  OPENNLP-328 Now considers sentences which are already in the 
CAS. Sentence detection is triggered automatically.
     new f6ff3b2  OPENNLP-353 New label provider shows begin and end of a 
sentence.
     new 1b60558  OPENNLP-354 Specified timeout and fixed error handling.
     new 28854d9  No jira, added suppress warnings for raw types
     new 36a49c3  OPENNLP-356 Now uses cas editor listener to listen for input 
changes. If input is changed view is refreshed.
     new 5467a41  OPENNLP-356 Updated to work with the latest version of the 
editor listener.
     new 1646bf8  OPENNLP-356 Updated change listener, to use common base class.
     new c452da2  No jira, added suppress warnings.
     new 4c0343e  OPENNLP-356 Fixed confirm action to always use current 
document.
     new 9a5f3e6  OPENNLP-337 Moved Porter Stemmer to opennlp.tools.stemmer 
package. Thanks to Boris Galitsky for providing a patch.
     new bbebdc4  OPENNLP-356 Now uses new CasEditorView as a base which 
handles input changes by re-creation of the view page.
     new 16fcd30  OPENNLP-392 Now maintains the selection after confirmation.
     new 2c2b434  OPENNLP-387 Demonstration on how similarity component 
improves search accuracy. Thanks to Boris Galitsky for providing a patch.
     new 31670cd  OPENNLP-387 Added missing AL header, and formated code.
     new 1c4ede5  OPENNLP-392 Now maintains the selection after confirmation.
     new e6b35e8  OPENNLP-392 Improved selection handling after one was skipped.
     new 550e9db  OPENNLP-401 Selection is only changed if the user confirms an 
annotation from the name finder view.
     new 4db48cf  OPENNLP-408 Now only updates selection when sentence detector 
view is active.
     new d95b777  OPENNLP-405 Now ordering also depends on confidence.
     new f1a265b  OPENNLP-410 Refactored Entity class into PotentialAnnotation 
and removed old Entity left overs.
     new 90b7004  OPENNLP-410 Added javadoc comment.
     new c78818d  No jira, changed OpenNLP version to 1.5.2.
     new 04d3744  OPENNLP-319 Added new type list field editor
     new 328e935  OPENNLP-319 Minor layout improvements.
     new 6cb4f21  OPENNLP-319 Added new field editor for UIMA types.
     new b6024c1  OPENNLP-319 Added two new field editors to name finder 
preference page to make configuration easier.
     new e053af6  OPENNLP-324 Added configuration options and added initial 
capital letter filter.
     new fd7c143  Creating folder for new OpenNLP ML rewrite.
     new 79692b2  OPENNLP-39 Branched opennlp-maxent for opennlp-ml
     new 1c7bbb7  OPENNLP-39 Deleted incorrectly copied opennlp-maxent branch.
     new ff93001  OPENNLP-39 Branched opennlp-maxent for opennlp-ml
     new 558e359  OPENNLP-119 Moved classes to new location in new project 
structure
     new db43e73  OPENNLP-119 Fixed package declarations.
     new bf93714  Updated test code to have org.apache.ml package prefix. 
Updated POM. OPENNLP-416
     new fe66ee6  OPENNLP-319 Now correctly used Modify event instead of 
Selection event.
     new ff11418  No jira, added missing AL header.
     new 6f2a1a2  repaired appropriateness criterion for sentence inclusion 
into generated content
     new d4ff792  demonstration how sensitive syntactic match is compared to 
bag-of-words approach Key: OPENNLP-413
     new 5fc74ef  OPENNLP-414.txt
     new cbc2068  OPENNLP-277 Index is now reused and re-opened on index 
change. Analyzer was changed from whitespace to standard.
     new 17313bd  OPENNLP-419 readme.txt + more code comments for similarity 
component
     new 81bcd7e  Move opennlp to tlp as per INFRA-4456
     new a95aaf7  OPENNLP-411 Updated UIMA dependency to 2.4.0.
     new bee2db2  OPENNLP-411 Updated UIMA dependency to 2.4.0.
     new 39198fc  No jira, added missing mention of mapping file to help 
message.
     new 5495a91  OPENNLP-457 Type System is now resolved before it is send to 
the Corpus Server.
     new 3416312  No jira, fixed formating.
     new fc41455  OPENNLP-458 Now checks that corpus exists instead of just 
assuming that it exists.
     new f318026  OPENNLP-458 Removed left over debug message.
     new 3ecbacf  OPENNLP-458 getCorpus now return null if corpus does not 
exist.
     new 89b8cfc  OPENNLP-459 Now searcher is correctly chosen for corpus.
     new 56672af  OPENNLP-459 Searcher is now closed on shutdown!
     new ee18433  No jira, removed unused imports.
     new b0bb0eb  No jira, added wtp eclipse config file to svn ignore.
     new 28cedc6  OPENNLP-460 Updated to fit implementation.
     new 6882a66  OPENNLP-461 Fixed a bug in query remembering code.
     new f962a49  OPENNLP-462 Added support to exclude annotation types from 
intersecting with recommended sentences. Existing sentences are now handled via 
the new exclude logic.
     new eddbf23  OPENNLP-462 Fixed NPE when no exclusion types are specified.
     new 5f8a96d  OPENNLP-464 Model loading error is now reported to the user.
     new 2a87904  OPENNLP-465 Now it is triggered by preferences changes as 
well.
     new 990215d  OPENNLP-462 Fixed code to exclude sentences.
     new 9b453bb  OPENNLP-467 Now logs creation of a queue.
     new 35e94cb  OPENNLP-468 Added lowercase filter to better work with 
standard analyzer
     new 615a267  OPENNLP-468 Created a constant for the id field.
     new 8af4465  No jira, fixed formating.
     new 2b87455  OPENNLP-340 Added support to remove a CAS from a corpus.
     new b5bd305  OPENNLP-475 Changed CorpusSever.getTypeSystem to return a 
byte array.
     new cebd3d8  OPENNLP-472 Created new project for corpus-server 
implementation.
     new b10a4ec  OPENNLP-472 Created folder structure.
     new b854621  OPENNLP-472 Copying impl classes from corpus-server project.
     new 40c67ce  OPENNLP-472 Copying impl classes from corpus-server project.
     new 03d6585  OPENNLP-472 Fixed package declaration.
     new 84ad141  OPENNLP-472 Added pom.xml and Activator class. Fixed imports 
in copied implementation classes.
     new cdd4e70  OPENNLP-472 Moved implementation over to corpus-server-impl 
and changed build to produce an OSGi bundle.
     new ca10b92  OPENNLP-472 Added moved resources
     new 0ae521a  OPENNLP-476 Added OSGi classes
     new 14b8d3d  OPENNLP-480 Created tagging server project folder in sandbox.
     new 6493ee1  OPENNLP-480 Created initial structure for tagging server.
     new 0e04d71  OPENNLP-476 Added OSGi bundle and the application for the 
rest servlet
     new 61c0f91  OPENNLP-420 to speed up similarity computation, store parsing 
results in a hash, so that if a sentence has been parsed, chunked and prepared 
for matching once, we store it in a hash. when the Processor is instantiated, 
hash is deserialized. When the processor is closed, this hash is serialized.
     new f0e1f0f  OPENNLP-420: cached parsing results for junits *.dat file
     new 4390c27  OPENNLP-419 write a doc which will introduce potential users 
to the Component
     new 39e8132   OPENNLP-436 Auto Taxonomy Learner for Search Relevance 
Improvement based on Similarity
     new 149bb28  resources for OPENNLP-436 Auto Taxonomy Learner for Search 
Relevance Improvement based on Similarity
     new b87df8e  test for OPENNLP-436 Auto Taxonomy Learner for Search 
Relevance Improvement based on Similarity
     new 48d89f2  OPENNLP-489 Now always uses end marker which comes first in 
article. Thanks to Prokopis Prokopidis for providing a patch.
     new eecf0ac  OPENNLP-420: cached parsing results for junits *.dat file now 
caches into CSV file instead of java object serialization
     new b1ad93d  OPENNLP-420: cached parsing results for junits *.dat file now 
caches into CSV file instead of java object serialization added cache in CSV 
format: sentence_parseObject.CSV
     new 20d048b  formatting fixed by applying template OPENNLP-419 write a doc 
which will introduce potential users to the Component
     new ec9aa61  OpenNLP OPENNLP-497 create maven script, release notes
     new 1747178  OpenNLP OPENNLP-497 create maven script, release notes
     new d692dce  OPENNLP-480 First draft of POS Tagging service.
     new e073aa2  OPENNLP-480 First draft of Name Finder service.
     new ff786e1  OPENNLP-480 Added sample model bundle and feature xml for 
easier installation.
     new f58dbc4  OPENNLP-476 Added features.xml files to ease the installtion 
into an OSGi Runtime such as Apache Karaf.
     new fe50c98  No jira, added argument validation.
     new 5a80b5a  OPENNLP-513 Added support to drop a corpus
     new 8f2ee87  OPENNLP-472 Adjusted packages imports to work with OSGi
     new a1c8635  OPENNLP-518 Models can now be loaded from file URLs as well.
     new a96dac5  OPENNLP-480 Added initial support for tokenizer and sentence 
detector, updated name finder and pos tagger.
     new dc3b53a  OPENNLP-480 Added initial support for tokenizer and sentence 
detector, updated name finder and pos tagger.
     new 3c6e8b2  OPENNLP-480 Updated and extended sample configuration.
     new e3a22b8  
     new eb2a5f8  OPENNLP-480 Added first draft of simple web demo.
     new 68c3bfb  OPENNLP-480 Fixed CSS problems.
     new 6237da8  OPENNLP-480 Fixed bug in offset handling.
     new 46f4e5c  No jira, fixed formating.
     new 12a5e2d  OPENNLP-528 Added method to replace the type system of a 
corpus.
     new 15287ae  OPENNLP-528 Added method to replace the type system of a 
corpus.
     new 9b2979c  OPENNLP-528 Added support to resolve the replaced type system.
     new 72302ac  OPENNLP-531 Output directory must now be passed in as an 
argument.
     new f5a5143  OPENNLP-532 Added start script.
     new 2801904  No jira, added annotation status feature structure.
     new e5615a7  OPENNLP-532 Added start script.
     new b71c207  OPENNLP-532 Renamed script.
     new 1cd3d45  OPENNLP-261 Implemented CAS write support.
     new 54f8ff6  OPENNLP-261 Added sample to perform sentence detection and 
tokenization via a CPE.
     new 5ba9519  OPENNLP-261 Added sample to train a person name finder.
     new d7eef46  OPENNLP-537: make an access to generic search engines to 
demonstrate search results re-ranking
     new 2ce90b3  OPENNLP-538 Another illustration for similarity component: 
converting natural language task into Java code
     new ea5e96b  OPENNLP-497 Fixed maven script to build distributions
     new 82ec363  OPENNLP-497 Fixed maven script to build distributions 
Re-built cache for runs w/o models
     new 0d914c2  OPENNLP-497 Fixed maven script to build distributions better 
handling of cases where models are unavailable
     new 57b19b5  OPENNLP-540 SOLR request handler for search results 
re-ranking based on 'Similarity'
     new 7501fd8  OPENNLP-497 Fixed maven script to build distributions updated 
parsing cache for junits
     new 01f2232  OPENNLP-497 Fixed maven script to build distributions added 
caching for search engine api calls
     new 0cd102a  OPENNLP-497 Fixed maven script to build distributions updated 
thresh for tests
     new 7bfed99  OPENNLP-497 Fixed maven script to build distributions only 
use cached external web search engine results
     new bab831a  copied from Apache Tika project
     new 2c4184e  OPENNLP-497 Fixed maven script to build distributions fixed 
[WARNING] 
/opennlp-similarity/src/main/java/opennlp/tools/nl2code/NL2ObjCreateAssign.java:210:
 warning: unmappable character for encoding UTF-8
     new 4ae829c  OPENNLP-497 Fixed maven script to build distributions 
assembly.xml is copied from Tika
     new be734cd  OPENNLP-497 Fixed maven script to build distributions 
assembly plugin is added
     new 55ab905  OPENNLP-497 Fixed maven script to build distributions moved 
license & notice files to the root
     new fe4c1a2  OPENNLP-497 Fixed maven script to build distributions fixed  
license & notice
     new 1dfeaa3  OPENNLP-497 Fixed maven script to build distributions fixed  
license & notice
     new c518d24  <?xml version="1.0" encoding="UTF-8"?>
     new aa6d29c  <?xml version="1.0" encoding="UTF-8"?>
     new 928cc2e  pom.xml
     new f6492bc  pom.xml
     new 7512e44  
src/main/java/opennlp/tools/similarity/apps/WebSearchEngineResultsScraper.java
     new b7952a1  [maven-release-plugin] prepare release 
opennlp-similarity-0.0.1
     new 3746842  bing api
     new 34b5e71   OPENNLP-555 Now throws a Core Exception if document couldn't 
be saved.
     new 733e4a3  opennlp-548 bing api
     new 48d6812  opennlp-548 bing api
     new 409673f  OPENNLP-575 Created the new coref project folder
     new a6a17c1  OPENNLP-575 Created src folder and opennlp.tools package.
     new c76719b  OPENNLP-575 Copied coref component main code over to sandbox 
project.
     new 2b3ebb6  OPENNLP-575 Added cmdline package
     new ca4db4f  OPENNLP-575 Copied cmd line tools over to opennlp-coref
     new bbde4e3  OPENNLP-575 Added initial pom file to build to coref 
component.
     new 3d70692  OPENNLP-575 Created lang folder.
     new 8c84ec8  OPENNLP-575 Copied the old englisch coref command line tools
     new 9375e8c  OPENNLP-575 Created formats folder.
     new 0358d6b  
     new adf992f  OPENNLP-575 Copied coref over to sandbox.
     new 5f18ee8  OPENNLP-575 Copied coref over to sandbox.
     new 8c814c8  OPENNLP-585 Added a Brat NER tagging service.
     new 95089e6  OPENNLP-585 Added a Brat NER tagging service.
     new 698a60a  Prototype of a tool to allow users to create models from  of 
a set of known entities based on their own data in the form of sentences. See 
the Example class in the .v2 package.
     new 25d225d  Prototype of a tool to allow users to create models from  of 
a set of known entities based on their own data in the form of sentences. See 
the Example class in the .v2 package.
     new 55c5a17  Changed to use interface sig rather than impl
     new 6ea8536  OPENNLP-607 Changed header template. Referenced new jira 
ticket
     new 961dab4  OPENNLP-611 POM with 1.7 build tags.
     new 543b97a  OPENNLP-614 Moved all GeoEntityLinker impl classes to 
sandbox. Called this module addons as a place to consolidate useful addons to 
the base opennlp modules.
     new eb46984  OPENNLP-614 Moved all GeoEntityLinker impl classes to 
sandbox. Called this module addons as a place to consolidate useful addons to 
the base opennlp modules.
     new c570240  OPENNLP-615 Added a scoring impl that utilizes a doccat model 
to help with toponym resolution. The ModelBasedScorer also contains two static 
methods for training the model based on the CountryContext information used by 
the GeoEntityLinker.
     new 8ed861d  OPENNLP-615 Cleaned up javadocs and header info in 
ModelBasedScorer
     new 17332b2  OPENNLP-579 Added a SetupUtils class so users can get the 
Lucene indexes and Country Doccat models built very easily. Also many other 
small efficiencies.
     new 4b726f5  OPENNLP-579 Fixed a bug in the GazateerIndexer. Refined the 
SetupUtils.
     new a682acf  OPENNLP-579 Added simple caching to improve performance.
     new ea98ffc  OPENNLP-607 Cleaned up comments and fixed a bug that was 
giving the output model the wrong type in some cases
     new 822b6bc  OPENNLP-621 Fixed errors and changed all approprate imports 
to opennlp.tools.ml. Builds but no testing done yet.
     new f838a60  OPENNLP-614 Fixed a bug in the GeoEntityLinker. No gaz lookup 
was being performed if no country context was found.
     new 9c65f04  OPENNLP-607 Fixed many issues. Added default file-based impls 
for all interfaces, and created a util class wrapper to allow for easy use of 
the default implementations.
     new eea1e78  OPENNLP-607 Fixed many issues. Added default file-based impls 
for all interfaces, and created a util class wrapper to allow for easy use of 
the default implementations.
     new b866ac1  OPENNLP-626 Integrated Arabic, Russian, Thai, and Farsi 
analyzer usage to GazateerIndexer. Still need to add support for query time 
analyzer usage via a language code overload or language detector...
     new 3bc5f31  OPENNLP-628
     new 747ff23  OPENNLP-626 renamed packages for consistency in addons, also 
made  small efficiencies
     new 574a936  OPENNLP-626 renamed packages for consistency in addons, also 
made  small efficiencies
     new 26e7d94  OPENNLP-607 renamed packages for consistency in addons, also 
made the framework generic with file based implementations
     new e66257d  OPENNLP-607 renamed packages for consistency in addons, also 
made the framework generic with file based implementations
     new 8ddfe01  renamed directory
     new b2d22cb  renamed directory
     new a13ab82  OPENNLP-574 Moved from addons to sandbox to mature there.
     new a7a34fd  OPENNLP-636 Updated Trainer constructor usage. Init 
parameters are now passed in via the init method and not via the constructor.
     new 8747699  OPENNLP-661 The OpenNLP machine learning code was integrated 
into the openlp-tools project. The opennlp-ml project should be removed from 
the sandbox.
     new af92a7d  OPENNLP-657 Initial pull of the nlp-utils provided by Tommaso 
Teofili. Thanks for contributing.
     new 5daae29  OPENNLP-666 Support for strict CFGs non terminal rules 
expansion. Thanks to Tommaso Teofili for providing a patch.
     new 627d985  OPENNLP-666 Added RuleTest. Thanks to Tommaso Teofili for 
providing a patch.
     new 531ccb8  OPENNLP-713 - fixed some javadocs, using generics in ngrams 
utils, added more tests to cfg and language modeling packages
     new 20ad591  OPENNLP-723 - pcfg support in sandbox (nlp-utils)
     new 113a657  OPENNLP-723 - fixed cky method, minor fixes to formatting
     new 8f6c185  OPENNLP-752 Added the summarizer contribution. Thanks to Ram 
Soma for contributing it.
     new 11f9145  Added initial version of the wsd component. Thanks to Anthony 
Beylerian and Mondher Bouazizi for the contribution.
     new c5d321f  OPENNLP-758 Added a pom to make it build with maven
     new c717f6b  OPENNLP-758 Formatted the code according to OpenNLP code 
conventions
     new b276f56  OPENNLP-757 Added some headers and fixed some issues. Thanks 
to Mondher Bouazizi for providing a patch.
     new 2b2d892  OPENNLP-758 Applied clean up patch. Thanks to  Anthony 
Beylerian for providing a patch.
     new 7b94e5f  No jira, set eol-style property to native.
     new 3f541e2  OPENNLP-757 Applying bulk patch. Thanks to Mondher Bouazizi 
for providing a patch!
     new 500915b  OPENNLP-790 First iteration of the evaluator, testing on 
basic lesk, will need to validate and check the different performances. Thanks 
to Anthony Beylerian for providing a patch.
     new 359c3a5  OPENNLP-790 Removed unused variables. Changed the output 
format to : [Source SenseKey Score] each WSDisambiguator is assumed to have at 
least [Source SenseKey] as output for each disambiguation. In the case of Lesk 
and other unsupervised approaches with scores, the score can be provided as 
extra output. For now only the highest scoring disambiguated sense is 
considered in evaluation.
     new ffafc92  OPENNLP-790 - Fix for the IMS approach to Support Semsor3.0 
data - The output format is now [Source SenseKey] so it corresponds to that of 
Lesk. - Removed some unused variables. - Added Some parameters to let the user 
select the source of data he wants to use. - Implemented the IMS Evaluator. - 
Added and clarified some parts of the documentation.
     new 77f56ce  OPENNLP-802 The WSDisambiguator needs a baseline to compare 
the implemented approaches with. Lesk presents a good baseline, however 
Senseval and Semeval workshops demonstrated that MFS presents a better and more 
challenging baseline.
     new ce14617  OPENNLP-758 Updated Lesk with new data readers and added MFS 
in case no overlaps are found (similar to the simplified version). Thanks to 
Anthony Beylerian for providing a patch.
     new 4b4bd99  OPENNLP-791  Reads the mentioned clustering files, could also 
switch to objectstream. Thanks to Anthony Beylerian for providing a patch.
     new 9d4b861  OPENNLP-802
     new da26bd6  OPENNLP-758 fixes for parameters
     new d8abd31  OPENNLP-804 Updated opennlp-tools from 1.6.0-SNAPSHOT to 
1.6.0.
     new ff5d685  OPENNLP-801
     new 689952b  OPENNLP-794
     new 329b0df  OPENNLP-801 Also includes some more cleanups. Thanks to 
Anthony Beylerian for providing a patch!
     new 729117f  OPENNLP-801 1- IMS now no longer does the pre-processing 
steps (The user will have to introduce them). Thanks to Mondher Bouazizi  for 
providing a patch!
     new 6637500  OPENNLP-807 We have worked on the integration of the existing 
approaches.
     new 61943f6  OPENNLP-796 The two readers now return 
ObjectStream<WSDSample>. Thanks to Mondher Bouazizi for providing a patch.
     new afda4bf  Removed classes marked for removal
     new 600c541  Removed classes marked for removal
     new d8aa2a1  Added missing commons lang dependency
     new 806c27d  Fixed code formatting
     new 5580818  Commented junit Assert call to make it compile with maven
     new d81166b  OPENNLP-791 WordNet based clusters patch, uses ME for now 
will have to modify for other classifiers. Thanks to Anthony Beylerian for 
providing a patch!
     new 172e891  OPENNLP-792 Added class javadoc. Thanks to Anthony Beylerian 
for providing a patch.
     new 98eb743  OPENNLP-713 - slightly enhanced some tests
     new 74d93c5  OPENNLP-713 - slightly enhanced some tests, made Hypothesis 
unmutable
     new 93ddafb  OPENNLP-713 - pcfg#toString should result in same parser CLI 
output
     new fbbf803  OPENNLP-817 - added a CFG runner (with samples), added pcfg 
parse rules / cfg capabilities
     new 8faad08  OPENNLP-817 - switch to j7, added missing AL header, added 
runner test, tweaked parse rules method to adjust probs
     new c0197a5  The geocoder was moved to the addons area quite some time back
     new 1bb45db  OPENNLP-821 Moved mallet addon from my github repository to 
here
     new ce88c13  OPENNLP-821 Now builds and runs with 1.6.0
     new 092bff6  added unit tests, corrected some mistakes, need more unit 
tests
     new 77c5552  removed useless classes/folder
     new d167ee7  fixed method name
     new 9c1a75c  moved MFS and Lesk into main package moved IMS and OSCC into 
main package as contextGenerators
     new 6ce823b  
     new 1b75157  updated tests
     new 0498fe3  OPENNLP-850 Add ner brat annotation service
     new 206ef95  OPENNLP-850 Update dependencies to work with the uber jar
     new 4e121df  OPENNLP-850 Fix type in tokenizer init error message
     new 7009f23  OPENNLP-843 - moved contextgen implementations to top dir, 
need to make a common model and params for supervised approaches
     new 0f08de2  OPENNLP-843 - grouped the two supervised techniques into a 
common one with different context generators, the default context generator is 
from the IMS approach, updated the unit tests,  need to remove the useless 
classes.
     new f40736d  OPENNLP-843 - removed the unnecessary files
     new bf255a3  OPENNLP-827 fix for evaluator to check for non empty 
instances from senseval data
     new 552afea  OPENNLP-864 Rename name finder annotator classes
     new 67e5ed4  OPENNLP-866 Add optional argument for server port
     new dce84c0  OPENNLP-860 Add .gitignore file
     new 4350f64  Move brat annotator to opennlp.git
     new ad4195b  Whitespace test commit
     new 9aa270c  merge from bgalitsky's own git repo
     new 1f97041  merge from bgalitsky's own git repo
     new 2707f66  removed stanford nlp refs
     new 96c088b  Add first draft of dl name finder
     new a63ec16  OPENNLP-1009 - added initial RNN and StackedRNN impls from 
Yay lab, minor fixes
     new 6bfb15f  OPENNLP-1009 - minor improvements / fixes
     new fe2b1d9  fixed adagrad update for (s)rnn, added rmsprop to srnn
     new 6f0659f  removed useless state update, minor fixes
     new a80f29b  text sequence classification using Glove and RNN/LSTMs
     new a1c8692  OPENNLP-1106: Make it compile with 1.6.0, update java to 8 
and checkstyle fixes
     new 7f2076c  Refactored and implemented DocCat API
     new e5c4676  Removed test CLI parameters for Main method
     new cba153e  OPENNLP-1111: Adding initial EC2 scripts for testing.
     new 7c6bb48  Merge pull request #3 from thammegowda/glove-rnn-classifier
     new a0fa9d0  OPENNLP-1111: Improving the CloudFormation template for 
OpenNLP testing on AWS.
     new 9c78236  OPENNLP-1111: Making tests on EC2 automated.
     new f764dbd  OPENNLP-1009 - minor updates to (s)rnn parameters, rnn now 
using rmsprop
     new 8e234db  OPENNLP-1009 - wrong test file
     new be049da  Update DL4J/ND4J to 0.9.1
     new c7fcaa3  OPENNLP-1009 - less epochs for (s)RNNs tests
     new 84bf608  OPENNLP-1009 - added NeuralDocCatTest, currently fails at 
loading model
     new 4bde702  OPENNLP-1009 - switch to opennlp-tools 1.8.3 release
     new a08a73e  added tensorflow NER prediction PoC
     new 87a75a7  Merge pull request #10 from thygesen/tfnerpoc
     new 23c0ebb  added files for test
     new c0a14f6  Merge pull request #11 from thygesen/tfnerpoc
     new 9a14494  Add TF training code for name finder
     new efc1051  Replace hard coded paths with args
     new 23c95bd  Add AL 2.0 header to Java source files
     new 60eb80e  Remove incorrectly placed space in tag name
     new 788e73a  Map chars to indices 0..n instead of using ord(c)
     new 6294dfa  Write mapping dicts to disk
     new f8db193  Name placeholders and variables for use from Java API
     new 12b2c65  Fix loading of dicts by removing GZIP decompressor
     new bd30e1a  Adjust operation names to namefinder.py
     new 0005382  Write model to disk after training
     new 2abc214  Write correct dict into char_dict.txt
     new e483e9f  Adjust encoding to match BioCodec (Java)
     new 19d046d  Implement the TokenNameFinder interface
     new 77a39ad  Disable dropout for inference
     new a1899bb  Add missing return parameter to fix compile error
     new 5df185c  Rename module to match folder name
     new c126158  Rename packages to org.apache.opennlp.namefinder
     new 8dc9495  added vector size to NameFinder + only save model if improved 
+ stop training if not improved for 5 iteration
     new c136c85  Adjust settings to match namefinder.py trainer
     new 5e9fc9b  Add constructor to load all resources from Input Streams
     new faee815  Call close on Tensor objects to release memory
     new 8f24fc6  Add first version of namecat poc
     new 5e401cc  Move namefinder.py to namefinder folder
     new 04da946  Add Java API for namecat and more Randomize training data, 
add dropout, add test eval
     new f5f9377  Add split.py to split training data into pieces
     new 7f33b3f  Compute ntags based on label dict size
     new cb36083  Extract vector size from embeddings file
     new 8b09a57  OPENNLP-1009 - upgrade to dl4j 1.0.0-beta2
     new fd42cb4  Merge pull request #20 from tteofili/OPENNLP-1009a
     new 6f38bee  Add first draft of normalizer trainer
     new 30067a7  Add first draft of normalizer Java API
     new 9c02da7  Make batch size for normalizer inference dynamic
     new d187edf  Remove end marker from output seq
     new a850904  Remove hard coded seq length
     new 00a8fdf  Write model and dictionaries into zip package
     new f746c57  Add train dropout to normalizer
     new 5e8d0da  Name Finder Trainer now writes zip package with vocab files 
inside
     new fa9de88  Namecat Trainer now writes zip package with vocab files inside
     new 6804801  Fix error when empty String array is passed in
     new 2c0121d  Add a script to generate date normalization data
     new 92a30e8  Add year only dates to date generator
     new af594df  Add char dropout to handle unknown chars to normalizer
     new 199f756  Replace hard coded train, dev and test file names with args
     new 8c74466  Compute max target length instead of fixed value in normalizer

The 506 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Reply via email to