Classification: UK OFFICIAL

Hi,

All the information that we've published on Baleen is either in the README on 
GitHub, on the GitHub wiki pages, or built in to Baleen itself. If you download 
a release version from GitHub and run it with no configuration, you'll be able 
to access all the built in documentation, which includes Getting Started 
information, development and usage guides, and full Javadoc.

If you have any specific questions I'll do my best to answer them, but 
otherwise I hope the above helps.

James

-----Original Message-----
From: buddha [mailto:[email protected]]
Sent: 15 December 2015 14:20
To: [email protected]
Subject: Re: [UK OFFICIAL] Baleen 2.1 Released

Hello Mr. Baker,

Do you have any more supporting information on Baleen?  Perhaps a website? I 
don’t see it referenced on Github?

Thanks,
b

~~~~~
May All Your Sequences Converge

> On Dec 15, 2015, at 3:40 AM, Baker James D <[email protected]> wrote:
> 
> Classification: UK OFFICIAL
> Morning all,
> 
> A new version of Baleen, the UIMA based entity extraction and text analytics 
> framework developed by Dstl (part of the UK Ministry of Defence) has been 
> released. This version includes the following improvements:
> 
> 
> *         New Annotator: MongoStemming uses a gazetteer and stemming to 
> perform a pseudo-fuzzy match and find gazetter terms in different tenses and 
> plurals
> 
> *         New Cleaner: MergeAdjacent will merge adjacent entities of the same 
> type
> 
> *         New Content Extractor: CsvContentExtractor splits CSV fields into 
> content and metadata
> 
> *         New Collection Reader: LineReader will read a single file into 
> multiple documents by line
> 
> *         New REST API to get configuration parameters for components (e.g. 
> annotators)
> 
> *         Significant changes to the way gazetteer annotators work, including 
> changing from RadixTrees to MultiMaps and implementing the Aho-Corasick 
> algorithm, resulting in performance improvements for large gazetteers in the 
> order of 100s
> 
> *         Lots of bug fixes and minor improvements
> 
> The latest release is available on GitHub: 
> http://scanmail.trustwave.com/?c=7240&d=6aPw1vduVnYiI0jDO1eRCRkk73OHFi
> BwcrGjmznt3g&u=https%3a%2f%2fgithub%2ecom%2fdstl%2fbaleen
> 
> Any feedback, suggestions, comments, issues and code contributions are 
> welcome! We're keen for people to help us improve it so that it's a useful 
> tool for a wide range of people.
> 
> James
> 
> "This e-mail and any attachment(s) is intended for the recipient only.   Its 
> unauthorised use, 
> disclosure, storage or copying is not permitted.  Communications with 
> Dstl are monitored and/or recorded for system efficiency and other 
> lawful purposes, including business intelligence, business metrics and 
> training.  Any views or opinions expressed in this e-mail do not necessarily 
> reflect Dstl policy."
> 
> "If you are not the intended recipient, please remove it from your 
> system and notify the author of the email and [email protected]"

"This e-mail and any attachment(s) is intended for the recipient only.   Its 
unauthorised use, 
disclosure, storage or copying is not permitted.  Communications with Dstl are 
monitored and/or 
recorded for system efficiency and other lawful purposes, including business 
intelligence, business 
metrics and training.  Any views or opinions expressed in this e-mail do not 
necessarily reflect Dstl policy."

"If you are not the intended recipient, please remove it from your system and 
notify the author of 
the email and [email protected]"

Reply via email to