Hi!
On Mon, Sep 28, 2015 at 02:31:03PM +0100, Baker James D wrote:
> I would like to draw your attention to a text analytics framework that has
> just been released by Dstl (part of the UK Ministry of Defence). It uses UIMA
> as part of its underlying architecture but provides additional functionality
> on top of that, and simplifies much of the user configuration and experience,
> as well as the development process. A number of collection readers,
> annotators and consumers are included as part of the framework.
>
> The tool is called Baleen, and is released under Apache Software License 2.
>
> There is more information about the tool on the press release
> (https://www.gov.uk/government/news/dstl-adds-to-open-source-software), and
> on the GitHub page (https://github.com/dstl/baleen).
Thanks for the heads up. However, I haven't found any clear summary
of what is the framework capable of right now - I think you might want
to expand the generic description a bit with some examples and
use-cases. I have been looking around a bit and seems like e.g.
https://github.com/dstl/baleen/blob/master/baleen/baleen-annotators/src/main/java/uk/gov/dstl/baleen/annotators/cleaners/MergeAdjacentQuantities.java
is something that could be pretty useful, but you might want to make it
easier to discover the capabilities to get more users / contributors.
Best,
Petr Baudis