I'll take a look at the Leipzig project, not familiar with it. But the idea is to allow users to wire up whatever data they have and not have it particular to any format, the tool now just produces opennlp format... however I can write a LeipzigSentenceProvider or LeipzigKnownEntityProvider impl and it would work with the framework as is. thanks
On Fri, Oct 11, 2013 at 6:13 AM, Jörn Kottmann <[email protected]> wrote: > On 10/11/2013 11:51 AM, Mark G wrote: > >> Thanks Joern. Good question about license.... I wrote a web crawler and it >> polls a bunch of RSS news feeds (google news and BBC mainly) as well as >> wikipedia and then recursively scrapes to N depth on them. So.... It's >> hard >> to say what the license would be, I will look deeper, and maybe only use >> the wiki data. >> > > The Leipzig project is doing something similar for many languages, maybe > it would be good > solution to just make it work with their data format. > > What do you think? > > Jörn >
