Thanks Joern. Good question about license.... I wrote a web crawler and it
polls a bunch of RSS news feeds (google news and BBC mainly) as well as
wikipedia and then recursively scrapes to N depth on them. So.... It's hard
to say what the license would be, I will look deeper, and maybe only use
the wiki data.
thanks


On Fri, Oct 11, 2013 at 3:17 AM, Jörn Kottmann <[email protected]> wrote:

> On 10/10/2013 06:54 PM, Mark G wrote:
>
>> thanks, I am also working on a rapid model builder framework that I would
>> like you to look at. I posted a description earlier but no feedback yet, I
>> was thinking I could check it into the sandbox so everyone can run it,
>> along with a filebased implementation that includes a file of ~200K
>> sentences.
>> This tool would allow users to specify a file of sentences from their
>> data,
>> a file (dictionary) of known named entities, and a blacklist file (for
>> false positive reduction) in order to build a model for a specific entity
>> type.
>>
>
> +1 I posted feedback to this on the user list.
>
> Just go ahead and open a Jira issue for it, and then add it to the sandbox.
>
> What is the license of the sentence file?
>
> Jörn
>

Reply via email to