Re: BRAT

2018-02-01 Thread Daniel Russ
I don’t know of any wiki with info specifically using BRAT data. However, first read: https://opennlp.apache.org/docs/1.8.4/manual/opennlp.html#tools.namefind.training.tool Then issue the command: ope

Re: [VOTE] Apache OpenNLP 1.8.2 Release Candidate 2

2017-09-12 Thread Daniel Russ
+1 Daniel > On Sep 12, 2017, at 2:37 AM, Suneel Marthi wrote: > > +1 binding > > On Tue, Sep 12, 2017 at 8:10 AM, Tommaso Teofili > wrote: > >> +1 >> >> Tommaso >> >> Il giorno lun 11 set 2017 alle ore 09:12 Joern Kottmann < >> kottm...@gmail.com> >> ha scritto: >> >>> Hi Folks, >>> >>>

Re: Cache

2017-09-05 Thread Daniel Russ
Again, you should send this to users not dev mail list. Have you tried adding an instance variable (e.g. numWords) that you update when you call “createFeature”? You need to be concerned with thread safety if you do this on more than 1 thread, but you can synchronize only the part of the code

Re: DictionaryNameFinder

2017-09-05 Thread Daniel Russ
Hi Manoj, Please send your question to the users list, not the dev list. I believe the dictionaryNameFinder is passed a dictionary of names and if a name appears in the dictionary, it is marked as found. Otherwise, no name is found. It is not a statistical model. The two methods you des

Re: [VOTE] Apache OpenNLP 1.8.2 Release Candidate

2017-09-05 Thread Daniel Russ
+1 binding (Thank Jörn and Suneel for the help) Daniel > On Sep 4, 2017, at 11:08 PM, Suneel Marthi wrote: > > +1 binding > > On Mon, Sep 4, 2017 at 5:41 PM, Joern Kottmann wrote: > >> Hi Folks, >> >> >> I have posted a first release candidate for the Apache OpenNLP 1.8.2 >> release and it

Re: Early stopping NameFinderME

2017-08-25 Thread Daniel Russ
Jörn, Currently, GISTrainer has a private static final variable LLThreshold, which controls if the change in the log likelihood between two iterations is too small. We could make this parameter. I am concerned about using the accuracy to train the model. If we use accuracy, the weight spac

Re: Spelling correction

2017-07-01 Thread Daniel Russ
Damiano, There is a lot of research on spelling correction. Here is a paper from a group our of the National Library of Medicine https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2137159/ . They also have a product called GSpell https://

Re: [GitHub] opennlp pull request #231: Adding sentiment analysis code to OpenNLP: OPENNL...

2017-06-15 Thread Daniel Russ
Hi, I tried to take a look at the pull request. But it is 1677 commits behind apache:master. Can you please rebase your code. Thank you. Daniel > On Jun 15, 2017, at 1:19 PM, amensiko wrote: > > GitHub user amensiko opened a pull request: > >https://github.com/apache/opennlp/pull/231

Re: [VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

2017-05-12 Thread Daniel Russ
> > 2017-05-12 9:48 GMT-03:00 Joern Kottmann : > >> The vote is still open and we won't close it before the entire active PMC >> voted or the time passed. >> >> Jörn >> >> On Fri, May 12, 2017 at 2:29 PM, Daniel Russ wrote: >> >>

Re: [VOTE] Apache OpenNLP 1.8.0 Release Candidate 2

2017-05-12 Thread Daniel Russ
Even though we have enough binding votes to release, can I have a few hours to complete testing of my code with 1.8.0RC2 before release. Daniel On May 11, 2017 12:38 PM, "Joern Kottmann" wrote: > The Apache OpenNLP PMC would like to call for a Vote on Apache OpenNLP > 1.8.0 Release Candidate 2.

Re: [VOTE] Apache OpenNLP 1.8.0 Release Candidate

2017-05-11 Thread Daniel Russ
-1 because of DictionaryLemmatizer bug OPENNLP-1056 Daniel > On May 9, 2017, at 2:41 PM, Joern Kottmann wrote: > > The Apache OpenNLP PMC would like to call for a Vote on Apache OpenNLP > 1.8.0 Release Candidate 1. > > The RC 1 distributables can be downloaded from here: > https://repository.a

Re: Problem in passing feature generator for NameFinderCrossValidation

2017-04-21 Thread Daniel Russ
Hi Saurabh, I am a little confused why you need a byte[]. Can't you do this: 1. split your data into 5-folds. (it doesn’t have to be 5, but it is a more concrete example) 2. train on 4 folds. test on 1. (run 5 times changing the test set) 3. look at the average agreement. I am a little di