Re: Name Finder trainer default settings

Joern Kottmann Tue, 07 Feb 2017 07:09:07 -0800

We actually can train a CRF from Mallet with the existing infrastructure,
and the code should still work (maybe there are minor issues, who knows). I
tried that but just couldn't get better results. We should maybe try to get
this code (mallet-addon) into a good shape again and then see what the
issues could be.
There might be some folks out there who want to do this for research or
other cases, I believe one of my issues were bad hyper params.


With perceptron you can get good results, quickly out of the box.

Jörn



On Tue, Feb 7, 2017 at 3:59 PM, Russ, Daniel (NIH/CIT) [E] <
dr...@mail.nih.gov> wrote:

> It would be interesting to compare the results of OpenNLP’s perceptron
> trained models, GIS trained models, and a vanilla CRF implementation (i.e.
> not specifically trained for a task).  We can make a better decision on if
> we should spend the effort to implement a CRF.  Every once in a while we
> see people ask “what can I do?”.  Maybe the answer should be… given an
> ObjectStream<Event> or DataIndexer, train a CRFModel that extends
> AbstractModel.  Your training class must extend AbstractEventTrainer and we
> serializable using AbstractModelWriter
>
> Just my 2 cents.
> Daniel
>
> On 2/7/17, 9:51 AM, "Damiano Porta" <damianopo...@gmail.com> wrote:
>
>     I have good results with perceptron, but +1 for CRF
>
>     2017-02-07 15:42 GMT+01:00 Russ, Daniel (NIH/CIT) [E] <
> dr...@mail.nih.gov>:
>
>     > Hi Jörn,
>     >
>     >
>     >
>     >    I think the best entity recognition systems use CRF’s.  At some
> point
>     > we might want to consider adding them.  As you know, ME classifiers
> suffer
>     > from label bias problem (see Lafferty et. al<http://repository.upenn
> .
>     > edu/cgi/viewcontent.cgi?article=1162&context=cis_papers>.) CRF’s
> deal
>     > with that issue.  I believe that perceptrons suffer from the same
> problem.
>     > If you think the results are better, I have no problem.  I think
> that our
>     > long-term goal should be to add a CRF, and make it the default for
> the
>     > NameFinder.
>     >
>     >
>     >
>     > Daniel
>     >
>     >
>     >
>     >
>     >
>     > On 2/6/17, 12:40 PM, "Joern Kottmann" <kottm...@gmail.com> wrote:
>     >
>     >
>     >
>     >     Hello all,
>     >
>     >
>     >
>     >     I would like to propose to switch the default training algorithm
> from
>     >
>     >     maxent gis to perceptron for the Name Finder. In all the data
> sets I
>     >
>     >     tried perceptron performs better than maxent gis and I believe
> that
>     >
>     >     would be a much more sensible default.
>     >
>     >
>     >
>     >     A user can always override the default by providing the
> algorithms
>     >
>     >     parameter for training.
>     >
>     >
>     >
>     >     What do you think?
>     >
>     >
>     >
>     >     Jörn
>     >
>     >
>     >
>     >
>     >
>
>
>

Re: Name Finder trainer default settings

Reply via email to