Hi Ben, I completely agree with your statements. Dictionary is just a booster in case there is no enough training data. I wanted to avoid repetition of data in dictionary. So, in case there is no exact match for a particular entry, we can look into the next best match available in the dictionary.
Thanks. Manoj. On Tue, Oct 10, 2017 at 11:38 PM, Benedict Holland < benedict.m.holl...@gmail.com> wrote: > Hi Manoj, > > Couldn't you just add the 2 token name out of the 3? If the order matters, > always have the more specific first and go to less specific. What you are > describing is a problem specifically associated with dictionary lookups: > that unless there is an exact match, nothing will match. Dictionaries are > prone to Type 1 errors if entries like yours are missing from the > dictionary and Type 2 errors in the context of a name matching but it isn't > a name. I ran into a problem today where text matched Dec, Jan, Mar, April. > Jan was a name in the dictionary lookup. > > This is why you should probably switch to an ME model (or at the very > least, an adaptive mode) as soon as you have the training data. You train > the ME model to recognize contextually a name, rather than specifying that > only these words are names. The more training data, the better and more > accurate your results. > > Thanks, > ~Ben > > On Tue, Oct 10, 2017 at 7:32 AM, Manoj B. Narayanan < > manojb.narayanan2...@gmail.com> wrote: > > > Hi all, > > > > The present Dictionary Name Finder matches the tokens in the same order > as > > given in the dictionary (XML). Is it possible to define how the match > > should occur. > > > > For Example, > > > > Say, I have 3 tokens as an entry. But the input contains only 2 tokens > out > > of the 3. In this case, the Dictionary Name Finder will not match. If we > > can define our own matching algorithm, it would be useful. > > > > If it is already present as a feature please guide me on how to use it. > > Else, please consider this as a suggestion. > > > > > > Thanks, > > Manoj > > >