There is also a n-gram feature generator that can be used with the
Name Finder, you should give it a try to establish a baseline on your
data and then you can still tune it and test different feature
generation strategies.

Jörn

On Mon, Mar 5, 2018 at 11:57 AM, Manoj B. Narayanan
<manojb.narayanan2...@gmail.com> wrote:
> Hi Manjunath,
>
> The best way is to go with NER.
>
> I don't get what you mean by N-gram feature analysis. Would be helpful if
> you could elaborate.
>
> From your example I see all are exact matches. So I suggest you go with a
> Dictionary Name Finder.
>
> Thanks,
> Manoj.
>
> On Mon, Mar 5, 2018 at 4:16 PM, manjunath nakshathri <nakshat...@gmail.com>
> wrote:
>
>> Hello There,
>>
>> We are using opennlp for document categorization with Ngram Features to
>> categorize our incoming text. For example :
>>
>> "The shape of water and Frances McDormand rule oscar 2018"
>>
>> Given this sentence we would like to arrive at :
>>
>> Shape of Water : Movie
>> Frances McDormand : Actress
>>
>> This we are able to achieve with the following document categorization
>> training data and with the ngram features;
>>
>> Movie Shape of Water
>> Actress Frances McDormand
>>
>> *What is not working:*
>> If we try to categorize a single word say Oscar as an award category, we
>> are not able to. Any idea how we can get this working?
>>
>> *Target training data*
>> Movie Shape of Water
>> Actress Frances McDormand
>> Award Oscar
>>
>> *Desired Output :*
>> Shape of Water : Movie
>> Frances McDormand : Actress
>> Oscar: Award
>>
>> Implementation details :
>> Open NLP version : 1.8.4
>> Training Algorithm used : Naive Bayes
>> Iteraitions set : 100
>>
>> *General Questions*
>> Q :Why we cant use NER ?
>> A : We need ngram feature analysis which is not possible in NER.
>>
>> Q : Are we going to build our own training data ?
>> A : Yes
>>
>> Really appreciate any help towards solving this issue.
>>
>> --
>> Thanks and Regards
>> Manjunath
>>
>
>
>
> --
> Regards,
> Manoj.

Reply via email to