Hi Manjunath,

The best way is to go with NER.

I don't get what you mean by N-gram feature analysis. Would be helpful if
you could elaborate.

>From your example I see all are exact matches. So I suggest you go with a
Dictionary Name Finder.

Thanks,
Manoj.

On Mon, Mar 5, 2018 at 4:16 PM, manjunath nakshathri <nakshat...@gmail.com>
wrote:

> Hello There,
>
> We are using opennlp for document categorization with Ngram Features to
> categorize our incoming text. For example :
>
> "The shape of water and Frances McDormand rule oscar 2018"
>
> Given this sentence we would like to arrive at :
>
> Shape of Water : Movie
> Frances McDormand : Actress
>
> This we are able to achieve with the following document categorization
> training data and with the ngram features;
>
> Movie Shape of Water
> Actress Frances McDormand
>
> *What is not working:*
> If we try to categorize a single word say Oscar as an award category, we
> are not able to. Any idea how we can get this working?
>
> *Target training data*
> Movie Shape of Water
> Actress Frances McDormand
> Award Oscar
>
> *Desired Output :*
> Shape of Water : Movie
> Frances McDormand : Actress
> Oscar: Award
>
> Implementation details :
> Open NLP version : 1.8.4
> Training Algorithm used : Naive Bayes
> Iteraitions set : 100
>
> *General Questions*
> Q :Why we cant use NER ?
> A : We need ngram feature analysis which is not possible in NER.
>
> Q : Are we going to build our own training data ?
> A : Yes
>
> Really appreciate any help towards solving this issue.
>
> --
> Thanks and Regards
> Manjunath
>



-- 
Regards,
Manoj.

Reply via email to