My problem: I need to index texts against a list of relevant words. Doing this by hand for a number of small sample made me think that ranking the relevance of a text by the frequency of words that are subjects or verbs gives somewhat better results than simply counting the frequency of all words parts that match the list.
My question: is there any module that would help with recognizing syntactical parts in a text written in English?
Searching with Google got me a lot of results that were course outlines or assignments. On CPAN I found only the Lingua::* modules, but if there is one that fits what I need, then probably I failed to see it among the over 600 hits ...
thank you,
Emil Perhinschi