Thanks for suggestion. Can you give me a specific NLT toolset/approach with
example if you have experience already?



On Tue, Jun 14, 2011 at 12:57 PM, Venkatraman S <venka...@gmail.com> wrote:

> On Tue, Jun 14, 2011 at 12:07 PM, Gopalakrishnan Subramani <
> gopalakrishnan.subram...@gmail.com> wrote:
>
> > Jayalalithaa meets PM, DMK watches closely
> > Jaya to meet PM today in New Delhi
> > Jaya-PM meet, 'jittery' DMK watches on Times
> >
> > How to do this in Python? I think, NLT toolkit is too large for me to
> learn
> > and do.. Any other fun & simpler way to do that?
> >
>
> 1) NLTK is pretty simple. You can do duplicate detection pretty easily -
> look out for sample codes.
>
> 2) Do a keyword generation from the content and check the correlation
> between documents.
>
> 3) For headlines alone : do a substring matching?(but this would leave the
> semantics of the text - i.e, 'Jayalalitha was last seen in KOdagu estate'
> and 'Real estate would get a boost under Jayalalitha' would be categorized
> under the same)
>
> -V
> http://blizzardzblogs.blogspot.com/
> _______________________________________________
> BangPypers mailing list
> BangPypers@python.org
> http://mail.python.org/mailman/listinfo/bangpypers
>
_______________________________________________
BangPypers mailing list
BangPypers@python.org
http://mail.python.org/mailman/listinfo/bangpypers

Reply via email to