Hi
       I need to do text clustering but in the context of natural
language processing. Consequently, word ordering becomes important.
Initially, I will be doing the nGram model (with n =3).

In Mahout, the Vector and SequenceFileFormat representation does not
take into consideration contextual information (as I understand). I
know I might need to modify  both of them but is there a bagofwords
and stoplist that I may use?

Thanks,
Neel Sheyal

Reply via email to