I think you would get further if you used the identities of the poster. That would let you build an authoritativeness score as well as augment your topic models.
It would also be helpful to build a model that can find helpful answers based on the responses to the answer. Ken Krugler showed a simple version of this in his HUG talk starting on roughly slide 9: http://www.slideshare.net/sh1mmer/the-bixo-web-mining-toolkit <http://www.slideshare.net/sh1mmer/the-bixo-web-mining-toolkit>(please ignore the final scores so that I don't have to blush) On Tue, Feb 22, 2011 at 9:15 PM, Stefan Henß <[email protected]>wrote: > - Build categorization solely based on the conversation's texts (by > clustering). >
