One way would be to filter out common words, create a set of the remaining words in the question, then for each possible answer create a set of words and check for a superset: keywords = get_word_set(question) - common_words probable_answers = [answer for answer in answers if keywords <= get_word_set(answer)]
2009/1/22 benny daon <[email protected]> > Hi all, > I'm working on the next release of http://www.tzafim.org a GPLv2 project. > This release will have a bot that scans many feeds (probably using > feedparser) checking for new entries that answer one of the questions on the > site. A question is up to 256 chars long and to be considered an *answer* > an entry must have the question's text in it. > I'd appreciate any ideas on how to do it efficiently, or better yet code > snippets I can integrate. > > Thanks, > > Benny > -- > "Your task is not to foresee the future, but to enable it." > http://tuzig.com/daonb > > _______________________________________________ > Python-il mailing list > [email protected] > http://hamakor.org.il/cgi-bin/mailman/listinfo/python-il > > -- Check out my blog: http://orip.org
_______________________________________________ Python-il mailing list [email protected] http://hamakor.org.il/cgi-bin/mailman/listinfo/python-il
