I've run the data trough Whoosh, and now the hardest part is to cull the words. For example these are the top 10 word counts: (u'django', 15051), (u'have', 4066), (u'your', 3770), (u'us', 3311), (u'python', 2738), (u'some', 2713), (u'site', 2501), (u'code', 2359), (u'like', 2335), (u'project', 2327),
Any ideas how to sort out relevant tags? On Jun 25, 4:36 pm, benny daon <[email protected]> wrote: > Hi all,I've got a project going with the aim of improving djangoproject.com. > So far I've forked the original code, cleaned it up, added buildout so > installation will be a breeze, and added django-south so we can easily > upgrade the database. > Jacob KM sent me a link to a dump of the current database which I included > in the migration script so the code pulls the dump and use it to create the > database and add all the rows. There are almost 5000 rows in the model, > pointing to django related posts. The next step is to extract common tags > from the title and summary fields of the FeedItem. > A friend recommended I use Solr or Lucene for this job which makes sense. My > issue is that I never used them before. If you know what needs to be done > and have some time, please assign this ticket > -http://bitbucket.org/daonb/django-website/issue/3/- to yourself, fork the > code, do it, and send me a 'pull request'. > > Thanks, > > Benny. > > BTW - there's much more to do in this project. Please feel free to open > tickets with suggestions/bugs or better yet - send code. Jacob said he will > use it in the live site. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "PyWeb-IL" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/pyweb-il?hl=en -~----------~----~----~----~------~----~------~--~--- _______________________________________________ Python-il mailing list [email protected] http://hamakor.org.il/cgi-bin/mailman/listinfo/python-il
