There are a couple of issues I noted with the Google Code page... These are language issues like improper capitalization and some grammar. These will be noticed ASAP by any critic. So I did :-)
- The description under the title of the page should say "... English Wikipedia..." - I corrected the description for you. Go through it and change where you want to. This project is using: Python (for creating a converter which converts MediaWiki format text to corresponding HTML page), Django (as a web server to fix CSS and other stuff), PostgreSQL database to locate the article and its span. The basic working module was taken from this article http://users.softlab.ece.ntua.gr/~ttsiod/buildWikipediaOffline.html; difference being the Python parser for XML->HTML conversion. So in place of a PHP kind of setup which Wikipedia uses, this application uses Python. The idea which remains intact is of using the XML dump provided from MediaWiki site, breaking it to small files using bzip2recover, and then creating database for article and its location in list of files. For now, I am using the dump from 24th July 2008. For me this dump, the database and rest of configuration takes around 4.6G (3.9G XML files, ~700MB PostgreSQL dump and some CSS and JS files taken from MediaWiki site itself.) Great job, baali :-P cheers pratul -- Incoming! Freed.in 2009, 20-21 February. dum vivimus, vivamus. http://pratul.in _______________________________________________ ilugd mailinglist -- [email protected] http://frodo.hserus.net/mailman/listinfo/ilugd Archives at: http://news.gmane.org/gmane.user-groups.linux.delhi http://www.mail-archive.com/[email protected]/
