There are a couple of issues I noted with the Google Code page...
These are language issues like improper capitalization and some
grammar. These will be noticed ASAP by any critic. So I did :-)

 - The description under the title of the page should say "... English
Wikipedia..."

 - I corrected the description for you. Go through it and change where
you want to.

This project is using: Python (for creating a converter which converts
MediaWiki format text to corresponding HTML page), Django (as a web
server to fix CSS and other stuff), PostgreSQL database to locate the
article and its span.

The basic working module was taken from this article
http://users.softlab.ece.ntua.gr/~ttsiod/buildWikipediaOffline.html;
difference being the Python parser for XML->HTML conversion. So in
place of a PHP kind of setup which Wikipedia uses, this application
uses Python. The idea which remains intact is of using the XML dump
provided from MediaWiki site, breaking it to small files using
bzip2recover, and then creating database for article and its location
in list of files. For now, I am using the dump from 24th July 2008.
For me this dump, the database and rest of configuration takes around
4.6G (3.9G XML files, ~700MB PostgreSQL dump and some CSS and JS files
taken from MediaWiki site itself.)

Great job, baali :-P

cheers
pratul

-- 
Incoming! Freed.in 2009, 20-21 February.
dum vivimus, vivamus.
http://pratul.in

_______________________________________________
ilugd mailinglist -- [email protected]
http://frodo.hserus.net/mailman/listinfo/ilugd
Archives at: http://news.gmane.org/gmane.user-groups.linux.delhi 
http://www.mail-archive.com/[email protected]/

Reply via email to