Hi everyone, we need your help. We are from Python Argentina, and we are working on adapting our cdpedia project to make a DVD together with educ.ar and Wikimedia Foundation, holding the entire Spanish Wikipedia that will be sent soon to Argentinian schools.
Hernán and Diego are the two interns tasked with updating the data that cdpedia uses to make the cd (it currently uses a static html dump dated June 2008), but they are encountering some problems while trying to make an up to date static html es-wikipedia dump. I'm ccing this list of people, because I'm sure you've faced similar issues when making your offline wikipedias, or because maybe you know someone who can help us. Following is an email from Hernán describing the problems he's found. thanks! -- alecu - Python Argentina 2010/4/30 Hernan Olivera <[email protected]>: Hi everybody, I've been working on making an up to date static html dump for the spanish wikipedia, to use as a basis for the DVD. I've followed the procedures detailed in the pages below, that were used to generate the current (and out of date) static html dumps: 1) installing and setting up a mediawiki instance 2) importing the xml from [6] with mwdumper 3) exporting the static html with mediawiki's tool The procedure finishes without throwing any errors, but the xml import produces malformed html pages that have visible wikimarkup. We would really need to have a successful import from the spanish xmls to a mediawiki instance so we can produce the up to date static html dump. Links to the info I used: [0] http://www.mediawiki.org/wiki/Manual:Installation_guide/es [1] http://www.mediawiki.org/wiki/Manual:Running_MediaWiki_on_Ubuntu [2] http://en.wikipedia.org/wiki/Wikipedia_database [3] http://www.mediawiki.org/wiki/Manual:Importing_XML_dumps [4] http://meta.wikimedia.org/wiki/Importing_a_Wikipedia_database_dump_into_MediaWiki [5] http://meta.wikimedia.org/wiki/Data_dumps [6] http://dumps.wikimedia.org/eswiki/20100331/ [7] http://www.mediawiki.org/wiki/Alternative_parsers (among others) Cheers, -- Hernan Olivera PS: unluckily I didn't write down every step in detail. I did a lot more tests than what I wrote here. To make a detailed report I'd like to go thru the procedure again writing down every option (and to check if I missed something). I'm finishing installing a server just for this, because this processes take forever and they blocked other tasks while making this tests. 2009/10/23 Samuel Klein <[email protected]>: > Jimbo - thanks for the spur to clean up the existing work. > > All - Let's start by cleaning up the mailing lists and setting a few > short-term goals :-) It's a good sign that we have both charity and love > converging to make something happen. > > * For all-platform all-purpose wikireaders, let's use > [email protected], as we discussed a month ago in the aftermath of > Wikimania (Erik, were you going to set this up? I think we agreed to > deprecate wiki-offline-reader-l and replace it with offline-l.) > > * For wikireaders such as WikiBrowse and Infoslicer on the XO, please > continue to use [email protected] > > > I would like to see WikiBrowse become the 'sugarized' version of a reader > that combines the best of that and the openZim work. A standalone DVD or > USB drive that comes with its own search tools would be another version of > the same. As far as merging codebases goes, I don't think the WikiBrowse > developers are invested in the name. > > I think we have a good first cut at selecting articles, weeding out stubs, > and including thumbnail images. Maybe someone working on openZim can > suggest how to merge the search processes, and that file format seems > unambiguously better. > > Kul - perhaps part of the work you've been helping along for standalone > usb-key snapshots would be useful here. > > > Please continue to update this page with your thoughts and progress! > http://meta.wikimedia.org/wiki/Offline_readers > > SJ > > > 2009/10/23 Iris Fernández <[email protected]> >> >> On Fri, Oct 23, 2009 at 1:37 PM, Jimmy Wales <[email protected]> wrote: >> > >> > My dream is quite simple: a DVD that can be shipped to millions of >> > people with an all-free-software solution for reading Wikipedia in Spanish. >> > It should have a decent search solution, doesn't have to be perfect, but >> > it >> > should be full-text. It should be reasonably fast, but super-perfect is >> > not >> > a consideration. >> > >> >> Hello! I am an educator, not a programmer. I can help selecting >> articles or developing categories related to school issues. > > Iris - you know the main page of WikiBrowse that you see when the reader > first loads? You could help with a new version of that page. Madeleine > (copied here) worked on the first one, but your thoughts on improving it > would be welcome. > > > _______________________________________________ dev-l mailing list [email protected] https://intern.openzim.org/mailman/listinfo/dev-l
