Thank you as always for this work.
It is enormously helpful, for casual analysis as well as deep research.  SJ
On Feb 6, 2015 12:37 AM, "Federico Leva (Nemo)" <nemow...@gmail.com> wrote:

> I just published https://archive.org/details/wikia_dump_20141219 :
>
> ----
>
> Snapshot of all the known Wikia dumps. Where a Wikia public dump was
> missing, we produced one ourselves. 9 broken wikis, as well as lyricswikia
> and some wikis for which dumpgenerator.py failed, are still missing; some
> Wikia XML files are incorrectly terminated and probably incomplete.
>
> In detail, this item contains dumps for 268 902 wikis in total, of which
> 21 636 full dumps produced by Wikia, 247 266 full XML dumps produced by us
> and 5610 image dumps produced by Wikia. Up to 60 752 wikis are missing.
> Nonetheless, this is the most complete Wikia dump ever produced.
>
> ----
>
> We appreciate help to:
> * verify the quality of the data (for Wikia dumps I only checked valid
> gzipping; for WikiTeam dumps only XML well-formedness
> https://github.com/WikiTeam/wikiteam/issues/214 );
> * figure out what's going on for those 60k missing wikis
> https://github.com/WikiTeam/wikiteam/commit/a1921f0919c7b44cfef967f5d07ea4
> 953b0a736d ;
> * improve dumpgenerator.py management of huge XML files
> https://github.com/WikiTeam/wikiteam/issues/8 ;
> * fix anything else! https://github.com/WikiTeam/wikiteam/issues
>
> For all updates on Wikia dumps, please watchlist/subscribe to the feed of:
> http://archiveteam.org/index.php?title=Wikia (notable update: future
> Wikia dumps will be 7z).
>
> Nemo
>
> _______________________________________________
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

Reply via email to