Emilio,

I'm very interested in making your XML dump processing work easier.  If you
file any bugs against the old[1] or new[2] libraries, I'll be quick to turn
around on them.

1. https://bitbucket.org/halfak/wikimedia-utilities
2. https://github.com/halfak/mediawiki-utilities

-Aaron


On Mon, May 12, 2014 at 10:30 AM, Morten Wang <[email protected]> wrote:

> Hi Emilio,
>
> You're probably aware of it, but one way to handle your own installs is to
> use virtual environments: https://virtualenv.pypa.io/en/latest/
>
> BTW, the Python utilities you pointed to is now deprecated in favour of a
> newer version, but the newer version is Python 3.x only:
> http://pythonhosted.org/mediawiki-utilities/
>
> I have the older version of his utilities installed in my virtual
> environment. When I processed the English dump about a month ago I used
> tools-dev for testing and then submitted jobs to the job servers when it
> was ready, running over the smaller split files of the dump for
> parallelisation and less memory usage.
>
> From what I've heard the newer library is considerably faster than the 2.x
> version, but I haven't yet had a project where I could test that.
>
>
> Regards,
> Morten
>
>
>
> On 11 May 2014 13:10, Emilio J. Rodríguez-Posada <[email protected]> wrote:
>
>> Hi;
>>
>> I would like to process some Wikipedia dumps. The right place for this is
>> tools-dev? I don't see Wikimedia Utilities[1] available there.
>>
>> Do I have to install it or this is a task for an admin?
>>
>> Regards
>>
>> [1] https://bitbucket.org/halfak/wikimedia-utilities/wiki/Home
>>
>> _______________________________________________
>> Labs-l mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/labs-l
>>
>>
>
> _______________________________________________
> Labs-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/labs-l
>
>
_______________________________________________
Labs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/labs-l

Reply via email to