Hello

2009/10/3 Santiago M. Mola <[email protected]>:
> Hello,
>
> While writing a bot, I had to discard redirected pages from the XML
> dump. In order to be able to do it early, I modified xmlreader.py to
> parse the <redirect /> tag and add it to XmlEntry. I'm attaching the
> patch, which is not extensively tested.
>
> I haven't updated the regex_parse method since it looks outdated
> anyway (it tries to create an XmlEntry with different arguments than
> usual).
>

Thanks for your contribution!

I tweaked the patch a bit:
- I renamed the "redirect" attribute to "isredirect"
- Logic cleanup
- Adding a relevant test in tests/test_xmlreader.py

The resulting patch is in trunk as r7366

--
Nicolas Dumazet — NicDumZ

_______________________________________________
Pywikipedia-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l

Reply via email to