Hey guys.. I'm writing a python program that deals with a dump of wikipedia - I've parsed the xml dump into elements (title, id, text, etc) but am looking for a way to parse the text (within the <text> tags) to deal with the wikitext markup. I'm on Windows and mwlib is taking a lot of effort to get working, so thought I'd just ask if this is something it can do, or whether I should just continue writing a parser myself. What I'm really looking for is something that parses wikitext the way that, say, lxml parses xml - returning a list or tree of elements that I can then work with.
Thanks a lot for any advice. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "mwlib" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/mwlib?hl=en -~----------~----~----~----~------~----~------~--~---
