Hi, thanks for your reply! I've been able to use parseString but.. how do i find templates databases? Are they downloadable from wikipedia?
Thanks! On Feb 9, 12:10 pm, Osipov <[email protected]> wrote: > On 5 фев, 20:02, Alex Rades <[email protected]> wrote: > > > Hi, > > I'm trying to do a very basic parsing of wikipedia articles but i > > don't find an entry point in mwlib documentation :) The only thing > > that i'm finding is the help of the various mw-* commands. Am I > > missing something? > > > Basically, i need to extract the Infobox (the first table on the right > > of article) of a given article, and the first paragraph of the > > article. Is it possible to mwlib? > > > Thank you very much > > If nothing has changed by last 3 months, the only documentation is the > source code of mwlib, which can be found at (on linux) /usr/lib/ > python*/site-packages/mwlib*/mwlib/, if you installed it with > easyintall. > > See old_uparser.py which is the source code of the various mw-* > commands. > I think, to extract Infobox, you need parse article using Python: > from mwlib import uparser > a=uparser.parseString(<title>, wikidb=<database>, raw=<raw>) > > This function returns tree representation of article, where you can > found what you want. <database> is used to substitute templates, <raw> > is article sourcecode. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "mwlib" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/mwlib?hl=en -~----------~----~----~----~------~----~------~--~---
