On 5 фев, 20:02, Alex Rades <[email protected]> wrote:
> Hi,
> I'm trying to do a very basic parsing of wikipedia articles but i
> don't find an entry point in mwlib documentation :) The only thing
> that i'm finding is the help of the various mw-* commands. Am I
> missing something?
>
> Basically, i need to extract the Infobox (the first table on the right
> of article) of a given article, and the first paragraph of the
> article. Is it possible to mwlib?
>
> Thank you very much

If nothing has changed by last 3 months, the only documentation is the
source code of mwlib, which can be found at (on linux) /usr/lib/
python*/site-packages/mwlib*/mwlib/, if you installed it with
easyintall.

See old_uparser.py which is the source code of the various mw-*
commands.
I think, to extract Infobox, you need parse article using  Python:
  from mwlib import uparser
  a=uparser.parseString(<title>, wikidb=<database>, raw=<raw>)

This function returns tree representation of article, where you can
found what you want. <database> is used to substitute templates, <raw>
is article sourcecode.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"mwlib" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [email protected]
For more options, visit this group at http://groups.google.com/group/mwlib?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to