Re: [mwlib] Re: An mwlib tutorial/examples?

Ralf Schmitt Mon, 04 Jan 2010 02:26:28 -0800

Nazarius Kappertaal <[email protected]> writes:

>
> I mostly just need to see how to parse and extract data from the tree
> that is created when one parses a mediawiki article - links/images/
> tables -> currently doing it with a variety of regexes (multi-pass is
> bad).
>
> A concrete tutorial/walkthrough would be akin to dive into python,
> centred around some type of complex, featured article -> en.wikipedia -
>> DNA/World War II.
>
> And just run down the most used aspects of mwlib, using the featured
> article.
>
> Nothing long, just some code that works, and shows (via comments/
> descriptions) how one can use mwlib in real life. Make it easier for
> outsiders.
>


the following is a basic example. the comments somehow got lost. sorry
:)


#! /usr/bin/env python 
# mw-zip -x -c :en -o acdc.zip AC/DC

from mwlib import wiki
from mwlib.refine import core

env = wiki.makewiki("acdc.zip")
a=env.wiki.getParsedArticle("AC/DC")

sections = core.walknodel([a], lambda x: x.tagname=="@section" and x.level==2)
for s in sections:
    core.show(s.children[:1])

print "------------"
for k in core.walknodel([a], lambda x: x.type==core.T.t_complex_link):
    core.show(k)



- Ralf

--

You received this message because you are subscribed to the Google Groups 
"mwlib" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/mwlib?hl=en.

Re: [mwlib] Re: An mwlib tutorial/examples?

Reply via email to