On Fri, Aug 6, 2010 at 10:18 AM, Léa Massiot <[email protected]> wrote:
> Are you sure this will be able to extract the > introductory paragraph (only) which is not in any > section... (because it is not trivial). > > There is only one example I could find at > http://code.pediapress.com/wiki/wiki/mwlib > ... which is not so easy to understand by the way... > > Cheers, > -- > Lmhelp > mwlib is a fairly full featured parser for wikitext. It is not documented, but by using the dir and help python commands you can easily navigate its methods. It creates a parse tree from which you can reconstruct the plain text. Once you have the plain text extracting paragraphs is straightforward. _______________________________________________ MediaWiki-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
