Re: [Mediawiki-l] Wikitext grammar

BPJ Mon, 09 Aug 2010 11:03:11 -0700

2010-08-07 20:24, lmhelp skrev:
>
>> So why not use the "real" parser?
>
> Exactly. Where can it be found, please?
>
> Thanks and all the best,
> --
> Lmhelp


fetch the html from wikipedia.org with something like wget
(playing nicely and using delays!) and then extract the
first <p> element with something which parses the html
into a tree. I've done that using perl with HTML::Tree.
Generally a regular expression like /<p\b.+?<\/p>/ might
do the extraction just as well, but cheaper and faster
if you,re just after the first <p> element!
Really cheap, I know!

/BP

_______________________________________________
MediaWiki-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l

Re: [Mediawiki-l] Wikitext grammar

Reply via email to