If you are to extract only Wikipedia'a articles first paragraph no problema.

2010/8/6 Katharina Wolkwitz <[email protected]>

> Hi,
>
> Am 05.08.2010 16:47 schrieb lmhelp2:
> >
> > Thank you!
> >
> > So here is the list I have for the moment:
> > I need to ignore lines:
> > - containing: {{...}}
> >           => possibly spreading over several lines,
> >           => being possibly nested {{... {{ ... }} ... }}.
> > - containing: [[...]]
> >           => being possibly nested [[... [[ ... ]] ... ]].
> > - equal to: __TOC__
> > - equal to: __NOTOC__
> > - beginning with the '=' character
> > - beginning with the '*' character
> I don't think you should ignore lines beginning with the '*' character -
> those
> may include the wanted first paragraph of the text as the '*' is just a way
> of
> formatting the page...
>
> Greetings
> Katharina
>
> _______________________________________________
> MediaWiki-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
>



-- 
{+}Nevinho
Venha para o Movimento Colaborativo http://sextapoetica.com.br !!
_______________________________________________
MediaWiki-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l

Reply via email to