[libreoffice-users] LO Writer and regex - finding "everything" but one thing

gordom Thu, 21 May 2015 00:41:04 -0700

Hello everybody.
Probably the title of this post is not very clear, sorry for that ;).

I have a bunch of text (html code) and need to find tags with theirclasses, id, styles (if any) etc. I'm doing this using the following regexs:

<p(.*?)> or (<p([^>]+))>

The pattern of my text is here:

Lorem ipsum dolor sit amet, consecteturadipiscing elit.

Aliquam mi sapien, rutrum eget sem vel, semperefficitur.<a href="xyz.html" class="topiclink">vitae velit</a>

Donec fringilla sapien vitae interdumvolutpat.

Cras nec orci non dolor ultrices luctus sit amet vitaevelit.

The problem is that I need to find every occurrence of tag exceptone certain class (i.e. I want to avoid paragraph tags of this class). Idon't know how to write a regex exclusion that is treated as a string,not a set of the individual characters? I tried to use back-references,with no success. I want to use regex because the tag classes, to beavoided, are different on each page (but they keep a certain pattern)and a the job should be done as automatic as possible (the code shouldbe as versatile as possible).

I will appreciate any help. Kind regards,

gordom

--
To unsubscribe e-mail to: [email protected]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted

[libreoffice-users] LO Writer and regex - finding "everything" but one thing

Reply via email to