Rob Dixon wrote:
Dave Cardwell wrote:
Hello there, I'm having trouble constructing a regular expression that
would do the following:
FOO...
...followed by anything but BAR (non-greedy)...
...followed by BAZ (captured)...
...followed by anything but BAR (greedy)...
...followed by BAR
I've been looking at zero-width negative look-ahead, but I haven't
used this area of regular expressions before so I'm struggling. A
solution or prod in the right direction would be lovely.
Please show us the real problem. I know you mean to clarify, but your
summary is so ambiguous that understanding it becomes the most difficult
part of providing a solution.
Thanks,
Rob
I was afraid of that, sorry. I'm using HTML::Parser to scan through a
document, but I need to do one quick manipulation first that depends on
seeing the document as a whole (unlike per-token as with HTML::Parser).
Rather than attempting to fit all of the real work in a regular
expression, I thought it best to simply mark the element with a custom
attribute that HTML::Parser could pick up later.
To that end, I need to find an <a> (BAZ) that contains just plain text,
somewhere between an opening <td> (FOO) and the closest closing </td>
(BAR), ie something along the lines of:
s%
<td([^>]*>
{not </td>}*?
<a[^>]*>[\w\s]+</a>
{not </td>}*?
</td>)
%<td foo="1"$1%gismx;
It's the {not </td>} bits I'm having difficulty with.
--
Best wishes,
Dave Cardwell.
http://perlprogrammer.co.uk/
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/