At 17:07 +0100 1/13/11, Marek Stepanek wrote:
>
>Thank you for your reply. Whether I did not understand your suggestion,
>or you did not understand my problem :-)
>
>The problem was iterating over many "naked" (with out any html-tags)
>text in a large text file and tag these found occurrences, and replace
>the line breaks with <br>. Meanwhile I found the solution, but I don't
>know, why this is working with a while-loop, and not with a
>foreach-loop. Since I am looking into Perl - this is now about 10 years
>- I have had always this comprehension problem between while and foreach
>constructs. Also if this is not on topic for this list, could somebody
>explain me this difference?
>
>When using the debugger with the following script and the following
>example, it is working. Used as a filter in BBEdit, the last paragraph
>is not tagged. Strange! That means, my filter is probably not right ...
>
>#!/usr/bin/perl
>
>use strict;
>use warnings;
>
>$/ = undef;
>$_ = <>;
>
>
>while ($_ =~ m,(</p>\s+[^<]+?<p),g) {
>       my $paragraf = $1;
>       my $orig  = $1;
>       $paragraf =~ s,(</p>)\n?,$1<p class="links_normal">,;
>       $paragraf =~ s,\n\n<p$,</p><p,;
>       $paragraf =~ s,\n,<br>\n,g;
>       $paragraf =~ s!\s{2,}!!g;
>       $paragraf =~ s!><!>\n<!g;
>       s/$orig/$paragraf/;
>}
>

Ahha.  Now I understand why you need the loop.

Two things worry me .  In the while loop you use the g flag. Perl has to 
remember where it is in the $_ string that is the document. When you do the 
substitution you are messing with the string that the initial match is working 
with.  Perhaps doing the substitution on a copy of what came in as $_ would 
make a difference because of that.

The other is that the matches need an s option to allow them to match the line 
ends. At least I think it does because match typically stops when it reaches a 
line end. But if it works in a shell I may have some wrong ideas.  With $/ 
undefined it's conceivable that the s option is not required.

That last substitution changes the value of $1 but I don't think that's germane.

It's also quite possible that what you get in a shell based version is working 
on a slightly different version. I'm pretty sure it is recreated from the 16 
bit characters that are in the memory image of the file and when it's delivered 
to the filter there can well be subtle differences - line ends for instance.

foreach typically requires a list as an argument.  while can use anything that 
returns a boolean logic item. there is also the  " for (start; stop; 
increment)"  introduction to a loop.

perl == magic

-- 
--> If  it's not  on  fire  it's  a  software  problem. <--

-- 
You received this message because you are subscribed to the 
"BBEdit Talk" discussion group on Google Groups.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
<http://groups.google.com/group/bbedit?hl=en>
If you have a feature request or would like to report a problem, 
please email "[email protected]" rather than posting to the group.
Follow @bbedit on Twitter: <http://www.twitter.com/bbedit>

Reply via email to