Re: Classic text problem: matching consecutive newlines

Jeff 'japhy' Pinyan Mon, 29 Jul 2002 08:33:15 -0700

On Jul 29, KEVIN ZEMBOWER said:

>I'm facing what I believe to one of the classic text manipulation
>problems, transforming a document which was typed with a hard return at
>the end of every physical line, and two consecutive newlines to mark the
>end of a paragraph.
>
>Would anyone help me write a program which would transform these
>documents? I'm trying to find all instances of a single newline, and
>remove it, either inserting or removing space characters around where it
>was to leave just one space between what was the two lines. I also need
>to substitute a single newline for two or more consecutive newlines,
>whether or not they're separated by whitespace characters.


It's not too difficult.  I'm using [^\S\n] to match whitespace EXCEPT
newline.

  $text =~ s{
    (             # capture to $1
      [^\S\n]*    # any non-newline whitespace
      (?:         # this chunk...
        \n          # a newline
        [^\S\n]*    # any non-newline whitespace
      )+          # one or more times
    )             # end $1
  }{
    my $ws = $1;
    if (($ws =~ tr/\n//) == 1) { " " }
    else { "\n" }
  }gex;

The code to replace the whitespace says "if there's only ONE newline,
replace the whitespace with a space; otherwise, replace it with a
newline."

-- 
Jeff "japhy" Pinyan      [EMAIL PROTECTED]      http://www.pobox.com/~japhy/
RPI Acacia brother #734   http://www.perlmonks.org/   http://www.cpan.org/
** Look for "Regular Expressions in Perl" published by Manning, in 2002 **
<stu> what does y/// stand for?  <tenderpuss> why, yansliterate of course.
[  I'm looking for programming work.  If you like my work, let me know.  ]


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Classic text problem: matching consecutive newlines

Reply via email to