Sorry, my fault, that's what I get for responding that early in the morning.
I was thinking of s/// the entire time.

-----Original Message-----
From: Jeff 'japhy' Pinyan [mailto:[EMAIL PROTECTED]]
Sent: Thursday, September 26, 2002 7:19 AM
To: Timothy Johnson
Cc: 'Tim Booher'; [EMAIL PROTECTED]
Subject: RE: use perl to trim out non text characters from a file


On Sep 26, Timothy Johnson said:

>while(<INFILE>){
>  $_ = tr/[^characterclass]//g;
>  print OUTFILE $_;
>}
>
>putting a ^ at the beginning of a character class matches if the
>character is NOT one of those in the brackets.

That's not at all how tr/// works.  tr/// ALREADY is a character class
operator, and unlike a regex character class, you can't invert it with a
leading ^.  Instead, you must use the /c modifier.

tr/// is also automatically global, and it has no /g modifier.

tr/// does not automatically delete characters; you must use the /d
modifier for that.

Finally, tr/// works on $_ by default, and returns the number of
characters adjusted, not the new string.

  while (<INFILE>) {
    tr/a-zA-Z0-9//cd;      # remove all non-letters and non-numbers
    print OUTFILE "$_\n";  # remember that newline!
  }

>Otherwise I think there is a predefined [:printable] or something along
>those lines if you just want to get rid of non-printable characters.

Yes, but those are only supported in regex character classes.

  s/[[:^print:]]+//g;

or

  s/[^[:print:]]+//g;

-- 
Jeff "japhy" Pinyan      [EMAIL PROTECTED]      http://www.pobox.com/~japhy/
RPI Acacia brother #734   http://www.perlmonks.org/   http://www.cpan.org/
** Look for "Regular Expressions in Perl" published by Manning, in 2002 **
<stu> what does y/// stand for?  <tenderpuss> why, yansliterate of course.
[  I'm looking for programming work.  If you like my work, let me know.  ]

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to