Re: t character in regular expression

Andy Bach Thu, 25 Oct 2012 14:24:44 -0700

On Thu, Oct 25, 2012 at 3:57 PM, Weidner, Ron <rweid...@idexcorp.com> wrote:
> In the following regex what is the "t" character doing?
> $linebuf =~ tr/\n/:/;


tr/// is the "translate", er, transliteration operator.  Same as
"y///" - from/like a unix util ("tr").  Takes any of the left hand
side ("\n" here) and turns them into the right hand side (":");
Returns the count of transliterations. Much like s/// subst though it
doesn't interpolate variables (though it does, obviously, allow
metachars) - see perldoc perlop for all the details

Transliterates all occurrences of the characters found in the search
list with the corresponding
               character in the replacement list.  It returns the
number of characters replaced or deleted.  If no
               string is specified via the =~ or !~ operator, the $_
string is transliterated.  (The string speci-
               fied with =~ must be a scalar variable, an array
element, a hash element, or an assignment to one of
               those, i.e., an lvalue.)

               A character range may be specified with a hyphen, so
"tr/A-J/0-9/" does the same replacement as
               "tr/ACEGIBDFHJ/0246813579/".  For sed devotees, "y" is
provided as a synonym for "tr".  If the
               SEARCHLIST is delimited by bracketing quotes, the
REPLACEMENTLIST has its own pair of quotes, which
               may or may not be bracketing quotes, e.g.,
"tr[A-Z][a-z]" or "tr(+\-*/)/ABCD/".

               Note that "tr" does not do regular expression character
classes such as "\d" or "[:lower:]".  The
               <tr> operator is not equivalent to the tr(1) utility.
If you want to map strings between
               lower/upper cases, see "lc" in perlfunc and "uc" in
perlfunc, and in general consider using the "s"
               operator if you need regular expressions.

               Note also that the whole range idea is rather
unportable between character sets--and even within
               character sets they may cause results you probably
didn’t expect.  A sound principle is to use only
               ranges that begin from and end at either alphabets of
equal case (a-e, A-E), or digits (0-4).  Any-
               thing else is unsafe.  If in doubt, spell out the
character sets in full.

               Options:

                   c   Complement the SEARCHLIST.
                   d   Delete found but unreplaced characters.
                   s   Squash duplicate replaced characters.

               If the "/c" modifier is specified, the SEARCHLIST
character set is complemented.  If the "/d" modi-
               fier is specified, any characters specified by
SEARCHLIST not found in REPLACEMENTLIST are deleted.
               (Note that this is slightly more flexible than the
behavior of some tr programs, which delete any-
               thing they find in the SEARCHLIST, period.) If the "/s"
modifier is specified, sequences of charac-
               ters that were transliterated to the same character are
squashed down to a single instance of the
               character.

               If the "/d" modifier is used, the REPLACEMENTLIST is
always interpreted exactly as specified.  Oth-
               erwise, if the REPLACEMENTLIST is shorter than the
SEARCHLIST, the final character is replicated
               till it is long enough.  If the REPLACEMENTLIST is
empty, the SEARCHLIST is replicated.  This latter
               is useful for counting characters in a class or for
squashing character sequences in a class.

               Examples:

                   $ARGV[1] =~ tr/A-Z/a-z/;    # canonicalize to lower case

                   $cnt = tr/*/*/;             # count the stars in $_

                   $cnt = $sky =~ tr/*/*/;     # count the stars in $sky

                   $cnt = tr/0-9//;            # count the digits in $_

                   tr/a-zA-Z//s;               # bookkeeper -> bokeper

                   ($HOST = $host) =~ tr/a-z/A-Z/;

                   tr/a-zA-Z/ /cs;             # change non-alphas to
single space

                   tr [\200-\377]
                      [\000-\177];             # delete 8th bit

               If multiple transliterations are given for a character,
only the first one is used:

                   tr/AAA/XYZ/

               will transliterate any A to X.

               Because the transliteration table is built at compile
time, neither the SEARCHLIST nor the REPLACE-
               MENTLIST are subjected to double quote interpolation.
That means that if you want to use variables,
               you must use an eval():

                   eval "tr/$oldlist/$newlist/";
                   die $@ if $@;

                   eval "tr/$oldlist/$newlist/, 1" or die $@;


-- 

a

Andy Bach,
afb...@gmail.com
608 658-1890 cell
608 261-5738 wk

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: t character in regular expression

Reply via email to