>>>>> "jp" == j proctor <[EMAIL PROTECTED]> writes:

  jp> At the risk of prolonging this, I believe the idea is that we
  jp> commonly think of the regex metacharacters as sets, and we think
  jp> of the arguments to tr as ordered sets.  And as long as we're not
  jp> operating under the rules of quantum computing, the set of
  jp> characters represented by a regex metacharacter *must* be ordered
  jp> at some point, because we can't guarantee that we test whether
  jp> something is a space and a tab simultaneously.  So given an
  jp> extremely useful collection of (ordered) sets with easy-to-type
  jp> names (i.e. the regex metacharacters), and an extremely rapid way
  jp> of transliterating members of one ordered set to members of
  jp> another, why shouldn't Perl allow us to use those things together?
  jp> If I want to change an input string by making all of its non-word
  jp> characters into underscores, I can currently use
  jp> "s/[^a-zA-Z0-9_]/_/g;" or "tr/a-zA-Z0-9_/_/c;".  Personally, I'd
  jp> be more likely to write "s/\W/_/g;", but only because I can't use
  jp> "tr/\W/_/;" or even "tr/\w/_/c;".

first off, regex char classes do work in parallel as they build a simple
256 sized char array and just populate it with 1 bytes indexed with the
chars in the set. checking if the char is in that set is a fixed time
operation. the char it looks up can be different each time due to
quantifiers, backtracking, etc. the thing you replace it with (if any)
has no bearing on what chars were in that char class.

translate tables are different beasts. they also are indexed by the
chars coming in but the value in each byte entry is the char to
translate the input char to. tr/// is a purely char based sequential
conversion of its input string. using a char class make little sense
unless you are converting (or deleting ) all of those chars to the same
one. the order of the set matters as that is how the input table is
mapped to the output table. char classes are not ordered in regexes as
they are just used there as a boolean 'in the class' test.

so tr/\s/ /s makes some sense as all white space is converted to space
and squeezed. but only similar ones like tr/\d//cd which would delete
all non-digits make sense as well. most other uses of char classes in
tr/// are nonsensical. 

try explaining what this would do:

        tr/\s/ \n/ ;

what is the order of the whitespace chars in \s?

remember in tr/// the order of the chars matters while in a char class
it doesn't.

uri

-- 
Uri Guttman  ---------  [EMAIL PROTECTED]  ----------  http://www.sysarch.com
SYStems ARCHitecture, Software Engineering, Perl, Internet, UNIX Consulting
The Perl Books Page  -----------  http://www.sysarch.com/cgi-bin/perl_books
The Best Search Engine on the Net  ----------  http://www.northernlight.com

Reply via email to