On Wed, 10 Aug 2005 23:56:31 -0700 (PDT), rajarshi das <[EMAIL PROTECTED]> wrote
> Hi, > This is Rajarshi expressing Sastry's viewpoints since he's on vacation. > > SADAHIRO Tomoyuki <[EMAIL PROTECTED]> wrote: > >> According to the above statement in perlebcdic.pod, >> s/[\x89-\x91]/X/g must substitute \x8e with X. >> But it doesn't concern whether tr/\x89-\x91/X/ would substitute \x8e >> with X or not, since tr/// does not use brackets, [ ]. > >> Though I think ranges in [ ] and ranges in tr/// should coincide >> and agree that tr/\x89-\x91/X/ should substitute \x8e with X, >> that is just my opinion. >> I don't know whether it is true and correct. > Is there some way we can confirm if this is correct (and expected behaviour) > since there isnt any explicit documentation for the tr operator ? Since t/op/tr.t already has a test case (cf. Change 9038) which Sastry previously pointed out its failing on EBCDIC Platform, I assume that at least the then pumpking thought it to be correct. >> By the way, when you say "If I specify [\x89-\x91]", does it >> mean s/[\x89-\x91]/X/g or tr/\x89-\x91/X/ ? I'm confused. > We mean tr/\x89-\x91/X/. > > >> We are first informed by you that gapped characters are not >> substituted with X by tr/\x89-\x91/X/. >> And you said s/[\x89-\x91]/X/g substituted all the characters >> including gapped characters with X, hadn't you? > > Yes. >> If so, I assume your [\x89-\x91] which doesn't matching any of >> the gapped characters to be tr/\x89-\x91/X/. > That's correct. We mean tr/\x89-\x91/X/. > > >> The following is a part of the current core tests from op/pat.t. >> I believe they should be passed. > Yes all the following tests pass. I think the following tests are in the > context of the > s/[]/X/ operator and hence pass. > > Thanks, > > Rajarshi. OK. To me, it is confirmed that s/[]/X/ is fine and tr/// has a problem. Since I don't have any EBCDIC machine, I can't ensure the following patch will really makes sense. Regards, SADAHIRO Tomoyuki ! t/op/tr.t, toke.t diff -ur perl~/t/op/tr.t perl/t/op/tr.t --- perl~/t/op/tr.t Mon Aug 01 17:17:24 2005 +++ perl/t/op/tr.t Thu Aug 11 23:41:22 2005 @@ -295,18 +295,15 @@ # (i-j, r-s, I-J, R-S), [\x89-\x91] [\xc9-\xd1] has to match them, # from Karsten Sperling. -# Not working in EBCDIC as of 12674. $c = ($a = "\x89\x8a\x8b\x8c\x8d\x8f\x90\x91") =~ tr/\x89-\x91/X/; is($c, 8); is($a, "XXXXXXXX"); - -# Not working in EBCDIC as of 12674. + $c = ($a = "\xc9\xca\xcb\xcc\xcd\xcf\xd0\xd1") =~ tr/\xc9-\xd1/X/; is($c, 8); is($a, "XXXXXXXX"); - -SKIP: { +SKIP: { skip "not EBCDIC", 4 unless $Is_EBCDIC; $c = ($a = "\x89\x8a\x8b\x8c\x8d\x8f\x90\x91") =~ tr/i-j/X/; diff -ur perl~/toke.c perl/toke.c --- perl~/toke.c Mon Jul 18 04:31:02 2005 +++ perl/toke.c Thu Aug 11 22:55:18 2005 @@ -1368,6 +1368,9 @@ I32 has_utf8 = FALSE; /* Output constant is UTF8 */ I32 this_utf8 = UTF; /* The source string is assumed to be UTF8 */ UV uv; +#ifdef EBCDIC + UV literal_endpoint = 0; +#endif const char *leaveit = /* set of acceptably-backslashed characters */ PL_lex_inpat @@ -1417,8 +1420,9 @@ } #ifdef EBCDIC - if ((isLOWER(min) && isLOWER(max)) || - (isUPPER(min) && isUPPER(max))) { + if (literal_endpoint == 2 && + ((isLOWER(min) && isLOWER(max)) || + (isUPPER(min) && isUPPER(max)))) { if (isLOWER(min)) { for (i = min; i <= max; i++) if (isLOWER(i)) @@ -1437,6 +1441,9 @@ /* mark the range as done, and continue */ dorange = FALSE; didrange = TRUE; +#ifdef EBCDIC + literal_endpoint = 0; +#endif continue; } @@ -1455,6 +1462,9 @@ } else { didrange = FALSE; +#ifdef EBCDIC + literal_endpoint = 0; +#endif } } @@ -1788,6 +1798,10 @@ s++; continue; } /* end if (backslash) */ +#ifdef EBCDIC + else + literal_endpoint++; +#endif default_action: /* If we started with encoded form, or already know we want it ###END OF PATCH