Re: Regular Expression Quick Reference

2003-07-27 Thread Iain Truskett

Hmm. No comments from anyone here?

 http://dellah.org/perlreref.pod
 http://dellah.org/perlreref.html

Should I just send to p5p instead of here?


cheers,
-- 
Iain.


Re: Regular Expression Quick Reference

2003-07-27 Thread Sean M. Burke
At 06:53 PM 2003-07-27 +1000, you wrote:

Hmm. No comments from anyone here?

 http://dellah.org/perlreref.pod
 http://dellah.org/perlreref.html
It all looks wonderful!
Except I'm a bit torn over this:
   \x7f Any hexadecimal ASCII value
   \x{263a} A wide hexadecimal value
Isn't there no difference anymore between \xNN and \x except
that the first can express values only 0-0xFF?

Should I just send to p5p instead of here?
Hm, maybe p5p is good for something.  I don't know.

Tiny fixes:

% diff perlreref.pod~orig perlreref.pod
--- perlreref.pod~orig  Sun Jul 27 06:11:16 2003
+++ perlreref.pod   Sun Jul 27 06:10:36 2003
@@ -16,7 +16,7 @@
 =item =~
 determines to which variable the regex is applied.
-In its absence C$_ is used.
+In its absence, C$_ is used.
 $var =~ /foo/;

@@ -25,7 +25,7 @@
 searches a string for a pattern match,
 applying the given options.
-i  case Insensitive
+i  case-Insensitive
 g  Global - all occurrences
 m  Multiline mode - ^ and $ match internal lines
 s  match as a Single line - . matches \n
@@ -135,12 +135,12 @@
cntrl   IsCntrl Control characters
digit   IsDigit  \d Digits
graph   IsGraph Alphanumeric and punctuation
-   lower   IsLower Lower case chars (locale aware)
+   lower   IsLower Lowercase chars (locale aware)
print   IsPrint Alphanumeric, punct, and space
punct   IsPunct Punctuation
space   IsSpace  [\s\ck]Whitespace
IsSpacePerl   \sPerl's whitespace definition
-   upper   IsUpper Upper case chars (locale aware)
+   upper   IsUpper Uppercase chars (locale aware)
wordIsWord   \w Alphanumeric plus _ (Perl)
xdigit  IsXDigit [\dA-Fa-f] Hexadecimal digit
@@ -218,10 +218,10 @@

 =head1 FUNCTIONS

-   lc  Lower case a string
-   lcfirst Lower case first char of a string
-   uc  Upper case a string
-   ucfirst Upper case first char of a string
+   lc  Lowercase a string
+   lcfirst Lowercase first char of a string
+   uc  Uppercase a string
+   ucfirst Uppercase first char of a string
pos Return or set current match position
quotemeta   Quote meta characters
reset   Reset ?pattern? status
--
Sean M. Burkehttp://search.cpan.org/~sburke/


RE: Regular Expression Quick Reference

2003-07-27 Thread Hugh S. Myers
Looks quite good to me---although I might remove the '^' character from the
headings (unless I mis-understand its usage?)

--hsm

 -Original Message-
 From: Iain Truskett [mailto:[EMAIL PROTECTED]
 Sent: Sunday, July 27, 2003 2:54 AM
 To: [EMAIL PROTECTED]
 Subject: Re: Regular Expression Quick Reference



 Hmm. No comments from anyone here?

  http://dellah.org/perlreref.pod
  http://dellah.org/perlreref.html

 Should I just send to p5p instead of here?


 cheers,
 --
 Iain.



Re: Regular Expression Quick Reference

2003-07-27 Thread Tom Christiansen
+   lc  Lowercase a string
+   lcfirst Lowercase first char of a string
+   uc  Uppercase a string
+   ucfirst Uppercase first char of a string

Not quite; the last one (for ucfirst or \u) should be Titlecase, 
not Uppercase--which of course, are not always the same.

Consider the dz character at U+01F3:

% perl  -e 'printf U+%04X\n,  ord  chr 0x1F3'
U+01F3
% perl  -e 'printf U+%04X\n,  ord uc   chr 0x1F3'
U+01F1
% perl  -e 'printf U+%04X\n,  ord ucfirst  chr 0x1F3'
U+01F2

If you're (usefully) running something like

% xterm -n unicode -u8 -fn 
-misc-fixed-medium-r-normal--20-200-75-75-c-100-iso10646-1 

Then under perl v5.8.1, providing that you've used -C6 or setenv
PERL_UNICODE to 6 or some such similarly useful value, then you can look 
at actual characters on your screen instead of their numeric codepoints.

% perl -le 'print chr for 0x1f1 .. 0x1f3'
DZ
Dz
dz

If that's too exotic, consider the ß character at U+00DF (more 
common in Germany than in Hungary, unlike dz):

% perl -le 'print   pack U,0xDF'
ß
% perl -le 'print ucpack U,0xDF'
SS
% perl -le 'print ucfirst   pack U,0xDF'
Ss

The funny pack is to force the UTF8 flag when 128  codepoint = 256.  
That way we get correct casing rules loaded for what would otherwise
presumably appear to be in an 8-bit encoding, since for this peculiar
character, the POSIX and/or ctype.h charclass macros are of no use, 
but the Unicode casing rules are.

% perl -le 'print   0xDF'
ß
% perl -le 'print uc0xDF'
ß
% perl -le 'print ucfirst   0xDF'
ß

Alas.

--tom