Re: Portable upper/lower case regexp matches

2000-08-11 Thread David L. Nicol

Peter Scott wrote:

 Perl 5.6.0 has [[:lower:]] and [[:upper:]].
 
 Yes, but this one is worth a digraph.  Question is, which one?  Currently
 the free ones are:
 
 \F  \h \H  \i \I  \j \J  \k \K  \m \M  \o \O  \q  \R  \T  \v \V  \y \Y
 
 \v \V are being debated on p5p currently.
 
 I suggest \i \I, mnemonic with ?:i and /i.

I agree.

I also think that a web page with a big table of unresolvables
and simple (and uncontrolled) vote tallies would be a good thing.
Something like


Unresolvable   AgreeDisagree Dontcare   
--
All remaining simple300521  10
digraph letters should be  [  ][  ] [  ]  [opine]
conserved
--
Quebeq should secede from   20   71 320
the United States  [  ][  ]   [  ] [opine]
---


with a blank at the bottom for suggesting new unresolvables.  I bet that
at least a twentieth of the people who will read this could set this system
up in about the same amount of time that I spent writing this "me too."


-- 
  David Nicol 816.235.1187 [EMAIL PROTECTED]
:wq



Re: Portable upper/lower case regexp matches

2000-08-10 Thread Bart Lateur

On Thu, 10 Aug 2000 17:21:44 +0300, Jason Elbaum wrote:

   \x  match lowercase alpha char
\X  match uppercase alpha char

Thus /\X\x*/ would match all capitalized words, while /\X+/ would match
acronyms, and /(\X\x+)+/ would match Java class names.

You've got my vote, apart from one tiny detail: \x is already in use. It
allows you to specify character codes in hex, just like "\012" is in
octal.

$_ = "a\tb";
$_ = "a\tb";
print join '#', split /\x09/;
--
a#b


Getting a feature like this accepted, should be rather trivial.

-- 
Bart.



Re: Portable upper/lower case regexp matches

2000-08-10 Thread Jarkko Hietaniemi

On Thu, Aug 10, 2000 at 05:21:44PM +0300, Jason Elbaum wrote:
 As far as I know, there is a basic bit of regexp functionality which
 Perl should support but doesn't.
 
 Perl regexps support the following features, though they're a bit
 obscure to my tastes...
 
 (from perlre:)
 \l  lowercase next char (think vi)
 \u  uppercase next char (think vi)
 \L  lowercase till \E (think vi)
 \U  uppercase till \E (think vi)
 \E  end case modification (think vi)
 
 ...but Perl doesn't offer a regexp pattern to match all alphabetical
 characters of a particular case. Something like:
 
 \x  match lowercase alpha char
 \X  match uppercase alpha char
 
 Thus /\X\x*/ would match all capitalized words, while /\X+/ would match
 acronyms, and /(\X\x+)+/ would match Java class names.

Perl 5.6.0 has [[:lower:]] and [[:upper:]].

-- 
$jhi++; # http://www.iki.fi/jhi/
# There is this special biologist word we use for 'stable'.
# It is 'dead'. -- Jack Cohen



Re: Portable upper/lower case regexp matches

2000-08-10 Thread Johan Vromans

Jason Elbaum [EMAIL PROTECTED] writes:

 Perl regexps support the following features, though they're a bit
 obscure to my tastes...
 
 (from perlre:)
 \l  lowercase next char (think vi)

Actually, this has little to do with regexes, it a string issue.

 ...but Perl doesn't offer a regexp pattern to match all alphabetical
 characters of a particular case. Something like:
 
 \x  match lowercase alpha char
 \X  match uppercase alpha char

Well, take your pick:

[:lower:]   and [:upper](POSIX, e.g., /[[:islower:]]+/)
\p{IsLower} and \p{IsUpper} (UNICODE, e.g., /\p{IsLower}+/)

It's all in 5.6. See PP3, pp 167 and up.

-- Johan



Re: Portable upper/lower case regexp matches

2000-08-10 Thread Peter Scott

At 10:28 AM 8/10/00 -0500, Jarkko Hietaniemi wrote:
On Thu, Aug 10, 2000 at 05:21:44PM +0300, Jason Elbaum wrote:
  As far as I know, there is a basic bit of regexp functionality which
  Perl should support but doesn't.
 
  Perl regexps support the following features, though they're a bit
  obscure to my tastes...
 
  (from perlre:)
  \l  lowercase next char (think vi)
  \u  uppercase next char (think vi)
  \L  lowercase till \E (think vi)
  \U  uppercase till \E (think vi)
  \E  end case modification (think vi)
 
  ...but Perl doesn't offer a regexp pattern to match all alphabetical
  characters of a particular case. Something like:
 
  \x  match lowercase alpha char
  \X  match uppercase alpha char
 
  Thus /\X\x*/ would match all capitalized words, while /\X+/ would match
  acronyms, and /(\X\x+)+/ would match Java class names.

Perl 5.6.0 has [[:lower:]] and [[:upper:]].

Yes, but this one is worth a digraph.  Question is, which one?  Currently 
the free ones are:

\F  \h \H  \i \I  \j \J  \k \K  \m \M  \o \O  \q  \R  \T  \v \V  \y \Y

\v \V are being debated on p5p currently.

I suggest \i \I, mnemonic with ?:i and /i.  I know it's a strange 
association once you think about it, but it made sense at first thought.

--
Peter Scott
Pacific Systems Design Technologies




Re: Portable upper/lower case regexp matches

2000-08-10 Thread Jarkko Hietaniemi

On Thu, Aug 10, 2000 at 08:55:27AM -0700, Peter Scott wrote:
 At 10:28 AM 8/10/00 -0500, Jarkko Hietaniemi wrote:
 On Thu, Aug 10, 2000 at 05:21:44PM +0300, Jason Elbaum wrote:
   As far as I know, there is a basic bit of regexp functionality which
   Perl should support but doesn't.
  
   Perl regexps support the following features, though they're a bit
   obscure to my tastes...
  
   (from perlre:)
   \l  lowercase next char (think vi)
   \u  uppercase next char (think vi)
   \L  lowercase till \E (think vi)
   \U  uppercase till \E (think vi)
   \E  end case modification (think vi)
  
   ...but Perl doesn't offer a regexp pattern to match all alphabetical
   characters of a particular case. Something like:
  
   \x  match lowercase alpha char
   \X  match uppercase alpha char
  
   Thus /\X\x*/ would match all capitalized words, while /\X+/ would match
   acronyms, and /(\X\x+)+/ would match Java class names.
 
 Perl 5.6.0 has [[:lower:]] and [[:upper:]].
 
 Yes, but this one is worth a digraph.  Question is, which one?  Currently 
 the free ones are:

Hardly.  I beg to differ.  We have enough magical digraphs as it is.
Let's save what we have for metamarkers, such as \v\V, not for 
character classes..

-- 
$jhi++; # http://www.iki.fi/jhi/
# There is this special biologist word we use for 'stable'.
# It is 'dead'. -- Jack Cohen