On Thu, 18 Jun 2015, Ze'ev Atlas wrote:

I have added \x41 to the list that is recognized by \h and committed the 
patch.

> An interesting point:  The Perlre in perldocs (5.20), document states: (The 
> following all specify the same class of three characters: [-az] , [az-] , and 
> [a\-z] . All are different from [a-z] , which specifies a class containing 
> twenty-six characters, even on EBCDIC-based character sets.) 
> 
> Apparently, Perl somehow recognizes [a-z] and treats it as a special case in 
> EBCDIC and ignore the non-letters gaps.  This is news to me.  Dis you know 
> that?  I intend to ask in the perl-mvs forum what do they do about it.

I did not know that. PCRE does not treat [a-z] as special.

> Obviously, I know that \p and \P are useless, but the tests are odd, and I am 
> trying to reduce the level of oddity as much as I could.                      
>                                                      

There was a bug. It was not diagnosing an error for \p and \P within a 
class when UCP support was disabled. I have fixed that.

> While 0x41 is indeed not in any class that I may have thought about,
> 0x25, is actually in some.

> /[\h]/BZ                                                          
------------------------------------------------------------------
        Bra                                                       
        [\x05\x0b-\x0d\x15\x25 ]                                  
        Ket                                                       
        End                                                      
------------------------------------------------------------------

That is wrong! It should only be \x05, space, and (now) \x41. Those 
vertical spaces should not be there. Can you check again, please?

> /[\v]/BZ                                                        
   ------------------------------------------------------------------
     Bra                                                        
     [\x0b-\x0d\x15\x25]                                         
     Ket                                                           
     End                                                       
  ------------------------------------------------------------------ 
  
That one is correct.
 

> /\R/SI                                    
Starting chars: \x0b \x0c \x0d \x15 \x25  

That is correct.

Philip

-- 
Philip Hazel
-- 
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev 

Reply via email to