Has anybody specifically looked at how Perl6 regexes might map to
the various requirements of UTS#18, Unicode Regular Expressions?

    http://unicode.org/reports/tr18/

I ask because to my inexperienced eye, quite a few perl6isms are
*much* better at this than in perl5 obtain, and so I wondered
whether this was by conscious intent and design.  Is/Was it?

I'm also curious whether there are active plans to address the
tr18 requirements in perl6 regexes.  It would be a wonderful
feather in perl6's cap to be able to legitimately claim Level 2
or even Level 3 compliance, since besides perl5, only ICU right
now manages even Level 1, with everybody else *very* far behind.

TR18 specifies three levels of support (Basic, Extended, and Tailored),
with each having specific, reasonably well-defined requirements:

  =Level 1: Basic Unicode Support
   RL1.1    Hex Notation                        
   RL1.2    Properties                         
   RL1.2a   Compatibility Properties          
   RL1.3    Subtraction and Intersection     
   RL1.4    Simple Word Boundaries          
   RL1.5    Simple Loose Matches           
   RL1.6    Line Boundaries               
   RL1.7    Supplementary Code Points    

  =Level 2: Extended Unicode Support
   RL2.1    Canonical Equivalents       
   RL2.2    Default Grapheme Clusters  
   RL2.3    Default Word Boundaries   
   RL2.4    Default Loose Matches    
   RL2.5    Name Properties         
   RL2.6    Wildcard Properties    

  =Level 3: Tailored Unicode Support
   RL3.1    Tailored Punctuation            
   RL3.2    Tailored Grapheme Clusters     
   RL3.3    Tailored Word Boundaries      
   RL3.4    Tailored Loose Matches       
   RL3.5    Tailored Ranges             
   RL3.6    Context Matching           
   RL3.7    Incremental Matches       
 ( RL3.8    Unicode Set Sharing )
   RL3.9    Possible Match Sets      
   RL3.10   Folded Matching         
   RL3.11   Submatchers            

thanks,

--tom

Reply via email to