On 13.03.2013 17:17, Olemis Lang wrote: > On 3/13/13, Branko Čibej <[email protected]> wrote: >> On 13.03.2013 15:20, Branko Čibej wrote: >>> On 13.03.2013 09:31, Olemis Lang wrote: >>>> On 3/13/13, Jure Zitnik <[email protected]> wrote: >>>>> On 3/13/13 8:58 AM, Olemis Lang wrote: >>>>> >>>>>> -0 ... I'd suggest (?:(?!\d)\w+) ... which is already available in >>>>>> multiproduct.util.IDENTIFIER after recent patches . >>>>> Sure, we can go with that one too ... >>>>> >>>> things like Hønsdrükḱenshàpełlmünçen (i.e. unicode chars ;) will be >>>> supported , which is nice to have ; and will be matched by TracLinks >>>> expressions >>>> ;) >>> I very much disagree with using something like that as the tag that is >>> effectively a unique key in database columns. Display name is different, >>> but for database debugging, I really don't want to have to tell the >>> difference between i and í and ì and ı. >> Not to mention that those 4 variants of i have 6 different Unicode >> representations. You do *not* want to deal with Unicode normalization >> issues in primary keys. >> > Like I just said in my previous message ... we already deal with > unicode values in primary keys , so that belongs in the past ...
Just make triply sure that Trac core actually does normalize the keys before writing them to the database. Otherwise I'd consider that a serious bug, because it leaves room for having two identical-looking primary keys with different bit values. Database collations typically are not normalization-agnostic. As far as I can see, Trac core only normalizes the names of attachments. That's not enough. -- Brane -- Branko Čibej Director of Subversion | WANdisco | www.wandisco.com
