Curtis Bruneau wrote: > Ruslan Zakirov wrote: > >> On Sat, Aug 9, 2008 at 12:20 AM, Curtis Bruneau <[EMAIL PROTECTED]> wrote: >> >> >>> I need some suggestions, I have come to the conclusion that all utf8 >>> collations don't do french properly, not like latin1 anyway. All accents >>> are seen as the same, while binary distinct they cannot be unique >>> indexed and sorting will recognize them as the same as well as queries >>> using any variant character. >>> >>> So I'm in a bit of a bind, if I were to use RT with a case sensitive >>> collation like utf8_bin would the application behave as expected? I know >>> search would be much more strict and possibly confusing to the end user. >>> >>> >> utf8_bin is good choice. You're free to use binary collation. May be >> utf8_general_ci collation will be better for you. Any collation is ok >> as long as you know how to deal with them in mysql. >> >> >> >> > Ok just wondering, I'll give it a try.. I was more curious if any string > type clauses would still work internally since binary collations are > everything/case sensitive > . I'm guessing that's all fine because I think postgres stores it's > stuff as binary_cs and relies on the OS do to collations (something like > that, other postgres db's around here seem to be case sensitive). > >>> My other option would be to continue to use latin1, is there any way to >>> accomplish this using the latest code base? It's probably not >>> configurable and I don't want to have to manage diffs for the possible >>> changes, unless it is fairly minimal to do.. >>> >>> >> No, we wouldn't return to that as it's totally wrong and have >> concequences as it's actually violation of setting purpose. RT was >> storing UTF8 encoded data in a latin1 column, so collations worked >> absolutly incorrect for everything even latin1 and were close to >> binary. >> >> At this point I can suggest you move either binary collation or create >> a new one and send it to mysql team for inclusion. >> >> >> > Understood, I wasn't liking that idea either. Oddly enough > latin1_swedish_ci (the latin1 default) isn't suppose to be accent > sensitive, latin1_general_ci is but my old database (mysql 4.1) seems > to be indexing it and seeing them seperate. The collation isn't > specified so i'm assuming swedish but it's behaving like general, > perhaps the old version respected the differences. I'm basically trying > to get it the same as before (perhaps if swedish was enforced before I > wouldn't be in this position), regardless this isn't really an issue > with RT. > >>> The issue in question -> http://bugs.mysql.com/bug.php?id=34130 >>> >>> They said it's on 'todo', MSSQL handles this with ci_ai, ci_as, cs_ai >>> and cs_as collations where the accents are either sensitive or not. >>> Hopefully they do come around to it.. >>> >>> Character difference for mysql .. http://www.collation-charts.org/mysql60/ >>> >>> >>> Curtis >>> >>> > Thanks again for your time, i'm really excited to launch 3.8.x, compared > to 3.4.x our users are loving it, especially the reporting and all that. > Curtis. I have a question that's probably obvious.. If I go ahead with utf8_bin, any variation of case on incoming emails will be regarded as distinct right? I can see this causing many issues, I may just get rid of my accented emails and possibly merge the tickets or just delete the users as they aren't valid emails anyway. I don't think I could pad the emails enough to get the users to match, looking through my data emails come in as all kinds of different cases.
Curtis _______________________________________________ http://lists.bestpractical.com/cgi-bin/mailman/listinfo/rt-users Community help: http://wiki.bestpractical.com Commercial support: [EMAIL PROTECTED] Discover RT's hidden secrets with RT Essentials from O'Reilly Media. Buy a copy at http://rtbook.bestpractical.com
