RE: Persian UTF-8 MySql collation

Behdad Esfahbod Sat, 03 Jul 2004 13:15:02 -0700

Actually there's a middle solution here, which the price is just
messing a bit with your database schema.  All you need is to
store the string returned by strxfrm(str) in your database as a
binary field, and just sort on that column instead of str.


behdad


On Sat, 3 Jul 2004, Ehsan Akhgari wrote:

> > Ehsan - are you thinking about adding glibc collation to the
> > strings/ctype-MYSET.c file? Or something more fundemental?
>
> Well, to tell you the truth, I'm not really sure, since I've not checked the
> MySQL source tree yet.  But yes, I'm going to see if glibc support can be
> incorporated into MySQL's charset handling mechanism.
>
> > I think you and the team I'm working with are trying to do
> > the same thing - it would be great if we could work together
> > and come up with a solution that anyone else can use too.
>
> I looked around a bit, and it seems like MySQL 4.1.x will be supporting
> UTF-8.  MySQL 4.0.x doesn't have that support (the version I'm using on the
> production server is 4.0.18-standard.)  Because of that, incorporating that
> support into MySQL might require a lot more work that I currently imagine.
> Unfortunately in that case, I'll have to leave MySQL as it is, and sort the
> data at the client site (less efficient, but requiring less development
> time), and since the application I'm working on doesn't store very big
> chunks of data in the db, I may decide to sacrifice performance for
> development time.
>
> > What's involved in creating a collation file? These two pages:
> > http://dev.mysql.com/doc/mysql/en/Adding_character_set.html
> > http://dev.mysql.com/doc/mysql/en/Character_arrays.html
> > http://dev.mysql.com/doc/mysql/en/String_collating.html
> > seem to say that's it's not too difficult, if you know what
> > you're doing?
> > (Which I dont. I'm just a humble PHP programmer)
>
> Well, that seems to be for single-byte code pages.  The Persian character
> coding system used in glibc is UTF-8, and that will require patching MySQL
> source code.  And like I said, because of MySQL's lack of UTF-8 support, it
> might require more work that I imagine.  I think I can handle it from
> technical point of view (I'm good at C/C++) but I'm quite pressed in free
> time...
>
> > ... it seems it would be great to create a mySql Persian
> > collation file rather than changing the source, with all the
> > problems that would lead to of having to re-patch the code
> > everytime there's a new MySql release? Or is that inevitable?
>
> Well, if we decide to change the MySQL source code, we can submit our
> patches to MySQL team, and hopefully they will incorporate it into their new
> releases.  Of course in that case we might have to look into adding that
> support to MySQL 4.1.x as well (if it already doesn't have.)  So there's no
> need for re-patching.  There's just a need for time!  :-)
>
> In case I decide not to spend the time in the development of Persian
> collation support in MySQL, I'll be glad to help your team in case they need
> technical programming help.  In that case, I'll let you know off-list
> (remind me if you don't get any note from me within a week, please.)
>
>
> -------------
> Ehsan Akhgari
>
> Farda Technology (http://www.farda-tech.com/)
>
> [ Email: [EMAIL PROTECTED] ]
> [ WWW: http://www.beginthread.com/Ehsan ]
>
>
>
> _______________________________________________
> PersianComputing mailing list
> [EMAIL PROTECTED]
> http://lists.sharif.edu/mailman/listinfo/persiancomputing
>
>

--behdad
  behdad.org
_______________________________________________
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing

RE: Persian UTF-8 MySql collation

Reply via email to