Persian UTF-8 MySql collation

2004-06-24 Thread Peter Cruickshank
Hello

I'm a new subscriber to the list, so please forgive me if I'm asking an old
question. I did look at the archives for last few months though and didn't
see any discussion of this issue:

The subject kind of explains it all - I'm part of a team adapting an open
source MySql based content management system (Back-End - 
www.back-end,org) to work with Persian content. A big stumbling block is
getting UTF-8 collation working. We don't want to be reinventing wheels
here - so it would be great to hear if someone has already built a UTF-8
collation file and is willing to share it?

Any help or pointers will be greatly appreciated!

Thanks

Peter
-- 
Peter Cruickshank
peterc at openconcept.ca



___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: Persian UTF-8 MySql collation

2004-07-03 Thread Peter Cruickshank
On Sat, 3 Jul 2004 02:37:55 -0400
Behdad Esfahbod [EMAIL PROTECTED] wrote:

 On Sat, 3 Jul 2004, Ehsan Akhgari wrote:
 
   For proper sorting using Glibc, it's not enough that the
   application use Glibc, but it should call the sorting
   function of Glibc too! (which apparently MySql does not).
 
  Right.
 
  I'd like to spend some time trying to patch MySQL sources to use glibc
  collation functions before I give up and sort the data at the client
  side. Would you mind letting me know which version of glibc I should be
  using? Also, is there any resource/documentation/how-to available which
  can guide me in this job?
 
 It's not any easy to do what you are saying here, unless you make
 sure you ALWAYS run your mysql under the same (fa_IR) locale, and
 that the locale data does not change.  Any Glibc version = 2.2
 should be Ok.

Thanks everyone for the feedback so far. It's a kind of relief to hear that
we aren't the only people who've hit this issue.

Ehsan - are you thinking about adding glibc collation to the
strings/ctype-MYSET.c file? Or something more fundemental?

I think you and the team I'm working with are trying to do the same thing -
it would be great if we could work together and come up with a solution
that anyone else can use too.

What's involved in creating a collation file? These two pages:
http://dev.mysql.com/doc/mysql/en/Adding_character_set.html
http://dev.mysql.com/doc/mysql/en/Character_arrays.html
http://dev.mysql.com/doc/mysql/en/String_collating.html
seem to say that's it's not too difficult, if you know what you're doing?
(Which I dont. I'm just a humble PHP programmer)

... it seems it would be great to create a mySql Persian collation file
rather than changing the source, with all the problems that would lead to
of having to re-patch the code everytime there's a new MySql release? Or is
that inevitable?

Thanks again!

Peter

-- 
Peter Cruickshank
peterc at openconcept.ca
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: Persian UTF-8 MySql collation

2004-07-04 Thread Peter Cruickshank
On Sat, 3 Jul 2004 16:13:02 -0400
Behdad Esfahbod [EMAIL PROTECTED] wrote:

 Actually there's a middle solution here, which the price is just
 messing a bit with your database schema.  All you need is to
 store the string returned by strxfrm(str) in your database as a
 binary field, and just sort on that column instead of str.
 
 behdad

That might work for Ehsan, but it sadly wouldn't save much effort for us
since PHP doesn't do Persian UTF-8 collation (that I've been able to get
working anyway), or provide access to strxfrm()

:-(

- which is why MySql seemed the least bad option.

Peter

-- 
Peter Cruickshank
peter cruickshank biz


___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing