Re: Persian UTF-8 MySql collation

2004-07-05 Thread Peter Cruickshank
On Sat, 3 Jul 2004 22:10:24 +0430
"Ehsan Akhgari" <[EMAIL PROTECTED]> wrote:

[...]

> > I think you and the team I'm working with are trying to do
> > the same thing - it would be great if we could work together
> > and come up with a solution that anyone else can use too.
> 
> I looked around a bit, and it seems like MySQL 4.1.x will be supporting
> UTF-8.  MySQL 4.0.x doesn't have that support (the version I'm using on
> the production server is 4.0.18-standard.)  Because of that,
> incorporating that support into MySQL might require a lot more work that
> I currently imagine. Unfortunately in that case, I'll have to leave MySQL
> as it is, and sort the data at the client site (less efficient, but
> requiring less development time), and since the application I'm working
> on doesn't store very big chunks of data in the db, I may decide to
> sacrifice performance for development time.

Right. I was thinking about adding UTF-8 Persian collation to MySql 4.1.x
- our project will involve a fairly large amount of data, so we'd like to
have the option of sorting at the DB level.

> > What's involved in creating a collation file? These two pages:
> > http://dev.mysql.com/doc/mysql/en/Adding_character_set.html
> > http://dev.mysql.com/doc/mysql/en/Character_arrays.html
> > http://dev.mysql.com/doc/mysql/en/String_collating.html
> > seem to say that's it's not too difficult, if you know what
> > you're doing?
> > (Which I dont. I'm just a humble PHP programmer)
> 
> Well, that seems to be for single-byte code pages.  The Persian character
> coding system used in glibc is UTF-8, and that will require patching
> MySQL source code.  And like I said, because of MySQL's lack of UTF-8
> support, it might require more work that I imagine.  I think I can handle
> it from technical point of view (I'm good at C/C++) but I'm quite pressed
> in free time...

... which is why we're hoping to use MySql 4.1.x 

> > ... it seems it would be great to create a mySql Persian
> > collation file rather than changing the source, with all the
> > problems that would lead to of having to re-patch the code
> > everytime there's a new MySql release? Or is that inevitable?
> 
> Well, if we decide to change the MySQL source code, we can submit our
> patches to MySQL team, and hopefully they will incorporate it into their
> new releases.  Of course in that case we might have to look into adding
> that support to MySQL 4.1.x as well (if it already doesn't have.)  So
> there's no need for re-patching.  There's just a need for time!  :-)

Nope, no Persian collation file for MySql 4.1.x as far as I can see (which
is where we came in!)

> In case I decide not to spend the time in the development of Persian
> collation support in MySQL, I'll be glad to help your team in case they
> need technical programming help.  In that case, I'll let you know
> off-list(remind me if you don't get any note from me within a week,
> please.)

We may be in touch... :-)

Cheers

-- 
Peter Cruickshank
peter # cruickshank # biz

___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: Persian UTF-8 MySql collation

2004-07-05 Thread Roozbeh Pournader
On Tue, 2004-06-29 at 19:41, C Bobroff wrote:

> If you're talking about sorting, it was recently pointed out (see
> archives) that Windows server 2003 can sort Persian properly.

I would appreciate if someone can volunteer to run a test data set
FarsiWeb has on it. I'm 100% sure they won't support Hamzas or Harakat
properly.

roozbeh


___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Persian UTF-8 MySql collation

2004-07-05 Thread Ehsan Akhgari
> [Ehsan, you just replied to me.  Answering on list.]

My bad.  Sorry, I meant to reply to the list.

> Well, you may wish to read a couple documents.  Read Unicode Collation
> Algorithm for example.  Just read the intro or something like that.
> The point is that Persian Collation is only an small table feed to the
> Unicode Collation Algorithm.
> So yes, there is a free Persian collation implementation, Glibc +
> fa_IR locale.

Good point, thanks.  I'll investigate it.

> What you have seen is the binary encoded table.  The source is in the
> fa_IR locale source file.

Thanks, I'll try Googling for it.

> Guys, both of you, if you don't have Glib,

You mean glibc, right?

> and your system
> does not provide what you need, you:
>
> * Either forget about Persian Collation, or
> * Implement your own minimal collation, or

That's what I have in mind, currently.

> * Consider using something like Glibc or uClibc with Persian
>   locale as a library.  Not sure how uClibc deals with Persian
>   locale.

Thanks again,

-
Ehsan Akhgari

Farda Technology (http://www.farda-tech.com/)

[ Email: [EMAIL PROTECTED] ]
[ WWW: http://www.beginthread.com/Ehsan ]



___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Persian UTF-8 MySql collation

2004-07-05 Thread Ehsan Akhgari
> Right. I was thinking about adding UTF-8 Persian collation to MySql
> 4.1.x
> - our project will involve a fairly large amount of data, so we'd like
> to have the option of sorting at the DB level.

I've never tested MySQL 4.1.x.  Have you tried it?  How is the UTF-8
support?  Have you tried Persian collation in MySQL 4.1.x to see how much
better it's compared to 4.0.x?

Unfortunately I won't be willing to look into 4.1.x at this time, since it's
Beta, and we don't use Beta products on our productions servers, so doing so
will do no good to my project.

> ... which is why we're hoping to use MySql 4.1.x

I'd give it a try if I were in your shoes.

> Nope, no Persian collation file for MySql 4.1.x as far as I can see
> (which is where we came in!)

How does 4.1.x get Persian sorting?  Like 4.0.x?


-
Ehsan Akhgari

Farda Technology (http://www.farda-tech.com/)

[ Email: [EMAIL PROTECTED] ]
[ WWW: http://www.beginthread.com/Ehsan ]



___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Persian UTF-8 MySql collation

2004-07-05 Thread Ehsan Akhgari
> That might work for Ehsan, but it sadly wouldn't save much effort for
> us since PHP doesn't do Persian UTF-8 collation (that I've been able
> to get working anyway), or provide access to strxfrm()
>
> :-(
>
> - which is why MySql seemed the least bad option.

Hmmm, if you've compiled PHP with glibc, I suppose you could simply do the
following (code not tested):



And yes, PHP doesn't provide access to strxfrm, but I think it's trivial to
write a PHP extension which provides that function.

-
Ehsan Akhgari

Farda Technology (http://www.farda-tech.com/)

[ Email: [EMAIL PROTECTED] ]
[ WWW: http://www.beginthread.com/Ehsan ]



___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: Persian UTF-8 MySql collation

2004-07-04 Thread Behdad Esfahbod
On Sun, 4 Jul 2004, Peter Cruickshank wrote:

> On Sat, 3 Jul 2004 16:13:02 -0400
> Behdad Esfahbod <[EMAIL PROTECTED]> wrote:
>
> > Actually there's a middle solution here, which the price is just
> > messing a bit with your database schema.  All you need is to
> > store the string returned by strxfrm(str) in your database as a
> > binary field, and just sort on that column instead of str.
> >
> > behdad
>
> That might work for Ehsan, but it sadly wouldn't save much effort for us
> since PHP doesn't do Persian UTF-8 collation (that I've been able to get
> working anyway), or provide access to strxfrm()

To do Persian collation you need to set locale to Persian.
Wrapping setlocale and strxfrm is a ten minute job (if they're
really not in PHP).  Or do you mean you are using PHP on a system
which does Persian collation but does not provide strxfrm?  Then
you better deal with it...  If you have Glibc, as I said, it's a
ten minute job.

> :-(
>
> - which is why MySql seemed the least bad option.

By no means it's the least bad option, believe me.  It's the
hardest, without Gilbc at least.

> Peter

[Ehsan, you just replied to me.  Answering on list.]

On Sun, 4 Jul 2004, Ehsan Akhgari wrote:

> > Actually there's a middle solution here, which the price is
> > just messing a bit with your database schema.  All you need
> > is to store the string returned by strxfrm(str) in your
> > database as a binary field, and just sort on that column
> > instead of str.
>
> Thanks for the suggestion.  I didn't think of this before.
>
> BTW, is there a free Persian collation implementation available?  I have

Well, you may wish to read a couple documents.  Read Unicode
Collation Algorithm for example.  Just read the intro or
something like that.  The point is that Persian Collation is only
an small table feed to the Unicode Collation Algorithm.  So yes,
there is a free Persian collation implementation, Glibc + fa_IR
locale.

> seen Roozbeh's fa_IR LC_COLLATE file, but I'm wonderring is it implemented
> in straight C as well.  And no, using glibc is not an option here.

What you have seen is the binary encoded table.  The source is in
the fa_IR locale source file.

> Thanks!
>
> -
> Ehsan Akhgari

Guys, both of you, if you don't have Glib, and your system does
not provide what you need, you:

* Either forget about Persian Collation, or
* Implement your own minimal collation, or
* Consider using something like Glibc or uClibc with Persian
  locale as a library.  Not sure how uClibc deals with Persian
  locale.

--behdad
  behdad.org
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: Persian UTF-8 MySql collation

2004-07-04 Thread Peter Cruickshank
On Sat, 3 Jul 2004 22:10:24 +0430
"Ehsan Akhgari" <[EMAIL PROTECTED]> wrote:

[...]

> > I think you and the team I'm working with are trying to do
> > the same thing - it would be great if we could work together
> > and come up with a solution that anyone else can use too.
> 
> I looked around a bit, and it seems like MySQL 4.1.x will be supporting
> UTF-8.  MySQL 4.0.x doesn't have that support (the version I'm using on
> the production server is 4.0.18-standard.)  Because of that,
> incorporating that support into MySQL might require a lot more work that
> I currently imagine. Unfortunately in that case, I'll have to leave MySQL
> as it is, and sort the data at the client site (less efficient, but
> requiring less development time), and since the application I'm working
> on doesn't store very big chunks of data in the db, I may decide to
> sacrifice performance for development time.

Right. I was thinking about adding UTF-8 Persian collation to MySql 4.1.x
- our project will involve a fairly large amount of data, so we'd like to
have the option of sorting at the DB level.

> > What's involved in creating a collation file? These two pages:
> > http://dev.mysql.com/doc/mysql/en/Adding_character_set.html
> > http://dev.mysql.com/doc/mysql/en/Character_arrays.html
> > http://dev.mysql.com/doc/mysql/en/String_collating.html
> > seem to say that's it's not too difficult, if you know what
> > you're doing?
> > (Which I dont. I'm just a humble PHP programmer)
> 
> Well, that seems to be for single-byte code pages.  The Persian character
> coding system used in glibc is UTF-8, and that will require patching
> MySQL source code.  And like I said, because of MySQL's lack of UTF-8
> support, it might require more work that I imagine.  I think I can handle
> it from technical point of view (I'm good at C/C++) but I'm quite pressed
> in free time...

... which is why we're hoping to use MySql 4.1.x 

> > ... it seems it would be great to create a mySql Persian
> > collation file rather than changing the source, with all the
> > problems that would lead to of having to re-patch the code
> > everytime there's a new MySql release? Or is that inevitable?
> 
> Well, if we decide to change the MySQL source code, we can submit our
> patches to MySQL team, and hopefully they will incorporate it into their
> new releases.  Of course in that case we might have to look into adding
> that support to MySQL 4.1.x as well (if it already doesn't have.)  So
> there's no need for re-patching.  There's just a need for time!  :-)

Nope, no Persian collation file for MySql 4.1.x as far as I can see (which
is where we came in!)

> In case I decide not to spend the time in the development of Persian
> collation support in MySQL, I'll be glad to help your team in case they
> need technical programming help.  In that case, I'll let you know
> off-list(remind me if you don't get any note from me within a week,
> please.)

We may be in touch... :-)

Cheers

-- 
Peter Cruickshank
peter # cruickshank # biz



-- 
Peter Cruickshank
[EMAIL PROTECTED]

___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: Persian UTF-8 MySql collation

2004-07-04 Thread Peter Cruickshank
On Sat, 3 Jul 2004 16:13:02 -0400
Behdad Esfahbod <[EMAIL PROTECTED]> wrote:

> Actually there's a middle solution here, which the price is just
> messing a bit with your database schema.  All you need is to
> store the string returned by strxfrm(str) in your database as a
> binary field, and just sort on that column instead of str.
> 
> behdad

That might work for Ehsan, but it sadly wouldn't save much effort for us
since PHP doesn't do Persian UTF-8 collation (that I've been able to get
working anyway), or provide access to strxfrm()

:-(

- which is why MySql seemed the least bad option.

Peter

-- 
Peter Cruickshank
peter cruickshank biz


___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Persian UTF-8 MySql collation

2004-07-03 Thread Behdad Esfahbod

Actually there's a middle solution here, which the price is just
messing a bit with your database schema.  All you need is to
store the string returned by strxfrm(str) in your database as a
binary field, and just sort on that column instead of str.

behdad


On Sat, 3 Jul 2004, Ehsan Akhgari wrote:

> > Ehsan - are you thinking about adding glibc collation to the
> > strings/ctype-MYSET.c file? Or something more fundemental?
>
> Well, to tell you the truth, I'm not really sure, since I've not checked the
> MySQL source tree yet.  But yes, I'm going to see if glibc support can be
> incorporated into MySQL's charset handling mechanism.
>
> > I think you and the team I'm working with are trying to do
> > the same thing - it would be great if we could work together
> > and come up with a solution that anyone else can use too.
>
> I looked around a bit, and it seems like MySQL 4.1.x will be supporting
> UTF-8.  MySQL 4.0.x doesn't have that support (the version I'm using on the
> production server is 4.0.18-standard.)  Because of that, incorporating that
> support into MySQL might require a lot more work that I currently imagine.
> Unfortunately in that case, I'll have to leave MySQL as it is, and sort the
> data at the client site (less efficient, but requiring less development
> time), and since the application I'm working on doesn't store very big
> chunks of data in the db, I may decide to sacrifice performance for
> development time.
>
> > What's involved in creating a collation file? These two pages:
> > http://dev.mysql.com/doc/mysql/en/Adding_character_set.html
> > http://dev.mysql.com/doc/mysql/en/Character_arrays.html
> > http://dev.mysql.com/doc/mysql/en/String_collating.html
> > seem to say that's it's not too difficult, if you know what
> > you're doing?
> > (Which I dont. I'm just a humble PHP programmer)
>
> Well, that seems to be for single-byte code pages.  The Persian character
> coding system used in glibc is UTF-8, and that will require patching MySQL
> source code.  And like I said, because of MySQL's lack of UTF-8 support, it
> might require more work that I imagine.  I think I can handle it from
> technical point of view (I'm good at C/C++) but I'm quite pressed in free
> time...
>
> > ... it seems it would be great to create a mySql Persian
> > collation file rather than changing the source, with all the
> > problems that would lead to of having to re-patch the code
> > everytime there's a new MySql release? Or is that inevitable?
>
> Well, if we decide to change the MySQL source code, we can submit our
> patches to MySQL team, and hopefully they will incorporate it into their new
> releases.  Of course in that case we might have to look into adding that
> support to MySQL 4.1.x as well (if it already doesn't have.)  So there's no
> need for re-patching.  There's just a need for time!  :-)
>
> In case I decide not to spend the time in the development of Persian
> collation support in MySQL, I'll be glad to help your team in case they need
> technical programming help.  In that case, I'll let you know off-list
> (remind me if you don't get any note from me within a week, please.)
>
>
> -
> Ehsan Akhgari
>
> Farda Technology (http://www.farda-tech.com/)
>
> [ Email: [EMAIL PROTECTED] ]
> [ WWW: http://www.beginthread.com/Ehsan ]
>
>
>
> ___
> PersianComputing mailing list
> [EMAIL PROTECTED]
> http://lists.sharif.edu/mailman/listinfo/persiancomputing
>
>

--behdad
  behdad.org
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Persian UTF-8 MySql collation

2004-07-03 Thread Ehsan Akhgari
> Ehsan - are you thinking about adding glibc collation to the
> strings/ctype-MYSET.c file? Or something more fundemental?

Well, to tell you the truth, I'm not really sure, since I've not checked the
MySQL source tree yet.  But yes, I'm going to see if glibc support can be
incorporated into MySQL's charset handling mechanism.

> I think you and the team I'm working with are trying to do
> the same thing - it would be great if we could work together
> and come up with a solution that anyone else can use too.

I looked around a bit, and it seems like MySQL 4.1.x will be supporting
UTF-8.  MySQL 4.0.x doesn't have that support (the version I'm using on the
production server is 4.0.18-standard.)  Because of that, incorporating that
support into MySQL might require a lot more work that I currently imagine.
Unfortunately in that case, I'll have to leave MySQL as it is, and sort the
data at the client site (less efficient, but requiring less development
time), and since the application I'm working on doesn't store very big
chunks of data in the db, I may decide to sacrifice performance for
development time.

> What's involved in creating a collation file? These two pages:
> http://dev.mysql.com/doc/mysql/en/Adding_character_set.html
> http://dev.mysql.com/doc/mysql/en/Character_arrays.html
> http://dev.mysql.com/doc/mysql/en/String_collating.html
> seem to say that's it's not too difficult, if you know what
> you're doing?
> (Which I dont. I'm just a humble PHP programmer)

Well, that seems to be for single-byte code pages.  The Persian character
coding system used in glibc is UTF-8, and that will require patching MySQL
source code.  And like I said, because of MySQL's lack of UTF-8 support, it
might require more work that I imagine.  I think I can handle it from
technical point of view (I'm good at C/C++) but I'm quite pressed in free
time...

> ... it seems it would be great to create a mySql Persian
> collation file rather than changing the source, with all the
> problems that would lead to of having to re-patch the code
> everytime there's a new MySql release? Or is that inevitable?

Well, if we decide to change the MySQL source code, we can submit our
patches to MySQL team, and hopefully they will incorporate it into their new
releases.  Of course in that case we might have to look into adding that
support to MySQL 4.1.x as well (if it already doesn't have.)  So there's no
need for re-patching.  There's just a need for time!  :-)

In case I decide not to spend the time in the development of Persian
collation support in MySQL, I'll be glad to help your team in case they need
technical programming help.  In that case, I'll let you know off-list
(remind me if you don't get any note from me within a week, please.)


-
Ehsan Akhgari

Farda Technology (http://www.farda-tech.com/)

[ Email: [EMAIL PROTECTED] ]
[ WWW: http://www.beginthread.com/Ehsan ]



___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Persian UTF-8 MySql collation

2004-07-03 Thread Ehsan Akhgari
> It's not any easy to do what you are saying here, unless you
> make sure you ALWAYS run your mysql under the same (fa_IR)
> locale, and that the locale data does not change.  Any Glibc
> version >= 2.2 should be Ok.

I think I'll give it a try anyway; but I'm wonderring how useful it is,
considering the fact that MySQL 4.1.x (currently Beta) will be UTF-8
enabled...


Anyway, thanks for your comments a lot.

-
Ehsan Akhgari

Farda Technology (http://www.farda-tech.com/)

[ Email: [EMAIL PROTECTED] ]
[ WWW: http://www.beginthread.com/Ehsan ]



___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: Persian UTF-8 MySql collation

2004-07-03 Thread Peter Cruickshank
On Sat, 3 Jul 2004 02:37:55 -0400
Behdad Esfahbod <[EMAIL PROTECTED]> wrote:

> On Sat, 3 Jul 2004, Ehsan Akhgari wrote:
> 
> > > For proper sorting using Glibc, it's not enough that the
> > > application use Glibc, but it should call the sorting
> > > function of Glibc too! (which apparently MySql does not).
> >
> > Right.
> >
> > I'd like to spend some time trying to patch MySQL sources to use glibc
> > collation functions before I give up and sort the data at the client
> > side. Would you mind letting me know which version of glibc I should be
> > using? Also, is there any resource/documentation/how-to available which
> > can guide me in this job?
> 
> It's not any easy to do what you are saying here, unless you make
> sure you ALWAYS run your mysql under the same (fa_IR) locale, and
> that the locale data does not change.  Any Glibc version >= 2.2
> should be Ok.

Thanks everyone for the feedback so far. It's a kind of relief to hear that
we aren't the only people who've hit this issue.

Ehsan - are you thinking about adding glibc collation to the
strings/ctype-MYSET.c file? Or something more fundemental?

I think you and the team I'm working with are trying to do the same thing -
it would be great if we could work together and come up with a solution
that anyone else can use too.

What's involved in creating a collation file? These two pages:
http://dev.mysql.com/doc/mysql/en/Adding_character_set.html
http://dev.mysql.com/doc/mysql/en/Character_arrays.html
http://dev.mysql.com/doc/mysql/en/String_collating.html
seem to say that's it's not too difficult, if you know what you're doing?
(Which I dont. I'm just a humble PHP programmer)

... it seems it would be great to create a mySql Persian collation file
rather than changing the source, with all the problems that would lead to
of having to re-patch the code everytime there's a new MySql release? Or is
that inevitable?

Thanks again!

Peter

-- 
Peter Cruickshank
peterc at openconcept.ca
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: Persian UTF-8 MySql collation

2004-07-03 Thread Peter Cruickshank
On Sat, 3 Jul 2004 02:37:55 -0400
Behdad Esfahbod <[EMAIL PROTECTED]> wrote:

> On Sat, 3 Jul 2004, Ehsan Akhgari wrote:
> 
> > > For proper sorting using Glibc, it's not enough that the
> > > application use Glibc, but it should call the sorting
> > > function of Glibc too! (which apparently MySql does not).
> >
> > Right.
> >
> > I'd like to spend some time trying to patch MySQL sources to use glibc
> > collation functions before I give up and sort the data at the client
> > side. Would you mind letting me know which version of glibc I should be
> > using? Also, is there any resource/documentation/how-to available which
> > can guide me in this job?
> 
> It's not any easy to do what you are saying here, unless you make
> sure you ALWAYS run your mysql under the same (fa_IR) locale, and
> that the locale data does not change.  Any Glibc version >= 2.2
> should be Ok.

Thanks everyone for the feedback so far. It's a kind of relief to hear that
we aren't the only people who've hit this issue.

Ehsan - are you thinking about adding glibc collation to the
strings/ctype-MYSET.c file? Or something more fundemental?

I think you and the team I'm working with are trying to do the same thing -
it would be great if we could work together and come up with a solution
that anyone else can use too.

What's involved in creating a collation file? These two pages:
http://dev.mysql.com/doc/mysql/en/Adding_character_set.html
http://dev.mysql.com/doc/mysql/en/Character_arrays.html
http://dev.mysql.com/doc/mysql/en/String_collating.html
seem to say that's it's not too difficult, if you know what you're doing?
(Which I dont. I'm just a humble PHP programmer)

... it seems it would be great to create a mySql Persian collation file
rather than changing the source, with all the problems that would lead to
of having to re-patch the code everytime there's a new MySql release? Or is
that inevitable?

Thanks again!

Peter

-- 
Peter Cruickshank
peterc at openconcept.ca
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Persian UTF-8 MySql collation

2004-07-02 Thread Behdad Esfahbod
On Sat, 3 Jul 2004, Ehsan Akhgari wrote:

> > For proper sorting using Glibc, it's not enough that the
> > application use Glibc, but it should call the sorting
> > function of Glibc too! (which apparently MySql does not).
>
> Right.
>
> I'd like to spend some time trying to patch MySQL sources to use glibc
> collation functions before I give up and sort the data at the client side.
> Would you mind letting me know which version of glibc I should be using?
> Also, is there any resource/documentation/how-to available which can guide
> me in this job?

It's not any easy to do what you are saying here, unless you make
sure you ALWAYS run your mysql under the same (fa_IR) locale, and
that the locale data does not change.  Any Glibc version >= 2.2
should be Ok.

> Thanks!
>
> -
> Ehsan Akhgari

--behdad
  behdad.org
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Persian UTF-8 MySql collation

2004-07-02 Thread Ehsan Akhgari
> For proper sorting using Glibc, it's not enough that the
> application use Glibc, but it should call the sorting
> function of Glibc too! (which apparently MySql does not).

Right.

I'd like to spend some time trying to patch MySQL sources to use glibc
collation functions before I give up and sort the data at the client side.
Would you mind letting me know which version of glibc I should be using?
Also, is there any resource/documentation/how-to available which can guide
me in this job?

Thanks!

-
Ehsan Akhgari

Farda Technology (http://www.farda-tech.com/)

[ Email: [EMAIL PROTECTED] ]
[ WWW: http://www.beginthread.com/Ehsan ]



___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Persian UTF-8 MySql collation

2004-07-02 Thread Behdad Esfahbod

For proper sorting using Glibc, it's not enough that the
application use Glibc, but it should call the sorting function of
Glibc too! (which apparently MySql does not).

behdad



On Fri, 2 Jul 2004, Ehsan Akhgari wrote:

> > You can do proper Persian sorting using either glibc
> > (available in all GNU/Linux distributions), or ICU (available
> > from http://oss.software.ibm.com/icu/).
>
> I have tested both MySQL 4.0.15 on WinXP and the default MySQL which comes
> with Fedora Core 1, and neither could handle Persian sorting correctly.
> They both seemed to start sorting from letter "FEH" to "YEH" and then
> picking up "CHEH", "ZHEH", "GEH" and "PEH", and then starting from "ALEF" to
> "GHEIN".
>
> It's possible that the Windows version has not been compiled with glibc, but
> the Linux version is most likely compiled with glibc, I think.
>
> Do I need to compile MySQL manually?  If so, is any particular version of
> glibc required, or do I need to specify any particular compilation options?
>
> Thanks in advance,
>
> -
> Ehsan Akhgari
>
> Farda Technology (http://www.farda-tech.com/)

--behdad
  behdad.org
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


RE: Persian UTF-8 MySql collation

2004-07-02 Thread Ehsan Akhgari
> You can do proper Persian sorting using either glibc
> (available in all GNU/Linux distributions), or ICU (available
> from http://oss.software.ibm.com/icu/).

I have tested both MySQL 4.0.15 on WinXP and the default MySQL which comes
with Fedora Core 1, and neither could handle Persian sorting correctly.
They both seemed to start sorting from letter "FEH" to "YEH" and then
picking up "CHEH", "ZHEH", "GEH" and "PEH", and then starting from "ALEF" to
"GHEIN".

It's possible that the Windows version has not been compiled with glibc, but
the Linux version is most likely compiled with glibc, I think.

Do I need to compile MySQL manually?  If so, is any particular version of
glibc required, or do I need to specify any particular compilation options?

Thanks in advance,

-
Ehsan Akhgari

Farda Technology (http://www.farda-tech.com/)

[ Email: [EMAIL PROTECTED] ]
[ WWW: http://www.beginthread.com/Ehsan ]



___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: Persian UTF-8 MySql collation

2004-06-29 Thread C Bobroff

On Tue, 29 Jun 2004, Roozbeh Pournader wrote:

> There is no other software known to
> the community that does Persian Unicode software properly without using
> either of those.

If you're talking about sorting, it was recently pointed out (see
archives) that Windows server 2003 can sort Persian properly.

-Connie
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: Persian UTF-8 MySql collation

2004-06-29 Thread Roozbeh Pournader
You can do proper Persian sorting using either glibc (available in all
GNU/Linux distributions), or ICU (available from
http://oss.software.ibm.com/icu/). There is no other software known to
the community that does Persian Unicode software properly without using
either of those.

roozbeh

On Thu, 2004-06-24 at 21:15, Peter Cruickshank wrote:
> Hello
> 
> I'm a new subscriber to the list, so please forgive me if I'm asking an old
> question. I did look at the archives for last few months though and didn't
> see any discussion of this issue:
> 
> The subject kind of explains it all - I'm part of a team adapting an open
> source MySql based content management system (Back-End - 
> www.back-end,org) to work with Persian content. A big stumbling block is
> getting UTF-8 collation working. We don't want to be reinventing wheels
> here - so it would be great to hear if someone has already built a UTF-8
> collation file and is willing to share it?
> 
> Any help or pointers will be greatly appreciated!
> 
> Thanks
> 
> Peter

___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing