Bug#426366: postgresql-8.2: upper(), lower(), and ilike are broken on UNICODE databases

2007-07-08 Thread Emil Nowak
On 2007-07-06, at  21:34:52 Erwin Brandstetter wrote:


 Emil, are you sure you were connected to right database when running 
 your tests?
 I have been having major problems with this bug in version pg 8.1.*, but 
 I think it has been fixed in 8.2.
Yes - it was the right database.

 See the results of my tests:
Because your results were different I started to investigate problem on my
postgres.
Probably something was wrong main cluster. I made a backup of my databases,
then removed postgress server with purge. Then I installed it once again, and
recovered databases from backup. Now it works. 

This server was few times upgraded from 7.x version. So I guess something went
wrong in pg_upgradecluster during one of previous upgrades.

I will close this bug right now.
Thanks for help.

 test=# select 'ı' ilike 'I';
  ?column?
 --
  f
I think a better test for this is:
testy= select 'ı' ilike upper('ı');
 ?column?
--
 f
(1 row)

I don't know what is this i without a dot, but It seems for me like a bug.
Maybe this 'ı' letter exists only in lowercase version?



Bug#426366: postgresql-8.2: upper(), lower(), and ilike are broken on UNICODE databases

2007-07-08 Thread Erwin Brandstetter

Hi Emil!


[EMAIL PROTECTED] wrote:

Because your results were different I started to investigate problem on my
postgres.
Probably something was wrong main cluster. I made a backup of my databases,
then removed postgress server with purge. Then I installed it once again, and
recovered databases from backup. Now it works. 


This server was few times upgraded from 7.x version. So I guess something went
wrong in pg_upgradecluster during one of previous upgrades.

I will close this bug right now.
Thanks for help.
  



The results of these commands might have been of interest (but it's too 
late for that as you have reinstalled):

   select version();
   show client_encoding;
   show server_encoding;


(...)
I don't know what is this i without a dot, but It seems for me like a bug.
Maybe this 'ı' letter exists only in lowercase version?
  


The dotless 'ı' ist part of the turkish and some related languages. I 
have found a comprehensive article on wikipedia:

   http://en.wikipedia.org/wiki/Turkish_dotted_and_dotless_I
Obviously this glyph has a history of causing confusion.
Devrim GUNDUZ is active in the postgresql community and lives in turkey, 
AFAIK. So, if this is considered a bug (not sure), he might be able to 
help. I have cc'ed him for that reason - I hope this is ok.




Regards
Erwin Brandstetter


--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#426366: postgresql-8.2: upper(), lower(), and ilike are broken on UNICODE databases

2007-07-06 Thread Erwin Brandstetter

Hi Emil! Hi everyone!

Emil, are you sure you were connected to right database when running 
your tests?
I have been having major problems with this bug in version pg 8.1.*, but 
I think it has been fixed in 8.2.


See the results of my tests:

test=# select version();
version
-
PostgreSQL 8.2.4 on x86_64-pc-linux-gnu, compiled by GCC cc (GCC) 4.1.2 
20061115 (prerelease) (Debian 4.1.1-21)



event=# show client_encoding;
client_encoding
-
UTF8


test=# select upper ('ÖöÜüÄäßĆČćč€ĞğīŃńŇňŐőŘřŠšşŽžŻżÓóŁłĆć');
upper
---
ÖÖÜÜÄÄßĆČĆČ€ĞĞĪŃŃŇŇŐŐŘŘŠŠŞŽŽŻŻÓÓŁŁĆĆ


test=# select lower('ÖöÜüÄäßĆČćč€ĞğīŃńŇňŐőŘřŠšşŽžŻżÓóŁłĆć');
lower
---
ööüüääßćčćč€ğğīńńňňőőřřššşžžżżóółłćć


test=# select 'ööüüääßćčćč€ğğńńňňőőřřššşžžżżóółłćć' ilike 
'ÖÖÜÜÄÄßĆČĆČ€ĞĞŃŃŇŇŐŐŘŘŠŠŞŽŽŻŻÓÓŁŁĆĆ';

?column?
--
t



None of the above works properly in pg 8.1.
HOWEVER there IS a problematic case with characters like 'ı' - notice 
the missing dot, it is not 'i'.
Upper case representation of 'ı' is 'I', but lower case representation 
of 'I' is 'i' - at least according to lower() and upper().


test=# select upper('ı');
upper
---
I

test=# select lower ('I');
lower
---
i

test=# select 'ı' ilike 'I';
?column?
--
f

test=# select 'I' ilike 'ı';
?column?
--
f

Everything is converted to lower() before comparing, so 'ı' and 'I' will 
never match.

But shouldn't they?


Is this the right place for these comments?


Regards
Erwin Brandstetter


--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#426366: postgresql-8.2: upper(), lower(), and ilike are broken on UNICODE databases

2007-05-28 Thread Emil Nowak
Package: postgresql-8.2
Version: 8.2.4-1
Severity: normal

When I try to use upper(), lower(), or ILIKE on UNICODE databases results
are incorrect. For example:

testy-utf=# select upper('Żółć'), lower('Żółć');
 upper | lower 
---+---
 ūãłć  | 
(1 row)


And now the same thing on database with LATIN2 encoding (here we have correct
results):
testy=# select upper('Żółć'), lower('Żółć');
 upper | lower 
---+---
 ŻÓŁĆ  | żółć
(1 row)

I have the same problem when using psql command line utility, and pgadmin3 and
JDBC driver.

-- System Information:
Debian Release: lenny/sid
  APT prefers unstable
  APT policy: (500, 'unstable')
Architecture: i386 (i686)

Kernel: Linux 2.6.18-4-686 (SMP w/1 CPU core)
Locale: LANG=pl_PL.UTF-8, LC_CTYPE=pl_PL.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages postgresql-8.2 depends on:
ii  libc62.5-5   GNU C Library: Shared libraries
ii  libcomer 1.39+1.40-WIP-2007.04.07+dfsg-2 common error description library
ii  libkrb53 1.6.dfsg.1-2MIT Kerberos runtime libraries
ii  libpam0g 0.79-3.2Pluggable Authentication Modules l
ii  libpq5   8.2.4-1 PostgreSQL C client library
ii  libssl0. 0.9.8e-4SSL shared libraries
ii  postgres 8.2.4-1 front-end programs for PostgreSQL 
ii  postgres 75  manager for PostgreSQL database cl
ii  tzdata   2007f-1 time zone and daylight-saving time

postgresql-8.2 recommends no packages.

-- no debconf information