Where are we on this? Given the original report:
online=# select * from common_logins where username = 'potyty';
uid | username | password | lastlogin | status | usertype | loginnum
-----+----------+----------+-----------+--------+----------+----------
(0 rows)
online=# select * from common_logins where username like 'potyty';
uid | username | password | lastlogin | status |
usertype | loginnum
--------+----------+----------+----------------------------+--------+----------+----------
155505 | potyty | board | 2004-08-16 17:45:55.723829 | A |
S | 1
60067 | potyty | board | 2004-07-07 20:22:17.68699 | A |
S | 3
174041 | potyty | board | 2005-02-17 00:00:13.706144 | A |
S | 3
(3 rows)
online=# select username, username = 'potyty' from common_logins where
username like 'potyty';
username | ?column?
----------+----------
potyty | t
potyty | t
potyty | t
(3 rows)
I don't think we can state that our current behavior is correct. I
realize we are being hit by the length comparison optimization, but
ultimiately the issue is that the Hungarian-specific locale considers
"tyty" and "tty" as the same string, which confuses our indexing
comparisons.
Is our fix going to be a Hungarian-specific one?
---------------------------------------------------------------------------
Tom Lane wrote:
> Martijn van Oosterhout <[email protected]> writes:
> > On Fri, Dec 16, 2005 at 01:06:58PM -0500, Tom Lane wrote:
> >> Ah. So we could redefine hashtext() to return the hash of the strxfrm
> >> value. Slow, but a lot better than giving up hash join and hash
> >> aggregation altogether...
>
> > Not to put too fine a point on it, but either you want locale-sensetive
> > sorting or you don't.
>
> Nobody's said anything about giving up locale-sensitive sorting. The
> question is about locale-sensitive equality: does it really make sense
> that 'tty' = 'tyty'? Would your answer change in the context
> '/dev/tty' = '/dev/tyty'? Are you willing to *not have access* to a
> text comparison operator that will make the distinction?
>
> I'm inclined to think that this is more like the occasional need for
> accent-insensitive comparisons. It seems generally agreed that you want
> something like smash('ab') = smash('??b') rather than making the
> strings equal in all contexts.
>
> Of course, not being a native speaker of any of the affected languages,
> my opinion shouldn't be taken too seriously ...
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>
--
Bruce Momjian | http://candle.pha.pa.us
[email protected] | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match