Accent insensitive comparison: diacritical letters with DIAGONAL crossing
stroke pass only test on EQUALITY to their non-accented forms
---------------------------------------------------------------------------------------------------------------------------------------
Key: CORE-4739
URL: http://tracker.firebirdsql.org/browse/CORE-4739
Project: Firebird Core
Issue Type: Bug
Components: Charsets/Collation
Reporter: Pavel Zotov
Priority: Minor
Attachments:
diacritical-comparison-of-letters-with-diagonal-stokes.png.zip
The following letters:
Ø = U+00D8 // LATIN CAPITAL LETTER O WITH STROKE' (U+00D8), used in danish &
iceland alphabets;
Ð = U+00D0 // LATIN CAPITAL LETTER ETH' (U+00D0), iceland
Ŀ = U+013F // LATIN CAPITAL LETTER L WITH MIDDLE DOT' (U+013F), catalone
(valencian)
Ł = U+0141 // LATIN CAPITAL LETTER L WITH STROKE' (U+0141), polish
-- can be compared with their non-accented forms only using '=' or 'is NOT
distinct from' for getting result TRUE.
Other kinds of comparison: STARTING WITH, LIKE, SIMILAR TO and evaluation of
result POS() - fails.
Test query:
========
with recursive
d as (
select
cast( 'ØÐ' || 'Ł' || 'Ŀ' ||
'ĘĄĂÂÎŢŐŰĖÅĽĢÁÉÍÓÚÝÀÈÌÒÙÂÊÎÔÛÃÑÕÄËÏÖÜŸÇŠĄĘŹŻĂŞŢ' as varchar(80) character set
utf8) s
,cast( 'OD' || 'L' || 'L' ||
'EAAAITOUEALGAEIOUYAEIOUAEIOUANOAEIOUYCSAEZZAST' as varchar(80) character set
utf8) t
from rdb$database
)
,r as(select 1 i from rdb$database union all select r.i+1 from r where r.i
< 100)
,e as(
select
substring(d.s from r.i for 1) c
,substring(d.t from r.i for 1) t
from d join r on r.i <= char_length(d.s)
)
,f as (
select
e.c as utf_char
,e.t as latin_char
,iif( e.c collate co_utf8_ci_ai = e.t, 1, 0 ) equal_test
,iif( position(e.t, e.c collate co_utf8_ci_ai) >0 , 1, 0 ) pos_test
,iif( e.c collate co_utf8_ci_ai starting with e.t, 1, 0 )
start_with_test
,iif( e.c collate co_utf8_ci_ai like e.t, 1, 0 ) like_test
,iif( e.c collate co_utf8_ci_ai similar to e.t, 1, 0 )
similar_to_letter_test
,iif( e.c collate co_utf8_ci_ai similar to '[[:ALPHA:]]', 1, 0 )
similar_to_alpha_test
from e
)
select *
from f
order by equal_test + pos_test + start_with_test + like_test +
similar_to_letter_test + similar_to_alpha_test
,utf_char
;
Result that I've got on Windows and Linux can be seen in attach (screenshot).
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://tracker.firebirdsql.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
------------------------------------------------------------------------------
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF
Firebird-Devel mailing list, web interface at
https://lists.sourceforge.net/lists/listinfo/firebird-devel