> -----Original Message-----
> From: Adriano dos Santos Fernandes [mailto:[email protected]]
> Sent: Tuesday, September 12, 2017 1:21 PM
> To: [email protected]
> Subject: Re: [Firebird-devel] FB 3.0 Replacing ICU libraries on Windows
>
> On 12/09/2017 08:10, Magnus Johansson wrote:
> >
> > But when testing, the result is not as expected.
> >
> >
> It will be difficult to help you with this problem description.
>
>
> Adriano
>
Hi Adriano,
I was hoping you would be so kind to answer. :-)
I perfectly understand that, but since I didn't want to do another lengthy post
I thought that I should break it up.
Anyhow, here is an excerpt from my original post.
I might as well add that it involves collation with Numeric-Sort and index.
After I replaced the ICU libraries as described before, I created a database,
some collations and a table with some data:
SET NAMES UTF8;
SET SQL DIALECT 3;
CREATE DATABASE 'C:\Data\Test\Collation_CI.fdb'
USER 'SYSDBA' PASSWORD 'masterkey'
PAGE_SIZE 16384
DEFAULT CHARACTER SET UTF8;
SHOW VERSION;
ISQL Version: WI-V3.0.3.32802 Firebird 3.0
Server version:
Firebird/Windows/AMD/Intel/x64 (access method), version "WI-V3.0.3.32802
Firebird 3.0"
on disk structure version 12.0
CREATE COLLATION UTF_SV_CI
FOR UTF8
FROM UNICODE
CASE INSENSITIVE
'LOCALE=sv_SE';
CREATE COLLATION UTF_SV_CI_NUM
FOR UTF8
FROM UNICODE
CASE INSENSITIVE
'LOCALE=sv_SE;NUMERIC-SORT=1';
CREATE COLLATION ISO_SV_CI
FOR ISO8859_1
FROM SV_SV
CASE INSENSITIVE;
COMMIT;
CREATE TABLE TABLE_T (
FIELD_UTF_SV_CI VARCHAR(5) CHARACTER SET UTF8 COLLATE UTF_SV_CI,
FIELD_UTF_SV_CI_NUM VARCHAR(5) CHARACTER SET UTF8 COLLATE
UTF_SV_CI_NUM,
FIELD_ISO_SV_CI VARCHAR(5) CHARACTER SET ISO8859_1 COLLATE ISO_SV_CI
);
COMMIT;
INSERT INTO TABLE_T VALUES ('1', '1', '1' );
INSERT INTO TABLE_T VALUES ('2', '2', '2' );
INSERT INTO TABLE_T VALUES ('10', '10', '10');
INSERT INTO TABLE_T VALUES ('20', '20', '20');
INSERT INTO TABLE_T VALUES ('a', 'a', 'a' );
INSERT INTO TABLE_T VALUES ('AA', 'AA', 'AA');
INSERT INTO TABLE_T VALUES ('aa', 'aa', 'aa');
INSERT INTO TABLE_T VALUES ('Aa', 'Aa', 'Aa');
INSERT INTO TABLE_T VALUES ('A', 'A', 'A' );
INSERT INTO TABLE_T VALUES ('b', 'b', 'b' );
INSERT INTO TABLE_T VALUES ('B', 'B', 'B' );
INSERT INTO TABLE_T VALUES ('o', 'o', 'o' );
INSERT INTO TABLE_T VALUES ('O', 'O', 'O' );
INSERT INTO TABLE_T VALUES ('x', 'x', 'x' );
INSERT INTO TABLE_T VALUES ('X', 'X', 'X' );
INSERT INTO TABLE_T VALUES ('å', 'å', 'å' );
INSERT INTO TABLE_T VALUES ('Å', 'Å', 'Å' );
INSERT INTO TABLE_T VALUES ('ä', 'ä', 'ä' );
INSERT INTO TABLE_T VALUES ('Ä', 'Ä', 'Ä' );
INSERT INTO TABLE_T VALUES ('ö', 'ö', 'ö' );
INSERT INTO TABLE_T VALUES ('Ö', 'Ö', 'Ö' );
COMMIT;
Now some tests.
SELECT FIELD_UTF_SV_CI
FROM TABLE_T
WHERE FIELD_UTF_SV_CI BETWEEN '' AND 'b'
ORDER BY 1;
FIELD_UTF_SV_CI
===============
1
10
2
20
a
A
aa
Aa
AA
b
B
===================
Result as expected
SELECT FIELD_UTF_SV_CI_NUM
FROM TABLE_T
WHERE FIELD_UTF_SV_CI_NUM BETWEEN '' AND 'b'
ORDER BY 1;
FIELD_UTF_SV_CI_NUM
===================
1
2
10
20
a
A
aa
Aa
AA
b
B
===================
Result as expected
SELECT FIELD_ISO_SV_CI
FROM TABLE_T
WHERE FIELD_ISO_SV_CI BETWEEN '' AND 'b'
ORDER BY 1;
FIELD_ISO_SV_CI
===============
1
10
2
20
A
a
Ä
Å
Ö
ä
å
ö
AA
Aa
aa
B
b
===================
Not the expected result.
Both wrong order and records that should not be there.
Let's add some index:
COMMIT;
CREATE INDEX IDX_UTF_SV_CI ON TABLE_T (FIELD_UTF_SV_CI);
CREATE INDEX IDX_UTF_SV_CI_NUM ON TABLE_T (FIELD_UTF_SV_CI_NUM);
CREATE INDEX IDX_ISO_SV_CI ON TABLE_T (FIELD_ISO_SV_CI);
COMMIT;
And repeat the queries from before:
SELECT FIELD_UTF_SV_CI
FROM TABLE_T
WHERE FIELD_UTF_SV_CI BETWEEN '' AND 'b'
ORDER BY 1;
FIELD_UTF_SV_CI
===============
1
10
2
20
a
A
aa
Aa
AA
b
B
===================
Result as expected
SELECT FIELD_UTF_SV_CI_NUM
FROM TABLE_T
WHERE FIELD_UTF_SV_CI_NUM BETWEEN '' AND 'b'
ORDER BY 1;
FIELD_UTF_SV_CI_NUM
===================
1
2
10
20
a
A
aa
Aa
AA
b
===================
Not the expected result.
Where did the 'B' go?
SELECT FIELD_ISO_SV_CI
FROM TABLE_T
WHERE FIELD_ISO_SV_CI BETWEEN '' AND 'b'
ORDER BY 1;
FIELD_ISO_SV_CI
===============
1
10
2
20
a
A
å
Å
ä
Ä
ö
Ö
AA
aa
Aa
b
B
===================
Not the expected result.
Both wrong order and records that should not be there.
Although the order differs slightly from the same query without index.
I have done the same tests with the release version of FB 3.0.2
(Firebird-3.0.2.32703-0_x64) as well as tested with icu 53 and 57 for them both
with the same result, no differences there.
Using the original icu files that comes with Firebird-3.0.2.32703-0_x64 and
Firebird-3.0.3.32802-0_x64 and doing the same tests for FIELD_ISO_SV_CI gives
the same result, so no difference there if the icu library is replaced or not.
I have also done some test with ACCENT INSENSITIVE but I will leave that beast
for now.
Thanks for reading,
Magnus Johansson
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Firebird-Devel mailing list, web interface at
https://lists.sourceforge.net/lists/listinfo/firebird-devel