Re: [sqlite] bug? like-search with german umlaut is case-sensitive, should not be

2008-11-15 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Peter Jacobi wrote:
> (1) Just as an OS abstraction layer is in place for I/O, wouldn't it
> be possible to use an OS abstraction layer for L14N?

SQLite allows multiple registrations of the same function if they take
different number of arguments.  Consequently the SQLite core implements
upper/lower taking one arg and the ICU extension implements upper/lower
taking two args with the second arg being the locale.

Why don't you write an extension that does the mapping into the Win32
api and then contribute it - http://sqlite.org/contrib - if it is small
and works well then it could become part of the core for Windows.

> (2) I'm under the impression, that the problematic cases 

ICU in SQLite does a lot more than just locale specific upper/lower
casing.  It also does locale specific sorting (which can't be done with
a trivial lookup table) and LIKE/regular expressions.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEARECAAYFAkkfPHYACgkQmOOfHg372QQ25wCgzoswPyHJbHKdw+AZeX/6MV3g
XWMAn10xjkcf3NjZWvr+e+BOhyLUErzO
=62c0
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug? like-search with german umlaut is case-sensitive, should not be

2008-11-15 Thread Peter Jacobi
I'm aware that ICU is able to provide a very general solution, but I'm
wondering about two other options:

(1) Just as an OS abstraction layer is in place for I/O, wouldn't it
be possible to use an OS abstraction layer for L14N? So that for
example uppercasing is forwarded to LCMapString(LCMAP_UPPERCASE) on
Win32. That would bring the Sqlite behaviour in line with the handling
in the application program itself (provided that it uses OS APIs and
not ICU).

(2) I'm under the impression, that the problematic cases (german
sharp-s, turkic i) are few compared with all the cases where a simple
lookup would things make work. If I'm not mistaken, a lookup table of
2048 entries handling all 2 byte UTF-8 characters would already cover
all the joint character repertoire of all ISO-8859-*  (and their MSFT
counterparts). Thai (in ISO 8859-11) is using three byte UTF-8 but
doesn't have upper/lower case.

Regards,
Peter
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug? like-search with german umlaut is case-sensitive, should not be

2008-11-14 Thread Thomas Mittelstaedt
Thanks for that pointer to the icu project. Did not know about that!!

thomas

Am Freitag, den 14.11.2008, 15:27 +0200 schrieb Elefterios
Stamatogiannakis:
> Has anybody successfully compiled sqlite with icu for win32?
> 
> I haven't managed to find an libicu for mingw. Any tips welcome.
> 
> lefteris
> 
> D. Richard Hipp wrote:
> > On Nov 14, 2008, at 8:08 AM, Martin Engelschalk wrote:
> > 
> >> Hi all,
> >>
> >> the ICU project is a very powerful tool to handle codepages, and also
> >> supports regular expressions (using a class named "RegexMatcher", see
> >> http://icu-project.org/apiref/icu4c/classRegexMatcher.html).
> >> So, it should be relatively easy to replace the like() - function in
> >> sqlite (see http://www.sqlite.org/lang_corefunc.html#like and
> >> http://www.sqlite.org/c3ref/create_function.html)
> >>
> > 
> > http://www.sqlite.org/cvstrac/fileview?f=sqlite/ext/icu/README.txt&v=1.2
> > 
> > D. Richard Hipp
> > [EMAIL PROTECTED]
> > 
> > 
> > 
> > ___
> > sqlite-users mailing list
> > sqlite-users@sqlite.org
> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 
> 
> 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug? like-search with german umlaut is case-sensitive, should not be

2008-11-14 Thread Elefterios Stamatogiannakis
Has anybody successfully compiled sqlite with icu for win32?

I haven't managed to find an libicu for mingw. Any tips welcome.

lefteris

D. Richard Hipp wrote:
> On Nov 14, 2008, at 8:08 AM, Martin Engelschalk wrote:
> 
>> Hi all,
>>
>> the ICU project is a very powerful tool to handle codepages, and also
>> supports regular expressions (using a class named "RegexMatcher", see
>> http://icu-project.org/apiref/icu4c/classRegexMatcher.html).
>> So, it should be relatively easy to replace the like() - function in
>> sqlite (see http://www.sqlite.org/lang_corefunc.html#like and
>> http://www.sqlite.org/c3ref/create_function.html)
>>
> 
> http://www.sqlite.org/cvstrac/fileview?f=sqlite/ext/icu/README.txt&v=1.2
> 
> D. Richard Hipp
> [EMAIL PROTECTED]
> 
> 
> 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug? like-search with german umlaut is case-sensitive, should not be

2008-11-14 Thread D. Richard Hipp

On Nov 14, 2008, at 8:08 AM, Martin Engelschalk wrote:

> Hi all,
>
> the ICU project is a very powerful tool to handle codepages, and also
> supports regular expressions (using a class named "RegexMatcher", see
> http://icu-project.org/apiref/icu4c/classRegexMatcher.html).
> So, it should be relatively easy to replace the like() - function in
> sqlite (see http://www.sqlite.org/lang_corefunc.html#like and
> http://www.sqlite.org/c3ref/create_function.html)
>

http://www.sqlite.org/cvstrac/fileview?f=sqlite/ext/icu/README.txt&v=1.2

D. Richard Hipp
[EMAIL PROTECTED]



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug? like-search with german umlaut is case-sensitive, should not be

2008-11-14 Thread Martin Engelschalk
Hi all,

the ICU project is a very powerful tool to handle codepages, and also 
supports regular expressions (using a class named "RegexMatcher", see 
http://icu-project.org/apiref/icu4c/classRegexMatcher.html).
So, it should be relatively easy to replace the like() - function in 
sqlite (see http://www.sqlite.org/lang_corefunc.html#like and 
http://www.sqlite.org/c3ref/create_function.html)

Martin

Igor Tandetnik wrote:
> "Thomas Mittelstaedt"
> <[EMAIL PROTECTED]> wrote in
> message news:[EMAIL PROTECTED]
>   
>> Just did a search on my database using
>> SELECT * FROM ku2008 where "Empfaenger 1" like '%köck%';
>>
>> and nothing was found. Doing a SELECT * FROM ku2008 where "Empfaenger
>> 1" like '%kÖck%'; with the capital umlaut did find the record.
>> 
>
> http://sqlite.org/lang_expr.html
>
> "SQLite only understands upper/lower case for 7-bit Latin characters. 
> Hence the LIKE operator is case sensitive for 8-bit iso8859 characters 
> or UTF-8 characters. For example, the expression 'a' LIKE 'A' is TRUE 
> but 'æ' LIKE 'Æ' is FALSE."
>
> Apparently, it's possible to integrate SQLite with ICU 
> (http://icu-project.org/) to support properly localized collation and 
> case folding. I don't know the details, hopefully someone more 
> knowledgeable will chime in.
>
> Igor Tandetnik 
>
>
>
>   
> 
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>   

-- 

* Codeswift GmbH *
Traunstr. 30
A-5026 Salzburg-Aigen
Tel: +49 (0) 8662 / 494330
Mob: +49 (0) 171 / 4487687
Fax: +49 (0) 12120 / 204645
[EMAIL PROTECTED]
www.codeswift.com / www.swiftcash.at

Codeswift Professional IT Services GmbH
Firmenbuch-Nr. FN 202820s
UID-Nr. ATU 50576309

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug? like-search with german umlaut is case-sensitive, should not be

2008-11-14 Thread Igor Tandetnik
"Thomas Mittelstaedt"
<[EMAIL PROTECTED]> wrote in
message news:[EMAIL PROTECTED]
> Just did a search on my database using
> SELECT * FROM ku2008 where "Empfaenger 1" like '%köck%';
>
> and nothing was found. Doing a SELECT * FROM ku2008 where "Empfaenger
> 1" like '%kÖck%'; with the capital umlaut did find the record.

http://sqlite.org/lang_expr.html

"SQLite only understands upper/lower case for 7-bit Latin characters. 
Hence the LIKE operator is case sensitive for 8-bit iso8859 characters 
or UTF-8 characters. For example, the expression 'a' LIKE 'A' is TRUE 
but 'æ' LIKE 'Æ' is FALSE."

Apparently, it's possible to integrate SQLite with ICU 
(http://icu-project.org/) to support properly localized collation and 
case folding. I don't know the details, hopefully someone more 
knowledgeable will chime in.

Igor Tandetnik 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug? like-search with german umlaut is case-sensitive, should not be

2008-11-14 Thread Michael Schlenker
Thomas Mittelstaedt schrieb:
> Hallo,
> 
> Just did a search on my database using 
> SELECT * FROM ku2008 where "Empfaenger 1" like '%köck%';
> 
> and nothing was found. Doing a SELECT * FROM ku2008 where "Empfaenger 1"
> like '%kÖck%'; with the capital umlaut did find the record. 
> The data is utf-8! my sqlite version is 3.5.9 on ubuntu hardy.
>
Documented bug, see the sqlite expressions documentation page which states:
http://www.sqlite.org/lang_expr.html

(A bug: SQLite only understands upper/lower case for 7-bit Latin characters.
Hence the LIKE operator is case sensitive for 8-bit iso8859 characters or
UTF-8 characters. For example, the expression 'a' LIKE 'A' is TRUE but 'æ'
LIKE 'Æ' is FALSE.).

But its hard to fix as you would need language information for the data to
get the upper/lower thing always correct (just think about the ß -> SS
anomaly in german).

Michael

-- 
Michael Schlenker
Software Engineer

CONTACT Software GmbH   Tel.:   +49 (421) 20153-80
Wiener Straße 1-3   Fax:+49 (421) 20153-41
28359 Bremen
http://www.contact.de/  E-Mail: [EMAIL PROTECTED]

Sitz der Gesellschaft: Bremen
Geschäftsführer: Karl Heinz Zachries, Ralf Holtgrefe
Eingetragen im Handelsregister des Amtsgerichts Bremen unter HRB 13215
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug? like-search with german umlaut is case-sensitive, should not be

2008-11-14 Thread Martin Engelschalk
Hello Thomas,

I have the same problem. There is no readily available function for 
converting utf-8 characters outside 7-bit-Ascii from lower to upper, so 
sqlite does not use one.
To achieve this, you have to write your own function and/or incorporate 
something like ICU into your project. I still have hte work before me.

Martin

Thomas Mittelstaedt wrote:
> Hallo,
>
> Just did a search on my database using 
> SELECT * FROM ku2008 where "Empfaenger 1" like '%köck%';
>
> and nothing was found. Doing a SELECT * FROM ku2008 where "Empfaenger 1"
> like '%kÖck%'; with the capital umlaut did find the record. 
> The data is utf-8! my sqlite version is 3.5.9 on ubuntu hardy.
>
> thomas
>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
>   

-- 

* Codeswift GmbH *
Traunstr. 30
A-5026 Salzburg-Aigen
Tel: +49 (0) 8662 / 494330
Mob: +49 (0) 171 / 4487687
Fax: +49 (0) 12120 / 204645
[EMAIL PROTECTED]
www.codeswift.com / www.swiftcash.at

Codeswift Professional IT Services GmbH
Firmenbuch-Nr. FN 202820s
UID-Nr. ATU 50576309

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] bug? like-search with german umlaut is case-sensitive, should not be

2008-11-14 Thread Thomas Mittelstaedt
Hallo,

Just did a search on my database using 
SELECT * FROM ku2008 where "Empfaenger 1" like '%köck%';

and nothing was found. Doing a SELECT * FROM ku2008 where "Empfaenger 1"
like '%kÖck%'; with the capital umlaut did find the record. 
The data is utf-8! my sqlite version is 3.5.9 on ubuntu hardy.

thomas


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users