[sqlite] Windows A and W APIs dual support

2016-02-12 Thread Scott Robison
On Fri, Feb 12, 2016 at 8:05 PM, Warren Young  wrote:

> On Feb 12, 2016, at 4:42 PM, Scott Robison 
> wrote:
> >
> > I find it kind of interesting that Microsoft takes a lot
> > of (deserved) flack for not adhering to standards, yet UTF-8 came about
> > specifically because some didn't want to use UCS-2
>
> ?for good reason.  UCS-2/UTF-16 isn?t compatible with C strings.  I know
> you know this, but it?s a huge consideration.  Outside of Mac OS Classic
> and a few even smaller enclaves, C and its calling standards were the
> lingua franca of the computing world when Unicode first came on the scene,
> and those enclaves are now all but gone.
>
> We?ll be living with the legacy of C for quite a long time yet.  Until C
> is completely stamped out, we?ll have to accommodate 0-terminated strings
> somehow.
>

UCS (which was by definition a 2 byte encoding; UCS-2 is a retronym) was
not a "standard" until late 1991. C89/C90 provided for definition of a wide
character type. People didn't want to (perhaps couldn't) use it (and I can
understand why). My point was just that Microsoft was the first to really
embrace the standard as written, not tweak it into something else.

Windows bought into the idea of Unicode and/or UCS before they were unified
and standardized to their current form. That locked Windows into what we
call the UCS-2 format, when Unicode was "guaranteed" to never need more
than 2^16 code points. Later unification of the two standards expanded the
potential code point space to U+7FFF, and later still restricted it to
U+10 to ensure that UTF-16 could address all of the potential standard
code points.


> > Had Microsoft come up with it first, I'm sure they'd be crucified by
> some of
> > the same people who today are critical of them for using wide characters
> > instead of UTF-8!
>
> I think if we were to send a copy of the Unicode 8.0 standard back to the
> early 1960s as a model for those designing ASCII, Unicode would look very
> different today.
>

I think you're probably correct. Though who knows. The industry still
hadn't really agreed to 8 bit bytes. Memory was expensive, and you did what
you had to to minimize its use. 6 bit bytes/characters, 2 digit year
encodings. A lot of people today just can't imagine caring that much about
RAM (given how much of it is used to share pictures of kittens), but it was
a significant savings that translated to real money.


> UCS-2 feels like the 90?s version of ?640 kB is enough for everything!? to
> me, and UTF-16 like bank switching/segmentation.  We?re going to be stuck
> with those half-measure decisions for decades now.  Thanks, Microsoft.
>

Thanks Unicode / ISO-10646. They set the standard. Microsoft adopted it.


> The POSIX platforms did the right thing here: UTF-32 when speed matters
> more than space, and UTF-8 when space or compatibility matters more.
>

They had the luxury of waiting until UTF-8 and UCS-4 (now UTF-32) existed
before making those decisions. 20/20 hindsight.

Note: I like UTF-8. I try to use it everywhere and only convert as needed
to suit the API. I certainly think Microsoft has had plenty of time to more
thoroughly integrate UTF-8 into the APIs so that you don't have to convert
back and forth. I just find it funny that Microsoft is condemned by so many
for adhering to the standards / draft standards while POSIX systems were
able to embrace and extend. :)

-- 
Scott Robison


[sqlite] Windows A and W APIs dual support

2016-02-12 Thread Igor Tandetnik
On 2/12/2016 10:44 PM, J Decker wrote:
> On Fri, Feb 12, 2016 at 7:37 PM, Igor Tandetnik  wrote:
>> It performs the conversion it is documented to perform. It indeed doesn't
>> perform the conversion that you, for reasons unclear, expect it to perform.
>> In other words, you engage in wishful thinking, and then blame the messenger
>> for failure of your wishes to materialize.
>
> I expect it to take any string

What is the basis of this expectation, other than wishful thinking?

Again, if you need to convert specifically between UTF-16 and UTF-8, 
there are API functions that are documented to do that, and they do 
work. They are WideCharToMultiByte and MultiByteToWideChar. wcstombs and 
mbstowcs are not documented to do that, and, quite unsurprisingly, they 
don't work for that.
-- 
Igor Tandetnik



[sqlite] Windows A and W APIs dual support

2016-02-12 Thread Igor Tandetnik
On 2/12/2016 10:14 PM, J Decker wrote:
> mbstowcs( out, utf8, 5 );

mbstowcs expects the string in the codepage of the current locale - 
which is never UTF-8.

> for( n = 0; n < 5; n++ )
> printf( "%04x ", out[n] );  // output is 00f0 0090 0080 0081; expect d800 dc01

Why do you expect that? It appears your system uses Western European 
codepage (aka Latin-1). You pass a character "\xf0" which, when taken to 
be encoded in that codepage, is quite properly converted to U+00F0.

> for( n = 0; n < 5; n++ )
>printf( "%02x ", chout[n] );  // output is 00 00 00 00

U+10001 is (unsurprisingly) not representable in your current ANSI 
codepage, so wcstombs call fails (I can't help but notice that you 
aren't checking any calls for failure) and leaves the output buffer 
unchanged.

> so it does no useful conversion either way :)

It performs the conversion it is documented to perform. It indeed 
doesn't perform the conversion that you, for reasons unclear, expect it 
to perform. In other words, you engage in wishful thinking, and then 
blame the messenger for failure of your wishes to materialize.
-- 
Igor Tandetnik



[sqlite] Windows A and W APIs dual support

2016-02-12 Thread Clemens Ladisch
Olivier Mascia wrote:
> Are there Windows platforms, supported by SQLite source code of course, where 
> the 'W' version of the APIs are not available?

Once upon a time, SQLite supported Windows 95/98/Me.

Nowadays, the code is still there, but untested.


Regards,
Clemens


[sqlite] Bug: LEFT JOIN on view confusing the query optimiser

2016-02-12 Thread Clemens Ladisch
Dave Blake wrote:
>> It chooses a _correct_ plan.
>
> Really? With query 1) to select a song it first scans the song_artist table
> that could contain many thousands of records. That does not seem right.

The result is correct.  This would not necessarily be the case if it
executed an outer join as if it were an inner join.

A left join forces SQLite to scan the outer table first.  This is
necessary for the join to work correctly.

>> Your queries are quite different.
>
> They produce the same result sets, but yes one uses a view and the other
> the constitent tables.

No, the difference is that the view uses an inner join, while your
second query use only outer joins.


Regards,
Clemens


[sqlite] Windows A and W APIs dual support

2016-02-12 Thread J Decker
On Fri, Feb 12, 2016 at 8:00 PM, Igor Tandetnik  wrote:
> On 2/12/2016 10:44 PM, J Decker wrote:
>>
>> On Fri, Feb 12, 2016 at 7:37 PM, Igor Tandetnik 
>> wrote:
>>>
>>> It performs the conversion it is documented to perform. It indeed doesn't
>>> perform the conversion that you, for reasons unclear, expect it to
>>> perform.
>>> In other words, you engage in wishful thinking, and then blame the
>>> messenger
>>> for failure of your wishes to materialize.
>>
>>
>> I expect it to take any string
>
>
> What is the basis of this expectation, other than wishful thinking?
I don't think expectation and wishful thinknig have anything to do
with each other.

Though I expect standards would look at what the world really needs
and implement core functionality?  That's hardly wishful thinking.
Well I guess it is, because I repeatedly have found myself
disappointed in the lack of considerations in standards.  Yes there
are even unicode libraries for posix; but it's a huge expense for a
couple hundred lines of code.  and if it's something I needed for
interop, why doesn't everyone?

>
> Again, if you need to convert specifically between UTF-16 and UTF-8, there
> are API functions that are documented to do that, and they do work. They are
> WideCharToMultiByte and MultiByteToWideChar. wcstombs and mbstowcs are not
> documented to do that, and, quite unsurprisingly, they don't work for that.
>
> --
> Igor Tandetnik
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Windows A and W APIs dual support

2016-02-12 Thread J Decker
On Fri, Feb 12, 2016 at 8:00 PM, Igor Tandetnik  wrote:
> On 2/12/2016 10:44 PM, J Decker wrote:
>>
>> On Fri, Feb 12, 2016 at 7:37 PM, Igor Tandetnik 
>> wrote:
>>>
>>> It performs the conversion it is documented to perform. It indeed doesn't
>>> perform the conversion that you, for reasons unclear, expect it to
>>> perform.
>>> In other words, you engage in wishful thinking, and then blame the
>>> messenger
>>> for failure of your wishes to materialize.
>>
>>
>> I expect it to take any string
>
>
> What is the basis of this expectation, other than wishful thinking?
>
> Again, if you need to convert specifically between UTF-16 and UTF-8, there
> are API functions that are documented to do that, and they do work. They are
> WideCharToMultiByte and MultiByteToWideChar. wcstombs and mbstowcs are not
> documented to do that, and, quite unsurprisingly, they don't work for that.
>
and what exists for platforms other than windows?
doesn't matter though.  I have a solution that works on all platforms
that's the same name and doesn't require some #ifdef to work.
> --
> Igor Tandetnik
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Windows A and W APIs dual support

2016-02-12 Thread Igor Tandetnik
On 2/12/2016 7:24 PM, J Decker wrote:
> well mbstowc and vice versa only understand 16 bit encodings, and do
> not generate codpairsand do generate d800-dfff characters which
> are entirely illegal in utf-16 (without corresponding pair)

What character in what ANSI codepage ends up converted by mbstowcs to an 
unpaired surrogate?

What character in what ANSI codepage requires a surrogate pair to 
represent (that is, corresponds to a Unicode character outside of BMP), 
and triggers failure when passed to mbstowcs?

With all due respect, I find your claims difficult to believe.

In any case, MultiByteToWideChar and WideCharToMultiByte are perfectly 
capable of converting between UTF-8 and UTF-16.
-- 
Igor Tandetnik



[sqlite] Windows A and W APIs dual support

2016-02-12 Thread Warren Young
On Feb 12, 2016, at 4:42 PM, Scott Robison  wrote:
> 
> I find it kind of interesting that Microsoft takes a lot
> of (deserved) flack for not adhering to standards, yet UTF-8 came about
> specifically because some didn't want to use UCS-2

?for good reason.  UCS-2/UTF-16 isn?t compatible with C strings.  I know you 
know this, but it?s a huge consideration.  Outside of Mac OS Classic and a few 
even smaller enclaves, C and its calling standards were the lingua franca of 
the computing world when Unicode first came on the scene, and those enclaves 
are now all but gone.

We?ll be living with the legacy of C for quite a long time yet.  Until C is 
completely stamped out, we?ll have to accommodate 0-terminated strings somehow.

> Had Microsoft come up with it first, I'm sure they'd be crucified by some of
> the same people who today are critical of them for using wide characters
> instead of UTF-8!

I think if we were to send a copy of the Unicode 8.0 standard back to the early 
1960s as a model for those designing ASCII, Unicode would look very different 
today.

I think the basic idea of UTF-8 would remain.  Instead of being sold as a 
C-compatible encoding, we?d still have a need for it as a packed encoding.  A 
kind of Huffman encoding for language, if you will.

But, I think we?d probably reorder the Unicode character points so that it 
packed even more densely on typical texts.  Several of the ASCII punctuation 
characters don?t deserve a place in the low 7 bits, and we could relocate the 
control characters, too.  We could probably get all of Western Europe?s 
characters into the lower 7 that way.

The next priority would be to pack the rest of the Western world?s characters 
into the lower 11 bits.  Cyrillic, Greek, Eastern European accented Latin 
characters, etc.

That should still leave space for several other non-Asian, non-Latin character 
sets.  Devanagari, Hebrew, Arabic?pack as many of them in as we can.  We should 
be able to cover about half the world?s population in the same space as UCS-2, 
while allowing most Western texts to be smaller, thoroughly outcompeting it.

UCS-2 feels like the 90?s version of ?640 kB is enough for everything!? to me, 
and UTF-16 like bank switching/segmentation.  We?re going to be stuck with 
those half-measure decisions for decades now.  Thanks, Microsoft.

The POSIX platforms did the right thing here: UTF-32 when speed matters more 
than space, and UTF-8 when space or compatibility matters more.

> Note: I still wish [Microsoft] supported UTF-8 directly from the API.

If wishes were changes, I?d rather that all languages and platforms supported 
tagged UTF-8 and UTF-32 strings, with automatic conversion as necessary.  Pack 
your strings down as UTF-8 when space matters, and unpack them as UTF-32 when 
speed matters.  Unicode could define a sensible conversion rule set, similar to 
the way sign extension works when mixing integer sizes.

Since the Unicode Consortium has stated that Unicode won?t grow beyond 2^21-1 
code points to prevent UTF-8 from going beyond 4 bytes per character, that tag 
could be an all-1s upper byte.  The rule could be that if you pass at least 4 
bytes to a function expecting a string, the buffer length is evenly divisible 
by 4, and the first 32-bit word has 0xFF on either end, it?s a tagged UTF-32 
value.  Otherwise, it?s UTF-8.

Simple and straightforward.

Too bad it will never happen.


[sqlite] Windows A and W APIs dual support

2016-02-12 Thread J Decker
On Fri, Feb 12, 2016 at 7:37 PM, Igor Tandetnik  wrote:
> On 2/12/2016 10:14 PM, J Decker wrote:
>>
>> mbstowcs( out, utf8, 5 );
>
>
> mbstowcs expects the string in the codepage of the current locale - which is
> never UTF-8.
>
>> for( n = 0; n < 5; n++ )
>> printf( "%04x ", out[n] );  // output is 00f0 0090 0080 0081; expect d800
>> dc01
>
>
> Why do you expect that? It appears your system uses Western European
> codepage (aka Latin-1). You pass a character "\xf0" which, when taken to be
> encoded in that codepage, is quite properly converted to U+00F0.
>
>> for( n = 0; n < 5; n++ )
>>printf( "%02x ", chout[n] );  // output is 00 00 00 00
>
>
> U+10001 is (unsurprisingly) not representable in your current ANSI codepage,
> so wcstombs call fails (I can't help but notice that you aren't checking any
> calls for failure) and leaves the output buffer unchanged.
>
>> so it does no useful conversion either way :)
>
>
> It performs the conversion it is documented to perform. It indeed doesn't
> perform the conversion that you, for reasons unclear, expect it to perform.
> In other words, you engage in wishful thinking, and then blame the messenger
> for failure of your wishes to materialize.

I expect it to take any string such as

???
or
?  ??? ? 

and give me a char * representation of it that's useful... or
conversely take the char* version of said strings and give me wchar_t
* that makes can be used.


> --
> Igor Tandetnik
>
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Windows A and W APIs dual support

2016-02-12 Thread J Decker
On Fri, Feb 12, 2016 at 7:05 PM, Warren Young  wrote:
> On Feb 12, 2016, at 4:42 PM, Scott Robison  wrote:
>>
>> I find it kind of interesting that Microsoft takes a lot
>> of (deserved) flack for not adhering to standards, yet UTF-8 came about
>> specifically because some didn't want to use UCS-2
>
> ?for good reason.  UCS-2/UTF-16 isn?t compatible with C strings.  I know you 
> know this, but it?s a huge consideration.  Outside of Mac OS Classic and a 
> few even smaller enclaves, C and its calling standards were the lingua franca 
> of the computing world when Unicode first came on the scene, and those 
> enclaves are now all but gone.
>
> We?ll be living with the legacy of C for quite a long time yet.  Until C is 
> completely stamped out, we?ll have to accommodate 0-terminated strings 
> somehow.
>

and Go.  Which is purely UTF8.

>
> Simple and straightforward.
>
> Too bad it will never happen.
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Windows A and W APIs dual support

2016-02-12 Thread J Decker
On Fri, Feb 12, 2016 at 5:32 PM, Igor Tandetnik  wrote:
> On 2/12/2016 7:24 PM, J Decker wrote:
>>

> What character in what ANSI codepage ends up converted by mbstowcs to an
> unpaired surrogate?
>
> What character in what ANSI codepage requires a surrogate pair to represent
> (that is, corresponds to a Unicode character outside of BMP), and triggers
> failure when passed to mbstowcs?
>
> With all due respect, I find your claims difficult to believe.
>
> In any case, MultiByteToWideChar and WideCharToMultiByte are perfectly
> capable of converting between UTF-8 and UTF-16.
> --
> Igor Tandetnik
>

Okay; I'd forgotten.  It does worse than I expected...

//--

int main( void )
{
char utf8[5] = "\xf0\x90\x80\x81";
char utf82[5] = "\xed\xa0\x81";
   char utf8tmp[5];
wchar_t out[5];
wchar_t out2[5];
wchar_t utf16[5] = L"\xd800\xdc01";
wchar_t real_out[25];
   char chout[5];
int n;

memset( out, 0, sizeof( out ) );
   memset( out2, 0, sizeof( out2 ) );
   memset( chout, 0, sizeof( chout ) );

mbstowcs( out, utf8, 5 );
mbstowcs( out2, utf82, 5 );
wcstombs( chout, utf16, 5 );

for( n = 0; n < 5; n++ )
printf( "%04x ", out[n] );  // output is 00f0 0090 0080 0081; expect d800 dc01
   printf( "\n" );
for( n = 0; n < 5; n++ )
  printf( "%04x ", out2[n] ); // output is 00ed 00a0 0081; expect d801
   printf( "\n" );
for( n = 0; n < 5; n++ )
  printf( "%02x ", chout[n] );  // output is 00 00 00 00
}

//--

so it does no useful conversion either way :)  (but at least I ended
up fixing a boundary issue while testing)

>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] query Benchmark

2016-02-12 Thread Dan Kennedy

On 02/12/2016 03:47 PM, Michele Pradella wrote:
> Hi all, is there a way to make a benchmark of queries to check which 
> version is faster? I'm using sqlite shell, the question is about on 
> how to make repetitive tests in the same conditions (for example I 
> need to totally disable cache to avoid different results the second 
> time query is executed).
> I try to explain better: the same query re-arranged in two different 
> queries to check which is the faster, executed on the same DB, how can 
> I do speed check of those queries avoiding interaction between two 
> queries(example cache)?

One way to look at performance of an SQLite query is to say that it is 
influenced by three factors: the amount of IO, the number and size of 
malloc() calls made and the CPU consumed by the library to run the query.

Adding the -stats option to the shell tool command line causes it to 
output some helpful information after running each query. Including page 
cache hits and misses. And some stuff about malloc() usage.

We usually use the valgrind tool callgrind or cachegrind to repeatably 
measure CPU load here.

Dan.



[sqlite] query Benchmark

2016-02-12 Thread Quan Yong Zhai
http://sqlite.org/rtree.html

To improve your query performance, you need a r-tree index on DateTime

???: Michele Pradella
: ?2016/?2/?12 17:28
???: sqlite-users at mailinglists.sqlite.org
??: Re: [sqlite] query Benchmark

I'm already using my software. The scope of test my query with sqlite
shell is to have a test environment without other interaction, just to
test witch is the fast version of my query to use in my software.
Another question is, if I have a query like this:
SELECT *field* FROM car_plates WHERE (*field* LIKE 'EX011%') AND
(DateTime>=14550588) AND (DateTime<=14552315) ORDER BY
DateTime;

If I try to force the use of an index on *field *but I think sqlite
can't use it, is it right?

Selea s.r.l.


Michele Pradella R


SELEA s.r.l.

Via Aldo Moro 69
Italy - 46019 Cicognara (MN)
Tel +39 0375 889091
Fax +39 0375 889080
*michele.pradella at selea.com* 
*http://www.selea.com*
Il 12/02/2016 10.23, Simon Slavin ha scritto:
> On 12 Feb 2016, at 8:47am, Michele Pradella  
> wrote:
>
>> Hi all, is there a way to make a benchmark of queries to check which version 
>> is faster? I'm using sqlite shell, the question is about on how to make 
>> repetitive tests in the same conditions (for example I need to totally 
>> disable cache to avoid different results the second time query is executed).
> Using the ".timer ON" command in the command-line shell, as you write, is the 
> best way to do timing.  But there are many levels of cache in your computer 
> and there's no way to disable them all.  You will definitely get timings 
> influenced by interactions.
>
> It looks like you are trying to make commands run as fast as possible.  
> Unless you are writing an academic paper on theory, this is probably the 
> wrong thing to do.  Write your software and see if it runs fast enough.  Only 
> if it runs too slowly do you need to start worrying about speed.
>
> We can advise on good SQL commands and good indexes for you if you do need to 
> increase your speed.
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users at mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] query Benchmark

2016-02-12 Thread Michele Pradella
Already solved with UNION of SELECT

Selea s.r.l.


Michele Pradella R


SELEA s.r.l.

Via Aldo Moro 69
Italy - 46019 Cicognara (MN)
Tel +39 0375 889091
Fax +39 0375 889080
*michele.pradella at selea.com* 
*http://www.selea.com*
Il 12/02/2016 17.03, Richard Hipp ha scritto:
> On 2/12/16, Michele Pradella  wrote:
>> Why this query
>> PRAGMA case_sensitive_like=ON;EXPLAIN QUERY PLAN SELECT * FROM
>> car_plates INDEXED BY car_plates_plate
>> WHERE ((CarPlateType!=-1)AND((Plate LIKE 'AA00O%')OR(Plate LIKE
>> 'AA0O0%')))OR((CarPlateType==-1)AND((Plate LIKE '~A00O%')))
>>
>> give me "Error: no query solution"
>>
>> if I remove the second expression ((CarPlateType==-1)AND((Plate LIKE
>> '~A00O%'))) it works
>>
> All of your questions are answered in the documentation, in the links
> provided in previous replies.  The fact that you are still asking
> questions tells me that I need to work on improving the documentation
> so that it is easier to understand.  Perhaps a table that shows the
> various combinations of COLLATE and PRAGMA case_sensitive_like work
> together.  I dunno.  I'll take the action item to revisit the
> documentation and try to improve it.
>
> Meanwhile, please go reread the documentation yourself, slowly and
> carefully, and see if this doesn't all suddenly make sense.



[sqlite] query Benchmark

2016-02-12 Thread Michele Pradella
Splitting query in 2 SELECT using UNION let me use car_plates_plate 
index without problemvery strange...but I found a walkaround

Selea s.r.l.


Michele Pradella R


SELEA s.r.l.

Via Aldo Moro 69
Italy - 46019 Cicognara (MN)
Tel +39 0375 889091
Fax +39 0375 889080
*michele.pradella at selea.com* 
*http://www.selea.com*
Il 12/02/2016 16.43, Michele Pradella ha scritto:
> Why this query
> PRAGMA case_sensitive_like=ON;EXPLAIN QUERY PLAN SELECT * FROM 
> car_plates INDEXED BY car_plates_plate
> WHERE ((CarPlateType!=-1)AND((Plate LIKE 'AA00O%')OR(Plate LIKE 
> 'AA0O0%')))OR((CarPlateType==-1)AND((Plate LIKE '~A00O%')))
>
> give me "Error: no query solution"
>
> if I remove the second expression ((CarPlateType==-1)AND((Plate LIKE 
> '~A00O%'))) it works
>
> Selea s.r.l.
>
>
>Michele Pradella R
>
>
>SELEA s.r.l.
>
> Via Aldo Moro 69
> Italy - 46019 Cicognara (MN)
> Tel +39 0375 889091
> Fax +39 0375 889080
> *michele.pradella at selea.com* 
> *http://www.selea.com*
> Il 12/02/2016 15.02, Michele Pradella ha scritto:
>> the strange thing is that if you create
>>
>> CREATE TABLE car_plates (Id INTEGER PRIMARY KEY 
>> AUTOINCREMENT,DateTime BIGINT,StringKey VARCHAR(255) UNIQUE,Plate 
>> VARCHAR(255),Type BIGINT,CameraName VARCHAR(255),CameraIP 
>> VARCHAR(255),CameraMAC VARCHAR(255),FileName VARCHAR(255),Country 
>> VARCHAR(255),Confidence VARCHAR(255),Reason VARCHAR(255),CarPlateType 
>> BIGINT,VehicleType BIGINT,GPS VARCHAR(255));
>> CREATE INDEX car_plates_datetime ON car_plates(DateTime);
>> CREATE INDEX car_plates_plate ON car_plates(Plate);
>>
>> and you do
>> PRAGMA case_sensitive_like=ON;
>> EXPLAIN QUERY PLAN SELECT * FROM car_plates WHERE ((Plate LIKE 
>> 'EX011%')) AND 
>> (DateTime>=14550588)AND(DateTime<=14552315);
>>
>> you will see the use of car_plates_datetime not car_plates_plate, but 
>> if you force the use of the index(car_plates_plate) it will use the 
>> correct index
>> PRAGMA case_sensitive_like=ON;
>> EXPLAIN QUERY PLAN SELECT * FROM car_plates indexed by 
>> car_plates_plateWHERE ((Plate LIKE 'EX011%')) AND 
>> (DateTime>=14550588)AND(DateTime<=14552315)
>>
>> with PRAGMA case_sensitive_like=OFF; you obviously obtain error
>>
>> Selea s.r.l.
>>
>>
>>Michele Pradella R
>>
>>
>>SELEA s.r.l.
>>
>> Via Aldo Moro 69
>> Italy - 46019 Cicognara (MN)
>> Tel +39 0375 889091
>> Fax +39 0375 889080
>> *michele.pradella at selea.com* 
>> *http://www.selea.com*
>> Il 12/02/2016 13.20, Richard Hipp ha scritto:
>>> On 2/12/16, Michele Pradella  wrote:
 table:
 CREATE TABLE IF NOT EXISTS car_plates (Id INTEGER PRIMARY KEY
 AUTOINCREMENT,DateTime BIGINT,StringKey VARCHAR(255) UNIQUE,Plate
 VARCHAR(255),Type BIGINT,CameraName VARCHAR(255),CameraIP
 VARCHAR(255),CameraMAC VARCHAR(255),FileName VARCHAR(255),Country
 VARCHAR(255),Confidence VARCHAR(255),Reason VARCHAR(255),CarPlateType
 BIGINT,VehicleType BIGINT,GPS VARCHAR(255))

 index:
 CREATE INDEX IF NOT EXISTS car_plates_plate ON car_plates(Plate)

>>> Test case:
>>>
>>> /* Your original schema */
>>> CREATE TABLE IF NOT EXISTS car_plates (Id INTEGER PRIMARY KEY
>>> AUTOINCREMENT,DateTime BIGINT,StringKey VARCHAR(255) UNIQUE,Plate
>>> VARCHAR(255),Type BIGINT,CameraName VARCHAR(255),CameraIP
>>> VARCHAR(255),CameraMAC VARCHAR(255),FileName VARCHAR(255),Country
>>> VARCHAR(255),Confidence VARCHAR(255),Reason VARCHAR(255),CarPlateType
>>> BIGINT,VehicleType BIGINT,GPS VARCHAR(255));
>>> CREATE INDEX IF NOT EXISTS car_plates_plate ON car_plates(Plate);
>>> /* Sample query with PRAGMA case_sensitive_like=OFF (the default) */
>>> EXPLAIN QUERY PLAN
>>> SELECT 
>>> DateTime,FileName,Plate,Type,CameraName,Id,Country,Reason,CarPlateType,VehicleType,GPS
>>> FROM car_plates WHERE ((Plate LIKE 'EX011%')) AND (Type IN
>>> (1,2,3,6,7,8)) AND (DateTime>=14550588) AND
>>> (DateTime<=14552315) ORDER BY DateTime DESC LIMIT 2;
>>> .print -
>>> /* Sample Query with PRAGMA case_sensitive_like=ON */
>>> PRAGMA case_sensitive_like=ON;
>>> EXPLAIN QUERY PLAN
>>> SELECT 
>>> DateTime,FileName,Plate,Type,CameraName,Id,Country,Reason,CarPlateType,VehicleType,GPS
>>> FROM car_plates WHERE ((Plate LIKE 'EX011%')) AND (Type IN
>>> (1,2,3,6,7,8)) AND (DateTime>=14550588) AND
>>> (DateTime<=14552315) ORDER BY DateTime DESC LIMIT 2;
>>>
>>> Run the above in the shell.  See that the second query uses the index.
>>>
>>> Or, leave case_sensitive_like turned off (the default) but change the
>>> table definition to include:
>>>
>>> ... Plate VARCHAR(255) COLLATE nocase, ...
>>>
>>> If you add the "COLLATE nocase" to the example above, you will see
>>> that the index is used in the default setting, but not when PRAGMA
>>> case_sensitive_like=ON.
>>>
>>>
>>
>> 

[sqlite] Windows A and W APIs dual support

2016-02-12 Thread Olivier Mascia
Dear,

I see the source code for SQLite3 takes great care to support either the A 
(MBCS, but not UTF8) or the W (Windows 'UTF16') versions of key APIs it depends 
on that platform.

Are there Windows platforms, supported by SQLite source code of course, where 
the 'W' version of the APIs are not available? I know about Windows 3.0, but 
what else? Some CE editions?

--
Meilleures salutations, Met vriendelijke groeten, Best Regards,
Olivier Mascia, integral.be/om

-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 842 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: 
<http://mailinglists.sqlite.org/cgi-bin/mailman/private/sqlite-users/attachments/20160212/c1a4f0c0/attachment.pgp>


[sqlite] query Benchmark

2016-02-12 Thread Michele Pradella
Why this query
PRAGMA case_sensitive_like=ON;EXPLAIN QUERY PLAN SELECT * FROM 
car_plates INDEXED BY car_plates_plate
WHERE ((CarPlateType!=-1)AND((Plate LIKE 'AA00O%')OR(Plate LIKE 
'AA0O0%')))OR((CarPlateType==-1)AND((Plate LIKE '~A00O%')))

give me "Error: no query solution"

if I remove the second expression ((CarPlateType==-1)AND((Plate LIKE 
'~A00O%'))) it works

Selea s.r.l.


Michele Pradella R


SELEA s.r.l.

Via Aldo Moro 69
Italy - 46019 Cicognara (MN)
Tel +39 0375 889091
Fax +39 0375 889080
*michele.pradella at selea.com* 
*http://www.selea.com*
Il 12/02/2016 15.02, Michele Pradella ha scritto:
> the strange thing is that if you create
>
> CREATE TABLE car_plates (Id INTEGER PRIMARY KEY AUTOINCREMENT,DateTime 
> BIGINT,StringKey VARCHAR(255) UNIQUE,Plate VARCHAR(255),Type 
> BIGINT,CameraName VARCHAR(255),CameraIP VARCHAR(255),CameraMAC 
> VARCHAR(255),FileName VARCHAR(255),Country VARCHAR(255),Confidence 
> VARCHAR(255),Reason VARCHAR(255),CarPlateType BIGINT,VehicleType 
> BIGINT,GPS VARCHAR(255));
> CREATE INDEX car_plates_datetime ON car_plates(DateTime);
> CREATE INDEX car_plates_plate ON car_plates(Plate);
>
> and you do
> PRAGMA case_sensitive_like=ON;
> EXPLAIN QUERY PLAN SELECT * FROM car_plates WHERE ((Plate LIKE 
> 'EX011%')) AND 
> (DateTime>=14550588)AND(DateTime<=14552315);
>
> you will see the use of car_plates_datetime not car_plates_plate, but 
> if you force the use of the index(car_plates_plate) it will use the 
> correct index
> PRAGMA case_sensitive_like=ON;
> EXPLAIN QUERY PLAN SELECT * FROM car_plates indexed by 
> car_plates_plateWHERE ((Plate LIKE 'EX011%')) AND 
> (DateTime>=14550588)AND(DateTime<=14552315)
>
> with PRAGMA case_sensitive_like=OFF; you obviously obtain error
>
> Selea s.r.l.
>
>
>Michele Pradella R
>
>
>SELEA s.r.l.
>
> Via Aldo Moro 69
> Italy - 46019 Cicognara (MN)
> Tel +39 0375 889091
> Fax +39 0375 889080
> *michele.pradella at selea.com* 
> *http://www.selea.com*
> Il 12/02/2016 13.20, Richard Hipp ha scritto:
>> On 2/12/16, Michele Pradella  wrote:
>>> table:
>>> CREATE TABLE IF NOT EXISTS car_plates (Id INTEGER PRIMARY KEY
>>> AUTOINCREMENT,DateTime BIGINT,StringKey VARCHAR(255) UNIQUE,Plate
>>> VARCHAR(255),Type BIGINT,CameraName VARCHAR(255),CameraIP
>>> VARCHAR(255),CameraMAC VARCHAR(255),FileName VARCHAR(255),Country
>>> VARCHAR(255),Confidence VARCHAR(255),Reason VARCHAR(255),CarPlateType
>>> BIGINT,VehicleType BIGINT,GPS VARCHAR(255))
>>>
>>> index:
>>> CREATE INDEX IF NOT EXISTS car_plates_plate ON car_plates(Plate)
>>>
>> Test case:
>>
>> /* Your original schema */
>> CREATE TABLE IF NOT EXISTS car_plates (Id INTEGER PRIMARY KEY
>> AUTOINCREMENT,DateTime BIGINT,StringKey VARCHAR(255) UNIQUE,Plate
>> VARCHAR(255),Type BIGINT,CameraName VARCHAR(255),CameraIP
>> VARCHAR(255),CameraMAC VARCHAR(255),FileName VARCHAR(255),Country
>> VARCHAR(255),Confidence VARCHAR(255),Reason VARCHAR(255),CarPlateType
>> BIGINT,VehicleType BIGINT,GPS VARCHAR(255));
>> CREATE INDEX IF NOT EXISTS car_plates_plate ON car_plates(Plate);
>> /* Sample query with PRAGMA case_sensitive_like=OFF (the default) */
>> EXPLAIN QUERY PLAN
>> SELECT 
>> DateTime,FileName,Plate,Type,CameraName,Id,Country,Reason,CarPlateType,VehicleType,GPS
>> FROM car_plates WHERE ((Plate LIKE 'EX011%')) AND (Type IN
>> (1,2,3,6,7,8)) AND (DateTime>=14550588) AND
>> (DateTime<=14552315) ORDER BY DateTime DESC LIMIT 2;
>> .print -
>> /* Sample Query with PRAGMA case_sensitive_like=ON */
>> PRAGMA case_sensitive_like=ON;
>> EXPLAIN QUERY PLAN
>> SELECT 
>> DateTime,FileName,Plate,Type,CameraName,Id,Country,Reason,CarPlateType,VehicleType,GPS
>> FROM car_plates WHERE ((Plate LIKE 'EX011%')) AND (Type IN
>> (1,2,3,6,7,8)) AND (DateTime>=14550588) AND
>> (DateTime<=14552315) ORDER BY DateTime DESC LIMIT 2;
>>
>> Run the above in the shell.  See that the second query uses the index.
>>
>> Or, leave case_sensitive_like turned off (the default) but change the
>> table definition to include:
>>
>> ... Plate VARCHAR(255) COLLATE nocase, ...
>>
>> If you add the "COLLATE nocase" to the example above, you will see
>> that the index is used in the default setting, but not when PRAGMA
>> case_sensitive_like=ON.
>>
>>
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



[sqlite] Windows A and W APIs dual support

2016-02-12 Thread Scott Robison
On Fri, Feb 12, 2016 at 4:05 PM, J Decker  wrote:

> windows W is wide-char not utf-16.
> as much as A is ansi and isn't utf-8
>

Has Windows ever supported a wide character set that was not UCS-2 or
UTF-16? I've always understood Microsoft embraced UCS-2 specifically so
that it would not have to deal with future encoding changes. Obviously it
failed to an extent when UCS-2 was deprecated in favor of UTF-16, but since
UTF-16 is backward compatible as long as you don't need surrogate pairs, it
wasn't too painful of a transition. Especially when compared to the
plethora of 8 bit multibyte encodings.

Note: I know Windows has supported DBCS for various encodings / code pages,
but those are never passed to wide functions.

I find it kind of interesting that Microsoft takes a lot
of (deserved) flack for not adhering to standards, yet UTF-8 came about
specifically because some didn't want to use UCS-2 (then simply known as
UCS, the one and only true flavor of the Universal Character Set). Had
Microsoft come up with it first, I'm sure they'd be crucified by some of
the same people who today are critical of them for using wide characters
instead of UTF-8!

Note: I still wish they supported UTF-8 directly from the API.

-- 
Scott Robison


[sqlite] Windows A and W APIs dual support

2016-02-12 Thread J Decker
well mbstowc and vice versa only understand 16 bit encodings, and do
not generate codpairsand do generate d800-dfff characters which
are entirely illegal in utf-16 (without corresponding pair)

But; fortunately, they do end up supporting utf-8 since it's just a
stream of bytes with a nul terminator in most cases.  But for display
I defiantly had to do my own 'getCodepoint' and then index the font..
I'd imagine that applications like IE handle it internally too... but
definitely the console has some issues.

On Fri, Feb 12, 2016 at 3:42 PM, Scott Robison  
wrote:
> On Fri, Feb 12, 2016 at 4:05 PM, J Decker  wrote:
>
>> windows W is wide-char not utf-16.
>> as much as A is ansi and isn't utf-8
>>
>
> Has Windows ever supported a wide character set that was not UCS-2 or
> UTF-16? I've always understood Microsoft embraced UCS-2 specifically so
> that it would not have to deal with future encoding changes. Obviously it
> failed to an extent when UCS-2 was deprecated in favor of UTF-16, but since
> UTF-16 is backward compatible as long as you don't need surrogate pairs, it
> wasn't too painful of a transition. Especially when compared to the
> plethora of 8 bit multibyte encodings.
>
> Note: I know Windows has supported DBCS for various encodings / code pages,
> but those are never passed to wide functions.
>
> I find it kind of interesting that Microsoft takes a lot
> of (deserved) flack for not adhering to standards, yet UTF-8 came about
> specifically because some didn't want to use UCS-2 (then simply known as
> UCS, the one and only true flavor of the Universal Character Set). Had
> Microsoft come up with it first, I'm sure they'd be crucified by some of
> the same people who today are critical of them for using wide characters
> instead of UTF-8!
>
> Note: I still wish they supported UTF-8 directly from the API.
>
> --
> Scott Robison
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Windows A and W APIs dual support

2016-02-12 Thread J Decker
windows W is wide-char not utf-16.
as much as A is ansi and isn't utf-8

On Fri, Feb 12, 2016 at 1:07 PM, Warren Young  wrote:
> On Feb 12, 2016, at 1:49 PM, Clemens Ladisch  wrote:
>>
>> Olivier Mascia wrote:
>>> Are there Windows platforms, supported by SQLite source code of course, 
>>> where the 'W' version of the APIs are not available?
>>
>> Once upon a time, SQLite supported Windows 95/98/Me.
>
> The DOS-based versions of Windows still have the ?W? functions for binary 
> compatibility with the NT-based versions, but for the most part they treat 
> their arguments according to the 8-bit code page or MBCS rules, which means 
> you generally get garbage output when you feed in UCS-2.
>
> There are a few exceptions: https://support.microsoft.com/en-us/kb/210341
>
> Note that Windows didn?t move from UCS-2 to UTF-16 until Windows 2000, which 
> is effectively after the development time of the DOS-based versions of 
> Windows.  (There?s a tiny overlap there with Windows ME, but that?s last-gasp 
> stuff.)
>
> I assume if you pass strings using characters beyond the BMP to the ?16? APIs 
> in SQLite, they would do the wrong thing on Windows NT 3.x and 4.x systems, 
> too.
>
> I doubt there would be much crying if SQLite dropped the ?A? support.  I 
> suspect the only reason SQLite still has it is that it?s more work to remove 
> it than to leave it alone.
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] query Benchmark

2016-02-12 Thread Michele Pradella
the strange thing is that if you create

CREATE TABLE car_plates (Id INTEGER PRIMARY KEY AUTOINCREMENT,DateTime 
BIGINT,StringKey VARCHAR(255) UNIQUE,Plate VARCHAR(255),Type 
BIGINT,CameraName VARCHAR(255),CameraIP VARCHAR(255),CameraMAC 
VARCHAR(255),FileName VARCHAR(255),Country VARCHAR(255),Confidence 
VARCHAR(255),Reason VARCHAR(255),CarPlateType BIGINT,VehicleType 
BIGINT,GPS VARCHAR(255));
CREATE INDEX car_plates_datetime ON car_plates(DateTime);
CREATE INDEX car_plates_plate ON car_plates(Plate);

and you do
PRAGMA case_sensitive_like=ON;
EXPLAIN QUERY PLAN SELECT * FROM car_plates WHERE ((Plate LIKE 
'EX011%')) AND (DateTime>=14550588)AND(DateTime<=14552315);

you will see the use of car_plates_datetime not car_plates_plate, but if 
you force the use of the index(car_plates_plate) it will use the correct 
index
PRAGMA case_sensitive_like=ON;
EXPLAIN QUERY PLAN SELECT * FROM car_plates indexed by 
car_plates_plateWHERE ((Plate LIKE 'EX011%')) AND 
(DateTime>=14550588)AND(DateTime<=14552315)

with PRAGMA case_sensitive_like=OFF; you obviously obtain error

Selea s.r.l.


Michele Pradella R


SELEA s.r.l.

Via Aldo Moro 69
Italy - 46019 Cicognara (MN)
Tel +39 0375 889091
Fax +39 0375 889080
*michele.pradella at selea.com* 
*http://www.selea.com*
Il 12/02/2016 13.20, Richard Hipp ha scritto:
> On 2/12/16, Michele Pradella  wrote:
>> table:
>> CREATE TABLE IF NOT EXISTS car_plates (Id INTEGER PRIMARY KEY
>> AUTOINCREMENT,DateTime BIGINT,StringKey VARCHAR(255) UNIQUE,Plate
>> VARCHAR(255),Type BIGINT,CameraName VARCHAR(255),CameraIP
>> VARCHAR(255),CameraMAC VARCHAR(255),FileName VARCHAR(255),Country
>> VARCHAR(255),Confidence VARCHAR(255),Reason VARCHAR(255),CarPlateType
>> BIGINT,VehicleType BIGINT,GPS VARCHAR(255))
>>
>> index:
>> CREATE INDEX IF NOT EXISTS car_plates_plate ON car_plates(Plate)
>>
> Test case:
>
> /* Your original schema */
> CREATE TABLE IF NOT EXISTS car_plates (Id INTEGER PRIMARY KEY
> AUTOINCREMENT,DateTime BIGINT,StringKey VARCHAR(255) UNIQUE,Plate
> VARCHAR(255),Type BIGINT,CameraName VARCHAR(255),CameraIP
> VARCHAR(255),CameraMAC VARCHAR(255),FileName VARCHAR(255),Country
> VARCHAR(255),Confidence VARCHAR(255),Reason VARCHAR(255),CarPlateType
> BIGINT,VehicleType BIGINT,GPS VARCHAR(255));
> CREATE INDEX IF NOT EXISTS car_plates_plate ON car_plates(Plate);
> /* Sample query with PRAGMA case_sensitive_like=OFF (the default) */
> EXPLAIN QUERY PLAN
> SELECT 
> DateTime,FileName,Plate,Type,CameraName,Id,Country,Reason,CarPlateType,VehicleType,GPS
> FROM car_plates WHERE ((Plate LIKE 'EX011%')) AND (Type IN
> (1,2,3,6,7,8)) AND (DateTime>=14550588) AND
> (DateTime<=14552315) ORDER BY DateTime DESC LIMIT 2;
> .print -
> /* Sample Query with PRAGMA case_sensitive_like=ON */
> PRAGMA case_sensitive_like=ON;
> EXPLAIN QUERY PLAN
> SELECT 
> DateTime,FileName,Plate,Type,CameraName,Id,Country,Reason,CarPlateType,VehicleType,GPS
> FROM car_plates WHERE ((Plate LIKE 'EX011%')) AND (Type IN
> (1,2,3,6,7,8)) AND (DateTime>=14550588) AND
> (DateTime<=14552315) ORDER BY DateTime DESC LIMIT 2;
>
> Run the above in the shell.  See that the second query uses the index.
>
> Or, leave case_sensitive_like turned off (the default) but change the
> table definition to include:
>
> ... Plate VARCHAR(255) COLLATE nocase, ...
>
> If you add the "COLLATE nocase" to the example above, you will see
> that the index is used in the default setting, but not when PRAGMA
> case_sensitive_like=ON.
>
>



[sqlite] Bug: LEFT JOIN on view confusing the query optimiser

2016-02-12 Thread Clemens Ladisch
Dave Blake wrote:
> I noticed my queries going very slowly after changing a join to a left
> join, examination of the query plan showed that the optimiser was choosing
> a poor plan.

It chooses a _correct_ plan.

> It only occurs when the left join is on a views, if explicit
> tables are used instead then the problem does not occur.

Your queries are quite different.  (One less LEFT.)

This is the equivalent query with tables:

SELECT song.*, song_artist.*, artist.*
FROM song
LEFT JOIN (song_artist JOIN artist
   ON song_artist.idArtist = artist.idArtist
  ) AS songartistview
 ON song.idSong = songartistview.idSong
WHERE song.idSong =1;

It has exactly the same query plan as the query with the view.

> In trying to get the artist(s) for a song ...

A query to get the artist(s) for a song would look like this:

SELECT *
FROM artist
WHERE idArtist IN (SELECT idArtist
   FROM song_artist
   WHERE idSong = 1);

What is your query actually supposed to do?


Regards,
Clemens


[sqlite] Unicode issue on windows consoles. Was: Version 3.11.0 beta

2016-02-12 Thread Clemens Ladisch
Olivier Mascia wrote:
> it looks evident that the command-line tool of sqlite3 is meant to be
> compiled as a narrow I/O program. It uses the standard main() and not
> wmain(), ... and does not attempt to use the wide chars in or out
> channels.

That's just because UTF-8 is the most portable encoding; it works by
default on every OS except Windows.

> WriteConsoleW ... introduces the will to use widechar output within
> a program designed for narrow output at first.

A conversion to the console code page would have to go through wide
characters anyway.  I do not understand what technical objection you
have against UTF-16, but on Windows, it cannot be avoided.

> Most people using sqlite3.exe interactively (on windows) won't expect
> to have correct console output for international texts

Oh yes they do.  Nowadays, knowledge of codepages is very rare.


Regards,
Clemens


[sqlite] Windows A and W APIs dual support

2016-02-12 Thread Warren Young
On Feb 12, 2016, at 1:49 PM, Clemens Ladisch  wrote:
> 
> Olivier Mascia wrote:
>> Are there Windows platforms, supported by SQLite source code of course, 
>> where the 'W' version of the APIs are not available?
> 
> Once upon a time, SQLite supported Windows 95/98/Me.

The DOS-based versions of Windows still have the ?W? functions for binary 
compatibility with the NT-based versions, but for the most part they treat 
their arguments according to the 8-bit code page or MBCS rules, which means you 
generally get garbage output when you feed in UCS-2.

There are a few exceptions: https://support.microsoft.com/en-us/kb/210341

Note that Windows didn?t move from UCS-2 to UTF-16 until Windows 2000, which is 
effectively after the development time of the DOS-based versions of Windows.  
(There?s a tiny overlap there with Windows ME, but that?s last-gasp stuff.)

I assume if you pass strings using characters beyond the BMP to the ?16? APIs 
in SQLite, they would do the wrong thing on Windows NT 3.x and 4.x systems, too.

I doubt there would be much crying if SQLite dropped the ?A? support.  I 
suspect the only reason SQLite still has it is that it?s more work to remove it 
than to leave it alone.


[sqlite] query Benchmark

2016-02-12 Thread Michele Pradella
No...I do not understand, try dump file and re-import but the result 
it's always the same
even if I use PRAGMA case_sensitive_like=ON;
the query
PRAGMA case_sensitive_like=ON;
EXPLAIN QUERY PLAN
SELECT 
DateTime,FileName,Plate,Type,CameraName,Id,Country,Reason,CarPlateType,VehicleType,GPS
FROM car_plates WHERE ((Plate LIKE 'EX011%')) AND (Type IN
(1,2,3,6,7,8)) AND (DateTime>=14550588) AND
(DateTime<=14552315) ORDER BY DateTime DESC LIMIT 2;

the index selected is the one on DateTime
0|0|0|SEARCH TABLE car_plates USING INDEX car_plates_datetime 
(DateTime>? AND DateTimemailto:michele.pradella at selea.com>
*http://www.selea.com*
Il 12/02/2016 13.44, Michele Pradella ha scritto:
> ok, assume casr_sensitive_like=OFF (default), according the point 6 of 
> LIKE optimization:
> http://www.sqlite.org/optoverview.html
> should be the same to have my table definition and
>
> CREATE INDEX IF NOT EXISTS car_plates_plate ON car_plates(Plate 
> COLLATE NOCASE);
>
> Correct? because that way LIKE do not use index.
>
>
> Anyway about the test case you propose, the first EXPLAIN do not use 
> index, the second (after PRAGMA case_sensitive_like=ON;) correctly use 
> the index.
> I tried the same in an DB I created month ago with the same 
> structure...but it doesn't workprobably the DB was created with a 
> earlier sqlite version and I do not know if this can cause the use of 
> wrong index.
>
> So speaking about performance, which is better PRAGMA 
> case_sensitive_like=ON; or PRAGMA case_sensitive_like=OFF;?
>
>
>
> Selea s.r.l.
>
>
>Michele Pradella R
>
>
>SELEA s.r.l.
>
> Via Aldo Moro 69
> Italy - 46019 Cicognara (MN)
> Tel +39 0375 889091
> Fax +39 0375 889080
> *michele.pradella at selea.com* 
> *http://www.selea.com*
> Il 12/02/2016 13.20, Richard Hipp ha scritto:
>> On 2/12/16, Michele Pradella  wrote:
>>> table:
>>> CREATE TABLE IF NOT EXISTS car_plates (Id INTEGER PRIMARY KEY
>>> AUTOINCREMENT,DateTime BIGINT,StringKey VARCHAR(255) UNIQUE,Plate
>>> VARCHAR(255),Type BIGINT,CameraName VARCHAR(255),CameraIP
>>> VARCHAR(255),CameraMAC VARCHAR(255),FileName VARCHAR(255),Country
>>> VARCHAR(255),Confidence VARCHAR(255),Reason VARCHAR(255),CarPlateType
>>> BIGINT,VehicleType BIGINT,GPS VARCHAR(255))
>>>
>>> index:
>>> CREATE INDEX IF NOT EXISTS car_plates_plate ON car_plates(Plate)
>>>
>> Test case:
>>
>> /* Your original schema */
>> CREATE TABLE IF NOT EXISTS car_plates (Id INTEGER PRIMARY KEY
>> AUTOINCREMENT,DateTime BIGINT,StringKey VARCHAR(255) UNIQUE,Plate
>> VARCHAR(255),Type BIGINT,CameraName VARCHAR(255),CameraIP
>> VARCHAR(255),CameraMAC VARCHAR(255),FileName VARCHAR(255),Country
>> VARCHAR(255),Confidence VARCHAR(255),Reason VARCHAR(255),CarPlateType
>> BIGINT,VehicleType BIGINT,GPS VARCHAR(255));
>> CREATE INDEX IF NOT EXISTS car_plates_plate ON car_plates(Plate);
>> /* Sample query with PRAGMA case_sensitive_like=OFF (the default) */
>> EXPLAIN QUERY PLAN
>> SELECT 
>> DateTime,FileName,Plate,Type,CameraName,Id,Country,Reason,CarPlateType,VehicleType,GPS
>> FROM car_plates WHERE ((Plate LIKE 'EX011%')) AND (Type IN
>> (1,2,3,6,7,8)) AND (DateTime>=14550588) AND
>> (DateTime<=14552315) ORDER BY DateTime DESC LIMIT 2;
>> .print -
>> /* Sample Query with PRAGMA case_sensitive_like=ON */
>> PRAGMA case_sensitive_like=ON;
>> EXPLAIN QUERY PLAN
>> SELECT 
>> DateTime,FileName,Plate,Type,CameraName,Id,Country,Reason,CarPlateType,VehicleType,GPS
>> FROM car_plates WHERE ((Plate LIKE 'EX011%')) AND (Type IN
>> (1,2,3,6,7,8)) AND (DateTime>=14550588) AND
>> (DateTime<=14552315) ORDER BY DateTime DESC LIMIT 2;
>>
>> Run the above in the shell.  See that the second query uses the index.
>>
>> Or, leave case_sensitive_like turned off (the default) but change the
>> table definition to include:
>>
>> ... Plate VARCHAR(255) COLLATE nocase, ...
>>
>> If you add the "COLLATE nocase" to the example above, you will see
>> that the index is used in the default setting, but not when PRAGMA
>> case_sensitive_like=ON.
>>
>>
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



[sqlite] Bug: LEFT JOIN on view confusing the query optimiser

2016-02-12 Thread Dave Blake
Thank you for looking at this.

>It chooses a _correct_ plan.

Really? With query 1) to select a song it first scans the song_artist table
that could contain many thousands of records. That does not seem right.

>Your queries are quite different.

They produce the same result sets, but yes one uses a view and the other
the constitent tables.

>What is your query actually supposed to do?

I have simplified things to make the issue clear, obviously unsuccesfuly.
In my application the query is more complex.

In this example there is a many to many relationship between songs and
artists resolved using a song_artist link table. Both 1) and 2) return the
song fields and the artist fields for all the artists for a chosen song.
More realistically the song and artist tables would have many more columns,
and the where clause be more involved and result in more than one song. But
none of those things are relevent to the issue.

Importantly you say

>It has exactly the same query plan as the query with the view.

That is not what I am getting. I have listed the query plans I get with
3.8.10.1, what version are you using and what query plans do you get?

On 12 February 2016 at 13:34, Clemens Ladisch  wrote:

> Dave Blake wrote:
> > I noticed my queries going very slowly after changing a join to a left
> > join, examination of the query plan showed that the optimiser was
> choosing
> > a poor plan.
>
> It chooses a _correct_ plan.
>
> > It only occurs when the left join is on a views, if explicit
> > tables are used instead then the problem does not occur.
>
> Your queries are quite different.  (One less LEFT.)
>
> This is the equivalent query with tables:
>
> SELECT song.*, song_artist.*, artist.*
> FROM song
> LEFT JOIN (song_artist JOIN artist
>ON song_artist.idArtist = artist.idArtist
>   ) AS songartistview
>  ON song.idSong = songartistview.idSong
> WHERE song.idSong =1;
>
> It has exactly the same query plan as the query with the view.
>
> > In trying to get the artist(s) for a song ...
>
> A query to get the artist(s) for a song would look like this:
>
> SELECT *
> FROM artist
> WHERE idArtist IN (SELECT idArtist
>FROM song_artist
>WHERE idSong = 1);
>
> What is your query actually supposed to do?
>
>
> Regards,
> Clemens
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


[sqlite] Unicode issue on windows consoles. Was: Version 3.11.0 beta

2016-02-12 Thread Dominique Devienne
On Fri, Feb 12, 2016 at 11:36 AM, Olivier Mascia  wrote:

> > So it would be a bad idea to change sqlite3's output depending on the
> > current code page or font.
>
> I share your view on keeping things as unicode as possible, but the
> frontier is thin before trying to impose our views onto Windows usages and
> Windows command-line users expectations.  That's why I tend to favor the
> path of least resistance: just fit whatever the default
> behavior/expectation is for the a (narrow) command-line tool writing some
> text to the console.
>

The solution IMHO is to introduce a clean abstraction in shell.c for
input/output.

i.e. a Virtual Console Layer (VCL), similar to the existing Virtual File
Layer (VFL) of sqlite3.c (amalgamation).
The VCL would assume Unicode (e.g. UTF-8) on the "shell side" of the APIs,
and the implementation of the VCL
takes care of the platform-specific and console-specific "details" to get
or render those UTF-8 encoded chars
as best of the current "console" can or allows.

And this would also fill my need to be able to embed the SQLite3 shell in a
GUI app for example,
simply by writing my own VCL, talking to Qt for example.

The above wouldn't preclude Dr Hipp from evolving shell.c as he sees fit,
even in BC-breaking ways,
since the contract would be clear that the only public API would be to get
the text in and out of the shell
to the "console". The only thing necessary is then an alternative entry
point than main() via conditional
compilation basically, and a way to inject one's own VCL.


> Most people using sqlite3.exe interactively (on windows) won't expect to
> have correct console output for international texts (stored in the db)
> using scripts which are unusual for their system. But most users will
> expect to be able to type and see characters which are usual for them and
> their system.  A user on a Windows set for typical Western Europe support
> (CP850 for OEM and CP1252 for ANSI), will expect to be able to type and see
> correctly things like '?', '?', '?', '?'. And won't be surprised to have a
> display issue with characters more exotic to them (those are characters
> which they would be at pain trying to type with their keyboard). Yet those
> concerned can use the standard Windows command CHCP to check or change
> their console code page to their liking. Provided that the font configured
> in their cmd session is able to, they will then be able to see those other
> characters (and copy them properly).
>

Seeing those characters properly in the Windows console is not enough!

The DB file itself must really contain properly encoded UTF-8 text for them,
such that the same DB on Linux or another SQLite-based app which really
supports Unicode
also can see them correctly.

The former is already true in places, but I don't think the latter is
(although I have tested it yet). --DD


[sqlite] query Benchmark

2016-02-12 Thread Michele Pradella
ok, assume casr_sensitive_like=OFF (default), according the point 6 of 
LIKE optimization:
http://www.sqlite.org/optoverview.html
should be the same to have my table definition and

CREATE INDEX IF NOT EXISTS car_plates_plate ON car_plates(Plate COLLATE NOCASE);

Correct? because that way LIKE do not use index.


Anyway about the test case you propose, the first EXPLAIN do not use index, the 
second (after PRAGMA case_sensitive_like=ON;) correctly use the index.
I tried the same in an DB I created month ago with the same structure...but it 
doesn't workprobably the DB was created with a earlier sqlite version and I 
do not know if this can cause the use of wrong index.

So speaking about performance, which is better PRAGMA case_sensitive_like=ON; 
or PRAGMA case_sensitive_like=OFF;?



Selea s.r.l.


Michele Pradella R


SELEA s.r.l.

Via Aldo Moro 69
Italy - 46019 Cicognara (MN)
Tel +39 0375 889091
Fax +39 0375 889080
*michele.pradella at selea.com* 
*http://www.selea.com*
Il 12/02/2016 13.20, Richard Hipp ha scritto:
> On 2/12/16, Michele Pradella  wrote:
>> table:
>> CREATE TABLE IF NOT EXISTS car_plates (Id INTEGER PRIMARY KEY
>> AUTOINCREMENT,DateTime BIGINT,StringKey VARCHAR(255) UNIQUE,Plate
>> VARCHAR(255),Type BIGINT,CameraName VARCHAR(255),CameraIP
>> VARCHAR(255),CameraMAC VARCHAR(255),FileName VARCHAR(255),Country
>> VARCHAR(255),Confidence VARCHAR(255),Reason VARCHAR(255),CarPlateType
>> BIGINT,VehicleType BIGINT,GPS VARCHAR(255))
>>
>> index:
>> CREATE INDEX IF NOT EXISTS car_plates_plate ON car_plates(Plate)
>>
> Test case:
>
> /* Your original schema */
> CREATE TABLE IF NOT EXISTS car_plates (Id INTEGER PRIMARY KEY
> AUTOINCREMENT,DateTime BIGINT,StringKey VARCHAR(255) UNIQUE,Plate
> VARCHAR(255),Type BIGINT,CameraName VARCHAR(255),CameraIP
> VARCHAR(255),CameraMAC VARCHAR(255),FileName VARCHAR(255),Country
> VARCHAR(255),Confidence VARCHAR(255),Reason VARCHAR(255),CarPlateType
> BIGINT,VehicleType BIGINT,GPS VARCHAR(255));
> CREATE INDEX IF NOT EXISTS car_plates_plate ON car_plates(Plate);
> /* Sample query with PRAGMA case_sensitive_like=OFF (the default) */
> EXPLAIN QUERY PLAN
> SELECT 
> DateTime,FileName,Plate,Type,CameraName,Id,Country,Reason,CarPlateType,VehicleType,GPS
> FROM car_plates WHERE ((Plate LIKE 'EX011%')) AND (Type IN
> (1,2,3,6,7,8)) AND (DateTime>=14550588) AND
> (DateTime<=14552315) ORDER BY DateTime DESC LIMIT 2;
> .print -
> /* Sample Query with PRAGMA case_sensitive_like=ON */
> PRAGMA case_sensitive_like=ON;
> EXPLAIN QUERY PLAN
> SELECT 
> DateTime,FileName,Plate,Type,CameraName,Id,Country,Reason,CarPlateType,VehicleType,GPS
> FROM car_plates WHERE ((Plate LIKE 'EX011%')) AND (Type IN
> (1,2,3,6,7,8)) AND (DateTime>=14550588) AND
> (DateTime<=14552315) ORDER BY DateTime DESC LIMIT 2;
>
> Run the above in the shell.  See that the second query uses the index.
>
> Or, leave case_sensitive_like turned off (the default) but change the
> table definition to include:
>
> ... Plate VARCHAR(255) COLLATE nocase, ...
>
> If you add the "COLLATE nocase" to the example above, you will see
> that the index is used in the default setting, but not when PRAGMA
> case_sensitive_like=ON.
>
>



[sqlite] query Benchmark

2016-02-12 Thread Michele Pradella
table:
CREATE TABLE IF NOT EXISTS car_plates (Id INTEGER PRIMARY KEY 
AUTOINCREMENT,DateTime BIGINT,StringKey VARCHAR(255) UNIQUE,Plate 
VARCHAR(255),Type BIGINT,CameraName VARCHAR(255),CameraIP 
VARCHAR(255),CameraMAC VARCHAR(255),FileName VARCHAR(255),Country 
VARCHAR(255),Confidence VARCHAR(255),Reason VARCHAR(255),CarPlateType 
BIGINT,VehicleType BIGINT,GPS VARCHAR(255))

index:
CREATE INDEX IF NOT EXISTS car_plates_plate ON car_plates(Plate)


query sample:
SELECT 
DateTime,FileName,Plate,Type,CameraName,Id,Country,Reason,CarPlateType,VehicleType,GPS
 
FROM car_plates WHERE ((Plate LIKE 'EX011%')) AND (Type IN 
(1,2,3,6,7,8)) AND (DateTime>=14550588) AND 
(DateTime<=14552315) ORDER BY DateTime DESC LIMIT 2;

Anyway I tried even with :
SELECT 
DateTime,FileName,Plate,Type,CameraName,Id,Country,Reason,CarPlateType,VehicleType,GPS
 
FROM car_plates WHERE ((Plate LIKE 'EX011%'))

SELECT 
DateTime,FileName,Plate,Type,CameraName,Id,Country,Reason,CarPlateType,VehicleType,GPS
 
FROM car_plates INDEXED BY  car_plates_plate WHERE ((Plate LIKE 'EX011%'))
give me error

Selea s.r.l.


Michele Pradella R


SELEA s.r.l.

Via Aldo Moro 69
Italy - 46019 Cicognara (MN)
Tel +39 0375 889091
Fax +39 0375 889080
*michele.pradella at selea.com* 
*http://www.selea.com*
Il 12/02/2016 13.06, Richard Hipp ha scritto:
> On 2/12/16, Michele Pradella  wrote:
>> I try using COLLATE NOCASE index with PRAGMA case_sensitive_like=OFF;
>> and COLLATE BINARY index with PRAGMA case_sensitive_like=ON;
>> but I have always the query with field LIKE 'AA%' that can't use index
>> on field "No Query solution" is reported by shell if you try to force
>> index. And with explain query plan the index on field is not used.
>> Do you think I'm doing something wrong?
>>
> Yes I do.
>
> Please post your schema and your query and we will have a look.



[sqlite] Andl: choices for generic types: bool, binary, number/decimal, date/datetime, text/varchar

2016-02-12 Thread Rowan Worth
On 10 February 2016 at 11:14,  wrote:

> >>>Every SQLite database file has a text encoding that applies to the
> entire
> file:  one of utf8, utf16be, or utf16le.  The database text encoding is
> stored in the header.  You can see the encoding for a particular database
> using:
>
>  sqlite3 DATABASE.db .dbinfo
>
> >>>(NB: The ".dbinfo" command is relatively recent, so you'll want the
> latest version of sqlite3.exe for this to work.)
>
> Mine is 3.8.6 2014-08-15 11:46:33 9491ba7d738528f168657adb43a198238abde19e.
> I guess I need something newer. That does explain why I couldn't find out.
>

"PRAGMA encoding;" seems to do the trick in earlier versions (3.8.1 here).
 -Rowan


[sqlite] query Benchmark

2016-02-12 Thread Michele Pradella
I try using COLLATE NOCASE index with PRAGMA case_sensitive_like=OFF;
and COLLATE BINARY index with PRAGMA case_sensitive_like=ON;
but I have always the query with field LIKE 'AA%' that can't use index 
on field "No Query solution" is reported by shell if you try to force 
index. And with explain query plan the index on field is not used.
Do you think I'm doing something wrong?

Selea s.r.l.


Michele Pradella R


SELEA s.r.l.

Via Aldo Moro 69
Italy - 46019 Cicognara (MN)
Tel +39 0375 889091
Fax +39 0375 889080
*michele.pradella at selea.com* 
*http://www.selea.com*
Il 12/02/2016 11.23, Simon Slavin ha scritto:
> On 12 Feb 2016, at 10:14am, Michele Pradella  
> wrote:
>
>> I can make the query:
>> SELECT field FROM car_plates WHERE (field>='EX011A')AND(field<='EX011Z');
>> now I can use the index the query is faster.
> SQLite makes this optimization for you.  See section 4.0 of
>
> 
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



[sqlite] Bug: LEFT JOIN on view confusing the query optimiser

2016-02-12 Thread Dave Blake
I noticed my queries going very slowly after changing a join to a left
join, examination of the query plan showed that the optimiser was choosing
a poor plan. It only occurs when the left join is on a views, if explicit
tables are used instead then the problem does not occur.

To repeat the issue try this example.
CREATE TABLE song ( idSong integer primary key, strTitle varchar(512))
CREATE TABLE song_artist ( idArtist integer, idSong integer)
CREATE TABLE artist (idArtist integer primary key, strArtist
varchar(256), strBio  text)

CREATE INDEX idxSongArtist_1 ON song_artist ( idSong);
CREATE INDEX idxSongArtist_2 ON song_artist ( idArtist);

CREATE VIEW songartistview AS SELECT song_artist.idSong AS idSong,
song_artist.idArtist AS idArtist, artist.strArtist AS strArtist FROM
song_artist JOIN artist ON song_artist.idArtist = artist.idArtist

In the data there can be songs with no artist, and artists with no song
(hence the use of left join)

Compare the query plan of
1) SELECT song.*, songartistview.* FROM song LEFT JOIN songartistview ON
song.idSong = songartistview.idSong WHERE song.idSong =1

with
2) SELECT song.*, song_artist.*, artist.* FROM song LEFT JOIN song_artist
ON song.idSong = song_artist.idSong LEFT JOIN artist ON
song_artist.idArtist = artist.idArtist WHERE song.idSong =1


Query 2) sensibly does
"SEARCH TABLE song USING INTEGER PRIMARY KEY (rowid=?)"
"SEARCH TABLE song_artist USING INDEX idxSongArtist_1 (idSong=?)"
"SEARCH TABLE artist USING INTEGER PRIMARY KEY (rowid=?)"

But the view equivalent 1) does
"SCAN TABLE song_artist"
"SEARCH TABLE artist USING INTEGER PRIMARY KEY (rowid=?)"
"SEARCH TABLE song USING INTEGER PRIMARY KEY (rowid=?)"
"SCAN SUBQUERY 1"

In trying to get the artist(s) for a song, scanning the full song_artist
table is not an efficient place to start!
Note this is a greatly simplified example to show the issue, in real
application the impact of a suboptimal plan is significant.

My testing was done mostly in v3.8.10.2, but this also happens in v3.8.6,
yet does not seem to occur in v3.8.3.1

Running ANALYZE on my real world data made the issue even worse (plan
started with scan of artist table instead). My current work around is to
convert my views into tables, but it would be nice to be able to use views
as an alias for some table joins. I did consider using "CROSS JOIN" because
SQLite effectively treats that as a LEFT JOIN but does not left the
optimiser change the table order, however that makes my SQL non-generic and
my queries need to work on MySQL too.


[sqlite] query Benchmark

2016-02-12 Thread Michele Pradella
Moreover if I make field LIKE 'AA%' and I use INDEXED BY index_on_field
shell return me "Error: no query solution"
so it's seams sqlite with LIKE operator can't use index on field

Selea s.r.l.


Michele Pradella R


SELEA s.r.l.

Via Aldo Moro 69
Italy - 46019 Cicognara (MN)
Tel +39 0375 889091
Fax +39 0375 889080
*michele.pradella at selea.com* 
*http://www.selea.com*
Il 12/02/2016 11.39, Michele Pradella ha scritto:
> Ok understood...anyway trying with sqlite shell it's seams that (filed 
> LIKE 'AA%') is slower than (field>='AAA' AND field<='AAZ')
> do you think there's a way I can check if the optimization is working?
>
> Selea s.r.l.
>
>
>Michele Pradella R
>
>
>SELEA s.r.l.
>
> Via Aldo Moro 69
> Italy - 46019 Cicognara (MN)
> Tel +39 0375 889091
> Fax +39 0375 889080
> *michele.pradella at selea.com* 
> *http://www.selea.com*
> Il 12/02/2016 11.23, Simon Slavin ha scritto:
>> On 12 Feb 2016, at 10:14am, Michele Pradella 
>>  wrote:
>>
>>> I can make the query:
>>> SELECT field FROM car_plates WHERE 
>>> (field>='EX011A')AND(field<='EX011Z');
>>> now I can use the index the query is faster.
>> SQLite makes this optimization for you.  See section 4.0 of
>>
>> 
>>
>> Simon.
>> ___
>> sqlite-users mailing list
>> sqlite-users at mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



[sqlite] query Benchmark

2016-02-12 Thread Michele Pradella
Ok understood...anyway trying with sqlite shell it's seams that (filed 
LIKE 'AA%') is slower than (field>='AAA' AND field<='AAZ')
do you think there's a way I can check if the optimization is working?

Selea s.r.l.


Michele Pradella R


SELEA s.r.l.

Via Aldo Moro 69
Italy - 46019 Cicognara (MN)
Tel +39 0375 889091
Fax +39 0375 889080
*michele.pradella at selea.com* 
*http://www.selea.com*
Il 12/02/2016 11.23, Simon Slavin ha scritto:
> On 12 Feb 2016, at 10:14am, Michele Pradella  
> wrote:
>
>> I can make the query:
>> SELECT field FROM car_plates WHERE (field>='EX011A')AND(field<='EX011Z');
>> now I can use the index the query is faster.
> SQLite makes this optimization for you.  See section 4.0 of
>
> 
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



[sqlite] Unicode issue on windows consoles. Was: Version 3.11.0 beta

2016-02-12 Thread Olivier Mascia
Clemens,

> Le 12 f?vr. 2016 ? 06:43, Clemens Ladisch  a ?crit :
> 
>> Convert from UTF8 to the codepage returned by GetConsoleOutputCP().
> 
> This would corrupt any characters not in that code page.
> 
> When we need to handle Windows console I/O specially, we could just as
> well use _O_U8TEXT, or ReadConsoleW/WriteConsoleW, without lossy
> conversions through a code page.


I very well understand your concern.

( Please understand that all comments that follow are based on a fact: it looks 
evident that the command-line tool of sqlite3 is meant to be compiled as a 
narrow I/O program. It uses the standard main() and not wmain(), gets argv[] 
which are the narrow ones and not the wide chars version and does not attempt 
to use the wide chars in or out channels. )

_O_U8TEXT is something specific introduced by some visual studio version (I 
think VS.2005 - to be verified, but probably not something we can really expect 
to find in all other compilers).  I tend to prefer a platform specific feature 
over a compiler (or compiler's runtime library) specific feature.

WriteConsoleW is interesting, but will require more conditional code changes to 
shell.c than may be desirable. It also introduces the will to use widechar 
output within a program designed for narrow output at first.

SetConsoleOutputCP(65001) and leaving fprintf alone (not using 
shell.c::utf8_printf) more or less work, but I really fear issues with older 
Windows versions (than the Windows 10 I'm using right now). It also has a 
side-effect: that setting persists for command-line session after sqlite3.exe 
has quit, so great care should be taken to restore it on exits. It has half the 
effect of using the standard command CHCP.

> 
>> [...] unless the display font actually supports unicode and UTF8
>> encoding, display issues can appear.
> 
> If the font does not have the needed glyphs, the console will display
> boxes, but it is still possible to copy the correct character values to
> the clipboard.
> 
> So it would be a bad idea to change sqlite3's output depending on the
> current code page or font.

I share your view on keeping things as unicode as possible, but the frontier is 
thin before trying to impose our views onto Windows usages and Windows 
command-line users expectations.  That's why I tend to favor the path of least 
resistance: just fit whatever the default behavior/expectation is for the a 
(narrow) command-line tool writing some text to the console.

Most people using sqlite3.exe interactively (on windows) won't expect to have 
correct console output for international texts (stored in the db) using scripts 
which are unusual for their system. But most users will expect to be able to 
type and see characters which are usual for them and their system.  A user on a 
Windows set for typical Western Europe support (CP850 for OEM and CP1252 for 
ANSI), will expect to be able to type and see correctly things like '?', '?', 
'?', '?'. And won't be surprised to have a display issue with characters more 
exotic to them (those are characters which they would be at pain trying to type 
with their keyboard). Yet those concerned can use the standard Windows command 
CHCP to check or change their console code page to their liking. Provided that 
the font configured in their cmd session is able to, they will then be able to 
see those other characters (and copy them properly).

What I'm proposing is that sqlite3.exe merely follows (for text sent to the 
interactive console) the current output setting of that console. Leaving it to 
the user to change the console setting if wanted/needed and permitted by the 
Windows version at hand. Certainly not a perfect solution, but it is much 
better than the current situation which is completely wrong, and it does not 
try to correct Windows inherently wrong unicode support.

--
Meilleures salutations, Met vriendelijke groeten, Best Regards,
Olivier Mascia, integral.be/om


-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 842 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: 
<http://mailinglists.sqlite.org/cgi-bin/mailman/private/sqlite-users/attachments/20160212/c8497d08/attachment.pgp>


[sqlite] query Benchmark

2016-02-12 Thread Michele Pradella
Just to understand, if I have the query with LIKE simplified:
SELECT field FROM car_plates WHERE (field LIKE 'EX011%');
I can't use index by field
now suppose you know text value you can have in field, I can make the query:
SELECT field FROM car_plates WHERE (field>='EX011A')AND(field<='EX011Z');
now I can use the index the query is faster. Do you think the results 
should be the same (consider field that can have only 2 letters after 
EX011)?

generalizing the behavior is (field LIKE 'EX011%') tha same as 
(field>='EX011*lowestASCII*')AND(field<='EX011*greatestASCII*')?

Selea s.r.l.


Michele Pradella R


SELEA s.r.l.

Via Aldo Moro 69
Italy - 46019 Cicognara (MN)
Tel +39 0375 889091
Fax +39 0375 889080
*michele.pradella at selea.com* 
*http://www.selea.com*
Il 12/02/2016 10.28, Michele Pradella ha scritto:
> I'm already using my software. The scope of test my query with sqlite 
> shell is to have a test environment without other interaction, just to 
> test witch is the fast version of my query to use in my software.
> Another question is, if I have a query like this:
> SELECT *field* FROM car_plates WHERE (*field* LIKE 'EX011%') AND 
> (DateTime>=14550588) AND (DateTime<=14552315) ORDER BY 
> DateTime;
>
> If I try to force the use of an index on *field *but I think sqlite 
> can't use it, is it right?
>
> Selea s.r.l.
>
>
>Michele Pradella R
>
>
>SELEA s.r.l.
>
> Via Aldo Moro 69
> Italy - 46019 Cicognara (MN)
> Tel +39 0375 889091
> Fax +39 0375 889080
> *michele.pradella at selea.com* 
> *http://www.selea.com*
> Il 12/02/2016 10.23, Simon Slavin ha scritto:
>> On 12 Feb 2016, at 8:47am, Michele Pradella 
>>  wrote:
>>
>>> Hi all, is there a way to make a benchmark of queries to check which 
>>> version is faster? I'm using sqlite shell, the question is about on 
>>> how to make repetitive tests in the same conditions (for example I 
>>> need to totally disable cache to avoid different results the second 
>>> time query is executed).
>> Using the ".timer ON" command in the command-line shell, as you 
>> write, is the best way to do timing.  But there are many levels of 
>> cache in your computer and there's no way to disable them all.  You 
>> will definitely get timings influenced by interactions.
>>
>> It looks like you are trying to make commands run as fast as 
>> possible.  Unless you are writing an academic paper on theory, this 
>> is probably the wrong thing to do.  Write your software and see if it 
>> runs fast enough.  Only if it runs too slowly do you need to start 
>> worrying about speed.
>>
>> We can advise on good SQL commands and good indexes for you if you do 
>> need to increase your speed.
>>
>> Simon.
>> ___
>> sqlite-users mailing list
>> sqlite-users at mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



[sqlite] query Benchmark

2016-02-12 Thread Richard Hipp
On 2/12/16, Michele Pradella  wrote:
> Why this query
> PRAGMA case_sensitive_like=ON;EXPLAIN QUERY PLAN SELECT * FROM
> car_plates INDEXED BY car_plates_plate
> WHERE ((CarPlateType!=-1)AND((Plate LIKE 'AA00O%')OR(Plate LIKE
> 'AA0O0%')))OR((CarPlateType==-1)AND((Plate LIKE '~A00O%')))
>
> give me "Error: no query solution"
>
> if I remove the second expression ((CarPlateType==-1)AND((Plate LIKE
> '~A00O%'))) it works
>

All of your questions are answered in the documentation, in the links
provided in previous replies.  The fact that you are still asking
questions tells me that I need to work on improving the documentation
so that it is easier to understand.  Perhaps a table that shows the
various combinations of COLLATE and PRAGMA case_sensitive_like work
together.  I dunno.  I'll take the action item to revisit the
documentation and try to improve it.

Meanwhile, please go reread the documentation yourself, slowly and
carefully, and see if this doesn't all suddenly make sense.
-- 
D. Richard Hipp
drh at sqlite.org


[sqlite] query Benchmark

2016-02-12 Thread Michele Pradella
I'm already using my software. The scope of test my query with sqlite 
shell is to have a test environment without other interaction, just to 
test witch is the fast version of my query to use in my software.
Another question is, if I have a query like this:
SELECT *field* FROM car_plates WHERE (*field* LIKE 'EX011%') AND 
(DateTime>=14550588) AND (DateTime<=14552315) ORDER BY 
DateTime;

If I try to force the use of an index on *field *but I think sqlite 
can't use it, is it right?

Selea s.r.l.


Michele Pradella R


SELEA s.r.l.

Via Aldo Moro 69
Italy - 46019 Cicognara (MN)
Tel +39 0375 889091
Fax +39 0375 889080
*michele.pradella at selea.com* 
*http://www.selea.com*
Il 12/02/2016 10.23, Simon Slavin ha scritto:
> On 12 Feb 2016, at 8:47am, Michele Pradella  
> wrote:
>
>> Hi all, is there a way to make a benchmark of queries to check which version 
>> is faster? I'm using sqlite shell, the question is about on how to make 
>> repetitive tests in the same conditions (for example I need to totally 
>> disable cache to avoid different results the second time query is executed).
> Using the ".timer ON" command in the command-line shell, as you write, is the 
> best way to do timing.  But there are many levels of cache in your computer 
> and there's no way to disable them all.  You will definitely get timings 
> influenced by interactions.
>
> It looks like you are trying to make commands run as fast as possible.  
> Unless you are writing an academic paper on theory, this is probably the 
> wrong thing to do.  Write your software and see if it runs fast enough.  Only 
> if it runs too slowly do you need to start worrying about speed.
>
> We can advise on good SQL commands and good indexes for you if you do need to 
> increase your speed.
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



[sqlite] query Benchmark

2016-02-12 Thread Simon Slavin

On 12 Feb 2016, at 10:14am, Michele Pradella  
wrote:

> I can make the query:
> SELECT field FROM car_plates WHERE (field>='EX011A')AND(field<='EX011Z');
> now I can use the index the query is faster.

SQLite makes this optimization for you.  See section 4.0 of



Simon.


[sqlite] query Benchmark

2016-02-12 Thread Simon Slavin

On 12 Feb 2016, at 9:28am, Michele Pradella  
wrote:

> I'm already using my software. The scope of test my query with sqlite shell 
> is to have a test environment without other interaction, just to test witch 
> is the fast version of my query to use in my software.

Okay.  I understand why you want to do this but unfortunately it will not work. 
 You will always have cache interactions and it is extremely difficult to get 
'pretend' figures which ignore them.

> Another question is, if I have a query like this:
> SELECT *field* FROM car_plates WHERE (*field* LIKE 'EX011%') AND 
> (DateTime>=14550588) AND (DateTime<=14552315) ORDER BY 
> DateTime;
> 
> If I try to force the use of an index on *field *but I think sqlite can't use 
> it, is it right?

It might be bad to use an index based on *field* for this.  It might be better 
to use one which started with DateTime since that would provide both the 
selection of DateTime within a range, and also the ORDER BY clause.  But it 
would be best of all to let SQLite choose, since it knows more about SQLite 
than you do.  So don't force any index.

If you want to test out something like this, do the following in the shell tool:

CREATE INDEX cp_fd ON car_plates (field,DateTime);
CREATE INDEX cp_df ON car_plates (DateTime,field);
ANALYZE;
EXPLAIN QUERY PLAN SELECT *field* FROM car_plates WHERE (*field* LIKE 'EX011%') 
AND (DateTime>=14550588) AND (DateTime<=14552315) ORDER BY 
DateTime;

The output from EXPLAIN QUERY PLAN will tell you which index it has chosen to 
use.  Then you can delete the other one.

For ANALYZE to give useful results you need to have typical data in your 
tables.  SQLite will look at how 'chunky' this data is and use that to make its 
decisions.  You can run the commands with empty tables and they may not lead to 
the best results.

Simon.


[sqlite] query Benchmark

2016-02-12 Thread Michele Pradella
Hi all, is there a way to make a benchmark of queries to check which 
version is faster? I'm using sqlite shell, the question is about on how 
to make repetitive tests in the same conditions (for example I need to 
totally disable cache to avoid different results the second time query 
is executed).
I try to explain better: the same query re-arranged in two different 
queries to check which is the faster, executed on the same DB, how can I 
do speed check of those queries avoiding interaction between two 
queries(example cache)?



[sqlite] query Benchmark

2016-02-12 Thread Simon Slavin

On 12 Feb 2016, at 8:47am, Michele Pradella  
wrote:

> Hi all, is there a way to make a benchmark of queries to check which version 
> is faster? I'm using sqlite shell, the question is about on how to make 
> repetitive tests in the same conditions (for example I need to totally 
> disable cache to avoid different results the second time query is executed).

Using the ".timer ON" command in the command-line shell, as you write, is the 
best way to do timing.  But there are many levels of cache in your computer and 
there's no way to disable them all.  You will definitely get timings influenced 
by interactions.  

It looks like you are trying to make commands run as fast as possible.  Unless 
you are writing an academic paper on theory, this is probably the wrong thing 
to do.  Write your software and see if it runs fast enough.  Only if it runs 
too slowly do you need to start worrying about speed.

We can advise on good SQL commands and good indexes for you if you do need to 
increase your speed.

Simon.


[sqlite] query Benchmark

2016-02-12 Thread Richard Hipp
On 2/12/16, Michele Pradella  wrote:
> table:
> CREATE TABLE IF NOT EXISTS car_plates (Id INTEGER PRIMARY KEY
> AUTOINCREMENT,DateTime BIGINT,StringKey VARCHAR(255) UNIQUE,Plate
> VARCHAR(255),Type BIGINT,CameraName VARCHAR(255),CameraIP
> VARCHAR(255),CameraMAC VARCHAR(255),FileName VARCHAR(255),Country
> VARCHAR(255),Confidence VARCHAR(255),Reason VARCHAR(255),CarPlateType
> BIGINT,VehicleType BIGINT,GPS VARCHAR(255))
>
> index:
> CREATE INDEX IF NOT EXISTS car_plates_plate ON car_plates(Plate)
>

Test case:

/* Your original schema */
CREATE TABLE IF NOT EXISTS car_plates (Id INTEGER PRIMARY KEY
AUTOINCREMENT,DateTime BIGINT,StringKey VARCHAR(255) UNIQUE,Plate
VARCHAR(255),Type BIGINT,CameraName VARCHAR(255),CameraIP
VARCHAR(255),CameraMAC VARCHAR(255),FileName VARCHAR(255),Country
VARCHAR(255),Confidence VARCHAR(255),Reason VARCHAR(255),CarPlateType
BIGINT,VehicleType BIGINT,GPS VARCHAR(255));
CREATE INDEX IF NOT EXISTS car_plates_plate ON car_plates(Plate);
/* Sample query with PRAGMA case_sensitive_like=OFF (the default) */
EXPLAIN QUERY PLAN
SELECT 
DateTime,FileName,Plate,Type,CameraName,Id,Country,Reason,CarPlateType,VehicleType,GPS
FROM car_plates WHERE ((Plate LIKE 'EX011%')) AND (Type IN
(1,2,3,6,7,8)) AND (DateTime>=14550588) AND
(DateTime<=14552315) ORDER BY DateTime DESC LIMIT 2;
.print -
/* Sample Query with PRAGMA case_sensitive_like=ON */
PRAGMA case_sensitive_like=ON;
EXPLAIN QUERY PLAN
SELECT 
DateTime,FileName,Plate,Type,CameraName,Id,Country,Reason,CarPlateType,VehicleType,GPS
FROM car_plates WHERE ((Plate LIKE 'EX011%')) AND (Type IN
(1,2,3,6,7,8)) AND (DateTime>=14550588) AND
(DateTime<=14552315) ORDER BY DateTime DESC LIMIT 2;

Run the above in the shell.  See that the second query uses the index.

Or, leave case_sensitive_like turned off (the default) but change the
table definition to include:

   ... Plate VARCHAR(255) COLLATE nocase, ...

If you add the "COLLATE nocase" to the example above, you will see
that the index is used in the default setting, but not when PRAGMA
case_sensitive_like=ON.


-- 
D. Richard Hipp
drh at sqlite.org


[sqlite] query Benchmark

2016-02-12 Thread Keith Medcalf

If and only If you follow rule 6 and declare the column with a collation 
sequence that matches the collation in effect for the LIKE operator.

So if the column/index is declared with CASE SENSITIVE collation (BINARY, the 
default), and you use like on that column, and the LIKE is case insensitive 
(the default) then the index will not be used to optimize this constraint.  You 
must change the collations somewhere so that they match.

> -Original Message-
> From: sqlite-users-bounces at mailinglists.sqlite.org [mailto:sqlite-users-
> bounces at mailinglists.sqlite.org] On Behalf Of Simon Slavin
> Sent: Friday, 12 February, 2016 03:24
> To: SQLite mailing list
> Subject: Re: [sqlite] query Benchmark
> 
> 
> On 12 Feb 2016, at 10:14am, Michele Pradella 
> wrote:
> 
> > I can make the query:
> > SELECT field FROM car_plates WHERE
> (field>='EX011A')AND(field<='EX011Z');
> > now I can use the index the query is faster.
> 
> SQLite makes this optimization for you.  See section 4.0 of
> 
> 
> 
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users





[sqlite] query Benchmark

2016-02-12 Thread Richard Hipp
On 2/12/16, Michele Pradella  wrote:
> I try using COLLATE NOCASE index with PRAGMA case_sensitive_like=OFF;
> and COLLATE BINARY index with PRAGMA case_sensitive_like=ON;
> but I have always the query with field LIKE 'AA%' that can't use index
> on field "No Query solution" is reported by shell if you try to force
> index. And with explain query plan the index on field is not used.
> Do you think I'm doing something wrong?
>

Yes I do.

Please post your schema and your query and we will have a look.
-- 
D. Richard Hipp
drh at sqlite.org


[sqlite] Unicode issue on windows consoles. Was: Version 3.11.0 beta

2016-02-12 Thread Clemens Ladisch
Olivier Mascia wrote:
>> Le 11 f?vr. 2016 ? 23:13, Clemens Ladisch  a ?crit :
>> - stdin from the console
>
> Convert from the codepage returned by GetConsoleCP() to UTF8.

>> - stdout to the console
>
> Convert from UTF8 to the codepage returned by GetConsoleOutputCP().

This would corrupt any characters not in that code page.

When we need to handle Windows console I/O specially, we could just as
well use _O_U8TEXT, or ReadConsoleW/WriteConsoleW, without lossy
conversions through a code page.

> [...] unless the display font actually supports unicode and UTF8
> encoding, display issues can appear.

If the font does not have the needed glyphs, the console will display
boxes, but it is still possible to copy the correct character values to
the clipboard.

So it would be a bad idea to change sqlite3's output depending on the
current code page or font.

>> - stdin redirected from a file
>
> Personal opinion: I'd like it to treat input as implicitly in UTF8 as today.

>> - stdout redirected to a file
>
> Personal opinion: I'd like it to output UTF8.

Of course; but we need to detect it.

(And there's the question of "sqlite3 < somefile" vs.
"type somefile | sqlite3", which might also behave differently in
"cmd /a" vs. "cmd /u"; but I do not think this is even detectable
reliably.)


Regards,
Clemens


[sqlite] Andl: choices for generic types: bool, binary, number/decimal, date/datetime, text/varchar

2016-02-12 Thread David Bennett
Just a final point -- the Sqlite text16() routines are a perfect match for
C# Marshall.PtrToStringUni(). Definitely the nicest solution for my problem,
regardless of database encoding. Thanks all.

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org


-Original Message-
From: David Bennett [mailto:da...@yorkage.com] 
Sent: Wednesday, 10 February 2016 2:15 PM
To: 'SQLite mailing list' 
Subject: RE: [sqlite] Andl: choices for generic types: bool, binary,
number/decimal, date/datetime, text/varchar

>>>Every SQLite database file has a text encoding that applies to the entire
file:  one of utf8, utf16be, or utf16le.  The database text encoding is
stored in the header.  You can see the encoding for a particular database
using:

 sqlite3 DATABASE.db .dbinfo

>>>(NB: The ".dbinfo" command is relatively recent, so you'll want the
latest version of sqlite3.exe for this to work.)

Mine is 3.8.6 2014-08-15 11:46:33 9491ba7d738528f168657adb43a198238abde19e.
I guess I need something newer. That does explain why I couldn't find out.

>>>All text store in the database file uses the database text encoding.
*All Text*.  So if your database encoding is UTF16 and you do
"sqlite3_bind_text()" then SQLite automatically converts your UTF8 input
into UTF16 before storing it.  Or if you do "sqlite3_column_text()", then
SQLite will convert from UTF16 to UTF8 before returning the answer.

>>>Obviously, in order to avoid unnecessary conversions, it works best if
the database encoding matches the encoding most commonly used by your
application.

OK, I get it. I don't want to enforce an encoding, I just want it to work
with any database.

I use the C API from C# using P/Invoke, and there doesn't seem to be any
native way to do utf-8 marshalling. It would seem the best result might be
to use the text16() family [such as sqlite3_value_text16()]. There is
marshalling from 'Unicode', which is Windows for utf-16, and that should be
reasonably efficient (compared to trying to do it in managed space).I'll ask
that one over on Stack Overflow. Thanks for the tips.

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org







[sqlite] Unicode issue on windows consoles. Was: Version 3.11.0 beta

2016-02-12 Thread Olivier Mascia

> Le 11 f?vr. 2016 ? 23:13, Clemens Ladisch  a ?crit :
> 
> As far as I can see, there are five problems:
> - stdin from the console

Convert from the codepage returned by GetConsoleCP() to UTF8.

> - stdin redirected from a file

Personal opinion: I'd like it to treat input as implicitly in UTF8 as today.

> - stdout to the console

Convert from UTF8 to the codepage returned by GetConsoleOutputCP().

> - stdout redirected to a file

Personal opinion: I'd like it to output UTF8.

> - command-line arguments

They're presented to the application code, through the argv[] pointers, as 
system default ANSI code page. Converting from CP_ACP to UTF8 is appropriate.

I'm adding a 6th point:
- make sure that if sqlite3 needs to present a filename to any ...A Windows 
API, that conversion continues to choose between CP_ACP or CP_OEMCP depending 
on the AreFileApisANSI() function. This is the case right now, and nothing 
related to shell.c should change that.

And a 7th point:
- check that when sqlite gets a text string from Windows from a ...A API (an 
error message string for instance), it is considered to be in CP_ACP and 
converted to whatever needed, from CP_ACP (AreFileApisANSI() should not be 
used).

> This would be too much for 3.11.0.

Of course.

About UINT GetConsoleCP() and UINT GetConsoleOutputCP() functions... They're 
present since Windows 2000. I don't know about various WinCE editions. I wasn't 
so sure of since when they're available, so I coded the quick and dirty change 
for tests purpose using hardcoded CP_OEMCP, but it is better to use 
GetConsole(Output)CP() APIs. Indeed, among the codepages to which the console 
can be switched (or defaulted to on various localized editions of Windows), 
some codepages are considered 'OEM', others 'ANSI'. Using CP_OEMCP when the 
console has been set for an ANSI codepage, gives wrong result. And reciprocal 
too.

I'm advocating for using GetConsoleCP() and GetConsoleOutputCP() in order to 
convert the input or the output as needed, instead of being tempted to use 
their Set counterparts (SetConsoleCP(65001) and SetConsoleOutputCP(65001)).  
That would look simpler to use them to turn the console IO to UTF8, but it's a 
bumpy road. Because unless the display font actually supports unicode and UTF8 
encoding, display issues can appear. And using 65001 does not goes back in time 
as far on the Windows timeline. Using the Get... path, the user can change its 
codepage himself through command chcp ..., knowingly.

--
Meilleures salutations, Met vriendelijke groeten, Best Regards,
Olivier Mascia, integral.be/om

-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 842 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: 
<http://mailinglists.sqlite.org/cgi-bin/mailman/private/sqlite-users/attachments/20160212/1f0df2ea/attachment.pgp>