Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread Deon Brewis
If you look at the original underlying NT I/O architecture that Cutler 
implemented - it is a thing of beauty that's based in async patterns, and not 
threads.

It was the Win32 wrappers over the NT subsystem that tried to make things 
"easier" for developers to deal with, which forced synchronous blocking code on 
top of the async Zw/Nt layers. This made the only practical way to deal with 
"blocking" I/O to be multi-threaded.

Today even junior-ish developers can deal with async code in node.js, and not 
bat an eyelid about it - the language makes async interaction simple enough - 
even in a single threaded environment. It wasn't the underlying technology was 
wrong, it was the simplifying abstraction on top of it that was.

If instead NT initially only exposed the Nt API's and not the Win32 layers, we 
would have had languages that simplified async a long time ago - and 
multi-threaded would be the domain of a few applications that actually need 
compute, and not just non-blocking IO. This wasn't due to Cutler's architecture 
though - more market driven decisions trying to maintain API compatibility with 
Win95, which was in turn driven by API compatibility with 16 bit API's, which 
long predated Cutler.

- Deon

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of James K. Lowden
Sent: Thursday, February 16, 2017 6:26 PM
To: sqlite-users@mailinglists.sqlite.org
Subject: Re: [sqlite] Thread safety of serialized mode

On Thu, 16 Feb 2017 21:49 +
Tim Streater  wrote:

> > What's inherently wrong with threads in principle is that there is 
> > no logic that describes them, and consequently no compiler to 
> > control that logic.
> 
> [snip remainder of long whinge about threads]
> 
> Sounds, then, like I'd better eliminate threads from my app. In which 
> case when the user initiates some action that may take some minutes to 
> complete, he can just lump it when the GUI becomes unresponsive.

[snip chest thumping]

You didn't refute my assertion, and facts refute yours.  

There has been a GUI in use for some 30 years, dating back to your VMS days, 
that is single-threaded.  I'm sure you've heard of it, the X Window System?  

If your particular GUI system is based on threads, like, say, Microsoft 
Windows, then, yes, you're pretty much cornered into using threads.  But that 
doesn't change the fact that you have no compiler support to verify the 
correctness of memory access over the time domain.  It doesn't change the fact 
that the OS has subverted the guarantees your language would otherwise provide, 
such as the atomicity of ++i noted elsewhere in this thread.  

WR Stevens describes 4 models for managing concurrency:

1.  Mutilplexing: select(2)
2.  Multiprocessing
3.  Asynchronous callbacks
4.  Signal-driven

None of those subvert the semantics of the programming language.  In each case, 
at any one moment there is only one thread of control over any given section of 
logic.  

Hoare had already published "Communicating Sequential Processes"  
(http://www.usingcsp.com/cspbook.pdf) when it hired David Cutler to design 
Windows NT.  It's too bad they adopted threads as their concurrency-management 
medium.  If they'd chosen CSP instead, maybe they wouldn't have set computing 
back two decades.  

--jkl


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why is this so much more efficient?

2017-02-16 Thread Barry Smith
Aliases in SQL are not the same as variables in most procedural languages. So 
every time you mention 'totaltime' SQLite is probably recalculating that value 
by adding all the columns together.  See the various discussions regarding no 
deterministic (random) functions last year. Less references to that alias => 
less calculations. Reference it twice and expect it to be slower.

> On 17 Feb 2017, at 12:27 AM, Darko Volaric  wrote:
> 
> You can actually index functions or expression:
> https://www.sqlite.org/expridx.html
> 
> On Thu, Feb 16, 2017 at 9:32 PM, Cecil Westerhof 
> wrote:
> 
>> 2017-02-16 21:10 GMT+01:00 Dominique Pellé :
>> 
>>> Cecil Westerhof  wrote:
>>> 
 I have a table vmstat that I use to store vmstat info. ;-)
 At the moment it has more as 661 thousand records.
 
 In principle the values of usertime, systemtime, idletime, waittime and
 stolentime should add up to 100. I just wanted to check it. Of-course
>>> there
 could be a rounding error, so I wrote the following query:
 SELECT date
 ,  time
 ,  usertime
 ,  systemtime
 ,  idletime
 ,  waittime
 ,  stolentime
 ,  (usertime + systemtime + idletime + waittime + stolentime) AS
 totaltime
 FROM   vmstat
 WHERE  totaltime  < 99 OR totaltime > 101
 
 I did not like that, so I rewrote the WHERE to:
 WHERE  ABS(100 - totaltime) > 1
 
 The funny thing the second WHERE is more efficient as the first, where
>> I
 would have expected it to be the other way around.
 The first takes around 1.050 milliseconds.
 The second takes around  950 milliseconds.
 So the second is around 10% more efficient. Why is this?
 
 In case it is important: I did this in sqlitebrowser 3.7.0, which uses
 SQLite 3.8.10.2.
>>> 
>>> 
>>> I suspect  that when you use "WHERE ABS(100 - totaltime) > 1"
>>> SQLite cannot use an index since it's an expression, so it does a
>>> full table scan, whereas the other solution which does
>>> "WHERE  totaltime  < 99 OR totaltime > 101"
>>> may use an index on totaltime (assuming that there is an index).
>>> 
>>> In general using an index is good.  But if most of the records
>>> satisfy the condition "ABS(100 - totaltime) > 1" then an index
>>> can be more harmful than useful.   And that could explain
>>> why using "WHERE ABS(100 - totaltime) > 1" is a bit faster.
>>> You could try "EXPLAIN QUERY PLAN" on your queries to
>>> see if they use an index or if they do a full table scan.
>> 
>> ​Totaltime is calculated, so it cannot have an index. ;-)
>> Besides from the almost 700.000 records only two satisfy the condition.
>> 
>> I should look into EXPLAIN QUERY PLAN.
>> 
>> --
>> Cecil Westerhof
>> ___
>> sqlite-users mailing list
>> sqlite-users@mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread Jens Alfke

> On Feb 16, 2017, at 6:26 PM, James K. Lowden  wrote:
> 
> It doesn't change the fact that the OS has subverted the
> guarantees your language would otherwise provide, such as the atomicity
> of ++i noted elsewhere in this thread.  

It’s not the OS, it’s the architecture of multiprocessor systems. A 
read-modify-write cycle on an address in main memory is intrinsically _not_ 
atomic on a multiprocessor system, not unless the CPU goes through some 
expensive efforts to make it so (cache invalidation, bus locking, etc.) You can 
get that behavior if you like by using libraries like stdatomic or the C++ 
atomic types, but it makes the operation much, much slower.

> In
> each case, at any one moment there is only one thread of control over
> any given section of logic.  

That’s nice, but that’s just not the way the memory architectures of current 
computers work. Threaded programming happens to expose that behavior because it 
allows simultaneous read/write access to memory, and the semantics of that are 
subtle and weird. (It’s a distributed system after all. Distributed systems are 
rather Einsteinian: they don’t have strong causality.)

> It's too bad they adopted threads as
> their concurrency-management medium.  If they'd chosen CSP instead,
> maybe they wouldn't have set computing back two decades.  

No, they’d never have shipped a useable product. The attractive thing about 
threads is that they’re cheap and efficient. Higher level constructs like CSP 
are great, but they have a lot of overhead. For example, look at Mach: as 
originally implemented, it had a tiny kernel that only did message-passing, and 
everything else was implemented as separate processes that communicated by 
messaging. Unfortunately it was too slow to be useful. Every OS ever shipped by 
NeXT and Apple has a monolithic kernel based on BSD, with Mach messaging only 
used for higher-level tasks.

It’s really easy to point to hard work done by other engineers and insult it 
without any knowledge of the actual design constraints and development issues. 
I hear it all the time, and personally I am sick of that attitude.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why is this so much more efficient?

2017-02-16 Thread Darko Volaric
You can actually index functions or expression:
https://www.sqlite.org/expridx.html

On Thu, Feb 16, 2017 at 9:32 PM, Cecil Westerhof 
wrote:

> 2017-02-16 21:10 GMT+01:00 Dominique Pellé :
>
> > Cecil Westerhof  wrote:
> >
> > > I have a table vmstat that I use to store vmstat info. ;-)
> > > At the moment it has more as 661 thousand records.
> > >
> > > In principle the values of usertime, systemtime, idletime, waittime and
> > > stolentime should add up to 100. I just wanted to check it. Of-course
> > there
> > > could be a rounding error, so I wrote the following query:
> > > SELECT date
> > > ,  time
> > > ,  usertime
> > > ,  systemtime
> > > ,  idletime
> > > ,  waittime
> > > ,  stolentime
> > > ,  (usertime + systemtime + idletime + waittime + stolentime) AS
> > > totaltime
> > > FROM   vmstat
> > > WHERE  totaltime  < 99 OR totaltime > 101
> > >
> > > I did not like that, so I rewrote the WHERE to:
> > > WHERE  ABS(100 - totaltime) > 1
> > >
> > > The funny thing the second WHERE is more efficient as the first, where
> I
> > > would have expected it to be the other way around.
> > > The first takes around 1.050 milliseconds.
> > > The second takes around  950 milliseconds.
> > > So the second is around 10% more efficient. Why is this?
> > >
> > > In case it is important: I did this in sqlitebrowser 3.7.0, which uses
> > > SQLite 3.8.10.2.
> >
> >
> > I suspect  that when you use "WHERE ABS(100 - totaltime) > 1"
> > SQLite cannot use an index since it's an expression, so it does a
> > full table scan, whereas the other solution which does
> > "WHERE  totaltime  < 99 OR totaltime > 101"
> > may use an index on totaltime (assuming that there is an index).
> >
> > In general using an index is good.  But if most of the records
> > satisfy the condition "ABS(100 - totaltime) > 1" then an index
> > can be more harmful than useful.   And that could explain
> > why using "WHERE ABS(100 - totaltime) > 1" is a bit faster.
> > You could try "EXPLAIN QUERY PLAN" on your queries to
> > see if they use an index or if they do a full table scan.
> >
>
> ​Totaltime is calculated, so it cannot have an index. ;-)
> Besides from the almost 700.000 records only two satisfy the condition.
>
> I should look into EXPLAIN QUERY PLAN.
>
> --
> Cecil Westerhof
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug: fields from external (being updated) table cannot be used in "order by" clause of a subselect

2017-02-16 Thread James K. Lowden
On Wed, 15 Feb 2017 12:06:58 +0100
Marek Wieckowski  wrote:

> The thing is that in principle there is nothing wrong with using
> test.xxx fields in the subselect: there really should be no
> difference whether you use them in "where" or "order by"... The fact
> that sqlite does not allow them to be used in ORDER BY (while
> allowing in SELECT and WHERE) imho is simply a bug. 

The SQL standard does not allow ORDER BY anywhere except as the last
clause of the main query.  ORDER BY does not enter into the logic of
the query; it's a convenience to the host program receiving the rows.  

The idiom

select ...
order by ...
limit 1

can always be replaced with a logical 

select min(...) -- or max(...)

The idiom 

select ...
limit N

is illogical because nondeterministic.  Since we're trading
opinions ;-)  mine is that SQL should allow only logical constructs and
should refuse every illogical construct as a syntax error.  

There was kind of hole in standard SQL in that there was no convenient
way to express the idea of TOP N rows or the Nth row with a particular
ranking.  That's addressed these days with window functions, although
it's debatable how "convenient" they are, and in any event SQLite
doesn't support them.  

In SQLite the hole is filled with ORDER BY ... LIMIT.  The problem is 
that construct is frequently misused, as in  your example, and it is 
unnecessarily complex.  

How so?  Observe that *order* is implicit in magnitude functions: max()
implies order without requiring the user to say so.  

Much more powerful would be functions that return N values instead of
just one.  The most convenient form would require support in the SQL
interpreter.  For example:

select date, maxn(3, score) from scores group by by date

would produce (up to) three scores for each date.  

Not only is the logic for that query awkward to express in SQL (any
version) but, because of its roundabout expression, it presents a
challenge to the query planner.  With the notion of "top N" buried in
a function, the interpreter would be free to keep track of the top N
values without necessarily sorting them.  

--jkl



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread James K. Lowden
On Thu, 16 Feb 2017 21:49 +
Tim Streater  wrote:

> > What's inherently wrong with threads in principle is that there is
> > no logic that describes them, and consequently no compiler to
> > control that logic.  
> 
> [snip remainder of long whinge about threads]
> 
> Sounds, then, like I'd better eliminate threads from my app. In which
> case when the user initiates some action that may take some minutes
> to complete, he can just lump it when the GUI becomes unresponsive. 

[snip chest thumping]

You didn't refute my assertion, and facts refute yours.  

There has been a GUI in use for some 30 years, dating back to your VMS
days, that is single-threaded.  I'm sure you've heard of it, the X
Window System?  

If your particular GUI system is based on threads, like, say,
Microsoft Windows, then, yes, you're pretty much cornered into using
threads.  But that doesn't change the fact that you have no compiler
support to verify the correctness of memory access over the time
domain.  It doesn't change the fact that the OS has subverted the
guarantees your language would otherwise provide, such as the atomicity
of ++i noted elsewhere in this thread.  

WR Stevens describes 4 models for managing concurrency:

1.  Mutilplexing: select(2)
2.  Multiprocessing
3.  Asynchronous callbacks
4.  Signal-driven

None of those subvert the semantics of the programming language.  In
each case, at any one moment there is only one thread of control over
any given section of logic.  

Hoare had already published "Communicating Sequential
Processes"  (http://www.usingcsp.com/cspbook.pdf) when it hired David
Cutler to design Windows NT.  It's too bad they adopted threads as
their concurrency-management medium.  If they'd chosen CSP instead,
maybe they wouldn't have set computing back two decades.  

--jkl


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread James K. Lowden
On Thu, 16 Feb 2017 16:21:06 -0700
Warren Young  wrote:

> https://www2.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.pdf
> 
> Threads aren?t just distasteful from an implementation standpoint,
> they?re *mathematically unsound*.

Thank you for that.  I think I encounted that paper late one night and
never got back to it.  This was poignant: 

"I conjecture that most multi-threaded general-purpose
applications are, in fact, so full of concurrency bugs that as
multi-core architectures become commonplace, these bugs will begin to
show up as system failures."

I guess it was about the year 2005 when we upgraded our SQL Server to
IIRC a machine with 4 processors.  For the first time, we had hardware
that could execute multiple threads literally simultaneously.  And
for the first time some queries failed to execute unless we set the
maximum degree of parallelism to 1.  The server itself was also
unstable.  As I recall, we used certain settings in the registry to
restrict the parallelism in the server until later releases made that
unnecessary.  

Concurrency bugs exposed by multi-core architectures?  Ya think?  

--jkl





___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread Bob Friesenhahn

On Thu, 16 Feb 2017, Warren Young wrote:


Taking it off-list, since there is zero remaining connection to SQLite now:


Thank you for taking it off list.

How can we expect people to write threaded programs when even a 
simple integer increment is prone to race conditions and 
read-modify-write errors?


I have not encountered much issue with threads in my own programs. 
Using threads requires attention to detail, such as if all libraries 
used (e.g. sqlite3) are thread safe and the terms by which they are 
thread safe.


However, there is still the specific issue I posted about which no one 
has posted a follow-up on.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread Warren Young
Taking it off-list, since there is zero remaining connection to SQLite now:

> On Feb 16, 2017, at 2:49 PM, Tim Streater  wrote:
> 
> On 16 Feb 2017 at 18:30, James K. Lowden  wrote: 
> 
>> On Tue, 14 Feb 2017 17:05:30 -0800
>> Darren Duncan  wrote:
>> 
>>> There is nothing inherently wrong with threads in principle
>> 
>> What's inherently wrong with threads in principle is that there is no
>> logic that describes them, and consequently no compiler to control that
>> logic.  
> 
> [snip remainder of long whinge about threads]
> 
> Sounds, then, like I'd better eliminate threads from my app. In which case 
> when the user initiates some action that may take some minutes to complete, 
> he can just lump it when the GUI becomes unresponsive.

CSPs, the actor model, message passing architectures, etc. all give you ways to 
have concurrent processing without your program explicitly dealing with 
OS-level threads.

> That OK with you? Can I point the user your way when he gives me grief about 
> it? Or should I just say that no, he can't have a responsive GUI under those 
> conditions because some guy on the Internet says so?

“Fear is the path to the dark side. Fear leads to anger. Anger leads to hate. 
Hate leads to suffering.”

Let go your anger. :)

(And lest you think I fear threads and that this sword cuts both ways, no: I 
avoid using threads whenever possible because I *understand* threads.)

> I'll just bring my 50 years experience of writing software to the table, 
> including threaded apps for PDP-11s and VAXes. It's called debugging.

Some kinds of debugging are easier than others.  Why set yourself up for a much 
harder problem than necessary by using inherently problematic mechanisms?  

(Plural, meaning threads and all the synchronization primitives that go along 
with them, which drag in new problems like deadlocking that you didn’t have 
before you added mutexes to try and solve the problems you bought by adding 
that one little ol' thread.)

Have you read the Lee paper referenced in the mailing list thread?

https://www2.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.pdf

Threads aren’t just distasteful from an implementation standpoint, they’re 
*mathematically unsound*.

Did you study the ARM assembly language comparison I linked to?  How can we 
expect people to write threaded programs when even a simple integer increment 
is prone to race conditions and read-modify-write errors?
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread Jens Alfke

> On Feb 16, 2017, at 11:49 AM, Warren Young  wrote:
> 
> A software developer who refuses to learn about his processor’s assembly 
> language is like trying to become an electrical engineer without learning 
> anything about physics. 

In this case what you need to read is the specification of the memory model 
assumed by C and C++. The language specs are very explicit about what you can 
and can’t expect from concurrent memory accesses. (They’re also very hard to 
read! But fortunately there are good books about concurrent C/C++ programming 
that explain the rules clearly, like “C++ Concurrency In Action.)

Of course that might not exactly match your CPU, because the specs are 
cross-platform and generally have to be very conservative … but in most cases 
it’s not a good idea to bake detailed knowledge of the CPU into your code, if 
you ever plan on porting that code to any other CPU. :) 

(Longtime programmers for Apple platforms know this especially well … I’ve been 
through transitions from the 68000 to the 68030, then to PowerPC, then to dual 
CPUs, then to 64-bit, then to X86, then to ARM7, then to ARM-64bit. Whew! 
Especially in the early stages, there were super clever things you could do 
that took advantages of details of the CPU architecture — like the way the 
68000 ignored the high 8 bits of pointers — that then stopped working in the 
next generation and forced you to redesign your code.)

—Jens

PS: I’m not aware of _any_ current CPUs that can increment main memory in one 
instruction, atomically or not. The PDP-11 did famously have such an 
instruction, which is exactly why the ++ and -- operators were added to C’s 
parent BCPL, since it was intended as a “structured assembly language”.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Deprecated OSX functioon

2017-02-16 Thread Jens Alfke

> On Feb 16, 2017, at 1:02 PM, James Walker  wrote:
> 
> For what it's worth, the OS header recommends replacing it with the function 
> atomic_compare_exchange_strong, which I assume refers to the C++11 template 
> function.

It’s also available in C as a macro defined in .
(At least, it’s in Clang’s C library.)

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread Tim Streater
On 16 Feb 2017 at 18:30, James K. Lowden  wrote: 

> On Tue, 14 Feb 2017 17:05:30 -0800
> Darren Duncan  wrote:
>
>> There is nothing inherently wrong with threads in principle
>
> What's inherently wrong with threads in principle is that there is no
> logic that describes them, and consequently no compiler to control that
> logic.  

[snip remainder of long whinge about threads]

Sounds, then, like I'd better eliminate threads from my app. In which case when 
the user initiates some action that may take some minutes to complete, he can 
just lump it when the GUI becomes unresponsive. That OK with you? Can I point 
the user your way when he gives me grief about it? Or should I just say that 
no, he can't have a responsive GUI under those conditions because some guy on 
the Internet says so?

Well that ain't gonna happen. I'll just bring my 50 years experience of writing 
software to the table, including threaded apps for PDP-11s and VAXes. It's 
called debugging.

--
Cheers  --  Tim
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread Bob Friesenhahn
It seems like the discussion has turned into a general programming 
discussion unrelated to SQLite3.  Does anyone have an idea about this 
specific problem that we encountered (see quoted message below)?


It is not clear to me if this is a threading issue, or memory 
corruption issue, or if it is a SQLite3 implementation logic issue 
(something to do with a deferred moveto).  Why should destroying a 
prepared statement care about a cursor's deferred moveto?


Bob

On Wed, 15 Feb 2017, Bob Friesenhahn wrote:

It turns out that I have more data on the problem.  The error message 
reported reads something like:


SQLITE_CORRUPT: database disk image is malformed database corruption at line 
70273 of [17efb4209f]


We are using version 3.10.2.

Looking at amalgamation code I see that the error is returned from 
handleDeferredMoveto() and is base on a value returned from 
sqlite3BtreeMovetoUnpacked():


 70259 ** The cursor "p" has a pending seek operation that has not yet been
 70260 ** carried out.  Seek the cursor now.  If an error occurs, return
 70261 ** the appropriate error code.
 70262 */
 70263 static int SQLITE_NOINLINE handleDeferredMoveto(VdbeCursor *p){
 70264   int res, rc;
 70265 #ifdef SQLITE_TEST
 70266   extern int sqlite3_search_count;
 70267 #endif
 70268   assert( p->deferredMoveto );
 70269   assert( p->isTable );
 70270   assert( p->eCurType==CURTYPE_BTREE );
 70271   rc = sqlite3BtreeMovetoUnpacked(p->uc.pCursor, 0, p->movetoTarget, 
0, );

 70272   if( rc ) return rc;
 70273   if( res!=0 ) return SQLITE_CORRUPT_BKPT;
 70274 #ifdef SQLITE_TEST
 70275   sqlite3_search_count++;
 70276 #endif
 70277   p->deferredMoveto = 0;
 70278   p->cacheStatus = CACHE_STALE;
 70279   return SQLITE_OK;
 70280 }

Ideas?

Bob



--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Deprecated OSX functioon

2017-02-16 Thread James Walker

On 2/16/2017 7:40 AM, Ward WIllats wrote:

This has been coming across my console when building Sqlite since upgrading to 
Mac OSX Sierra a few months ago. FWIW.


Building sqlite3 shell...
sqlite3.c:20839:17: warning: 'OSAtomicCompareAndSwapPtrBarrier' is deprecated: first 
deprecated in macOS 10.12 - Use atomic_compare_exchange_strong() from 
 instead [-Wdeprecated-declarations]
  success = OSAtomicCompareAndSwapPtrBarrier(NULL, newzone,
^
/usr/include/libkern/OSAtomicDeprecated.h:547:6: note: 
'OSAtomicCompareAndSwapPtrBarrier' has been explicitly marked deprecated here
boolOSAtomicCompareAndSwapPtrBarrier( void *__oldValue, void *__newValue, 
void * volatile *__theValue );
^
1 warning generated.


For what it's worth, the OS header recommends replacing it with the 
function atomic_compare_exchange_strong, which I assume refers to the 
C++11 template function.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLite Options

2017-02-16 Thread Simon Slavin

On 16 Feb 2017, at 8:40pm, Clyde Eisenbeis  wrote:

> Is there an SQLite version that is comprised of fewer dlls, etc.? ...
> Perhaps SQLite3?

If you’re writing C or C++ code all you need is the amalgamation source code 
files from the SQLite3 site.  You compile them into your program.  No DLLs 
needed at all.

The need for DLLs comes when you need a 'shim' to allow your favourite 
programming language to be able to call the SQLite API functions.  Even then 
you can do without them if your favourite programming language has the ability 
to call C functions directly.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread Keith Medcalf

www.microsoft.com ...

The only time the OS is not "mapped" into the process address space is if you 
are running 32-bit code on a 64-bit OS.  In that case it has to use an 
imitation syscall trampoline stored at the top of the 4GB 32-bit address space 
to jump into 64-bit mode to access the OS code.  In all cases where you are 
running either 64-bit processes on a 64-bit OS or a 32-bit process on a 32-bit 
OS, the entire OS is mapped into the process address space.

The most pure example of a DCSS based OS is CMS, although the actual real OS 
(the CP part of CP/CMS) lives in a separate supervisor process.

> -Original Message-
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> On Behalf Of James K. Lowden
> Sent: Thursday, 16 February, 2017 11:30
> To: sqlite-users@mailinglists.sqlite.org
> Subject: Re: [sqlite] Thread safety of serialized mode
> 
> On Wed, 15 Feb 2017 07:55:16 -0700
> "Keith Medcalf"  wrote:
> 
> > Note that for several modern OSes, the OS is nothing more than a
> > discontiguous saved segment (DCSS) which is mapped into *every*
> > process space and that process isolation is more of a myth than a
> > reality.
> 
> Are you referring to one in particular we could read about?
> 
> --jkl
> 
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] SQLite Options

2017-02-16 Thread Clyde Eisenbeis
I started writing SQLite code about two years ago (Visual Studio 2013,
C#, WPF) ... with a significant delay, since then, because of a
physical move.

The code is written for a specific use on my computer ... no other users.

SQLite was chosen so my sons could eventually install this program on
their computer ... no database needs to be installed ... no other
installation required.

I don't recall the actions taken then, but I do see quite a few
additional files (EntityFramework.dll, EntityFramework.SqlServer.dll,
etc.) as references ... see attachment.

Is there an SQLite version that is comprised of fewer dlls, etc.? ...
Perhaps SQLite3?

Clyde
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why is this so much more efficient?

2017-02-16 Thread Cecil Westerhof
2017-02-16 21:10 GMT+01:00 Dominique Pellé :

> Cecil Westerhof  wrote:
>
> > I have a table vmstat that I use to store vmstat info. ;-)
> > At the moment it has more as 661 thousand records.
> >
> > In principle the values of usertime, systemtime, idletime, waittime and
> > stolentime should add up to 100. I just wanted to check it. Of-course
> there
> > could be a rounding error, so I wrote the following query:
> > SELECT date
> > ,  time
> > ,  usertime
> > ,  systemtime
> > ,  idletime
> > ,  waittime
> > ,  stolentime
> > ,  (usertime + systemtime + idletime + waittime + stolentime) AS
> > totaltime
> > FROM   vmstat
> > WHERE  totaltime  < 99 OR totaltime > 101
> >
> > I did not like that, so I rewrote the WHERE to:
> > WHERE  ABS(100 - totaltime) > 1
> >
> > The funny thing the second WHERE is more efficient as the first, where I
> > would have expected it to be the other way around.
> > The first takes around 1.050 milliseconds.
> > The second takes around  950 milliseconds.
> > So the second is around 10% more efficient. Why is this?
> >
> > In case it is important: I did this in sqlitebrowser 3.7.0, which uses
> > SQLite 3.8.10.2.
>
>
> I suspect  that when you use "WHERE ABS(100 - totaltime) > 1"
> SQLite cannot use an index since it's an expression, so it does a
> full table scan, whereas the other solution which does
> "WHERE  totaltime  < 99 OR totaltime > 101"
> may use an index on totaltime (assuming that there is an index).
>
> In general using an index is good.  But if most of the records
> satisfy the condition "ABS(100 - totaltime) > 1" then an index
> can be more harmful than useful.   And that could explain
> why using "WHERE ABS(100 - totaltime) > 1" is a bit faster.
> You could try "EXPLAIN QUERY PLAN" on your queries to
> see if they use an index or if they do a full table scan.
>

​Totaltime is calculated, so it cannot have an index. ;-)
Besides from the almost 700.000 records only two satisfy the condition.

I should look into EXPLAIN QUERY PLAN.

-- 
Cecil Westerhof
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why is this so much more efficient?

2017-02-16 Thread Dominique Pellé
Cecil Westerhof  wrote:

> I have a table vmstat that I use to store vmstat info. ;-)
> At the moment it has more as 661 thousand records.
>
> In principle the values of usertime, systemtime, idletime, waittime and
> stolentime should add up to 100. I just wanted to check it. Of-course there
> could be a rounding error, so I wrote the following query:
> SELECT date
> ,  time
> ,  usertime
> ,  systemtime
> ,  idletime
> ,  waittime
> ,  stolentime
> ,  (usertime + systemtime + idletime + waittime + stolentime) AS
> totaltime
> FROM   vmstat
> WHERE  totaltime  < 99 OR totaltime > 101
>
> I did not like that, so I rewrote the WHERE to:
> WHERE  ABS(100 - totaltime) > 1
>
> The funny thing the second WHERE is more efficient as the first, where I
> would have expected it to be the other way around.
> The first takes around 1.050 milliseconds.
> The second takes around  950 milliseconds.
> So the second is around 10% more efficient. Why is this?
>
> In case it is important: I did this in sqlitebrowser 3.7.0, which uses
> SQLite 3.8.10.2.


I suspect  that when you use "WHERE ABS(100 - totaltime) > 1"
SQLite cannot use an index since it's an expression, so it does a
full table scan, whereas the other solution which does
"WHERE  totaltime  < 99 OR totaltime > 101"
may use an index on totaltime (assuming that there is an index).

In general using an index is good.  But if most of the records
satisfy the condition "ABS(100 - totaltime) > 1" then an index
can be more harmful than useful.   And that could explain
why using "WHERE ABS(100 - totaltime) > 1" is a bit faster.
You could try "EXPLAIN QUERY PLAN" on your queries to
see if they use an index or if they do a full table scan.

Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread Warren Young
On Feb 15, 2017, at 5:03 AM, a...@zator.com wrote:
> 
> I suppose someday, programming languages can do an analogous translation in 
> our limited but safe, sequential programs.

Not as long as we require side effects to achieve anything of practical value.  
Any form of I/O is a “side effect” by my definition, whether that’s disk, 
network, GUI, or what have you.

Avoiding threads is good because the well known problems with global variables 
magnify a combinatorially when multiple threads can access them simultaneously 
— literally *simultaneously* on a modern multi-core processor! — in any pattern 
you can conceive, and more you probably haven’t even thought of.  The problem 
is combinatoric on the number of instructions in the program and the number of 
threads, which gives you a really big number really fast.  Humans aren’t good 
at thinking about all N billion execution paths through a given program.

Synchronization — whether that’s mutexes or transactions or message passing or 
something else — helps, but it always eats into the speed advantage of raw 
threads, so there will continue to be a continuous pressure to reduce 
synchronization rather than add more automatically.  Go look up “lock free data 
structures” if you want to see the kind of thing being done in this area.

Computers can help with the combinatoric explosion, but I’m not seeing a whole 
lot of progress on this with real world programs.  I want a tool like lint(1) 
that will detect synchronization errors statically, but for now, I think you 
have to rely on dynamic tools like Helgrind:

   http://valgrind.org/docs/manual/hg-manual.html

The problem with dynamic error detection is that it can only catch errors in 
code paths that you can trigger while the tool watches.  This is about more 
than just simple code coverage, it’s about *combinatoric* code path coverage.  
If you don’t test all possible interleavings of instructions among the threads 
and cores, you can still miss an error, even if the tool knows how to detect it.

I have to believe static thread correctness analysis is at least possible in 
principle, because humans do manage to see threading problems just by staring 
at the code long enough.  It might require strong AI to do it, but it’s got to 
be possible to at least do as well as an expert human.  But, that just means 
this becomes yet another, “Won’t it be great when we have strong AI?” wish.

A tool like Helgrind isn’t enough by itself.  It’ll blithely ignore your 
lock-free data structure code, for example.  It’ll also fail to flag logical 
errors in your SQLite code where you’re missing transactions, for another.

It’s worth repeating: Concurrency is hard.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Difference between min and max time, especially for sys

2017-02-16 Thread Cecil Westerhof
2017-02-16 19:31 GMT+01:00 Jens Alfke :

>
> > On Feb 15, 2017, at 11:47 AM, Cecil Westerhof 
> wrote:
> >
> > But the difference between sys can be almost a factor twenty. What seems
> very big to me.
>
> Yup. There is a huge speed difference between the kernel going to the
> disk/SSD to read from a file, vs. the kernel reading the data out of its
> buffer cache. That’s why the kernel has buffer caches :)
>
> (And those disk reads can be hugely variable. A hard disk can spend 10ms
> or more seeking, and any medium has bandwidth limits, so other processes’
> I/O can greatly slow down your own. I’ve seen filesystem slowdowns of 20x
> or more shortly after a user logs in when dozens of processes are launching
> at once.)
>

​Well user/sys is mostly around 10, so it does not have a to big influence
on the measurements. ;-)

-- 
Cecil Westerhof
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Smallest reasonable cache size

2017-02-16 Thread Dominique Pellé
Kim Gräsman  wrote:

> Hi all,
>
> In my battles with lots of connections competing over precious cache
> memory, I've considered giving some non-critical connections zero
> cache using `PRAGMA cache_size=0`.
>
> Is this a reasonable thing to do? If zero is too extreme, what might a
> more moderate small cache size be? 32? 64? 100?
>
> Some of these connections are only opened to do `PRAGMA quick_check;`,
> I'm guessing that operation is not very cache-intensive, assuming it
> reads pages sequentially and checks them?
>
> Thanks for any advice on this,
> - Kim


I think it's probably best to give a large cache_size to
each connection, and limit the total amount of memory
used by SQLite with sqlite3_soft_heap_limit64().
This will effectively limit the global amount of cache
used if cache pages comes from the heap.

For example, if you give 20MB to each connection
(PRAGMA cache_size=) and limit the global SQLite
heap usage to 30 MB with sqlite3_soft_limit() then even
if you have say 100 connections, SQLite will not use
more than 30 MB of heap. If a connection needs to
cache pages, unpinned cache pages from other
connections will be discarded in LRU order. Strictly
speaking, SQLite could use more than the 30 MB
soft limit if it has no other choice to allocate memory
(hence a soft limit), but in general that does not happen.

That way, inactive connections (connections that
are opened, but no queries have been done in a
long time) do hog pages in memory. Their pages
get discarded after a while. Memory will be best
used automatically to cache pages of the most active
connections. At least that's my understanding.

Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread Warren Young
On Feb 15, 2017, at 4:40 AM, Darren Duncan  wrote:
> 
> On 2017-02-15 2:40 AM, Clemens Ladisch wrote:
>> Cecil Westerhof wrote:
>> 
>> And just like with assembly code, you also have to count the time spent
>> writing it, and debugging the result.
> 
> Also, its a long time since hand-writing assembly code was any good for 
> performance, unless you're a 1% top expert with a good reason.

While true insofar as it goes, that attitude leads to people being ignorant of 
what the compiler produces from the code you give it.  This almost caused a 
threading bug in a program I was modifying recently, and we only caught it 
ahead of time because someone questioned some basic assumptions, causing me to 
go look at the generated assembly.

Consider this humble lone line of code:

++i;

The threading bug is right there, staring at you.

What, you don’t see it?  How about now:

https://godbolt.org/g/xfJ9SQ

Yeah, that’s right, friends, integer increment isn’t a single instruction on 
ARM, even with gcc -O2, hence it is not atomic!  It takes at least three 
instructions (load, modify, store) and for some reason GCC chose to use 6 in 
this particular case.  (Probably some remnant of the function calling 
convention.)

That means that if you’re depending on that increment to be atomic across 
threads, you’re going to be in for a shock of the old bank balance transaction 
problem form.  (You know, the one every SQL newbie gets taught, where the 
account gets double-debited or double-credited if you don’t use transactions.)

The solution is to use GCC’s atomic increment primitive — also shown via the 
above link for comparison — which adds a couple of “dmb” ARM instructions to 
lock the code to a single CPU core through that critical section.

A software developer who refuses to learn about his processor’s assembly 
language is like trying to become an electrical engineer without learning 
anything about physics.  A typical practicing EE won’t need to break out 
Maxwell’s equations every day, but understanding the implications of those 
equations is what separates engineering from tinkering.

> If you want speed, write in C or something else that isn't assembly.  The 
> odds are like 99% that the modern C compiler will generate faster code than 
> you could ever write yourself in assembly, and it will be much less buggy.

Just to be sure people understand my position here, I will agree with this 
again.

If you don’t like my example above as presented, consider also that it supports 
the “threads are evil” hypothesis.  If you can’t count on a simple preincrement 
to be atomic, what else are you misunderstanding about what’s going on at the 
low levels of the system when it runs your multithreaded program?

(And no, it wasn’t my idea to use threads in the program I was modifying in the 
first place!  One of the planned upcoming changes is to redesign it from a 
2-thread system to two cooperating single-threaded programs communicating over 
an IPC channel.)

> Similarly with threads, for the vast majority of people, using other 
> concurrency models with supported languages are better; they will still get 
> the performance benefit of using multiple CPU cores but do it much more 
> safely than if you are explicitly using "threads" in code.

Also agreed.  I recommend starting with message-passing, and move on to other 
methods only when you can prove that won’t give the required benefit.

I also recommend that you go learn you some Erlang (for great good):

   http://learnyousomeerlang.com/content

If you can’t get past the syntax, you can paper over it with the Ruby-like 
Elixir front-end:

   http://elixir-lang.org/

But to bring all of this back around on topic, beware that a SQL DB is 
basically a global variable store, albeit with arbitrated access.  You can 
create cross-process problems by misuse of the data store just as you can with 
global variables in a traditional threaded program.

Message-passing concurrency is just a tool that increases your chances of 
effortless success, it is not a guarantee of it.  No system that allows side 
effects can guarantee proper ordering of operations without some thought given 
to it, whether the mechanisms involved are mutexes, transactions, or something 
else.

Concurrency is hard.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Smallest reasonable cache size

2017-02-16 Thread Richard Hipp
On 2/16/17, Kim Gräsman  wrote:
> I've considered giving some non-critical connections zero
> cache using `PRAGMA cache_size=0`.
>
> Is this a reasonable thing to do?

Yes.
-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Smallest reasonable cache size

2017-02-16 Thread Kim Gräsman
Hi all,

In my battles with lots of connections competing over precious cache
memory, I've considered giving some non-critical connections zero
cache using `PRAGMA cache_size=0`.

Is this a reasonable thing to do? If zero is too extreme, what might a
more moderate small cache size be? 32? 64? 100?

Some of these connections are only opened to do `PRAGMA quick_check;`,
I'm guessing that operation is not very cache-intensive, assuming it
reads pages sequentially and checks them?

Thanks for any advice on this,
- Kim
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] 3.17.0 does not read updated DB

2017-02-16 Thread Jens Alfke

> On Feb 15, 2017, at 9:51 PM, Roman Fleysher  
> wrote:
> 
> My mistake: I do not update DB. I rename (unix mv) the DB.

I wouldn’t recommend doing that to an open database!

(a) I don’t think it’s formally specified when SQLite opens or closes the 
database file. It’s possible it might close and reopen it at some point, in 
which case it would probably open the old path and suddenly be seeing a 
different file.
(b) Any side files like the -wal and -shm files used by WAL, or the .journal 
file, will be left behind. SQLite will probably keep using them at the old 
location. But if you close the handle and then reopen it on the moved file, 
those side files will now be missing, which could result in database corruption.
(c) As noted, the semantics of mv differ depending on whether the paths are on 
the same filesystem, and the effects on an open sqlite database would be 
completely different.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Difference between min and max time, especially for sys

2017-02-16 Thread Jens Alfke

> On Feb 15, 2017, at 11:47 AM, Cecil Westerhof  wrote:
> 
> But the difference between sys can be almost a factor twenty. What seems very 
> big to me.

Yup. There is a huge speed difference between the kernel going to the disk/SSD 
to read from a file, vs. the kernel reading the data out of its buffer cache. 
That’s why the kernel has buffer caches :)

(And those disk reads can be hugely variable. A hard disk can spend 10ms or 
more seeking, and any medium has bandwidth limits, so other processes’ I/O can 
greatly slow down your own. I’ve seen filesystem slowdowns of 20x or more 
shortly after a user logs in when dozens of processes are launching at once.)

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread James K. Lowden
On Tue, 14 Feb 2017 17:05:30 -0800
Darren Duncan  wrote:

> There is nothing inherently wrong with threads in principle

What's inherently wrong with threads in principle is that there is no
logic that describes them, and consequently no compiler to control that
logic.  

By analogy, the Forth language has no datatypes.  The programmer is
free to treat any area of memory as encoded as any type; the only
structure is the stack.  Languages that do define datatypes allow the
programmer to guarantee consistent treatment of variables by enforcing
type constraints.  

Threads were and are a theoretical regression.  They returned the
programmer to the time when the programming environment provided no
memory protection.  They introduced a flow of control with no governance
provided by the OS or the compiler.  

> Being multi-threaded is necessary to properly utilize the hardware,
> or else we're just running on a single core and letting the others go
> idle.  The real problem is about properly managing memory.  

It's not necessary in general to give up protected memory to fully
utilze the hardware.  

Rob Pike made some excellent presentations explaining the difference
between parallel and concurrent operations, and how Go uses CSP to
support concurrency.  

Go is a step in the right direction.  By bringing threads under the
control of the compiler, GoThreads give the programmer the efficiency
threads afford without relinquishing control over memory.  

> Also giving sufficient hints to the programming language so that it
> can implicitly parallelize operations.  

Afaik that's an unsolved problem.  Take qsort(3) for example.  I wrote
a recursive version that runs in parallel.  (Mine uses shared memory
and fork(2)).  What kind of a "hint" might the programmer provide to
determine how many processes to use?  

Clearly, no static choice is right.  Should qsort interrogate the
machine and decide to use, say, 1/2 the processors?  Why not all?  But
what if the machine is heavily loaded at the time qsort runs?  Should
it take the time to examine the machine state, and limit itself?  Or
should it just go ahead and let the OS deal with it?  

I decided on per-process regulation via an environment variable
representing the approximate number of processes my qsort would spawn.
If the variable is not set, the default is 2x the number of processors,
which seems to the be upper limit for performance in my limited
testing.  

So, yeah, threads are the 1990s version of 1960s computing.  We seem to
be on the cusp of recognizing the value of CSP to manage concurrency,
and of functional programming to manage parallelism.  That's far from
the majority view, though, afaict.  There's a lot more to be invented,
and done.  

--jkl






___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread James K. Lowden
On Wed, 15 Feb 2017 09:40:13 -0800
Jens Alfke  wrote:

> https://en.wikipedia.org/wiki/Communicating_sequential_processes
> 

Also search YouTube for Rob Pike's presentations on CSP in Go.  It will
help clarify your thinking about the different computing models that
"multithreading" is used for.  

--jkl
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread James K. Lowden
On Wed, 15 Feb 2017 12:34:51 +
Simon Slavin  wrote:

> Two disadvantages are that threads are indistinguishable to anything
> but the owner and don?t know how to keep out of each-other?s way.  By
> the time you?ve devised some sort of mutex/locking/blocking mechanism
> you?re usually better-off using processes.

Yup.  Jim Gettys observed that no multithreaded X server has replaced or
out-performed the original single-threaded one.  

--jkl
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread James K. Lowden
On Wed, 15 Feb 2017 07:55:16 -0700
"Keith Medcalf"  wrote:

> Note that for several modern OSes, the OS is nothing more than a
> discontiguous saved segment (DCSS) which is mapped into *every*
> process space and that process isolation is more of a myth than a
> reality.

Are you referring to one in particular we could read about?  

--jkl

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] 'misuse at line...' sqlite3_config

2017-02-16 Thread J Decker
I have this bit of code I've had around forever, I recently enabled
SQLITE_CONFIG_LOG and have been getting a misuse because I'm calling
sqlite3_config after initialization.

The one I'm calling is SQLITE_CONFIG_GETMALLOC which should read the
internal routines already in use; I then override some of those in the
sqlite3_mem_methods structure and set sqlite3_db_config to set those on a
specific connection.

The first thing in sqlite3_config is 'if( sqlite3GlobalConfig.isInit )
return SQLITE_MISUE-BKPT'

shouldn't I be able to use config to read things?  There is no check to see
if the op is a write config or a read config
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why is this so much more efficient?

2017-02-16 Thread R Smith
Basically, the DB size and compactness state influences the speed at 
which values are read. (Every update implies a read).


Add to that the fact that these functions verge on the bottom edge of 
time consumers... It's like testing diffusion speed of a perfume in a 
hurricane.


Glad you found the close competition between most-optimal solutions.


On 2017/02/15 8:43 PM, Cecil Westerhof wrote:

2017-02-15 14:18 GMT+01:00 Cecil Westerhof :


The OR version is the least efficient and it look likes the BETWEEN
version 2 is the most efficient. It looks like it uses less user and more
sys.


​Which is the most efficient is also dependent on the state of the database
itself. I compacted the database and now sometimes NOT BETWEEN is more
efficient and sometimes NOT BETWEEN version 2 is more efficient.
The OR version is always the least efficient.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Deprecated OSX functioon

2017-02-16 Thread Ward WIllats
This has been coming across my console when building Sqlite since upgrading to 
Mac OSX Sierra a few months ago. FWIW.


Building sqlite3 shell...
sqlite3.c:20839:17: warning: 'OSAtomicCompareAndSwapPtrBarrier' is deprecated: 
first deprecated in macOS 10.12 - Use atomic_compare_exchange_strong() from 
 instead [-Wdeprecated-declarations]
  success = OSAtomicCompareAndSwapPtrBarrier(NULL, newzone, 
^
/usr/include/libkern/OSAtomicDeprecated.h:547:6: note: 
'OSAtomicCompareAndSwapPtrBarrier' has been explicitly marked deprecated here
boolOSAtomicCompareAndSwapPtrBarrier( void *__oldValue, void *__newValue, 
void * volatile *__theValue );
^
1 warning generated.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Group contiguous rows (islands)

2017-02-16 Thread Rossel, Jonathan
@ Pasma and Hainaut,

Thanks again, that looks promising !

Jonathan

Message: 42
Date: Wed, 15 Feb 2017 21:10:10 +0100
From: "E.Pasma" 
To: SQLite mailing list 
Subject: Re: [sqlite] Group contiguous rows (islands)
Message-ID: <55fa9699-22fe-4dd9-9b86-36a190485...@concepts.nl>
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes


Jean-Luc Hainaut:

> On 15/02/2017 18:34, E.Pasma wrote:
>>
>> Hello,  the query below is simpler. May be slower. But looks pretty  
>> relational. Thanks, E Pasma.
>>
>> create table T(date integer,test char(12));
>> insert into T
>> values (1,'clim'),(3,'clim'),(7,'amb'),(10,'amb'),(12,'xxx'),
>> (13,'clim'),(15,'clim'),(20,'clim'),(22,'amb'),(25,'amb');
>>
>> select min(date) as fromdate, max(date) as enddate, test
>> from(--get closest preceeding different key
>>select t.*, max(t2.date) as key2
>>from t
>>left join t t2
>>on t2.datet.test
>>group by t.date
>>)
>> group by key2
>
> Quite nice solution indeed!
> For those who may feel uncomfortable with outer joins, the from  
> clause could be written as a subquery:
>
> from (select date, test, (select  max(date)
>  fromt t2
>  where  t2.date < t.date
>  and  t2.test <> t.test)  
> as key2)
>
> Thanks
>
> J-L
>
this way you may also try to optimise speed by using ORDER BY & LIMIT  
1 instead of MAX

from (select date, test, (select t2.date
   from  t t2
   where t2.date < t.date
   and t2.test <>  t.test
   order by t2.date desc limit 1)  
as key2


***
This e-mail message is intended only for the addressee(s) and contains
information which may be confidential. If you are not the intended
recipient please do not read, save, forward, disclose or copy the contents
of this e-mail. If this e-mail has been sent to you in error, please delete 
this 
e-mail and any copies or links to this e-mail completely and immediately
from your system. We also like to inform you that communication via e-mail
over the Internet is insecure because third parties may have the possibility
to access and manipulate e-mails.

Any views expressed in this message are those of the individual sender,
except where the sender specifically states them to be the views of
The Swatch Group Ltd.
***
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] About the performance of recursive WITH

2017-02-16 Thread Keith Medcalf

H.  With the current head of trunk ...

I run this and the recursive version takes 93 milliseconds vs the iterative 
version which takes 203 milliseconds.
Creating all possible covering indexes takes 110 milliseconds vs 141 
milliseconds.
Creating only used/required covering indexes takes 93 milliseconds vs 124 
milliseconds

Can you add .timer on and .eqp on and see if you are getting reasonable query 
plans?  Also, what version of SQLite are you using?

Is the version you are using creating the necessary automatic covering indexes?

If I manually create indexes I get better and worse results, as expected.  
Those follow after these runs.


>timethis "sqlite64 graph.db < GRAPH-performance-Load.sql"

TimeThis :  Command Line :  sqlite64 graph.db < GRAPH-performance-Load.sql
TimeThis :Start Time :  Thu Feb 16 06:47:14 2017


TimeThis :  Command Line :  sqlite64 graph.db < GRAPH-performance-Load.sql
TimeThis :Start Time :  Thu Feb 16 06:47:14 2017
TimeThis :  End Time :  Thu Feb 16 06:47:14 2017
TimeThis :  Elapsed Time :  00:00:00.078


>dir graph.db
2017-02-16  06:4745,056 graph.db

>timethis "sqlite64 graph.db < GRAPH-performance-recursive.sql"

TimeThis :  Command Line :  sqlite64 graph.db < GRAPH-performance-recursive.sql
TimeThis :Start Time :  Thu Feb 16 06:47:58 2017

SQLite 3.18.0 2017-02-15 22:36:15 58797e9bafa95709e0f706a15f42f93b409e2db5
.eqp on
.timer on
update GRAPH
   set Level = null;
--EQP-- 0,0,0,SCAN TABLE GRAPH
Run Time: real 0.016 user 0.00 sys 0.00

with
HIERARCHY (FromID,ToID,Level) as
(
select Parent,
   Child,
   '0'
  from GRAPH
 where Parent is null
 union all
select H.ToID,
   G.Child,
   H.Level + 1
  from HIERARCHY H,
   GRAPH G
 where H.ToID = G.Parent
)
update GRAPH
   set Level = (select Level
  from HIERARCHY H
 where ToID = GRAPH.Child);
--EQP-- 0,0,0,SCAN TABLE GRAPH
--EQP-- 0,0,0,EXECUTE CORRELATED SCALAR SUBQUERY 0
--EQP-- 2,0,0,SCAN TABLE GRAPH
--EQP-- 3,0,0,SCAN TABLE HIERARCHY AS H
--EQP-- 3,1,1,SEARCH TABLE GRAPH AS G USING AUTOMATIC COVERING INDEX (Parent=?)
--EQP-- 1,0,0,COMPOUND SUBQUERIES 0 AND 0 (UNION ALL)
--EQP-- 0,0,0,SEARCH SUBQUERY 1 AS H USING AUTOMATIC COVERING INDEX (ToID=?)
Run Time: real 0.046 user 0.00 sys 0.015625

select Level,
 count(*) as Number
from GRAPH
group by Level;
--EQP-- 0,0,0,SCAN TABLE GRAPH
--EQP-- 0,0,0,USE TEMP B-TREE FOR GROUP BY
0|1
1|47
2|215
3|638
4|1010
5|729
6|50
Run Time: real 0.000 user 0.00 sys 0.00


TimeThis :  Command Line :  sqlite64 graph.db < GRAPH-performance-recursive.sql
TimeThis :Start Time :  Thu Feb 16 06:47:58 2017
TimeThis :  End Time :  Thu Feb 16 06:47:58 2017
TimeThis :  Elapsed Time :  00:00:00.093


>timethis "sqlite64 graph.db < GRAPH-performance-iterative.sql"

TimeThis :  Command Line :  sqlite64 graph.db < GRAPH-performance-iterative.sql
TimeThis :Start Time :  Thu Feb 16 06:48:10 2017

SQLite 3.18.0 2017-02-15 22:36:15 58797e9bafa95709e0f706a15f42f93b409e2db5
.eqp on
.timer on

update GRAPH
   set Level = null;
--EQP-- 0,0,0,SCAN TABLE GRAPH
Run Time: real 0.031 user 0.00 sys 0.015625

update GRAPH
   set Level = 0
 where Parent is null;
--EQP-- 0,0,0,SCAN TABLE GRAPH
Run Time: real 0.015 user 0.00 sys 0.00

update GRAPH
   set Level = 1
 where Parent in (select Child
from GRAPH
   where Level = 0);
--EQP-- 0,0,0,SCAN TABLE GRAPH
--EQP-- 0,0,0,EXECUTE LIST SUBQUERY 0
--EQP-- 0,0,0,SCAN TABLE GRAPH
Run Time: real 0.016 user 0.00 sys 0.00

update GRAPH
   set Level = 2
 where Parent in (select Child
from GRAPH
   where Level = 1);
--EQP-- 0,0,0,SCAN TABLE GRAPH
--EQP-- 0,0,0,EXECUTE LIST SUBQUERY 0
--EQP-- 0,0,0,SCAN TABLE GRAPH
Run Time: real 0.015 user 0.015625 sys 0.00

update GRAPH
   set Level = 3
 where Parent in (select Child
from GRAPH
   where Level = 2);
--EQP-- 0,0,0,SCAN TABLE GRAPH
--EQP-- 0,0,0,EXECUTE LIST SUBQUERY 0
--EQP-- 0,0,0,SCAN TABLE GRAPH
Run Time: real 0.032 user 0.015625 sys 0.00

update GRAPH
   set Level = 4
 where Parent in (select Child
from GRAPH
   where Level = 3);
--EQP-- 0,0,0,SCAN TABLE GRAPH
--EQP-- 0,0,0,EXECUTE LIST SUBQUERY 0
--EQP-- 0,0,0,SCAN TABLE GRAPH
Run Time: real 0.015 user 0.00 sys 0.00

update GRAPH
   set Level = 5
 where Parent in (select Child
from GRAPH
   where Level = 4);
--EQP-- 0,0,0,SCAN TABLE GRAPH
--EQP-- 0,0,0,EXECUTE LIST SUBQUERY 0
--EQP-- 0,0,0,SCAN TABLE GRAPH
Run Time: real 0.016 user 0.00 sys 0.00

update GRAPH
   set Level = 6
 where Parent in (select Child
from GRAPH
   where Level = 5);
--EQP-- 0,0,0,SCAN TABLE GRAPH
--EQP-- 0,0,0,EXECUTE LIST SUBQUERY 0
--EQP-- 0,0,0,SCAN TABLE GRAPH
Run Time: real 0.016 

Re: [sqlite] sqlite3 hangs on query

2017-02-16 Thread Jens-Heiner Rechtien

On 15/02/2017 23:36, Richard Hipp wrote:

On 2/15/17, Richard Hipp  wrote:

It is an optimization opportunity, not a bug.


That optimization is now on trunk.

Very cool!

Thanks,
- Heiner


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multiple connections, page sizes and cache

2017-02-16 Thread Kim Gräsman
I've now built an inventory of all our connections and their
lifetimes. They come in three primary classes:

1) Perpetual -- opened on app startup, closed on shutdown
2) Periodical, transient, serial -- only one at a time, happens quite
rarely (online backup)
3) Stochastic, transient, concurrent -- can happen whenever, and
potentially in parallel

Since 1 & 2 are unavoidable, it makes sense to reserve cache memory
for them up-front with SQLITE_CONFIG_PAGECACHE.

But the class-3 connections for the most part never happen. When they
do happen, they will definitely overlap with (1) and might overlap
with (2).

So I was thinking it would be useful for us to have these connections
by-pass the global preallocated page cache buffer and just go straight
to malloc(). That way we don't risk that they steal pages from the
perpetual connections, and force *them* to go to malloc() for memory,
increasing overall memory usage.

Would something like "PRAGMA cache_allocator=malloc" be a crazy idea?

Thanks,
- Kim

On Sat, Feb 4, 2017 at 3:21 PM, Kim Gräsman  wrote:
> Hi Richard,
>
> I got side-tracked with other work, but am coming back to this now.
>
> On Sun, Jan 15, 2017 at 5:50 PM, Richard Hipp  wrote:
>> On 1/15/17, Kim Gräsman  wrote:
>>>
>>> 1) If I configure a global SQLite heap with SQLITE_CONFIG_HEAP, won't
>>> I just trade malloc heap fragmentation for SQLite private heap
>>> fragmentation? Or does SQLite's fragmentation-prevention strategy work
>>> well even in the face of multiple page sizes and connections?
>>
>> As described in the document that you read
>> (https://www.sqlite.org/malloc.html), if you keep your memory usage
>> below the Robson limit, then fragmentation will never be a problem.
>> Guaranteed.
>>
>> If you exceed the Robson limit, it is theoretically possible that
>> fragmentation could cause problems.  But in practice, we don't
>> normally have issues.
>>
>>>
>>> 2) What happens if the configured heap is depleted?
>>
>> You get SQLITE_NOMEM errors.  Use sqlite3_memory_used() to monitor
>> your memory utilization and raise alarms if it gets too high.  And/or
>> call sqlite3_db_release_memory() to try to free up memory from caches
>> if you start to run low.
>
> Thanks, makes sense!
>
> We are having a hard time estimating needed memory up-front, and we
> can't afford SQLITE_NOMEM (we'd rather take slower
> allocation/fragmentation). Especially since our app has a mix of
> long-running and transient connections, it's hard for us to tell
> exactly how many, with what cache needs, will be running concurrently
> at any one time.
>
> One thing that we've been discussing internally, and can't find a
> definitive answer to:
>
> Why are these arenas/buffers global, and not associated with each
> connection? It seems to me making a big chunk malloc() to match cache
> size when opening a connection would make it easier for an application
> such as ours, where some connections come and go. I guess it could be
> more fragmentation-/OOM-prone but with a reasonable malloc
> implementation it feels like it would be more flexible without
> sacrificing much of the benefits.
>
> Thank you,
> - Kim
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] About the performance of recursive WITH

2017-02-16 Thread Jean-Luc Hainaut

Hi,

This post concerns a strange (imho) behaviour of recursive WITH update 
query as far as execution time is concerned.


I have created (in an "In memory" DB) a table that stores a set of nodes 
arranged in a tree structure (actually the skeleton of the contents of a 
Windows folder):


   create table GRAPH(Parent int, Child int, Level int);

Column "Level" indicates the depth of the node in the tree. The root 
node has Level 0, its children Level 1, and so on.


2690 nodes have been inserted, with "Level" initialized to null:

   insert into GRAPH(Parent,Child) values (null,1); -- root node (no 
parent)
   insert into GRAPH(Parent,Child) values (1,9);-- child of the 
root node
   insert into GRAPH(Parent,Child) values (9,10);   -- grand-child of 
the root node

   insert into GRAPH(Parent,Child) values (9,11);
   insert into GRAPH(Parent,Child) values (9,12);
   insert into GRAPH(Parent,Child) values (9,13);
   etc.

Considering the size of the table, no indexes have been created.
The distribution of nodes among the levels is fairly "normal":

   +---++
   | Level | Number |
   +---++
   | 0 | 1  |
   | 1 | 47 |
   | 2 | 215|
   | 3 | 638|
   | 4 | 1010   |
   | 5 | 729|
   | 6 | 50 |
   +---++

Now, I would like to compute the "Level" value of all nodes from their 
position in the tree. This task immediately suggests a recursive WITH 
update:


   with recursive
   HIERARCHY(FromID,ToID,Level) as
  (
  select Parent, Child, '0'
  from   GRAPH where Parent is null
 union all
  select H.ToID, G.Child, H.Level + 1
  from   HIERARCHY H, GRAPH G
  where  H.ToID = G.Parent
  )
   update GRAPH
   setLevel = (select Level from HIERARCHYwhere ToID = GRAPH.Child);

When this query is executed by SQLite 3.16.2, the timer reports an 
execution time of 5.522 s.  Adding an index on "Parent" and one on 
"Child" just makes things worse (6.381 s.)


I find this figures quite high, so that I try an iterative technique, 
which is likely to be close to the execution strategy of the WITH 
statement (https://www.sqlite.org/lang_with.html). I translate it for 
the CLI shell as follows:


   update GRAPH set Level = 0 where Parent is null;

   update GRAPH set Level = 1
   where  Parent in (select Child from GRAPH where Level = 0);

   update GRAPH set Level = 2
   where  Parent in (select Child from GRAPH where Level = 1);

   update GRAPH set Level = 3
   where  Parent in (select Child from GRAPH where Level = 2);

   update GRAPH set Level = 4
   where  Parent in (select Child from GRAPH where Level = 3);

   update GRAPH set Level = 5
   where  Parent in (select Child from GRAPH where Level = 4);

   update GRAPH set Level = 6
   where  Parent in (select Child from GRAPH where Level = 5);

For this script, I get an execution time of 0.015 s., i.e., nearly 370 
times less!


Is there something wrong in my queries? Or is there an optimization 
trick for WITH queries by which one could approach the performance of 
the iterative version?


The scripts are available here: 
https://www.dropbox.com/s/23t4ycftlk0doy1/GRAPH-performance.zip?dl=0


Thanks for any advice

Jean-Luc Hainaut

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS3 tokenize unicode61 does not remove diacritics correctly?

2017-02-16 Thread Cezary H. Noweta

Hello,

On 2017-02-16 10:53, artur.krol.elea...@sqlite.org wrote:

[...]
The result is:
eoałzzcneoałzzcnlł



It seems diacritics from letter „ł” and „Ł” was not removed. Is it a sqlite bug?


In general, overlays (slash, crossbars, etc.) are considered as 
diacritics, however, Unicode does not provide a decomposition mapping 
for ``ł'', or ``Ł''. Even if it is a bug, then it will concern the 
Unicode standard rather then SQLite FTS3 itself, as the latter is using 
the character database provided by the Unicode standard.


-- best regards

Cezary H. Noweta
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] FTS3 tokenize unicode61 does not remove diacritics correctly?

2017-02-16 Thread artur.krol.eleader
Hi all,

I have an issue with FTS3 (http://www.sqlite.org/fts3.html).

I am creating virtual table using fts3 to query tokens:
CREATE VIRTUAL TABLE tok1 USING fts3tokenize(unicode61);
Documentation says:
„By default, "unicode61" also removes all diacritics from Latin script 
characters.”;

When I use query to select tokens:
SELECT token FROM tok1 WHERE input='ęóąłżźćńĘÓĄŁŻŹĆŃlŁ*';

The result is:
eoałzzcneoałzzcnlł

It seems diacritics from letter „ł” and „Ł” was not removed. Is it a sqlite bug?

Regards,
Artur Król


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users