Re: [sqlite] Thread safety of serialized mode

2017-02-19 Thread Rowan Worth
On 18 February 2017 at 01:16, James K. Lowden 
wrote:

> It's why I like Go: it's the first language in 30 years to incorporate
> concurrency in its design, and finally support a theoretically sound
> model.
>

I like Go too, but this is giving it a bit too much credit. What of Alef
and Limbo, which bore a similar concurrency model, and of which Go is
almost the spiritual successor? What of Clojure, which incorporates a
different concurrency model but is still theoretically sound (STM)? What of
Erlan- oh, Erlang *is* 30 years old :P

-Rowan
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-17 Thread Jens Alfke

> On Feb 17, 2017, at 11:11 AM, Dominique Devienne  wrote:
> 
> No they are not. They may be scheduled on threads, but they are not threads.

We're disagreeing on terminology, but I believe I’m correct. Threads don’t have 
to be implemented at the OS level. Threads implemented in user code are called 
“user threads” or “green threads” or sometimes “fibers”.

https://en.wikipedia.org/wiki/Thread_(computing)#Processes.2C_kernel_threads.2C_user_threads.2C_and_fibers
 


> You do have enough rope to hang yourself with, but Go unlike C has channels
> and the select statement,

We are in violent agreement here. Go has safe concurrency primitives; they’re 
one of the best features of the language. It also has unsafe ones like mutexes, 
and it allows sharing data between threads. You can inadvertently hurt yourself 
with shared data even if you only use the safe features — like by creating a 
channel of an interface type and not realizing that the channel is sending only 
pointers, not the objects themselves.

> That's the second time you've mischaracterized Go IMHO. —DD

After how many do I get a prize? ;-) 

I’ve written tens of thousands of lines of production Go code, so I think I’m 
entitled to some right to characterize the language, even if it differs from 
what you think.

—Jens

PS: I realize this thread [sic] has been way off-topic for a long time. This is 
my last reply to it on this list. Anyone may email me directly if they have 
something constructive to say :)
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-17 Thread Dominique Devienne
On Fri, Feb 17, 2017 at 6:58 PM, Jens Alfke  wrote:

> [from another email]
> > It's why I like Go: it's the first language in 30 years to incorporate
> > concurrency in its design, and finally support a theoretically sound
> > model.
>
> Goroutines are still threads, and Go programs can and do share memory
> between goroutines,


No they are not. They may be scheduled on threads, but they are not threads.

And the fact you can have several goroutines communicating on a single
thread is indicative of that.

The same exact code using goroutines and channels can run on 1 thread, or N
threads.
No explicit yields, no locking, just communication via channels, which are
pipe-like in nature,
except they are typed and not just bytes.

And you don't need threads to do async-I/O in Go either. When Go makes an
IO call,
it's implicitly async, and the Go routine will often "yield" (be
unscheduled) to given another
one the opportunity to run while the first is sitting there waiting for its
IO to complete.
No callback hell, and completely transparent to the programmer.

which has all the same concurrency issues you find in C.


You do have enough rope to hang yourself with, but Go unlike C has channels
and the select statement,
and that is what allows Go concurrent programs to be so much easier to
write than C, albeit not trivial to write.


> Go has a Memory Model specification** that lays out the undefined results
> of unsynchronized access to memory by multiple goroutines, and it’s very
> much like what the C spec describes.
>

Of course it does. That's why one of Go's motto is:

Do not communicate by sharing memory; instead, share memory by
communicating. [1]

but sharing memory and locks is still often faster than the implicit model
of channels and CSP,
so it does not preclude it. I.e. in that one instance, Go doesn't dumb-down
too much the language.
But it's definitely not the one that's promoted and reserved to
high-performance when warranted,
and proven by profiling, and even then, one probably shouldn't go there.

That's the second time you've mischaracterized Go IMHO. --DD

[1] https://blog.golang.org/share-memory-by-communicating
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-17 Thread Jens Alfke

> On Feb 17, 2017, at 9:18 AM, James K. Lowden  wrote:
> 
> It's the OS.  A thread is an OS abstraction, not a machine feature.  

You don’t need an OS to create threads. All you need is setjmp/longjmp or the 
equivalent. “Green” threads are more awkward to use than kernel-supported 
threads, but they’re frequently used, and they can reproduce many of the nasty 
concurrency problems you get with “real" threads.

> The OS *chooses* not to
> arbitrate access to memory, and the compiler *chooses* to expose the
> programmer to those pathological interactions.  That's by design.

OS arbitration of memory access at a fine-grained level is too expensive. 
Microkernel systems have made headway, but they still don’t match the 
performance of monolithic OS’s*, and I think that for the finer-grained 
concurrency that in-process threads are often used for, they’d be way too 
expensive.

[from another email]
> It's why I like Go: it's the first language in 30 years to incorporate
> concurrency in its design, and finally support a theoretically sound
> model.  

Goroutines are still threads, and Go programs can and do share memory between 
goroutines, which has all the same concurrency issues you find in C. Go has a 
Memory Model specification** that lays out the undefined results of 
unsynchronized access to memory by multiple goroutines, and it’s very much like 
what the C spec describes.

I’ve used Go quite a bit in the past, but today I’m more interested in Rust and 
Pony which actually do prevent shared memory access in (supposedly) efficient 
ways.

—Jens

* https://en.wikipedia.org/wiki/Microkernel#Performance 

** https://golang.org/ref/mem
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-17 Thread Jens Alfke

> On Feb 17, 2017, at 3:48 AM, Simon Slavin  wrote:
> 
> It is insane that a CPU would allow two threads to interfere with each-other 
> in such a way as to 'break' an INC instruction.

It may be insane to you, but it’s simply how multiprocessor computer 
architectures work, and railing against it won’t accomplish anything. Any 
multiprocessor/multicore computer is a distributed system — you can think of it 
as client/server where the clients are the CPUs and the RAM is the server.

An INC instruction is a read / modify / write sequence. If you want this to be 
atomic, you have to use a transaction. Transactions are expensive. If every 
read/modify/write used a transaction, it would slow the system to a crawl, and 
99.999% of the time it’s unnecessary. So the CPU makes it optional. In the case 
of the x86 INC instruction (which I didn’t know about), there is a LOCK flag 
that can be used to do this. A C compiler isn’t normally going to generate code 
that uses this flag, _unless_ you call an atomic-increment primitive.

This has nothing to do with threads or programming languages. It’s simply the 
way today’s computer architectures work. There have been different 
multiprocessor architectures that didn’t share memory, but they never became 
practical enough to make it into general-purpose computers. (Actually one of 
them did: the GPU. But we don’t write database engines that run on GPUs, to my 
knowledge.)

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-17 Thread Deon Brewis
Actually C# await on at least the file async IO operations will complete the 
await on the I/O completion port callback.
 
It's absolutely sad that there is no C++ support for that as of yet, and the 
standards committee is nowhere near it. It got kicked out of C++ 17... again. 
MAYBE we'll have language support in 2020, and then it's still going to be 1 or 
2 cycles after that before there will be standard library support for common 
async IO operations.

PS: The work that Microsoft did in Midori & M# on concurrency was also very 
promising. This is worth a read:
http://joeduffyblog.com/2016/11/30/15-years-of-concurrency/

- Deon

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of James K. Lowden
Sent: Friday, February 17, 2017 9:17 AM
To: sqlite-users@mailinglists.sqlite.org
Subject: Re: [sqlite] Thread safety of serialized mode

On Fri, 17 Feb 2017 04:10:09 +
Deon Brewis <de...@outlook.com> wrote:

> If you look at the original underlying NT I/O architecture that Cutler 
> implemented - it is a thing of beauty that's based in async patterns, 
> and not threads.
...
> If instead NT initially only exposed the Nt API's and not the Win32 
> layers, we would have had languages that simplified async a long time 
> ago - and multi-threaded would be the domain of a few applications 
> that actually need compute, and not just non-blocking IO.

That's very interesting, Deon, thanks.  I'm happy to cross out Cutler's name 
from my list of those who Ruined Computing During My Lifetime.  

I'm not as optimistic as you about what would have happened languagewise.  
Language support for OS features is pretty rare.  Threads happened, and no 
language corralled them. Windows has completion ports, which are very useful 
for asynchronous control, but afaik no Microsoft language supports them.  

It's why I like Go: it's the first language in 30 years to incorporate 
concurrency in its design, and finally support a theoretically sound model.  

--jkl
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-17 Thread Dominique Devienne
On Fri, Feb 17, 2017 at 5:10 AM, Deon Brewis  wrote:

> If you look at the original underlying NT I/O architecture that Cutler
> implemented - it is a thing of beauty that's based in async patterns, and
> not threads.
>

Thanks. That Cutler reference led me to
http://static.slated.org/nt/the-real-nt.html which is interesting. --DD
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-17 Thread James K. Lowden
On Thu, 16 Feb 2017 19:35:51 -0800
Jens Alfke  wrote:

> > On Feb 16, 2017, at 6:26 PM, James K. Lowden
> >  wrote:
> > 
> > It doesn't change the fact that the OS has subverted the
> > guarantees your language would otherwise provide, such as the
> > atomicity of ++i noted elsewhere in this thread.  
> 
> It?s not the OS, it?s the architecture of multiprocessor systems. 

It's the OS.  A thread is an OS abstraction, not a machine feature.  

You accept that exec(2) creates a single-threaded process that allows a
language like C to express sequential logic.  You accept that Posix
threads provide a different flow of control.  You know that to execute
threads requires the OS to do some degree of context switching, and OS
defines what a "context" is by what it chooses to preserve.  You know
you have can have many threads on a single processor. You know the OS
determines things like thread affinity.  

A process and a thread, as defined by the operating system, are both OS
abstractions, a representation of a machine.  The OS *chooses* not to
arbitrate access to memory, and the compiler *chooses* to expose the
programmer to those pathological interactions.  That's by design.

Without the OS supporting threads, the machine's intrinsic
multiprocessing implementation would not be exposed to the programmer,
any more than are the intricacies of SATA or Ethernet chip design are.  

--jkl


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-17 Thread James K. Lowden
On Fri, 17 Feb 2017 04:10:09 +
Deon Brewis  wrote:

> If you look at the original underlying NT I/O architecture that
> Cutler implemented - it is a thing of beauty that's based in async
> patterns, and not threads.
...
> If instead NT initially only exposed the Nt API's and not the Win32
> layers, we would have had languages that simplified async a long time
> ago - and multi-threaded would be the domain of a few applications
> that actually need compute, and not just non-blocking IO. 

That's very interesting, Deon, thanks.  I'm happy to cross out Cutler's
name from my list of those who Ruined Computing During My Lifetime.  

I'm not as optimistic as you about what would have happened
languagewise.  Language support for OS features is pretty rare.  Threads
happened, and no language corralled them. Windows has completion ports,
which are very useful for asynchronous control, but afaik no Microsoft
language supports them.  

It's why I like Go: it's the first language in 30 years to incorporate
concurrency in its design, and finally support a theoretically sound
model.  

--jkl
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-17 Thread Petite Abeille

> On Feb 17, 2017, at 12:21 AM, Warren Young  wrote:
> 
> How can we expect people to write threaded programs when even a simple 
> integer increment is prone to race conditions and read-modify-write errors?

"… we did not (and still do not) believe in the standard multithreading model, 
which is preemptive concurrency with shared memory: we still think that no one 
can write correct programs in a language where ‘a=a+1’ is not deterministic.”

— Roberto Ierusalimschy & Co., The Evolution of Lua
https://www.lua.org/doc/hopl.pdf

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-17 Thread Richard Damon
The issue here (with the INC instruction) isn't 'Thread' but 
"Multi-Master/(Core)' On as single processor, the INC is atomic. When you have 
a possibility of multiple masters (or cores) access the same location, you need 
to be more careful, and that LOCK prefix due to what it needs to do to caches 
is expensive. The more general application is that when you share information 
between concurrent operations you must be careful and use the right tools to 
make sure it 'works', and you don't want to use these all the time for things 
that aren't shared because they can get too expensive.

When dealing with an API, some items are designed to be used across threads, 
and some only within a single thread. You can sometimes get a single threaded 
API to work multithreaded by adding external synchronization (which can be 
clumsy), and you can use the multithreaded parts of the API in just a single 
thread, but they may be slower than a single threaded version of the API. The 
key is you need to read the documentation and follow the rules. 

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Simon Slavin
Sent: Friday, February 17, 2017 6:48 AM
To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Thread safety of serialized mode


On 17 Feb 2017, at 9:27am, Clemens Ladisch <clem...@ladisch.de> wrote:

> X86(-64) has always had "INC [mem]" and "LOCK INC [mem]".

It is insane that a CPU would allow two threads to interfere with each-other in 
such a way as to 'break' an INC instruction.

But yes, we have drifted a long way from SQLite.  Shuttup Simon.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-17 Thread Simon Slavin

On 17 Feb 2017, at 9:27am, Clemens Ladisch  wrote:

> X86(-64) has always had "INC [mem]" and "LOCK INC [mem]".

It is insane that a CPU would allow two threads to interfere with each-other in 
such a way as to 'break' an INC instruction.

But yes, we have drifted a long way from SQLite.  Shuttup Simon.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-17 Thread Clemens Ladisch
Jens Alfke wrote:
> A read-modify-write cycle on an address in main memory is intrinsically
> _not_ atomic on a multiprocessor system, not unless the CPU goes through
> some expensive efforts to make it so (cache invalidation, bus locking,
> etc.)

Most modern CPUs have caches, and any _normal_ memory write already
requires that the CPU takes ownership of a cache line.  And once it has
ownership, the read-modify-write cycle is trivial.

Atomic operations are somewhat slower than normal ones because they
often imply a memory barrier, but they are very slow only if you do them
in uncached memory.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-17 Thread Darren Duncan

On 2017-02-14 5:05 PM, Darren Duncan wrote:

On 2017-02-14 4:46 PM, Richard Hipp wrote:

 This is yet another reason why I say "threads are evil".  For
whatever reason, programmers today think that "goto" and pointers and
assert() are the causes of all errors, but threads are cool and
healthful.  Entire programming languages are invited (I'm thinking of
Java) to make goto and pointers impossible or to make assert()
impossible (Go) and yet at the same time encourage people to use
threads.  It boggles the mind 


There is nothing inherently wrong with threads in principle, just in how some
people implement them.  Multi-core and multi-CPU hardware is normal these days
and is even more the future.  Being multi-threaded is necessary to properly
utilize the hardware, or else we're just running on a single core and letting
the others go idle.  The real problem is about properly managing memory.  Also
giving sufficient hints to the programming language so that it can implicitly
parallelize operations.  For example, want to filter or map or reduce a relation
and have 2 cores, have one core evaluate half the tuples and another evaluate
the other half, and this can be implicit simply by declaring the operation
associative and commutative and lacking of side-effects or whatever. -- Darren
Duncan


Based on the responses I have seen, I think a lot of people have misunderstood 
what I was trying to say here.


When I said "threads", I was meaning that term in the most generic sense 
possible, in the same way that "concurrency" is generic.  My saying 
"multi-threaded" is saying to use appropriate tools, which come in a variety of 
names, so that you permit your workload to be spread over multiple CPUs or CPU 
cores at once, rather than constraining it to be run serially in a single core.


I was never advocating using a specific mechanism like a C language "thread".

-- Darren Duncan

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-17 Thread Clemens Ladisch
Jens Alfke wrote:
> PS: I’m not aware of _any_ current CPUs that can increment main memory
> in one instruction, atomically or not.

X86(-64) has always had "INC [mem]" and "LOCK INC [mem]".

And MSP430 calls itself RISC, but is so orthogonal that any operand
can be in memory.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-17 Thread Simon Slavin

On 17 Feb 2017, at 8:54am, Clemens Ladisch  wrote:

> Bob Friesenhahn wrote:
> 
>> Does anyone have an idea about this specific problem that we encountered?
>> 
>> It is not clear to me if this is a threading issue, or memory corruption 
>> issue
> 
> It's probably memory corruption causes by a threading issue.
> 
> The original mail said:
> 
>> Any thread may acquire and use this one database connection at any time.
> 
> Add a mutex. Lock it before beginning _any_ transaction (read or read/write),
> and unlock it only after the transaction has finished.

You may or may not want to use SQLite’s own mutex routines:



> Alternatively, give each thread its own connection.

Often the route which leads to a program which is simpler to read and only a 
tiny bit slower.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-17 Thread Clemens Ladisch
Bob Friesenhahn wrote:
> Does anyone have an idea about this specific problem that we encountered?
>
> It is not clear to me if this is a threading issue, or memory corruption issue

It's probably memory corruption causes by a threading issue.

The original mail said:
> Any thread may acquire and use this one database connection at any time.

Add a mutex. Lock it before beginning _any_ transaction (read or read/write),
and unlock it only after the transaction has finished.

Alternatively, give each thread its own connection.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread Deon Brewis
If you look at the original underlying NT I/O architecture that Cutler 
implemented - it is a thing of beauty that's based in async patterns, and not 
threads.

It was the Win32 wrappers over the NT subsystem that tried to make things 
"easier" for developers to deal with, which forced synchronous blocking code on 
top of the async Zw/Nt layers. This made the only practical way to deal with 
"blocking" I/O to be multi-threaded.

Today even junior-ish developers can deal with async code in node.js, and not 
bat an eyelid about it - the language makes async interaction simple enough - 
even in a single threaded environment. It wasn't the underlying technology was 
wrong, it was the simplifying abstraction on top of it that was.

If instead NT initially only exposed the Nt API's and not the Win32 layers, we 
would have had languages that simplified async a long time ago - and 
multi-threaded would be the domain of a few applications that actually need 
compute, and not just non-blocking IO. This wasn't due to Cutler's architecture 
though - more market driven decisions trying to maintain API compatibility with 
Win95, which was in turn driven by API compatibility with 16 bit API's, which 
long predated Cutler.

- Deon

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of James K. Lowden
Sent: Thursday, February 16, 2017 6:26 PM
To: sqlite-users@mailinglists.sqlite.org
Subject: Re: [sqlite] Thread safety of serialized mode

On Thu, 16 Feb 2017 21:49 +
Tim Streater <t...@clothears.org.uk> wrote:

> > What's inherently wrong with threads in principle is that there is 
> > no logic that describes them, and consequently no compiler to 
> > control that logic.
> 
> [snip remainder of long whinge about threads]
> 
> Sounds, then, like I'd better eliminate threads from my app. In which 
> case when the user initiates some action that may take some minutes to 
> complete, he can just lump it when the GUI becomes unresponsive.

[snip chest thumping]

You didn't refute my assertion, and facts refute yours.  

There has been a GUI in use for some 30 years, dating back to your VMS days, 
that is single-threaded.  I'm sure you've heard of it, the X Window System?  

If your particular GUI system is based on threads, like, say, Microsoft 
Windows, then, yes, you're pretty much cornered into using threads.  But that 
doesn't change the fact that you have no compiler support to verify the 
correctness of memory access over the time domain.  It doesn't change the fact 
that the OS has subverted the guarantees your language would otherwise provide, 
such as the atomicity of ++i noted elsewhere in this thread.  

WR Stevens describes 4 models for managing concurrency:

1.  Mutilplexing: select(2)
2.  Multiprocessing
3.  Asynchronous callbacks
4.  Signal-driven

None of those subvert the semantics of the programming language.  In each case, 
at any one moment there is only one thread of control over any given section of 
logic.  

Hoare had already published "Communicating Sequential Processes"  
(http://www.usingcsp.com/cspbook.pdf) when it hired David Cutler to design 
Windows NT.  It's too bad they adopted threads as their concurrency-management 
medium.  If they'd chosen CSP instead, maybe they wouldn't have set computing 
back two decades.  

--jkl


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread Jens Alfke

> On Feb 16, 2017, at 6:26 PM, James K. Lowden  wrote:
> 
> It doesn't change the fact that the OS has subverted the
> guarantees your language would otherwise provide, such as the atomicity
> of ++i noted elsewhere in this thread.  

It’s not the OS, it’s the architecture of multiprocessor systems. A 
read-modify-write cycle on an address in main memory is intrinsically _not_ 
atomic on a multiprocessor system, not unless the CPU goes through some 
expensive efforts to make it so (cache invalidation, bus locking, etc.) You can 
get that behavior if you like by using libraries like stdatomic or the C++ 
atomic types, but it makes the operation much, much slower.

> In
> each case, at any one moment there is only one thread of control over
> any given section of logic.  

That’s nice, but that’s just not the way the memory architectures of current 
computers work. Threaded programming happens to expose that behavior because it 
allows simultaneous read/write access to memory, and the semantics of that are 
subtle and weird. (It’s a distributed system after all. Distributed systems are 
rather Einsteinian: they don’t have strong causality.)

> It's too bad they adopted threads as
> their concurrency-management medium.  If they'd chosen CSP instead,
> maybe they wouldn't have set computing back two decades.  

No, they’d never have shipped a useable product. The attractive thing about 
threads is that they’re cheap and efficient. Higher level constructs like CSP 
are great, but they have a lot of overhead. For example, look at Mach: as 
originally implemented, it had a tiny kernel that only did message-passing, and 
everything else was implemented as separate processes that communicated by 
messaging. Unfortunately it was too slow to be useful. Every OS ever shipped by 
NeXT and Apple has a monolithic kernel based on BSD, with Mach messaging only 
used for higher-level tasks.

It’s really easy to point to hard work done by other engineers and insult it 
without any knowledge of the actual design constraints and development issues. 
I hear it all the time, and personally I am sick of that attitude.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread James K. Lowden
On Thu, 16 Feb 2017 21:49 +
Tim Streater  wrote:

> > What's inherently wrong with threads in principle is that there is
> > no logic that describes them, and consequently no compiler to
> > control that logic.  
> 
> [snip remainder of long whinge about threads]
> 
> Sounds, then, like I'd better eliminate threads from my app. In which
> case when the user initiates some action that may take some minutes
> to complete, he can just lump it when the GUI becomes unresponsive. 

[snip chest thumping]

You didn't refute my assertion, and facts refute yours.  

There has been a GUI in use for some 30 years, dating back to your VMS
days, that is single-threaded.  I'm sure you've heard of it, the X
Window System?  

If your particular GUI system is based on threads, like, say,
Microsoft Windows, then, yes, you're pretty much cornered into using
threads.  But that doesn't change the fact that you have no compiler
support to verify the correctness of memory access over the time
domain.  It doesn't change the fact that the OS has subverted the
guarantees your language would otherwise provide, such as the atomicity
of ++i noted elsewhere in this thread.  

WR Stevens describes 4 models for managing concurrency:

1.  Mutilplexing: select(2)
2.  Multiprocessing
3.  Asynchronous callbacks
4.  Signal-driven

None of those subvert the semantics of the programming language.  In
each case, at any one moment there is only one thread of control over
any given section of logic.  

Hoare had already published "Communicating Sequential
Processes"  (http://www.usingcsp.com/cspbook.pdf) when it hired David
Cutler to design Windows NT.  It's too bad they adopted threads as
their concurrency-management medium.  If they'd chosen CSP instead,
maybe they wouldn't have set computing back two decades.  

--jkl


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread James K. Lowden
On Thu, 16 Feb 2017 16:21:06 -0700
Warren Young  wrote:

> https://www2.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.pdf
> 
> Threads aren?t just distasteful from an implementation standpoint,
> they?re *mathematically unsound*.

Thank you for that.  I think I encounted that paper late one night and
never got back to it.  This was poignant: 

"I conjecture that most multi-threaded general-purpose
applications are, in fact, so full of concurrency bugs that as
multi-core architectures become commonplace, these bugs will begin to
show up as system failures."

I guess it was about the year 2005 when we upgraded our SQL Server to
IIRC a machine with 4 processors.  For the first time, we had hardware
that could execute multiple threads literally simultaneously.  And
for the first time some queries failed to execute unless we set the
maximum degree of parallelism to 1.  The server itself was also
unstable.  As I recall, we used certain settings in the registry to
restrict the parallelism in the server until later releases made that
unnecessary.  

Concurrency bugs exposed by multi-core architectures?  Ya think?  

--jkl





___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread Bob Friesenhahn

On Thu, 16 Feb 2017, Warren Young wrote:


Taking it off-list, since there is zero remaining connection to SQLite now:


Thank you for taking it off list.

How can we expect people to write threaded programs when even a 
simple integer increment is prone to race conditions and 
read-modify-write errors?


I have not encountered much issue with threads in my own programs. 
Using threads requires attention to detail, such as if all libraries 
used (e.g. sqlite3) are thread safe and the terms by which they are 
thread safe.


However, there is still the specific issue I posted about which no one 
has posted a follow-up on.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread Warren Young
Taking it off-list, since there is zero remaining connection to SQLite now:

> On Feb 16, 2017, at 2:49 PM, Tim Streater  wrote:
> 
> On 16 Feb 2017 at 18:30, James K. Lowden  wrote: 
> 
>> On Tue, 14 Feb 2017 17:05:30 -0800
>> Darren Duncan  wrote:
>> 
>>> There is nothing inherently wrong with threads in principle
>> 
>> What's inherently wrong with threads in principle is that there is no
>> logic that describes them, and consequently no compiler to control that
>> logic.  
> 
> [snip remainder of long whinge about threads]
> 
> Sounds, then, like I'd better eliminate threads from my app. In which case 
> when the user initiates some action that may take some minutes to complete, 
> he can just lump it when the GUI becomes unresponsive.

CSPs, the actor model, message passing architectures, etc. all give you ways to 
have concurrent processing without your program explicitly dealing with 
OS-level threads.

> That OK with you? Can I point the user your way when he gives me grief about 
> it? Or should I just say that no, he can't have a responsive GUI under those 
> conditions because some guy on the Internet says so?

“Fear is the path to the dark side. Fear leads to anger. Anger leads to hate. 
Hate leads to suffering.”

Let go your anger. :)

(And lest you think I fear threads and that this sword cuts both ways, no: I 
avoid using threads whenever possible because I *understand* threads.)

> I'll just bring my 50 years experience of writing software to the table, 
> including threaded apps for PDP-11s and VAXes. It's called debugging.

Some kinds of debugging are easier than others.  Why set yourself up for a much 
harder problem than necessary by using inherently problematic mechanisms?  

(Plural, meaning threads and all the synchronization primitives that go along 
with them, which drag in new problems like deadlocking that you didn’t have 
before you added mutexes to try and solve the problems you bought by adding 
that one little ol' thread.)

Have you read the Lee paper referenced in the mailing list thread?

https://www2.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.pdf

Threads aren’t just distasteful from an implementation standpoint, they’re 
*mathematically unsound*.

Did you study the ARM assembly language comparison I linked to?  How can we 
expect people to write threaded programs when even a simple integer increment 
is prone to race conditions and read-modify-write errors?
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread Jens Alfke

> On Feb 16, 2017, at 11:49 AM, Warren Young  wrote:
> 
> A software developer who refuses to learn about his processor’s assembly 
> language is like trying to become an electrical engineer without learning 
> anything about physics. 

In this case what you need to read is the specification of the memory model 
assumed by C and C++. The language specs are very explicit about what you can 
and can’t expect from concurrent memory accesses. (They’re also very hard to 
read! But fortunately there are good books about concurrent C/C++ programming 
that explain the rules clearly, like “C++ Concurrency In Action.)

Of course that might not exactly match your CPU, because the specs are 
cross-platform and generally have to be very conservative … but in most cases 
it’s not a good idea to bake detailed knowledge of the CPU into your code, if 
you ever plan on porting that code to any other CPU. :) 

(Longtime programmers for Apple platforms know this especially well … I’ve been 
through transitions from the 68000 to the 68030, then to PowerPC, then to dual 
CPUs, then to 64-bit, then to X86, then to ARM7, then to ARM-64bit. Whew! 
Especially in the early stages, there were super clever things you could do 
that took advantages of details of the CPU architecture — like the way the 
68000 ignored the high 8 bits of pointers — that then stopped working in the 
next generation and forced you to redesign your code.)

—Jens

PS: I’m not aware of _any_ current CPUs that can increment main memory in one 
instruction, atomically or not. The PDP-11 did famously have such an 
instruction, which is exactly why the ++ and -- operators were added to C’s 
parent BCPL, since it was intended as a “structured assembly language”.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread Tim Streater
On 16 Feb 2017 at 18:30, James K. Lowden  wrote: 

> On Tue, 14 Feb 2017 17:05:30 -0800
> Darren Duncan  wrote:
>
>> There is nothing inherently wrong with threads in principle
>
> What's inherently wrong with threads in principle is that there is no
> logic that describes them, and consequently no compiler to control that
> logic.  

[snip remainder of long whinge about threads]

Sounds, then, like I'd better eliminate threads from my app. In which case when 
the user initiates some action that may take some minutes to complete, he can 
just lump it when the GUI becomes unresponsive. That OK with you? Can I point 
the user your way when he gives me grief about it? Or should I just say that 
no, he can't have a responsive GUI under those conditions because some guy on 
the Internet says so?

Well that ain't gonna happen. I'll just bring my 50 years experience of writing 
software to the table, including threaded apps for PDP-11s and VAXes. It's 
called debugging.

--
Cheers  --  Tim
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread Bob Friesenhahn
It seems like the discussion has turned into a general programming 
discussion unrelated to SQLite3.  Does anyone have an idea about this 
specific problem that we encountered (see quoted message below)?


It is not clear to me if this is a threading issue, or memory 
corruption issue, or if it is a SQLite3 implementation logic issue 
(something to do with a deferred moveto).  Why should destroying a 
prepared statement care about a cursor's deferred moveto?


Bob

On Wed, 15 Feb 2017, Bob Friesenhahn wrote:

It turns out that I have more data on the problem.  The error message 
reported reads something like:


SQLITE_CORRUPT: database disk image is malformed database corruption at line 
70273 of [17efb4209f]


We are using version 3.10.2.

Looking at amalgamation code I see that the error is returned from 
handleDeferredMoveto() and is base on a value returned from 
sqlite3BtreeMovetoUnpacked():


 70259 ** The cursor "p" has a pending seek operation that has not yet been
 70260 ** carried out.  Seek the cursor now.  If an error occurs, return
 70261 ** the appropriate error code.
 70262 */
 70263 static int SQLITE_NOINLINE handleDeferredMoveto(VdbeCursor *p){
 70264   int res, rc;
 70265 #ifdef SQLITE_TEST
 70266   extern int sqlite3_search_count;
 70267 #endif
 70268   assert( p->deferredMoveto );
 70269   assert( p->isTable );
 70270   assert( p->eCurType==CURTYPE_BTREE );
 70271   rc = sqlite3BtreeMovetoUnpacked(p->uc.pCursor, 0, p->movetoTarget, 
0, );

 70272   if( rc ) return rc;
 70273   if( res!=0 ) return SQLITE_CORRUPT_BKPT;
 70274 #ifdef SQLITE_TEST
 70275   sqlite3_search_count++;
 70276 #endif
 70277   p->deferredMoveto = 0;
 70278   p->cacheStatus = CACHE_STALE;
 70279   return SQLITE_OK;
 70280 }

Ideas?

Bob



--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread Keith Medcalf

www.microsoft.com ...

The only time the OS is not "mapped" into the process address space is if you 
are running 32-bit code on a 64-bit OS.  In that case it has to use an 
imitation syscall trampoline stored at the top of the 4GB 32-bit address space 
to jump into 64-bit mode to access the OS code.  In all cases where you are 
running either 64-bit processes on a 64-bit OS or a 32-bit process on a 32-bit 
OS, the entire OS is mapped into the process address space.

The most pure example of a DCSS based OS is CMS, although the actual real OS 
(the CP part of CP/CMS) lives in a separate supervisor process.

> -Original Message-
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> On Behalf Of James K. Lowden
> Sent: Thursday, 16 February, 2017 11:30
> To: sqlite-users@mailinglists.sqlite.org
> Subject: Re: [sqlite] Thread safety of serialized mode
> 
> On Wed, 15 Feb 2017 07:55:16 -0700
> "Keith Medcalf" <kmedc...@dessus.com> wrote:
> 
> > Note that for several modern OSes, the OS is nothing more than a
> > discontiguous saved segment (DCSS) which is mapped into *every*
> > process space and that process isolation is more of a myth than a
> > reality.
> 
> Are you referring to one in particular we could read about?
> 
> --jkl
> 
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread Warren Young
On Feb 15, 2017, at 5:03 AM, a...@zator.com wrote:
> 
> I suppose someday, programming languages can do an analogous translation in 
> our limited but safe, sequential programs.

Not as long as we require side effects to achieve anything of practical value.  
Any form of I/O is a “side effect” by my definition, whether that’s disk, 
network, GUI, or what have you.

Avoiding threads is good because the well known problems with global variables 
magnify a combinatorially when multiple threads can access them simultaneously 
— literally *simultaneously* on a modern multi-core processor! — in any pattern 
you can conceive, and more you probably haven’t even thought of.  The problem 
is combinatoric on the number of instructions in the program and the number of 
threads, which gives you a really big number really fast.  Humans aren’t good 
at thinking about all N billion execution paths through a given program.

Synchronization — whether that’s mutexes or transactions or message passing or 
something else — helps, but it always eats into the speed advantage of raw 
threads, so there will continue to be a continuous pressure to reduce 
synchronization rather than add more automatically.  Go look up “lock free data 
structures” if you want to see the kind of thing being done in this area.

Computers can help with the combinatoric explosion, but I’m not seeing a whole 
lot of progress on this with real world programs.  I want a tool like lint(1) 
that will detect synchronization errors statically, but for now, I think you 
have to rely on dynamic tools like Helgrind:

   http://valgrind.org/docs/manual/hg-manual.html

The problem with dynamic error detection is that it can only catch errors in 
code paths that you can trigger while the tool watches.  This is about more 
than just simple code coverage, it’s about *combinatoric* code path coverage.  
If you don’t test all possible interleavings of instructions among the threads 
and cores, you can still miss an error, even if the tool knows how to detect it.

I have to believe static thread correctness analysis is at least possible in 
principle, because humans do manage to see threading problems just by staring 
at the code long enough.  It might require strong AI to do it, but it’s got to 
be possible to at least do as well as an expert human.  But, that just means 
this becomes yet another, “Won’t it be great when we have strong AI?” wish.

A tool like Helgrind isn’t enough by itself.  It’ll blithely ignore your 
lock-free data structure code, for example.  It’ll also fail to flag logical 
errors in your SQLite code where you’re missing transactions, for another.

It’s worth repeating: Concurrency is hard.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread Warren Young
On Feb 15, 2017, at 4:40 AM, Darren Duncan  wrote:
> 
> On 2017-02-15 2:40 AM, Clemens Ladisch wrote:
>> Cecil Westerhof wrote:
>> 
>> And just like with assembly code, you also have to count the time spent
>> writing it, and debugging the result.
> 
> Also, its a long time since hand-writing assembly code was any good for 
> performance, unless you're a 1% top expert with a good reason.

While true insofar as it goes, that attitude leads to people being ignorant of 
what the compiler produces from the code you give it.  This almost caused a 
threading bug in a program I was modifying recently, and we only caught it 
ahead of time because someone questioned some basic assumptions, causing me to 
go look at the generated assembly.

Consider this humble lone line of code:

++i;

The threading bug is right there, staring at you.

What, you don’t see it?  How about now:

https://godbolt.org/g/xfJ9SQ

Yeah, that’s right, friends, integer increment isn’t a single instruction on 
ARM, even with gcc -O2, hence it is not atomic!  It takes at least three 
instructions (load, modify, store) and for some reason GCC chose to use 6 in 
this particular case.  (Probably some remnant of the function calling 
convention.)

That means that if you’re depending on that increment to be atomic across 
threads, you’re going to be in for a shock of the old bank balance transaction 
problem form.  (You know, the one every SQL newbie gets taught, where the 
account gets double-debited or double-credited if you don’t use transactions.)

The solution is to use GCC’s atomic increment primitive — also shown via the 
above link for comparison — which adds a couple of “dmb” ARM instructions to 
lock the code to a single CPU core through that critical section.

A software developer who refuses to learn about his processor’s assembly 
language is like trying to become an electrical engineer without learning 
anything about physics.  A typical practicing EE won’t need to break out 
Maxwell’s equations every day, but understanding the implications of those 
equations is what separates engineering from tinkering.

> If you want speed, write in C or something else that isn't assembly.  The 
> odds are like 99% that the modern C compiler will generate faster code than 
> you could ever write yourself in assembly, and it will be much less buggy.

Just to be sure people understand my position here, I will agree with this 
again.

If you don’t like my example above as presented, consider also that it supports 
the “threads are evil” hypothesis.  If you can’t count on a simple preincrement 
to be atomic, what else are you misunderstanding about what’s going on at the 
low levels of the system when it runs your multithreaded program?

(And no, it wasn’t my idea to use threads in the program I was modifying in the 
first place!  One of the planned upcoming changes is to redesign it from a 
2-thread system to two cooperating single-threaded programs communicating over 
an IPC channel.)

> Similarly with threads, for the vast majority of people, using other 
> concurrency models with supported languages are better; they will still get 
> the performance benefit of using multiple CPU cores but do it much more 
> safely than if you are explicitly using "threads" in code.

Also agreed.  I recommend starting with message-passing, and move on to other 
methods only when you can prove that won’t give the required benefit.

I also recommend that you go learn you some Erlang (for great good):

   http://learnyousomeerlang.com/content

If you can’t get past the syntax, you can paper over it with the Ruby-like 
Elixir front-end:

   http://elixir-lang.org/

But to bring all of this back around on topic, beware that a SQL DB is 
basically a global variable store, albeit with arbitrated access.  You can 
create cross-process problems by misuse of the data store just as you can with 
global variables in a traditional threaded program.

Message-passing concurrency is just a tool that increases your chances of 
effortless success, it is not a guarantee of it.  No system that allows side 
effects can guarantee proper ordering of operations without some thought given 
to it, whether the mechanisms involved are mutexes, transactions, or something 
else.

Concurrency is hard.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread James K. Lowden
On Tue, 14 Feb 2017 17:05:30 -0800
Darren Duncan  wrote:

> There is nothing inherently wrong with threads in principle

What's inherently wrong with threads in principle is that there is no
logic that describes them, and consequently no compiler to control that
logic.  

By analogy, the Forth language has no datatypes.  The programmer is
free to treat any area of memory as encoded as any type; the only
structure is the stack.  Languages that do define datatypes allow the
programmer to guarantee consistent treatment of variables by enforcing
type constraints.  

Threads were and are a theoretical regression.  They returned the
programmer to the time when the programming environment provided no
memory protection.  They introduced a flow of control with no governance
provided by the OS or the compiler.  

> Being multi-threaded is necessary to properly utilize the hardware,
> or else we're just running on a single core and letting the others go
> idle.  The real problem is about properly managing memory.  

It's not necessary in general to give up protected memory to fully
utilze the hardware.  

Rob Pike made some excellent presentations explaining the difference
between parallel and concurrent operations, and how Go uses CSP to
support concurrency.  

Go is a step in the right direction.  By bringing threads under the
control of the compiler, GoThreads give the programmer the efficiency
threads afford without relinquishing control over memory.  

> Also giving sufficient hints to the programming language so that it
> can implicitly parallelize operations.  

Afaik that's an unsolved problem.  Take qsort(3) for example.  I wrote
a recursive version that runs in parallel.  (Mine uses shared memory
and fork(2)).  What kind of a "hint" might the programmer provide to
determine how many processes to use?  

Clearly, no static choice is right.  Should qsort interrogate the
machine and decide to use, say, 1/2 the processors?  Why not all?  But
what if the machine is heavily loaded at the time qsort runs?  Should
it take the time to examine the machine state, and limit itself?  Or
should it just go ahead and let the OS deal with it?  

I decided on per-process regulation via an environment variable
representing the approximate number of processes my qsort would spawn.
If the variable is not set, the default is 2x the number of processors,
which seems to the be upper limit for performance in my limited
testing.  

So, yeah, threads are the 1990s version of 1960s computing.  We seem to
be on the cusp of recognizing the value of CSP to manage concurrency,
and of functional programming to manage parallelism.  That's far from
the majority view, though, afaict.  There's a lot more to be invented,
and done.  

--jkl






___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread James K. Lowden
On Wed, 15 Feb 2017 09:40:13 -0800
Jens Alfke  wrote:

> https://en.wikipedia.org/wiki/Communicating_sequential_processes
> 

Also search YouTube for Rob Pike's presentations on CSP in Go.  It will
help clarify your thinking about the different computing models that
"multithreading" is used for.  

--jkl
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread James K. Lowden
On Wed, 15 Feb 2017 12:34:51 +
Simon Slavin  wrote:

> Two disadvantages are that threads are indistinguishable to anything
> but the owner and don?t know how to keep out of each-other?s way.  By
> the time you?ve devised some sort of mutex/locking/blocking mechanism
> you?re usually better-off using processes.

Yup.  Jim Gettys observed that no multithreaded X server has replaced or
out-performed the original single-threaded one.  

--jkl
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-16 Thread James K. Lowden
On Wed, 15 Feb 2017 07:55:16 -0700
"Keith Medcalf"  wrote:

> Note that for several modern OSes, the OS is nothing more than a
> discontiguous saved segment (DCSS) which is mapped into *every*
> process space and that process isolation is more of a myth than a
> reality.

Are you referring to one in particular we could read about?  

--jkl

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-15 Thread Bob Friesenhahn
It turns out that I have more data on the problem.  The error message 
reported reads something like:


SQLITE_CORRUPT: database disk image is malformed database corruption 
at line 70273 of [17efb4209f]


We are using version 3.10.2.

Looking at amalgamation code I see that the error is returned from 
handleDeferredMoveto() and is base on a value returned from 
sqlite3BtreeMovetoUnpacked():


  70259 ** The cursor "p" has a pending seek operation that has not yet been
  70260 ** carried out.  Seek the cursor now.  If an error occurs, return
  70261 ** the appropriate error code.
  70262 */
  70263 static int SQLITE_NOINLINE handleDeferredMoveto(VdbeCursor *p){
  70264   int res, rc;
  70265 #ifdef SQLITE_TEST
  70266   extern int sqlite3_search_count;
  70267 #endif
  70268   assert( p->deferredMoveto );
  70269   assert( p->isTable );
  70270   assert( p->eCurType==CURTYPE_BTREE );
  70271   rc = sqlite3BtreeMovetoUnpacked(p->uc.pCursor, 0, p->movetoTarget, 0, 
);
  70272   if( rc ) return rc;
  70273   if( res!=0 ) return SQLITE_CORRUPT_BKPT;
  70274 #ifdef SQLITE_TEST
  70275   sqlite3_search_count++;
  70276 #endif
  70277   p->deferredMoveto = 0;
  70278   p->cacheStatus = CACHE_STALE;
  70279   return SQLITE_OK;
  70280 }

Ideas?

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-15 Thread Jens Alfke

> On Feb 15, 2017, at 3:44 AM, Cecil Westerhof  wrote:
> 
> ​As I said before: I did not work much with threads. Mostly for GUI
> performance. Do you (or anyone else) have any resources about those
> concurrency models​?


Theory:
https://en.wikipedia.org/wiki/Actor_model 

https://en.wikipedia.org/wiki/Communicating_sequential_processes 


also, the paper The Problem With Threads is definitely required reading!
https://www2.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.pdf

Languages:
https://golang.org  — specifically ‘channels’, which are 
like generalized in-process streams or sockets.
https://www.rust-lang.org/  — Rust tracks memory 
ownership to enforce thread-safety at compile time.
http://www.ponylang.org  — Similar memory-safety to 
Rust, but adds garbage-collection and actors.
Other languages that support actors are Scala and Io.

You can build constructs like channels and actors on top of threads in other 
languages. I’m using actors in a C++ project right now; the C++ actor libraries 
I found were too heavyweight so I wrote my own. You do have to be careful 
(since C++ is basically one big double-edged razor blade) but it’s much easier 
than trying to work with mutexes and locks.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-15 Thread Jens Alfke

> On Feb 14, 2017, at 11:58 PM, Clemens Ladisch  wrote:
> 
> But "go parallel" does not necessarily imply threads.  There are many
> ways to allow code running on different CPUs(/cores) to communicate
> with each other (e.g., files, sockets, message queues, pipes, shared
> memory, etc.), and almost all of them are safer than threading because
> they do not require that _all_ of the address space and the process
> context are shared.

Yes, but they’re also _much_ more expensive, for pretty much the same reason. A 
process context switch requires updating the MMU and a bunch of kernel state. A 
thread switch just requires swapping CPU registers. (Depending on the OS there 
may be a system call involved, but that can be avoided by using ‘green’ 
threads, which are basically just a wrapper around setjmp/longjmp.) There are 
several orders of magnitude of difference in performance (though the details 
depend on the CPU and the OS.)

>  When using threads, all memory accesses are unsafe
> by default, and it is then the job of the programmer to manually add
> some form of locking to make it safe again.

This depends on the language and/or the concurrency library. In a managed 
language it’s perfectly feasible to make it impossible for two threads to 
access the same memory (Rust and Pony do this.) In unmanaged code you can’t 
make strong guarantees, but you can get the same effect as long as the 
programmer doesn’t use unsafe techniques like global variables.

I think we have a problem with terminology. The issue here isn’t threads 
themselves, but with how threads communicate. In most languages, the default is 
that you use shared memory and some locking primitives. _That’s_ the nasty evil 
part. Alternative concurrency mechanisms like channels and actors still use 
threads under the hood; they just give the programmer safer and more 
deterministic ways to communicate.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-15 Thread Keith Medcalf


On Wednesday, 15 February, 2017 03:16, Cecil Westerhof  
said:

> 2017-02-15 8:58 GMT+01:00 Clemens Ladisch :
 
> > Jens Alfke wrote:
> > Threading is the most extreme method of achieving parallelism, and
> > therefore should be used only as the last resort.  (I'd compare it to
> > assembly code in this regard.)

> ​At the moment I am not using it much and I am certainly not an expert, but
> as I understood it one of the reasons to use threading is that it costs a
> lot less resources.

Which was very important a few years ago when Dynamic RAM cost more than a 
$1000 per megabyte, were having memory that could be measured in units bigger 
than Kilobytes meant you had a whopping expensive huge computer that probably 
cost more than your car, house, and yacht all added together.  Yet it was 
computationally as advanced as my wristwatch.

There is nothing wrong with multithreading.  Using "processes" is just 
multithreading -- with training wheels, belt, suspenders, diapers, and knee 
pads -- to prevent the foolish from, well, being foolish.  If you design your 
"multithreaded" program as if each thread were a separate process but without 
all the safety gear to prevent you from hurting yourself (and defeating Natural 
Selection in the process), you will have much less issue.

Note that for several modern OSes, the OS is nothing more than a discontiguous 
saved segment (DCSS) which is mapped into *every* process space and that 
process isolation is more of a myth than a reality.




___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-15 Thread Bob Friesenhahn

On Tue, 14 Feb 2017, Jens Alfke wrote:



If we have two threads executing sqlite3_step() on the same connection and 
using their own prepared statement, is there any magic in sqlite3 which would 
keep sqlite3_step() and sqlite3_column_foo() from consuming (or disrupting) the 
results from the other thread?


Not if they’re using the same statement. A statement is a stateful 
object, so using it on multiple threads is probably going to cause 
problems.


To be clear, each thread is using its own prepared statement.

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-15 Thread Bob Friesenhahn

On Tue, 14 Feb 2017, Richard Hipp wrote:


On 2/14/17, Bob Friesenhahn  wrote:

Due to memory constraints
(at least 1MB is consumed per connection!), only one database
connection is used.  Any thread may acquire and use this one database
connection at any time.


 This is yet another reason why I say "threads are evil".  For
whatever reason, programmers today think that "goto" and pointers and
assert() are the causes of all errors, but threads are cool and
healthful.  Entire programming languages are invited (I'm thinking of


Threads are a powerful tool but (like guns) they must be used very 
carefully.


In this particular case I think that the developer is making an 
assumption that more (partial) threading helps but with serialized 
access the database will still block and so perhaps it does not really 
help at all.



If we have two threads executing sqlite3_step() on the same connection
and using their own prepared statement, is there any magic in sqlite3
which would keep sqlite3_step() and sqlite3_column_foo() from
consuming (or disrupting) the results from the other thread?


Yes, that is suppose to work.  If you find a (reproducible) case where
it does not, we will look into it.


Thanks for this clarification.  It is quite possible that the bug 
is outside of sqlite.  The bug feels like a thread safety issue to me.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-15 Thread Simon Slavin

On 15 Feb 2017, at 10:16am, Cecil Westerhof  wrote:

> 2017-02-15 8:58 GMT+01:00 Clemens Ladisch :
> 
>> Jens Alfke wrote:
>> Threading is the most extreme method of achieving parallelism, and
>> therefore should be used only as the last resort.  (I'd compare it to
>> assembly code in this regard.)
> 
> ​At the moment I am not using it much and I am certainly not an expert, but
> as I understood it one of the reasons to use threading is that it costs a
> lot less resources.

Compared with processes, yes.  Threads share stuff.  Processes have their own 
stuff.  Therefore threads are faster to start up and end (no resources to 
allocate or release) and don’t take any kind of resource space.

Two disadvantages are that threads are indistinguishable to anything but the 
owner and don’t know how to keep out of each-other’s way.  By the time you’ve 
devised some sort of mutex/locking/blocking mechanism you’re usually better-off 
using processes.

Graphics programs where you can assign one thread per pixel because each pixel 
has its own colour ?  Threads.
Database programs where everything has to access the same database file ?  
Processes.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-15 Thread ajm
>  Mensaje original 
> De: Richard Hipp <d...@sqlite.org>
> Para:  SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
> Fecha:  Tue, 14 Feb 2017 20:15:49 -0500
> Asunto:  Re: [sqlite] Thread safety of serialized mode
>>
> > On 2/14/17, Darren Duncan <dar...@darrenduncan.net> wrote:
>>
> > There is nothing inherently wrong with threads in principle,

> Nor is there anything wrong with goto, pointers, and assert(), in
> principle.  And yet they are despised while threads are adored, in
> spite of the fact that goto/pointer/assert() errors are orders of
> magnitude easier to understand, find, and fix.
>
> -- 
> D. Richard Hipp

It seems that the problem of writing good multi-threaded programs lies in the 
limited ability of the human mind to think in parallel, which reminds me of the 
time when, after learning to use the unfortunate goto, they came to tell us 
that if we used them, the evil demon would come and burn our toenails, even 
though the compiler translates all those elegant programs to an enormous amount 
of jumps in the assembled code.

I suppose someday, programming languages can do an analogous translation in our 
limited but safe, sequential programs.

--
Adolfo J.M.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-15 Thread Cecil Westerhof
2017-02-15 12:40 GMT+01:00 Darren Duncan :

> Similarly with threads, for the vast majority of people, using other
> concurrency models with supported languages are better; they will still get
> the performance benefit of using multiple CPU cores but do it much more
> safely than if you are explicitly using "threads" in code.


​As I said before: I did not work much with threads. Mostly for GUI
performance. Do you (or anyone else) have any resources about those
concurrency models​?

-- 
Cecil Westerhof
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-15 Thread Darren Duncan

On 2017-02-15 2:40 AM, Clemens Ladisch wrote:

Cecil Westerhof wrote:

2017-02-15 8:58 GMT+01:00 Clemens Ladisch :

Threading is the most extreme method of achieving parallelism, and
therefore should be used only as the last resort.  (I'd compare it to
assembly code in this regard.)


​At the moment I am not using it much and I am certainly not an expert, but
as I understood it one of the reasons to use threading is that it costs a
lot less resources.


And just like with assembly code, you also have to count the time spent
writing it, and debugging the result.


Also, its a long time since hand-writing assembly code was any good for 
performance, unless you're a 1% top expert with a good reason.


If you want speed, write in C or something else that isn't assembly.  The odds 
are like 99% that the modern C compiler will generate faster code than you could 
ever write yourself in assembly, and it will be much less buggy.


Similarly with threads, for the vast majority of people, using other concurrency 
models with supported languages are better; they will still get the performance 
benefit of using multiple CPU cores but do it much more safely than if you are 
explicitly using "threads" in code.


-- Darren Duncan

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-15 Thread Clemens Ladisch
Cecil Westerhof wrote:
> 2017-02-15 8:58 GMT+01:00 Clemens Ladisch :
>> Threading is the most extreme method of achieving parallelism, and
>> therefore should be used only as the last resort.  (I'd compare it to
>> assembly code in this regard.)
>
> ​At the moment I am not using it much and I am certainly not an expert, but
> as I understood it one of the reasons to use threading is that it costs a
> lot less resources.

And just like with assembly code, you also have to count the time spent
writing it, and debugging the result.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-15 Thread Cecil Westerhof
2017-02-15 8:58 GMT+01:00 Clemens Ladisch :

> Jens Alfke wrote:
> Threading is the most extreme method of achieving parallelism, and
> therefore should be used only as the last resort.  (I'd compare it to
> assembly code in this regard.)
>

​At the moment I am not using it much and I am certainly not an expert, but
as I understood it one of the reasons to use threading is that it costs a
lot less resources.

-- 
Cecil Westerhof
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-14 Thread Dominique Devienne
On Wed, Feb 15, 2017 at 2:07 AM, Jens Alfke  wrote:

> > Entire programming languages are invited (I'm thinking of
> > Java) to make goto and pointers impossible or to make assert()
> > impossible (Go) and yet at the same time encourage people to use
> > threads.
>
> [...]. And Go has had assertions for a while now. :)
>

Really? Where? --DD

Why does Go not have assertions? https://golang.org/doc/faq#assertions
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-14 Thread Clemens Ladisch
Jens Alfke wrote:
> With clock speed having stalled, the only way to take advantage of
> modern CPUs (and GPUs!) is to go parallel.

But "go parallel" does not necessarily imply threads.  There are many
ways to allow code running on different CPUs(/cores) to communicate
with each other (e.g., files, sockets, message queues, pipes, shared
memory, etc.), and almost all of them are safer than threading because
they do not require that _all_ of the address space and the process
context are shared.  When using threads, all memory accesses are unsafe
by default, and it is then the job of the programmer to manually add
some form of locking to make it safe again.

Threading is the most extreme method of achieving parallelism, and
therefore should be used only as the last resort.  (I'd compare it to
assembly code in this regard.)


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-14 Thread Jens Alfke

> On Feb 14, 2017, at 5:15 PM, Richard Hipp  wrote:
> 
> Nor is there anything wrong with goto, pointers, and assert(), in
> principle.  And yet they are despised while threads are adored, in
> spite of the fact that goto/pointer/assert() errors are orders of
> magnitude easier to understand, find, and fix.

Goto and pointers don’t enable huge speed increases the way concurrency does. 
With clock speed having stalled, the only way to take advantage of modern CPUs 
(and GPUs!) is to go parallel. Threading is also important to keep the UI 
responsive in GUI apps.

Goto is pretty much unnecessary except occasionally for error handling in C 
since it doesn’t have any proper cleanup mechanisms. I haven’t used it in years 
(and yeah, my programming history goes back to the late ‘70s, starting with 
Tiny BASIC on IMSAI 8080s, so don’t tell me to get off your lawn ;-)
Pointer arithmetic likewise except in certain really low-level grungy 
libraries. I have one such library, but the rest of my code is in C++ and 
mostly uses smart pointers.
I’ve actually never heard anyone speak negatively about assertions. What’s 
wrong with them?

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-14 Thread Richard Hipp
On 2/14/17, Darren Duncan  wrote:
>
> There is nothing inherently wrong with threads in principle,

Nor is there anything wrong with goto, pointers, and assert(), in
principle.  And yet they are despised while threads are adored, in
spite of the fact that goto/pointer/assert() errors are orders of
magnitude easier to understand, find, and fix.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-14 Thread Scott Hess
On Tue, Feb 14, 2017 at 5:05 PM, Darren Duncan 
wrote:

> On 2017-02-14 4:46 PM, Richard Hipp wrote:
>
>>  This is yet another reason why I say "threads are evil".  For
>> whatever reason, programmers today think that "goto" and pointers and
>> assert() are the causes of all errors, but threads are cool and
>> healthful.  Entire programming languages are invited (I'm thinking of
>> Java) to make goto and pointers impossible or to make assert()
>> impossible (Go) and yet at the same time encourage people to use
>> threads.  It boggles the mind 
>>
>
> There is nothing inherently wrong with threads in principle, just in how
> some people implement them.  Multi-core and multi-CPU hardware is normal
> these days and is even more the future.  Being multi-threaded is necessary
> to properly utilize the hardware, or else we're just running on a single
> core and letting the others go idle.  The real problem is about properly
> managing memory.  Also giving sufficient hints to the programming language
> so that it can implicitly parallelize operations.  For example, want to
> filter or map or reduce a relation and have 2 cores, have one core evaluate
> half the tuples and another evaluate the other half, and this can be
> implicit simply by declaring the operation associative and commutative and
> lacking of side-effects or whatever.


I'm with Dr Hipp - threads are evil.  It's not so much how they work when
everything goes well, it's that it's so challenging to align everything so
that it goes well.  My experience is that even very talented programmers
write bugs into their multi-threaded code without realizing it.  I think
what happens is that multi-threaded code often makes things much more
complicated than they look, so if you write to the limits of the complexity
you can understand, you're already over your head.

IMHO, if you're using a message-passing system which does implicit
parallelization, well, _you're_ not using threads, the implementation is
using threads on your behalf.  That I can get behind.  Unfortunately,
decent systems along those lines are like nuclear fusion, they've been just
around the corner for decades, now :-).

-scott
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-14 Thread Jens Alfke

> On Feb 14, 2017, at 4:46 PM, Richard Hipp  wrote:
> 
>  This is yet another reason why I say "threads are evil”.  

I agree, and it’s a pretty widely held opinion these days, going back at least 
to Edward Lee’s 2006 paper “The Problem With Threads”.[1] Actually the problem 
isn’t threads per se, but sharing mutable data between threads. There is a lot 
of work these days going into alternative concurrency mechanisms that are safer 
to use.

> Entire programming languages are invited (I'm thinking of
> Java) to make goto and pointers impossible or to make assert()
> impossible (Go) and yet at the same time encourage people to use
> threads.

Well, both Java and Go have pointers, including the dreaded null, just not 
pointer arithmetic or uninitialized pointers. And Go has had assertions for a 
while now. :)

One of the bad things about Java is that it made threads so easy to use that 
they became an “attractive nuisance” in the hands of inexperienced or careless 
programmers.
Go, to its credit, reintroduced the “channel” mechanism from CSP, a much safer 
way for threads to communicate. But channels are limited and somewhat slow, and 
Go still lets you mess with shared data and mutexes.

The really interesting work is going on in new languages like Rust and Pony 
that make it impossible to share mutable state between threads. They let you 
express the idea of moving data, so one thread can hand off state to another 
without copying.

—Jens

[1]: https://www2.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.pdf
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-14 Thread Darren Duncan

On 2017-02-14 4:46 PM, Richard Hipp wrote:

 This is yet another reason why I say "threads are evil".  For
whatever reason, programmers today think that "goto" and pointers and
assert() are the causes of all errors, but threads are cool and
healthful.  Entire programming languages are invited (I'm thinking of
Java) to make goto and pointers impossible or to make assert()
impossible (Go) and yet at the same time encourage people to use
threads.  It boggles the mind 


There is nothing inherently wrong with threads in principle, just in how some 
people implement them.  Multi-core and multi-CPU hardware is normal these days 
and is even more the future.  Being multi-threaded is necessary to properly 
utilize the hardware, or else we're just running on a single core and letting 
the others go idle.  The real problem is about properly managing memory.  Also 
giving sufficient hints to the programming language so that it can implicitly 
parallelize operations.  For example, want to filter or map or reduce a relation 
and have 2 cores, have one core evaluate half the tuples and another evaluate 
the other half, and this can be implicit simply by declaring the operation 
associative and commutative and lacking of side-effects or whatever. -- Darren 
Duncan


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-14 Thread Jens Alfke

> On Feb 14, 2017, at 3:51 PM, Bob Friesenhahn  
> wrote:
> 
> Due to timing constraints, it performs all read queries in one thread and 
> creates a temporary POSIX thread for each update query (this is the 
> developer's reasoning).

To me that seems kind of backwards, since SQLite supports multiple readers but 
only one writer. In other words, reads can be parallelized [if you use multiple 
connections], but it’s not possible to perform more than one write at a time. 
For example, I’m told the .NET SQLite library keeps a pool of connections for 
reads, but uses a single connection for writes.

> Due to memory constraints (at least 1MB is consumed per connection!), only 
> one database connection is used.  Any thread may acquire and use this one 
> database connection at any time.

With only one connection I don’t think you get any real parallelism. The docs 
aren’t explicit, but my understanding is that in serialized mode every SQLite 
API call goes through a mutex belonging to the connection, so only one thread 
can be acting on that connection at a time.

> If we have two threads executing sqlite3_step() on the same connection and 
> using their own prepared statement, is there any magic in sqlite3 which would 
> keep sqlite3_step() and sqlite3_column_foo() from consuming (or disrupting) 
> the results from the other thread?

Not if they’re using the same statement. A statement is a stateful object, so 
using it on multiple threads is probably going to cause problems.

Other ways to get in to trouble include
* Iterating over a statement’s result set in one thread while mutating the 
database on another thread, which results in “undefined results” from the 
iteration (I just got burned by this a few weeks ago)
* Beginning and ending transactions on multiple threads, since the transaction 
is shared state of the connection.

Again, my understanding is that SQLite’s thread-safety just says that it won’t 
crash or corrupt memory or databases, if called on multiple threads. The 
semantics of the API still mean that you’re not going to get the results you 
want if you’re not careful about which thread calls what when.

(FYI, this is why I think making APIs thread-safe is a waste of time. Even if 
you do so, the higher level calling patterns still need to be synchronized 
correctly by the client code, in which case the lower level thread safety is 
largely unnecessary. And the overhead of all those mutex locks can be pretty 
high.)

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-14 Thread Richard Hipp
On 2/14/17, Bob Friesenhahn  wrote:
> Due to memory constraints
> (at least 1MB is consumed per connection!), only one database
> connection is used.  Any thread may acquire and use this one database
> connection at any time.

 This is yet another reason why I say "threads are evil".  For
whatever reason, programmers today think that "goto" and pointers and
assert() are the causes of all errors, but threads are cool and
healthful.  Entire programming languages are invited (I'm thinking of
Java) to make goto and pointers impossible or to make assert()
impossible (Go) and yet at the same time encourage people to use
threads.  It boggles the mind 

>
> If we have two threads executing sqlite3_step() on the same connection
> and using their own prepared statement, is there any magic in sqlite3
> which would keep sqlite3_step() and sqlite3_column_foo() from
> consuming (or disrupting) the results from the other thread?

Yes, that is suppose to work.  If you find a (reproducible) case where
it does not, we will look into it.

>
> In this use case is sqlite3 usage "thread safe" or is behavior
> unstable due to sqlite3_step(), sqlite3_reset(), and result column
> accessors accessing/disrupting data from the result set of the other
> thread?
>


-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-14 Thread Bob Friesenhahn

On Wed, 15 Feb 2017, Simon Slavin wrote:



On 14 Feb 2017, at 11:51pm, Bob Friesenhahn  
wrote:


One of our Linux programs (not written by me) is reporting errors of the form 
"SQLITE_CORRUPT: database disk image is malformed database corruption".


Is the database actually corrupt ?  Even if your other threads are 
not reporting this corruption, it may be real until you’ve checked. 
Can you use the shell tool to execute


I don't know if it is corrupt.  I added query code to the program 
which reports the problem, causes a core dump, and then the whole 
device reboots.  Queries written by someone else just prints a message 
and carries on.  We use a design in that the working database is in a 
RAM disk and so after the device reboots, the problem database is 
gone.


Sometimes sqlite3_step() reports the problem and sometimes 
sqlite3_finalize() reports the problem.



PRAGMA integrity_check

on it and find out ?


I may be able to add code which automatically does this.

It is noteworthy that none of the other programs are encountering this 
problem, yet all of those programs perform SQL queries from just one 
thread.  Some developers did try to do queries from multiple threads 
and encountered severe problems and so they changed their design to 
use just one thread.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Thread safety of serialized mode

2017-02-14 Thread Simon Slavin

On 14 Feb 2017, at 11:51pm, Bob Friesenhahn  
wrote:

> One of our Linux programs (not written by me) is reporting errors of the form 
> "SQLITE_CORRUPT: database disk image is malformed database corruption".

Is the database actually corrupt ?  Even if your other threads are not 
reporting this corruption, it may be real until you’ve checked.  Can you use 
the shell tool to execute

PRAGMA integrity_check

on it and find out ?

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users