Re: [sqlite] Multithreading

2004-12-25 Thread Roger Binns
Well, currently SQLite seems to meet my performance needs. Currently I 
don't think that brewing my own file format would pay off. I will 
consider this only as a last solution.
I was being somewhat facetious mainly to prove that you don't
actually want best possible performance.  You are far better off
writing whatever code is simplest, especially to test, and then
profiling the results.  If performance is not acceptable then,
you can change how things work and will have prior code to test
against.  

 http://c2.com/cgi/wiki?PrematureOptimization
The higher-level tasks will be performed by various callers in various
threads. But in the end, each tasks requires several db commands to be
executed, so the db will be the bottleneck; 
I did say that the work items should be higher level than a single
SQL command.
adding more threads and CPUs
won't help if I have to serialize all requests into a queue that is
processed by a worker thread. 
Worker thread*s*.  You will generally want one more worker thread
than you have CPUs.  However until you actually profile you won't
know where your bottle neck is.  It could be disk, memory, CPU
or other factors.
The thread switching at some point will limit scalability. 
And you know this how?  Given that operating systems and CPUs have
been tuning for thread switching for decades, actual contradictory
evidence would be useful first.
The other option - opening the db in each thread -
will lead to more memory overhead, also limiting the scalability.
The default settings use 2MB.  With a pragma you can change this
up or down.
As the page says:
 Make it work. 
 Make it right. 
 Make it fast.

Roger


[sqlite] SQLITE_IOERR and strange rollback when db is busy

2004-12-25 Thread Rolf Schaeuble
Hello,
I stumbled over some quite strange case.
Here's the easiest test case that triggers the behaviour:
One process performs very long reads from a db (multiple joins, so the 
cartesian product is *very* large, and the reader needs a while to 
complete).

Another process performs a "BEGIN TRANSACTION", then executes lots of 
"INSERT INTO ... VALUES".
At some point, this process will end up in sqlite3pager_get, when it 
tries to read some page from the database file (the main file, not a 
temp file or a journal). It detects that the page is not in the page 
cache (it ends up in the 'else' branch of if( pPg==0 )'). It runs down 
to the block of code covered by the following comment:

 /* Write the page to the database file if it is dirty.
 */
In this block, pager_write_pagelist( pPg ) returns with SQLITE_BUSY. As 
a consequence, the changes are rolled back and SQLITE_IOERR is returned.

And here seems to be the problem:
First, the database file is locked, so I don't understand why the 
SQLITE_BUSY value isn't propagated back to the caller. If SQLITE_BUSY 
would be returned, then the application could restart the command.
Seconds, sqlite3VdbeHalt decides to perform a sqlite3BtreeRollbackStmt, 
so only the last command should be rolled back. However, this is not 
what happens! In fact, all commands back to the beginning of the 
transaction are rolled back; the transaction, however is not closed. 
Doesn't this violate the default rollback behaviour (roll back last 
command, keep transaction open)? As a consequence, even if the 
application would get SQLITE_BUSY, it couldn't properly react on it.

There are other places in sqlite3pager_get where SQLITE_IOERR are 
returned; I've not checked whether these can also be triggered by the db 
being locked or if they indicate serious problem.
  
If someone is interested, I can send a test case for this problem.

This all was done with SQLite version 3.0.8.
Best regards
Rolf Schäuble


Re: [sqlite] Multithreading

2004-12-25 Thread Rolf Schaeuble
Hello Roger,
thanks for your answer.
Answers inline.
Roger Binns wrote:
itself. However, I'm afraid that this will not lead to the best 
possible performance. 

If you want the best possible performance then don't use SQLite at all
and instead make your own file format that exactly meets your needs
and tradeoffs (memory, byte ordering, CPU, concurrency etc).

Well, currently SQLite seems to meet my performance needs. Currently I 
don't think that brewing my own file format would pay off. I will 
consider this only as a last solution.


result to the issuer of the command. This approach saves memory, but 
I'm afraid the the thread switching involved in this design causes 
too much overhead.

In general the best way to structure applications such as you suggest is
to use a queue and have some threads put work on the queues and a pool
of threads that execute that work.  (The actual work items should be
higher level than a single SQL statement).
That makes it easier to split your program such as if you decide to
make the backend be a service or over the network, makes threading
issues less likely since there will be little shared data, and makes
it easy to scale (just add more worker threads on machines with lots
more memory and/or CPU).

The higher-level tasks will be performed by various callers in various
threads. But in the end, each tasks requires several db commands to be
executed, so the db will be the bottleneck; adding more threads and CPUs
won't help if I have to serialize all requests into a queue that is
processed by a worker thread. The thread switching at some point will
limit scalability. The other option - opening the db in each thread -
will lead to more memory overhead, also limiting the scalability.
Roger
Best regards
Rolf Schäuble



Re: [sqlite] php4/sqlite - sqlite_escape_string doesn't function

2004-12-25 Thread Eric Bohlman
Peter Jay Salzman wrote:
Hi all,
This piece of code kept giving error messages that looked like some of my
VALUES were getting parsed by php:
sqlite_query( $handle, "
INSERT INTO course VALUES (
sqlite_escape_string($semester),
sqlite_escape_string($course),
sqlite_escape_string($course_desc),
sqlite_escape_string($college),
sqlite_escape_string($reference)
)
");
You're expecting PHP to interpolate function calls within quoted 
strings.  It doesn't.  See the "Strings" section of Chapter 6 ("Types") 
of the PHP manual.


Re: [sqlite] Multithreading

2004-12-25 Thread Roger Binns
itself. However, I'm afraid that this will not lead to the best possible 
performance. 
If you want the best possible performance then don't use SQLite at all
and instead make your own file format that exactly meets your needs
and tradeoffs (memory, byte ordering, CPU, concurrency etc).
result to the issuer of the command. This approach saves memory, but I'm 
afraid the the thread switching involved in this design causes too much 
overhead.
In general the best way to structure applications such as you suggest is
to use a queue and have some threads put work on the queues and a pool
of threads that execute that work.  (The actual work items should be
higher level than a single SQL statement).
That makes it easier to split your program such as if you decide to
make the backend be a service or over the network, makes threading
issues less likely since there will be little shared data, and makes
it easy to scale (just add more worker threads on machines with lots
more memory and/or CPU).
Roger


[sqlite] Multithreading

2004-12-25 Thread Rolf Schaeuble
Hello All,
I'm sure this questions has already been asked thousands of times, but 
after searching the archive I still don't know the definite answer.

Is it safe to access the same instance of an sqlite3 database (the same 
pointer returned by sqlite3_open) in several threads, provided that the 
threads perform proper locking so that only one thread accesses the data 
structures at one time?
I have read that at least on Linux it's not safe, since the thread 
identifier is used in the implementation of file locking, and using the 
sqlite3* in several threads will break locking. But what about Windows? 
Windows is the only platform I care about, and if it works on Windows 
(and there are no plans for changes that make it break), it would be 
okay for me.

I have read the suggestions of letting each thread open the database by 
itself. However, I'm afraid that this will not lead to the best possible 
performance. Each sqlite3* instance will have its own page chache, 
wasting precious memory (which can have quite a bad effect, because too 
much CPU cache is wasted). So letting all threads share the same 
sqlite3* seems like a good thing performance-wise.
Another alternative would be to have on dedicated thread for interaction 
with the database. Other threads interact only with this thread by 
putting commands into a queue. The db thread one after the other 
retrieves the commands from the queue, executes them, and returns the 
result to the issuer of the command. This approach saves memory, but I'm 
afraid the the thread switching involved in this design causes too much 
overhead.

Does anyone have some real-life numbers that show how these two 
approaches scale? The usage pattern in my application is lots of 
requests with a rather small result (either inserts, or queries with 
<1000 rows in the result set).

Thanks for your help and best regards
Rolf Schäuble



Re: [sqlite] sqlite3_open() exclusive?

2004-12-25 Thread Andrew Piskorski
On Fri, Dec 24, 2004 at 07:32:07PM -0500, John Richard Moser wrote:

> I thought sqlite databases weren't supposed to be opened with two sqlite
> processes at once.  There are unimplemented locking commands due to this
> right?
> 
> I'm bouncing back and forth in my head trying to decide if I should use
> mysql or sqlite to write a package manager.  I'd like to use SQLite

This seems like a rather strange design question to be "bouncing back
and forth" on.  Do you want to use a client server or an embedded
database for your application?  If you want client server, then you
get to choose from MySQL, PostgreSQL, Oracle, etc. etc.  If you want
an embedded database, then you'd be looking at things like SQLite,
Metakit, etc.

The particular or peculiar attributes of each such piece of software
might even lead you to re-consider whether you want client server or
embedded, but simply jumping ahead to "MySQL vs. SQLite" seems decidly
innappropriate.  Perhaps you have other unmentioned constraints, but I
personally can't think of ANY case where I would conclude, "Oh, I
can't use SQLite for that, so I have to use MySQL." - nor vice versa,
either.

> because it's a single library reliant on pthreads and libc, and thus
> lightweight and potentially ok for embedded systems; worst case, I write
> a RDBMS shell around SQLite especially for the package manager.

-- 
Andrew Piskorski <[EMAIL PROTECTED]>
http://www.piskorski.com/


Re: [sqlite] php4/sqlite - sqlite_escape_string doesn't function

2004-12-25 Thread D. Richard Hipp
Roger Binns wrote:
sqlite_query($handle, "INSERT INTO course VALUES (?,?,?,?,?)",
array($semester, $course, $course_desc, $college, $reference))

Jolan Luff wrote:
sqlite_query($db, "INSERT INTO whints (whid, whregex, "
. "whcatid) VALUES (NULL, '" . sqlite_escape_string($val) . "', '"
. sqlite_escape_string($_REQUEST["whcatid_new"]) . "')");
Extension writers, it is worth noting how this same binding operation
is done in TCL:
  db eval {INSERT INTO course VALUES($semester, $course, $course_desc,
 $college, $reference)}
"db" is the database connection object created by the sqlite3 command,
of course.  The variables we want to bind are written directly in
the SQL and SQLite parses them as named parameters.  The eval method
of the database handle object scans the named parameters and binds
in the values of corresponding local variables.  Note how much
cleaner and eaiser to read this statement is and how it makes the
SQLite interface much closer to the underlying language.  This is,
I believe, a better way of doing binding than either the Perl or
the PHP examples given above.
This kind of binding is implemented in the "eval" method using
a simple loop like the following:
 int nVar = sqlite3_bind_parameter_count(pStmt);
 for(i=1; i<=nVar; i++){
   const char *zVar = sqlite3_bind_parameter_name(pStmt, i);
   /* Look up the TCL variable named by [1] and bind it
   ** as the i-th parameter.  sqlite3_bind_int() or
   ** sqlite3_bind_double() or sqlite3_bind_text() or
   ** sqlite3_bind_blob() might be used, depending on
   ** how the value is stored internally */
 }
Perhaps now you better understand the purpose of the "$alpha"
named parameters in the parser!  Note that you do not have to
use the $alpha named parameter syntax.  :alpha works just as
well:
  db eval {INSERT INTO course VALUES(:semester, :course, :course_desc,
 :college, :reference)}
The use of $alpha would seem to work best for Perl and PHP but
perhaps :alpha would work better in Python.  I will not venture
to choose.  But I do want to encourage extension writers to make
this kind of binding mechanism available to their users.  In
my experience, it makes programming with SQLite much easier,
faster, and less error-prone.
--
D. Richard Hipp -- [EMAIL PROTECTED] -- 704.948.4565