Re: [sqlite] Suppressing column (field) headers?

2004-11-21 Thread ben . carlyle

21/11/2004 04:56 PM
Please respond to sqlite-users

Subject:[sqlite] Suppressing column (field) headers?

> 1. How can I get the dll to return JUST the desired data? 
>When I send the following SQL, for example,
>   select colEntry from tblEntry where colNum = 3
>   the textbox shows 
>  colEntry
>  "The text in the column."

This is being done by your delphi wrapper. Please consult its 
documentation. The "C" interface of sqlite (for both 2.8 series and 3.0 
series) returns the column names and column values separately, and doesn't 
return any extra "quote" characters.

Clay Dowling wrote:
> You can get just the data that you want by using the prepared statements
> feature of the 3.0.x dll.  It sounds like you're using the table output
> method for getting your results.  This is fine, but it means that you 
> it ignore the first row of data, which in the table interface is the
> column labels.

This statement makes you sound like you're taking crazy pills, Clay 
(prepared statements and non-prepared statements aren't functionally 
different in sqlite)... but I'm interested to know if you have a grain of 
truth hidden in there. Are you familiar with the delphi wrapper Mswarm is 
using? Which delphi wrapper are you using, Mswarm? What does the code that 
issues your query look like?


Re: [sqlite] Client/Server Environment

2004-11-09 Thread ben . carlyle

"Clay Dowling" <[EMAIL PROTECTED]>
09/11/2004 02:39 AM
Please respond to sqlite-users

Subject:Re: [sqlite] Client/Server Environment

> Richard Boehme said:
> > Hi there. How well does SQLite do in a client/server environment where
> > the SQLite database is handling a batch of 100 inserts and the same
> > number of queries a every 0.5 seconds? The inserts would be batched 
> > a transaction. The queries would most likely not.
> I'm not trying to run down SQLite here, but don't you think that this 
> of load justifies a pretty heavy duty database?  SQLite is a disk based
> database, not a client server database.  For that kind of throughput, 
> traditional choice has been a large scale client server database, and
> there's been a pretty good reason why that's been so.

If you are talking about scaling the application to a large number of 
machines to try and balance load, the big end databases may suit you. If 
you are talking about putting a big iron database to work on a single 
server and single disk array to get higher performance than sqlite... 
forget it. My figures are a little out of date now, but I looked for a 
high performance database for my own application a couple of years back 
and found that sqlite blew the popular alternatives[1] out of the water in 
data throughput. Since that time SQLite has become faster for many 
functions while other databases continue to get fatter.

As with any performance-sensitive application you must profile the 
alternatives you are considering on your own target hardware with your own 
projected target load to get a reliable indication as to which will 
perform. I would certainly not discount sqlite based on a mailing list 
post. Try it out and see. I was pleasantly surprised, and suspect you will 
be also. Please report back, also. Your experience may be a useful data 
point for someone else in your position in future.

>100 separate
> clients trying to access the same file on disk is bound to cause a fiew
> problems.  The capacity of the database driver is probably the least of
> your worries.

100 readers does not equal 100 separate clients trying to access the same 
file on disk. It may be equal to 100 clients trying to access the same 
piece of cache memory. That's certainly something you can do more than 100 
times in half a second. The quality of your operating system will have 
some impact here, but I believe that even Microsoft can manage to do this 
properly these days. The main difference between the way sqlite handles 
performance and the way other databases do is that SQLite relies on the 
well-worn file operations of your operating system more than other 
databases. They try to implement these operations themselves. In bygone 
days this was a good thing because OS file handling was often poor. These 
days, It's a good thing. It keeps you lean and allows the operating system 
to do the heavy lifting in making you perform.

If sqlite is going to be a problem under the proposed loading, it will 
most likely be related to locking and possible delays introduced by 
clients having to sleep to get their locks in. As presented, though, I 
suspect that won't be a huge problem. It would largely depend on the size 
and time-cost of the queries being performed.

[1] Mysql, postgres, sapdb, oracle.

Re: [sqlite] strange rounding problem

2004-10-13 Thread ben . carlyle

Will Leshner <[EMAIL PROTECTED]>
14/10/2004 04:20 AM
Please respond to sqlite-users

Subject:Re: [sqlite] strange rounding problem

>The point is that the number is in the database as '358', but by the time 
we get it back from a query it has become '357.999'. We aren't  doing any 
floating point with the number. It is (apparently) happening for us in the 
SQLite engine itself. But we depend on the number being '358' when we 
query for it and because it isn't, on some machines, our app fails.

Can you come up with a minimal test case that exhibits this behaviour? 
Ideally, it would be a series of SQL statements written into the sqlite 
shell program that results in something visibly wrong. The second best 
case would be a short (10-20 line) C program that uses the sqlite API only 
to demonstrate the problem. Perhaps other users of this list could test on 
their own machines and come up with more information.


[sqlite] A quick code review, analysis of locking model, "fine-tuning" suggestions

2004-06-24 Thread ben . carlyle

I thought I'd so somewhat of a code review on the lower-level pieces of 
sqlite 3.0.1, so here goes:

v- code review v

You use a loop here to try and write all data 
synchronously to disk. I had to read it a couple of times before I 
realised it was actually performing the check on write's return correctly, 
but I am happy with it
Is there a reason why similar logic doesn't appear in 
Background: Under unix, multiple reads/writes may be 
required for "slow" devices. One read/write should alway suffice for a 
"fast" device. A disk is considered a fast device for this purpose. An nfs 
mount would be considered a slow device.

I've always been uneasy about the interaction in sqlite 
between the pager an os layer with regards to locking. It seems like 
excessive coupling to me. The os layer is essentially written to know the 
lock state of the pager, and then participate in a dance whereby the pager 
and os always approximately agree on the lock status. The sqlite3OsLock 
function is an example of this, where an attempt to grab a lock of a less 
than or equal status to the current lock is a no-op. To me this means that 
the pager is making uncessary calls and doesn't seem particularly aware of 
its own state. The os layer knows too much about what's happening ("I know 
that you didn't really mean to call this function"), and the pager doesn't 
know enough ("I'll make this call, just in case"). I think that all such 
no-op forms of os layer functions should be replaced with assertions and 
the code above made self-contained.
The concept of upgrading and downgrading locks 
transparently has also consistently bugged me, especially when I've wanted 
to use blocking locks. It makes things much harder when the aren't 
clearly-defined individual lifecycles for reader and writer locks, and 
when writers are forced to share the early lifecycle of readers.

Extraneous assert. The if condition tests this assert 

Use the #define value rather than numeric value of 

F_RDLCK should be replaced with F_WRLCK. Locking only with 
F_RDLCK has no effect.

Use the #define value rather than numeric value of 

lock.l_len should be set to 1 when it needs to be that 
value. Doing it at the top of function reduces code clarity and introduces 
uncecessary asymmetry in conditional branches.

I'm nervy about the different locks that might be held by 
an process in EXCLUSIVE_LOCK state depending on how it reached that state. 
If it went through SHARED_LOCK it has SHARED_FIRST through SHARED_FIRST + 
SHARED_SIZE write locked + PENDING_BYTE through PENDING_BYTE + 1 read 
locked. If it came through the reserved state it also has a write lock on 
RESERVED_BYTE through RESERVED_BYTE + 1. This appears to be dealt with in 
the unlock code, but it grates a little. I actually don't like the SHARED 
-> PENDING path at all, and think it should be removed for simplicity. It 
effectively creates two versions of the pending and exclusive states, 

I don't like the setting of PENDING_LOCK here. Surely the 
code would be clearer if it were set back when the pending lock was 
obtained. I understand that you still want access to the "old value" 
during the rest of the function, but couldn't you copy it?

Ok, I haven't reviewed much past this point. I was hoping to get 
in some comments on the pager itself which I haven't read, yet... but I've 
been at this for a little too long now.

v analysis of locking model v

Just to get this straight in my head, this is the current unix locking 
model (does this appear somewhere in comments?):


NO_LOCK = Nothing locked
locks of SHARED_LOCK state
PENDING_LOCK = PENDING_BYTE through PENDING_BYTE + 1 read-locked + either 
locks of SHARED_LOCK state, or locks of RESERVED_LOCK state
write-locked + locks of PENDING_LOCK state


1. Pick up locks for pending state
2. Pick up locks for shared state
3. Drop locks for pending state

1. Pick up locks for reserved state. Reserved lock is exclusive, so only 
one process can be in reserved state at any one time but concurrency with 
readers is ok.

1. Pick up locks for pending state
2. Pick up locks for exclusive state

1. Pick up locks for pending state
2. Pick up locks for exclusive state


Re: [sqlite] database corruption

2004-06-24 Thread ben . carlyle

"D. Richard Hipp" <[EMAIL PROTECTED]>
24/06/2004 06:04 AM

Subject:Re: [sqlite] database corruption

> Michael Robinette wrote:
> > ...

> You present a new and novel approach to corrupting the database, which
> is to combine a database file with a journal from a different database
> into the same directory.  We'll be thinking about what to prevent this
> attack in the 6 days that remain before we freeze the 3.0.0 database
> format.

This is actually a variant of the method of corrupting the database that 
fsync()ing the directory containing your journal each commit is designed 
to solve. An unsynched directory entry may lead to the existence after a 
power failure of an old journal file, instead of the one that relates to 
the current database state. Obviously, this variant is a solved problem 
while others are not.

The variant I'm most concerned about is actually a copy operation. User A 
says to himself "they're just files, I'll copy them onto my backup media". 
This will often appear to work, so he won't be concerned. One day he 
restores the files and "weird things" start happening.

I'm not sure there's a solution to that, other than user education or an 
operating-system-level implementation of the journalling itself that 
treats a copy operation the same as other kinds of database reads. 
Ultimately the ideal world would have sqlite journalling built into the 
kernel vfs layer. Hrrmm... I've heard that windows longhorn might 
incorporate this kind of function. Perhaps we should be pushing for its 
introduction into other operating systems. It's really very compatible 
with other file operations where you might want to do operations that 
ensure readers always see a consistent state of the data. It might also 
make sqlite just a little touch lighter and more focused.

However, although file data may appear to be consistent from the kernel 
point of view, since there is no API exported to the userspace to control 
transactions, we may end-up in a situation where the application makes 2 
write requests (as part of one logical transaction) but only one of these 
gets journaled before the system crashes. From the application point of 
view, we may then end up with inconsistent data in the file. 
Such issues should be addressed with the upcoming ReiserFS v.4 release. 
Such an API will be exported to userspace and all programs that need 
transactions will be able to use it.;>
[...] one other thing that I want to do is to actually export the nested 
transaction API into userspace. You have to be very, very careful about 
that because it's not possible to guarantee proper database semantics. You 
can't have unbounded, large transactions. You have to have some way in 
which the user application can get in advance some idea of how many disk 
blocks it's going to need to modify for the operation, because it's going 
to call various things like that which are not entirely straight forward; 
it's not quite as simple as people would hope. But it's sufficiently 
useful that that will be exported to userspace at some point.;>
A user-visible transaction API is something entirely different.  No way
does it belong in the kernel.;>

There will be a new API to support userspace-controlled 
multifile transactions. 
At first stab, multifile transactions will be used internally to 
implement extended attributes. 
Now, another question is.. will the transaction API support commit() and 
rollback()? *grin*;>

>From time to time, people ask for a version of the transaction API 
exported to user space. The ReiserFS journal layer was designed to support 
finite operations that usually complete very quickly, and it would not be 
a good fit for a general transaction subsystem. It might be a good idea to 
provide atomic writes to user space, however, and give them more control 
over grouping operations together. That way an application could request 
for a 64K file to be created in a certain directory and treat it like an 
atomic operation. Very little planning has happened in this area thus far. 

 A full transaction API will probably never be exported by 
the kernel itself, however some basic hooks may eventually be provided if 
enough people can agree on what those hooks should be. Most of the work 
would be performed in user-space.

* The breif period sqlite now has an inconsistent state in the main 
database made the copy scenareo less likely to be a problem, but the 
problem may occur occasionally.
* In an embedded scenareo you 

Re: [sqlite] RE: SQLite performance with mid-size databases

2004-06-17 Thread ben . carlyle
"Ismail Kizir" <[EMAIL PROTECTED]>
17/06/2004 06:48 PM

Subject:Re: [sqlite] RE: SQLite performance with mid-size databases

> I had the same problem as Richard's.
> Unfortunately, i have decided to use postgres and finally back to mysql.

If you're happy with your current solution, feel free to ignore the rest 
of this email. All one can hope for is to be happy with one's present 
solution :)

> I have a table of ~300,000 rows of newspaper articles, and its fulltext
> index tables.

Sqlite doesn't support fulltext indexes. I suggest explaining what you 
were trying to achieve, and how.

> Table opening time was very high.

There's no such thing as table opening time in sqlite :) What was being 
done to sqlite during "table open time"?

> Concurrency -even only for reading- was very very poor.

What was the nature of your concurrent access? Were any writers active? 
Did readers use transactions?

> Especially, i observed the linear write time just as Richard did.

I'm not sure what you're saying here. A single insert took O(n) time where 
n was the size of the database, or O(n) time where n is the size of the 
row data? Was it really O(n), or was it O(log(n)) as modern databases such 
as sqlite will normally guarantee? How much data was in the row you were 
inserting? How did you measure the linear growth (is this in the archives 

> I think SQLite is still a good solution for embedded system, but not 
> for mid-size databases.

I'm using sqlite in heavily-loaded system consisting of database files 
that are created over a two-day period. The average database file is about 
800 meg. After extensive testing early in the piece I concluded that only 
sqlite was suitable. All alternative technoligies I tested were far too 
slow for my (now aging) sun hardware. I tested sqlite, postgres, mysql, 
and sapdb. Anecdotally, I'd have to say that SQLite is ideal for targeted 
mid-size databases.


To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sqlite] SQLite performance with mid-size databases

2004-06-17 Thread ben . carlyle

So far on this list I have only seen a reports of scalability problems 
with sqlite when the real problems were specific queries being given to 
sqlite that were constructed in a way that sqlite does not process well. 
In every such case I can recall an alternative form of the query was able 
to be produced (usually by DRH personally) that did not exhibit the 
problem. Knowing the code, there is no reason for sqlite to be scaling 
linearly in queries unless it is being forced to do table scans.

It is true that sqlite doesn't optimise queries as well as major 
databases. It's not designed to. Users must take some care in constructing 
their queries and ensuring the queries are suited to sqlite's design if 
performance might be a problem. At the same time, such queries usually 
perform much better than those of major databases because of the vastly 
reduced optimisation and connection overhead that sqlite affords.

As with every database technology. If you care about performance you have 
to understand some things about the design of your underlying technology. 
For sqlite the design is simple and the experts are extremely responsive. 
If you're having problems and can provide a clear, specifc description of 
your problem you will get help. While queries such as "I think sqlite 
scales linearly, but I can't tell you want queries I'm issuing to make 
that happen" and "My scroll list seems slow, but I don't know what is 
happening between the GUI and the database or what queries are going on" 
are unlikely to solicit helpful response, "I am issuing this query on that 
database schema with about 100 thousand rows each carrying 2k of data" is 
likely to be something members of this list can help you solve.

I'd like to take this soapbox opportunity to again thank drh and his 
associates for the wonderful work they put into sqlite and their genine 
personal commitment to this project. Sqlite is a great product, and a 
well-targeted one.


Raymond Irving <[EMAIL PROTECTED]>
18/06/2004 12:09 AM

Subject:Re: [sqlite] SQLite performance with mid-size databases

Well this does not sound good at all. I would think
that SQLite would at least do a better job at queries.

The issues with performance and scalability is of
great concern. I was planning on create some new apps
that use SQLite, but now I'm wondering is this little
database is capable of handle over 2 Gigabytes (GB) of
data even though the docs says it can handle up to 2
Terabytes (TB).

Does it really make sense to cut back on performance
in order to keep the library size small?

Will SQLite 3.0 fix these problems?

Raymond Irving

--- "D. Richard Hipp" <[EMAIL PROTECTED]> wrote:
> Richard Kuo wrote:
> > 
> > I suspect some unnecessary disk access has to be
> the problem...despite
> > the small amount of new guide information being
> queried out, disk bytes
> > read is several times higher than with MS access
> and scrolling back over
> > previously accessed areas of data is visibly
> faster...indicating that
> > the disk cache is very favorably impacting the
> speed of the queries.
> > 
> If each of your rows contains 2K of data, that means
> each database entry
> is using about 2 overflow pages.  You can change
> this by increasing the
> page size.  Try recompiling SQLite after changing
> macro to 16384.
> You might also trying switching to SQLite version
> 3.0.0 which will be
> released tomorrow.
> Also tomorrow, I will be making available a database
> analysis tool
> for version 2.8 databases that will help us to
> better understand
> how information is stored on disk for your database,
> and possibly
> provide some clues about why you are having
> problems.

To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sqlite] Locking and concurrency in SQLite version 3.0

2004-06-16 Thread ben . carlyle

"D. Richard Hipp" <[EMAIL PROTECTED]>
12/06/2004 08:16 AM

Subject:[sqlite] Locking and concurrency in SQLite version 3.0


My thoughts are listed as they come to me.

Thought 1:

Section 5.0, entitled "Writing to a database file":
After the in-memory cache initially spills to disk the exclusive lock must 
be maintained because the database file is changed. One way to avoid this 
happening might be to change the database file and log structure as 
1) Add a numeric entry to each page in the database file that refers to a 
specific page (or file offset) into the log file.
2) Add an entry to each log file entry indicating a 0 or 1.

If a page is read from the database and has a non-zero file offset, that 
page refers to the roll-forward log that superceeds it. A zero in the log 
file entry indicates it is a rollback entry, while a 1 indicates it is a 
roll-forward entry.

The algorithms described would change in the following ways:
1) Instead of writing the dirty page to the main file when memory 
spilliage occurs, write it to the journal. If the main file entry already 
has a file offset encoded into it, write the page to that offset. If the 
main file entry has no offset, write it at the end of the journal file and 
overwrite only the offset of the main page.
2) Readers with shared locks should always overlook any such offsets it 
finds in main files. Readers with any of the writer locks should refer to 
the journal for the updated version of such pages.
3) When rolling back a journal file, only rollback pages with a 0 entry in 
the rollback/roll-forward field.
4) When committing a transaction write all pages from memory, but also 
commit any pages in the journal with a 1 in the rollback/roll-forward 
5) You might have to rethink any vacuum operation and some other small 
aspects of life. By using the main file as an index into the roll-forward 
log you make truncating the database file more difficult.

One extra alternative to throw in is to keep the roll-back and 
roll-forward journals in separate files. That would avoid the need to 
identify the individual log entries as roll-back or roll-forward and may 
improve performance of large changes. The roll-forward file would never 
have to be committed.

This approach differs slightly from previous suggestions of the shadow 
pager or of creating tree structures in the journal file. It does not 
completely virtualise the pager level, although the concept is similar. It 
requires only trivial extra structure in the journal file since it uses 
the real main file as an index into the roll-forward section of the 
journal. If this kind of scheme were to be implimented in the future the 
groundwork in file format changes could be laid now in a 
forward-compatable way by allocating the necessary spaces and always 
ensuring they had a zero value.

Thought 2:

I'm a little concerned about when SQLITE_BUSY can be returned. In section 
7.0, entitled "Transaction Control At The SQL Level" a mention is made of 
locks not be acquired with the BEGIN statement. Personally I don't like to 
see SQLITE_BUSY at all. I currently modify my sqlite version to use 
blocking locks in restricted ways to avoid getting the message and ensure 
optimum fairness. If they do occur, I would prefer they happen at 
well-defined and designated places. Hmmm... I guess I can't think of any 
cases where this is really an issue, though.

I would like to see blocking locks supported by SQLITE. If that's not 
possible it's ok, but my preference is that the capability should exist. 
Currently sqlite provides an API to execute a function when SQLITE_BUSY 
would be returned. That's ok, but doesn't suit blocking locks well for two 
reasons: 1) The locking semantics of sqlite use operating system locks in 
specific ways that would be unwise to mess with in a callback function. 2) 
I don't belive there is an API to register a corresponding unlock function 
to the sqlite_busy_callback, so whatever locks might be put in place can't 
be unmade at appropriate times. Perhaps the API should be changed to 
support replacement of the various os.c lock functions for each of the 
specific lock types in the new sqlite locking model.

As a matter of interest, the current sqlite isn't far off being able to 
work with blocking lock in place of its existing non-blocking locks. The 
main prohibition that needs to be imposed is that shared locks cannot be 
upgraded to exclusive locks. The current sqlite can be "tuned" to ensure 
exclusive locks are obtained early to prevent blocking locks from 
deadlocking. I haven't seen the new sqlite3 code and haven't seen detail 
of how the various locking mode transitions will be implimented in a posix 
environment to know whether extra problems will be introduced in this 

By my reading the allowable transitions will be these:
Unlocked -> Shared 

[sqlite] create trigger before commit...

2004-06-15 Thread ben . carlyle

I seem to recall (but cannot find) an email on this mailing list sent by 
drh quite sometime ago regarding triggers before commits. As I recall, he 
was battling with some kind of problem for which a trigger that executed 
immediately before the commit was executed seemed to be the best solution.

I'm currently working on/planning/thinking about an accounting system. The 
system essentially consists of a transaction table, an account table, and 
a transaction entry table as follows:
CREATE TABLE AccountTable (AccountId INTEGER PRIMARY KEY, name);
CREATE TABLE TransactionTable (TransactionId INTEGER PRIMARY KEY, date, 
CREATE TABLE TransactionEntryTable (TransactionId, AccountId, amount);

I'm wanting to put some foreign key constraints in, and I think I can do 
that with triggers, but the tricky constraint of the data model is that 
sum(TransactionEntryTable.amount) must always equal zero for each 
TransactionId (this is double-entry bookkeeping).

My thinking is that I can check this with something approximating the 
following pseudo-triggers and table:
CREATE TABLE DirtyTransactions (TransactionId UNIQUE); 
INSERT OR REPLACE INTO DirtyTransactions VALUES(old.TranactionId)
INSERT OR REPLACE INTO DirtyTransactions VALUES(new.TranactionId)
DirtyTransactions LEFT INNER JOIN TransactionEntryTable WHERE s<>0 GROUP 
BY TransactionId);
ERASE FROM DirtyTransactions;

So, did the BEFORE COMMIT get implimented? If so, is it stable/usable? I 
don't see it on the sqlite language page. If not, can anyone think of an 
alternative way to do this that doesn't require a table-scan of 
TransactionEntryTable each time a new transaction is added?


To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sqlite] getpid() and linux threads in sqlite

2004-06-03 Thread ben . carlyle

"Mrs. Brisby" <[EMAIL PROTECTED]>
03/06/2004 01:37 PM

Subject:Re: [sqlite] getpid() and linux threads in sqlite

> On Wed, 2004-06-02 at 01:53, Daniel K wrote:
> > The getpid() call on my Redhat9 linux box is causing
> > some trouble for SQLite. It's reasonably complicated,
> > but SQLite currently assumes that on linux the
> > getpid() function returns a different value in 
> > different threads.
> Anyway, it shouldn't cause any harm. SQLite only uses the pid to help
> work around problems with fcntl-locking (esp. when using NFS) and to
> seed the random number generator.

I believe the software Mr K is working on is sqlite version 3, and that it 
is file locking issues he's looking into.


To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sqlite] Sqlite Secure backup

2004-06-03 Thread ben . carlyle
03/06/2004 09:26 AM

Subject:[sqlite] Sqlite Secure backup

> I am in the process of writing a PEAR package to handle Sqlite backups 
> (both local backups and remote backups)  and I am looking for some 
> suggestions.
> Is the right way to proceed the db dump ? I assume that copying the db 
> will result in a corrupted file if the db is used at the time of 
> backup.

Using .dump in the sqlite command-line utility is always going to be the 
simplest, lowest risk solution. I recommend it, especially for small 
databases. Using .dump, you don't have to worry about other readers and 
writers. You don't have to worry about stray -journal files. All of this 
is taken care of by the engine itself. You end up with a sequence of sql 
commands that are suitable for rebuilding the sqlite database. This list 
has a number of advantages over the basic file format:
1) Although sqlite database changes have in the past been forward 
compatible, it is more likely that the .dump format of an ancient version 
will be restorable on modern sqlite versions than that the old database 
files is restorable.
2) .dump output is more likely to be backwards compatible, in case you 
decide you have to downgrade your sqlite version
3) The dump file is human readible and is more likely to be able to be 
applied to other database software if the need arises
4) The .dump output is more compressable than the original file format. In 
gzip form it will usually take up less space than the gzip of the original 
database file.

The only significant downside to the .dump format in my experience is that 
it takes longer to restore from your media than a straight ungzip would 
take. this is particularly true if you have large tables with indices. On 
some (old) hardware I use a 400meg file takes only a few seconds to 
restore from gzip on backup media, but the equivalent .dump format takes 
in the order of half an hour to rebuild its tables and indices.

If you have large tables and restoration time is critial, the original 
sqlite files are probably the best thing to backup. If you're more 
interested in the security of your data, I recommend backing up the .dump 
output instead.


To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sqlite] Re: sqlite-users Digest 22 May 2004 05:23:11 -0000 Issue 115

2004-05-23 Thread ben . carlyle

Ulrik Petersen <[EMAIL PROTECTED]>
23/05/2004 07:31 AM

Subject:[sqlite] Re: sqlite-users Digest 22 May 2004 05:23:11 - 
Issue 115

> Read especially section 3.2.  The EULA is here:
> IANAL, but the way I understand it, you can't link against their 
> libraries and still distribute your code under an Open Source license, 
> or distribute your binaries under a license that requires that the 
> software be offered at no charge.  My understanding may be flawed, so 
> read the EULA yourself before deciding whether the toolchain is for you.

IANALE, but my reading is a little different. I think that all its saying 
is that you can't release your software under a license that forces the 
-microsoft- libraries to be open source. According to, that 
shouldn't be a problem:

Anyway, better solutions have already been offered so I'll say no more on 
this thread.


To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sqlite] command history

2004-05-03 Thread ben . carlyle

Puneet Kishor <[EMAIL PROTECTED]>
03/05/2004 05:01 AM

Subject:[sqlite] command history

> Is there a way to enable some kind of command history in the SQLite 
> shell? You know, press the up arrow to get the previous command... I 
> think it is called READLINE support, no?

If you're still having trouble, perhaps you could try starting sqlite in a 
different way. Instead of this:

$ sqlite my.db
> SELECT ...;
> CREATE ...;


$ sqlite my.db 'SELECT ...;'
$ sqlite my.db 'CREATE ...;'

This method doesn't allow you to run transactions across multiple lines, 
but does give the advantage of immediately conforming to the way your 
shell does its command history :) I like to do things this way with ksh 
and vi editing keys, myself.


To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[sqlite] Effectiveness of PRAGMA integrity_check;

2004-04-13 Thread ben . carlyle

I'm trying to write some defensive code that is able to recover from 
database corruption. The idea is that if a disk fails and a database 
becomes corrupt it can be detected and synchronised from a backup copy.

To this end, I've just been trying to write a function that returns true 
only when it is sure the database is ok. I use PRAGMA integrity check; and 
compare the returned string with "ok". When I tried this with a few random 
database changes, though, I had a hard time trying to get the corruption 
to trigger. I did the following:

CREATE TABLE foo(bar);

I then went in with a text editor and started modifying the bar record. I 
changed "bar" to "car", but the change was not detected. I started 
modifying characters to the left and right of the "car" string, but still 
no corruption. I was able to get corruption to be detected when I 
truncated the file.

Can I take it from this behaviour that there isn't any checksum checking 
going on apart from headers and the BTrees themselves? Will the 
integrity_check at least guarantee me that I won't at some later stage get 


To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[sqlite] Clustered indicies Was: [sqlite] A proposal for SQLite version 3.0

2004-04-07 Thread ben . carlyle

Jeff Pleimling <[EMAIL PROTECTED]>
08/04/2004 12:42 PM

Subject:Re: [sqlite] A proposal for SQLite version 3.0

At 12:08 PM 4/8/2004 +1000, [EMAIL PROTECTED] wrote:
> I believe you're thinking of a 'clustered index'. This puts the data
> into the order of the index. There can be, of course, only one clustered
> index per table.
> Since at least some of the data in the table is moved around on every 
> regular indexs need to take this into account (usually by indirection, 
> then modifying all of the indexes with each insert).

Ahh, I didn't think of that. I don't have any other indices on my table, 
so this wouldn't be a problem for me... but I can see now how it would 
harm the general case. I guess the simplest implementation would have a 
proabition on having -any- other indicies on the table.

> >If a table could be ordered according to an index, rather than having 
> >external index, I think it would significantly improve the time and 
> >performance of my databases. I don't know whether my experience would 
> >shared by other users. It it were something that could go into 3.0 it
> >would at least do me some good.
> Clustered indexes can really slow the performance for OLTP (On-Line
> Transaction Processing) and other systems where data is added/deleted in 
> mixed fashion. Every time a record is inserted, data is possibly moved 
> the disk (with page splits causing even more slowdowns).

Yes, that's what's happening already in the index whenever I do an insert. 
My thinking was that maintaining an index-ordered table would be less work 
overall than maintaining a table with an ordered index. I could be wrong 
on that, but I'm not sure I see the flaw in my logic.

> If your system is entirely historic data, that would be great - but if 
> system is inserting meter readings in (near) real-time, you'd probably
> get a big performance hit.

It's certainly real-time, with the occasional query.

> There are many pros and cons. A google search turns up articles (usually 
> MS SQL Server) on both side - some people saying 'always' and some 

I can certainly see how the usefulness of this feature could be limited.

I guess the problem is primarily related to how the rowid is chosen. If I 
could choose a rowid that would put the table in the correct order, and 
renumber as required I might make some ground. I could order by ROWID, 
then... although there may have to be some kind of code tweakage to allow 
the where clauses to operate correctly.

Hmm... maybe something like this:
-- if the insertion point is after current data:
-- else if insertion point is between two values
INSERT INTO mytable VALUES (( + )/2, )
-- else rearrange contiguous values

Perhaps the changes to sqlite could be as minimal as providing a facility 
to say:
"I promise to keep these rows in an order consistent with this index"
so sqlite will use the index in queries.


To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sqlite] A proposal for SQLite version 3.0

2004-04-07 Thread ben . carlyle
Peoples :)

"D. Richard Hipp" <[EMAIL PROTECTED]>
07/04/2004 11:22 PM

Subject:[sqlite] A proposal for SQLite version 3.0

> A design proposal for SQLite version 3.0 can be found at:
> Feedback from the user community is strongly encouraged.

Since this is a fairly rare opportunity to make incompatible suggestions 
that could make it into code, I thought I'd put this left-field one out 

My main use of sqlite is in a database that stores vast quantities of 
historical data (individual changes on meter readings, that sort of 
thing). The main table contains data for multiple instruments that each 
provide their data in time order with their own updates, but out of order 
with the updates of others:

Instrument1,2am,value is 0
Instrument2,1:59am,value is 3
Instrument1,3am,value is 1
Instrument2,3:01am,value is 4

This table is constructed in the order that data comes in, but the queries 
I want to do are quite different:

SELECT * FROM thetable WHERE instrument="Instrument1" AND time >= 2am and 
time <= 3am ORDER BY TIME;

Well, I have an index to make this work efficiently (the exact query and 
index details probably aren't that important here). The problem is that 
the index contributes significantly to the size of the database and the 
fact that I keep the original ordering around but unused seems like a 
waste of effort. Queries are also slower than they would be if they were 
following an integer primary key along a real table, with the extra 
O(log(n)) lookup for each result entry.

A little while ago a list reader suggested a kind of index (from ms 
access, if I recall... I don't recall the term they used) that is not 
external. Instead the index changes the order in which the table itself is 
organised. You suggested at the time that if explicit ordering were 
required the user could use a CREATE TABLE AS SELECT statement to get 
things in that order, but that sqlite would have to table-scan as it would 
have no saved knowledge of the table ordering. Moreover the table would 
not remain ordered as new elements were ordered.

If a table could be ordered according to an index, rather than having an 
external index, I think it would significantly improve the time and space 
performance of my databases. I don't know whether my experience would be 
shared by other users. It it were something that could go into 3.0 it 
would at least do me some good.

The other thing that I think will help me most is the native storage of 
numeric formats. It's great to see this happening :)

On the wchar_t subject, I've just looked up, which says wchar_t 
is a long under 32-bit solaris and int under 64-bit solaris. Both numbers 
are 32-bits long. According to, a 
32-bit representation is the only unicode that is actually fixed-width. 
Both UTF-8 and UTF-16 are multi-byte, rather than wide characters. This 
page also lists various known unicode encodings, so may be of some value.


To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sqlite] Concurrency Proposal

2004-04-01 Thread ben . carlyle

"D. Richard Hipp" <[EMAIL PROTECTED]>
31/03/2004 01:21 PM

Subject:Re: [sqlite] Concurrency Proposal

> I think the gist of Ben's proposal is as follows
> (please correct me if I am wrong):
> Writes do not modify the main database file until
> they are ready to commit - meaning that reader can
> continue to read from the file during lengthy
> transactions.  The journal contains modified pages,
> not the original pages, and is thus a roll-forward
> journal rather than a roll-back journal.

I think it's worth my posting a suggestion from a co-worker of mine who 
may be known to some if only by his surname :) He posed the obvious 
question as to why the transaction is so long in the first place. My 
personal answer was that I have a transaction I keep open for a second at 
a time, in case more change come through. That way I get maximum 
throughput while retaining the consistency guarantee that journaling 

His alternative proposal for my situation is simple: Buffer the changes, 
instead of holding a transaction open. This is something my code could do 
fairly easily and I'm a bit disappointed I didn't think of it :) If I ever 
get around to changing the code, I'll have it keep a fixed-size buffer of 
changes. Whenever the buffer fills, or one second passes since the first 
buffer entry was inserted, I'll flush the buffer with a transaction.

Maybe long write transactions concurrent with readers is a requirement for 
sqlite, but I'm not so sure it's my requirement anymore. Perhaps this 
simple suggestion will make it a requirement for fewer current sqlite 
mailing list users, too ;)


To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sqlite] Concurrency Proposal

2004-03-30 Thread ben . carlyle
- Forwarded by Ben Carlyle/AU/IRSA/Rail on 31/03/2004 02:38 PM -

Ben Carlyle
31/03/2004 02:34 PM

To: "D. Richard Hipp" <[EMAIL PROTECTED]>@CORP
Subject:Re: [sqlite] Concurrency Proposal


"D. Richard Hipp" <[EMAIL PROTECTED]>
31/03/2004 01:21 PM

Subject:Re: [sqlite] Concurrency Proposal

> I think the gist of Ben's proposal is as follows
> (please correct me if I am wrong):
> Writes do not modify the main database file until
> they are ready to commit - meaning that reader can
> continue to read from the file during lengthy
> transactions.  The journal contains modified pages,
> not the original pages, and is thus a roll-forward
> journal rather than a roll-back journal.

That's pretty accurate. If the modified pages could all be kept in memory 
the pager could contain either the new or the old pages. It wouldn't 
really matter, but if pages need to be flushed from memory they would have 
to be put into the journal.

> The main difficulty in this approach is locating
> a particular page in the journal file when it needs
> to be reloaded.  For example, suppose page X is read
> from the main database, modified, then written into
> the journal.  Due to cache pressure, that page is then
> flushed from the in-memory cache.  Later on, we need
> to read page X again.  How do we locate the position
> in the journal where the current data for page X was
> written so that we can reread it?

I'm not surprised this has come up as a problem. I wasn't sure whether 
sqlite flushed changed pages back to the original database file. Since the 
whole algorithm hinges off the main file being changed any overflow would 
have to be written to the journal.

Hmm... an on-disk hash or maybe ordered pages?

If this were to proceed, ordered pages might be the simplest approach but 
wouldn't perform well. The algorithm would simply be to use binary 
searches over the journal file's known-sized blocks to find the pages. 
This would also have to be done to find an insert point for new pages and 
then everything afterwards would have to be shifted to accommodate. Hashes 
would be able to place things explicitly, but you'd have empty pages (of 
undefined content) throughout the journal that you'd have to explicitly 
write "this is not a page" headers into.


Obviously the poor performance of ordered pages would only become a factor 
if your in-memory cache were exceeded so it could be optimised... but 
there comes a point where you have a few gig of database and a 
corresponding couple-o gig of index that all need to be updated every time 
you do an insert. Large databases lead to large changes, at least every 
once in a while.

Doug's shadow pager idea looks like it would effectively be an optimised 
form of the ordered pages concept[1]. If the journal had an on-disk linked 
list of ordered page numbers and their in-journal indicies searching and 
data modification would occur mostly in-memory, and would could be done 
with constant memory-usage:

Pager file :-

A BTree with the corresponding navel-gazing might be better, but consider 
the following search algorithm:
Read first index block
While the greatest page in this block is less than the target page, skip 
to the next index block.
Do binary search in memory

Anyway :) It's just some ideas that fell out of my head. It may or may not 
be practical.

[1] I haven't read the proposal in detail, so I apologise if I'm making 

To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[sqlite] Concurrency Proposal

2004-03-30 Thread ben . carlyle

I've been mulling over this for a little while, but only got around to 
writing anything down last night. I may have made some errors, perhaps 
serious ones. Either way, I thought I'd just put it out there. This email 
contains an algorithm for permitting concurrent reader and writer access 
to an sqlite database. The algorithm has the following characteristics:
- File-based locking (rather than page or row-level locking) is retained
- Semantics of current operations remains the same, as far as I can tell
- A change to the sqlite file format would be required
- The solution is (probably) not backwards-compatible
- Multiple readers can access the database at the same time
- A single writer can prepare changes to the database while readers are 
still reading
- The writer locks out all readers and other writers while it performs a 
commit operation
- Blocking locks would be supported (and may be required in some places)
- Lock periods are short, even when transactions are held open for a long 
time. This may be blocking locks are more appropriate than they were 
previously. A process that holds a database open for a second at a time to 
improve throughput will only block readers on the second boundary, 
although it will still block other writers for the full second duration.
- Inconsistency only occurs if a writer fails to apply its transaction to 
the main at commit time, not if it fails while the transaction is still 
- More synching may be proposed than in the current algorithm, but I'm 
pretty picky about synchs
- The journal file is used for writer locking contention, so can't be 
erased at the end of each transaction. It can be erased safely only when 
there are no writers currently waiting.
- The algorithm doesn't allow modification of main file (apart from 
journal-based recovery) until the commit phase of the writer algorithm.

Reader algorithm:
open main file
read-lock main file
Check for inconsistency marker in main file
If inconsistent
release main file read-lock[1a]
open journal file
write-lock journal file
write-lock main file[1b]
perform journal playback routine[2]
sync main file
clear inconsistency marker in main file
sync main file
truncate journal file
release journal file write-lock
downgrade main file write-lock to read-lock
perform queries
flush block cache
release main file read-lock

[1] Don't upgrade read lock to write lock, otherwise deadlock may occur 
for blocking lock. Never attempt to lock the journal file while you have a 
lock on the main file.
[2] May be a no-op as a process who got the write-lock first may have 
already performed the playback
[3] Consistency is now guaranteed, and a valid read-lock exists on the 
main file

Writer algorithm[1]:
open journal file
write-lock journal file
write-lock main file[2a]
Check for inconsistency marker in main file
If inconsistent
perform journal playback routine
sync main file
clear inconsistency marker in main file
sync main file
downgrade main file write-lock to read-lock[2b]
truncate journal
perform updates in-memory and in-journal only. Do not modify main file.
sync journal file
upgrade main file read-lock to write-lock
write inconsistency marker into main file[4]
sync main file
write changes into main file
sync main file
clear inconsistency marker for main file
sync main file[5]
truncate journal file[6]

[1] To avoid deadlock, a reader should never attempt to "become" a writer. 
The reader must first drop its read lock (with associated cache flush), 
and then begin the Writer algorithm steps.
[2] Write-lock main instead of read-lock as a precaution. If we read-lock 
it, readers could open the database concurrently, see the same 
inconsistency as we do, and get caught behind our transaction as writers 
trying to resolve it. With a write-lock up-front until we're sure no 
inconsistency exists we get to avoid this degenerate case. This ends up 
meaning a writer briefly locks the main file twice: once at the beginning 
of the transaction, and once at the end.
[3] At this point the writer has a write lock on the journal, a read lock 
on the database, and the journal is empty. No inconsistency exists in the 
[4] The use of an inconsistency marker would be the main incompatability 
between proposed and existing sqlite implementations. I see this most 
probably as an inconspicuous boolean that could sneak into an sqlite 
database header.
[5] Why so many syncs? Well, to be safe. We want to know all the data we 
might roll back in to the main is written to disk before we record the 
fact that that roll-back has to occur. We want to know that the 
inconsistency is noted before we start modifying the database, otherwise 
the modifications could be written but not the inconsistency record. We 
want to know the the data has been fully updated before we clear the 

Re: [sqlite] Search results

2004-03-25 Thread ben . carlyle

"Greg Obleshchuk" <[EMAIL PROTECTED]>
26/03/2004 03:29 PM

Subject:[sqlite] Search results

> In the database I will have three columns where I will be searching for 
matches , my question is what type of method do people/systems use to rate 
a match in the database?

It's difficult to rate exact matches, because all search results returned 
exactly what was requested. As such, the only way to find a match that is 
better than other matches is to have the user refine their search. Rated 
search results are more useful when matches are inexact. A word that is 
spelled nearly the same as the one you searched for. A result that only 
contained four out of your five queried words. They can be rated with 
respect to each other because they match the query to a better or worse 
extent. For exact matches you really have to come up with another method 
of deciding how relevant the results were. Google uses information on how 
popular a web-page is to determine which pages are most likely to be 
relevant when users search. If you could get feedback on how useful 
particular search results were, you could make sure the best results were 
returned for later searches. If you could predict how useful results are 
likely to be when you enter the records, you could provide a baseline 
ranking to start from.

Anyway, tricky subject :) No quick answers ;)


To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sqlite] Text(3)

2004-03-24 Thread ben . carlyle
"Anabell Chan" <[EMAIL PROTECTED]>
24/03/2004 05:23 PM
Please respond to anabell

Subject:[sqlite] Text(3)

> Firstly, how do I define a string table field with constraints on its 
> For example Text(3), string of three characters.  Secondly, how is it
> enforced during updates?  Many thanks!

Sqlite will make you work a little to enforce contraints like this. If you 
really want to, here's how:

CREATE TABLE foo (a); -- we want a to be at most three characters long
CREATE TRIGGER foo_a_insert_constraint BEFORE INSERT ON foo
SELECT RAISE(ABORT, "a is too long") WHERE length(NEW.a) > 3;
CREATE TRIGGER foo_a_update_constraint BEFORE UPDATE OF a ON foo
SELECT RAISE(ABORT, "a is too long") WHERE length(NEW.a) > 3;

sqlite> INSERT INTO foo VALUES("abcd");
SQL error: a is too long
sqlite> INSERT INTO foo VALUES("abc");

Adjust table and column names, and string length to suit. See for reference material. This 
approach can be taken to check a wide variety of constraints. If you can 
define a WHERE clause that can detect a problem, you can abort the 
offending operation using a couple of triggers.

Does anyone have a more succinct version of the above?


To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sqlite] sqlite not sharing page cache across processes?

2004-03-21 Thread ben . carlyle

"Mark D. Anderson" <[EMAIL PROTECTED]>
22/03/2004 05:15 AM

Subject:[sqlite] sqlite not sharing page cache across processes?

> It appears from looking at pager.c that every process working
> with a database acquires its own page cache by use of read().
> They independently keep their individual cache in sync by
> playing back the (single) journal.
> But is my understanding of the tradeoff correct?

My reading of the code has the readers flushing their cache often, and not 
using the journal at all to maintain consistency. My reading has the 
reader start with nothing in the cache, lock the database with a read 
lock, perform as many SQL queries as are in the single text string it's 
currently parsing, and flush its cache around the time it unlocks the 
database. The cache remains valid only as long as the lock is in-place, 
and the operating-system is responsible for any caching between locks.

Replay of the journal only occurs if a writer has terminated (lost its 
lock without committing or rolling-back its transaction). The next reader 
will roll-back the transaction by restoring original pages from the 


To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: [sqlite] Resetting primary key auto-increment after table re-create

2004-03-08 Thread ben . carlyle
- Forwarded by Ben Carlyle/AU/IRSA/Rail on 09/03/2004 10:56 AM -

Ben Carlyle
09/03/2004 10:54 AM

To: "Roger Hawkins" <[EMAIL PROTECTED]>@CORP
Subject:RE: [sqlite] Resetting primary key auto-increment after table 


"Roger Hawkins" <[EMAIL PROTECTED]>
08/03/2004 03:18 PM

Subject:RE: [sqlite] Resetting primary key auto-increment after table 

> Thanks for your reply...
> I havent found any reference to this in the doco so Im not sure how any
> of this might work..
Pay particular attention to the "is logically equivalent" part.

> So what you are saying is that if I recreate the table I can insert all
> the old values back into the newly created table (including the primary
> key values) and the primary key column wont complain? That's a bit
> scary!

So long as you don't insert the same number into the INTEGER PRIMARY KEY 
twice, sqlite will not complain. It's a primarily a primary key. It only 
doubles as an auto-increment field when you insert NULLs into it.

> Even if this is true what happens when I next insert a value (normally
> without specifying the primary key field) - does it just pick up the
> latest?

> Any one tried this?

Yes. You can, too:

INSERT INTO foo VALUES(3); -- SQL Error: PRIMARY KEY must be unique



To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sqlite] Delete a database file

2004-02-26 Thread ben . carlyle
- Forwarded by Ben Carlyle/AU/IRSA/Rail on 27/02/2004 09:54 AM -

Ben Carlyle
27/02/2004 09:34 AM

To: "Benoit Gantaume" <[EMAIL PROTECTED]>@CORP
Subject:Re: [sqlite] Delete a database file


"Benoit Gantaume" <[EMAIL PROTECTED]>
26/02/2004 08:45 PM

Subject:[sqlite] Delete a database file

> I just want to suppress a database file.
> But it's impossible to do it from the procees that created or opened the 
database even if i closed it!
> I guess that I do not close properly the database.
> The only way to be able to remove the file is to close the process that 
used the database...
> Do you have some tip?

Under unix, just unlink it. The file will actually be deleted when your 
process closes it. Under Windows, you can't delete any open file of any 
kind, ever. You have to close the file, first. If you issue sqlite_close 
calls for each sqlite* pointer you recieved from sqlite_open the file will 
be closed.

But the question is, why do you want to close the file? Just to free up 
some space? Why are you using an on-disk database when you could use an 
in-memory database?


To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sqlite] multithreading

2004-02-26 Thread ben . carlyle
- Forwarded by Ben Carlyle/AU/IRSA/Rail on 27/02/2004 09:53 AM -

Ben Carlyle
27/02/2004 09:49 AM

Subject:Re: [sqlite] multithreading


26/02/2004 04:56 PM

Subject:[sqlite] multithreading

>   Since I'm 
> debugging, I'm crashing a lot before I have a chance to sqlite_close() 
> the database.  Is that the problem?  If it is, is there any way to 
> unwedge a database if the app crashes before closing it?

sqlite uses operating-system locks. When your process terminates, the 
operating system will remove all locks that process owned. I presume from 
the fact you can delete the file while it's still locked (and therefore 
open) that you're running a unix variant? I would be checking the output 
of lsof to see exactly which process still had the file open.

> I'm trying to access the database from two threads: the UI thread 
> mostly reads and occasionally updates rows, while the network thread 
> loads new messages and inserts them.  Will that work, or do I just 
> misunderstand how SQLite does multithreading?

It is important that each thread has its own sqlite*, each coming from a 
separate sqlite_open call. If that condition is met, nothing should 
actually misbehave. Be warned, though, that each thread will lock out all 
other threads while it is querying or modifying the database. If any 
thread locks the database for too long, the other threads attempting to 
access the database will have SQLITE_BUSY returned. It is possible to hack 
blocking locks into sqlite to make the waiting thread sleep, instead, but 
I'll leave that as an exercise for the reader (or drh).


To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: [sqlite] multithreading

2004-02-26 Thread ben . carlyle
"rumcais" <[EMAIL PROTECTED]>
26/02/2004 05:16 PM

Subject:RE: [sqlite] multithreading

> I've make some experiments around accessing the SQLite database fom 
concurent processes at the same time
> and I find it work good when I'm using transactions every time I touch 
the database.
> When I don't, it corrupt the database file sometimes.

In sqlite,


is equivalent to


If you are getting database corruption then you are running into a very 
strange and very serious bug, or your code (not sqlite) is corrupting the 
database (in which case it is still a very serious bug, just not in sqlite 
:). I suggest you reinvestigate and report. If its reproducable it should 
be possible to track down.


To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sqlite] Primary key and index

2004-02-08 Thread ben . carlyle
- Forwarded by Ben Carlyle/AU/IRSA/Rail on 09/02/2004 12:04 PM -

Ben Carlyle
09/02/2004 12:04 PM

To: Bertrand Mansion <[EMAIL PROTECTED]>@CORP
Subject:Re: [sqlite] Primary key and index


Bertrand Mansion <[EMAIL PROTECTED]>
09/02/2004 04:29 AM

Subject:[sqlite] Primary key and index

> Does the declaration of an INTEGER PRIMARY KEY implies the creation of 
> index ? I am asking because I have noticed a performance boost when I 
> an index on an INTEGER PRIMARY KEY when sorting rows with the primary 
> column.

Yes, but not in the way you're thinking.

> More generally, does declaring any column primary key implies that this
> column will be indexed ? I am not sure about that because when I do a 
> index_list(), there is no index shown for the primary key.

Yes, although an INTEGER PRIMARY KEY is a special case of this.

> Thanks for any hints,


"Specifying a PRIMARY KEY normally just creates a UNIQUE index on the 
primary key. However, if primary key is on a single column that has 
datatype INTEGER, then that column is used internally as the actual key of 
the B-Tree for the table. This means that the column may only hold unique 
integer values. (Except for this one case, SQLite ignores the datatype 
specification of columns and allows any kind of data to be put in a column 
regardless of its declared datatype.) If a table does not have an INTEGER 
PRIMARY KEY column, then the B-Tree key will be a automatically generated 
integer. The B-Tree key for a row can always be accessed using one of the 
special names "ROWID", "OID", or "_ROWID_". This is true regardless of whether or not 
there is an INTEGER PRIMARY 

1 Table = 1 BTree, the BTree holds the data and is ordered by ROWID
1 Table with 1 Index = 2 BTrees, the second referring to rows in the first
1 Table with PRIMARY KEY = 1 Table with 1 (unique) Index
1 Table with INTEGER PRIMARY KEY = 1 Table, with its own BTree forming its 
unique index


To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sqlite] Converting Sqlite to Myqsl

2004-01-21 Thread ben . carlyle
- Forwarded by Ben Carlyle/AU/IRSA/Rail on 22/01/2004 11:13 AM -

Ben Carlyle
22/01/2004 11:06 AM

To: Martin Brinkmann <[EMAIL PROTECTED]>@CORP
Subject:Re: [sqlite] Converting Sqlite to Myqsl


Martin Brinkmann <[EMAIL PROTECTED]>
22/01/2004 10:02 AM

Subject:[sqlite] Converting Sqlite to Myqsl

> I tried to find a easy way of converting the file to mysql format, i 
> to use
> the data for a website of mine. Until now, i did not find a way to do it 

> and
> now i would like to ask you if you could provide me with an easy to 
> understand
> solution to this problem i face.

Have a look at the dump command on this web page:

The simple answer is
sqlite filename.idb .dump > sqlcommands.txt

sqlcommands.txt will contain the SQL commands required to re-create 
filename.idb using sqlite. You may need to edit the sqlcommands.txt to 
make sure the table columns have the correct types for mysql import.


To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sqlite] Clarification on file locking in web-served apps

2004-01-14 Thread ben . carlyle

Vania Smrkovski <[EMAIL PROTECTED]>
15/01/2004 12:14 PM
Please respond to vania

To: "SQLite-Users (E-mail)" <[EMAIL PROTECTED]>
Subject:[sqlite] Clarification on file locking in web-served apps

>  Why the heck would anyone even need Transactions in the first place? If 
the DB file is locked, no changes can be made, right?

The database file is only locked while a transaction is active. Since a 
transaction is implicitly started and ended every time you do an update if 
one's not already running this means that whenever you're modifying the 
data the whole file is locked. When your transaction ends the file is 
unlocked again.
Likewise, when you query the database the file is read-locked for the 
duration of the query. In this instance multiple programs (or threads) can 
query the database at the same time, but if any query is active updates 
have to wait.

sqlite_open does not lock the database. You can keep the database open as 
long as you like, and whenever you're not actually executing SQL the file 
will be unlocked.

Note that the other thing transactions will give you is a guarantee of 
atomicity. Either the whole transaction gets written or none of it does. 
If your program crashes (or your machine loses power) before the end of 
the transaction, the partial updates will be reversed next time you open 
the database.

>  So I'm wondering if I am missing a few pieces  Is the database file 
locked more than once during a process?  That is, does it get  locked as 
the Update statement is getting a collection of rows with which to apply 
the intended actions?  Does it then release the lock as it prepares the 
update on this subset of data, and then re-lock when it prepares to write? 
 If so, that would explain the need for a Transacction, as it leaves gaps 
of access during the transaction.

If you do these as separate SQL statements without a transaction its 
possible that someone else can get a write in, in-between your own query 
and update.

>  Ignoring Transactions for a second, if I have such a Select, will every 
user ben locked behind a wall until the Select for user 1 is complete? 
>  And if this is not the case for Select, will it be so for Update/Insert 
of this lenth?

Other selects can operate concurrently, but updates will have to wait 
until all selects have finished.


To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sqlite] locked - what am I doing wrong?

2004-01-08 Thread ben . carlyle
- Forwarded by Ben Carlyle/AU/IRSA/Rail on 09/01/2004 10:44 AM -

Ben Carlyle
09/01/2004 09:35 AM

To: "David Swigger" <[EMAIL PROTECTED]>@CORP
Subject:Re: [sqlite] locked - what am I doing wrong?


"David Swigger" <[EMAIL PROTECTED]>
09/01/2004 12:04 AM

Subject:[sqlite] locked - what am I doing wrong?

> I am having a bit of a hard time understanding something:

> When I am in the middle of stepping (sqlite_step) through the
> results of a table query, the database is locked.  Is this normal?  Have
> I missed something?

This is discussed in Question 7 of the FAQ, but I think the first 
paragraph is a little misleading. I think that:
"Multiple processes can have the same database open at the same time. 
Multiple processes can be doing a select at the same time. But only one 
process can be making changes to the database at once."
should read:
"Multiple processes can have the same database open at the same time. 
Multiple processes can be doing a select at the same time. But a process 
can't make changes to the database while any readers or other writers are 

The important thing to understand here, is that you're not stepping 
through a set of results that you have in memory and that were extracted 
from the database. You are (at least for simple queries) stepping through 
data that is still in the database. Because you're still actively reading 
the database the writer thread can't get access until your query is 
finished. If, instead, you want to load the whole results set into memory 
before stepping through them consider the sqlite_get_table_* functions. 
Alternatively you can collect the results into memory any way you like 
before stepping through the structures you've created yourself.


To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: [sqlite] Documentation

2004-01-05 Thread ben . carlyle
- Forwarded by Ben Carlyle/AU/IRSA/Rail on 06/01/2004 08:34 AM -

Ben Carlyle
06/01/2004 08:31 AM

To: "Allan Edwards" <[EMAIL PROTECTED]>@CORP
Subject:RE: [sqlite] Documentation


"Allan Edwards" <[EMAIL PROTECTED]>
06/01/2004 12:18 AM

To: "'Roger Reghin'" <[EMAIL PROTECTED]>, "'Ralph Wetzel'" 
Subject:RE: [sqlite] Documentation

> CREATE TABLE Allan (DateTimeField DateTime);
> INSERT INTO Allan (DateTimeField) VALUES ('2004-01-04 12:00:00');
> SELECT DateTimeField FROM Allan WHERE datetime(DateTimeField) =
> datetime('2004-01-04 12:00:00');
> SELECT DateTimeField FROM Allan WHERE datetime(DateTimeField) >
> datetime('2004-01-03 12:00:00');
> SELECT DateTimeField FROM Allan WHERE datetime(DateTimeField) <
> datetime('2004-01-03 12:00:00');

Part of the appeal of this particular time and date format is that for 
normal equality and comparison tests like the ones you've done you don't 
need to call any datetime funcations. String comparison is fine (at least 
until the year 1 needs to be represented) because the digits are 
always in exactly the same places in the string (which is why it's 
important to still include the leading 0 chars):

SELECT DateTimeField FROM Allan WHERE DateTimeField = '2004-01-04 
SELECT DateTimeField FROM Allan WHERE DateTimeField > '2004-01-03 
SELECT DateTimeField FROM Allan WHERE DateTimeField < '2004-01-03 

By dropping the transformation you permit the current version of sqlite to 
use any relevant index to do these selections.

I don't know the ANSI standard to which you referred, but XSD defines the 
datetime with a few extra tweaks: A T between the date and time, optional 
sub-second resolution, and an optional time-zone: 


To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sqlite] Concurrency in SQLite

2003-11-24 Thread ben . carlyle

"D. Richard Hipp" <[EMAIL PROTECTED]>
24/11/2003 03:22 AM

Subject:[sqlite] Concurrency in SQLite

> Please, give me some examples of the kinds of things you are
> doing which could benefit from improved concurrency.
>   *  Are you holding transactions open for an extended period
>  of time?  Why?

This is my situation. I have a large amount of data flowing into a 
database which shows historical records (maybe a couple of thousand 
inserts per second). Queries are much rarer. To keep inserts efficient I 
hold transactions open for one second at a time. The most important change 
for me is one that I introduced into my copy: Blocking locks. These are 
important because there is only an instant between the last transaction 
closing and the next beginning. In this scenareo the poll-based locking 
mechnism currently used by sqlite is just not lucky enough to try at that 
instant. Only blocking locks with their operating-system support are 
sufficient to ensure that the readers get in at all. I also have a 
situation where I have multiple writers on the database that can run into 
the same problem.

If you could ensure that readers could still read the untouched version of 
database blocks while a writer is working on "dirty" version of the same 
blocks I wouldn't have any problems as far as reading is going. Writing 
would still be problem, though. It's not the amount of concurrency that's 
a problem for me. One at a time is fine. It's just the ability to schedule 
the accesses that do happen very tightly together that I care about.

>*  How many processes do you have trying to access the database
>   at once?

Usually at most two or three.

>   *  How do you currently handle SQLITE_BUSY replies?  Do you use
>  the sqlite_busy_handler() or sqlite_busy_timeout() APIs?

The problem with both of these apis is that they use timers beetween 
attempts. If I could put a blocking lock on the database in the busy 
handler, allow the database access to occur, then get called back to 
unlock the database, it would be almost as good as the current blocking 
lock situation.

>   *  How large are your databases?

Usually less than a gig :)

>   *  Do you ever put database files on a shared filesystem?



To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sqlite] database table is locked

2003-11-06 Thread ben . carlyle
- Forwarded by Ben Carlyle/AU/IRSA/Rail on 07/11/2003 10:03 AM -

Ben Carlyle
07/11/2003 10:03 AM

Subject:Re: [sqlite] database table is locked

Thiago Mello <[EMAIL PROTECTED]>
08/11/2003 09:00 AM

Subject:[sqlite] database table is locked

> Im doing a SELECT sql query, and in the callback function of this sql
> query I do a UPDATE, so when I do this update I get 
> database table is locked.

> How I cant do the UPDATE in the second sqlite_exec() function?!

Either separate the two operations or bring them closer together. To 
separate them do something like:

SELECT ...; -- Collate results
-- Action results:

To bring them closer together do something like this:

UPDATE foo = bar WHERE (SELECT ...)

It depends on what your query is and how you want to update the table in 
response to your table data.


To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sqlite] Performance problem

2003-11-06 Thread ben . carlyle
- Forwarded by Ben Carlyle/AU/IRSA/Rail on 07/11/2003 10:00 AM -

Ben Carlyle
07/11/2003 10:00 AM

To: "Mrs. Brisby" <[EMAIL PROTECTED]>@CORP
Subject:Re: [sqlite] Performance problem

"Mrs. Brisby" <[EMAIL PROTECTED]>
07/11/2003 12:08 AM

To: "Jonas Forsman / Axier.SE" <[EMAIL PROTECTED]>
cc: "Clark, Chris" <[EMAIL PROTECTED]>, "D. Richard Hipp" <[EMAIL 
Subject:Re: [sqlite] Performance problem

> On Wed, 2003-11-05 at 23:59, Jonas Forsman / Axier.SE wrote:
> > Note: Testing has shown PostgreSQL's hash indexes to be similar or 
> > than B-tree indexes, and the index size and build time for hash 
indexes is
> > much worse. Hash indexes also suffer poor performance under high
> > concurrency. For these reasons, hash index use is discouraged.
> Please note I'm note I'm not talking about a hash of the entire key- I'm
> talking about n distinct b-trees that are selected by an 8->n bit
> function. This transformation can be made very fast: We get a speed
> improvement here on searches if our 8->n bit function takes less time
> than n-1 random memcmp()'s.

How would you handle the lack of ordering associate with hash tables? 
Sqlite can currently use indicies for three main tests: equals, less than, 
and greater than. While hash-tables are good at finding equal-to in 
constant time it usually means linear time (a table-scan) to test for less 
than or greater than. Do you have a solution to this problem?


To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Re[2]: [sqlite] Performance problem

2003-11-03 Thread ben . carlyle
- Forwarded by Ben Carlyle/AU/IRSA/Rail on 04/11/2003 02:26 PM -

Ben Carlyle
04/11/2003 02:01 PM

To: Doug Currie <[EMAIL PROTECTED]>
Subject:Re: Re[2]: [sqlite] Performance problem

03/11/2003 05:39 AM
Please respond to Doug Currie

Subject:Re[2]: [sqlite] Performance problem

> > Can any readers suggest ways that I have not thought of for making
> > large numbers of non-localized INSERTs go faster?

> Instead of journaling the old ("undo") pages of idx2, simply record
> the fact that idx2 is modified (i.e., make a single record in the
> journal). Upon recovery, noticing that idx2 is modified will cause it
> to be recreated from the rolled-back data.

I've been thinking about his for a while and I think the solution, as with 
many other potential sqlite performance issues is to solve it in the 
operating system, not sqlite it's self. If you don't have enough bandwidth 
to your file, raid it. If you want to be able to write a large number of 
transactions per second using journalling that lazily writes to the 
database enable data-journalling on your filesystem.

This is a couter-intuitve approach. When you apply data journalling on a 
filesystem such as linux's ext3 the operating system writes all data to a 
journal before writing it to the appropriate locations on disk. You may 
think that because its being written twice (four times in the case of 
sqlite, because the journal will be written twice and the file will be 
written twice) it will take longer to work. When you use fsync() alot the 
opposite is actually true. You get the behaviour that other list members 
have been discussing with the combined undo/redo log. The data updates are 
written sequentially by the operating system to disk so seek-time doesn't 
come into it, and the fsync() simply ensures that your file has been 
written to the sequential log. The operating system it's self deals with 
the lazy writing to the actual disk, and it's transparent to sqlite 
because the operating system cache makes it appear that its already been 

This only works when you use all-data journalling, not just meta-data like 
directory entries, and regular writes to the same filesystem might be a 
little slower... but try it. You should get better performance with 
data-journalling and sqlite.

As this is a problem that can be solved very nicely indeed in the 
operating system I think that sqlite should make no special provisions for 
it. I do suggest that the relevant FAQs should contain the instruction to 
use a data-journalled fs if a large number of transactions must be pushed 
through sqlite.


Re: [sqlite] Journalling

2003-10-27 Thread ben . carlyle

27/10/2003 05:16 PM

To: "Mrs. Brisby" <[EMAIL PROTECTED]>
cc: sqlite <[EMAIL PROTECTED]>
Subject:Re: [sqlite] Journalling

> I am trying to use sqlite in a context where I will be using it to store 
some configuration about a system. I want to try to minimize the disk 
access. Since journalling uses a file on the disk, I wanted to turn it 
off. I am not worried about rolling back the database to a known state in 
case of some failure.

You're not worried about your database becoming corrupted and all your 
data being destroyed? It doesn't sound like you like your data very 
much... ;)

This is a question that pops up on the list every so often, and there have 
been some good reasons for it. Well. One comes to mind, and that's the use 
of flash memory in embedded devices. When you don't want to write to your 
media too many times you might find that it's better to turn off 
journalling and risk the consequences... perhaps make regular backups... 
rather than write to the media too often.

The problem is that most people don't know what they're talking about when 
they ask how to turn journalling off. They don't understand when the 
journal gets written in the first place and they don't understand which 
operations they're performing that aren't affected by journalling. They 
haven't read the list archives, and they patently haven't read the manual, 
because it's listed under the pragma section of

This is why when you ask the question on this list you get the response
"Well I know you've asked how to turn off journalling, but what do you 
actually want to achieve by this and what alternatives have you 

You haven't yet given an explination that makes sense to me, so in the 
spirit of RTFM I'll leave you to find the exact manual reference yourself. 
I think it's worth you understanding, though, that journalling doesn't 
occur when you're only querying the database. It only happens when you 
modify the database. Using transactions while modifying the database is 
not only a good idea for data integrity, it also makes the overhead 
associated with synching the file to disk almost disappear so there's 
usually no need at all to turn off journalling. Given all of this, if you 
still can't find the exact spot in the manuals to turn this off yourself 
perhaps you could offer a more complete discussion about the nature of 
your database and your access to it. You'd be well advised to discuss the 
alternatives you have considered so that the gentle list members will feel 
more compelled to answer your question directly.

--Premature optimisation is the root of all evil

Re: [sqlite] Problem with LIMIT and transactions? Or is it just me

2003-10-22 Thread ben . carlyle

"Jay Macaulay" <[EMAIL PROTECTED]>
23/10/2003 05:55 AM

Subject:[sqlite] Problem with LIMIT and transactions?  Or is it just me

> I am running into a real odd problem that I hope someone can explain.  I
> have a simple query:

> SELECT * FROM users WHERE uid >= 1 LIMIT 10;

> Simple enough.  No reason to use a transaction, but I use them for 
> reason, actually.  The problem I run into is:  If I execute the above
> statement 3 times, it works fine.  It takes 70 ms to execute the select
> statement.  On the fourth time of execution the select statement takes 
> 5 seconds to execute!!  I can execute the statement again after the 5
> seconds, and it's back to a 70 ms execution.  Then after other time I
> execute this statement it takes 5 seconds.

This does seem difficult to explain. Is it possible that another process 
or thread is accessing the database occasionally (thus locking this 
"writer" out for a period?).  Apart from that, you probably have to work 
through a process of elimination. Does it still happen without the LIMIT 
clause? Does it still happen if you take the transaction away? Is there 
anything else going on on the machine at the time? How are you running 
your test? How are you obtaining your timings? What's the schema of the 
database? Is there an index?

Hopefully the answers to some of these questions will bring you closer to 
a conclusion.