----- Forwarded by Ben Carlyle/AU/IRSA/Rail on 05/11/2003 12:37 PM -----
Ben Carlyle
05/11/2003 12:37 PM
To: Doug Currie <[EMAIL PROTECTED]>
cc:
Subject: Re: Re[4]: [sqlite] Performance problem
Doug Currie <[EMAIL PROTECTED]>
05/11/2003 05:36 AM
Please respond to Doug Currie
To: [EMAIL PROTECTED], [EMAIL PROTECTED]
cc:
Subject: Re[4]: [sqlite] Performance problem
Monday, November 3, 2003, 11:22:55 PM, Ben Carlyle wrote:
> >> > Can any readers suggest ways that I have not thought of for making
> >> > large numbers of non-localized INSERTs go faster?
> >> Instead of journaling the old ("undo") pages of idx2, simply record
> >> the fact that idx2 is modified (i.e., make a single record in the
> >> journal). Upon recovery, noticing that idx2 is modified will cause it
> >> to be recreated from the rolled-back data.
> > I've been thinking about his for a while and I think the solution, as
with
> > many other potential sqlite performance issues is to solve it in the
> > operating system, not sqlite it's self. If you don't have enough
bandwidth
> > to your file, raid it. If you want to be able to write a large number
of
> > transactions per second using journalling that lazily writes to the
> > database enable data-journalling on your filesystem.
> Using the OS or RAID may solve many problems. In this case, though, I
> proposed a solution that eliminated all of the index journaling
> normally applied by sqlite. In D. Richard Hipp's example there were
> 30000 writes to (mostly) different pages for idx2. With sqlite
> presently this means 60000 pages written, and a lot of disk head
> motion. With my approach, it means 30001 pages written (lazily) -- and
> it works without added hardware or a journaling file system.
This sounds complicated, but it sounds like it would improve performance
for a single transaction that affects a large number of index disk blocks.
I think the OS data journalling would provide still better performance
under this case, and also where a large number of transactions are
required in a short time.
The biggest problem with commiting a transaction is not bandwidth to the
disk, it's moving the head around the disk. If you have to write data to a
lot of non-sequential blocks around a database file the disk bandwidth
will be underutilised as the head moves from one place to another. The
best way to write data to disk in bulk is to make sure your blocks are
sequential whenever you're performing a time-critical write such as the
write that occurs during a commit.
While writing sequentailly to a file does not guarantee that the blocks
used are sequential, utilising the filesystem journal to write your data
does guarantee a sequential disk area. Because the head never moves its
possible to write and commit a large number of blocks to the journal
exactly when you need to: at commit time. After the commit the OS works
out the details of getting the data into the right spots on disk when its
own commit triggers.
In this way, it doesn't really matter how many blocks you change through
the database file or through the sqlite journal file. The operating system
makes sure that the data is written efficiently when the time-critical
events are occuring.
The reason that I think data journalling would work better than the lazy
journalling of indices is that when you do the lazy journalling you still
need to update the indices themselves. When there are a large number of
indicies the head still has move all over disk to write the relevant bits.
The process of commiting data is 1) Commit the journal, 2) Update and
commit the file, 3) Delete the journal. You see, the problem is not that
too much data is being written to the journal. The journal is probably
sequential anyway so the slow movement of the disk head is not an issue.
The problem is updating that database file itself where updates are less
likely to be to sequentail blocks. Data-journalling in the OS solves the
problem of changes the database file being all over the place by writing
them to its own, guaranteed sequenal, log. When the commit occurs in
user-space the filesystem doesn't force the blocks to be written to the
correct disk areas at that time-critical moment. The commit in user-space
only forces the data to the filesystem log.
On another note:
DRH: Will the changes to indicies allow us to define arbitrary collation
functions? If so, will those indicies be used when a query is done that
could use the arbitrary collation function?
Benjamin.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]