Hash: SHA1

> Background: Will 64 bit versions of SQLite buy us any significant
> difference in performance?

You need to have some benchmark code first, ideally in the form of a
representative script to feed to the sqlite shell.  That way we can run
the code for you and give feedback.

In the specific case of SQLite the biggest determinant of performance
are the hard drives and the speed they spin at.  For transactions, data
has to be on the oxide and confirmed as such by the drive.  For 7200rpm
drives, each spot is under the heads 120 times per second.  Each
transaction requires two syncs so the maximum possible transaction rate
is 60 per second.  This will apply even if the data is spread across
raid stripes since every drive in the array still has to confirm the
data is committed. For random access the latency is also affected by the
drive speed, limiting you to a maximum of 120 random accesses per
second.  Sequential access will obviously be significantly faster but
again the maximum rate will be a factor of packing density and rpm.  You
can increase the amount of data that can be sent to/read from the drives
by using a RAID array, or by using drives that support NCQ.  For example
my bog standard Seagate 750GB drive will accept 100MB/s for writes and
does 70MB/s for mostly sequential reads.  A two drive array would
roughly double that.

> I may have a chance to get our department 64 bit AMD workstations

It is a very good idea to get 64 bit machines and all current
workstation processors are 64 bit anyway.  64 bit Windows and Linux will
also run and compile 32 bit binaries so you can test compatibility.

> with 8G, 

It is pointless having more than about 3G with 32 bit operating systems
as they need the last 1G of physical address space for peripherals,
video memory etc.

> These workstations will
> be processing very large text based log files (> 16G each).

Generally using 64 bit operating systems makes life considerably easier
for developers.  As an example you can just memory map the files and
then treat them as a char*.  You can assemble multiple files in memory,
do sorting etc without having to worry about hitting limits.

Note that SQLite itself has limits which are 32 bit based even in a 64
bit process.  For example the largest a single record could be is 2GB
(signed 32 bit maxint).

In terms of general 32 bit vs 64 bit performance, the AMD64 instruction
set doubles the number of registers from 8 to 16.  But pointers take up
twice the space (8 rather than 4 bytes).  So basically the code can have
more balls in the air, but requires more memory accessed to operate.  If
you have access to 64 bit Linux you can compile 64 and 32 bit versions
of the SQLite shell for easy comparison.

  $ ./configure --disable-shared CC="gcc -m32" && make && mv sqlite3 \

Repeat replacing 32 with 64.  The resulting binary is 371kb (32 bit) and
428kb (64 bit).

> We will be
> using Python 2.52 as our SQLite scripting tool.

Good tool choice.  That is far more productive for developers than C
development :-)  You may also want to try the APSW bindings as well as
pysqlite.  APSW is reported to be faster by others doing benchmarking.
(Disclaimer: I am the author of APSW).

Version: GnuPG v1.4.6 (GNU/Linux)

sqlite-users mailing list

Reply via email to