Re: [sqlite] Syntax Error For 3.0.8 --> 3.2.1 Upgrade

2005-04-21 Thread William Hachfeld
On Thu, Apr 21, 2005 at 11:57:19AM -0700, Darren Duncan wrote:
> One of your problems is that you are using single quotes for a delimited
> identifier, "Open|SpeedShop", when you should be using double quotes.
> 
> Single quotes means a string literal in the SQL standard, where double quotes
> a delimited identifier, which are very different things.
> 
> On the other hand, I think the SQL standard lets you define your own
> delimiter, for delimited identifiers, but even in that case it seems wrong to
> pick the same kind as that which always means string literal.

Thanks for the advice, Darren. I used ' instead of " for no other reason than
it allowed me to put SQL in my C++ source without \" all over. As something of
a SQL newbie, I wasn't aware that there was a distinction. Using ' worked, had
the desired result, and so I went with it... I'll look into this more and
consider changing it in our code.

-- William Hachfeld


Re: [sqlite] Syntax Error For 3.0.8 --> 3.2.1 Upgrade

2005-04-21 Thread William Hachfeld

Quite right Thomas and Jolan! I knew it was going to be something stupid on my
part. If I change the field "column" to "clumn" it works just fine.

Thanks!

-- William Hachfeld


[sqlite] Syntax Error For 3.0.8 --> 3.2.1 Upgrade

2005-04-21 Thread William Hachfeld

Heh everyone!

I've recently tried to upgrade from SQLite 3.0.8 to 3.2.1 and have run into a
snag. This is probably something REAL simple that I'm just not seeing. I'm
hoping someone can point out my stupid mistake...

When I rebuild our project with 3.2.1 instead of 3.0.8, I'm getting a syntax
error, where none previously existed, while constructing a database. I've found
that I can reproduce this same behavior by executing the identical SQL
statements using the "sqlite3" command interface:



% sqlite3 tmp.db
SQLite version 3.2.1
Enter ".help" for instructions

sqlite> CREATE TABLE 'Open|SpeedShop' (version INTEGER);

sqlite> INSERT INTO 'Open|SpeedShop' (version) VALUES (1);

sqlite> CREATE TABLE Threads (id INTEGER PRIMARY KEY, host TEXT, pid INTEGER 
DEFAULT NULL, posix_tid INTEGER DEFAULT NULL, openmp_tid INTEGER DEFAULT NULL, 
mpi_rank INTEGER DEFAULT NULL);

sqlite> CREATE TABLE AddressSpaces (id INTEGER PRIMARY KEY, thread INTEGER, 
time_begin INTEGER, time_end INTEGER, addr_begin INTEGER, addr_end INTEGER, 
linked_object INTEGER);

sqlite> CREATE TABLE LinkedObjects (id INTEGER PRIMARY KEY, addr_begin INTEGER, 
addr_end INTEGER, file INTEGER, is_executable INTEGER);

sqlite> CREATE TABLE Functions (id INTEGER PRIMARY KEY, linked_object INTEGER, 
addr_begin INTEGER, addr_end INTEGER, name TEXT);

sqlite> CREATE TABLE Statements (id INTEGER PRIMARY KEY, linked_object INTEGER, 
file INTEGER, line INTEGER, column INTEGER);
SQL error: near "column": syntax error

sqlite>



In the above, "tmp.db" did not exist prior to the execution of "sqlite3". Where
is the syntax error near "column" in the last line? 

-- William Hachfeld


Re: [sqlite] NFS Query Performance

2005-04-21 Thread William Hachfeld
On Thu, Apr 21, 2005 at 11:49:21AM +0100, Christian Smith wrote:
> I know the snippet above is from a test program, but don't ever put things
> like sqlite3_exec() inside an assert()! I hope you're (OP) not doing this in
> your application.

I realize that evaluation of assert() is short-circuited when NDEBUG is defined
during the compile (i.e typically when optimizations are used). Using assert()
in a test program like this is a very quick, very effective, way to insure that
each of your OS calls is completing successfully. Much better (IMHO) than
naively assuming that everything succeeds and then being dumbfounded later when
things break in mysterious ways.

-- William Hachfeld


Re: [sqlite] NFS Query Performance

2005-04-20 Thread William Hachfeld

Good suggestions, guys!

First, I modified the previously-posted test program to time the insertion rate
outside of the transaction (i.e. Kilnt's suggestion). I left the creation of
the table, however, outside of the timed insertion loop. In other words:

Create Table
Begin Timing
Begin Transaction
Loop Performing 10,000 Inserts
Commit Transaction
End Timing

Second, I took Mike & Dan's suggestion and put a transaction around the queries
as well. In other words:

Begin Timing
Begin Transaction
Loop Performing 10,000 Queries
Commit Transaction
End Timing

When I run this modified version on the same four file systems as before, I
get:

 /tmp27198.7, 29177.2, 25465.3  Insertions/Second
 /tmp301.234, 300.551, 304.311  Queries/Second

 ~   26596.9, 27185.5, 26012Insertions/Second
 ~   306, 303.746, 302.499  Queries/Second

 NFS(1)  33789.5, 33675.7, 30738.6  Insertions/Second
 NFS(1)  304.678, 305.742, 303.51   Queries/Second

 NFS(2)  30446.6, 30528.9, 29344.3  Insertions/Second
 NFS(2)  300.903, 304.742, 303.782  Queries/Second

These results are pretty much directly in line with what one would expect given
the hypothesis that poor NFS locking performance was causing the slowdown. Key
points:

1) Timing outside the transaction for inserts did, as expected, lower the
   amortized insertion rate in each case.

2) Adding a transaction around the queries did, as expected, increase the
   amortized query rate in every case.

3) Adding a transaction around the queries, thus locking only once, as
   expected, brought the query rate on NFS up to the query rate on local disk
   (assuming the single, poor, NFS lock is amortized across a sufficient number
   of queries).

So my conclusion from all of this is that I need to do one of two things. I
either need to group my queries together and bracket them with a transaction,
or I need to try the suggestion yesterday of disabling the locking.


-- William Hachfeld


Re: [sqlite] NFS Query Performance

2005-04-19 Thread William Hachfeld
On Tue, Apr 19, 2005 at 11:58:11AM -0700, Ted Unangst wrote:
> If you don't need locking, and it sounds like you don't, just neuter the
> fcntl calls in os_unix.c.  In fact, there's a #define and comment for DJGPP
> that does exactly what you want.

Hmmm. Thanks for the suggestion, Ted. I think I'll try this out. If nothing
else, this will help confirm whether the majority of the slowdown is due to NFS
locking performance.

-- William Hachfeld


Re: [sqlite] NFS Query Performance

2005-04-19 Thread William Hachfeld
On Tue, Apr 19, 2005 at 02:23:44PM -0400, Griggs, Donald wrote:
> I'm not sure how your network is set up, but if you're using 100mbps
> ethernet, the full- table-scan queries are only 4-5 times slower over the
> net.   Wouldn't that be the same order of magnitude as the ratio of link
> speed to ATA local disk bus speed?

Possibly. Although as I noted, our "real" application is showing a speed
differential of about 50x. Much more than can be attributed to simple transfer
rates. But it sounds like the answer is the crappy NFS locking speed.

> As for the inserts, if you're achieving over 36,000 of these per second, are
> you perhaps in a CPU-bound state rather than a disk-bound one?

I think, as I indicated to Jay earlier, that the reason the insert speed
doesn't degrade is that I'm performing a transaction around all the inserts.
Thus I expect that I only pay the NFS locking price once rather than multiple
times when I'm doing the query.

> Given that there's no sqlite server software on the network host, scanning
> full tables 50 or 60 times a second sounds awfully fast to me --- though I'm
> an admitted newcomer to databases and may have some faulty logic here.
> 
> I wonder if placing sqlite on the host, and adding something like SQRELAY
> would be worth the time to setup?   If your queries are much faster, you
> might consider keeping it, or using a more conventional database like
> postgres or mysql.

Nope. that isn't going to be an option for me. Preserving the "zero
configuration" and "single file" environment is of utmost importance in our
application.

Sounds like I'm pretty much stuck.

-- William Hachfeld


Re: [sqlite] NFS Query Performance

2005-04-19 Thread William Hachfeld
On Tue, Apr 19, 2005 at 01:20:35PM -0500, Kurt Welgehausen wrote:
> SQLite is not a client-server database.

Yup. I realize that.
 
> In the worst case, when you query a database file over NFS, all the tables in
> your query are sent over the network to be processed on your cpu; so if you
> want to select one row from a table of a million rows, the million rows are
> retrieved over the network (in the worst case).

Which is where presumably SQLite's page caching would come in... At least for
subsequent queries on the same table.

> You may want to consider a client-server database for this reason and also
> because there have been reports that locking does not work reliably on some
> implementations of NFS -- so you could be risking a corrupted db file if you
> have multiple users.

We definitely do NOT want a client-server database. As I noted, SQLite is being
used to store, in effect, an application's "save files". The fact that SQLite
stores its databases as a single, user movable, file is one of the key features
that makes it of interest to us.

Multiple users will also be a rarity (if it EVER occurs). What SQLite is giving
us is the ability to mainpulate a large (gigabyte sized) file containing
complex structures, without having to do all the "dirty work" (indexing,
endian-ness translation, caching, etc.) ourselves.

> I think this has been discussed on the list. You could try a search of the
> archive:
> 
> <http://www.mail-archive.com/sqlite-users@sqlite.org/>

I did. Almost all of the posts seemed to relate to the correctness of the NFS
locking and how it could be improved. I really don't care so much about correct
locking for our application. I'm more worried about the performance.

-- William Hachfeld


Re: [sqlite] NFS Query Performance

2005-04-19 Thread William Hachfeld
On Tue, Apr 19, 2005 at 01:07:03PM -0500, Jay Sprenkle wrote:
> On a system we developed several years ago (non sqlite) we noted that
> obtaining locks on an NFS mounted file was VERY slow. I *think* it's trying
> to establish a lock then waiting at least the propagation delay of your
> network to ensure there was no collision with another process.
> 
> Doing anything over a network is not good for performance.

And because I used a transaction on the insert, the lock only has to be
obtained once, and thus does not significantly affect performance?

-- 
William Hachfeld  ([EMAIL PROTECTED], 651-683-3103)
SGI Compilers & Tools


[sqlite] NFS Query Performance

2005-04-19 Thread William Hachfeld

Heh everyone!

Can anyone explain to me why SQLite shows such poor query performance when the
database is located on an NFS-mounted file system? When running with a single
process accessing the database, I'm finding that my insertion rates are similar
to local disk, but my query performance is much, much, slower.

To illustrate, I built my test code (source at the end of this post) as:

 g++ -o sqlite-test sqlite-test.cxx -lsqlite3 -lrt

using GCC 3.3.3. My Dell test system had a 3.2Ghz P4, 1Gb of RAM, and a 80 ATA
HD (2Mb cache). I'm running SuSE 9.1 on this system. I ran each test three
times:


 /tmp36749.3, 36662.5, 36239.8  Insertions/Second
 /tmp264.195, 268.678, 266.233  Queries/Second

 ~   36515.5, 36538.9, 36673.6  Insertions/Second
 ~   274.449, 273.875, 273.236  Queries/Second

 NFS(1)  37128.2, 37479,   37347.9  Insertions/Second
 NFS(1)  44.2276, 45.408,  52.3626  Queries/Second

 NFS(2)  36812.2, 37272.3, 36595.2  Insertions/Second
 NFS(2)  68.3878, 68.3142, 68.0701  Queries/Second


   /tmp: Local "tmpfs" file system
  ~: Local "reiserfs" file system
 NFS(1): Served from Origin 2000 16p system, RAID-5 SCSI disk array, fairly
 heavy user load
 NFS(2): Served from lowly PIII Linux system, local ATA disk, no user load


Note that insertion performance is basically the same across all
configurations, but the query performance is seriously degraded when running on
NFS. In practice, on our real application, I'm seeing as much as a 50x speedup
when moving the database from NFS onto the local disk. Colleagues working on
the same project have all seen similar results.

Does anyone have an idea why this is the case? Is there anything I can do to
improve the query performance on NFS-mounted databases? Since we are using
SQLite for storing our application's "save files", our users are going to want
to be able to locate the database in a directory of their choosing. 

Also note that I'm not complaining about SQLite's query performance in general.
The absolute numbers in this test case are artifically low because I'm forcing
a full table scan by querying on a non-indexed field. My concern is with the
relative performance between local versus NFS file systems.

Thanks guys!

-- William Hachfeld





#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

uint64_t Now()
{
struct timespec now;
assert(clock_gettime(CLOCK_REALTIME, ) == 0);
return (static_cast(now.tv_sec) * 10) +
static_cast(now.tv_nsec);
}

int main(int argc, char* argv[])
{
const int N = 1;

// Form and display the name of the database
std::string dbname = std::string(get_current_dir_name()) + "/tmp.db";
std::cout << "Database \"" << dbname << "\"" << std::endl;

// Open the database
sqlite3* handle = NULL;
assert(sqlite3_open(dbname.c_str(), ) == SQLITE_OK);
assert(handle != NULL);

// Create and populate a table with 'N' entries

assert(sqlite3_exec(handle, "BEGIN TRANSACTION;",
NULL, NULL, NULL) == SQLITE_OK);

assert(sqlite3_exec(handle,
"CREATE TABLE Test ("
"key INTEGER PRIMARY KEY,"
"value INTEGER"
");",
NULL, NULL, NULL) == SQLITE_OK);

uint64_t t_start = Now();
for(int i = 0; i < N; ++i) {
char* statement = 
sqlite3_mprintf("INSERT INTO Test (value) VALUES (%ld);", i);
assert(statement != NULL);
assert(sqlite3_exec(handle, statement, NULL, NULL, NULL) == SQLITE_OK);
sqlite3_free(statement);
}
uint64_t t_stop = Now();

assert(sqlite3_exec(handle, "COMMIT TRANSACTION;",
NULL, NULL, NULL) == SQLITE_OK);

std::cout << (static_cast(N) /
  (static_cast(t_stop - t_start) / 10.0))
  << " Insertions/Second" << std::endl;

// Perform 'N' queries on the table

t_start = Now();
for(int i = 0; i < N; ++i) {
char* statement =
sqlite3_mprintf("SELECT * FROM Test WHERE value = %ld;", i);
assert(statement != NULL);
assert(sqlite3_exec(handle, statement, NULL, NULL, NULL) == SQLITE_OK);
sqlite3_free(statement);
}
t_stop = Now();

std::cout << (static_cast(N) /
  (static_cast(t_stop - t_start) / 10.0))
  << " Queries/Second" << std::endl << std::endl;

// Close the database
assert(sqlite3_close(handle) == SQLITE_OK);

// Remove the database
assert(remove(dbname.c_str()) == 0);
}


Re: [sqlite] Storing 64-Bit Addresses

2005-03-31 Thread William Hachfeld
On Thu, Mar 31, 2005 at 01:52:53PM -0800, Ted Unangst wrote:
> Store the length of the region, instead of the end.  Or is that impossible?
> I'm not sure what physical property (begin + end) could refer to.

Sorry Ted, the example I gave was misleading. It was meant purely as an
illustration. The actual test I'm doing is something like:

WHERE ?1 >= (base + begin) AND ?1 < (base + end)

where ?1, base, begin, and end are all 64-bit addresses. In some places I could
store the length of the region rather than "end", but I'd still have to do the
computation on "begin" using 64-bit quantities.

-- 
William Hachfeld  ([EMAIL PROTECTED], 651-683-3103)
SGI Compilers & Tools


[sqlite] Storing 64-Bit Addresses

2005-03-31 Thread William Hachfeld

Heh Everyone,

Have a question regarding storage of 64-bit unsigned values in SQLite... I'm
working on a project where I want to store 64-bit addresses (unsigned values)
as a column in an SQLite database. Currently I create the table using signed
integers. For example:

CREATE TABLE Example (
begin INTEGER,
end INTEGER
);

Before binding an address to a statement using sqlite3_bind_int64() I apply an
offset to the address to translate it to a signed value. And when reading out
an address using sqlite3_column_int64() I reverse the process. I.e.

dbase_value = addr_value - offset
addr_value = dbase_value + offset

where offset = ((uin64_t)~0 >> 1) + 1

this works fine for simple uses where I'm just storing and retrieving the
value. But if I want to do more complex queries involving arithmetic, for
example:

SELECT * From Example WHERE ? < (begin + end);

My transformation breaks down. E.g. where offset = 8, ? = 4, begin = 2, and end
= 3:

4 < (2 + 3) --> true

(4 - 8) < ((2 - 8) + (3 - 8))
= -4 < (-6 - 5)
= -4 < -11
--> false

Darn mathematics anyway! So after that long-winded explanation, my question is
simply this. Does anyone have an idea how I can store a 64-bit unsigned integer
in an SQLite column and still perform useful arithmetic and conditional
operators on them? Thanks in advance for any ideas!

-- William Hachfeld


Re: [sqlite] Index Usage

2004-10-28 Thread William Hachfeld

Thanks for the information Richard. Your explanation, along with the "Virtual
Database Engine" document that I was reading when you wrote this, makes things
a lot more clear. After reading Ulrik's suggestions, I decided to poke around a
little bit using "EXPLAIN" to see if I could discover what SQLite would do for
my two purposed queries... 

I was able to see in the VDBE opcodes for my first query exactly what you are
telling me - that SQLite will use the "grp" and "begin" terms only. So am I
also correct in understanding that if I did:

CREATE INDEX MultiColumnIndex ON Example (begin, end, grp);
SELECT id FROM Example WHERE x < end AND y >= begin AND grp=g;

That I would only make use of 1 of the 3 terms in the index?

The VDBE opcodes for my sub-select query looked almost identical to the non-
sub-select version. So I'm assuming that internally SQLite folds these together
and treats them, in effect, like a single query rather than a two-part query.

At the risk of trying everyone's patience, I have one more question... Can any
generalizations be made about the relative performance of the following queries
(again using the same example table):

CREATE INDEX IndexA ON Example (grp, begin);
SELECT id FROM Example WHERE grp=g AND x < end AND y >= begin;

versus:

CREATE INDEX IndexA ON Example (grp);
CREATE INDEX IndexB ON Example (begin);

SELECT * FROM Example WHERE grp=g
INTERSECT SELECT * FROM Example WHERE x < end AND y >= begin;

or maybe even:

CREATE INDEX IndexA ON Example (grp, begin);
CREATE INDEX IndexB ON Example (end);

SELECT * FROM Example WHERE grp=g AND y >= begin
INTERSECT SELECT * FROM Example WHERE x < end;

given a large (~1,000,000 rows) table? Is the cost of creating the temporary
table for the compound SELECT usually going to outweigh the benefit of using a
second index? Is there any way to force the temporary table to be placed in
main memory rather than on disk?

-- 
William Hachfeld  ([EMAIL PROTECTED], 651-683-3103)
SGI Debugger, Object, and Performance Tools


Re: [sqlite] Index Usage

2004-10-28 Thread William Hachfeld

Thanks for the advice Ulrik! 

I don't believe, however, that the alternate query you purposed using BETWEEN
is quite equivalent to what I was going to do. I am storing intervals [begin,
end) in the database and then looking for those intervals from the database
that intersect [x, y) - not those intervals contained by [x, y). A subtle, but
important, distinction in my application.

In any case, your information about SQLite's use (or lack there-of) of a
(begin, end) index is certainly helpful to me. Clearly I need to use the
EXPLAIN command and see if I can decipher what SQLite does under various
permutations of the query.

-- 
William Hachfeld  ([EMAIL PROTECTED], 651-683-3103)
SGI Debugger, Object, and Performance Tools