from:"Brannon King"

Re: [sqlite] proposal for write-lock on "commit" rather than "begin transaction"

2019-10-25 Thread Brannon King

>
> Two users – members of staff – enter data.  Each user enters a new
> invoice.  One of these entries gets rolled back.  What should their
> software do ?  Or should it just return an error message to the user ?
>

Multi-user data entry is not a part of my intended use case. I think other
database engines are better suited for this need. It could be done, though,
if you had an easy way to check for conflicts on commit. This change would
get us much closer to this than we are now.

My particular use case is for data verification. I have one part of the
system that does some computation. I then have another part of the system
that verifies that computation. I don't want the "verifier" to write to the
database; that data will be written by the true data owner. However, the
verifier to redo some of the inserts/deletes in order for the computation
to come out correctly.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] proposal for write-lock on "commit" rather than "begin transaction"

2019-10-25 Thread Brannon King

This is a request for a small change to the handling of multiple
connections. I think it would significantly enhance the usefulness there
via allowing multiple "views" of the data.

Consider that I have two simultaneous connections to one file, named Con1
and Con2. They could be in one process or one thread -- that's irrelevant.
Either one may write to the DB; we don't know yet. For starters, assume
that their journal mode is MEMORY.

Both connections begin with "begin transaction". Already I'm dead in the
water; one of those will fail presently with "database is locked". But it
doesn't need to be that way! Each connection can have its own journal file,
especially if it's in memory. Once one connection commits, the other
connection will no longer be allowed to commit. It will be forced to
rollback (or perhaps rebase if there are no conflicts).

Multiple WAL files could be supported in a similar fashion; they just need
some kind of unique naming scheme. For recovery, the user would be prompted
to select one or none. It doesn't seem that far from Sqlite's current
behavior. Thoughts?

~Brannon
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] rationale for private archives of the SQLite mailing list and suppressed reply-to

2019-10-11 Thread Brannon King

I agree that Mailman is archaic. I worry about the security on it. I don't
enjoy using 3rd-party mirrors for searching it. I'd like to propose that we
upgrade to something more modern and secure like Sympa or mlmmj, or even a
more drastic system upgrade to something like Redmine -- a project
tracker + forum.

On Fri, Oct 11, 2019 at 9:41 AM Mike Bayer  wrote:

> ...I would ask them to consider that making the mailing list opaque leads
> to user questions that are entirely related to SQLite and nothing else
> being posted in downstream project communities instead, which pushes the
> community response burden downwards.
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] SELECT uses index with SUBSTR but UPDATE doesn't

2019-10-08 Thread Brannon King

I have this query:
UPDATE nodes SET parent = ? WHERE SUBSTR(name, 0, ?) = ?
EXPLAIN QUERY PLAN tells me that it is going to do a table scan. At the
same time, the query plan for this:
SELECT * FROM nodes WHERE SUBSTR(name, 0, ?) = ?
tells me that it can and will use the (primary key) index on the name
column.

With that info, I thought that this query would be faster:
UPDATE nodes SET parent = ? WHERE name IN (SELECT name FROM nodes WHERE
SUBSTR(name, 0, ?) = ?)
Alas, it's not. I don't know why.

UPDATE will use the index if I use the LIKE operator. However, it won't use
the index if I attempt LIKE (? || '%'). Whatever handles the string
concatenation breaks the use of the index. I don't want to have to sanitize
my own data. I have very arbitrary, user-entered, malicious data going in.
It's also not clear to me what the sanitizer does for the LIKE operator.
What does it do to existing percent signs in the data? I don't want to use
those as wildcards. Hence, I much prefer the SUBSTR approach; it seems much
safer all around.

I run v3.29.0. I hope this can prompt somebody to make the SUBSTR operator
work with the indexes on an UPDATE statement.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] performance difficulty across multiple queries

2012-05-31 Thread Brannon King

>I recommend you execute the SQL command ANALYZE then try various of those
SELECTs again.  This can cause the SQLite optimizer to make different
choices about which indexes to use.

Thanks for the suggestion. I am familiar with the Analyze command and have
been using it. However, it doesn't seem to help in this scenario where the
choice is whether or not to use the index for the ORDER BY or the JOIN. I
find it a little frustrating that a single query can't use two indexes from
the same table as other database engines support that ability.


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] performance difficulty across multiple queries

2012-05-31 Thread Brannon King

I've got this (simplified) table schema:

 

  CREATE TABLE [Services] ([Id] INTEGER PRIMARY KEY, [AssetId] INTEGER NULL,
[Name] TEXT NOT NULL);

  CREATE TABLE [Telemetry] ([Id] INTEGER PRIMARY KEY, [ServiceId] INTEGER
NULL, [Name] TEXT NOT NULL)

  CREATE TABLE [Events] ([Id] INTEGER PRIMARY KEY, [TelemetryId] INTEGER NOT
NULL, [TimestampTicks] INTEGER NOT NULL, [Value] TEXT NOT NULL)

 

The Services and Telemetry tables have hundreds of rows, but the Events
table has a million. I can't seem to figure out the right combination of
indexes to get the following queries to all execute quickly. Some indexes
help some queries but hurt others. When I have indexes on TelemetryId and
TimestampTicks the former seems to be favored even when the TimestampTicks
index provides quicker results. When I use the '+' to disable the
TelemetryId index I get some fast and some slow results. Help?

 

SELECT MIN/MAX(e.TimestampTicks) FROM Events e INNER JOIN Telemetry t ON
t.ID = e.TelemetryID INNER JOIN Services s ON s.ID = t.ServiceID WHERE
s.AssetID = @AssetId;

 

SELECT e1.* FROM Events e1 INNER JOIN Telemetry t1 ON t1.Id = e1.TelemetryId
INNER JOIN Services s1 ON s1.Id = t1.ServiceId WHERE t1.Name =
@TelemetryName AND s1.Name = @ServiceName;

 

SELECT * FROM Events e INNER JOIN Telemetry t ON t.Id = e.TelemetryId INNER
JOIN Services s ON s.Id = t.ServiceId WHERE s.AssetId = @AssetId AND
e.TimestampTicks >= @StartTimeTicks ORDER BY e.TimestampTicks LIMIT 1000;

 

SELECT e.Id, e.TelemetryId, e.TimestampTicks, e.Value FROM (

SELECT e2.Id AS [Id], MAX(e2.TimestampTicks) as
[TimestampTicks]

FROM Events e2 INNER JOIN Telemetry t ON
t.Id = e2.TelemetryId INNER JOIN Services s ON s.Id = t.ServiceId

WHERE s.AssetId = @AssetId AND
e2.TimestampTicks <= @StartTimeTicks 

GROUP BY e2.TelemetryId) AS grp

INNER JOIN Events e ON grp.Id = e.Id;

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

RE: [sqlite] opening a database in command prompt in Windows

2006-08-21 Thread Brannon King

Dragging a database onto the sqlite.exe in explorer works. 

> I downloaded the command line utility, and it is an exe file 
> that automatically opens the sqlite prompt. Given this, how 
> am I supposed to open or create databases? The only way I'm 
> aware so far is that you have to specify the db after the 
> sqlite3 command which opens the prompt, but in this case I'm 
> already in the prompt.
> 
> Or is this exe file not meant to be used this way? Am I 
> supposed to put it in my PATH and then run it from the command line?
> 
> Thanks,
> John


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

RE: [sqlite] Return value of sqlite3_last_insert_rowid()?

2006-08-02 Thread Brannon King

a "normal C long" on 64 bit linux would be the same data type as
sqlite_int64. On a 32bit system, just cast it (assuming you know that you'll
never have more than 2 billion rows).  

> What I'm not clear about is that the documentation says that 
> rowid is an alias for the primary key column of type INTEGER 
> PRIMARY KEY. How is the
> sqlite_int64 converted to a 'normal' C long?
> 
> I'm using the sqlite3_last_insert_rowid() call to retrieve 
> the key of the row I just inserted (so I can refer to it when 
> I do UPDATE and DELETE).

RE: [sqlite] memory leak

2006-08-02 Thread Brannon King

One quick question, is that with memory management enabled in the sqlite
compile or not? I've done some extensive memory checking with memory
management disabled and never found a leak. However, with memory management
enabled, all bets are off.

> I believe the memory leak exists. The following is excerpt 
> from sqlite-3.3.6 in valgrind (In fact, the
> sqlite-2.8.17 in valgrind is similar):

RE: RE: AW: AW: [sqlite] New JDBC driver for SQLite

2006-07-31 Thread Brannon King


> Thanks for the info. Points 2, 4 and 5 are covered by the 
> Makefile (DB.h is generated with javah), but I'll patch the 
> project for the variable declarations and the cast. Though I 
> have a feeling the cast is unncessesary, did VC throw an 
> error or warning for that?

It was just a warning.

RE: AW: AW: [sqlite] New JDBC driver for SQLite

2006-07-31 Thread Brannon King

To compile the binary with VC71, I had to

1. move a dozen variable declarations to the top of the function
2. download the DB.h file separately from the build tree
3. change the jstrlen to end with "return (int)(s - str) - suppChars"
4. change my sqlite3 lib build to #define SQLITE_ENABLE_COLUMN_METADATA
5. reference the sqlite3 lib in a dll project containing the DB.c/h

> -Original Message-
> From: David Crawshaw [mailto:[EMAIL PROTECTED] 
> Sent: Monday, July 31, 2006 2:03 AM
> To: Martin Pfeifle; sqlite-users@sqlite.org
> Subject: Re: AW: AW: [sqlite] New JDBC driver for SQLite
> 
> Martin Pfeifle wrote:
> > could you please shortly outline the differences between your jdbc 
> > driver and the one developed by Christian Werner?
> 
> I haven't looked too closely at the other driver, but from 
> what I have seen, it is designed to work with the old 
> callback exec() functions, so it supports SQLite 2, but is 
> probably slower.
> 
> I've gone for simplicity with this one. This driver has a lot 
> less code in it. There is a single .c file:
> 
> http://java.zentus.com/src/sqlitejdbc/src/org/sqlite/DB.java
> http://java.zentus.com/src/sqlitejdbc/src/org/sqlite/DB.c
> 
> which basically maps the C functions into Java. The JDBC 
> implementation just calls these functions as appropriate.
> 
> I also put a lot of effort into making binaries, so it is easy to use.
> Compiling JNI libraries for different platforms is a bit of a pain.
> 
> d
>

RE: [sqlite] Enabling SQLITE_ENABLE_MEMORY_MANAGEMENT on Visual Studio.Net

2006-07-27 Thread Brannon King

You'll have to change that in the makefile before building the
"target_source" code, which means you'll need Linux or a similar build
environment on Windows. I don't think you want to do that anyway. MS's
memory handlers don't like that at all. For proof (once you get it enabled)
put this into your code and the sqlite code for debug compiles:

#define _CRTDBG_MAP_ALLOC
#include  // make sure this is the last header

// and then at the top of your main function:
_CrtSetDbgFlag ( _CRTDBG_ALLOC_MEM_DF | _CRTDBG_LEAK_CHECK_DF |
_CRTDBG_CHECK_ALWAYS_DF);


> Hello There :
>Does any one know how to enable 
> SQLITE_ENABLE_MEMORY_MANAGEMENT  on visual studio.net. I 
> tried by adding this macro to the preprocessor with no luck
> 
> Thanks
> Hilal
> 
>

RE: [sqlite] SQLiteSpy vs. SQLite.dll speed comparison

2006-07-20 Thread Brannon King

The VC6 compiler is from 1998. The VC 7.1 or 8.0 compilers produce better
compilations. I'm certain any Borland or GNU compiler of the past 3 years
would also produce better assembly than VC6. And if somebody has their hands
on a PathScale or Intel compiler, please post some benchmarks!

> I contacted Ralf who informed me that he knew of this speed 
> advantage and mentioned that he compiled his dll separately 
> using Borland.
> An additional mystery is why using the 'optimize for speed' 
> compiler option actually slows down the result set processing.

So that issue "optimize for speed" was with VC6 or with the Borland
compiler? If it's for VC6, well, upgrade already. If it's for Borland's
compiler, it is probably due to bugs in the memory management that were
evident in BCB5/6. In that case use FastMM (or some equivalent) for the
memory manager and see if that makes a difference.

 
> I am using VC 6 C++ and SQLite 3.34.
> My timing test uses the time reported my SQLite as reported 
> on the status bar, and for my code, I time only the sql_exec 
> stmt completion callbacks:
> 'rc = sqlite3_exec(db, pSQL, callback1, 0, );'.
> My callback merely stuffs 5 columns into a char array with no 
> processing.
> 
> Thanks for your help.
> Michael
>

RE: [sqlite] count(*)

2006-07-19 Thread Brannon King

>> select rowid from table limit 1 offset -1;
> Two ways to do this:
> 
>SELECT rowid FROM table ORDER BY rowid DESC LIMIT 1;
>SELECT max(rowid) FROM table;

Yes, but neither one of those would be as fast as this query, true?

SELECT rowid FROM table LIMIT 1

I guess I was thinking to avoid the sort overhead.

Re: [sqlite] count(*)

2006-07-18 Thread Brannon King

I know that this list has been over the issues of using ROWID to get the 
count. Nevertheless, I was thinking, if you never do any deletes the 
last ROWID should contain the count. It's too bad you can't do an offset 
of negative one so that it would start at the back. That should be darn 
fast. Something like:


select rowid from table limit 1 offset -1;

Nemanja Corlija wrote:

On 7/18/06, Sripathi Raj <[EMAIL PROTECTED]> wrote:

Hi Jay,

 I have a database whose size is around 250 MB. I have a table which has
around 140,000 records. Doing a count(*) on that takes me 473 seconds. I
believe it's not supposed to take that much time.


SQLite does not keep count of rows in a table. So when you execute
SELECT COUNT(*) FROM table;
what SQLite does is step through table row by row and count them.
Only way to get the total number of rows in a table faster is to keep
track of it yourself. This can easily be done using insert/delete
triggers that will increment/decrement row count in a separate table.
Complete examples of such code were posted to list before. Look for it
in archives.

RE: [sqlite] Re: I need help making this code run faster

2006-07-18 Thread Brannon King

Thanks, Igor, you've inspired and saved me yet again. The subqueries you had
used for the x/yEnd did not work, but the rest did and I have that maxim
information beforehand anyway. Here's how it shook down:

select 
  cast(cast((xStart+xEnd) as double)/2/15518.5 as integer) cellX,
  cast(cast((yStart+yEnd) as double)/2/15603.171875 as integer) cellY,
  max(score) 
from results_1 
group by cellX, cellY;

where the two real numbers are the preknown maxims divided by 64. That query
takes 8 seconds on a table with 560k rows (on my AthlonXp 2200). That's
certainly not bad speed. I wasn't sure what speed I should expect. The query
plan shows that neither the index on xEnd nor the index on yEnd are used. Is
there an index I could create that this query could use?


> select
> cast( (xStart+xEnd)/2/(select max(xEnd) from db)*64 as 
> integer) cellX,
> cast( (yStart+yEnd)/2/(select max(yEnd) from db)*64 as 
> integer) cellY,
> max(score)
> from db
> group by cellX, cellY
> 
> Igor Tandetnik 
> 


> > query = prepare("
> > SELECT score FROM db WHERE
> > (xStart+xEnd)/2 >= :left AND
> > (xStart+xEnd)/2 < :right AND
> > (yStart+yEnd)/2 >= :top AND
> > (yStart+yEnd)/2 < :bottom
> > ORDER BY score DESC LIMIT 1
> > ")
> >
> > xJump = max(xEnd in db) / 64.0;
> > yJump = max(yEnd in db) / 64.0;
> > for(int x = 0; x < 64; x++){
> >   for(int y = 0; y < 64; y++){
> > left = x*xJump, righ = (x+1)*xJump
> > top = y*yJump, bottom = (y+1)*yJump
> > bind(query, left, right, top, bottom)
> > ret = step(query)
> > output[x][y] = score or 0 if no ret
> >   }
> > }

RE: [sqlite] I need help making this code run faster

2006-07-18 Thread Brannon King

I can see no difference in my time measurements in changing the "order by
desc limit 1" to "max".

> >I'm wondering if the following query can be done as a single query 
> >rather than running it in a (nested) loop.
> > 
> >Suppose a database with five columns; xStart, yStart, xEnd, yEnd, 
> >score. I need the maximum score at the midpoint of the line 
> quantized 
> >to a 64x64 grid. Psuedo code:
> > 
> >query = prepare("
> >SELECT score FROM db WHERE
> >(xStart+xEnd)/2 >= :left AND
> >(xStart+xEnd)/2 < :right AND
> >(yStart+yEnd)/2 >= :top AND
> >(yStart+yEnd)/2 < :bottom
> >ORDER BY score DESC LIMIT 1
> >")
> > 
> >  
> >
> 
> Try:
> 
> SELECT MIN(score) FROM db WHERE
> (xStart+xEnd)/2 >= :left AND
> (xStart+xEnd)/2 < :right AND
> (yStart+yEnd)/2 >= :top AND
> (yStart+yEnd)/2 < :bottom
> 
> 
> 
>

[sqlite] I need help making this code run faster

2006-07-18 Thread Brannon King

I'm wondering if the following query can be done as a single query rather
than running it in a (nested) loop.
 
Suppose a database with five columns; xStart, yStart, xEnd, yEnd, score. I
need the maximum score at the midpoint of the line quantized to a 64x64
grid. Psuedo code:
 
query = prepare("
SELECT score FROM db WHERE
(xStart+xEnd)/2 >= :left AND
(xStart+xEnd)/2 < :right AND
(yStart+yEnd)/2 >= :top AND
(yStart+yEnd)/2 < :bottom
ORDER BY score DESC LIMIT 1
")
 
xJump = max(xEnd in db) / 64.0;
yJump = max(yEnd in db) / 64.0;
for(int x = 0; x < 64; x++){
  for(int y = 0; y < 64; y++){
left = x*xJump, righ = (x+1)*xJump
top = y*yJump, bottom = (y+1)*yJump
bind(query, left, right, top, bottom)
ret = step(query)
output[x][y] = score or 0 if no ret
  }
}
 
Thanks for your time. If you see anything else I could do to speed this up,
please mention it.
__________
Brannon King
¯¯

[sqlite] timer in shell.c part II

2006-07-18 Thread Brannon King

Thanks to Carl Clemens, I now have a working shell with a timer in it. I've
posted the shell.c on the contrib page along with a Windows exe compiled
using VC71. It is built on the latest shell.c version and should compile on
Linux as well, though I have not tested that. (My Linux box is down today.)
Use the .noprint option for doing time measurements. (The console slows
things down.) I assume it will require the NT kernel for the high
performance timer.

We've had some discussion here that VC71 actually compiles faster code. See
for yourself. It has all the compile optimizations on excluding SSE2 and
omit frame pointers.
 

__
Brannon King
¯¯

[sqlite] fast Java solution?

2006-07-17 Thread Brannon King

So who here has the _fast_ Java solution for accessing SQLite3?

Here's what I've done so far. I took the wrapper from wiki page labeled
"Java wrapper incl. JDBC driver for SQLite.
¤http://www.ch-werner.de/javasqlite;. I then fixed the calloc calls so that
the params were in the right order, fixed the finalize calls that should
have been reset calls, added some bind functions, built some step functions
without the column name/type overhead, and compiled it with my 3.3.6 code
from last week. So the step function(s) for that library take a Java
callback function. Alas, that seems to be too slow. The overhead of the
Java-to-C then the C-to-Java call all in each step function is just too much
overhead in JNI calls. Anyone else seen that issue?

Is there some standard tool that will generate a JNI dll from the
sqlite.dll? And if so, will that do step functions without the callback
overhead?

I suppose I'll dig in and make a custom JNI interface so that most of my
code is done in C, but just thought I'd ask around first... Thanks for your
time. The other Java wrappers posted seem to wrap too much or too little or
not be compatible with version 3. 
_
Brannon King
¯

[sqlite] icons and custom builds

2006-07-01 Thread Brannon King

Is there a place I can post some icon files made from the logo on the 
main sqlite.org page?


And where would I post unofficial builds of sqlite3.exe, like the one I 
did with current CVS today using VC71 and embedding the icon?

Re: [sqlite] SQLite 3.3.6 - possibly memory leak

2006-06-12 Thread Brannon King

I've seen a few issues myself, but they all appeared to be related to 
the memory management thing. If you're using VC, try putting this into 
your code:

#define CRTDBG_MAP_ALLOC
#include 
#include 

// and then in your main function:

_CrtSetDbgFlag ( _CRTDBG_ALLOC_MEM_DF | _CRTDBG_LEAK_CHECK_DF | 
_CRTDBG_CHECK_ALWAYS_DF);

Put a copy of the above line in sqlite3_open command as well. I think that it 
would be good to run the standard tests with that enabled. It provides a lot of 
useful information and checks for out of bounds or memory leaks on every 
allocation. It is similar to Borland's Codeguard. I have no idea how to enable 
malloc_dbg automatically in GCC, though.

Here's a few links:
http://www.amanjit-gill.de/CPP_VCtips.html
http://www.cprogramming.com/debugging/valgrind.html
http://msdn2.microsoft.com/en-us/library/sc65sadd(VS.80).aspx

Sasa Zeman wrote:

I working with my own SQLite wrapper for Delphi, with the statically linked
SQLite 3.3.6.

File variant works fine:
  SQLite3_Open('Test.sqb',db);
  SQLite3_Exec(db,'DROP TABLE TEST',NIL,NIL,ErrMsg);
  ...

However memory variant rise a memory leak report:
  SQLite3_Open(':memory:',db);
  SQLite3_Exec(db,'DROP TABLE TEST',NIL,NIL,ErrMsg);
  ...

I'm aware that above is a nonsence script. However, scripts with created
table which is later deleted twice by mistake, can also produce mentioned
problem. Please confirme memory leak existanace.

Sasa
--
www.szutils.net

RE: [sqlite] Problems with multiple threads?

2006-06-07 Thread Brannon King

For me, I have a bunch of threads writing to the database. That is the only
part I do multithreaded. (All my read queries are handled after all the data
is written.) I just use the scoped_lock operator from the Boost library at
the top of my function that does the bind and step calls. I pass a pointer
to the class containing that function and the mutex object to all my writer
threads. It seems to work great; I don't even have SQLite compiled with
threading enabled. My write function starts a new transaction every few
thousand writes.

> Hi Bill,
> 
> When you say "handle read/write locking [your]self" do you 
> mean outside of SQLite in your code or by altering SQLite's 
> source code?
> 
> What algorithm do you employ?

[sqlite] threading vs no threading compilation performance

2006-05-30 Thread Brannon King

For a single-threaded application, is there some performance benefit to
compiling SQLite3 with the THREADSAFE=0 option?  The only thing I do in a
multi-threaded situation is inserts, and I'm wondering if I'm not better off
just multiplexing them myself so that I only have to worry about one
transaction running at any given time.
__
Brannon King
¯¯

RE: [sqlite] Re: seeking answers for a few questions about indexes

2006-05-26 Thread Brannon King

> > EXPLAIN QUERY PLAN SELECT * FROM vals2d WHERE x = 1 OR y = 1;
> 
> select * from vals2d where x=1
> union all
> select * from vals2d where y=1


Super! You've greatly helped me yet again. My query really looked like this:


SELECT x,y,val FROM vals2d WHERE x = 1 OR y = 1 ORDER BY val DESC LIMIT 1;

I don't care if there are duplicates in the data; I'm only returning one row
anyway. Modifying it to use the UNION ALL has significantly increased the
speed. 

I would think this would be a valuable optimization for Sqlite to perform
automatically. When it finds an OR operation, it should look to see if
duplicating the query and using the UNION ALL compounder would allow it to
use additional indexes and if so then duplicate it.

[sqlite] seeking answers for a few questions about indexes

2006-05-26 Thread Brannon King

I have a table:
CREATE TABLE vals2d (x INTEGER, y INTEGER, val INTEGER);
CREATE UNIQUE INDEX xy ON vals2d (x,y);

EXPLAIN QUERY PLAN SELECT * FROM vals2d WHERE x = 1 OR y = 1;

produces

0|0|TABLE vals2d

The index is unused.

x,y are unique together so I built these indexes:
CREATE INDEX x ON vals2d (x);
CREATE INDEX y ON vals2d (y);

I run the explain again and get the same results: no indexes are used. Is
there any index I could build that would get used in that situation?

EXPLAIN QUERY PLAN SELECT * FROM vals2d WHERE (x > 1 AND x < 10) AND (y > 1
AND y < 10);

produces

0|0|TABLE vals2d WITH INDEX y

I drop the y index and then it uses the x index. I drop the x index and then
it uses the xy index. Can it not use two indexes simultaneously? And if it
can only use one, wouldn't the xy index be the best choice as it has
information about both x and y?

Thanks for your time. 
__________
Brannon King
¯¯

RE: [sqlite] can you speed this query up?

2006-05-23 Thread Brannon King

Thank you for taking the time to sort out my query! (The meat is at the
bottom.)

> SELECT
> r.qi,
> r.ri,
> r.drl,
> r.score
> FROM
> results_1 r
> INNER JOIN bounds b ON
> r.qis = b.bqis AND r.ris = b.bris
> WHERE
> (r.qi = 5604 OR r.ri = 5468)
> AND (r.qi >= b.bqis AND r.qi <= 5604)
> AND (r.ri >= b.bris AND r.ri <= 5468)
> AND b.bi = 1
> GROUP BY
> r.score DESC
> LIMIT 1

This one above was as slow as the original. 

> SELECT
> r.qi,
> r.ri,
> r.drl,
> r.score
> FROM
> results_1 r
> WHERE
> (r.qi = 5604 AND r.ri <= 5468) OR (r.ri = 5468 AND 
> r.qi <= 5604)
> AND EXISTS (
> SELECT
> NULL
> FROM
> bounds b
> WHERE
> b.bi = 1
> AND b.bqis = r.qis
> AND b.bris = r.ris
> AND b.bqis <= r.qi
> AND b.bris <= r.ri
> )
> ORDER BY
> r.score DESC
> LIMIT 1

But this one was twice as fast! I was able to use Mr. Cote's suggestion of
EXPLAIN QUERY PLAN to look at the indexes being used and make it run even
faster. My results were interesting, though. I had an index on results_1
that was both (qi,ri), call it coord. That was being unused in the above
query. I added a qi index and that made the above query use coord.
Apparently it doesn't matter if I add an index for qi or ri or both, any of
those options makes it use the coord index. I don't understand exactly why
on that. I tried adding an index on score. That seemed to speed up queries
where there were lots of scores, but it slowed down queries where there were
few scores.

Re: [sqlite] can you speed this query up?

2006-05-22 Thread Brannon King




select qi, ri, drl, max(score), min(score) from ...

What values of qi, ri, and drl would you want
this query to return?

What you have to do is:

   SELECT qi, ri, drl, score
 FROM ...
WHERE score=(SELECT max(score) FROM ...)

  
Thank you for the instruction, although the other query seemed to return 
the right results. (Not that I did any extensive testing)


So do I have to repeat all my constraints for both the main query and 
the subquery? Or is it even legal to specify a where clause when doing 
the max?


Also, please help me understand how the indexes are used on it in that 
situation as well.


So let's say I change my query. Coming from these tables:

CREATE TABLE bounds (bi INTEGER PRIMARY KEY AUTOINCREMENT DEFAULT NULL, 
bqi INTEGER, bri INTEGER, bqis INTEGER, bris INTEGER);
CREATE TABLE results (qi INTEGER, ri INTEGER, drl INTEGER, score 
INTEGER, qis INTEGER, ris INTEGER);


Assume that qi and ri are only unique together. Same for bqi, bri. All 
the others are never guaranteed to be unique.


What indexes will maximize the speed of this query?

"select count(*) from results where qi = 5604 OR ri = 5468;"

Assuming I do need to repeat the constraints, what indexes will maximize the 
speed of this query?

"select qi, ri, drl, score from results, bounds where 
score=(SELECT max(score) from results, bounds 
where (qi = 5604 OR ri = 5468)

AND qi >= bqis AND qi <= 5604 AND ri >= bris AND ri <= 5468
AND bi = 1 and qis = bqis AND ris = bris)
AND (qi = 5604 OR ri = 5468)
AND qi >= bqis AND qi <= 5604 AND ri >= bris AND ri <= 5468
AND bi = 1 and qis = bqis AND ris = bris
;"

That's kind of ugly.

[sqlite] can you speed this query up?

2006-05-22 Thread Brannon King

It seems that I yet need help with another query. This one is just too slow.
I've included the "explain" and the table schema. I've been using the
prepare/step model directly. What should I change on my indexing to make it
faster?

The schema:
CREATE TEMPORARY TABLE IF NOT EXISTS bounds (bi INTEGER PRIMARY KEY
AUTOINCREMENT DEFAULT NULL, bqi INTEGER, bri INTEGER, bqis INTEGER, bris
INTEGER);
CREATE UNIQUE INDEX IF NOT EXISTS qisris ON bounds (bqis, bris);
CREATE UNIQUE INDEX IF NOT EXISTS qiri ON bounds (bqi, bri);
insert into bounds values(NULL,1,1,5880,5880);
CREATE TABLE results_1 (qi INTEGER, ri INTEGER, drl INTEGER, score INTEGER,
qis INTEGER, ris INTEGER);
CREATE UNIQUE INDEX loc_1 ON results_1 (qi,ri); 

The queries (both of these run slow but I care about the second):
"select count(*) from results_1 where qi = 5604 OR ri = 5468;"
returns 102

So you can see the following query should only be doing a max over a 102
pieces; that's not very many. 

"explain 
select qi,ri,drl,max(score) as scr from results_1, bounds where (qi = 5604
OR ri = 5468) AND (qi >= bqis
 AND qi <= 5604) AND (ri >= bris AND ri <= 5468) AND bi = 1 and qis = bqis
AND ris = bris;"


0|MemNull|0|0|
1|MemNull|1|0|
2|MemNull|2|0|
3|MemNull|4|0|
4|MemNull|3|0|
5|Goto|0|73|
6|Integer|1|0|
7|OpenRead|1|2|
8|SetNumColumns|1|5|
9|Integer|0|0|
10|OpenRead|0|6|
11|SetNumColumns|0|6|
12|Integer|0|0|
13|OpenRead|2|8226|keyinfo(2,BINARY,BINARY)
14|Integer|1|0|
15|MustBeInt|1|63|
16|NotExists|1|63|
17|Integer|5604|0|
18|NotNull|-1|21|
19|Pop|1|0|
20|Goto|0|63|
21|MakeRecord|1|0|dd
22|MemStore|6|1|
23|Column|1|3|
24|NotNull|-1|27|
25|Pop|1|0|
26|Goto|0|63|
27|MakeRecord|1|0|dd
28|MoveGe|2|63|
29|MemLoad|6|0|
30|IdxGE|2|63|+
31|RowKey|2|0|
32|IdxIsNull|1|62|
33|IdxRowid|2|0|
34|MoveGe|0|0|
35|Column|0|0|
36|Integer|5604|0|
37|Eq|100|41|collseq(BINARY)
38|Column|0|1|
39|Integer|5468|0|
40|Ne|356|62|collseq(BINARY)
41|Column|0|1|
42|Column|1|4|
43|Lt|355|62|collseq(BINARY)
44|Column|0|1|
45|Integer|5468|0|
46|Gt|356|62|collseq(BINARY)
47|Column|0|4|
48|Column|1|3|
49|Ne|355|62|collseq(BINARY)
50|Column|0|5|
51|Column|1|4|
52|Ne|355|62|collseq(BINARY)
53|Column|0|3|
54|CollSeq|0|0|collseq(BINARY)
55|AggStep|3|1|max(1)
56|Column|0|0|
57|MemStore|0|1|
58|Column|0|1|
59|MemStore|1|1|
60|Column|0|2|
61|MemStore|2|1|
62|Next|2|29|
63|Close|1|0|
64|Close|0|0|
65|Close|2|0|
66|AggFinal|3|1|max(1)
67|MemLoad|0|0|
68|MemLoad|1|0|
69|MemLoad|2|0|
70|MemLoad|3|0|
71|Callback|4|0|
72|Halt|0|0|
73|Transaction|0|0|
74|VerifyCookie|0|4|
75|Transaction|1|0|
76|VerifyCookie|1|3|
77|Goto|0|6|
78|Noop|0|0|

Thanks again for any help.
__
Brannon King
¯¯

[sqlite] reset function speed

2006-05-22 Thread Brannon King

I noticed some interesting things about the step function. I have a query
with a MAX in it; hence, it always returns one row. (Well, it returns one
row every time I use the step function and when I use the sqlite3.exe
browser, but the sqlite3_exec function does not run the callback in the case
where all the return data was empty.) Now if I use the return data and then
call the reset function, it takes the same amount of time as if I call the
step function again and then the reset function. Apparently it cannot tell
it is on the last row. It just seems strange to me that the reset
functionality or the functionality to acquire the SQLITE_DONE signal is the
equivalent of another step call time-wise. From the profiler, it appears
that most of the time is in memory deallocation. That is strange because
there is very little time spent in memory allocation in general. Maybe the
memory is allocated in blocks, but deallocated one piece at a time.
 
__
Brannon King
¯¯

RE: [sqlite] create unique index quickly

2006-05-22 Thread Brannon King

Thank you for answering the question! Like I said in my other post, the
index creation is about 20% faster than the unique index creation in my
latest tests of an index of two INTEGERs. 20% out of several hours is
significant.

So index names are unique for the whole database? I was thinking they just
had to be unique for the table. I'll double check my names...

> > Is there some way I can create that unique index without 
> the overhead 
> > of uniqueness checking? I know all my triples are unique when used =
> > together: I
> > generated them that way. Or is there some other way I can 
> create that 
> > = index faster?
> > 
> 
> The following technique is unsupported.  It might change or 
> go away in a subsequent release.  It might not work.
> It might corrupt your database.  Use with extreme caution.
> 
>   (1)  CREATE INDEX primkey ON data(x,y,z)
>   (2)  PRAMGA writable_schema=ON
>   (3)  UPDATE sqlite_master 
>   SET sql='CREATE UNIQUE INDEX primkey ON data(x,y,z)'
> WHERE name='primkey'
>   (4)  close and reopen the database connection
> 
> The above technique is unsupported.  It might not work.
> It might corrupt your database.  The overhead of the 
> uniqueness check is not that much so it is unlikely to help 
> much anyway.  You have been warned.
> 
> --
> D. Richard Hipp

RE: [sqlite] create unique index quickly

2006-05-22 Thread Brannon King

> I am simply curious.  This sounds like an amazing 
> engineering challenge.  If it is not a secret, can you 
> describe what this data represents and how it will be used? 
Genomics. Look up "Smith-Waterman" or "Needleman-Wunsch-Sellers" on the web.

> What is the ultimate source of this data? 
Custom hardware
 
> How many days/weeks/eons of it do you plan to accumulate? 
Less than a week.

>   How much
> raw disk space is that?
It varies by thresholding. 1GB to several TB.
 
> If backups and journaling are not important, then is it 
> safe to assume that you can always regenerate that data on 
> demand?  Is each "set" of data identical, or only 
> statistically similar to prior sets?
I can always regenerate if I want to wait a few days. It's not exactly an
exact science anyway.

> Your project sounds like fun though, from what little 
> I've read of this thread.  Sure beats writing boring 
> financial software ;)
I admit I'm quite enjoying it as well.

RE: [sqlite] I need help understanding fake columns

2006-05-22 Thread Brannon King


> If I understand it right you'll definitely need a correlated subquery.
> You need one query to calculate the max() of the 257(0-256?) values.
> There must be some way to define which values should be used 
> in that calculation for a specific piece (the distance). Is 
> the distance geometric and calculated from x,y?
> 
> In pseudo code something like this:
> 
> select piece, ( sub query to calculate max where distance 
> from desired_piece < 256 )  from blah  where piece = desired_piece

Explain what you mean by "correlated subquery."

Like I said in the original post, I need help to modify this exact query:

SELECT qi,ri,(drl- (qi - :qi) ) as drl,MAX(score - ((ri - :ri) * :match)) AS
scr FROM results_1
WHERE (qi - ri) = (:qi - :ri) AND (qi >= :qi) AND (ri >= :ri) AND (qi < :qi
+ 256) 
AND (ri < :ri + 256) AND (drl > qi - :qi)

to make it such that I get the max of the above scr over :qi to
bounds.upperQi and :ri to bounds.upperRi

RE: [sqlite] create unique index quickly

2006-05-22 Thread Brannon King

> _Personally_ I think, this sounds like a task not quite 
> fitting in sqlites (probably any 'standard' databases) realm. 
> This is a bit off-topic in this group, but because you 
> mention sub-boxes - did you ever look into more specialized 
> file-formats like HDF5:
> 
> http://hdf.ncsa.uiuc.edu/HDF5/
> 
> with support for certain (of course limited) queries ?
> 
> Micha
> -- 
> 


What a fantastic link and suggestion! I just may end up using it.

Re: [sqlite] I need help understanding fake columns

2006-05-21 Thread Brannon King


Jay Sprenkle wrote:

score(x,y) = max(score(x+n,y+n)-n*penalty) where n < drc(x+n,y+n) < 256


So at some 'sample' you have the score and drc, but not the x,y value?
The x,y values can be interpolated using some algebra and the surrounding
samples?
Uh, no. Sorry my other emails were unclear. At all values that I do 
have, I have an x,y,score, and drc. I can calculate any value that is 
missing, though. The n in the above formula refers to all numbers 
between 0 and 256. I'm taking the max over all the data I have in that 
range: max for all data pieces that fall within n distance of the piece 
I'm calculating.

Re: [sqlite] create unique index quickly

2006-05-20 Thread Brannon King


John Stanton wrote:
You don't seem to need a data manipulation system like Sqlite, more a 
form of high volume storage.  Do you really need elaborate SQL, 
journalling, ROLLBACK and assured disk storage?


Di you consider some form of hashed storage, perhaps linear hashing, 
to build a compact and high performance associative array for your 
sparsely keyed data.


Do you really need the overhead of B-trees is you are just storing a 
sparse array?

JS
I don't need journaling or rollback. I'd love a way to shut them off. 
But elaborate SQL, that sure is handy. I'm not just storing, I'm 
viewing  stored, compressed data. I definitely need some way of querying 
a sparse matrix data that is larger than my DRAM. Sqlite sure seems like 
the quickest route to a workable product for that to happen. It has all 
the streaming/caching built in. Because of that, I assume it is faster 
than random file access. It supports complex data queries and indexes, 
both things I would need anyway. In the world of programming, I think 
many will agree you should get a working product, then make it faster. 
I'm just trying to get the most speed out of the easiest tool. If I need 
to rewrite the file storage for the next version, we can consider the 
cost to benefit for that separately.

Re: [sqlite] create unique index quickly

2006-05-20 Thread Brannon King



For your mail i think you have everything planned, but just for 
curiosity, how do you plan to pass data from hardware (800 MB/s + 
Database data) to a raid?? A GigEthernet has 1000 Mb/s (so, 128 MB/s), 
a PCI-X based solution has 1.3 to 2.6 Gb/s (so 150-300 MB/s), and 
similar for SATA and PCIe(x16). Even a direct Hypertransport to the 
raid has 3.2 Gb/s (. Note that i'm not asking for the raid, just for 
the SQLite hardware to Raid connection.


O ye of little faith. First, the hardware device. PCI64 based solutions 
run at 64bit x 66MHz with 50-70% efficiency. You can increase the clock 
to 133MHz using PCI-X. At 50% efficiency that is still 500MB/s. (My 
current solution runs the former and uses 50MB/s input.) Each channel in 
PCIe is 2.5Gb, apx 200MB/s. I assume I can get 60% efficiency on that. 
16 channels of that is more than enough for 800MB/s. Hypertransport is 
way faster than 3.2Gb/s these days. We've looked at HTX (Hypertransport) 
connections, which are near 1000MB/s 
(http://www.pathscale.com/infinipath.php). In addition, I actually have 
an SGI Altix 350 box in the office that uses NUMALink connections 
running at 6.5GB/s: 200MHz x 128bit at near 100% efficiency. You can buy 
one of those for a scant $13k ;-)


For the hard disk, you can buy SGI 
(http://www.sgi.com/products/storage/tech/6700.html) drive arrays that 
run at 3GB/s. How do they do that? Lots of hard drives connected to 
their NUMA controllers. At the Super Computer show I've seen 1U 8-disk 
arrays running at a steady 400MB/s using QLogic controllers and optical 
links for a scant $3k. Two of those and you've got it.

Re: [sqlite] Index syntax for attached database

2006-05-20 Thread Brannon King

The documentation says to put the database name on the front of the 
index name, not the table name when using the create index command. I 
thought it was weird myself.


Micha Bieber wrote:

I'm receiving sql errors like the following

"no such table: main.phs_matrices_1"

when trying to create an index using the syntax:

"CREATE INDEX IF NOT EXISTS phsm_1_idx ON phs_matrices_1 (a, b, c)"

from my C++ program.

The error statement is true - phs_matrices_1 doesn't belong to main,
but some attached database.
Trying to change the statement by full qualifying the attached table
( ... abc.phs_matrices_1 ...) causes a "syntax error near '.'".
The strange thing is, the table exists definitively yet and inserting
data etc. works at this moment (also with shortened table names, as it
has been stated in the documentation). So, the database should be
properly attached.

This is sqlite 3.3.5

Has anyone an idea, what went wrong here ?

Micha

Re: [sqlite] Fwd: C++ callback with class fuctions

2006-05-20 Thread Brannon King


Declare it static in the class declaration, not at the function definition.

Esteban Zeller wrote:

El Sábado 20 Mayo 2006 21:40, Brannon King escribió:
  

Is it declared as a static function? Something like "static int
bd::analisis_ultimo(void* arg)" ? I assume it would need to be static
just like all the other thread function parameters or callback
parameters in various APIs.

Esteban Zeller wrote:


I'v got the next problem:

string cola = "SELECT * FROM ultimo;";
 char zErrMsg;
 int ret = sqlite3_exec( base_datos, cola.c_str(), (bd::analisis_ultimo),
0,  );
 

the compiler tell me this:

/home/Programas/preguntas/src/bd.cpp:151: error: no matches converting
function `analisis_ultimo' to type `int (*)(void*, int, char**, char**)'

the functions uses the same prototipe of the example but it's inside of a
class
  


If I get it declared static the compiler say the next:

/home/Programas/preguntas/src/bd.cpp: At global scope:
/home/Programas/preguntas/src/bd.cpp:168: error: cannot declare member 
function `static int bd::analisis_ultimo(void*, int, char**, char**)' to have 
static linkage


here is part of the code:


ultima_entrada *  bd::cargar_ultimo_uso()
{
 string cola = "SELECT * FROM ultimo;";
 char *zErrMsg;
 int ret = sqlite3_exec( base_datos, cola.c_str(), (bd::analisis_ultimo), 0, 
 );

 if( ret != SQLITE_OK )
 {
  cout<<"Error al ejecutar la consulta SQL:"<<cola<<endl;
  cout<<"Numero de error:"<<ret<<endl;
  cout<<"Descripcion del error:"<<zErrMsg<<endl;
 }
 else
 {
  return( ultima );
 }
}

static int bd::analisis_ultimo( void *NotUsed, int argc, char **argv, char 
**azColName )

{
  int i;
  for(i=0; i<argc; i++){
printf("%s = %s\n", azColName[i], argv[i] ? argv[i] : "NULL");
  }
  printf("\n");
  return 0;
}




___ 
1GB gratis, Antivirus y Antispam 
Correo Yahoo!, el mejor correo web del mundo 
http://correo.yahoo.com.ar

Re: [sqlite] Fwd: C++ callback with class fuctions

2006-05-20 Thread Brannon King

Is it declared as a static function? Something like "static int 
bd::analisis_ultimo(void* arg)" ? I assume it would need to be static 
just like all the other thread function parameters or callback 
parameters in various APIs.


Esteban Zeller wrote:

I'v got the next problem:

string cola = "SELECT * FROM ultimo;";
 char zErrMsg;
 int ret = sqlite3_exec( base_datos, cola.c_str(), (bd::analisis_ultimo), 0,
 );
 

the compiler tell me this:

/home/Programas/preguntas/src/bd.cpp:151: error: no matches converting
function `analisis_ultimo' to type `int (*)(void*, int, char**, char**)'

the functions uses the same prototipe of the example but it's inside of a
class

Re: [sqlite] I need help understanding fake columns

2006-05-20 Thread Brannon King


Dennis Cote wrote:

Where did you get the idea there are "fake" columns?


"SELECT 200" returns 200. I'd call that a fake column. What is the 
proper name for it?


What I was picturing was something like this: "SELECT (BETWEEN 200 AND 
500) as a, a+2" and then get 300 outputs. That would be cool, eh?


So here's the real problem. Sorry about the long explanation, but if 
anyone wants to help with the query, that would be great. I have a 2d 
sparse matrix in a table where each point has two values, a score and a 
diagonal run count. In other words, my table has four columns: 
x,y,score,drc. Even though a certain x,y does not show up in the table, 
we can still calculate its score value by finding a piece farther along 
on the same diagonal using this formula:


score(x,y) = max(score(x+n,y+n)-n*penalty) where n < drc(x+n,y+n) < 256

If that were all we had to calculate, that would be great. Here's how (I 
think) that much looks:


SELECT MAX(score - (x - :x) * :penalty ) as score, x, y, (drc - (x - 
:x)) as drc FROM resultsGrid WHERE (x-y)=(:x-:y) AND (x >= :x) AND (y >= 
:y) AND (x < :x + 256) AND ( y < :y + 256) AND ( drc > x - :x)


So in other words, the x-y ratio is kept the same as the input 
point(:x,:y) so that it stays on the same diagonal. That also allows us 
to use x-:x as the distance to the point, though y-:y would work just as 
well. But now, the tricky part. I have another table: bounds. The bounds 
table has an lower bound :x and lower bound :y. I need to get a MAX of 
the MAX(scr) on the above query for all (:xCurrent, :yCurrent TO 
bounds.yLow OR :xCurrent TO bounds.xLow, :yCurrent). In other words, 
it's a backward L shape query where each point in the L is
processed in the query above and the maximum score value of that is 
returned.


The part I got confused on was trying to replace the :x,:y in the above 
query with (BETWEEN :x AND bounds.xLow OR BETWEEN :y and bounds.yLow).


Thanks again for any help,
Brannon

[sqlite] I need help understanding fake columns

2006-05-19 Thread Brannon King

So I first had these commands:

create table blah (b INTEGER, c INTEGER);
insert into blah values(400,300);
insert into blah values(360,250);
select a,max(b-a) from blah where c=a and a > 200 and a < 500;

That didn't work: 'a' doesn't exist. So I looked in the help and changed the
last line to include '(select 300) as a'. That still didn't let me do what I
really wanted to do; use 'a' in the max function. What's the right way to do
that? And how do I simply return a range of numbers with a select statement?

Thanks.
__
Brannon King
¯¯

RE: [sqlite] Re: help with simple query, please

2006-05-19 Thread Brannon King

It works like a champ. Thanks. 

> > Suppose I have two tables, A and B. A has two columns: score, 
> > location. B has two columns leftLocation, rightLocation.
> >
> > I need the maximum score located outside all entries in 
> table B. B is 
> > a small table, say less than 100 entries.
> 
> select max(score) from A where not exists (select * from B 
> where A.location between B.leftLocation and
> B.rightLocation)
> 
> Igor Tandetnik

[sqlite] help with simple query, please

2006-05-18 Thread Brannon King

Suppose I have two tables, A and B. A has two columns: score, location. 
B has two columns leftLocation, rightLocation.


I need the maximum score located outside all entries in table B. B is a 
small table, say less than 100 entries.


I tried this:

SELECT max(score),location FROM A, B GROUP BY score HAVING location < 
min(leftLocation) && location > max(rightLocation)


It doesn't work the way I think it should, not to mention it's the wrong 
algorithm because left/rightLocation are made to be used in pairs.


So, how is it done?

Thanks,
Brannon

[sqlite] create unique index quickly

2006-05-18 Thread Brannon King

The statement
CREATE UNIQUE INDEX primkey ON 4dData (x,y,z)
is too slow for me. It takes 13 sec on 700k records on my Athlon XP 2GHz.
That's about four seconds longer than it took me to put the data in there.
The two times together are still faster than inserting into a table with a
primary key already declared.

Is there some way I can create that unique index without the overhead of
uniqueness checking? I know all my triples are unique when used together: I
generated them that way. Or is there some other way I can create that index
faster?

Thanks.
__
Brannon King
¯¯

[sqlite] opcode count for sqlite3_progress_handler

2006-05-18 Thread Brannon King

Suppose I have a table with 6 INTEGER columns, and 600k entries. I need to
create a unique index on three of those columns, ie., CREATE UNIQUE INDEX
loc ON data (x,y,z). That takes a while so I thought I'd show some progress.
>From that information, is it possible to calculate the number of opcodes
such an operation would require for completion so that I could get an
accurate progress bar? Is there a query that would tell me how many opcodes
it would take? Is it possible to create the index once and count the number
of opcodes and do they grow linearly in the creation of indexes?

Thanks again,
__
Brannon King
¯¯

RE: [sqlite] add primary key after inserts?

2006-05-18 Thread Brannon King

 
> I'm not sure what you are getting at here. Copy memory from 
> where to where? The database is much more than an array of 
> record structures that you can copy data into.
> 
> Dennis Cote

I've been studying the profile and can see my thinking was wrong. I was
thinking we would just log an insert request with the step function in
memory somewhere. Now that I look at the profile more closely, I see that it
actually does do a full insert on each call to the step function. I've
attached a copy of the profile. You can see that near 79% of my program's
execution time is in sqlite3_step or one of its children. "writeResult" is
my function that calls sqlite3_step. From that you can see that the mutexing
and six bind calls are less than 6% of the runtime. And it appears from the
source code that the memory allcoations are all happening around or in
BtreeInsert. Again, that makes more sense now that I know it does a full
transaction on each step call. I wonder if I could cache my data myself for
some benefit? I'll try it.

"Method Name","% in","% in w/Children","Called","Average"
"sqlite3VdbeExec","18.0","74.9","26,198","30.6"
"OSF_SQLite::writeResult","5.7","98.7","26,165","9.6"
"sqlite3WinThreadSpecificData","5.3","5.8","78,593","3.0"
"RtlAllocateHeap","4.4","4.4","81,436","2.4"
"sqlite3BtreeInsert","4.1","21.9","26,175","7.0"
"sqlite3pager_get","3.5","3.8","196,934","0.8"
"sqlite3VdbeSerialType","2.9","2.9","549,609","0.2"
"sqlite3BtreeMoveto","2.8","5.6","26,183","4.8"
"malloc","2.6","6.9","80,030","1.4"
"sqlite3_step","2.5","79.1","26,198","4.3"
"sqlite3PutVarint","2.5","2.5","262,301","0.4"
"sqlite3VdbeMemRelease","2.3","2.3","628,410","0.2"
"sqlite3pager_unref","2.3","3.0","291,313","0.3"
"sqlite3VdbeHalt","2.3","3.4","52,388","1.9"
"free","2.2","4.1","79,746","1.2"
"sqlite3_bind_int64","2.2","8.1","156,996","0.6"
"sqlite3VdbeSerialPut","1.9","3.1","183,203","0.5"
"RtlFreeHeap","1.9","1.9","81,039","1.0"
"sqlite3Error","1.7","5.4","209,417","0.4"
"sqlite3MallocFailed","1.6","1.6","657,798","0.1"
"sqlite3VdbeReset","1.6","4.0","26,194","2.8"
"sqlite3VdbeMemSetStr","1.5","2.3","209,480","0.3"
"sqlite3BtreeCursor","1.4","3.9","26,183","2.3"
"sqlite3ValueSetStr","1.4","3.7","209,419","0.3"
"sqlite3BtreeLast","1.3","2.5","26,171","2.2"
"sqlite3BtreeGetMeta","1.2","5.5","26,176","2.0"
"sqlite3VdbeMemSetInt64","1.2","1.7","156,996","0.3"
"sqlite3VarintLen","1.2","1.2","235,547","0.2"
"sqlite3VdbeFreeCursor","1.1","4.7","235,769","0.2"
"sqlite3MallocRaw","1.1","9.2","80,030","0.6"
"sqlite3GenericMalloc","1.0","8.0","80,030","0.6"
"sqlite3VdbeSerialTypeLen","0.9","0.9","317,427","0.1"
"sqlite3GenericFree","0.8","4.8","79,746","0.4"
"sqlite3_reset","0.7","5.2","26,173","1.2"
"sqlite3BtreeKeySize","0.7","0.8","26,167","1.1"
"sqlite3GetVarint","0.7","0.7","134,238","0.2"
"sqlite3FreeX","0.7","5.5","158,661","0.2"
"sqlite3VdbeMemShallowCopy","0.6","0.6","157,040","0.2"
"sqlite3_bind_int","0.6","7.4","130,830","0.2"
"sqlite3ThreadDataReadOnly","0.6","6.4","78,569","0.3"
"(unknown)","0.5","1.2","120,422","0.2"
"sqlite3VdbeMakeReady","0.5","0.5","26,197","0.9"
"TlsGetValue","0.5","0.5","79,641","0.3"
"sqlite3pager_write","0.5","0.5","27,451","0.8"
"sqlite3BtreeCloseCursor","0.4","2.3","26,182","0.8"
"sqlite3Malloc","0.4","5.3","52,827","0.4"
"sqlite3VdbeFifoClear","0.4","0.6","26,215","0.7"
"sqlite3pager_ref","0.4","0.4","92,363","0.2"
"Fasta::Load","0.3","0.4","2","7,532.3"
"RtlLeaveCriticalSection","0.3","0.3","40,641","0.4"
"sqlite3VdbeMemIntegerify","0.3","0.5","26,196","0.5"
"WriteFile","0.3","0.3","1,350","9.0"
"sqlite3BtreeBeginTrans","0.3","0.4","26,174","0.4"
"sqlite3VdbeFifoInit","0.2","0.2","26,215","0.3"
"sqlite3SafetyOff","0.2","0.2","26,228","0.3"
"sqlite3ApiExit","0.2","0.2","26,261","0.3"
"RtlEnterCriticalSection","0.1","0.1","40,642","0.2"
"sqlite3BtreeFlags","0.1","0.1","52,353","0.1"
"sqlite3BtreeCommitStmt","0.1","0.1","26,179","0.2"
"sqlite3SafetyOn","0.1","0.1","26,229","0.2"
"sqlite3VdbeSetChanges","0.1","0.1","26,166","0.2"
"sqlite3VdbeIntValue","0.1","0.1","26,196","0.2"
"CreateFileW","0.1","0.1","8","408.2"
"sqlite3RunParser","0.1","0.3","29","79.2"
"sqlite3pager_lookup","0.0","0.0","8,238","0.2"
"DeleteFileW","0.0","0.0","6","273.7"
"sqlite3Parser","0.0","0.2","689","2.1"
"GetFileAttributesW","0.0","0.0","8","140.4"
"SearchBases::LoadParams","0.0","0.5","1","1,122.4"
"DeleteFileA","0.0","0.0","1","1,004.2"
"CloseHandle","0.0","0.0","8","118.8"
"FlushFileBuffers","0.0","0.0","3","264.8"
"boost::lexical_cast,bool>","0.0","0.0","32","24.3"
"SetFilePointer","0.0","0.0","617","1.0"
"(unknown)","0.0","0.0","611","0.9"
"(unknown)","0.0","0.2","608","0.8"
"boost::program_options::store","0.0","0.0","2","221.5"
"InitializeCriticalSection","0.0","0.0","62","7.0"
"RtlDeleteCriticalSection","0.0","0.0","55","7.0"
"sqlite3pager_pagecount","0.0","0.0","1,238","0.3"
"GetFileAttributesA","0.0","0.0","7","53.7"
"ReadFile","0.0","0.0","31","11.7"

RE: [sqlite] add primary key after inserts?

2006-05-18 Thread Brannon King

So I dropped the "not null" and it gave a ~2% increase in overall speed.
It's not much, but may count for something. It's a little strange because
that is just the opposite of MySQL. In MySQL, it shrinks the database size
and speeds up the access by declaring a column not null. 

I am using the prepare/bind/step combination. The bind calls are negligible
time-wise. It's the step function that I think could run faster. As I
understand it, the step function is not actually doing an insert. That
doesn't happen until the "end transaction" statement. Therefore, the not
null thing should not effect the speed of the step function. Is that not
true? 

It seems that the step function should only occasionally allocate memory; it
should allocate enough for a number of expected queries per transaction.
That doesn't make sense, though, if you're not in the middle of a
transaction. Hence, I wonder if we really need two different functions. The
step function is overkill for just doing inserts that never return any data
and that are in the middle of a transaction. What we need in that situation
is a quick memcpy and nothing else. 


> Having a primary key is the same thing as adding a UNIQUE 
> index on the desired columns except for the case where the 
> key is on a single column with integer type (i.e. integer 
> primary key). The later case is optimized in sqlite, but must 
> be done before the table is populated.
> 
> > I have a table that looks like this:
> >
> > CREATE TABLE IF NOT EXISTS results (qi INTEGER NOT NULL, ri INTEGER 
> > NOT NULL, run INTEGER NOT NULL, drl INTEGER NOT NULL, score INTEGER 
> > NOT NULL, qis INTEGER NOT NULL, ris INTEGER NOT NULL );
> >
> > I originally had this line as well: , PRIMARY KEY (qi, ri, run)
> >
> > I need to do billions (no joke) of consecutive inserts into this 
> > table. Writing binary out with std::ofstream was using 26% 
> of my code 
> > execution time according to the profiler. I changed to sqlite3, and 
> > now the sqlite3_step function uses 50% of my execution time. After 
> > reading the news groups, I removed the primary key. That dropped it 
> > down to about 41%. That was significant. However, I was still 
> > significantly slower than binary writes with ofstream. 
> Then, I tried 
> > the PRAGMA temp_store = 2. That made absolutely no difference. I'll 
> > assume that's what it was to begin with, though it reports 0. Also, 
> > from the profiler, it seems that the step function does a lot of 
> > memory allocations and deallocations, yet I cannot find 
> them in the code.
> >
> > If it helps, I was testing 600k inserts in transactions of 0xFFF 
> > inserts and my current settings:
> >PRAGMA auto_vacuum = 0; \
> >PRAGMA case_sensitive_like = 1; \
> >PRAGMA cache_size = 32768; \
> >PRAGMA default_cache_size = 32768; \
> >PRAGMA count_changes = 0; \
> >PRAGMA synchronous = 0; \
> >PRAGMA page_size = 4096; \
> >PRAGMA temp_store = 2;
> >
> > What else can I do to speed up my inserts?
> >
> >
> There are a couple of things to look at. First, you don't say 
> exactly how you are executing the inserts. You should use 
> sqlite3_prepare() to compile an insert statement with a your 
> seven columns as parameters.
> 
> insert into results values (?, ?, ?, ?, ?, ?, ?)
> 
> Then inside your insert loop you should bind values to each 
> of these parameters. If you have known patterns in your input 
> data, and some of the parameter values repeat,  you can 
> eliminate re-binding the same value to a parameter for each 
> loop since sqlite will retain the last bound value. Finally, 
> step the statement and then reset it for the next loop iteration.
> 
> You should add your index after you have inserted all the data.
> 
> You can also speed the execution time of the insert statement 
> somewhat by dropping the NOT NULL constraints on your data 
> columns. Each one of those is checked for every insert. For 
> comparison here is a  dump of the code generated for a simple 
> two column table with and without the not null constraints.
> 
> SQLite version 3.3.5
> Enter ".help" for instructions
> sqlite> create table t1 (a integer, b integer);
> sqlite> create table t2 (a integer not null, b integer not null);
> sqlite> .explain on
> sqlite> explain insert into t1 values (?, ?);
> addr  opcode  p1  p2  p3
>   --  --  --
> -
> 0 Goto0   11
> 1 Integer 0   0
> 2 OpenWrite   0   2
> 3 SetNumColumns   0   2
> 4 NewRowid0   0
> 5 Variable1   0
> 6 Variable2

RE: [sqlite] add primary key after inserts?

2006-05-18 Thread Brannon King

>Also, do a large number of inserts within a transaction. (1000-100,000 or
so, not a billion ;) )

I don't see how increasing the number of inserts per transaction changes the
speed at which the sqlite3_step function executes when it is called on an
insert in the middle of the transaction. I'll put some more commentary on
that in my next response to Dennis.

[sqlite] add primary key after inserts?

2006-05-17 Thread Brannon King

As I understand SQL, "alter table blah add primary key (blah1, blah2)" 
should be how you do it. The sqlite documentation seems to say 
otherwise. Actually, I cannot figure out from the documentation how to 
add a primary key after the table is created and data is entered. How is 
it done? Or is an Index sufficient?


Now that I've asked the question, here's some FYI from my experimentation.

I have a table that looks like this:

CREATE TABLE IF NOT EXISTS results (qi INTEGER NOT NULL, ri INTEGER NOT 
NULL, run INTEGER NOT NULL,
drl INTEGER NOT NULL, score INTEGER NOT NULL, qis INTEGER NOT NULL, ris 
INTEGER NOT NULL );


I originally had this line as well: , PRIMARY KEY (qi, ri, run)

I need to do billions (no joke) of consecutive inserts into this table. 
Writing binary out with std::ofstream was using 26% of my code execution 
time according to the profiler. I changed to sqlite3, and now the 
sqlite3_step function uses 50% of my execution time. After reading the 
news groups, I removed the primary key. That dropped it down to about 
41%. That was significant. However, I was still significantly slower 
than binary writes with ofstream. Then, I tried the PRAGMA temp_store = 
2. That made absolutely no difference. I'll assume that's what it was to 
begin with, though it reports 0. Also, from the profiler, it seems that 
the step function does a lot of memory allocations and deallocations, 
yet I cannot find them in the code.


If it helps, I was testing 600k inserts in transactions of 0xFFF inserts 
and my current settings:

   PRAGMA auto_vacuum = 0; \
   PRAGMA case_sensitive_like = 1; \
   PRAGMA cache_size = 32768; \
   PRAGMA default_cache_size = 32768; \
   PRAGMA count_changes = 0; \
   PRAGMA synchronous = 0; \
   PRAGMA page_size = 4096; \
   PRAGMA temp_store = 2;

What else can I do to speed up my inserts?

Thanks,
Brannon

RE: [sqlite] Make for windows

2006-05-17 Thread Brannon King

Thanks. That command created the files; however,

grep sqlite3_sleep ./tsrc/*

reports:

sqlite3.h:int sqlite3_sleep(int);

and that's all. In other words, it's declared but never defined.

> So I downloaded the latest CVS files on my Linux box. What do I run to 
> generate the c files I need for compilation on my Windows box? In 
> other words, what make command is used to generate the files for the 
> zip = source download? Where do the files end up? Do I need any 
> special parameters to configure or make for this to work right?
> 
> I searched the newsgroup for this information, but was unsuccessful. 
> Is there a link that explains it already available?
> 

Edit Makefile.linux-gcc to suit your setup.  Then type

   make target_source

Windows sources will appear in the "tsrc" subdirectory.
--
D. Richard Hipp   <[EMAIL PROTECTED]>

[sqlite] Make for windows

2006-05-17 Thread Brannon King

So I downloaded the latest CVS files on my Linux box. What do I run to
generate the c files I need for compilation on my Windows box? In other
words, what make command is used to generate the files for the zip source
download? Where do the files end up? Do I need any special parameters to
configure or make for this to work right? 

I searched the newsgroup for this information, but was unsuccessful. Is
there a link that explains it already available?

Thanks for your time. 
__
Brannon King
¯¯

[sqlite] sqlite3_sleep

2006-05-17 Thread Brannon King

I downloaded the "sqlite-source-3_3_5.zip" file, dropped it into a VC7.1
static lib project and compiled it. That worked fine (aside from numerous
warnings about sizeof(size_t) possibly != sizeof(int)) , however it is
missing the sqlite3_sleep function, which appears to reside in
"experimental.c", not included in that zip package. Can I get the current
"experimental.c" file out of CVS and add that to my project?

Also, when can we expect an official release that includes the speedup
discussed on the newsgroup previously (
http://www.mail-archive.com/sqlite-users%40sqlite.org/msg14954.html ) ?

Thanks for your time.
______
Brannon King
¯¯

53 matches

Mail list logo