RE: [sqlite] Difference in these indices?

2007-04-03 Thread McDermott, Andrew
Hi,
 
> > It might make sense to create a separate standalone utility program 
> > (like sqlite3_analyzer) that reuses some the sqlite  source 
> to do bulk 
> > inserts into a table in a database file as fast a possible with out 
> > having to worry about locking or journaling etc.
> 
> That would solve my problem too (thread: "CREATE INDEX 
> performance" on indexing a 5.8-million record table). I'd 
> love something like that!

Having such a utility would solve a lot of my problems regarding getting
data into the DB.  My current application does not allow access to the
data until the DB is populated so (in this case) I don't derive any
benefit from locking, journalling, etc.

Once the DB is populated I find SQLite to more than fast enough.

-- 
andy



-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Difference in these indices?

2007-03-29 Thread Stephen Toney
John,

Thanks for the useful info. Unfortunately it sounds as if this is more
than I have time for right now. 

Stephen


On Wed, 2007-03-28 at 16:29 -0600, John Stanton wrote:
> Perl would not do a good job.  You need to use the Sqlite page 
> structures and they are defined in C terms.
> 
> If you want to make such a program I can give you a template in simple 
> ANSI C.  It builds to a different data structure from Sqlite but the 
> algorithms are there.  It uses a quicksort as I recall and draws heavily 
> on Knuth's Sorting and Searching volume.
> 
> If you build a B-Tree by insertions it is continually splitting nodes. 
> A B-Tree grows by extending the root which makes it somewhat self 
> balancing but keeps it busy.  Enhanced B-Trees will merge nodes to 
> minimize splitting and checkerboarding and enhance balance.  The 
> splitting is expensive and even a simple insertion can be fairly 
> expensive because it may change the interior nodes.
> 
> To build one bottom-up you calculate how much space you need based on 
> the size and count of keys and how many levels you will have.  Then you 
> sort the keys and start filling leaf nodes.  As you fill a node you 
> insert an entry into its parent and as you fill a parent you insert into 
> ints parent and so on.  Eventually you will have a fully populated tree 
> with the root less than full.  You never read a node and only write one 
> when it is full so I/O activity is limited.  As a buffer you have a 
> stack of nodes with a depth equal to the depth of the tree.
> 
> You can add some optimization to the tree by making interior nodes 
> contiguous etc.
> 
> By using more modern OS capabilities (POSIX) you could build it faster 
> by extending the Sqlite file by the size of the index, memory mapping 
> that area and using it as the buffer.  When you are finished you unmap 
> the area and the index is complete.  Using that method you perform no 
> writes and get a 20-50% speed improvement compared to using the write 
> API call.
> 
> Stephen Toney wrote:
> > I may work on such a program, if time permits. If successful I will
> > share it. It would be in Perl using DBI::ODBC, so may not be amazingly
> > fast.
> > 
> > I am pretty good at C++ but have phased it out for most work, so I am
> > still using the antique Sybase compiler, and I doubt the SQLite C++
> > library would work with that. Otherwise it would, of course, be a better
> > choice for such a utility. Anyone ever tried that combination?
> > 
> > John, could you clarify what you mean by "building it bottom-up"? I'm
> > not sure how to build a b-tree any way but by insertions. 
> > 
> > Best regards,
> > Stephen
> > 
> > 
> > On Wed, 2007-03-28 at 11:46 -0600, John Stanton wrote:
> > 
> >>I proposed such a program earlier in this discussion.  I would envisage 
> >>a seperate program which strips out a list of keys from the database, 
> >>sorts it then allocates space in the DB file for the resulting index and 
> >>builds it bottom up.  It would be an off-line process but fast and would 
> >>make raising indices on large databases time efficient.
> >>
> >>Based on our experience of building a B-Tree with such a program 
> >>compared to successive insertions a speed improvement in raising an 
> >>index of at least an order of magnitude could be expected.
> >>
> >>By making it an independent program it can be lean, mean and fast and 
> >>not touch the regular Sqlite library.
> >>
> >>Stephen Toney wrote:
> >>
> >>>On Wed, 2007-03-28 at 08:23 -0600, Dennis Cote wrote:
> >>>
> >>>
> >>>
> It might make sense to create a separate standalone utility program 
> (like sqlite3_analyzer) that reuses some the sqlite  source to do bulk 
> inserts into a table in a database file as fast a possible with out 
> having to worry about locking or journaling etc.
> >>>
> >>>
> >>>That would solve my problem too (thread: "CREATE INDEX performance" on
> >>>indexing a 5.8-million record table). I'd love something like that!
> >>>
> >>>
> >>
> >>
> >>-
> >>To unsubscribe, send email to [EMAIL PROTECTED]
> >>-
> 
> 
> -
> To unsubscribe, send email to [EMAIL PROTECTED]
> -
-- 

Stephen Toney
Systems Planning
[EMAIL PROTECTED]
http://www.systemsplanning.com


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Difference in these indices?

2007-03-29 Thread Martin Jenkins

John Stanton wrote:
Perl would not do a good job.  You need to use the Sqlite page 
structures and they are defined in C terms.


But Tcl might. The test suite pokes about with SQLite internals.

Martin

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Difference in these indices?

2007-03-28 Thread John Stanton
Perl would not do a good job.  You need to use the Sqlite page 
structures and they are defined in C terms.


If you want to make such a program I can give you a template in simple 
ANSI C.  It builds to a different data structure from Sqlite but the 
algorithms are there.  It uses a quicksort as I recall and draws heavily 
on Knuth's Sorting and Searching volume.


If you build a B-Tree by insertions it is continually splitting nodes. 
A B-Tree grows by extending the root which makes it somewhat self 
balancing but keeps it busy.  Enhanced B-Trees will merge nodes to 
minimize splitting and checkerboarding and enhance balance.  The 
splitting is expensive and even a simple insertion can be fairly 
expensive because it may change the interior nodes.


To build one bottom-up you calculate how much space you need based on 
the size and count of keys and how many levels you will have.  Then you 
sort the keys and start filling leaf nodes.  As you fill a node you 
insert an entry into its parent and as you fill a parent you insert into 
ints parent and so on.  Eventually you will have a fully populated tree 
with the root less than full.  You never read a node and only write one 
when it is full so I/O activity is limited.  As a buffer you have a 
stack of nodes with a depth equal to the depth of the tree.


You can add some optimization to the tree by making interior nodes 
contiguous etc.


By using more modern OS capabilities (POSIX) you could build it faster 
by extending the Sqlite file by the size of the index, memory mapping 
that area and using it as the buffer.  When you are finished you unmap 
the area and the index is complete.  Using that method you perform no 
writes and get a 20-50% speed improvement compared to using the write 
API call.


Stephen Toney wrote:

I may work on such a program, if time permits. If successful I will
share it. It would be in Perl using DBI::ODBC, so may not be amazingly
fast.

I am pretty good at C++ but have phased it out for most work, so I am
still using the antique Sybase compiler, and I doubt the SQLite C++
library would work with that. Otherwise it would, of course, be a better
choice for such a utility. Anyone ever tried that combination?

John, could you clarify what you mean by "building it bottom-up"? I'm
not sure how to build a b-tree any way but by insertions. 


Best regards,
Stephen


On Wed, 2007-03-28 at 11:46 -0600, John Stanton wrote:

I proposed such a program earlier in this discussion.  I would envisage 
a seperate program which strips out a list of keys from the database, 
sorts it then allocates space in the DB file for the resulting index and 
builds it bottom up.  It would be an off-line process but fast and would 
make raising indices on large databases time efficient.


Based on our experience of building a B-Tree with such a program 
compared to successive insertions a speed improvement in raising an 
index of at least an order of magnitude could be expected.


By making it an independent program it can be lean, mean and fast and 
not touch the regular Sqlite library.


Stephen Toney wrote:


On Wed, 2007-03-28 at 08:23 -0600, Dennis Cote wrote:



It might make sense to create a separate standalone utility program 
(like sqlite3_analyzer) that reuses some the sqlite  source to do bulk 
inserts into a table in a database file as fast a possible with out 
having to worry about locking or journaling etc.



That would solve my problem too (thread: "CREATE INDEX performance" on
indexing a 5.8-million record table). I'd love something like that!





-
To unsubscribe, send email to [EMAIL PROTECTED]
-



-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Difference in these indices?

2007-03-28 Thread Stephen Toney
I may work on such a program, if time permits. If successful I will
share it. It would be in Perl using DBI::ODBC, so may not be amazingly
fast.

I am pretty good at C++ but have phased it out for most work, so I am
still using the antique Sybase compiler, and I doubt the SQLite C++
library would work with that. Otherwise it would, of course, be a better
choice for such a utility. Anyone ever tried that combination?

John, could you clarify what you mean by "building it bottom-up"? I'm
not sure how to build a b-tree any way but by insertions. 

Best regards,
Stephen


On Wed, 2007-03-28 at 11:46 -0600, John Stanton wrote:
> I proposed such a program earlier in this discussion.  I would envisage 
> a seperate program which strips out a list of keys from the database, 
> sorts it then allocates space in the DB file for the resulting index and 
> builds it bottom up.  It would be an off-line process but fast and would 
> make raising indices on large databases time efficient.
> 
> Based on our experience of building a B-Tree with such a program 
> compared to successive insertions a speed improvement in raising an 
> index of at least an order of magnitude could be expected.
> 
> By making it an independent program it can be lean, mean and fast and 
> not touch the regular Sqlite library.
> 
> Stephen Toney wrote:
> > On Wed, 2007-03-28 at 08:23 -0600, Dennis Cote wrote:
> > 
> > 
> >>It might make sense to create a separate standalone utility program 
> >>(like sqlite3_analyzer) that reuses some the sqlite  source to do bulk 
> >>inserts into a table in a database file as fast a possible with out 
> >>having to worry about locking or journaling etc.
> > 
> > 
> > That would solve my problem too (thread: "CREATE INDEX performance" on
> > indexing a 5.8-million record table). I'd love something like that!
> > 
> > 
> 
> 
> -
> To unsubscribe, send email to [EMAIL PROTECTED]
> -
-- 

Stephen Toney
Systems Planning
[EMAIL PROTECTED]
http://www.systemsplanning.com


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Difference in these indices?

2007-03-28 Thread Dennis Cote

RB Smissaert wrote:

Just one question; as I log nearly all my SQL statements to
a SQLite table, will this be OK with the double quotes added?

  
If I understand your question correctly the answer is of course, you are 
simply logging the text of the SQL as a literal string.


The literal string can be specified using the sqlite3_bind_text API or 
by generating a custom insert statement. In the later case the literal 
string must be surrounded by single quote characters and you will need 
to escape any single quote characters in the string. Avoiding this kind 
of quoting is a real benefit of using the bind APIs. If you can't do 
that you may want to use an squote function that will do the quoting and 
escaping for the SQL string and return a safely quoted literal string.


   Function SQuote(sql As String) As String
   SQuote = "'" & Replace(sql, "'", "''") & "'"
   End Function

This will take an SQL statement like

   insert into "t"("a", "b") values ('test', 'it''s a string with 
"quotes" in it.');


and produce a string like

   'insert into "t"("a", "b") values (''test'', ''its a string with 
"quotes" in it.'');'


which sqlite will treat as a single literal string which can be inserted 
into your log like this


   logSQL = "insert into log values (" & SQuote(strSQL) & ")";
   execute(logSQL);

HTH
Dennis Cote


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



RE: [sqlite] Difference in these indices?

2007-03-28 Thread RB Smissaert
Ok, thanks.
A bit more work then to deal with all the indices.

Just one question; as I log nearly all my SQL statements to
a SQLite table, will this be OK with the double quotes added?

RBS


-Original Message-
From: Dennis Cote [mailto:[EMAIL PROTECTED] 
Sent: 28 March 2007 16:10
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] Difference in these indices?

RB Smissaert wrote:
> Does this only apply to table and column names?
> I will never use double quote characters in my identifier 
> names, so there should be no problem there.
>
>   
It applies to all the identifiers: table, column, index, trigger, 
database (using attach as id), transactions (if named which they seldom 
are), collation, and view names.

HTH
Dennis Cote


-
To unsubscribe, send email to [EMAIL PROTECTED]

-




-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Difference in these indices?

2007-03-28 Thread Dennis Cote

RB Smissaert wrote:

Does this only apply to table and column names?
I will never use double quote characters in my identifier 
names, so there should be no problem there.


  
It applies to all the identifiers: table, column, index, trigger, 
database (using attach as id), transactions (if named which they seldom 
are), collation, and view names.


HTH
Dennis Cote

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Difference in these indices?

2007-03-28 Thread Ken
Joe, 
   
  In general the storage concept would be nice to extend this to a more general 
case.
   
  Extend the sql language to allow storage to a specified file. This would 
allow the user to easily seperate the data vs indices. 
  ie Create table  ... ( column ) storage-clause
   
  storage-clause := store as (fileName, pageSize, initialAlocation)
   
   
  Ken
   
  

Joe Wilson <[EMAIL PROTECTED]> wrote:
  --- [EMAIL PROTECTED] wrote:
> If you insert records in order of ascending integer primary
> key, then the inserts will be very fast. If you insert records
> where the integer primary key is randomized, inserts will be
> reasonably fast until the size of your table exceeds the size
> of your disk cache. Then each insert will need to do multiple
> reads and writes to disk as it tries to figure out where in
> your massive table is the right place to put the new record,
> and then make space for that new record. All this disk I/O
> will slow things down dramatically.

SQLite is extremely fast in most area of SQL functionality except for 
bulk inserts of non-ordered data into multi-indexes tables, where it 
is very slow. This pre-sorting-before-insert trick only works for single 
indexed tables, and is not useful in the general case. It also defeats
the purpose of using a database - why should the user have to pre-sort
data to achieve acceptable performance?

In the current SQLite insert algorithm there's way too much shuffling 
around of disk pages in order to minimize database footprint at 
all costs. There ought to be a PRAGMA to prefer maximizing insert 
speed when inserting, and if it doubles or quadruples the database 
size - so be it. Most embedded devices may opt for minimum space; but 
most PC-based apps would likely opt for maximum insert speed.

I believe 2 measures when applied together would significantly 
improve insert speed of indexed rows:

1. Allot each index/btree a contiguous region of the database file in 
which to grow without conflicting with the other indexes' pages and pages 
of the underlying table itself. (Putting indexes in separate files if 
only temporarily for bulk insert would be better, but hey, it's SQLite).

2. Adjust the btree algorithm to create larger gaps in the data when you
make space for new keys in the btree. Such page-sized gaps could minimize 
much of the page shuffling currently done.

When the inserts are done, the user need only issue a VACUUM to bring 
the database file back to its normal "packed" sized without such gaps 
if they should so choose.




TV dinner still cooling? 
Check out "Tonight's Picks" on Yahoo! TV.
http://tv.yahoo.com/

-
To unsubscribe, send email to [EMAIL PROTECTED]
-




RE: [sqlite] Difference in these indices?

2007-03-28 Thread RB Smissaert
Dennis,

OK, you convinced me and I think I will alter this.
Does this only apply to table and column names?
I will never use double quote characters in my identifier 
names, so there should be no problem there.

RBS


-Original Message-
From: Dennis Cote [mailto:[EMAIL PROTECTED] 
Sent: 28 March 2007 15:41
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] Difference in these indices?

RB Smissaert wrote:
> As to quotes etc.
> As my code works fine as it is I probably will leave this as the double
> quotes look ugly and it will be a reasonably big job to alter all this.
> Did I get you right that the only benefit of doing create "table1" etc.
> is compatibility with running sqlite with SQLite.exe?
>
>   
The benefit to using standard quoting for identifiers is portability. 
Your table definitions will almost certainly be rejected by almost any 
other database engine. Most don't support the same extended quoting 
rules that sqlite has added for compatibility with files coming from 
other sources.

If adding the escaped quotes to the SQL generation statements doesn't 
work for you, then you could create a simple function that adds the 
escaped quotes to your identifier variables. If you simplify the problem 
and assume you will never use double quote characters in your identifier 
names themselves this function is very simple;

Function Quote(id As String) As String
Quote = """" & id & """"
End Function

and your code becomes something like this.

strTable = "table1"
strColumn = "ID"
strSQL = "create " & Quote(strTable) & "(" & Quote(strColumn) & " 
INTEGER PRIMARY KEY)"

HTH
Dennis Cote


-
To unsubscribe, send email to [EMAIL PROTECTED]

-




-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Difference in these indices?

2007-03-28 Thread Dennis Cote

RB Smissaert wrote:

As to quotes etc.
As my code works fine as it is I probably will leave this as the double
quotes look ugly and it will be a reasonably big job to alter all this.
Did I get you right that the only benefit of doing create "table1" etc.
is compatibility with running sqlite with SQLite.exe?

  
The benefit to using standard quoting for identifiers is portability. 
Your table definitions will almost certainly be rejected by almost any 
other database engine. Most don't support the same extended quoting 
rules that sqlite has added for compatibility with files coming from 
other sources.


If adding the escaped quotes to the SQL generation statements doesn't 
work for you, then you could create a simple function that adds the 
escaped quotes to your identifier variables. If you simplify the problem 
and assume you will never use double quote characters in your identifier 
names themselves this function is very simple;


   Function Quote(id As String) As String
   Quote =  & id & 
   End Function

and your code becomes something like this.

   strTable = "table1"
   strColumn = "ID"
   strSQL = "create " & Quote(strTable) & "(" & Quote(strColumn) & " 
INTEGER PRIMARY KEY)"


HTH
Dennis Cote

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Difference in these indices?

2007-03-28 Thread Dennis Cote

Joe Wilson wrote:


SQLite is extremely fast in most area of SQL functionality except for 
bulk inserts of non-ordered data into multi-indexes tables, where it 
is very slow. 
Are you sure that it's really "very slow" compared to other database 
engines? How does its insert speed compare to mySQL, PostgreSQL, Orcale, 
DB2, Firebird, etc. for large bulk inserts?



why should the user have to pre-sort
data to achieve acceptable performance?
  
Most users don't have to do this, and are happy with the insert speed. 
It is only a few users who are pushing SQLite, and probably other 
database engines, to their limits that need to resort to such tricks to 
get it to run as fast as possible.


In the current SQLite insert algorithm there's way too much shuffling 
around of disk pages in order to minimize database footprint at 
all costs. There ought to be a PRAGMA to prefer maximizing insert 
speed when inserting, and if it doubles or quadruples the database 
size - so be it. Most embedded devices may opt for minimum space; but 
most PC-based apps would likely opt for maximum insert speed.


  
I'm not sure the number of users for whom this is an issue warrants 
adding a second code path to sqlite.


It might make sense to create a separate standalone utility program 
(like sqlite3_analyzer) that reuses some the sqlite  source to do bulk 
inserts into a table in a database file as fast a possible with out 
having to worry about locking or journaling etc.


Dennis Cote

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Difference in these indices?

2007-03-27 Thread Nuno Lucas

On 3/28/07, Joe Wilson <[EMAIL PROTECTED]> wrote:

I believe 2 measures when applied together would significantly
improve insert speed of indexed rows:

1. Allot each index/btree a contiguous region of the database file in
which to grow without conflicting with the other indexes' pages and pages
of the underlying table itself. (Putting indexes in separate files if
only temporarily for bulk insert would be better, but hey, it's SQLite).

2. Adjust the btree algorithm to create larger gaps in the data when you
make space for new keys in the btree. Such page-sized gaps could minimize
much of the page shuffling currently done.


First I have to say I have no experience on database optimization
techniques (and it's my last mail before going to bed), but maybe the
possibility of having different index data structures (instead of only
the b-tree one) could also help?

My idea is that if we could create an index that uses an hash table,
possibly with a b-tree to handle collisions, maybe it would make
random inserts faster? It would spread the insertions on smaller
b-tree's.

Off course that leaves the problem of finding the right hash function,
but I suppose there are already pretty good generic (and fast) ones.
My guess is that it should be an hash function that spreads the
calculated hash in an ordered way, so one could walk the hash table in
a similar way than an ordered directory.


Regards,
~Nuno Lucas

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Difference in these indices?

2007-03-27 Thread Joe Wilson
--- [EMAIL PROTECTED] wrote:
> If you insert records in order of ascending integer primary
> key, then the inserts will be very fast.  If you insert records
> where the integer primary key is randomized, inserts will be
> reasonably fast until the size of your table exceeds the size
> of your disk cache.  Then each insert will need to do multiple
> reads and writes to disk as it tries to figure out where in
> your massive table is the right place to put the new record,
> and then make space for that new record.  All this disk I/O
> will slow things down dramatically.

SQLite is extremely fast in most area of SQL functionality except for 
bulk inserts of non-ordered data into multi-indexes tables, where it 
is very slow. This pre-sorting-before-insert trick only works for single 
indexed tables, and is not useful in the general case. It also defeats
the purpose of using a database - why should the user have to pre-sort
data to achieve acceptable performance?

In the current SQLite insert algorithm there's way too much shuffling 
around of disk pages in order to minimize database footprint at 
all costs. There ought to be a PRAGMA to prefer maximizing insert 
speed when inserting, and if it doubles or quadruples the database 
size - so be it. Most embedded devices may opt for minimum space; but 
most PC-based apps would likely opt for maximum insert speed.

I believe 2 measures when applied together would significantly 
improve insert speed of indexed rows:

1. Allot each index/btree a contiguous region of the database file in 
which to grow without conflicting with the other indexes' pages and pages 
of the underlying table itself. (Putting indexes in separate files if 
only temporarily for bulk insert would be better, but hey, it's SQLite).

2. Adjust the btree algorithm to create larger gaps in the data when you
make space for new keys in the btree. Such page-sized gaps could minimize 
much of the page shuffling currently done.

When the inserts are done, the user need only issue a VACUUM to bring 
the database file back to its normal "packed" sized without such gaps 
if they should so choose.


 

TV dinner still cooling? 
Check out "Tonight's Picks" on Yahoo! TV.
http://tv.yahoo.com/

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



RE: [sqlite] Difference in these indices?

2007-03-27 Thread RB Smissaert
Looks then that doing the table creation with INTEGER PRIMARY KEY
Is the way to go, but as always it will come down to a lot of testing.

As to quotes etc.
As my code works fine as it is I probably will leave this as the double
quotes look ugly and it will be a reasonably big job to alter all this.
Did I get you right that the only benefit of doing create "table1" etc.
is compatibility with running sqlite with SQLite.exe?

RBS

-Original Message-
From: Dennis Cote [mailto:[EMAIL PROTECTED] 
Sent: 27 March 2007 23:41
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] Difference in these indices?

RB Smissaert wrote:
> Is it right that this won't affect the speed of any subsequent inserts or
> deletes?
>   
Well inserts will be done in id order. If you have predefined ids 
assigned by some outside source and specify them when you insert into 
sqlite, it will have to insert at random location in the btree. This 
will take longer than always appending at the end of the btree. If you 
let sqlite assign the ids, or the ids are in order, then this is not an 
issue. If you are always going to create the external index afterwards 
anyway, it will also probably not make much difference (you would have 
to test it each way).
> About the single quotes etc:
> This is VB code, so I can't do:
> Create table "table1"("ID" INTEGER PRIMARY KEY)
>
> I can do:
> Create table table1(ID INTEGER PRIMARY KEY)
>
> As the table and the columns are often variables it will be something
like:
>
> strTable = "table1"
> strColumn = "ID"
>
> strSQL = "create " & strTable & "(" & strColumn & " INTEGER PRIMARY KEY)"
>
>
>   
VB and SQL both use the same technique of escaping quotes embedded in 
strings using a pair of quotes back to back.

In VB

print "Test ""quoted"" strings." 

will output

Test "quoted" strings.

You can do the same with the strings you are building to send to SQLite. 
Using the following VB statement

strSQL = "create """ & strTable & """(""" & strColumn & """ INTEGER 
PRIMARY KEY)"

will produce a strSQL that contains the string

create "table1"("ID" INTEGER PRIMARY KEY)

HTH
Dennis Cote




-
To unsubscribe, send email to [EMAIL PROTECTED]

-




-
To unsubscribe, send email to [EMAIL PROTECTED]
-



RE: [sqlite] Difference in these indices?

2007-03-27 Thread RB Smissaert
Thanks for clarifying that.
I think all the data to be inserted in tables with an INTEGER PRIMARY KEY
will be sorted on that key, but I will have to check as sometimes these
tables can be big, say a few million rows.

RBS

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Sent: 27 March 2007 23:34
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] Difference in these indices?

"RB Smissaert" <[EMAIL PROTECTED]> wrote:
> Thanks for that.
> So if I can then I should create the table with INTEGER PRIMARY KEY.
> Is it right that this won't affect the speed of any subsequent inserts or
> deletes?
> 

That depends on the data.

If you insert records in order of ascending integer primary
key, then the inserts will be very fast.  If you insert records
where the integer primary key is randomized, inserts will be
reasonably fast until the size of your table exceeds the size
of your disk cache.  Then each insert will need to do multiple
reads and writes to disk as it tries to figure out where in
your massive table is the right place to put the new record,
and then make space for that new record.  All this disk I/O
will slow things down dramatically.

Every table has an integer primary key whether you declare one
or not.  If you do not specify an integer primary key then one
is created for you automatically named "ROWID" or "OID".  If 
you do not specify a value for the integer primary key when
inserting, a value is selected automatically.  The value
selected is one more than the largest existing integer primary
key in that table.  That means that if you do not specify
integer primary keys, the keys choosen are in ascending order
and inserts are very fast.

--
D. Richard Hipp  <[EMAIL PROTECTED]>



-
To unsubscribe, send email to [EMAIL PROTECTED]

-




-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Difference in these indices?

2007-03-27 Thread Dennis Cote

RB Smissaert wrote:

Is it right that this won't affect the speed of any subsequent inserts or
deletes?
  
Well inserts will be done in id order. If you have predefined ids 
assigned by some outside source and specify them when you insert into 
sqlite, it will have to insert at random location in the btree. This 
will take longer than always appending at the end of the btree. If you 
let sqlite assign the ids, or the ids are in order, then this is not an 
issue. If you are always going to create the external index afterwards 
anyway, it will also probably not make much difference (you would have 
to test it each way).

About the single quotes etc:
This is VB code, so I can't do:
Create table "table1"("ID" INTEGER PRIMARY KEY)

I can do:
Create table table1(ID INTEGER PRIMARY KEY)

As the table and the columns are often variables it will be something like:

strTable = "table1"
strColumn = "ID"

strSQL = "create " & strTable & "(" & strColumn & " INTEGER PRIMARY KEY)"


  
VB and SQL both use the same technique of escaping quotes embedded in 
strings using a pair of quotes back to back.


In VB

   print "Test ""quoted"" strings." 


will output

   Test "quoted" strings.

You can do the same with the strings you are building to send to SQLite. 
Using the following VB statement


   strSQL = "create """ & strTable & """(""" & strColumn & """ INTEGER 
PRIMARY KEY)"


will produce a strSQL that contains the string

   create "table1"("ID" INTEGER PRIMARY KEY)

HTH
Dennis Cote



-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Difference in these indices?

2007-03-27 Thread drh
"RB Smissaert" <[EMAIL PROTECTED]> wrote:
> Thanks for that.
> So if I can then I should create the table with INTEGER PRIMARY KEY.
> Is it right that this won't affect the speed of any subsequent inserts or
> deletes?
> 

That depends on the data.

If you insert records in order of ascending integer primary
key, then the inserts will be very fast.  If you insert records
where the integer primary key is randomized, inserts will be
reasonably fast until the size of your table exceeds the size
of your disk cache.  Then each insert will need to do multiple
reads and writes to disk as it tries to figure out where in
your massive table is the right place to put the new record,
and then make space for that new record.  All this disk I/O
will slow things down dramatically.

Every table has an integer primary key whether you declare one
or not.  If you do not specify an integer primary key then one
is created for you automatically named "ROWID" or "OID".  If 
you do not specify a value for the integer primary key when
inserting, a value is selected automatically.  The value
selected is one more than the largest existing integer primary
key in that table.  That means that if you do not specify
integer primary keys, the keys choosen are in ascending order
and inserts are very fast.

--
D. Richard Hipp  <[EMAIL PROTECTED]>


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



RE: [sqlite] Difference in these indices?

2007-03-27 Thread RB Smissaert
Thanks for that.
So if I can then I should create the table with INTEGER PRIMARY KEY.
Is it right that this won't affect the speed of any subsequent inserts or
deletes?

About the single quotes etc:
This is VB code, so I can't do:
Create table "table1"("ID" INTEGER PRIMARY KEY)

I can do:
Create table table1(ID INTEGER PRIMARY KEY)

As the table and the columns are often variables it will be something like:

strTable = "table1"
strColumn = "ID"

strSQL = "create " & strTable & "(" & strColumn & " INTEGER PRIMARY KEY)"


RBS

-Original Message-
From: Dennis Cote [mailto:[EMAIL PROTECTED] 
Sent: 27 March 2007 22:51
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] Difference in these indices?

RB Smissaert wrote:
> Is there any difference in an index created like this:
>
> Create table 'table1'([ID] INTEGER PRIMARY KEY)
>
> with this:
>
> Create table 'table1'([ID] INTEGER)
>
> Create unique index idx_table1_ID on table1(ID)
>
> I tended to use the first form, but as that can make subsequent table
> inserts or deletes slower I am now moving to the second form.
>
>   
Yes there is. The first uses the key for the btree that stores the table 
to hold the id. The second uses a second comlpetely independent btree to 
store an index that holds records that contain the id number and the 
rowid of the corresponding record in the table. The table itself 
contains a rowid as the key of the table btree and the user id field.

SQLite version 3.3.13
Enter ".help" for instructions
sqlite> Create table 'table1'([ID] INTEGER PRIMARY KEY);
sqlite> select * from sqlite_master;
table|table1|table1|2|CREATE TABLE 'table1'([ID] INTEGER PRIMARY KEY)

SQLite version 3.3.13
Enter ".help" for instructions
sqlite> Create table 'table1'([ID] INTEGER);
sqlite> Create unique index idx_table1_ID on table1(ID);
sqlite> select * from sqlite_master;
table|table1|table1|2|CREATE TABLE 'table1'([ID] INTEGER)
index|idx_table1_ID|table1|3|CREATE UNIQUE INDEX idx_table1_ID on table1(ID)

The first version stores on integer for each record, and stores it in 
the btree key. The second stores four integers for each record, two in 
the table record and two in the index record. You are making your 
database much larger for no reason.

Also, you are using literal strings (delimited with a single quote) for 
your table names. This is not standard SQL and will not be portable. You 
are also using square brackets to quote your column names. This is also 
an SQLite and MS extension to standard SQL. You should quote identifiers 
such as table and column names with double quotes.

Create table 'table1'([ID] INTEGER PRIMARY KEY)

should be:

Create table "table1"("ID" INTEGER PRIMARY KEY)
  
HTH
Dennis Cote


-
To unsubscribe, send email to [EMAIL PROTECTED]

-




-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Difference in these indices?

2007-03-27 Thread Dennis Cote

RB Smissaert wrote:

Is there any difference in an index created like this:

Create table 'table1'([ID] INTEGER PRIMARY KEY)

with this:

Create table 'table1'([ID] INTEGER)

Create unique index idx_table1_ID on table1(ID)

I tended to use the first form, but as that can make subsequent table
inserts or deletes slower I am now moving to the second form.

  
Yes there is. The first uses the key for the btree that stores the table 
to hold the id. The second uses a second comlpetely independent btree to 
store an index that holds records that contain the id number and the 
rowid of the corresponding record in the table. The table itself 
contains a rowid as the key of the table btree and the user id field.


SQLite version 3.3.13
Enter ".help" for instructions
sqlite> Create table 'table1'([ID] INTEGER PRIMARY KEY);
sqlite> select * from sqlite_master;
table|table1|table1|2|CREATE TABLE 'table1'([ID] INTEGER PRIMARY KEY)

SQLite version 3.3.13
Enter ".help" for instructions
sqlite> Create table 'table1'([ID] INTEGER);
sqlite> Create unique index idx_table1_ID on table1(ID);
sqlite> select * from sqlite_master;
table|table1|table1|2|CREATE TABLE 'table1'([ID] INTEGER)
index|idx_table1_ID|table1|3|CREATE UNIQUE INDEX idx_table1_ID on table1(ID)

The first version stores on integer for each record, and stores it in 
the btree key. The second stores four integers for each record, two in 
the table record and two in the index record. You are making your 
database much larger for no reason.


Also, you are using literal strings (delimited with a single quote) for 
your table names. This is not standard SQL and will not be portable. You 
are also using square brackets to quote your column names. This is also 
an SQLite and MS extension to standard SQL. You should quote identifiers 
such as table and column names with double quotes.


   Create table 'table1'([ID] INTEGER PRIMARY KEY)

should be:

   Create table "table1"("ID" INTEGER PRIMARY KEY)
 
HTH

Dennis Cote

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



[sqlite] Difference in these indices?

2007-03-27 Thread RB Smissaert
Is there any difference in an index created like this:

Create table 'table1'([ID] INTEGER PRIMARY KEY)

with this:

Create table 'table1'([ID] INTEGER)

Create unique index idx_table1_ID on table1(ID)

I tended to use the first form, but as that can make subsequent table
inserts or deletes slower I am now moving to the second form.

RBS





-
To unsubscribe, send email to [EMAIL PROTECTED]
-