RE: [sqlite] FW: 2.8.14 performance, Indexed Table INSERTS

2004-06-21 Thread Edward Bacon
 
>From: D. Richard Hipp [mailto:[EMAIL PROTECTED] 
>
-- snip --
>simplifies to O(logN) which is clearly less than O(N).
>In that case, it pays to use the index.
 
Which is my case I believe, thanks.  It's been years (OMG, 16!) since I
had an algorithms class.  Is that log base 2, or does it matter?
 
Note the index in the example had two fields:
 
CREATE INDEX Articles_Index ON Articles (MessageID, DomainID);
 
The messageid is almost unique, very rarely (every few years?) would it
be duplicated.  The domainid by itself will have many duplicate records,
it is the name of the server that originated the article.  Together they
are unique.
 
By the way, as a brute force experiment I tried an index with only one
field of:
 
CREATE INDEX Articles_Index ON Articles (MessageID);
 
And, keeping N large, got a 30% speed improvement.  I wonder why?  Does
this indicate one could trade some file size (DomainID is a key into
another small table holding the domain string) for a speed improvement
by concatenating the two strings and having only one field?
 
Also back to one of my original questions: Can the table be kept in one
file and the index be kept in another?  The idea would be to localize
the index and suspects it currently is spread throughout the database.
 
TIA again.


RE: [sqlite] FW: 2.8.14 performance, Indexed Table INSERTS

2004-06-21 Thread Darren Duncan
At 5:05 PM -0700 6/21/04, Keith Herold wrote:
 > down the result set would make things faster..? Wouldn't the select
 here:
CREATE TABLE tmp ( flag boolean, name text );
SELECT name FROM tmp WHERE flag = 1 AND name LIKE '%foo%';
 run faster with an index on the flag column since it can scan
 just the
 flag = 1 rows instead of the full table?
I think this is one of those big-O things, right?  It might be faster, but
only a bit faster, and  not enough to justify the hassle of creating and
maintaining the index.
--Keith
A lot of optimizing is context specific.  Certain kinds of 
optimizations work well on some sets of data, but can make things 
worse with others.  The index being a type of optimization.  On some 
data sets, indexes can make things a little faster, and on others, 
orders of magnitude faster; the latter in particular is where you 
want to target them.

The main time indexes are useful is when you only want to return a 
tiny fraction of the records in a table, in a typical select.  For 
example, use an index when you want to fetch, say, 1 row out of a 
table with 1000.  But if you want to fetch 200 rows out of the same 
table, then not having an index will be faster for fetches.  And you 
save time by not creating the index too.

When you use an index, the overhead of storing rows is definately 
greater, and the overhead of fetching is also greater.  However, that 
overhead is only on the rows actually returned for a fetch, so you 
still have a huge savings with an index if you return just a few 
rows.  But if you are selecting a lot of rows, such as the 200, then 
the overhead of finding them first in the index leads to more work 
being done than if the index was ignored and the same 200 were found 
with a simple table scan.

One of the best places to have indexes is usually on columns where 
every value is unique, regardless of how many records you have in the 
table.  But then, if you have a unique or primary key constraint on 
the column, then an index is created implicitely anyway, as it is 
used when enforcing the constraint.

-- Darren Duncan
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: [sqlite] FW: 2.8.14 performance, Indexed Table INSERTS

2004-06-21 Thread Keith Herold
> On Jun 20, 2004, at 9:07 PM, Darren Duncan wrote:



> down the result set would make things faster..? Wouldn't the select 
> here:
> 
>CREATE TABLE tmp ( flag boolean, name text );
> 
>SELECT name FROM tmp WHERE flag = 1 AND name LIKE '%foo%';
> 
> run faster with an index on the flag column since it can scan 
> just the 
> flag = 1 rows instead of the full table?

I think this is one of those big-O things, right?  It might be faster, but
only a bit faster, and  not enough to justify the hassle of creating and
maintaining the index.

--Keith


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] FW: 2.8.14 performance, Indexed Table INSERTS

2004-06-21 Thread Dave Hayden
On Jun 20, 2004, at 9:07 PM, Darren Duncan wrote:
Generally speaking, you should only use indexes on table columns that 
have a lot of distinct values, and each one only appears a few times. 
You should not use indexes on columns that have few distinct values 
and each appears many times; in the latter case, a full table scan 
would be faster.
That's weird. I would have thought that having any index at all to pare 
down the result set would make things faster..? Wouldn't the select 
here:

  CREATE TABLE tmp ( flag boolean, name text );
  SELECT name FROM tmp WHERE flag = 1 AND name LIKE '%foo%';
run faster with an index on the flag column since it can scan just the 
flag = 1 rows instead of the full table?

-Dave
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [sqlite] SQLite 3.0 - Why only one text encoding per database?

2004-06-21 Thread Darren Duncan
At 5:23 PM -0500 6/21/04, Tim wrote:
It sounds like SQLite 3.0 only supports one text representation per 
database ("Internally and in the disk file, the same text 
representation is used everywhere.").  Is there a particular reason 
for this limitation?  What if I want to store one column as UTF8 
text and another column as UTF16?  Other databases offer a separate 
16-bit "ntext" column type for this purpose.
There is no reason to support both UTF-8 and UTF-16 in the same file 
at the same time.  They both have identical character repertoires, 
and identical code points for each character; they are completely 
interchangeable.  They both represent every character that exists in 
any language.

The only reasons for picking one over the other relates to 
memory/disk usage and portability.  UTF-8 takes up much less disk 
space when you are dealing mainly in Roman characters, while UTF-16 
tends to use less disk space when dealing mainly with non-Roman 
characters.  UTF-8 also has an identical byte order on any platform, 
making it "network safe", and trivially portable.  UTF-16 has 
separate versions for big-endian and little-endian architectures, 
plus some UTF-16 files have a byte-order-mark ("BOM") to say what we 
have, while others don't, making for a potential compatability 
problem in the latter case; UTF-16's main advantage is that it is 
sometimes faster to use.  In any event, both UTF-8 and UTF-16 are 
variable-width encodings, though UTF-8 tends to vary more; their 
names say the *minimum* number of bits used by each character.

Speaking practically, unless you can justify using UTF-16 for 
anything, then always use UTF-8 by default as doing so gives a lot 
more advantages.  (But whichever of those you choose, you can convert 
later fairly easily.)

The reason for making the entire file one encoding or the other is 
that it simplifies the core SQLite code, as well as other code which 
uses it, since the code doesn't have to constantly test for one type 
or the other.

The situation you mention about other databases is a different 
matter.  Their normal text columns are fixed-width 8-bit characters, 
which are not Unicode, and can only represent 255 unique characters. 
The "ntext" (short for "national varchar") column type is for when 
you need more than that many characters.  And even a fixed-width 
16-bit column won't fit all the characters that Unicode can 
represent.  Unicode has a 21-bit code point range, plus a fixed-width 
UTF-32 encoding (the remaining 11 bits aren't used).  The 8-bit and 
16-bit non-Unicode encodings used by other databases are not fully 
interchangeable with each other.

The fact that un-necessary options aren't provided by SQLite is part 
of what makes it "lite".  I say don't change a thing.

I could store anything with BLOBs, but then
I'd lose the ability to use SORT.
Simple.  Don't use BLOBs for character data; that is what Unicode text is for.
Storing only one type of text data also seems to go against the idea 
of SQLite's "manifest typing" where "datatype is associated with the 
data itself, not with its container."
That is still true.  But the "manifest typing" is kept relatively 
simple by the single-encoding description.  The choices are simply: 
character, integer, decimal, binary, null.  No reason to break down 
'character' further.

Here the datatype of particular piece of text is associated with not 
just a column type, but a fixed datatype defined for the entire 
database. SQLite should store knowledge about the text encoding of 
each value.  I guess there would also need to be a way to 
differentiate between 8-bit and 16-bit string literals since SQLite 
doesn't have column types, and efficiently (without conversion) 
insert or query for 8-bit and 16-bit values in a single statement.
What probably would work best is to pick the single UTF format that 
matches what the rest of your program uses internally; if your 
program uses multiple types, then either change it to standardize on 
one, or put conversion routines in the few spots that need it next to 
the database access code.  If it were up to me (and for my stuff, it 
is), I would pick a single encoding, UTF-8, and use just it through 
my entire program.

-- Darren Duncan
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [sqlite] SQLite 3.0 - Why only one text encoding per database?

2004-06-21 Thread D. Richard Hipp
Tim wrote:
> It sounds like SQLite 3.0 only supports one text representation per database
> ("Internally and in the disk file, the same text representation is used
> everywhere.").  Is there a particular reason for this limitation?
The reason is to keep the API simple.  If the programmer selects a single
text representation for all data (which they are advised to do) then they
only have to provide a single version for each user-defined function and
user-defined collating sequence.  Even if they do not stick to a single
text representation, the most they'll have to code up is three versions
of each user-defined function and collating sequence.  If we were to
allow mixed text representations in the database, 9 different versions
of each collating function would be required, and up to 3**N versions
of each user-defined function, where N is the number of parameters in
the function.
Of course, the programmer could be lazy and only provide one version of
each function and collating sequence and SQLite would automatically
convert text representations as needed.  But constantly flipping text
representations around is needlessly time consuming.
> What if I want to store one column as UTF8 text and another column as UTF16?
> Other databases offer a separate 16-bit "ntext" column type for this purpose.
You can still store one column as UTF8 and the other as UTF16 using
sqlite3_bind_text() and sqlite3_bind_text16().  It's just that one or
the other or both will be converted into the native representation of
the database before being written to disk.
> I could store anything with BLOBs, but then I'd lose the ability to use SORT.
>
BLOBs sort in SQLite.  They sort in memcmp() order.  And BLOBs sort after
text.
>
> Storing only one type of text data also seems to go against the idea of
> SQLite's "manifest typing" where "datatype is associated with the data
> itself, not with its container."
>
SQLite does not consider text representation part of the datatype.
Apart from performance and database file size, the programmer has
no way of knowing what text representation is being used internally.
The fact that one of three different representations might be used
internally shows through into the user-defined function and collating
sequence interfaces in order to allow programmers to optimize for
performance.  But apart from that, you don't really know what encodings
SQLite is using internally.  SQLite might convert all strings to
EBCDIC internally for all you know.  The main interface would be
the same.
Other database engines claim to store UTF-16 in some fields and UTF-8
in others.  But is that really what they are doing?  Are you certain
they are not converting the UTF-16 data into UTF-8 for storage then
converting it back when you request the data?  How can you tell?
Another way to look at it:  SQLite 2.8 made you choose between UTF-8 and
ISO8859 at compile-time.  SQLite 3.0 lets you choose between UTF-8,
UTF-16BE, and UTF-16LE and it lets you defer the choice to runtime.
But you still have to make the choice.
--
D. Richard Hipp -- [EMAIL PROTECTED] -- 704.948.4565
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


[sqlite] SQLite 3.0 - Why only one text encoding per database?

2004-06-21 Thread Tim
It sounds like SQLite 3.0 only supports one text representation per database 
("Internally and in the disk
file, the same text representation is used everywhere.").  Is there a particular 
reason for this
limitation?  What if I want to store one column as UTF8 text and another column as 
UTF16?  Other databases
offer a separate 16-bit "ntext" column type for this purpose.  I could store anything 
with BLOBs, but then
I'd lose the ability to use SORT.

Storing only one type of text data also seems to go against the idea of SQLite's 
"manifest typing" where
"datatype is associated with the data itself, not with its container."  Here the 
datatype of particular
piece of text is associated with not just a column type, but a fixed datatype defined 
for the entire
database. SQLite should store knowledge about the text encoding of each value.  I 
guess there would also
need to be a way to differentiate between 8-bit and 16-bit string literals since 
SQLite doesn't have
column types, and efficiently (without conversion) insert or query for 8-bit and 
16-bit values in a single
statement.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[sqlite] tclsqlite dll's

2004-06-21 Thread Lawrence Chitty
Hi

having recently started playing with sqlite and I am very impressed. As my
primary interest is to use it with Tcl, I'm wondering if a pre-compiled
tclsqlite.dll can be made available for release 3.0 .

I also notice that the tclsqlite.dll for 2.8.14 appears to have dissapeared
from the download page (the tclsqlite-2.8.14.zip file doesn't contain a
tclsqlite dll )

Thanks

Lawrence


---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.708 / Virus Database: 464 - Release Date: 18/06/04


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] sqlite 3.0.0 build issues

2004-06-21 Thread D. Richard Hipp
Doug Currie wrote:
The TEMP_STORE features should have been fully implemented by
check-in [1302] on 2004-Apr-23.  Have you observed otherwise?

1. Searching for TEMP_STORE in the sources, I see it described in the
comments in main.c but it does not appear to be used anywhere in the
code. So, this compiler macro does nothing as far as I can tell.
2. There is support for pragma temp_store -- but even this doesn't
seem to have any affect on operation... searching the sources I can't
find any use of db->temp_store (perhaps I am not following the logic).
Yep.  Looks like it got unimplemented again with check-in [1325]
on 2004-May-08.
--
D. Richard Hipp -- [EMAIL PROTECTED] -- 704.948.4565
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [sqlite] sqlite 3.0.0 build issues

2004-06-21 Thread D. Richard Hipp
Doug Currie wrote:
> I have not modified the sqlite sources to implement the TEMP_STORE features.
>
The TEMP_STORE features should have been fully implemented by
check-in [1302] on 2004-Apr-23.  Have you observed otherwise?
--
D. Richard Hipp -- [EMAIL PROTECTED] -- 704.948.4565
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [sqlite] sqlite 3.0.0 build issues

2004-06-21 Thread Doug Currie
Replying to my own message...

I bit the bullet and installed autoconf under msys/mingw. Using this
tool I updated configure.ac and Makefile.in to address some of the
build issues identified below -- everything but the OMIT macros. Of
course, I have not modified the sqlite sources to implement the
TEMP_STORE features.

The configure file ends up with a few unrelated changes (hey, it's
autoconf, what would you expect?) but they look minor. If this causes
anyone problems, I apologize in advance. Please let me know if you
think I have broken something you depend upon.

The new options are called
  --enable-threadsafe  Support threadsafe operation
  --enable-tempstore   Use an in-ram database for temporary tables 
(never,no,yes,always)

E.g.,

../sqlite/configure --enable-tempstore=no --enable-threadsafe

e

Sunday, June 20, 2004, 12:36:56 AM, I wrote:

> There are several build options in the configure/make of sqlite3 that
> are obsolete, and others that are missing.

> ENCODING is obsolete since it is specified in the open calls now.
> There are still several references to it, though, in the make and
> autoconf files, and these should all be removed.

> INMEMORYDB and SQLITE_OMIT_INMEMORYDB are obsolete since the red-black
> tree implementation is now gone. Again, references to these should be
> removed from the make and autoconf files.

> INCOREFLAGS and in particular TEMP_STORE are in limbo. The values are
> set but never used. E.g., INCOREFLAGS is set in Makefile but never
> used there. TEMP_STORE is initialized in sqliteInt.h and documented in
> several comments, but is never used in the code.

> I suspect that the intention is that sqlite3BtreeFactory use
> TEMP_STORE in certain circumstances, but presently it uses neither
> TEMP_STORE nor the db->temp_store set by pragma. In fact, even the
> flag BTREE_MEMORY which is set by sqlite3BtreeFactory when the
> filename is empty is ignored by sqlite3BtreeOpen.

> The THREADSAFE macro is used in all three os_xxx.c files, but is not
> supported by the autoconf files (configure.ac and Makefile.in). It
> would be really nice to add this when configure.ac is updated next.

> It would also be nice to change configure.ac so that only one of
> -DOS_WIN=1 or -DOS_UNIX=1 is specified in TARGET_CFLAGS. Only one is
> necessary (the other will be set in os.h) and it would make the
> console dribble from make much easier to read.

> There are several SQLITE_OMIT_ macros that should also be supported by
> the autoconf files.

> Disclaimer: I don't use autoconf, but I'd be willing to help with the
> source updates if somebody familiar with autoconf was willing to do
> the configure (+ aclocal.m4 etc.) file generation and linux testing.

> e



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[sqlite] When to set 'PRAGMA full_column_names = OFF'?

2004-06-21 Thread Tito Ciuro
Hello everybody,
Should I set 'PRAGMA full_column_names = OFF' every time I execute a 
SELECT statement, or just once, right after I open the database? What 
is the lifetime of this setting?

Thanks,
-- Tito

Re: [sqlite] api 3, bind help

2004-06-21 Thread Stephen C. Gilardi
The argument is a "pointer to function returning void and taking void* 
as its argument" (like "free");

Here's a sample program:
#include 
void func(void (*)(void*));
void freemystuff(void* in);
int main(int argc, char* argv[])
{
  func(freemystuff);
}
void func(void (*callback)(void*) )
{
  callback(0);
}
void freemystuff(void* in)
{
  std::cout << "freemystuff called " << in << std::endl;
}
--Steve
On Jun 21, 2004, at 10:55 AM, Andy Colson wrote:
So, I'm guessing the fifth arg is the eCopy param?  My C isnt very 
strong, so void(*)(void*) is a pointer to a function like this:
void* someFunc(void*), right?

My guess is the params are:
int sqlite3_bind_blob(sqlite3_stmt* statement, int paramNumber, const 
void* fieldValue, int fieldSize, callback); /*where callback can be 
one of the const's SQLITE_STATIC, SQLITE_TRANSIENT, or a pointer to a 
function that returns a void* and takes one void* as an arg. */


Re: [sqlite] ignore index

2004-06-21 Thread Raymond Irving

It would be even better if SQLite could do a better
job at optimizing the index searches.

__
Raymond Irving

--- rene <[EMAIL PROTECTED]> wrote:
> Hi,
> 
> i ran into a simulair case as this message
>
http://www.mail-archive.com/[EMAIL PROTECTED]/msg01646.html
> 
> where an index actually slows things down, because
> there are many rows matching the
> condition (where the other (indexed) condition only
> matches a few rows).
> 
> In above mentioned letter you suggest to just drop
> the index, or concat the condition with an
> (empty) string (like: instead of  "checked=0"  use 
> "checked||''='0' ").
> 
> in my case, i need the index for another query..
> well, i could consider to drop it but the fix is
> good enough for me.. i just like to mention that it
> would be a really cool feature to tell sqlite with
> index(es) not to use with something like 'IGNORE
> INDEX fieldname' or something..
> 
> also, maybe it is good idea to put this in the
> documentation (including above trick), but since it
> is a wiki i can guess the answer on that ;)
> 
> regards,
> 
> rene
> 
>
-
> To unsubscribe, e-mail:
> [EMAIL PROTECTED]
> For additional commands, e-mail:
> [EMAIL PROTECTED]
> 
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]