RE: [sqlite] FW: 2.8.14 performance, Indexed Table INSERTS
>From: D. Richard Hipp [mailto:[EMAIL PROTECTED] > -- snip -- >simplifies to O(logN) which is clearly less than O(N). >In that case, it pays to use the index. Which is my case I believe, thanks. It's been years (OMG, 16!) since I had an algorithms class. Is that log base 2, or does it matter? Note the index in the example had two fields: CREATE INDEX Articles_Index ON Articles (MessageID, DomainID); The messageid is almost unique, very rarely (every few years?) would it be duplicated. The domainid by itself will have many duplicate records, it is the name of the server that originated the article. Together they are unique. By the way, as a brute force experiment I tried an index with only one field of: CREATE INDEX Articles_Index ON Articles (MessageID); And, keeping N large, got a 30% speed improvement. I wonder why? Does this indicate one could trade some file size (DomainID is a key into another small table holding the domain string) for a speed improvement by concatenating the two strings and having only one field? Also back to one of my original questions: Can the table be kept in one file and the index be kept in another? The idea would be to localize the index and suspects it currently is spread throughout the database. TIA again.
RE: [sqlite] FW: 2.8.14 performance, Indexed Table INSERTS
At 5:05 PM -0700 6/21/04, Keith Herold wrote: > down the result set would make things faster..? Wouldn't the select here: CREATE TABLE tmp ( flag boolean, name text ); SELECT name FROM tmp WHERE flag = 1 AND name LIKE '%foo%'; run faster with an index on the flag column since it can scan just the flag = 1 rows instead of the full table? I think this is one of those big-O things, right? It might be faster, but only a bit faster, and not enough to justify the hassle of creating and maintaining the index. --Keith A lot of optimizing is context specific. Certain kinds of optimizations work well on some sets of data, but can make things worse with others. The index being a type of optimization. On some data sets, indexes can make things a little faster, and on others, orders of magnitude faster; the latter in particular is where you want to target them. The main time indexes are useful is when you only want to return a tiny fraction of the records in a table, in a typical select. For example, use an index when you want to fetch, say, 1 row out of a table with 1000. But if you want to fetch 200 rows out of the same table, then not having an index will be faster for fetches. And you save time by not creating the index too. When you use an index, the overhead of storing rows is definately greater, and the overhead of fetching is also greater. However, that overhead is only on the rows actually returned for a fetch, so you still have a huge savings with an index if you return just a few rows. But if you are selecting a lot of rows, such as the 200, then the overhead of finding them first in the index leads to more work being done than if the index was ignored and the same 200 were found with a simple table scan. One of the best places to have indexes is usually on columns where every value is unique, regardless of how many records you have in the table. But then, if you have a unique or primary key constraint on the column, then an index is created implicitely anyway, as it is used when enforcing the constraint. -- Darren Duncan - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: [sqlite] FW: 2.8.14 performance, Indexed Table INSERTS
> On Jun 20, 2004, at 9:07 PM, Darren Duncan wrote: > down the result set would make things faster..? Wouldn't the select > here: > >CREATE TABLE tmp ( flag boolean, name text ); > >SELECT name FROM tmp WHERE flag = 1 AND name LIKE '%foo%'; > > run faster with an index on the flag column since it can scan > just the > flag = 1 rows instead of the full table? I think this is one of those big-O things, right? It might be faster, but only a bit faster, and not enough to justify the hassle of creating and maintaining the index. --Keith - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] FW: 2.8.14 performance, Indexed Table INSERTS
On Jun 20, 2004, at 9:07 PM, Darren Duncan wrote: Generally speaking, you should only use indexes on table columns that have a lot of distinct values, and each one only appears a few times. You should not use indexes on columns that have few distinct values and each appears many times; in the latter case, a full table scan would be faster. That's weird. I would have thought that having any index at all to pare down the result set would make things faster..? Wouldn't the select here: CREATE TABLE tmp ( flag boolean, name text ); SELECT name FROM tmp WHERE flag = 1 AND name LIKE '%foo%'; run faster with an index on the flag column since it can scan just the flag = 1 rows instead of the full table? -Dave - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] SQLite 3.0 - Why only one text encoding per database?
At 5:23 PM -0500 6/21/04, Tim wrote: It sounds like SQLite 3.0 only supports one text representation per database ("Internally and in the disk file, the same text representation is used everywhere."). Is there a particular reason for this limitation? What if I want to store one column as UTF8 text and another column as UTF16? Other databases offer a separate 16-bit "ntext" column type for this purpose. There is no reason to support both UTF-8 and UTF-16 in the same file at the same time. They both have identical character repertoires, and identical code points for each character; they are completely interchangeable. They both represent every character that exists in any language. The only reasons for picking one over the other relates to memory/disk usage and portability. UTF-8 takes up much less disk space when you are dealing mainly in Roman characters, while UTF-16 tends to use less disk space when dealing mainly with non-Roman characters. UTF-8 also has an identical byte order on any platform, making it "network safe", and trivially portable. UTF-16 has separate versions for big-endian and little-endian architectures, plus some UTF-16 files have a byte-order-mark ("BOM") to say what we have, while others don't, making for a potential compatability problem in the latter case; UTF-16's main advantage is that it is sometimes faster to use. In any event, both UTF-8 and UTF-16 are variable-width encodings, though UTF-8 tends to vary more; their names say the *minimum* number of bits used by each character. Speaking practically, unless you can justify using UTF-16 for anything, then always use UTF-8 by default as doing so gives a lot more advantages. (But whichever of those you choose, you can convert later fairly easily.) The reason for making the entire file one encoding or the other is that it simplifies the core SQLite code, as well as other code which uses it, since the code doesn't have to constantly test for one type or the other. The situation you mention about other databases is a different matter. Their normal text columns are fixed-width 8-bit characters, which are not Unicode, and can only represent 255 unique characters. The "ntext" (short for "national varchar") column type is for when you need more than that many characters. And even a fixed-width 16-bit column won't fit all the characters that Unicode can represent. Unicode has a 21-bit code point range, plus a fixed-width UTF-32 encoding (the remaining 11 bits aren't used). The 8-bit and 16-bit non-Unicode encodings used by other databases are not fully interchangeable with each other. The fact that un-necessary options aren't provided by SQLite is part of what makes it "lite". I say don't change a thing. I could store anything with BLOBs, but then I'd lose the ability to use SORT. Simple. Don't use BLOBs for character data; that is what Unicode text is for. Storing only one type of text data also seems to go against the idea of SQLite's "manifest typing" where "datatype is associated with the data itself, not with its container." That is still true. But the "manifest typing" is kept relatively simple by the single-encoding description. The choices are simply: character, integer, decimal, binary, null. No reason to break down 'character' further. Here the datatype of particular piece of text is associated with not just a column type, but a fixed datatype defined for the entire database. SQLite should store knowledge about the text encoding of each value. I guess there would also need to be a way to differentiate between 8-bit and 16-bit string literals since SQLite doesn't have column types, and efficiently (without conversion) insert or query for 8-bit and 16-bit values in a single statement. What probably would work best is to pick the single UTF format that matches what the rest of your program uses internally; if your program uses multiple types, then either change it to standardize on one, or put conversion routines in the few spots that need it next to the database access code. If it were up to me (and for my stuff, it is), I would pick a single encoding, UTF-8, and use just it through my entire program. -- Darren Duncan - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] SQLite 3.0 - Why only one text encoding per database?
Tim wrote: > It sounds like SQLite 3.0 only supports one text representation per database > ("Internally and in the disk file, the same text representation is used > everywhere."). Is there a particular reason for this limitation? The reason is to keep the API simple. If the programmer selects a single text representation for all data (which they are advised to do) then they only have to provide a single version for each user-defined function and user-defined collating sequence. Even if they do not stick to a single text representation, the most they'll have to code up is three versions of each user-defined function and collating sequence. If we were to allow mixed text representations in the database, 9 different versions of each collating function would be required, and up to 3**N versions of each user-defined function, where N is the number of parameters in the function. Of course, the programmer could be lazy and only provide one version of each function and collating sequence and SQLite would automatically convert text representations as needed. But constantly flipping text representations around is needlessly time consuming. > What if I want to store one column as UTF8 text and another column as UTF16? > Other databases offer a separate 16-bit "ntext" column type for this purpose. You can still store one column as UTF8 and the other as UTF16 using sqlite3_bind_text() and sqlite3_bind_text16(). It's just that one or the other or both will be converted into the native representation of the database before being written to disk. > I could store anything with BLOBs, but then I'd lose the ability to use SORT. > BLOBs sort in SQLite. They sort in memcmp() order. And BLOBs sort after text. > > Storing only one type of text data also seems to go against the idea of > SQLite's "manifest typing" where "datatype is associated with the data > itself, not with its container." > SQLite does not consider text representation part of the datatype. Apart from performance and database file size, the programmer has no way of knowing what text representation is being used internally. The fact that one of three different representations might be used internally shows through into the user-defined function and collating sequence interfaces in order to allow programmers to optimize for performance. But apart from that, you don't really know what encodings SQLite is using internally. SQLite might convert all strings to EBCDIC internally for all you know. The main interface would be the same. Other database engines claim to store UTF-16 in some fields and UTF-8 in others. But is that really what they are doing? Are you certain they are not converting the UTF-16 data into UTF-8 for storage then converting it back when you request the data? How can you tell? Another way to look at it: SQLite 2.8 made you choose between UTF-8 and ISO8859 at compile-time. SQLite 3.0 lets you choose between UTF-8, UTF-16BE, and UTF-16LE and it lets you defer the choice to runtime. But you still have to make the choice. -- D. Richard Hipp -- [EMAIL PROTECTED] -- 704.948.4565 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[sqlite] SQLite 3.0 - Why only one text encoding per database?
It sounds like SQLite 3.0 only supports one text representation per database ("Internally and in the disk file, the same text representation is used everywhere."). Is there a particular reason for this limitation? What if I want to store one column as UTF8 text and another column as UTF16? Other databases offer a separate 16-bit "ntext" column type for this purpose. I could store anything with BLOBs, but then I'd lose the ability to use SORT. Storing only one type of text data also seems to go against the idea of SQLite's "manifest typing" where "datatype is associated with the data itself, not with its container." Here the datatype of particular piece of text is associated with not just a column type, but a fixed datatype defined for the entire database. SQLite should store knowledge about the text encoding of each value. I guess there would also need to be a way to differentiate between 8-bit and 16-bit string literals since SQLite doesn't have column types, and efficiently (without conversion) insert or query for 8-bit and 16-bit values in a single statement. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[sqlite] tclsqlite dll's
Hi having recently started playing with sqlite and I am very impressed. As my primary interest is to use it with Tcl, I'm wondering if a pre-compiled tclsqlite.dll can be made available for release 3.0 . I also notice that the tclsqlite.dll for 2.8.14 appears to have dissapeared from the download page (the tclsqlite-2.8.14.zip file doesn't contain a tclsqlite dll ) Thanks Lawrence --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.708 / Virus Database: 464 - Release Date: 18/06/04 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] sqlite 3.0.0 build issues
Doug Currie wrote: The TEMP_STORE features should have been fully implemented by check-in [1302] on 2004-Apr-23. Have you observed otherwise? 1. Searching for TEMP_STORE in the sources, I see it described in the comments in main.c but it does not appear to be used anywhere in the code. So, this compiler macro does nothing as far as I can tell. 2. There is support for pragma temp_store -- but even this doesn't seem to have any affect on operation... searching the sources I can't find any use of db->temp_store (perhaps I am not following the logic). Yep. Looks like it got unimplemented again with check-in [1325] on 2004-May-08. -- D. Richard Hipp -- [EMAIL PROTECTED] -- 704.948.4565 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] sqlite 3.0.0 build issues
Doug Currie wrote: > I have not modified the sqlite sources to implement the TEMP_STORE features. > The TEMP_STORE features should have been fully implemented by check-in [1302] on 2004-Apr-23. Have you observed otherwise? -- D. Richard Hipp -- [EMAIL PROTECTED] -- 704.948.4565 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] sqlite 3.0.0 build issues
Replying to my own message... I bit the bullet and installed autoconf under msys/mingw. Using this tool I updated configure.ac and Makefile.in to address some of the build issues identified below -- everything but the OMIT macros. Of course, I have not modified the sqlite sources to implement the TEMP_STORE features. The configure file ends up with a few unrelated changes (hey, it's autoconf, what would you expect?) but they look minor. If this causes anyone problems, I apologize in advance. Please let me know if you think I have broken something you depend upon. The new options are called --enable-threadsafe Support threadsafe operation --enable-tempstore Use an in-ram database for temporary tables (never,no,yes,always) E.g., ../sqlite/configure --enable-tempstore=no --enable-threadsafe e Sunday, June 20, 2004, 12:36:56 AM, I wrote: > There are several build options in the configure/make of sqlite3 that > are obsolete, and others that are missing. > ENCODING is obsolete since it is specified in the open calls now. > There are still several references to it, though, in the make and > autoconf files, and these should all be removed. > INMEMORYDB and SQLITE_OMIT_INMEMORYDB are obsolete since the red-black > tree implementation is now gone. Again, references to these should be > removed from the make and autoconf files. > INCOREFLAGS and in particular TEMP_STORE are in limbo. The values are > set but never used. E.g., INCOREFLAGS is set in Makefile but never > used there. TEMP_STORE is initialized in sqliteInt.h and documented in > several comments, but is never used in the code. > I suspect that the intention is that sqlite3BtreeFactory use > TEMP_STORE in certain circumstances, but presently it uses neither > TEMP_STORE nor the db->temp_store set by pragma. In fact, even the > flag BTREE_MEMORY which is set by sqlite3BtreeFactory when the > filename is empty is ignored by sqlite3BtreeOpen. > The THREADSAFE macro is used in all three os_xxx.c files, but is not > supported by the autoconf files (configure.ac and Makefile.in). It > would be really nice to add this when configure.ac is updated next. > It would also be nice to change configure.ac so that only one of > -DOS_WIN=1 or -DOS_UNIX=1 is specified in TARGET_CFLAGS. Only one is > necessary (the other will be set in os.h) and it would make the > console dribble from make much easier to read. > There are several SQLITE_OMIT_ macros that should also be supported by > the autoconf files. > Disclaimer: I don't use autoconf, but I'd be willing to help with the > source updates if somebody familiar with autoconf was willing to do > the configure (+ aclocal.m4 etc.) file generation and linux testing. > e - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[sqlite] When to set 'PRAGMA full_column_names = OFF'?
Hello everybody, Should I set 'PRAGMA full_column_names = OFF' every time I execute a SELECT statement, or just once, right after I open the database? What is the lifetime of this setting? Thanks, -- Tito
Re: [sqlite] api 3, bind help
The argument is a "pointer to function returning void and taking void* as its argument" (like "free"); Here's a sample program: #include void func(void (*)(void*)); void freemystuff(void* in); int main(int argc, char* argv[]) { func(freemystuff); } void func(void (*callback)(void*) ) { callback(0); } void freemystuff(void* in) { std::cout << "freemystuff called " << in << std::endl; } --Steve On Jun 21, 2004, at 10:55 AM, Andy Colson wrote: So, I'm guessing the fifth arg is the eCopy param? My C isnt very strong, so void(*)(void*) is a pointer to a function like this: void* someFunc(void*), right? My guess is the params are: int sqlite3_bind_blob(sqlite3_stmt* statement, int paramNumber, const void* fieldValue, int fieldSize, callback); /*where callback can be one of the const's SQLITE_STATIC, SQLITE_TRANSIENT, or a pointer to a function that returns a void* and takes one void* as an arg. */
Re: [sqlite] ignore index
It would be even better if SQLite could do a better job at optimizing the index searches. __ Raymond Irving --- rene <[EMAIL PROTECTED]> wrote: > Hi, > > i ran into a simulair case as this message > http://www.mail-archive.com/[EMAIL PROTECTED]/msg01646.html > > where an index actually slows things down, because > there are many rows matching the > condition (where the other (indexed) condition only > matches a few rows). > > In above mentioned letter you suggest to just drop > the index, or concat the condition with an > (empty) string (like: instead of "checked=0" use > "checked||''='0' "). > > in my case, i need the index for another query.. > well, i could consider to drop it but the fix is > good enough for me.. i just like to mention that it > would be a really cool feature to tell sqlite with > index(es) not to use with something like 'IGNORE > INDEX fieldname' or something.. > > also, maybe it is good idea to put this in the > documentation (including above trick), but since it > is a wiki i can guess the answer on that ;) > > regards, > > rene > > - > To unsubscribe, e-mail: > [EMAIL PROTECTED] > For additional commands, e-mail: > [EMAIL PROTECTED] > > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]