[sqlite] Why FTS3 has the limitations it does
The severe limitations on FTS3 seemed odd to me, but I figured I could live with them. Then I starting finding that various queries were giving strange "out of context" errors with the MATCH operator, even though I was following all the documented rules. As a result I started looking deeply into what is going on with FTS3 and I found something that bothers me. These limitations are really completely arbitrary. They should be removable. You can only use a single index to query a table, after that everything else has to be done with a scan of the results, fair enough. But with FTS3, the match operator works ONLY when the match expression is selected for the index. This means that if a query could allow a row to be selected by either rowid, or a MATCH expression, you can have a problem. If the rowid is selected for use as the index, the MATCH won't be used as the index, and you get errors. Similarly, a query with two MATCH expressions will only be able to use one as the index, so you get errors from the second. Now, the reason this is arbitrary is that the MATCH expressions not used for the index are STILL given a chance to work at the function level. If a MATCH function were implemented, these limitations would disappear. Oddly enough, FTS3 doesn't expose a function for MATCH. Unfortunately, there is a good reason. It turns out that, as currently designed, testing an expression against a single known row requires a full table scan for every test. Inside my match function I would know the rowid (docid) for a record. While this can quickly look up the content, it is impossible to look up segments by document id, so checking the match on that row requires a lookup of all possible docids for the match expression, and a full scan of those results. Clearly this makes a function level match utter nonsense. My first question is, why was FTS designed like this in the first place? Surely this was clear during the design stage, when the design could have been easily changed to accommodate the lookups required for a MATCH function. Is there some compelling performance benefit? Something I missed? My second question is, can we expect this to change at some point? Just adding the MATCH function would eliminate virtually every remaining FTS limitation. All that is needed is the ability to lookup by a combination of docid and term. Isn't a hash already built while creating a list of terms for storage? What if that hash were stored, indexed by docid? For now I've modified my code to always index on the MATCH expression, if there is any. This at least eliminates the random errors, but does nothing wonderful for performance. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] regular expression search
If you need more advanced matching (I.E. full regex, beyond what GLOB can do) you could implement a custom function. A regex search is always going to have to resort to a full table scan anyway, so it won't hurt performance any. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Simon Slavin Sent: Thursday, October 15, 2009 6:00 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] regular expression search On 15 Oct 2009, at 10:24pm, Farkas, Illes wrote: > I have strings in a database and I would like to find all of them > matching a pattern that is 5-10 characters long. In each position of > the pattern up to three different characters may be allowed. This > would be a typical regular expression that I'd like to find: > > A (B | C | D ) D ( A | D ) B B First guess would be to use GLOB: http://www.sqlite.org/lang_corefunc.html#glob Either as an infix operator, or as a function: http://www.sqlite.org/lang_corefunc.html#glob I cannot find a page which gives SQLite examples using GLOB, but this page http://en.wikipedia.org/wiki/Glob_(programming) gives examples showing the use of square brackets, which appears to be what you want. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Exception writing to database from multiple processes
I can empathize with this problem, having just worked through this recently. The bottom line is if you need concurrency, you're going to have to structure your code appropriately. Here are some things I found helpful: 1. ENCAPSULATE! You'll want to encapsulate your handling of queries so that you only have to wrap things for LOCKED and BUSY handling in one place. 2. Shared cache mode (sqlite3_enable_shared_cache()) has a better locking style for concurrency within a single process (it uses table level locking). This is almost a necessity if you need concurrent access between threads. 3. If you can possibly get away with it, use "PRAGMA read_uncommitted = true". THIS WILL MAKE YOUR READS NON ACID, but it greatly reduces contention. Generally speaking, a little care in your code should keep the non-acid reads from being a problem. 4. An open VDBE (sqlite3_stmt*) in the middle of returning rows will hold a read lock on its table. While that read lock is open, other threads will be unable to write to that table. Beware long time consuming loops that hold a read lock on a table another thread might want to write to. 5. Keep as much writing as possible in one thread (all of it, if you can manage.) 6. If you have to break rule 5, try to make sure that the different threads use different tables. 7. If any thread has a long running operation, make sure that it won't block any important tables for the whole time 8. If you have to break rule 7, make it possible to detect when another thread has been blocked, and yield to that thread by committing the transaction and releasing any open cursors. 9. Every query needs to happen in a loop. This loop needs to check for LOCKED (and perhaps BUSY) conditions. Handle LOCKED using sqlite3_unlock_notify(). This loop is half the reason for encapsulating. This isn't everything, but the list is long already, and that should get you past most of the hard stuff. Best luck. -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Pavel Ivanov Sent: Thursday, October 15, 2009 10:57 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] Exception writing to database from multiple processes You're definitely talking about some bug in your application or some misunderstanding about how SQLite should work. SQLite by itself never causes any deadlocks. So I guess in order to be able to help you we need to know more about what you're doing. Maybe for example you're forgetting to commit/rollback transaction somewhere, maybe you're dead-locking on your own mutexes. Are you able to look at the stack traces where your workers hang? Pavel On Thu, Oct 15, 2009 at 10:40 PM, David Carterwrote: > Yes, that's correct. I also tried using BEGIN EXCLUSIVE instead of BEGIN > IMMEDIATE. This results in only one worker process being able to write to > the database, while the other worker processes continually get SQLITE_BUSY > when trying to write. > > David > > -Original Message- > From: sqlite-users-boun...@sqlite.org > [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Pavel Ivanov > Sent: Thursday, 15 October 2009 9:53 PM > To: General Discussion of SQLite Database > Subject: Re: [sqlite] Exception writing to database from multiple processes > >> However, when it is run >> inside an Application Pool with multiple worker processes, the database >> soon becomes locked and cannot be written to by any of the worker >> processes. > > You mean your application hangs? None of workers can write to database > and nothing else happens in application? Nobody's reading database at > the same time, nobody connecting to database via command line tool, > nothing happens at all? > > Pavel > > On Thu, Oct 15, 2009 at 2:40 AM, David Carter wrote: >> Hello, >> >> >> >> I am currently using the SQLite Amalgamation v3.6.19 from >> http://www.sqlite.org/download.html in an ISAPI Extension to write out >> usage statistics to an SQLite database. >> >> >> >> When the ISAPI extension is running inside an Application Pool with a >> single worker process, everything works fine. However, when it is run >> inside an Application Pool with multiple worker processes, the database >> soon becomes locked and cannot be written to by any of the worker >> processes. >> >> >> >> Each worker process has a separate background thread which writes to the >> database every 5 seconds. Each write is performed as a single >> transaction starting with "BEGIN IMMEDIATE". >> >> >> >> Any help you can provide would be greatly appreciated. >> >> >> >> Thanks, >> >> >> >> David >> >> ___ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> > ___ > sqlite-users mailing list > sqlite-users@sqlite.org >
Re: [sqlite] Creating a spatial index for a large number of points- sorry for the text formating
I doubt SQLite is the right tool for this job, for a number of reasons. First, if the data is as simple as you say, you are probably better off writing your logic as straight C, rather than SQL. SQLite is VERY fast, but there is still an incredible amount of overhead in executing a query, in any database. It will take many times more operations to do this with SQL than it would to do it without. The strength of SQL is in abstracting complex access to complex data. Simple lists of datapoints that only need a few simple transformations should probably be handled differently. Second, SQLite has in-memory overhead based on the size of the database. 10TB worth of database would require a LOT of memory. Tried to find the numbers on this, but I couldn't. I think it was something like 1 byte per page, with each page 1KB by default. That would be a 10GB of ram required for your data, and that assuming that SQLite stores it as efficiently as it is now (unlikely). Third, SQLite has a 2TB limit, using the default page size. By increasing the page size you can raise that limit, but you are entering wild territory. I'm a huge fan of SQLite, but I wouldn't use it for the job you described. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Igor Conom Sent: Saturday, October 17, 2009 10:10 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] Creating a spatial index for a large number of points- sorry for the text formating Sorry for the formatting - it looked better when I sent it from Yahoo's web interface. - Original Message From: Igor ConomTo: sqlite-users@sqlite.org Sent: Sat, October 17, 2009 9:03:54 AM Subject: [sqlite] Creating a spatial index for a large number of points Hello everybody! I need to create a spatial index for a large number of points and I would really like to hear your comments. Thank you in advance for your time. There are some very large files containing scientific data, stored in a very simple format (by very large I mean things in the range of 10 TB). One of these files is made of equal-length records, each record containing a fix-length header and a vector of floating point numbers. For each record there are two spatial points associated: x1,y1 and x2,y2. There are few hundred millions of such records. What I need is to create an index such that it allows me to group all records with the same x1,y1 and do various operations (by same x1,y1 I mean points within some predefined range, i.e. two points can be considered the same if the distance between the two is less than some specified constant). The operations are: 1. for a given (x,y), quickly find the record group with the closest (x1,y1). 2. given a rectangle, find the list of (x1,y1) groups withing that rectangle. The index needs to be build in a reasonable amount of time (close to the time it takes to read through the original data). Preferably a single pass through the data should be required – but this is second to building an efficient index. Once created, the index can be used only read-only, so one can take some extra time building it. I expect that the number of distinct (x1,y1) groups will be in the range of few millions, with maybe few hundred records associated for each group. A record can be uniquely identified by a 32-bit integer (its position on the file(s)), so a group will be defined by (x1,y1) and a list of integers. My initial plan is to create two tables: one storing big chunks of groups and one r-tree. The “big chunks” will be a list of many groups contained in a larger rectangular area, that I can further process in memory. I start with a big rectangle representing a huge enough boundary to contain every point. As I add a point, I use the r-tree to find a suitable rectangle for it, then add the point point to its list. If the list grows over a give number of points (or size in bytes), I split the rectangle in 4 parts, from the median point, so each part will contain a balanced number of points). Another thing: I can use integer coordinates: I know that relative to a good choice of origin and cell size, the areal extent will not contain more than 2^32 cells on each axis (from INT_MIN to INT_MAX). I presume the operations are faster with integers. CREATE TABLE groups ( data BLOB); CREATE VIRTUAL TABLE groups_rt USING rtree_i32 (Id INTEGER, minX INTEGER, maxX INTEGER, minY INTEGER, maxY INTEGER); The groups_rt.Id will be groups.ROWID . The algorithm to create the index is described next: for each input point: if is the first point: insert the first entry to the groups table and insert a big rectangle in the r-tree (i.e. INT_MIN, INT_MAX, INT_MIN, INT_MAX) else: transform x1,y1 to integers: ix1, iy1 SELECT groups.ROWID, data, minX,maxX,minY,maxX FROM groups, groups_rt WHERE
Re: [sqlite] Why FTS3 has the limitations it does
Agreed, HUGE thanks for FTS. Hopefully my original post didn't come off ungrateful. I was just confused by limitations that looked like they could have been removed during the initial design (at least more easily than they can now.) Scott's reply helps me understand this better, and perhaps gives some starting points for finding a solution. The idea of using the tokenizer output and doing a direct match is intriguing. A full content scan is expensive (that is the point of indexing,) but guess this is usually less expensive than a full index scan for single rows (especially for large indexes), and would eliminate the current limitations. As far as continued development, there is a "tracker FTS" branch available that appears to be active. See http://git.gnome.org/cgit/tracker/tree/src/libtracker-fts. It looks like there is also continued active development on it: http://git.gnome.org/cgit/tracker/log/?qt=grep=FTS. The tracker-fts code adds ranking and some other important functionality, but it is hard to separate from the rest of tracker. The tracker-fts files are public domain (SQLite license) but they have some dependencies on other parts of tracker that are not. Also, at least as of a few months ago, I think they were based on an earlier version of FTS3. Supposing someone wanted to update FTS3, how would they get write access to the main code repository? John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of P Kishor Sent: Friday, October 16, 2009 4:23 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] Why FTS3 has the limitations it does On Fri, Oct 16, 2009 at 3:12 PM, Scott Hess <sh...@google.com> wrote: > On Wed, Oct 14, 2009 at 11:35 PM, John Crenshaw > <johncrens...@priacta.com> wrote: >> The severe limitations on FTS3 seemed odd to me, but I figured I could >> live with them. Then I starting finding that various queries were giving >> strange "out of context" errors with the MATCH operator, even though I >> was following all the documented rules. As a result I started looking >> deeply into what is going on with FTS3 and I found something that >> bothers me. >> >> These limitations are really completely arbitrary. They should be >> removable. > > fts is mostly the way it is because that was the amount that got done > before I lost the motivation to carry it further. The set of possible > improvements is vast, but they need a motivated party to carry them > forward. Some of the integration with SQLite is the way it is mostly > because it was decided to keep fts outside of SQLite core. Feel free > to dive in and improve it. > >> You can only use a single index to query a table, after that everything >> else has to be done with a scan of the results, fair enough. But with >> FTS3, the match operator works ONLY when the match expression is >> selected for the index. This means that if a query could allow a row to >> be selected by either rowid, or a MATCH expression, you can have a >> problem. If the rowid is selected for use as the index, the MATCH won't >> be used as the index, and you get errors. Similarly, a query with two >> MATCH expressions will only be able to use one as the index, so you get >> errors from the second. > > The MATCH code probes term->doclist, there is no facility for probing > by docid. At minimum the document will need to be tokenized. > Worst-case, you could tokenize it to an in-memory segment and probe > that, which would make good re-use of existing code. Most efficient > would be to somehow match directly against the tokenizer output (you > could look at the snippeting code for hints there). > >> My first question is, why was FTS designed like this in the first place? > > Because running MATCH against a subset of the table was not considered > an important use case when designing it? > >> Surely this was clear during the design stage, when the design could >> have been easily changed to accommodate the lookups required for a MATCH >> function. Is there some compelling performance benefit? Something I >> missed? > > "Easily" is all relative. There were plenty of hard problems to be > solved without looking around for a bunch of easy ones to tack on. > >> My second question is, can we expect this to change at some point? > > Probably not unless someone out there decides to. I got kind of > burned out on fts about a year back. With immense gratitude expressed here to Scott, I feel a bit disappointed that FTS has fallen out of the core, and out of "continued development and improvement." It is really a brilliant piece of work that makes sqlite eminently more usable for a number
Re: [sqlite] Infinity
SQLite stores the data however you give it. I'm not aware of any documentation requiring that 9e999 be considered infinity, nor any requiring that the command line treat invalid numbers as null. Most likely, treating NaN as null is simply a way for the command line to behave reasonably in an otherwise undefined situation. Practically, 9e999 is beyond the "infinity" limit for doubles on whatever compiler was used to build the command line. I think this limit is technically arbitrary, so on some compilers, either now, or in the future, 9e999 could very possibly NOT be infinity. std::numeric_limits::max() should be a standard (read "safe") way of getting the "infinity" value in C++. In the Visual C++ 2005 compiler, the max double is 1.7976931348623158e+308. I'm not sure that this is constant however, so don't count on it. -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Michael Chen Sent: Sunday, October 18, 2009 4:19 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] Infinity I am looking for the answer too. anybody know it? On Sat, Oct 17, 2009 at 12:23 AM, Dan Bishopwrote: > I've noticed that I can use IEEE Infinity values in SQLite by writing > any literal too big for a double. > > sqlite> CREATE TABLE foo (x REAL); > sqlite> INSERT INTO foo VALUES (9e999); -- +Inf > sqlite> INSERT INTO foo VALUES (-9e999); -- -Inf > sqlite> INSERT INTO foo VALUES (9e999 / 9e999); -- NaN: gets converted > to NULL > sqlite> .null NULL > sqlite> select * FROM foo; > Inf > -Inf > NULL > sqlite> SELECT * FROM foo WHERE ABS(x) = 9e999; > Inf > -Inf > > Is it true on all platforms that 9e999 = Infinity and CAST(9e999 AS > TEXT) = 'Inf'? What's the preferred SQL syntax for infinity? > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- Best regards, Michael Chen Google Voice Phone.: 847-448-0647 ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Why FTS3 has the limitations it does
> Doing an fts index which can handle subset scans efficiently is going to be hard. I noticed. After some thought, here's what I've come up with: We'll call nT the number of terms and nD the number of docids in a given term. nTD is the number of rows in a natural join of terms and docids. The current runtime to lookup a term for a match is O(log nT) (right?) The current best possible runtime to lookup a docid within a term is O(log nT) + O(nD). This is fine if nD is small, but rapidly becomes a major problem as nD gets larger. If we doubled the size of the index, adding an extra tree to index terms and docids together, we could get O(nTD) for a lookup. This is the ideal runtime (assuming we don't redesign the world and use a hash), but requires extra code, and a lot of extra space. On the other hand, we could add a tree inside each segment to index the doclist. The term would be looked up as normal at a cost of O(log nT). After that though, if the docid is known, it could be looked up at an additional cost of only O(log nD). The total cost O(log nT) + O(log nD) is only marginally worse than O(log nTD) (and only because nTD is the count of a natural join, rather than a true product of the counts.) The result is still pretty expensive for individual rows, but it is a whole lot better than it is now, and it avoids full scans. This still doesn't offer direct access to the doc lists by docid (you still need terms) but that problem should easy to solve once the term + docid case is handled separately, because only the docid needs to be indexed at that point. I think the right way to do this is to have the doclist point back to the term it belongs to. Then a list of doclists could be stored with the regular data for each row (it is known at that point, so requires no extra calculation.) These changes still require a data format change, but worst case that means incrementing the version. Does anyone see a reason why this wouldn't work? John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Scott Hess Sent: Monday, October 19, 2009 12:51 PM To: General Discussion of SQLite Database Cc: punk...@eidesis.org Subject: Re: [sqlite] Why FTS3 has the limitations it does On Sat, Oct 17, 2009 at 1:25 PM, John Crenshaw <johncrens...@priacta.com> wrote: > Agreed, HUGE thanks for FTS. Hopefully my original post didn't > come off ungrateful. I was just confused by limitations that > looked like they could have been removed during the initial > design (at least more easily than they can now.) Scott's reply > helps me understand this better, and perhaps gives some > starting points for finding a solution. One of the things I found challenging about fts development was that being embedded w/in SQLite made some problems harder. You can't just spin up a book-keeping thread to do stuff in the background, and you can't easily expose a grungy API to let the client do it, either. Plus you have the issues of shipping a framework (such as not being able to arbitrarily change the file format on a whim, even if it's WRONG). This meant that in many cases I was a bit aggressive in pruning features up front, to scope things appropriately, and once committed to a file format some things just couldn't be added. > The idea of using the tokenizer output and doing a direct match > is intriguing. A full content scan is expensive (that is the > point of indexing,) but guess this is usually less expensive > than a full index scan for single rows (especially for large > indexes), and would eliminate the current limitations. Doing an fts index which can handle subset scans efficiently is going to be hard. Like a lot of systems fts3 uses segments to keep index updates manageable, but this means that you can't just do a single b-tree intersection, you have to look at multiple b-trees, so you'll end up hitting a greater fraction of the index footprint to do the query. You could get a CPU win by having the code at least not keep more of the doclist data than needed around. One thing I had been considering adding was some stats data so that you could easily determine the magnitude of the doclist for a term. In this case, if that info suggested that the index wasn't much bigger than the subset of interest, use the index, otherwise use a content scan. > Supposing someone wanted to update FTS3, how would they get > write access to the main code repository? That's for the SQLite team (I've been pretty quiet on that front, lately, so will not speak for them). -scott ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Infinity
You need #include . Other than that, I don't know. I code on Windows, not Mac, but the code looks right. WARNING: min() != -infinity. For doubles, min is the smallest number greater than 0. -infinity == -max() John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Michael Chen Sent: Monday, October 19, 2009 4:59 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] Infinity //Dear there, it seems that I cannot insert std::numeric_limits::max(). I am on Mac Osx 10.5 //anybody can take a look at the code? I am new to sqlite3. comments on coding also welcome. #include #include #include #include using namespace std; int main(){ double infinity = std::numeric_limits::max(); double ninfinity = std::numeric_limits::min(); sqlite3 *db; char *zErrMsg = 0; int rc; rc = sqlite3_open(":memory:", ); if( rc ){ fprintf(stderr, "Can't open database: %s\n", sqlite3_errmsg(db)); sqlite3_close(db); exit(1); } sqlite3_stmt* create_tl; rc = sqlite3_prepare_v2(db, "create table tl (number real)", -1, _tl,NULL); printf("%i\n",rc); if ( rc != SQLITE_OK) exit(rc); rc = sqlite3_step(create_tl); if ( rc != SQLITE_DONE) exit(rc); sqlite3_stmt* insert_tl; rc = sqlite3_prepare_v2(db,"insert into tl values(:number)",-1, _tl,NULL); if ( rc != SQLITE_OK) exit(rc); sqlite3_bind_double(insert_tl, 1, 1.1); rc = sqlite3_step(insert_tl); if (rc != SQLITE_DONE) exit(rc); sqlite3_bind_double(insert_tl, 1, infinity); //this line would fail, error code 21 rc = sqlite3_step(insert_tl); if (rc != SQLITE_DONE) exit(rc); sqlite3_bind_double(insert_tl, 1, ninfinity); rc = sqlite3_step(insert_tl); if (rc != SQLITE_DONE) exit(rc); sqlite3_bind_double(insert_tl, 1, 3.3); rc = sqlite3_step(insert_tl); if (rc != SQLITE_DONE) exit(rc); sqlite3_stmt* select_tl; rc = sqlite3_prepare_v2(db, "select * from tl",-1,_tl,NULL); if (rc != SQLITE_OK) exit(rc); while ( sqlite3_step(select_tl) == SQLITE_ROW){ printf("%e",sqlite3_column_double(select_tl,0)); printf("\n"); } if (rc != SQLITE_DONE) exit(rc); sqlite3_finalize(select_tl); sqlite3_close(db); printf("exit normally\n"); } On Sun, Oct 18, 2009 at 5:58 PM, John Crenshaw <johncrens...@priacta.com>wrote: > SQLite stores the data however you give it. I'm not aware of any > documentation requiring that 9e999 be considered infinity, nor any > requiring that the command line treat invalid numbers as null. Most > likely, treating NaN as null is simply a way for the command line to > behave reasonably in an otherwise undefined situation. > > Practically, 9e999 is beyond the "infinity" limit for doubles on > whatever compiler was used to build the command line. I think this limit > is technically arbitrary, so on some compilers, either now, or in the > future, 9e999 could very possibly NOT be infinity. > std::numeric_limits::max() should be a standard (read "safe") > way of getting the "infinity" value in C++. > > In the Visual C++ 2005 compiler, the max double is > 1.7976931348623158e+308. I'm not sure that this is constant however, so > don't count on it. > > -Original Message- > From: sqlite-users-boun...@sqlite.org > [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Michael Chen > Sent: Sunday, October 18, 2009 4:19 PM > To: General Discussion of SQLite Database > Subject: Re: [sqlite] Infinity > > > I am looking for the answer too. anybody know it? > > On Sat, Oct 17, 2009 at 12:23 AM, Dan Bishop <danbisho...@gmail.com> > wrote: > > > I've noticed that I can use IEEE Infinity values in SQLite by writing > > any literal too big for a double. > > > > sqlite> CREATE TABLE foo (x REAL); > > sqlite> INSERT INTO foo VALUES (9e999); -- +Inf > > sqlite> INSERT INTO foo VALUES (-9e999); -- -Inf > > sqlite> INSERT INTO foo VALUES (9e999 / 9e999); -- NaN: gets converted > > to NULL > > sqlite> .null NULL > > sqlite> select * FROM foo; > > Inf > > -Inf > > NULL > > sqlite> SELECT * FROM foo WHERE ABS(x) = 9e999; > > Inf > > -Inf > > > > Is it true on all platforms that 9e999 = Infinity and CAST(9e999 AS > > TEXT) = 'Inf'? What's the preferred SQL syntax for infinity? > > ___ > > sqlite-users mailing list > > sqlite-users@sqlite.org > > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > > > > > -- > Best regards, > Michael Chen > Google Voice Phone.: 847-448-0647 > ___ > sqlite-user
Re: [sqlite] Creating a spatial index for a large number of points-sorry for the text formating
8GB is workable. Make sure you use prepared statements to avoid recompiling you insert 500 million times. Also with this much data, it would probably be a very good idea to compile SQLite with a much larger memory cache. Don't expect a miracle either. 500 million is a very large number, any way you look at it. SQLite is fast, but your queries will still take time. With that in mind, I THINK this should work for you: CREATE TABLE points ( x double, y double ); CREATE INDEX idx_points_x_by_y ON points (x, y); Using this, you can do a fully indexed query for any bounding rect: SELECT x, y FROM points WHERE (x BETWEEN ? AND ?) AND (y BETWEEN ? AND ?) (I recommend using EXPLAIN to verify that this DOES use the index, but it should.) If you need more advanced spatial calculations, you can do something like this, to first take advantage of the index, and then do any additional calculations to filter the return: SELECT x, y, MYSPATIALFUNC(x, y) AS foo FROM points WHERE (x BETWEEN ? AND ?) AND (y BETWEEN ? AND ?) AND foo < ? Then just define a custom function for MYSPATIALFUNC. If you need to select the nearest point, you can query a small region of space around that point, and compute the nearest point from the small set returned (use a custom function for maximum speed.) If the region is empty, enlarge it and try again. If the "nearest" point in the region is further than the nearest edge, enlarge the region enough to verify that it really IS the nearest point. Not a perfect process, but simple enough, and it works without writing a whole lot of extra code. You will need to evaluate the data to determine what the ideal "region" size is for determining nearest points. Too small and you'll have to re-query a lot, too large and you'll have to sift through a lot of data points. Hope that helps, John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Igor Conom Sent: Saturday, October 17, 2009 4:56 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] Creating a spatial index for a large number of points-sorry for the text formating John, thank you for the comments ! Maybe I wasn't clear - the 10TB data is separated. It contains a lot of other data that I don't dream of storing in a database. But this bulk data is structured in fixed-length records, each record containing a vector of floating point values and some associated meta-information. Part of this meta-info is a pair of points, (x1,y1) and (x2,y2). The number records is in the range of few hundred millions (say 500 mil), which is the number of coordinate pairs I need to handle. The coordinates are represented as a 4-byte IEEE floating point, so 500 mil will take 500*4*4 ~ 8GB. So I'll have to process about 8GB of points. These points have common coordinates for (x1,y1). The number of groups (distinct (x1,y1)) is in the range of few hundred thousand (say 1mil). Now, in this "index file" that I try to create, I'll have about 1 mil entries, each entry containing somehow the (x1,y1) and a list of integers, which are really record numbers to the original data set. So if nothing else is stored, the entire index file should be about 8GB in our case. From what I read, sqlite3 should handle fairly well a db few GB large, containing few mil records. The reality is that my "grups" table will store big rectangles, containing many groups (I'm thinking about 1000). So I expect that while the total size of the database is the same (few GB hopefully), the number of records in the r-tree and associated data table to be in the few thousands range). There are few reasons I sqlite is a candidate for this job: 1. This "index" that I try to build needs to be stored, since it will be used multiple times for further processing, so any "in-memory" structure that I may use needs to be stored on disk at some point in some form. 2. The process of indexing may take some time (few hours, to one day - as you read through 10TB of data). If the process gets interrupted somehow, I need to be able to restart it gracefully. Here the journaling feature of sqlite comes in very handy: ex. I process the input in chunks of few tens of MB of points, each chunk begins a transaction; if something happen, all the tables involved remains in a consistent state, and I can just restart with the last chunk processed. 3. The database is a single self-contained file, which makes it easy to integrated it in a bigger project. 4. I used it before for something else :) The points number 1 and 2 are important. So, unless I'll implement my own storage system (think journal, think page management, etc.), I need to find something else to build my app in top of it. The other candidates I considered were HD5, and ... well I heard firebird can be used in serverless mode. Any other ideas are welcome :) -
Re: [sqlite] Infinity
Sorry, I think I gave you slightly buggy instructions. I just realized that max() should be the max true value capable of being stored, which should be less than the infinity value. std::numeric_limits provides another function named infinity() for getting positive infinity. I believe this value will be different than max(). Sorry for the mistake. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Michael Chen Sent: Monday, October 19, 2009 11:40 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] Infinity Thanks John. After incorporate a few changes, the code can compile and run. The result seems reasonable, the input infinity std::numeric_limits::max() is sent to and retrieved from a sqlite3 database correctly. --terminal output sqlite3 tempdb sqlite> select * from tl; 1.1 1.79769313486232e+308 -1.79769313486232e+308 3.3 --source code -- #include #include #include #include #include using namespace std; int main(){ double infinity = std::numeric_limits::max(); double ninfinity = - std::numeric_limits::max(); sqlite3 *db; char *zErrMsg = 0; int rc; rc = sqlite3_open("tempdb", ); if( rc ){ fprintf(stderr, "Can't open database: %s\n", sqlite3_errmsg(db)); sqlite3_close(db); exit(1); } sqlite3_stmt* create_tl; rc = sqlite3_prepare_v2(db, "create table tl (number real)", -1, _tl,NULL); if ( rc != SQLITE_OK) exit(rc); rc = sqlite3_step(create_tl); if ( rc != SQLITE_DONE) exit(rc); sqlite3_stmt* insert_tl; rc = sqlite3_prepare_v2(db,"insert into tl values(:number)",-1, _tl,NULL); if ( rc != SQLITE_OK) exit(rc); sqlite3_bind_double(insert_tl, 1, 1.1); rc = sqlite3_step(insert_tl); if (rc != SQLITE_DONE) exit(rc); sqlite3_reset(insert_tl); sqlite3_bind_double(insert_tl, 1, infinity); rc = sqlite3_step(insert_tl); if (rc != SQLITE_DONE) exit(rc); sqlite3_reset(insert_tl); sqlite3_bind_double(insert_tl, 1, ninfinity); rc = sqlite3_step(insert_tl); if (rc != SQLITE_DONE) exit(rc); sqlite3_reset(insert_tl); sqlite3_bind_double(insert_tl, 1, 3.3); rc = sqlite3_step(insert_tl); if (rc != SQLITE_DONE) exit(rc); sqlite3_stmt* select_tl; rc = sqlite3_prepare_v2(db, "select * from tl",-1,_tl,NULL); if (rc != SQLITE_OK) exit(rc); while ( sqlite3_step(select_tl) == SQLITE_ROW){ printf("%e",sqlite3_column_double(select_tl,0)); printf("\n"); } if (rc != SQLITE_DONE) exit(rc); sqlite3_finalize(select_tl); sqlite3_close(db); printf("exit normally\n"); } -- Best regards, Michael Chen Google Voice Phone.: 847-448-0647 ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] index for a group by
> Nothing in this process can be sped up. Actually, that isn't entirely true. While it always requires a full data scan, Some things can make the indexing part of the process faster. Since indexing is done basically using a comparative sort, anything that would speed up the sort, will speed up the indexing. 1. It is faster to sort 1000 data points, than to insert 1000 datapoints into a constantly sorted list. Creating the index after all inserts is faster than creating the index, then inserting. 2. If possible, avoid indexes on long data strings, since the compares can be time consuming. 3. If you have a field that stores one of several strings (as an "enum") consider using integers instead. Integers have lower overhead, and can be compared (and sorted) more quickly than strings. 4. If you are feeling really gutsy, you could mod the code and implement a radix sort or something similar for integer values. I'm not really recommending this, just saying, inserts and lookups in a radix index are faster than a btree. 5. Make sure the memory cache is large enough for the sort. Writing data to disk is very costly, compared to sorting in memory. Default is 2000 pages (2MB) worth of btree data. If you are about to build an index that will require more btree than that, increase the size, or split across several transactions. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Pavel Ivanov Sent: Tuesday, October 20, 2009 7:35 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] index for a group by > please could you let me know which index could be better or faster? For this particular query index on (t,a,b,c) or even on (t,a,b,c,d) would be better and cause the query to execute faster (of course if by conditions t>x1 and t also do you know by chance how to speed up the index creation? There's no way to do that. SQLite have to scan the whole table, read data from all rows and put necessary information into the index. Nothing in this process can be sped up. Pavel On Mon, Oct 19, 2009 at 5:50 PM, Sylvain Pointeauwrote: > hello, > I have a table T (a,b,c,d,t) > where c is a value > a,b,c some dimensions > and t the time > > I need to make a subset with a "group by" > like > > select a,b,c,sum(d) > from T > where t>x1 and t group by a,b,c > > I created an index on a,b,c > but this table is large and the index creation is time consuming (few hours) > > please could you let me know which index could be better or faster? > also do you know by chance how to speed up the index creation? > > Best regards, > Sylvain > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] PRAGMA scope
database = the physical file itself database connection = a specific reference to the database, obtained using sqlite3_open page_size has to be set before the database is created (though I'm NOT entirely sure how you would execute the pragma before creating the database, since executing the pragma should require a connection, and a connection should require a database. Hmm...) -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Doug Sent: Tuesday, October 20, 2009 2:28 PM To: 'General Discussion of SQLite Database' Subject: [sqlite] PRAGMA scope I'm reading about the different PRAGMA operations. Cache_size mentions that it is per-database connection. Page_size says it is per database (must be used before the database is created), which sounds like per-connection (ie if I create two databases, I'm guessing I need to set the page_size after calling sqlite3_open each time). Temp_store and synchronous don't make any mention of files or connections. Can/should it be assumed that they are global to the SQLite library? Thanks Doug ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] PRAGMA scope
Makes sense. I figured the master pages would have still been created at this point, but I suppose that is simple enough to deal with. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of P Kishor Sent: Tuesday, October 20, 2009 2:37 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] PRAGMA scope On Tue, Oct 20, 2009 at 1:33 PM, John Crenshaw <johncrens...@priacta.com> wrote: > database = the physical file itself > database connection = a specific reference to the database, obtained > using sqlite3_open > > page_size has to be set before the database is created (though I'm NOT > entirely sure how you would execute the pragma before creating the > database, since executing the pragma should require a connection, and a > connection should require a database. Hmm...) before any table is created... > > -Original Message- > From: sqlite-users-boun...@sqlite.org > [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Doug > Sent: Tuesday, October 20, 2009 2:28 PM > To: 'General Discussion of SQLite Database' > Subject: [sqlite] PRAGMA scope > > I'm reading about the different PRAGMA operations. Cache_size mentions > that > it is per-database connection. Page_size says it is per database (must > be > used before the database is created), which sounds like per-connection > (ie > if I create two databases, I'm guessing I need to set the page_size > after > calling sqlite3_open each time). > > > > Temp_store and synchronous don't make any mention of files or > connections. > Can/should it be assumed that they are global to the SQLite library? > > > > Thanks > > Doug > > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- Puneet Kishor http://www.punkish.org Carbon Model http://carbonmodel.org Charter Member, Open Source Geospatial Foundation http://www.osgeo.org Science Commons Fellow, http://sciencecommons.org/about/whoweare/kishor Nelson Institute, UW-Madison http://www.nelson.wisc.edu --- Assertions are politics; backing up assertions with evidence is science === Sent from Madison, WI, United States ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] commit time
Sounds like a great candidate for shared cache with PRAGMA read_uncommitted = true. If other threads may also need a write lock on that table, you should handle SQLITE_LOCKED by incrementing a waiter count and calling sqlite3_unlock_notify. The thread doing the inserting can check to see if anybody is waiting (blocked) and yield by committing the current transaction and waiting for the blocked thread to unblock. Be aware, you should also close any open cursors before yielding, because open cursors will prevent write locks and you'll waste time yielding for nothing. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Tom Broadbent Sent: Tuesday, October 20, 2009 8:05 PM To: sqlite-users@sqlite.org Subject: [sqlite] commit time i have a simple join table containing two ids from two other tables. i have an index on each of the ids in the join table. CREATE TABLE ContentWordItem (word_id INT, item_id INT); CREATE INDEX idx_ContentWordItem_word ON ContentWordItem(word_id); // index to perform fast queries by word_id CREATE INDEX idx_ContentWordItem_item ON ContentWordItem(item_id); // index to perform fast deletes by item_id i have a large amount of data to insert into this join table on a slow embedded device. i need to avoid locking this join table for more than a second or two at a time so that i can make queries on this table. so here's the question: how do i insert small chunks of data into this table w/o taking a hit each time i commit? what i'm doing is: * read a chunk of data from flat data file into vector of id pairs * begin transaction * loop thru vector of id pairs binding and inserting * commit transaction * repeat until data is exhausted i'm seeing that the reading, binding, and inserting is very fast (300 ms) but the commit is taking upwards of 3 seconds. when i increase my chunk size by a factor of 10 the insert doesn't appear to take 10x longer but the commit still takes upwards of 3 seconds. the point is that the commit hit appears to be much greater than the insert hit but doesn't appear to scale directly. it appears that the commit is updating the indexes and taking a long time. is this a correct evaluation? it also appears that the commit takes longer as the size of the table grows (i.e. the index is getting bigger). is this expected? what i'm worried about is that by reducing the chunk size (to avoid locking the db for a long time) i add a significant amount of time to the insert process because the commits are costing several seconds. however, locking the db for a long time is not desirable. i'm also concerned about the commit time increasing over time as the amount of data in the table increases. is there a better approach? thanks tom __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] index for a group by
Actually, I thought exactly what you said when I saw the question. When I saw your answer though I realized I'd been wrong, there are ways I could slow indexing down, and therefore, ways to speed it up. Splitting across transactions is about the insertion of data, not the creation of the index. This is for the case where you can't insert first and create the index later (maybe you're inserting a lot of data into a table that already has data for example.) The recommendation in this case is to wrap the whole batch of inserts in a transaction, but to commit the transaction at regular intervals, breaking the process into multiple pieces so that you don't spill over the memory cache. SUPPOSEDLY this positively impacts indexing performance, but I've not personally tested that claim. The more significant impact in this case is actually the individual transactions you avoid, which makes a huge difference. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Pavel Ivanov Sent: Tuesday, October 20, 2009 2:47 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] index for a group by I want to notice, John, that my words are in context "I have table with a lot of data, I want to create a particular index on it, how can I do it quickly". In this context only your 5 bullet is applicable, I admit I've forgot about that. And I don't understand how can one split creating of index across several transactions. Pavel On Tue, Oct 20, 2009 at 2:15 PM, John Crenshaw <johncrens...@priacta.com> wrote: >> Nothing in this process can be sped up. > > Actually, that isn't entirely true. While it always requires a full data > scan, Some things can make the indexing part of the process faster. > Since indexing is done basically using a comparative sort, anything that > would speed up the sort, will speed up the indexing. > 1. It is faster to sort 1000 data points, than to insert 1000 datapoints > into a constantly sorted list. Creating the index after all inserts is > faster than creating the index, then inserting. > 2. If possible, avoid indexes on long data strings, since the compares > can be time consuming. > 3. If you have a field that stores one of several strings (as an "enum") > consider using integers instead. Integers have lower overhead, and can > be compared (and sorted) more quickly than strings. > 4. If you are feeling really gutsy, you could mod the code and implement > a radix sort or something similar for integer values. I'm not really > recommending this, just saying, inserts and lookups in a radix index are > faster than a btree. > 5. Make sure the memory cache is large enough for the sort. Writing data > to disk is very costly, compared to sorting in memory. Default is 2000 > pages (2MB) worth of btree data. If you are about to build an index that > will require more btree than that, increase the size, or split across > several transactions. > > John > > -Original Message- > From: sqlite-users-boun...@sqlite.org > [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Pavel Ivanov > Sent: Tuesday, October 20, 2009 7:35 AM > To: General Discussion of SQLite Database > Subject: Re: [sqlite] index for a group by > >> please could you let me know which index could be better or faster? > > For this particular query index on (t,a,b,c) or even on (t,a,b,c,d) > would be better and cause the query to execute faster (of course if by > conditions t>x1 and t table). > >> also do you know by chance how to speed up the index creation? > > There's no way to do that. SQLite have to scan the whole table, read > data from all rows and put necessary information into the index. > Nothing in this process can be sped up. > > Pavel > > On Mon, Oct 19, 2009 at 5:50 PM, Sylvain Pointeau > <sylvain.point...@gmail.com> wrote: >> hello, >> I have a table T (a,b,c,d,t) >> where c is a value >> a,b,c some dimensions >> and t the time >> >> I need to make a subset with a "group by" >> like >> >> select a,b,c,sum(d) >> from T >> where t>x1 and t> group by a,b,c >> >> I created an index on a,b,c >> but this table is large and the index creation is time consuming (few > hours) >> >> please could you let me know which index could be better or faster? >> also do you know by chance how to speed up the index creation? >> >> Best regards, >> Sylvain >> ___ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> > ___ > sqlite-users m
Re: [sqlite] Problem about write data into the DB
It is likely that the file can't be opened for a very predictable reason. For example, perhaps the specified path doesn't exist. (A common variation of this would be a hard coded string with single backslashes, most languages require you to escape backslashes in strings.) Perhaps the file is read only, already locked by another process, or has insufficient permissions. Most likely, this is going to be one of the regular reasons for failing to open a file. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Rich Shepard Sent: Wednesday, October 21, 2009 11:52 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] Problem about write data into the DB On Wed, 21 Oct 2009, ?? wrote: > I deployed a django app on my laptop, the whole environment is like this: > the OS is UBUNTU904, the web server is Apache, and the database is > sqlite3. The deployment is success, but when I try to write some data into > the database, I get the HTTP 500 error. And I check the error log, it > shows "*OperationalError: unable to open database file*". What does this > error mean? If there are some operation permission need configure? I'd look at the django code to see where it opens the database and what happens to inform the user if that attempt fails. I know nothing about django so I cannot suggest where you should look. Rich ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] commit time
Yes, you have to call sqlite3_enable_shared_cache before opening any database connections, then execute "PRAGMA read_uncommitted = true;" on each connection. Blocking can still happen in some situations, but you can handle it as I described in my original reply. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Tom Broadbent Sent: Wednesday, October 21, 2009 12:05 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] commit time it sounds like this is the feature you recommend using: "A database connection in read-uncommitted mode _does not attempt to obtain read-locks before reading_ from database tables as described above. This can lead to inconsistent query results if another database connection modifies a table while it is being read, but it also means that a read-transaction opened by a connection in read-uncommitted mode can neither block nor be blocked by any other connection." this is precisely what i need. thanks very much. From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] On Behalf Of John Crenshaw [johncrens...@priacta.com] Sent: Tuesday, October 20, 2009 7:18 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] commit time Sounds like a great candidate for shared cache with PRAGMA read_uncommitted = true. If other threads may also need a write lock on that table, you should handle SQLITE_LOCKED by incrementing a waiter count and calling sqlite3_unlock_notify. The thread doing the inserting can check to see if anybody is waiting (blocked) and yield by committing the current transaction and waiting for the blocked thread to unblock. Be aware, you should also close any open cursors before yielding, because open cursors will prevent write locks and you'll waste time yielding for nothing. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Tom Broadbent Sent: Tuesday, October 20, 2009 8:05 PM To: sqlite-users@sqlite.org Subject: [sqlite] commit time i have a simple join table containing two ids from two other tables. i have an index on each of the ids in the join table. CREATE TABLE ContentWordItem (word_id INT, item_id INT); CREATE INDEX idx_ContentWordItem_word ON ContentWordItem(word_id); // index to perform fast queries by word_id CREATE INDEX idx_ContentWordItem_item ON ContentWordItem(item_id); // index to perform fast deletes by item_id i have a large amount of data to insert into this join table on a slow embedded device. i need to avoid locking this join table for more than a second or two at a time so that i can make queries on this table. so here's the question: how do i insert small chunks of data into this table w/o taking a hit each time i commit? what i'm doing is: * read a chunk of data from flat data file into vector of id pairs * begin transaction * loop thru vector of id pairs binding and inserting * commit transaction * repeat until data is exhausted i'm seeing that the reading, binding, and inserting is very fast (300 ms) but the commit is taking upwards of 3 seconds. when i increase my chunk size by a factor of 10 the insert doesn't appear to take 10x longer but the commit still takes upwards of 3 seconds. the point is that the commit hit appears to be much greater than the insert hit but doesn't appear to scale directly. it appears that the commit is updating the indexes and taking a long time. is this a correct evaluation? it also appears that the commit takes longer as the size of the table grows (i.e. the index is getting bigger). is this expected? what i'm worried about is that by reducing the chunk size (to avoid locking the db for a long time) i add a significant amount of time to the insert process because the commits are costing several seconds. however, locking the db for a long time is not desirable. i'm also concerned about the commit time increasing over time as the amount of data in the table increases. is there a better approach? thanks tom __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.m
Re: [sqlite] commit time
My understanding is that the shared cache allows table level locking for multiple threads in a single process, and can do so efficiently because the threads all share the same memory space, but if multiple processes attempt to access the database, they will each use the original (full database lock) methods for concurrency. Therefore, if my understanding is correct, the "elsewhere" is the location that describes the normal database level locking. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Tom Broadbent Sent: Wednesday, October 21, 2009 12:03 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] commit time reading up on shared cache mode and found this: "The locking protocol used to arbitrate between multiple shared-caches or regular database users is described _elsewhere_." where is this described? From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] On Behalf Of John Crenshaw [johncrens...@priacta.com] Sent: Tuesday, October 20, 2009 7:18 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] commit time Sounds like a great candidate for shared cache with PRAGMA read_uncommitted = true. If other threads may also need a write lock on that table, you should handle SQLITE_LOCKED by incrementing a waiter count and calling sqlite3_unlock_notify. The thread doing the inserting can check to see if anybody is waiting (blocked) and yield by committing the current transaction and waiting for the blocked thread to unblock. Be aware, you should also close any open cursors before yielding, because open cursors will prevent write locks and you'll waste time yielding for nothing. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Tom Broadbent Sent: Tuesday, October 20, 2009 8:05 PM To: sqlite-users@sqlite.org Subject: [sqlite] commit time i have a simple join table containing two ids from two other tables. i have an index on each of the ids in the join table. CREATE TABLE ContentWordItem (word_id INT, item_id INT); CREATE INDEX idx_ContentWordItem_word ON ContentWordItem(word_id); // index to perform fast queries by word_id CREATE INDEX idx_ContentWordItem_item ON ContentWordItem(item_id); // index to perform fast deletes by item_id i have a large amount of data to insert into this join table on a slow embedded device. i need to avoid locking this join table for more than a second or two at a time so that i can make queries on this table. so here's the question: how do i insert small chunks of data into this table w/o taking a hit each time i commit? what i'm doing is: * read a chunk of data from flat data file into vector of id pairs * begin transaction * loop thru vector of id pairs binding and inserting * commit transaction * repeat until data is exhausted i'm seeing that the reading, binding, and inserting is very fast (300 ms) but the commit is taking upwards of 3 seconds. when i increase my chunk size by a factor of 10 the insert doesn't appear to take 10x longer but the commit still takes upwards of 3 seconds. the point is that the commit hit appears to be much greater than the insert hit but doesn't appear to scale directly. it appears that the commit is updating the indexes and taking a long time. is this a correct evaluation? it also appears that the commit takes longer as the size of the table grows (i.e. the index is getting bigger). is this expected? what i'm worried about is that by reducing the chunk size (to avoid locking the db for a long time) i add a significant amount of time to the insert process because the commits are costing several seconds. however, locking the db for a long time is not desirable. i'm also concerned about the commit time increasing over time as the amount of data in the table increases. is there a better approach? thanks tom __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.m
Re: [sqlite] Slow SELECTs in application
Someone correct me if I'm wrong, but I don't think that UNIQUE (id_norm,word,wform) is going to have the desired result. Won't that create a single tricolumn unique index? I suspect this table needs a separate index for each. Just put the UNIQUE keyword (with no arguments) after the type on each column. Secondly, I think the problem is AUTOINCREMENT. SQLite will alias id to the rowid (the internal autoincrementing id) if and only if the type is "INTEGER PRIMARY KEY". It is very picky about this. AUTOINCREMENT is probably messing it up (and meaningless in SQLite anyway). If that doesn't help, we probably need to know the error. -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Unabashed Sent: Wednesday, October 21, 2009 9:47 AM To: sqlite-users@sqlite.org Subject: [sqlite] Slow SELECTs in application Hello! I'm using SQLite as DB in my application. My problem consists of two aspects. First , it works great, but on large data SELECTs are very slow (10-20s!). Queries are in one transaction. My table structure is: CREATE TABLE mgWords ( id INTEGER PRIMARY KEY AUTOINCREMENT, id_norm INTEGER, word TEXT, wform TEXT) It seems, that the problem is in sql. How to optimize table to have fast selects? Second part of my trouble is in using unique values. When i'm trying to change structure to CREATE TABLE mgWords ( id INTEGER PRIMARY KEY AUTOINCREMENT, id_norm INTEGER, word TEXT, wform TEXT, UNIQUE (id_norm,word,wform)) and use INSERT INTO mgWords (id_norm,word,wform) VALUES (0,'aword','awform') it clauses error. I'm not good in sql, so I'll be very glad to receive a professional answer. Sorry, please, for my English - this is not my native language. Thank you fo reading it :) -- View this message in context: http://www.nabble.com/Slow-SELECTs-in-application-tp25992880p25992880.ht ml Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow SELECTs in application
It isn't just speed. That is probably the cause of the insert error. A PRIMARY KEY column is implied UNIQUE and NOT NULL but the insert doesn't specify a value for id. Since it isn't aliased to rowid (and therefore doesn't autoincrement) it would raise an error. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Keith Roberts Sent: Wednesday, October 21, 2009 12:36 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] Slow SELECTs in application On Wed, 21 Oct 2009, Unabashed wrote: > To: sqlite-users@sqlite.org > From: Unabashed> Subject: [sqlite] Slow SELECTs in application > > > Hello! > I'm using SQLite as DB in my application. My problem consists of two > aspects. First , it works great, but on large data SELECTs are very slow > (10-20s!). Queries are in one transaction. My table structure is: > CREATE TABLE mgWords ( > id INTEGER PRIMARY KEY AUTOINCREMENT, > id_norm INTEGER, > word TEXT, > wform TEXT) You could try removing the AUTOINCREMENT constraint to speed things up a little. All you need is: id INTEGER PRIMARY KEY, That will allow you to reference the rowid using the 'id' identifyer. They both referer to the same thing. Please see: http://www.sqlite.org/autoinc.html for all the details. Kind Regards, Keith Roberts - Websites: http://www.php-debuggers.net http://www.karsites.net http://www.raised-from-the-dead.org.uk All email addresses are challenge-response protected with TMDA [http://tmda.net] - ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] commit time
Read sort of does happen without a lock, but write requires a lock, and SQLite can't grab a write lock if another connection has open cursors (notwithstanding the fact that they technically don't have a lock.) It's complicated. Just trust me. You won't get that write lock while cursors are open, so you'll still have to handle SQLITE_LOCKED or pay the consequences. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Tom Broadbent Sent: Wednesday, October 21, 2009 6:21 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] commit time if thread 1 opens a read cursor in read uncommitted mode it can block a write lock? i thought the read happens w/o a lock? -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of John Crenshaw Sent: Wednesday, October 21, 2009 12:03 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] commit time Good, a single write thread saves you all the hassle involved with yielding. Unfortunately, even without multiple writers blocking is still possible. If thread 1 opens a cursor, and thread 2 tries to write before that cursor has been closed, it will return SQLITE_LOCKED. Since any read query will return a cursor, there is always a possibility for blocking, and you need to handle SQLITE_LOCKED. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Tom Broadbent Sent: Wednesday, October 21, 2009 2:09 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] commit time very good. i don't anticipate multiple writers so this should be pretty simple. -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of John Crenshaw Sent: Wednesday, October 21, 2009 9:15 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] commit time Yes, you have to call sqlite3_enable_shared_cache before opening any database connections, then execute "PRAGMA read_uncommitted = true;" on each connection. Blocking can still happen in some situations, but you can handle it as I described in my original reply. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Tom Broadbent Sent: Wednesday, October 21, 2009 12:05 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] commit time it sounds like this is the feature you recommend using: "A database connection in read-uncommitted mode _does not attempt to obtain read-locks before reading_ from database tables as described above. This can lead to inconsistent query results if another database connection modifies a table while it is being read, but it also means that a read-transaction opened by a connection in read-uncommitted mode can neither block nor be blocked by any other connection." this is precisely what i need. thanks very much. From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] On Behalf Of John Crenshaw [johncrens...@priacta.com] Sent: Tuesday, October 20, 2009 7:18 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] commit time Sounds like a great candidate for shared cache with PRAGMA read_uncommitted = true. If other threads may also need a write lock on that table, you should handle SQLITE_LOCKED by incrementing a waiter count and calling sqlite3_unlock_notify. The thread doing the inserting can check to see if anybody is waiting (blocked) and yield by committing the current transaction and waiting for the blocked thread to unblock. Be aware, you should also close any open cursors before yielding, because open cursors will prevent write locks and you'll waste time yielding for nothing. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Tom Broadbent Sent: Tuesday, October 20, 2009 8:05 PM To: sqlite-users@sqlite.org Subject: [sqlite] commit time i have a simple join table containing two ids from two other tables. i have an index on each of the ids in the join table. CREATE TABLE ContentWordItem (word_id INT, item_id INT); CREATE INDEX idx_ContentWordItem_word ON ContentWordItem(word_id); // index to perform fast queries by word_id CREATE INDEX idx_ContentWordItem_item ON ContentWordItem(item_id); // index to perform fast deletes by item_id i have a large amount of data to insert into this join table on a slow embedded device. i need to avoid locking this join table for more than a second or two at a time so that i can make queries on this table. so here's the question: how do i insert small chunks of data into this table w/o taking a hit each time i commit? what i'm doing is: * read a chunk of data from flat data file into vector of id pairs *
Re: [sqlite] Inner Join Performance Issue
Try to EXPLAIN the query and verify that the index is actually used. There are a lot of reasons why this query would probably NOT be using the index. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Ralf Sent: Wednesday, October 21, 2009 5:50 PM To: 'General Discussion of SQLite Database' Subject: [sqlite] Inner Join Performance Issue Hello Forum, [>> ] I have a select that joins 15 Tables the where clause consist of 8 like relations (all fields are indexed), this is to implement a sort of "search engine". The performance is awful. It takes around 10sec. Is this how it should be or is there anything I can do? If you need more infos pls. let me know Thx Ralf ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] commit time
An open cursor will block. I've watched it. It was a major problem, and I spent many many hours stepping through SQLite before I finally figured it out. Once I carefully closed out cursors, the problem went away. (In my case I had a long running write process trying to commit a transaction so it could yield to another connection in a separate thread that wanted to write. If cursors were open on a table, the other connection would refuse to grab a write lock on that table, even though the transaction was committed and there were no open writers.) I don't remember where for sure (may have been in sqlite3BtreeBeginTrans) SQLite specifically checks for open cursors. The write lock doesn't stop you from reading, but an open cursor DOES stop you from writing. You have to check for SQLITE_LOCKED, no way around it. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Dan Kennedy Sent: Thursday, October 22, 2009 12:06 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] commit time On Oct 22, 2009, at 5:21 AM, Tom Broadbent wrote: > if thread 1 opens a read cursor in read uncommitted mode it can > block a write lock? i thought the read happens w/o a lock? If using read-uncommitted mode, a reader thread will not block a writer thread that is using the same shared-cache. Except, it does block a writer from modifying the database schema. Dan. > > -Original Message- > From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org > ] On Behalf Of John Crenshaw > Sent: Wednesday, October 21, 2009 12:03 PM > To: General Discussion of SQLite Database > Subject: Re: [sqlite] commit time > > Good, a single write thread saves you all the hassle involved with > yielding. Unfortunately, even without multiple writers blocking is > still > possible. If thread 1 opens a cursor, and thread 2 tries to write > before > that cursor has been closed, it will return SQLITE_LOCKED. Since any > read query will return a cursor, there is always a possibility for > blocking, and you need to handle SQLITE_LOCKED. > > John > > -Original Message- > From: sqlite-users-boun...@sqlite.org > [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Tom Broadbent > Sent: Wednesday, October 21, 2009 2:09 PM > To: General Discussion of SQLite Database > Subject: Re: [sqlite] commit time > > very good. i don't anticipate multiple writers so this should be > pretty > simple. > > -Original Message----- > From: sqlite-users-boun...@sqlite.org > [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of John Crenshaw > Sent: Wednesday, October 21, 2009 9:15 AM > To: General Discussion of SQLite Database > Subject: Re: [sqlite] commit time > > Yes, you have to call sqlite3_enable_shared_cache before opening any > database connections, then execute "PRAGMA read_uncommitted = true;" > on > each connection. Blocking can still happen in some situations, but you > can handle it as I described in my original reply. > > John > > -Original Message- > From: sqlite-users-boun...@sqlite.org > [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Tom Broadbent > Sent: Wednesday, October 21, 2009 12:05 PM > To: General Discussion of SQLite Database > Subject: Re: [sqlite] commit time > > > it sounds like this is the feature you recommend using: > > "A database connection in read-uncommitted mode _does not attempt to > obtain read-locks before reading_ from database tables as described > above. This can lead to inconsistent query results if another database > connection modifies a table while it is being read, but it also means > that a read-transaction opened by a connection in read-uncommitted > mode > can neither block nor be blocked by any other connection." > > this is precisely what i need. thanks very much. > > > From: sqlite-users-boun...@sqlite.org [sqlite-users- > boun...@sqlite.org] > On Behalf Of John Crenshaw [johncrens...@priacta.com] > Sent: Tuesday, October 20, 2009 7:18 PM > To: General Discussion of SQLite Database > Subject: Re: [sqlite] commit time > > Sounds like a great candidate for shared cache with PRAGMA > read_uncommitted = true. > > If other threads may also need a write lock on that table, you should > handle SQLITE_LOCKED by incrementing a waiter count and calling > sqlite3_unlock_notify. The thread doing the inserting can check to see > if anybody is waiting (blocked) and yield by committing the current > transaction and waiting for the blocked thread to unblock. Be aware, > you > should also close any open cursors before yielding, because open > cursors > will prevent write
Re: [sqlite] commit time
Dangerous and disturbing this puzzle is. Only a bug could have locked those connections. If I discover anything useful I'll report it separately (no need to hijack this topic for that.) John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Dan Kennedy Sent: Thursday, October 22, 2009 1:27 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] commit time On Oct 22, 2009, at 11:37 AM, John Crenshaw wrote: > An open cursor will block. I've watched it. It was a major problem, > and > I spent many many hours stepping through SQLite before I finally > figured > it out. Once I carefully closed out cursors, the problem went away. > (In > my case I had a long running write process trying to commit a > transaction so it could yield to another connection in a separate > thread > that wanted to write. If cursors were open on a table, the other > connection would refuse to grab a write lock on that table, even > though > the transaction was committed and there were no open writers.) > > I don't remember where for sure (may have been in > sqlite3BtreeBeginTrans) SQLite specifically checks for open cursors. > > The write lock doesn't stop you from reading, but an open cursor DOES > stop you from writing. You have to check for SQLITE_LOCKED, no way > around it. I don't understand the situation described in the first paragraph. But the statement above is at least not universally true. Tcl test cases "shared-[01].3.11" and "shared-[01].3.1.2" (see around line 229 of the test/shared.test file in the source distribution) are examples of one connection writing to a table while a second connection is scanning through the same table using an open cursor. In this case it is the "second connection" is operating in read_uncommitted mode. Dan. > > John > > -Original Message- > From: sqlite-users-boun...@sqlite.org > [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Dan Kennedy > Sent: Thursday, October 22, 2009 12:06 AM > To: General Discussion of SQLite Database > Subject: Re: [sqlite] commit time > > > On Oct 22, 2009, at 5:21 AM, Tom Broadbent wrote: > >> if thread 1 opens a read cursor in read uncommitted mode it can >> block a write lock? i thought the read happens w/o a lock? > > If using read-uncommitted mode, a reader thread will not block a > writer thread that is using the same shared-cache. Except, it does > block a writer from modifying the database schema. > > Dan. > > >> >> -Original Message- >> From: sqlite-users-boun...@sqlite.org > [mailto:sqlite-users-boun...@sqlite.org >> ] On Behalf Of John Crenshaw >> Sent: Wednesday, October 21, 2009 12:03 PM >> To: General Discussion of SQLite Database >> Subject: Re: [sqlite] commit time >> >> Good, a single write thread saves you all the hassle involved with >> yielding. Unfortunately, even without multiple writers blocking is >> still >> possible. If thread 1 opens a cursor, and thread 2 tries to write >> before >> that cursor has been closed, it will return SQLITE_LOCKED. Since any >> read query will return a cursor, there is always a possibility for >> blocking, and you need to handle SQLITE_LOCKED. >> >> John >> >> -Original Message- >> From: sqlite-users-boun...@sqlite.org >> [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Tom Broadbent >> Sent: Wednesday, October 21, 2009 2:09 PM >> To: General Discussion of SQLite Database >> Subject: Re: [sqlite] commit time >> >> very good. i don't anticipate multiple writers so this should be >> pretty >> simple. >> >> -Original Message- >> From: sqlite-users-boun...@sqlite.org >> [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of John Crenshaw >> Sent: Wednesday, October 21, 2009 9:15 AM >> To: General Discussion of SQLite Database >> Subject: Re: [sqlite] commit time >> >> Yes, you have to call sqlite3_enable_shared_cache before opening any >> database connections, then execute "PRAGMA read_uncommitted = true;" >> on >> each connection. Blocking can still happen in some situations, but >> you >> can handle it as I described in my original reply. >> >> John >> >> -Original Message- >> From: sqlite-users-boun...@sqlite.org >> [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Tom Broadbent >> Sent: Wednesday, October 21, 2009 12:05 PM >> To: General Discussion of SQLite Database >> Subject: Re: [sqlite] commit time >> >> >> it sounds like this is the feature you r
Re: [sqlite] Inner Join Performance Issue
An example from my own data: explain query plan select * from categories where cat_name = ? order | from | detail -- 0 | 0| TABLE categories WITH INDEX sqlite_autoindex_categories_1 -- You can tell it uses the index because it says so. If you omit the "query plan" part you will get a longer more detailed result. If it has Idx* opcodes you know it is using an index to look up a record. I also get the same result (WITH INDEX) for: explain query plan select * from categories where cat_name LIKE "foo%" explain query plan select * from categories where cat_name LIKE "foo_" explain query plan select * from categories where cat_name LIKE "foo_%" explain query plan select * from categories where cat_name LIKE "foo_bar_%" Oddly enough, The following does NOT use the index: explain query plan select * from categories where cat_name LIKE "foo" I'm sure this is a bug with the optimizer, because this query is index capable. Also, be aware that the following can never use the index: explain query plan select * from categories where cat_name LIKE ? The reason it can't use the index with a bound parameter is that the statement is compiled in advance, and has to select a plan that will work for any input. Since the bound parameter might not be index capable, it has to compile a query that uses a slower plan. If the optimizer problem for strings without wildcards is a performance concern, you watch for this and build your query differently depending on whether the string has wildcards or not. BEWARE To take advantage of the index with user supplied strings you will have to build your query as a string rather than a prepared statement. MAKE SURE YOU SANITIZE YOUR INPUTS! John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Ralf Sent: Thursday, October 22, 2009 4:55 PM To: 'General Discussion of SQLite Database' Subject: Re: [sqlite] Inner Join Performance Issue I just ran EXPLAIN, how can I tell if the Indexes are used? I just read, that with an operator "like '%a%'" SQLite won't use an Index. Is this the case? Thanks Ralf > -Ursprüngliche Nachricht- > Von: sqlite-users-boun...@sqlite.org [mailto:sqlite-users- > boun...@sqlite.org] Im Auftrag von John Crenshaw > Gesendet: Donnerstag, 22. Oktober 2009 05:53 > An: General Discussion of SQLite Database > Betreff: Re: [sqlite] Inner Join Performance Issue > > Try to EXPLAIN the query and verify that the index is actually used. > There are a lot of reasons why this query would probably NOT be using > the index. > > John > > -Original Message- > From: sqlite-users-boun...@sqlite.org > [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Ralf > Sent: Wednesday, October 21, 2009 5:50 PM > To: 'General Discussion of SQLite Database' > Subject: [sqlite] Inner Join Performance Issue > > Hello Forum, > [>> ] > I have a select that joins 15 Tables the where clause consist of 8 like > relations (all fields are indexed), this is to implement a sort of > "search > engine". > The performance is awful. It takes around 10sec. > Is this how it should be or is there anything I can do? > > If you need more infos pls. let me know > > Thx > Ralf > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Grammar of "X is Y"
ROFL -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Griggs, Donald Sent: Tuesday, October 27, 2009 12:31 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] Grammar of "X is Y" Importance: Low Depends on what your definition of "is" is. (Sorry, non-English speakers. This is a tiny joke based in American political history) ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow SELECTs in application
Yeah, you don't just need an index on all columns. You need the right indexes with the right combination of columns. Also, use prepared statements. If you don't use prepared statements SQLite will have to recompile those queries at every execution, which can take some time. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Simon Slavin Sent: Tuesday, October 27, 2009 6:56 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] Slow SELECTs in application On 27 Oct 2009, at 8:30am, Unabashed wrote: > I have two general types of selects for this table. First one is > "SELECT > word,wform,id_norm FROM mgWords WHERE id=" > and second is > "SELECT > id FROM mgWords WHERE (word='') AND (wform='') > AND > (id_norm=)". > So I tried to add indexes to all fileds in my table: > CREATE INDEX id_norm ON mgWords (id_norm) > CREATE INDEX word ON mgWords (word) > CREATE INDEX wform ON mgWords (wform) Someone needs to explain to people what SQL indexes are for. For your first SELECT, CREATE INDEX id ON mgWords (id) For your second SELECT CREATE INDEX id ON mgWords (word,wform,id_norm,id) Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Idea for improving page cache
Supposing that the reduced cache misses are worth it, I think it would be better to simply allocate the nodes from a pool. Allocating from a pool maximizes locality and prevents the overhead involved in each allocation. Since the nodes have static size, pool allocation is easy. This doesn't save the size of the pointers, but let's face it, a couple of pointers doesn't add up to much here. Pool allocation also doesn't impose any of the additional limitations that ULL would (for example, migration from list to btree would still be easy). John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Pavel Ivanov Sent: Tuesday, October 27, 2009 1:38 PM To: kennethinbox-sql...@yahoo.com; General Discussion of SQLite Database Subject: Re: [sqlite] Idea for improving page cache Are you sure that there will be improvement with ULL? If you're talking about improving due to CPU internal cache then first of all you have to store in the list pointers to pages, not pages themselves (you don't want to store several pages in one chunk of memory, do you?). So you're getting one more pointer dereference every time you go to the list. Then you have to store additional information in the page to remember where in the list pointer to this page is stored. And each time list nodes are split or combined you have to change this information in each page. And now the main argument: ULL is good when you want to save memory overhead (which is very questionable in case of page cache) and good in getting elements by index and traversal of the whole list. Last two operations are never executed in SQLite. So looking at all this I don't see how performance can be improved (for me it seems that it's quite the opposite). Did I overlook something? Pavel On Tue, Oct 27, 2009 at 1:07 PM, Kenwrote: > Hi All, > > I have an idea that could improve the page cache performance. > > Instead of using a regular linked list to connect pages that are on the cache > use an "unrolled linked list". On some architectures due to the CPU caching > the ULL is about 40 times faster. > > Still this is mostly insignificant to the speed of disk i/o but every bit > helps... > > Just an idea, not sure if its been considered, feasible or even worthwhile. > > Ken > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Idea for improving page cache
"advantage" kind of depends. ULL is more specialized. You gain some benefit, but also lose some as well. For example, consider what is involved in doing a sorted insert into an ULL. On the other hand, you can get all of the same locality benefit with a pool allocation scheme. You don't reduce the pointer overhead this way, but that isn't really much of an issue. ULL requires less time to walk all nodes, but since you can't afford to keep ULL sorted, there aren't a lot of scenarios where that saves anything. In any case, a btree is better for sorted data, and a btree is well suited for pool allocation, not unrolling. Linked list nodes are an especially good candidate for pool allocation. They have uniform size, are generally used in large groups, and are large enough to support a linked list of deleted nodes. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Kristoffer Danielsson Sent: Tuesday, October 27, 2009 3:50 PM To: sqlite-users@sqlite.org Subject: Re: [sqlite] Idea for improving page cache I really like the concept of ULL. Check this one out: http://blogs.msdn.com/devdev/archive/2005/08/22/454887.aspx Don't know if would be of any use for SQLite, but it does indeed provide an advantage compared to regular linked lists. > Date: Tue, 27 Oct 2009 14:59:36 -0400 > From: johncrens...@priacta.com > To: sqlite-users@sqlite.org > Subject: Re: [sqlite] Idea for improving page cache > > Supposing that the reduced cache misses are worth it, I think it would be > better to simply allocate the nodes from a pool. Allocating from a pool > maximizes locality and prevents the overhead involved in each allocation. > Since the nodes have static size, pool allocation is easy. This doesn't save > the size of the pointers, but let's face it, a couple of pointers doesn't add > up to much here. Pool allocation also doesn't impose any of the additional > limitations that ULL would (for example, migration from list to btree would > still be easy). > > John > > -Original Message- > From: sqlite-users-boun...@sqlite.org > [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Pavel Ivanov > Sent: Tuesday, October 27, 2009 1:38 PM > To: kennethinbox-sql...@yahoo.com; General Discussion of SQLite Database > Subject: Re: [sqlite] Idea for improving page cache > > Are you sure that there will be improvement with ULL? > If you're talking about improving due to CPU internal cache then first > of all you have to store in the list pointers to pages, not pages > themselves (you don't want to store several pages in one chunk of > memory, do you?). So you're getting one more pointer dereference every > time you go to the list. Then you have to store additional information > in the page to remember where in the list pointer to this page is > stored. And each time list nodes are split or combined you have to > change this information in each page. > And now the main argument: ULL is good when you want to save memory > overhead (which is very questionable in case of page cache) and good > in getting elements by index and traversal of the whole list. Last two > operations are never executed in SQLite. > So looking at all this I don't see how performance can be improved > (for me it seems that it's quite the opposite). Did I overlook > something? > > Pavel > > On Tue, Oct 27, 2009 at 1:07 PM, Kenwrote: > > Hi All, > > > > I have an idea that could improve the page cache performance. > > > > Instead of using a regular linked list to connect pages that are on the > > cache use an "unrolled linked list". On some architectures due to the CPU > > caching the ULL is about 40 times faster. > > > > Still this is mostly insignificant to the speed of disk i/o but every bit > > helps... > > > > Just an idea, not sure if its been considered, feasible or even worthwhile. > > > > Ken > > ___ > > sqlite-users mailing list > > sqlite-users@sqlite.org > > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users _ Hitta hetaste singlarna på MSN Dejting! http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952 ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org
Re: [sqlite] Grammar of "X is Y"
> There's nothing wrong with that. Not unless style counts for something. X is Y looks far too much like X as Y for my taste. I'd rather do a little extra typing to have clear logic than to have clearly unclear code like that. My first thought when I saw this was "doesn't he mean AS?" If I saw something like this in my code I'd expect some major comments to clarify why the code really does mean IS and not AS. Just my two cents. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Darren Duncan Sent: Tuesday, October 27, 2009 7:08 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] Grammar of "X is Y" Jay A. Kreibich wrote: > On Tue, Oct 27, 2009 at 04:15:57PM +, Tom Sillence scratched on the wall: >> because I really want to write neat queries like: >> >> select col1 is col2 from table > > Are you sure? You just want a result set of true/false values? There's nothing wrong with that. Booleans are values like anything else, and one should be able to store them as field values and return them in rowsets. And very useful in practice, when one considers all the facts one might want to store that are commonly expressed as true/false, such as in a users table column named "may_login" or "is_moderated". -- Darren Duncan ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Idea for improving page cache
PM To: General Discussion of SQLite Database Subject: Re: [sqlite] Idea for improving page cache On Tue, Oct 27, 2009 at 04:28:11PM -0400, John Crenshaw wrote: > "advantage" kind of depends. ULL is more specialized. You gain some > benefit, but also lose some as well. For example, consider what is > involved in doing a sorted insert into an ULL. On the other hand, you > can get all of the same locality benefit with a pool allocation > scheme. You don't reduce the pointer overhead this way, but that isn't > really much of an issue. I don't know that you get the same locality benefit with pools: with a ULL there's fewer prev/next pointers to take up valuable cache space. The need to "pullup"/"breakup" a ULL at times is annoying, and to minimize you probably have to waste some space in order to amortize the cost. As you say, there's a trade-off. Many optimizations result in more complex, error-prone, brittle software. Wasting space is not much of a problem for cache locality, and if you keep the wasted space well under 50% you're ahead of plain lists in terms of memory footprint. So, ignoring code complexity, ULL seems like a likely win-win performance-wise. Even code complexity-wise, ULLs allow random access to be much faster than with plain linked lists, which means you're more likely to use random access, which means you probably win code complexity-wise too. Of course, once you start leaving room for additions in each chunk, ULLs start resembling B-trees, which, I think, is your point: might as well go whole-hog. > ULL requires less time to walk all nodes, but since you can't afford > to keep ULL sorted, there aren't a lot of scenarios where that saves I don't think it's true that you can't afford to keep it sorted, but you probably have to waste space if you want it sorted and allowing most random inserts to not require spills. Also, just because a chunk is full, or even all chunks are full, does not mean you must move lots of memory around: you can just insert a new chunk and move some entries from the neighboring ones into it, thus creating some breathing space for further additions (at the cost of space waste). Nico -- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Grammar of "X is Y"
Yeah, I tend to agree that null != null is confusing, however that is the way it is "supposed" to behave, so changing that would break a lot of code. If I had my way, and the behavior of NULL in operations COULD be changed, I would opt for the following: NULL = NULL -> true As a Boolean, NULL -> false (NULL = anything other than NULL) -> NULL (NULL {+,-,*,/,>,<} x) -> NULL (NULL {AND,OR} x) -> x The result would be that expressions with "null" are simply ignored, which as far as I can tell is nearly always the goal when you have to add "IS NULL" tests to expressions. Of course, a change like this would break TONS of code, so it will never happen, but I can dream. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Darren Duncan Sent: Tuesday, October 27, 2009 8:27 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] Grammar of "X is Y" John Crenshaw wrote: >> There's nothing wrong with that. > > Not unless style counts for something. X is Y looks far too much like X > as Y for my taste. I'd rather do a little extra typing to have clear > logic than to have clearly unclear code like that. My first thought when > I saw this was "doesn't he mean AS?" If I saw something like this in my > code I'd expect some major comments to clarify why the code really does > mean IS and not AS. I agree with you regarding syntax and style. My post was addressing just the "You just want a result set of true/false values?" comment as if that was saying "what reason is there to want a result set of true/false values"; it was not addressing the syntax/style issue. Regarding syntax and style, I would advocate that simple "=" comparisons being able to test for null would be superior, with the conception being that null is treated as a special marker that is not equal to any other value but is equal to itself. This is the semantics that SQLite's new "IS" follows I believe. And then you don't need any "is"/"as" confusion. -- Darren Duncan > -Original Message- > [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Darren Duncan > > Jay A. Kreibich wrote: >> On Tue, Oct 27, 2009 at 04:15:57PM +, Tom Sillence scratched on > the wall: >>> because I really want to write neat queries like: >>> >>> select col1 is col2 from table >> Are you sure? You just want a result set of true/false values? > > There's nothing wrong with that. Booleans are values like anything > else, and > one should be able to store them as field values and return them in > rowsets. > And very useful in practice, when one considers all the facts one might > want to > store that are commonly expressed as true/false, such as in a users > table column > named "may_login" or "is_moderated". -- Darren Duncan ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Grammar of "X is Y"
Meh, I don't want it THAT badly. I'm just saying that's how it should have been in the original design of the SQL language. In fact though, it probably wouldn't have mattered. Every different RDBMS seems to treat nulls differently in this regard, so I'm not even sure exactly what the spec says on the issue (most likely, SQLite is exactly in line with the spec, but I haven't bothered to confirm that.) For my own use, I'm willing to just be very specific where NULL is involved. Writing my own query parser to allow me to abuse the language isn't really worth it to me. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Darren Duncan Sent: Wednesday, October 28, 2009 12:09 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] Grammar of "X is Y" John Crenshaw wrote: > Yeah, I tend to agree that null != null is confusing, however that is > the way it is "supposed" to behave, so changing that would break a lot > of code. If I had my way, and the behavior of NULL in operations COULD > be changed, I would opt for the following: > > NULL = NULL -> true > As a Boolean, NULL -> false > (NULL = anything other than NULL) -> NULL > (NULL {+,-,*,/,>,<} x) -> NULL > (NULL {AND,OR} x) -> x > > The result would be that expressions with "null" are simply ignored, > which as far as I can tell is nearly always the goal when you have to > add "IS NULL" tests to expressions. Of course, a change like this would > break TONS of code, so it will never happen, but I can dream. On the other hand, you could invent another RDBMS programming language that works the way you think SQL should be weren't it for exact backwards compatibility, and then just have a translation tool to convert code written in it to the behavior-equivalent SQL/etc code that a DBMS will actually run. The other language could be arbitrarily similar to SQL when you don't have a reason to be different. This is what I'm doing with my "Muldis D" language, which is close to being feature-complete. -- Darren Duncan ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Late data typing. Am I missing something?
SQLite's data typing means it can support any and all field types supported in any other SQL database. That's a big deal. For the most part, the proper method for accessing any given data is going to be simple and universal. Homegrown routines will only happen if people have specific homegrown needs. I'm a huge fan of strong typing, but I enforce that at the application level with wrapper classes for each table. I don't see a particular problem with SQLite's weak typing internally. SQLite has plenty of date editing routines. Dates are stored in a double as a Julian date. Check out the SQLite code to see how this was done. SQLite doesn't expose the date structure and routines at the C level (a major oversight IMO), so you'll need to lift the dates code and put it in a new public class. SQLite's understanding of "dates" is capable of supporting null, date, time, or datetime. The only real problem is that timezone is not stored, dates are always stored and retrieved in UTC, and dates with timezones are converted prior to storage. If you need to retain timezone you could replace all the date functions with your own version, and store the dates differently, or you could store the timezone separately. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Ted Rolle Sent: Wednesday, October 28, 2009 1:57 PM To: sqlite-users Subject: [sqlite] Late data typing. Am I missing something? Doesn't dynamic data typing lead to bad data? And proliferation of home-grown editing routines? It seems that a strict data typing at column definition time would be MUCH better. For instance, date-editing routines... Ted ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Late data typing. Am I missing something?
Yeah, the code is fortunately all there, so once you know what you're looking for it is easy to copy out, but it should have been exposed in the API. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of O'Neill, Owen Sent: Wednesday, October 28, 2009 3:11 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] Late data typing. Am I missing something? You can get close if you put some check constraints on the columns. I must agree with other posters that the lack of an exposed timestamp type does feel like something of a gap. Owen -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Ted Rolle Sent: Wednesday, October 28, 2009 5:57 PM To: sqlite-users Subject: [sqlite] Late data typing. Am I missing something? Doesn't dynamic data typing lead to bad data? And proliferation of home-grown editing routines? It seems that a strict data typing at column definition time would be MUCH better. For instance, date-editing routines... Ted ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] shared cache mode and 'LOCKED'
It appears to be up to date. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of O'Neill, Owen Sent: Wednesday, October 28, 2009 1:45 PM To: General Discussion of SQLite Database Subject: [sqlite] shared cache mode and 'LOCKED' Hi Everyone, Does anyone know if this page is still up to date with respect to when you get "SQLITE_LOCKED" when operating in shared cache mode ? http://www.sqlite.org/cvstrac/wiki?p=DatabaseIsLocked (I'm trying to solve a two writers problem and am trying to understand the best way to solve it) (and I think part of my problem is not understanding the difference between 'locked' and 'busy' ) I've seen the notify example here. http://www.sqlite.org/unlock_notify.html and a 'busy' example here http://www.sqlite.org/cvstrac/wiki?p=MultiThreading http://www.sqlite.org/cvstrac/wiki?p=SampleCode Is it possible for something to be 'locked' then after being unlocked at it tries again it gets 'busy' ? Should my re-try strategy be the same or different for 'busy' and 'locked' and I guess if I get SQLITE_IOERR_BLOCKED (http://www.sqlite.org/c3ref/busy_handler.html) I should always back off and wait a while ? Any tips for the 'best' way to tackle this gratefully received. (I have one thread writing a lot but it can block for a 'long' time and still be ok (up to 5 seconds) - and another one mostly reading and doing a few occasional writes, but it can't block for long (>250ms) because it's servicing the UI and repainting will stop.) many thanks Owen ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SELECT * vs SELECT columns ?
You could use EXPLAIN to see if there is a different query plan, but I'd bet there isn't. * will generally be slower, just because you usually won't need EVERY column. If you can specify only certain columns, that will save you some time. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Kristoffer Danielsson Sent: Wednesday, October 28, 2009 12:51 PM To: sqlite-users@sqlite.org Subject: Re: [sqlite] SELECT * vs SELECT columns ? I don't know about SQLite, but in all SQL courses you learn that you should NEVER use the asterisk. The asterisk is merely there to let you quickly view data _manually_. > Date: Wed, 28 Oct 2009 16:02:01 +0200 > From: mi...@limbasan.ro > To: sqlite-users@sqlite.org > Subject: Re: [sqlite] SELECT * vs SELECT columns ? > > I would expect there to be a speed and memory performance *impact* if > the result set contains columns other than the three specified ones, > since obviously the library will need to allocate more memory to hold > the extra data. > > On 10/28/2009 03:52 PM, Pete56 wrote: > > I am searching across two joined tables and am interested in a few > > parameters: > > > > SELECT a.first a.third b.first FROM a JOIN b ON a.RowID = b.RowID WHERE > > value = :value > > > > Is there any speed or memory performance improvement by using SELECT *, > > rather than SELECT ? > > > > If I know there will only be one item (unique :value) selected (LIMIT 1) can > > I make any performance improvements ? > > > > -- > Cu stima, > Mihai Limbasan > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users _ Nya Windows 7 - Hitta en dator som passar dig! Mer information. http://windows.microsoft.com/shop ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] shared cache mode and 'LOCKED'
I don't know. Elsewhere it says you really shouldn't use the same connection in multiple threads. I use a different connection in each thread. With the shared cache, this results in very little overhead, so I'm unsure why you would need to do this the "not recommended" way. The contention between connections only applies to other processes if the shared cache is enabled. With the shared cache each process will lock the whole database, but connections in threads within that process will only lock individual tables. This is really the right way to do a multithreaded application, because otherwise contention is too great. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Tom Broadbent Sent: Wednesday, October 28, 2009 3:32 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] shared cache mode and 'LOCKED' to be clear... "in other words, two threads sharing a connection in shared cache mode will always cause SQLITE_LOCKED (rather than SQLITE_BUSY) when contention occurs _between the two threads_. if contention occurs from another connection (i.e. a connection in a different process) SQLITE_BUSY will be returned." i believe this is correct. experts? -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Tom Broadbent Sent: Wednesday, October 28, 2009 12:27 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] shared cache mode and 'LOCKED' i'm no expert on this, but my understanding is that since shared cache mode 'shares a connection' you won't get SQLITE_BUSY but rather SQLITE_LOCKED since the contention is 'internal' to the connection. in other words, two threads sharing a connection in shared cache mode will always cause SQLITE_LOCKED (rather than SQLITE_BUSY) when contention occurs. experts: pls correct me if i'm wrong here. here is the advice i received when asking about a similar situation: >> If other threads may also need a write lock on that table, you should >> handle SQLITE_LOCKED by incrementing a waiter count and calling >> sqlite3_unlock_notify. The thread doing the inserting can check to >> see if anybody is waiting (blocked) and yield by committing the >> current transaction and waiting for the blocked thread to unblock. Be >> aware, you should also close any open cursors before yielding, >> because open cursors will prevent write locks and you'll waste time >> yielding for nothing. >> >> John hope this helps (and isn't incorrect). thanks tom -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of O'Neill, Owen Sent: Wednesday, October 28, 2009 10:45 AM To: General Discussion of SQLite Database Subject: [sqlite] shared cache mode and 'LOCKED' Hi Everyone, Does anyone know if this page is still up to date with respect to when you get "SQLITE_LOCKED" when operating in shared cache mode ? http://www.sqlite.org/cvstrac/wiki?p=DatabaseIsLocked (I'm trying to solve a two writers problem and am trying to understand the best way to solve it) (and I think part of my problem is not understanding the difference between 'locked' and 'busy' ) I've seen the notify example here. http://www.sqlite.org/unlock_notify.html and a 'busy' example here http://www.sqlite.org/cvstrac/wiki?p=MultiThreading http://www.sqlite.org/cvstrac/wiki?p=SampleCode Is it possible for something to be 'locked' then after being unlocked at it tries again it gets 'busy' ? Should my re-try strategy be the same or different for 'busy' and 'locked' and I guess if I get SQLITE_IOERR_BLOCKED (http://www.sqlite.org/c3ref/busy_handler.html) I should always back off and wait a while ? Any tips for the 'best' way to tackle this gratefully received. (I have one thread writing a lot but it can block for a 'long' time and still be ok (up to 5 seconds) - and another one mostly reading and doing a few occasional writes, but it can't block for long (>250ms) because it's servicing the UI and repainting will stop.) many thanks Owen ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __ __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] shared cache mode and 'LOCKED'
Almost. Locking happens at a table level in this case, not a database level. Three different threads can all write at the same time, if they write to different tables. But, if two threads write try to the same table at the same time, one of them will return SQLITE_LOCKED. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Tom Broadbent Sent: Wednesday, October 28, 2009 3:46 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] shared cache mode and 'LOCKED' oh, right. my bad. i don't mean to share a connection between two threads, but rather that each thread (with its own connection) in the same process where shared cache mode is enabled will cause SQLITE_LOCKED error rather than SQLITE_BUSY error when these threads contend for the DB. is this right? -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of John Crenshaw Sent: Wednesday, October 28, 2009 12:38 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] shared cache mode and 'LOCKED' I don't know. Elsewhere it says you really shouldn't use the same connection in multiple threads. I use a different connection in each thread. With the shared cache, this results in very little overhead, so I'm unsure why you would need to do this the "not recommended" way. The contention between connections only applies to other processes if the shared cache is enabled. With the shared cache each process will lock the whole database, but connections in threads within that process will only lock individual tables. This is really the right way to do a multithreaded application, because otherwise contention is too great. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Tom Broadbent Sent: Wednesday, October 28, 2009 3:32 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] shared cache mode and 'LOCKED' to be clear... "in other words, two threads sharing a connection in shared cache mode will always cause SQLITE_LOCKED (rather than SQLITE_BUSY) when contention occurs _between the two threads_. if contention occurs from another connection (i.e. a connection in a different process) SQLITE_BUSY will be returned." i believe this is correct. experts? -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Tom Broadbent Sent: Wednesday, October 28, 2009 12:27 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] shared cache mode and 'LOCKED' i'm no expert on this, but my understanding is that since shared cache mode 'shares a connection' you won't get SQLITE_BUSY but rather SQLITE_LOCKED since the contention is 'internal' to the connection. in other words, two threads sharing a connection in shared cache mode will always cause SQLITE_LOCKED (rather than SQLITE_BUSY) when contention occurs. experts: pls correct me if i'm wrong here. here is the advice i received when asking about a similar situation: >> If other threads may also need a write lock on that table, you should >> handle SQLITE_LOCKED by incrementing a waiter count and calling >> sqlite3_unlock_notify. The thread doing the inserting can check to >> see if anybody is waiting (blocked) and yield by committing the >> current transaction and waiting for the blocked thread to unblock. Be >> aware, you should also close any open cursors before yielding, >> because open cursors will prevent write locks and you'll waste time >> yielding for nothing. >> >> John hope this helps (and isn't incorrect). thanks tom -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of O'Neill, Owen Sent: Wednesday, October 28, 2009 10:45 AM To: General Discussion of SQLite Database Subject: [sqlite] shared cache mode and 'LOCKED' Hi Everyone, Does anyone know if this page is still up to date with respect to when you get "SQLITE_LOCKED" when operating in shared cache mode ? http://www.sqlite.org/cvstrac/wiki?p=DatabaseIsLocked (I'm trying to solve a two writers problem and am trying to understand the best way to solve it) (and I think part of my problem is not understanding the difference between 'locked' and 'busy' ) I've seen the notify example here. http://www.sqlite.org/unlock_notify.html and a 'busy' example here http://www.sqlite.org/cvstrac/wiki?p=MultiThreading http://www.sqlite.org/cvstrac/wiki?p=SampleCode Is it possible for something to be 'locked' then after being unlocked at it tries again it gets 'busy' ? Should my re-try strategy be the same or different for 'busy' and 'locked' and I guess if I get SQLITE_IOERR_BLOCKED (http://www.sqlite.org/c3ref/busy_handler.html) I should always back off and wait a while ? Any tip
Re: [sqlite] How to input a double num?
Bad plan. Use prepared statements and bind. Otherwise you're going to create SQL injection vulnerabilities. Prepared statements are faster and easier to read anyway. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of liubin liu Sent: Wednesday, October 28, 2009 10:49 PM To: sqlite-users@sqlite.org Subject: [sqlite] How to input a double num? Now I use the sqlite3_mprintf() and the "%f" to get the double num. My code is below. Now there is a num like "212345678901234567890123456.988290112". With the way of "sqlite3_mprintf()" and "%f", the num is cut to "2123456789012346000.00". How to input the num "212345678901234567890123456.988290112"? code_ ... char *query_format = "INSERT OR REPLACE INTO pow_value_st (id, valid, powervalue_1, powervalue_2) VALUES (%d, %d, %f, %f)"; char *query_string = NULL; query_string = sqlite3_mprintf (query_format, index, tc->valid, tc->powervalue[0], tc->powervalue[1]); printf ("%s\n", query_string); ... -- View this message in context: http://www.nabble.com/How-to-input-a-double-num--tp26105457p26105457.htm l Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Late data typing. Am I missing something?
Strings have a number of other disadvantages in this case. They take more computations to compare, they take time to parse when you read them, and they take longer to build when you insert them. Generally, storing dates as a number of some sort is ideal. Building a query to return the value as a human readable string is fairly easy: SELECT datetime(2455133.71759947); -- returns '2009-10-29 05:13:20' I imagine the timezones aren't documented, because they aren't actually stored if the Julian format is used internally (they have to be converted to get the Julian in UTC.) If you use a string, it can store the timezone I guess, but it will cost you in terms of speed. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Dan Bishop Sent: Thursday, October 29, 2009 12:26 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] Late data typing. Am I missing something? John Crenshaw wrote: > SQLite has plenty of date editing routines. Dates are stored in a double > as a Julian date. Well, that's one way of doing it. I store them as strings because I wanted a human-readable format. The downside is that this requires 19 bytes instead of 8. I wish SQLite could handle the storage optimization behind the scenes. > SQLite's understanding of "dates" is capable of > supporting null, date, time, or datetime. The only real problem is that > timezone is not stored, dates are always stored and retrieved in UTC, > and dates with timezones are converted prior to storage. Wow! I didn't realize that SQLite supported timezones, but sure enough, it does: sqlite> select datetime('2009-10-28T22:54:52-05:00'); 2009-10-29 03:54:52 Why isn't this documented at http://www.sqlite.org/lang_datefunc.html ? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] shared cache mode and 'LOCKED'
Let's say we have the three connections in that diagram, and two tables named t1 and t2. I'll use a simple syntax to describe some concurrency scenarios: con#>>t# will mean con# writes to t# Commas will separate concurrent attempted operations After the operations will be a pipe '|' followed by the error code that would result, if any Here goes: 1. con1 >> t1, con2 >> t2 | SQLITE_BUSY 2. con2 >> t1, con2 >> t2 | SQLITE_OK 3. con1 >> t1, con2 >> t1 | SQLITE_BUSY 4. con2 >> t1, con2 >> t1 | SQLITE_LOCKED Does that clarify this? John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Tom Broadbent Sent: Wednesday, October 28, 2009 6:33 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] shared cache mode and 'LOCKED' i guess this isn't that complicated. the error codes even say basically what you've said: #define SQLITE_BUSY 5 /* The database file is locked */ #define SQLITE_LOCKED 6 /* A table in the database is locked */ i guess the point is that separate connections normally lock the entire DB file but in shared cache mode two connections (in the same process) can both have access to the DB file but not to the same table. you've said this below as well. the point is that in the diagram here (http://www.sqlite.org/sharedcache.html) if conn1 writes to tab1 then conn2 and conn3 will get SQLITE_BUSY, yes? if conn2 writes to tab1 then conn1 will get SQLITE_BUSY but conn3 will get SQLITE_LOCKED (if trying to write to tab1; will succeed if trying to write to tab2). correct? -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of John Crenshaw Sent: Wednesday, October 28, 2009 12:49 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] shared cache mode and 'LOCKED' Almost. Locking happens at a table level in this case, not a database level. Three different threads can all write at the same time, if they write to different tables. But, if two threads write try to the same table at the same time, one of them will return SQLITE_LOCKED. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Tom Broadbent Sent: Wednesday, October 28, 2009 3:46 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] shared cache mode and 'LOCKED' oh, right. my bad. i don't mean to share a connection between two threads, but rather that each thread (with its own connection) in the same process where shared cache mode is enabled will cause SQLITE_LOCKED error rather than SQLITE_BUSY error when these threads contend for the DB. is this right? -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of John Crenshaw Sent: Wednesday, October 28, 2009 12:38 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] shared cache mode and 'LOCKED' I don't know. Elsewhere it says you really shouldn't use the same connection in multiple threads. I use a different connection in each thread. With the shared cache, this results in very little overhead, so I'm unsure why you would need to do this the "not recommended" way. The contention between connections only applies to other processes if the shared cache is enabled. With the shared cache each process will lock the whole database, but connections in threads within that process will only lock individual tables. This is really the right way to do a multithreaded application, because otherwise contention is too great. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Tom Broadbent Sent: Wednesday, October 28, 2009 3:32 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] shared cache mode and 'LOCKED' to be clear... "in other words, two threads sharing a connection in shared cache mode will always cause SQLITE_LOCKED (rather than SQLITE_BUSY) when contention occurs _between the two threads_. if contention occurs from another connection (i.e. a connection in a different process) SQLITE_BUSY will be returned." i believe this is correct. experts? -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Tom Broadbent Sent: Wednesday, October 28, 2009 12:27 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] shared cache mode and 'LOCKED' i'm no expert on this, but my understanding is that since shared cache mode 'shares a connection' you won't get SQLITE_BUSY but rather SQLITE_LOCKED since the contention is 'internal' to the connection. in other words, two threads sharing a connection in shared cache mode will always cause SQLITE_LOCKED (rather than SQLITE_BUSY) when contention occurs. experts: pls correct me if
Re: [sqlite] Late data typing. Am I missing something?
> *if* you know that the number *is* a date. If the column has a type of timestamp, it should be safe to always assume that it IS a date. Don't put non-dates in it, and there isn't a problem. I can't imagine a case where you would not know whether the value is a date. Am I missing something here? The datetime() function will accept Julian or string representations (including special strings like 'now') so even if you did something really nasty like store dates sometimes in Julian format and sometimes as strings, the date and time functions will STILL get things right, though sorting would be a mess and indexes would be useless. > This is fine as long as you always view your data with > application-specific tools and never with generic ones. Julian *is* the preferred internal format for dates in SQLite. There is nothing application specific about this. I actually tested the queries in a generic viewer. I used SELECT Julian('now'); in sqlite3Explorer to get the "2455133.71759947" value used in my original example. > it's still worth mentioning that [timezones] can be > part of the parameter. You are right of course. They should have been mentioned in the documentation. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Dan Bishop Sent: Thursday, October 29, 2009 3:15 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] Late data typing. Am I missing something? John Crenshaw wrote: > Strings have a number of other disadvantages in this case. They take > more computations to compare, they take time to parse when you read > them, and they take longer to build when you insert them. Generally, > storing dates as a number of some sort is ideal. > I do agree with that. The problem is that the ideal way to store a date is different from the ideal way to *display* a date. And that the conversion between the two has to be done manually. > Building a query to return the value as a human readable string is > fairly easy: > SELECT datetime(2455133.71759947); -- returns '2009-10-29 05:13:20' > Yes, it's easy to do that -- *if* you know that the number *is* a date. This is fine as long as you always view your data with application-specific tools and never with generic ones. > I imagine the timezones aren't documented, because they aren't actually > stored if the Julian format is used internally (they have to be > converted to get the Julian in UTC.) If you use a string, it can store > the timezone I guess, but it will cost you in terms of speed. That's not the point. Timezones can't be part of the return value of the strftime/datetime/julianday functions, but it's still worth mentioning that they can be part of the parameter. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Late data typing. Am I missing something?
>>> *if* you know that the number *is* a date. >>> >> >> If the column has a type of timestamp, it should be safe to always >> assume that it IS a date. >sqlite> CREATE TABLE t1 (StartTime TIMESTAMP, Duration REAL); >sqlite> CREATE TABLE t2 AS SELECT StartTime, StartTime + Duration / >86400.0 AS StopTime FROM t1; >sqlite> SELECT sql FROM sqlite_master WHERE name = 't2'; >CREATE TABLE t2(StartTime TIMESTAMP,StopTime) > >What column type? Still a timestamp (assuming that you stored the dates as Julian values and that the math was therefore valid.) SQLite doesn't care what the column type is listed as so it doesn't matter what sqlite_master says. All that matters is how you used the data, and in this case the calculated value is still a timestamp. >> Don't put non-dates in it, and there isn't a >> problem. I can't imagine a case where you would not know whether the >> value is a date. Am I missing something here? >> >Consider the case of an application using an SQLite database to store >its settings (like the Windows registry, but portable). The dynamic >type system is great for this. > >CREATE TABLE Settings (Name TEXT PRIMARY KEY, Value BLOB); > >NameValue >- --- >LogPath 'E:\log\FooApp' >MaxAttachmentSize 250 >LastUpdate 2455130.1125 > >Now, in the SQLite equivalent of regedit, how is it supposed to know >that LastUpdate is timestamp 2009-10-25 14:42:00 but MaxAttachmentSize >is NOT the date 2132-08-31 12:00:00? Without knowledge of the >application that created this table, it can't. A system like this would need a type column as well. Storing dates as text doesn't change that, because at some level you'll still need to distinguish between regular text, and a date stored as text. Once you add a type column, it is no longer ambiguous. Dates don't have a special data type internally so you have to choose to treat them as dates for them to be dates. You can store them as strings and accept the performance hit that will come with that, or you can store them as Julian dates, and accept a little extra typing when you look at your data in a generic viewer. Either way though, it's only a date if you treat it like one. John ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Some clarification needed about Unicode
1. My understanding is that it will convert between UTF-8 and UTF-16 when you use these functions. I haven't tested this though, so you might try a simple test app just to make sure. 2. UTF-8 is NOT the same as ASCII for values greater than 127. Similarly, UTF-16 is NOT the same as UCS-2 (the wide "Unicode" chars used by MS APIs), though it looks the same at low values. UTF-16 is a multibyte character set, while UCS-2 is always 2 bytes per character. You have to convert these values. Better to just use the regular UTF8 versions. If you are only now internationalizing your code, and you've been passing 8bit strings to SQLite, you may already have a problem, since some of the data stored may be invalid if you attempt to treat it as a UTF-8 string. You may find that you'll need to dump the data, then convert from ASCII to UTF-8 and rebuild the database. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of A.J.Millan Sent: Thursday, October 29, 2009 5:14 AM To: sqlite-users@sqlite.org Subject: [sqlite] Some clarification needed about Unicode Hi list: After some years using this wonderful tool, I embraced the internationalization of a application, and despite some readings in this list, and muy own test -not conclusive-, I still have some obscure corners. [1] Supposing some textual data already inserted as UTF-8 (default mode) in a dBase, and a connection opened with sqlite3_open(): Does a sqlite3_column_text16 retrieves a correct UTF-16 content? Is to say, do SQLite the convertion internally? [2] Assuming the previous -or a UTF-16 content obtained by any other means- and an develop using M$ VCpp for Windows32, say sizeof(w_char) == 2: can be used the UTF-16 content to directly fill an w_char string? Any clarification would be greatly appreciated. A.J.Millan ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Some clarification needed about Unicode
Yes, they are obscure, but that is the whole point of supporting Unicode. Because users want to enter obscure characters into your application ;) My main point is that you can't take the UTF-16 string and safely supply it to APIs which want UCS-2 encoded text, such as Win32 APIs (including things like SetWindowText()). Odds are that the only library you are using which supports UTF-16 is SQLite. You should always be converting the text to UCS-2 before you use it. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Dan Kennedy Sent: Thursday, October 29, 2009 6:39 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] Some clarification needed about Unicode On Oct 29, 2009, at 4:41 PM, Jean-Christophe Deschamps wrote: > >> [1] Supposing some textual data already inserted as UTF-8 (default >> mode) in >> a dBase, and a connection opened with sqlite3_open(): Does a >> sqlite3_column_text16 retrieves a correct UTF-16 content? Is to >> say, do >> SQLite the convertion internally? >> >> [2] Assuming the previous -or a UTF-16 content obtained by any other >> means- >> and an develop using M$ VCpp for Windows32, say sizeof(w_char) == 2: >> can be >> used the UTF-16 content to directly fill an w_char string? > > Yes and yes, hopefully! Yes and yes it is. Of course, some unicode codepoints are encoded to 4 bytes using utf-16, they will span 2 of the entries in the array of w_char variables. I think (never checked though), that they are pretty obscure characters. Dan. > SQLite handles all necessary conversions and produces UTF-16 with > native endianness, whatever encoding was used at database creation. > > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SELECT * vs SELECT columns ?
> If the schema changes, a listing of every column can be > invalidated, but the asterisk cannot. This is only partly true. At some point, the code is going to need to grab the individual fields, and that is the point where the asterisk fails to serve you well. If new fields are added to the schema, but those fields are never used, you sacrifice performance. On the other hand, if fields are added to the schema and you need to use them, you still have to modify the code to retrieve and use the column. Adding the new fields to the query at the same time isn't a big deal. John ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Some clarification needed about Unicode
> there must exist zillions [working] wrappers to VC++. You would think. In fact, there are only a few, and most are not very good. I used the wrapper at Code Project as a base, then added handling for SQLITE_LOCKED, a date class, better blob handling, transaction support, and other useful enhancements as needed. All said it wasn't too bad. This is the route I would recommend. John ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Some clarification needed about Unicode
> Certainly not! > > Win32 supports full Unicode 5.1 in UTF-16 for a long time. I double checked this just to be sure, and now I'm not. Here are the facts: 1. When Microsoft describes "Unicode" and "wide characters" the description always matches UCS-2 (they continually mention that "Unicode" strings are 2 bytes per character, so the character length is always half the number of bytes.) 2. MultiByteToWideChar supports a "MB_COMPOSITE" flag, which appears to give UTF-16 output. 3. MultiByteToWideChar also supports a "MB_PRECOMPOSED" flag, which appears to force UCS-2 output. 4. By default, MultiByteToWideChar precomposes when possible, and returns a composite character otherwise. Microsoft never seems to clearly identify whether the wide APIs should be given UTF-16 or UCS-2. Their guide on internationalization would seem to suggest that UCS-2 must be used, however, there is some reason to believe that perhaps UTF-16 is handled correctly as well. Couldn't find anything reliable one way or the other though. (Though there are plenty of folks taking whichever position, so at least I'm not the only one who's confused now.) John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Jean-Christophe Deschamps Sent: Thursday, October 29, 2009 9:18 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] Some clarification needed about Unicode >My main point is that you can't take the UTF-16 string and safely supply >it to APIs which want UCS-2 encoded text, such as Win32 APIs (including >things like SetWindowText()). Odds are that the only library you are >using which supports UTF-16 is SQLite. You should always be converting >the text to UCS-2 before you use it. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Late data typing. Am I missing something?
> > I must agree with other posters that the lack of an exposed timestamp > > type does feel like something of a gap. > > Given the rather large number of reasonable ways to represent > timestamps, I have to disagree. I'd rather have the freedom to use > whichever representation is best for my needs. > > E.g., if a table contains rows representing Unix files, then I'll > probably want to store seconds since the Unix epoch because that will > mean fewer conversions, depending on how I use that table anyways. An internal timestamp type wouldn't force you to use it. Remember you still have dynamic typing. You could choose to store an integer instead and that would be fine. The point is that it would be good to have an internal type specifically for storing time, just like there are types for real numbers, integers, blobs, and text. Time could be stored as it is now (a real number) with perhaps a timezone, but would have a distinct type identifying it. I expect the reason this isn't done is that at the SQL level there is no wonderful way to recognize a constant as a "time". The data type would mostly only be relevant when binding, where the strong typing of the API clarifies things. John ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Some clarification needed about Unicode
> MB_COMPOSITE has nothing to do with surrogate pairs You're right. I was trying to determine whether the output was UTF-16 or UCS-2 based on whether the output might use multiple bytes to represent a character, which is where I got tripped up. > Do you believe _that's_ what differentiates UTF-16 and UCS-2? If so, you are > mistaken. No. If that were the difference it wouldn't be a big deal. The difference is an encoding difference, similar to UTF-8 vs. ASCII. (but different...) UTF-16 will use either 2 or 4 bytes for a character, UCS-2 will always use 2 bytes. As a result, UCS-2 can't hold everything that UTF-16 can. > > Microsoft never seems to clearly identify whether the wide APIs should > > be given UTF-16 or UCS-2. > > You mean, which Unicode normalization form they expect No, I mean which encoding. You can't give a UTF-16 string to an API that only knows how to handle UCS-2 encoded data, just like you can't use a UTF-8 string when ASCII data is expected. When I tackle this nightmare the last time I was left with the understanding that the wide Win32 APIs expected data to be UCS-2 encoded. Now I'm no longer sure, and I can't find any reliable documentation on this either way. It would be good if the APIs accept UTF-16, because that would mean they also accept UCS-2, but I couldn't find anything reliable to support this idea. Some folks say yes. Some say no. The documentation says nothing. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Igor Tandetnik Sent: Thursday, October 29, 2009 5:08 PM To: sqlite-users@sqlite.org Subject: Re: [sqlite] Some clarification needed about Unicode John Crenshaw <johncrens...@priacta.com> wrote: > 2. MultiByteToWideChar supports a "MB_COMPOSITE" flag, which appears > to > give UTF-16 output. MB_COMPOSITE has nothing to do with surrogate pairs, and everything to do with whether, say, Latin-1 character Á (A with accute) is converted to a single character U+00C1, or two characters U+0041 U+0301 (capital A + combining accute accent). The latter is "composite", the former is "precomposed". Do you believe _that's_ what differentiates UTF-16 and UCS-2? If so, you are mistaken. The difference between the two is in how Unicode characters U+1 and up are represented (as surrogate pairs in one case, unsupported in the other). U+0041 U+0301 is a valid UCS-2 sequence and a valid UTF-16 sequence. > Microsoft never seems to clearly identify whether the wide APIs should > be given UTF-16 or UCS-2. You mean, which Unicode normalization form they expect ( see http://en.wikipedia.org/wiki/Unicode_equivalence ), which, again, has absolutely nothing to do with UTF-16 vs UCS-2. The answer is, Win32 API can handle any normalization form as well as denormalized strings. FoldString API is provided to normalize strings to various normalization forms if desired. Igor Tandetnik ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Some clarification needed about Unicode
> Don't worry: we're all confused with MS wording! For what I understand > having also myself tried to sort out the question, is that there is a > line drawn: before XP unicode support included was nothing else than > UCS-2 (W2K). Xp and post-XP system include Unicode 5.1 and use UTF-16 > encoding. This is from 2005: > > http://blogs.msdn.com/michkap/archive/2005/05/11/416552.aspx Thanks for the link. That clarifies things a lot. So, for the OP, if you are targeting Win2k, it would be a good idea to use UCS-2, not UTF-16, with any wide API calls. XP and above should (according to Kaplan and Chen) support UTF-16 for API calls. John ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Late data typing. Am I missing something?
> Um...how do I go to the page that describes the date-time information > without just 'knowing' it's there? For example, there seems to be no > path to http://www.sqlite.org/lang_datefunc.html from > http://.sqlite.org. I used Google personally. "sqlite date functions" John ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] How to input a double num?
> He's probably measuring the number of atoms in a city or something. LOL. Actually my bet is that field of the project has more to do with number theory. That's the only likely way to get a number that large with such high precision. John ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Some clarification needed about Unicode
> > No, I mean which encoding. You can't give a UTF-16 string to an API > > that only knows how to handle UCS-2 encoded data > > Well, most of the time, you can. Only in rare cases do you need to treat > surrogate pairs in special way. One such case, relevant to this discussion, > is converting UTF-16 to UTF-8 Two other examples would be collation and rendering, both of which may be likely results of using the strings with various APIs. In fact, internationalization would be easy if not for these cases where the string is actually USED for something. This is where you run into the gotchas. > So, from where I sit, Win32 API cheerfully accepts UTF-16. Can you show an example to the contrary? Another user dug up an article on Michael Kaplan's blog explaining the nature of Unicode in Windows. Apparently Win2k supported only UCS-2, while XP and above can handle UTF-16. So your experiments will probably show UTF-16 working, but *if* anyone needs to support Win2k, this becomes a consideration again. On Win2k systems, however, MultiByteToWideChar() should return UCS-2 data, so if you use that for the conversion to wide char, you should always be fine on whatever system. > > When I tackle this > > nightmare the last time I was left with the understanding that the > > wide Win32 APIs expected data to be UCS-2 encoded. Now I'm no longer > > sure, and I can't find any reliable documentation on this either way. > > It would be good if the APIs accept UTF-16 > > Which API calls specifically are you concerned about? There are very few cases > where the presence of surrogate pairs makes a difference. I believe you are > blowing the issue way out of proportion. SetWindowTextW() would be a wonderful example. Any API that attempts to understand the string in terms of characters would have a problem. That would include any API that may measure, draw, sort, or re-encode the string, weather directly, or indirectly. That's a lot of APIs. That's why I was concerned about getting the encoding right. I use the wide APIs myself, and if I was making a horrible mistake I wanted to know. Besides, I figured the OP would prefer accurate information. Many thanks for making sure I had my facts straight. John ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] feature proposal - strong but dynamic typing
Been watching this discussion go back and forth, and I'd like to weigh in. I'm generally a HUGE fan of strong typing, but this doesn't do it for me. To me, strongly typed means a compiler catches my type mismatches before the app goes out the door. In this case though, no matter what you do, a mistake with your use of a type can never show up at compile time. You'll only get the error at runtime. Frankly, once the mistake is made, and the code deployed, I expect the program to behave the best it possibly can. Making the application blow up, rather than storing the data and moving on, is not a great plan in my mind. The user doesn't *care* that a REAL was provided, but an INTEGER was expected. The user cares that the app exploded on them and didn't save the edit they just made. If strong data typing *at the SQL level* is that important to you, this is a job for constraints: CHECK(typeof(product_id) = 'integer') If you hate the idea of manually adding these constraints, you could even write something to automatically add them for you. If you hate the performance hit, you could add them only in your debug build. As far as making the core universally reject data that doesn't match its belief about the format that data should be in, I think this is a very bad idea. It would break plenty of old code to provide a feature that is already possible for which the benefit is questionable anyway. Since SQLite can't catch my type errors at application compile time anyway, I think I like the current behavior better. Just my 2 cents. John ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] How to input a double num?
> May I use sqlite3_bind_double() and sqlite3_prepare_v2() to solve the > problem. That won't fix it. Your number is too large to fit in any native data type. Even the plain math inside your own program won't work right, because the precision of the number is limited at the C level, not the SQLite or printf level. You'll need an arbitrary precision math library, sqlite3_bind_blob(), and sqlite3_prepare_v2(). You can't use sqlite3_bind_double() because your number is too big for a double. FYI, those arbitrary precision libraries are brutes, so brace yourself. John ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Idea for improving page cache
> Just for the sake of discussion I've attached a performance > graph for various C++ data structures plus the Unrolled LL. > The tests where run on a dell vostro 1400 laptop. As you can > see the graphs show the ULL to be quite efficient for > insert/delete from the front/back of the list. I beleive this > is mainly due to the fact that a new node is not allocated > for the insert for each operation. Yes, a stack would be a good use for ULL because front/back insert/delete can be highly efficient, and you can afford 0 wasted space. I'd love to see the actual data you tried to attach, but I couldn't because the attachment was blank except for a message footer. John ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Some clarification needed about Unicode
> http://codesnipers.com/?q=utf-8-versus-windows-unicode > > The author asset that .NET is the only platform that offer full UTF-16 > support in the Windows API. The author is half mistaken, as was I. Michael Kaplan and Raymond Chen (big MS names many will recognize) clarified this. For Win2k, only UCS-2 is supported. Starting with XP, the Win32 APIs accept full UTF-16. On any version, MultiByteToWideChar() should return data in the proper encoding for that system. Igor and the others helped hash this through until the real answer could be found. John ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] feature proposal - strong but dynamic typing
> I believe I understand Darren's point (whether or not I care for them > is another story). Yes, you've understood Darren for the most part, but clearly don't understand the objections. > On Fri, Oct 30, 2009 at 2:22 AM, Roger Binnswrote: > > -BEGIN PGP SIGNED MESSAGE- > > Hash: SHA1 > > > > Darren Duncan wrote: > >> But on a newer SQLite that implements the stronger typing support I > >> proposed, > >> when that feature is active then columns with declared types like > >> INTEGER/etc > >> would enforce that only values of that type are stored there, > > > > I might have misunderstood you. Do you really mean that a new SQLite > > version should enforce the types with 'UNIVERSAL' meaning any? Do you > > really expect everyone to have to upgrade their database schemas for this? > > No, (I think what) Darren is saying is that a column with type > UNIVERSAL will behave as if that column had no CHECKs at all. It would > not enforce any type, and behave, more or less, like any SQLite column > except for INTEGER PRIMARY KEY currently behaves. That is, UNIVERSAL > would allow storing anything in it. You said no, then answered yes. This proposal would require many thousands of existing schemas to be updated. Anyone who doesn't want the strong typing would have to update their schema to use the "UNIVERSAL" keyword. This isn't going to be acceptable to ANYBODY except the "strong typing" clan. Additionally, this would be quite the shock to users not participating in this thread, who may, without prior warning, see new random errors when they update. A likely sore spot is the TIMESTAMP which, due to the current lack of documentation and supporting APIs, may likely be used to store data in ANY of the 4 types right now. > > > >> shorthand for an appropriate CHECK constraint, > > > > Now I am even more confused. There is this alleged group of people out > > there who need type enforcing but are somehow unable to put in CHECK > > constraints (which also let you addition stuff like range checks and other > > restrictions on values), so the rest of us would have to start littering our > > schemas with 'UNIVERSAL' to cater for them? > > Any column not declared as UNIVERSAL, so, INTEGER, REAL, BLOB, TEXT, > perhaps even a new type called DATETIME, would behave as if CHECK > CONSTRAINT were defined on them, allowing only the declared type of > data to be stored in them. Fortunately, I don't think this is exactly what is being proposed. The proposal (as I read it) only does the strong type checking on column types it recognizes, and others are left to the current model. If strong typing were done on any column not declared as UNIVERSAL, this would wreck even more schemas, since there are certainly going to be countless schemas using data types other than those that would be implemented. SQLite currently takes *ANYTHING* as the typename. This means, for example, someone might choose to give each column a type based on the C/C++ type/class that they will use to manipulate it. Not good SQL, but functional SQLite and plenty clean. I personally use TIMESTAMP for what you called DATETIME and there are who knows how many different variations on that, used by other SQL engines. > I see no problem with the existing tools, but, on the other hand, I > really see no problem with Darren's suggestion as well other than it > might make SQLite less Lite and more Heavy. > > But, I certainly see no backward compatibility issues with Darren's > suggestion. His suggestion allows those who care for strong typing, > but are too lazy to do it themselves, will actually have it done for > them, and those who don't care for strong typing can use UNIVERSAL. This IS a backwards compatibility issue. People get the new version of the library, but it behaves substantially differently than the previous version. The behavioral difference may break their app, and worse, it will not break at compile time, but only at runtime. This is a backwards compatibility sort of the worst type. Worse yet, this could affect users more transparently than you think. Take, for example, the web site written in PHP that uses SQLite. One day PHP is updated to use the new strongly typed SQLite, then Apache is updated to use the latest PHP, finally, one by one, web hosting providers throughout the world update to the latest Apache. And a whole rash of websites suddenly have problems. These sites changed NOTHING, but suddenly their sites don't work? I see SERIOUS problems with this proposal, especially in terms of backwards compatibility. John ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Table within a table??
> Jay A. Kreibich wrote: >> On Tue, Nov 03, 2009 at 02:51:23AM -0800, CityDev scratched on the wall: >>> That just seems so contrary to the original idea of the relational >>> model that you shouldn't have any data whose meaning is not defined >>> by data (in the case of an array you need to understand the >>> significance of relative position - remember relations have no row >>> or column order to stop you playing that game). > > So what is a character string then? An *ordered* sequence of characters. And > yet this coexists just fine with the relational model. An "array" is just a > generalization of this concept. Except that the characters in a string lose all meaning when used individually. Arrays on the other hand, while ordered, generally hold sequences of data such that each element has substantial meaning individually, and may need to be queried against. In fact, even storing strings tends to cause problems, because often people want to query only a slice of a string and this often requires a full table scan. If you absolutely must store arrays, you can do this by storing them in blobs. If you need to query against the array elements, use a custom function (though, if you need to query against the elements in the array, you should REALLY be using a relational table.) As far as order goes, it is reasonably easy to order the rows using a field for that purpose. John ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] User-defined infix functions
In your specific example you could simply define a custom "LIKE" function, and LIKE could become Unicode aware without any goofy new operators. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Jean-Christophe Deschamps Sent: Tuesday, November 03, 2009 7:51 PM To: SQLite mailing list Subject: [sqlite] User-defined infix functions Is it currently possible to specify that a user-defined function is of type infix, using the extension framework? It would be really easier to use, say a Unicode-aware LIKE named LIKEU under the infix form: ... test LIKEU pattern ... than ... LIKEU(pattern, test) ... Also converting existing statements from, for instance, the native LIKE to the new LIKEU would be _so_ easier as well! Would it be possible to have this feature someday, possibly as an optional parameter to the registering interface, or would it require too much deep surgery in the parser guts? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Local data structures vs sqlite
SQLite will have some overhead, no questions asked. If you aren't using indexes, a raw loop and raw data is going to be faster. A raw file will also always be smaller. That said, Dr. Hipp is right. If you are STORING the data, you should strongly consider using SQLite rather than proprietary file formats. SQLite is far more flexible, and the moment you need to adjust the schema, or and an index, SQLite will immediately become worth it. I wouldn't use SQLite for most in memory data that never needs to be stored on disk, but I STRONGLY recommend SQLite for persistent data. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Kavita Raghunathan Sent: Thursday, November 05, 2009 5:44 PM To: General Discussion of SQLite Database Subject: [sqlite] Local data structures vs sqlite I saw a presentation on sqlite by Dr Hipp that mentioned that anytime I'm storing data in structures or tables, I should be thinking about using sqlite instead. Would it be more efficient to use the sqlite database to store a table that Looks like this: where lets say I'm looking for the word "auto-align". Would the query be quicker than searching through this table in a "for" or while loop? Assume the table has about 200 entries. I want to know if the performance will be better and if I should consider storing these constants in the database. . . {"giants",e_sf_attr_pm_ethernet_giants}, {"last_time_cleared", e_sf_attr_pm_ethernet_last_time_cleared}, {"port_counters_start", e_sf_attr_pm_ethernet_port_counters_start}, {"port_counters_end", e_sf_attr_pm_ethernet_port_counters_end}, {"mac_rcv_unicast", e_sf_attr_pm_ethernet_mac_rcv_unicast}, {"mac_rcv_multicast", e_sf_attr_pm_ethernet_mac_rcv_multicast}, {"mac_rcv_broadcase", e_sf_attr_pm_ethernet_mac_rcv_broadcast}, {"mac_xmit_unicast", e_sf_attr_pm_ethernet_mac_xmit_unicast}, {"mac_xmit_multicast",e_sf_attr_pm_ethernet_mac_xmit_multicast}, {"mac_xmit_broadcast",e_sf_attr_pm_ethernet_mac_xmit_broadcast}, {"mac_rcv_octet", e_sf_attr_pm_ethernet_mac_rcv_octet}, {"mac_xmit_octet",e_sf_attr_pm_ethernet_mac_xmit_octet}, {"mac_delay_exceed", e_sf_attr_pm_ethernet_mac_delay_exceed}, {"mac_mtu_exceed",e_sf_attr_pm_ethernet_mac_mtu_exceed}, {"mac_in_discard",e_sf_attr_pm_ethernet_mac_in_discard}, {"mac_out_discard", e_sf_attr_pm_ethernet_mac_out_discard}, {"mac_last_time_cleared", e_sf_attr_pm_ethernet_mac_last_time_cleared}, {"manual_align", e_sf_attr_pm_manual_alig}, {"auto_align", e_sf_attr_pm_auto_align}, {"initial_align", e_sf_attr_pm_initial_align}, {"seconds_on_align", e_sf_attr_pm_seconds_on_align}, {"align_start_time", e_sf_attr_pm_last_align_start_time}, {"align_start_trigger",e_sf_attr_pm_last_align_start_trigger}, {"align_start_azimuth",e_sf_attr_pm_last_align_start_azimuth}, {"align_start_elevation", e_sf_attr_pm_last_align_start_elevation}, {"align_start_rssi", e_sf_attr_pm_last_align_start_rssi}, {"align_start_ber",e_sf_attr_pm_last_align_start_ber}, {"align_end_time", e_sf_attr_pm_last_align_end_time}, . . On 11/5/09 4:15 PM, "Beau Wilkinson"wrote: > I really think this warrants further discussion. Perhaps the correct answer > (that ARMs implement a non-standard FP type which is incompatible with Sqlite) > is already out there, but I think the issues I raised with that answer should > at least be addressed. > > Assuming (and perhaps this is the rub...) that Sqlite is built around C++ > "float" and "double," then I fail to see how any FP system that is even > plausibly useful could give the results cited by Mr Drozd. If I put (for > example) the value 100.0 into a "double," and then transport or store/retrieve > the binary representation somehow, and then take those bits and once more > treat them as a "double," then I ought to get 100 (or at least something very, > very close). These are the sorts of things that Sqlite should, to my mind at > least, be doing with real number data, and it ought not to matter what the > underlying representation is. > > And yet it has been put forth in this forum that such is not the case. Rather, > the underlying representation must comply with the IEEE FP standard, or even > basic operations will not work. And this is so certain, well-known, and > reasonable that discussion amongst the plebians is not warranted. > > How is this possible architecturally? The only explanation I can fathom is > that Sqlite depends on the underlying representation following the IEEE > standard at the bit level. For example, when doing sorts, maybe Sqlite is > assuming the mantissae and exponents are in the bit ranges specified by IEEE, > and that they are represented in the specified format (e.g. excess vs. > complement notation) as well. > > If
Re: [sqlite] Local data structures vs sqlite
>>I wouldn't use SQLite for most in memory data that never needs to be >>stored on disk > >Even this depends entirely on your context. Of course if only a simple >lookup in a table is more or less all you ever need, there is little >point in SQLite. But if or when your requirements get more complex and >things shift away from what basic data structures can do, SQLite can be >something worst considering. Yes. The operative word in my statement was "most". In other words, the vast majority of the time, data and structure needs are simple. Most data doesn't need to be stored and accessed in a way that makes SQL useful. Occasionally memory databases/tables are valuable, but it is the exception rather than the rule. I just don't want anyone saying "but this one dude told me to replace all my structs with SQLite memory tables." >For instance, if the language you use doesn't offer native support for >flexible data structure and fancy access to such data, you can feel >much more comfortable using memory DB(s) than a collection of >third-party libraries, not always consistent between each other, to >achieve the same effect as SQLite can offer in minutes. I think the best plan in this case would be to get a real language (the exception being if the language in question is assembly, in which case you didn't want SQLite anyway). >I use AutoIt (a very complete Basic-like scripting language for >Windows) to develop most applications I need. It doesn't offer >structures, nor unions, nor objects nor, say, associative arrays. But >it only takes a dozen lines of code to have associative arrays >available using an SQLite memory database, with more flexibility in >usage than most AA implementations rigidly built into many fashionable >languages. I would go crazy if I had to use a language without any structures, objects, or arrays. How could a man LIVE without arrays? Heck, even assembly language has arrays and/or structures (allocate your memory, and use offsets). John ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users