Re: [sqlite] SQLite scalability
Valentin Davydov wrote: On Thu, Nov 21, 2013 at 11:43:32AM +0200, Baruch Burstein wrote: Hi all, I know SQLite is supposed to support DB sizes in the TB (I think the default configuration can reach 1TB). I am curious if anyone actually uses SQlite at anywhere near this. Yes. Does anyone use it regularly for DBs 500GB+ in size, or with tables containing 10 billion rows+? I've got an installation with SQLite DB of several terabytes in size. It contains about 20 billions thoroughly indexed records, and grows every day (more precisely, every night) by a few tens of millions of new records. How much concurrency does your use require? I've spent some efforts to eliminate concurrency in application. That is, updates and selects occur at very different times of the day. How long do selects take (assuming indexes are set correctly?) It depends of the size of the select. Single row is selected instantaneously. Check of the uniqueness takes about 1-2 minutes per 1 million of records, most of time being spent parsing SQL commands. Whereas aggregate functions over substantional fraction of the entire database, of course, take too long to be executed in real time. Use SQLightning instead - concurrency issues are irrelevant then, since writers don't block readers. And it will search multiple gigabytes per second, as opposed to your millions-per-minute figure above. Are there problems of locking ("normal" SQLite doesn't usually suffer from locking since transactions are very quick, but if transactions can be in the order of 100's of ms, I think locking can easily happen if the DB is accessed concurrently a few times a second, though I am not sure if this may only apply to writes). Yes. Single transaction (insertion of that tens of millions of new recors) takes hours in the worst case. There's no good reason for insertion of 10 million records to take hours. I understand that the answer to most of these questions can be very hardware (and software) dependent, Indeed not so. The only hardware capable of storing such amount of data is an array of magnetic disks, and their latency time (about 10-20 ms for random access) is much more than any reasonable software overhead. Even cache (internal SQLite page cache and/or operation system file cache) occupies the same memory and therefore has almost the same effect. The only software which determines the performance is SQLite itself, in my case, perhaps, trees rebalancing algorithm. 1TB SSDs are only ~$500; there's no reason to limit yourself to the slowness of magnetic disks these days. http://www.amazon.com/Samsung-Electronics-EVO-Series-2-5-Inch-MZ-7TE1T0BW/dp/B00E3W16OU SQLightning uses only the OS filesystem cache, so you get maximal use of the available system RAM instead of wasting half of it with redundant copies in application-level caches. but I am just trying to get a feel for SQLite's applicability for a project I am working on that may reach limits like these. The only definive SQLite limits are documentet in the relevant manual page. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite scalability
On Thu, Nov 21, 2013 at 11:43:32AM +0200, Baruch Burstein wrote: > Hi all, > > I know SQLite is supposed to support DB sizes in the TB (I think the > default configuration can reach 1TB). I am curious if anyone actually uses > SQlite at anywhere near this. Yes. > Does anyone use it regularly for DBs 500GB+ > in size, or with tables containing 10 billion rows+? I've got an installation with SQLite DB of several terabytes in size. It contains about 20 billions thoroughly indexed records, and grows every day (more precisely, every night) by a few tens of millions of new records. > How much concurrency does your use require? I've spent some efforts to eliminate concurrency in application. That is, updates and selects occur at very different times of the day. > How long do selects take (assuming indexes are set correctly?) It depends of the size of the select. Single row is selected instantaneously. Check of the uniqueness takes about 1-2 minutes per 1 million of records, most of time being spent parsing SQL commands. Whereas aggregate functions over substantional fraction of the entire database, of course, take too long to be executed in real time. > Are there problems of locking ("normal" SQLite doesn't usually > suffer from locking since transactions are very quick, but if transactions > can be in the order of 100's of ms, I think locking can easily happen if > the DB is accessed concurrently a few times a second, though I am not sure > if this may only apply to writes). Yes. Single transaction (insertion of that tens of millions of new recors) takes hours in the worst case. > I understand that the answer to most of these questions can be very > hardware (and software) dependent, Indeed not so. The only hardware capable of storing such amount of data is an array of magnetic disks, and their latency time (about 10-20 ms for random access) is much more than any reasonable software overhead. Even cache (internal SQLite page cache and/or operation system file cache) occupies the same memory and therefore has almost the same effect. The only software which determines the performance is SQLite itself, in my case, perhaps, trees rebalancing algorithm. > but I am just trying to get a feel for > SQLite's applicability for a project I am working on that may reach limits > like these. The only definive SQLite limits are documentet in the relevant manual page. Valentin Davydov. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite scalability
On Thu, 21 Nov 2013 11:43:32 +0200 Baruch Bursteinwrote: > Hi all, > > I know SQLite is supposed to support DB sizes in the TB (I think the > default configuration can reach 1TB). I am curious if anyone actually uses > SQlite at anywhere near this. Does anyone use it regularly for DBs 500GB+ > in size, or with tables containing 10 billion rows+? How much concurrency > does your use require? How long do selects take (assuming indexes are set > correctly?) Are there problems of locking ("normal" SQLite doesn't usually > suffer from locking since transactions are very quick, but if transactions > can be in the order of 100's of ms, I think locking can easily happen if > the DB is accessed concurrently a few times a second, though I am not sure > if this may only apply to writes). > > I understand that the answer to most of these questions can be very > hardware (and software) dependent, but I am just trying to get a feel for > SQLite's applicability for a project I am working on that may reach limits > like these. Depends on what type/kind of use you need for your data. If you are going to do a lot insert, update, delete, perhaps sqlite isn't for you. If it's principal use is for select, when populate the tables, do it pre-sorted by the data primary key or by the colum which makes a better quality index for your use (look for 'low quality indexes' in sqlite docs, and do the opposite). Normalize your db as much as you can, but not more ;) . Sqlite allows you to attach up to 30 db files. Split your data between 2-3 db and put each one in different disks. Increase cache size before creating any table, other RDBMS uses a lot of memory from several MBs to GBs, give Sqlite cache 500MB-1GB for example. Write your queries in different ways and test which is better, there are some tricks with indexes, joins..., that can help you (study documentation) Set STATS3 or 4 for analyze your db data before creating any table. Set autovacuum full before creating any table, even if you don't plan to delete/update data. If I Remember Correctly (if not correct me please), autovacuum adds metadata to db file that allows Sqlite engine do some internal works faster. > > Thanks, > Baruch > > -- > ˙uʍop-ǝpısdn sı ɹoʇıuoɯ ɹnoʎ 'sıɥʇ pɐǝɹ uɐɔ noʎ ɟı > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users --- --- Eduardo Morras ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users