Re: [sqlite] FTS vs INDEX
2011/10/19 Fabian: > I always do inserts in batches of 100.000 rows, and after each batch I > manually merge the b-trees using: > > INSERT INTO table(table) VALUES('optimize'); > > Is there a possibility that it will do automatic maintenance half-way during > a batch? Or will it always wait untill the transaction is finished? I think you are victim of the premature optimization :) See documentation: "several different b-trees that are incrementally merged as rows are inserted, updated and deleted. This technique improves performance when writing to an FTS table, but causes some overhead for full-text queries that use the index." So you can work with a big FTS tables without using the "optimize" method. I use some FTS tables with tens of millions records and effect of the "optimize" isn't measurable. -- Best regards, Alexey Pechnikov. http://pechnikov.tel/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] FTS vs INDEX
On Wed, Oct 19, 2011 at 12:50 PM, Fabianwrote: > 2011/10/19 Scott Hess >> To be clear, how it works is that new insertions are batched into a >> new index tree, with index trees periodically aggregated to keep >> selection efficient and to keep the size contained. So while the >> speed per insert should remain pretty stable constant, periodically an >> insert will require index maintenance, so that insert will be slower. >> If you have a lot of documents (or a small page cache) these >> maintenance events can get pretty expensive relative to the cost of a >> non-maintenance insert. So it's not a clear-cut win, but it probably >> would be interesting as an alternative sort of index for some tables. > > I always do inserts in batches of 100.000 rows, and after each batch I > manually merge the b-trees using: > > INSERT INTO table(table) VALUES('optimize'); > > Is there a possibility that it will do automatic maintenance half-way during > a batch? Or will it always wait untill the transaction is finished? It does it when it does it, in fact you're probably getting some small merges during this process already. If you're doing your batch inserts within a surrounding transaction, and are inserting documents by ascending docid (or letting the system choose docid), it can buffer up many updates in memory before flushing them to disk, which is pretty efficient. Inserting 100,000 documents this way will probably not hit any very large merges, unless your documents tend to have a very large number of unique terms. -scott ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] FTS vs INDEX
2011/10/19 Scott Hess> > To be clear, how it works is that new insertions are batched into a > new index tree, with index trees periodically aggregated to keep > selection efficient and to keep the size contained. So while the > speed per insert should remain pretty stable constant, periodically an > insert will require index maintenance, so that insert will be slower. > If you have a lot of documents (or a small page cache) these > maintenance events can get pretty expensive relative to the cost of a > non-maintenance insert. So it's not a clear-cut win, but it probably > would be interesting as an alternative sort of index for some tables. > > I always do inserts in batches of 100.000 rows, and after each batch I manually merge the b-trees using: INSERT INTO table(table) VALUES('optimize'); Is there a possibility that it will do automatic maintenance half-way during a batch? Or will it always wait untill the transaction is finished? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] FTS vs INDEX
On Wed, Oct 19, 2011 at 7:56 AM, Fabianwrote: > 2011/10/19 Alexey Pechnikov >> FTS use index multi-tree and de-facto has _no_ insert speed degradation. > > Thanks, that's good to hear! It makes me wonder why SQLite doesn't use that > same multi-tree mechanism for regular indexes, but that's a whole different > question. To be clear, how it works is that new insertions are batched into a new index tree, with index trees periodically aggregated to keep selection efficient and to keep the size contained. So while the speed per insert should remain pretty stable constant, periodically an insert will require index maintenance, so that insert will be slower. If you have a lot of documents (or a small page cache) these maintenance events can get pretty expensive relative to the cost of a non-maintenance insert. So it's not a clear-cut win, but it probably would be interesting as an alternative sort of index for some tables. -scott ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] FTS vs INDEX
Nope -- didn't note the insert speed on that test. Why don't you take my benchmark data and test it yourself? Then post the results. The saying "your mileage may vary" comes to mind... Michael D. Black Senior Scientist Advanced Analytics Directorate Advanced GEOINT Solutions Operating Unit Northrop Grumman Information Systems From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on behalf of Fabian [fabianpi...@gmail.com] Sent: Wednesday, October 19, 2011 9:44 AM To: General Discussion of SQLite Database Subject: EXT :Re: [sqlite] FTS vs INDEX Very interesting benchmarks! However it seems to focus mainly on the speed of SELECT queries, and the total size of the resulting database on disk. But my main concern is about the speed of INSERT queries vs normal tables. Any chance you compared that too? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] FTS vs INDEX
2011/10/19 Fabian: > Thanks, that's good to hear! It makes me wonder why SQLite doesn't use that > same multi-tree mechanism for regular indexes, but that's a whole different > question. It's impossible with SQLite3 database format. May be SQLite4 will be support it :) -- Best regards, Alexey Pechnikov. http://pechnikov.tel/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] FTS vs INDEX
2011/10/19 Alexey Pechnikov> FTS use index multi-tree and de-facto has _no_ insert speed degradation. > Thanks, that's good to hear! It makes me wonder why SQLite doesn't use that same multi-tree mechanism for regular indexes, but that's a whole different question. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] FTS vs INDEX
FTS use index multi-tree and de-facto has _no_ insert speed degradation. I did do test for 400+ millions of records. With b-tree index there is insert speed degradation: http://geomapx.blogspot.com/2010/04/sqlite-index-degradation-tests.html http://geomapx.blogspot.com/search?q=index+speed So FTS as hash-index is nice. 2011/10/19 Fabian: > Did anyone do some benchmarks how the insert-speed of FTS compares to a TEXT > INDEX column? I don't need many of the extra features of FTS, because I > always need to look up rows by prefix or exact match, and both can be > implemented efficiently via TEXT INDEX too. But if the overhead is > comparable, I'd rather use FTS. > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- Best regards, Alexey Pechnikov. http://pechnikov.tel/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] FTS vs INDEX
Very interesting benchmarks! However it seems to focus mainly on the speed of SELECT queries, and the total size of the resulting database on disk. But my main concern is about the speed of INSERT queries vs normal tables. Any chance you compared that too? 2011/10/19 Black, Michael (IS) <michael.bla...@ngc.com> > I recently benchmarked this...FTS4 has a prefix option that can make it > slightly faster than TEXT. Other than that it's about the same speed. > > http://www.mail-archive.com/sqlite-users@sqlite.org/msg64591.html > > > > The older part of the thread has the benchmark data > > > > Michael D. Black > > Senior Scientist > > Advanced Analytics Directorate > > Advanced GEOINT Solutions Operating Unit > > Northrop Grumman Information Systems > > > From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on > behalf of Fabian [fabianpi...@gmail.com] > Sent: Wednesday, October 19, 2011 9:20 AM > To: General Discussion of SQLite Database > Subject: EXT :[sqlite] FTS vs INDEX > > Did anyone do some benchmarks how the insert-speed of FTS compares to a > TEXT > INDEX column? I don't need many of the extra features of FTS, because I > always need to look up rows by prefix or exact match, and both can be > implemented efficiently via TEXT INDEX too. But if the overhead is > comparable, I'd rather use FTS. > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] FTS vs INDEX
I recently benchmarked this...FTS4 has a prefix option that can make it slightly faster than TEXT. Other than that it's about the same speed. http://www.mail-archive.com/sqlite-users@sqlite.org/msg64591.html The older part of the thread has the benchmark data Michael D. Black Senior Scientist Advanced Analytics Directorate Advanced GEOINT Solutions Operating Unit Northrop Grumman Information Systems From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on behalf of Fabian [fabianpi...@gmail.com] Sent: Wednesday, October 19, 2011 9:20 AM To: General Discussion of SQLite Database Subject: EXT :[sqlite] FTS vs INDEX Did anyone do some benchmarks how the insert-speed of FTS compares to a TEXT INDEX column? I don't need many of the extra features of FTS, because I always need to look up rows by prefix or exact match, and both can be implemented efficiently via TEXT INDEX too. But if the overhead is comparable, I'd rather use FTS. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] FTS vs INDEX
Did anyone do some benchmarks how the insert-speed of FTS compares to a TEXT INDEX column? I don't need many of the extra features of FTS, because I always need to look up rows by prefix or exact match, and both can be implemented efficiently via TEXT INDEX too. But if the overhead is comparable, I'd rather use FTS. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users