Re: [sqlite] FTS vs INDEX

2011-10-19 Thread Alexey Pechnikov
2011/10/19 Fabian :
> I always do inserts in batches of 100.000 rows, and after each batch I
> manually merge the b-trees using:
>
>  INSERT INTO table(table) VALUES('optimize');
>
> Is there a possibility that it will do automatic maintenance half-way during
> a batch? Or will it always wait untill the transaction is finished?

I think you are victim of the premature optimization :)

See documentation:
"several different b-trees that are incrementally merged as rows are inserted,
updated and deleted. This technique improves performance when writing to an
FTS table, but causes some overhead for full-text queries that use the index."

So you can work with a big FTS tables without using the "optimize" method.
I use some FTS tables with tens of millions records and effect of the
"optimize"
isn't measurable.

-- 
Best regards, Alexey Pechnikov.
http://pechnikov.tel/
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS vs INDEX

2011-10-19 Thread Scott Hess
On Wed, Oct 19, 2011 at 12:50 PM, Fabian  wrote:
> 2011/10/19 Scott Hess 
>> To be clear, how it works is that new insertions are batched into a
>> new index tree, with index trees periodically aggregated to keep
>> selection efficient and to keep the size contained.  So while the
>> speed per insert should remain pretty stable constant, periodically an
>> insert will require index maintenance, so that insert will be slower.
>> If you have a lot of documents (or a small page cache) these
>> maintenance events can get pretty expensive relative to the cost of a
>> non-maintenance insert.  So it's not a clear-cut win, but it probably
>> would be interesting as an alternative sort of index for some tables.
>
> I always do inserts in batches of 100.000 rows, and after each batch I
> manually merge the b-trees using:
>
>  INSERT INTO table(table) VALUES('optimize');
>
> Is there a possibility that it will do automatic maintenance half-way during
> a batch? Or will it always wait untill the transaction is finished?

It does it when it does it, in fact you're probably getting some small
merges during this process already.  If you're doing your batch
inserts within a surrounding transaction, and are inserting documents
by ascending docid (or letting the system choose docid), it can buffer
up many updates in memory before flushing them to disk, which is
pretty efficient.  Inserting 100,000 documents this way will probably
not hit any very large merges, unless your documents tend to have a
very large number of unique terms.

-scott
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS vs INDEX

2011-10-19 Thread Fabian
2011/10/19 Scott Hess 

>
> To be clear, how it works is that new insertions are batched into a
> new index tree, with index trees periodically aggregated to keep
> selection efficient and to keep the size contained.  So while the
> speed per insert should remain pretty stable constant, periodically an
> insert will require index maintenance, so that insert will be slower.
> If you have a lot of documents (or a small page cache) these
> maintenance events can get pretty expensive relative to the cost of a
> non-maintenance insert.  So it's not a clear-cut win, but it probably
> would be interesting as an alternative sort of index for some tables.
>
>
I always do inserts in batches of 100.000 rows, and after each batch I
manually merge the b-trees using:

 INSERT INTO table(table) VALUES('optimize');

Is there a possibility that it will do automatic maintenance half-way during
a batch? Or will it always wait untill the transaction is finished?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS vs INDEX

2011-10-19 Thread Scott Hess
On Wed, Oct 19, 2011 at 7:56 AM, Fabian  wrote:
> 2011/10/19 Alexey Pechnikov 
>> FTS use index multi-tree and de-facto has _no_ insert speed degradation.
>
> Thanks, that's good to hear! It makes me wonder why SQLite doesn't use that
> same multi-tree mechanism for regular indexes, but that's a whole different
> question.

To be clear, how it works is that new insertions are batched into a
new index tree, with index trees periodically aggregated to keep
selection efficient and to keep the size contained.  So while the
speed per insert should remain pretty stable constant, periodically an
insert will require index maintenance, so that insert will be slower.
If you have a lot of documents (or a small page cache) these
maintenance events can get pretty expensive relative to the cost of a
non-maintenance insert.  So it's not a clear-cut win, but it probably
would be interesting as an alternative sort of index for some tables.

-scott
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS vs INDEX

2011-10-19 Thread Black, Michael (IS)
Nope -- didn't note the insert speed on that test.



Why don't you take my benchmark data and test it yourself?  Then post the 
results.



The saying "your mileage may vary" comes to mind...





Michael D. Black

Senior Scientist

Advanced Analytics Directorate

Advanced GEOINT Solutions Operating Unit

Northrop Grumman Information Systems


From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Fabian [fabianpi...@gmail.com]
Sent: Wednesday, October 19, 2011 9:44 AM
To: General Discussion of SQLite Database
Subject: EXT :Re: [sqlite] FTS vs INDEX

Very interesting benchmarks! However it seems to focus mainly on the speed
of SELECT queries, and the total size of the resulting database on disk. But
my main concern is about the speed of INSERT queries vs normal tables. Any
chance you compared that too?

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS vs INDEX

2011-10-19 Thread Alexey Pechnikov
2011/10/19 Fabian :
> Thanks, that's good to hear! It makes me wonder why SQLite doesn't use that
> same multi-tree mechanism for regular indexes, but that's a whole different
> question.

It's impossible with SQLite3 database format. May be SQLite4 will be
support it :)

-- 
Best regards, Alexey Pechnikov.
http://pechnikov.tel/
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS vs INDEX

2011-10-19 Thread Fabian
2011/10/19 Alexey Pechnikov 

> FTS use index multi-tree and de-facto has _no_ insert speed degradation.
>

Thanks, that's good to hear! It makes me wonder why SQLite doesn't use that
same multi-tree mechanism for regular indexes, but that's a whole different
question.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS vs INDEX

2011-10-19 Thread Alexey Pechnikov
FTS use index multi-tree and de-facto has _no_ insert speed degradation.
I did do test for 400+ millions of records.
With b-tree index there is insert speed degradation:
http://geomapx.blogspot.com/2010/04/sqlite-index-degradation-tests.html
http://geomapx.blogspot.com/search?q=index+speed

So FTS as hash-index is nice.

2011/10/19 Fabian :
> Did anyone do some benchmarks how the insert-speed of FTS compares to a TEXT
> INDEX column? I don't need many of the extra features of FTS, because I
> always need to look up rows by prefix or exact match, and both can be
> implemented efficiently via TEXT INDEX too. But if the overhead is
> comparable, I'd rather use FTS.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
Best regards, Alexey Pechnikov.
http://pechnikov.tel/
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS vs INDEX

2011-10-19 Thread Fabian
Very interesting benchmarks! However it seems to focus mainly on the speed
of SELECT queries, and the total size of the resulting database on disk. But
my main concern is about the speed of INSERT queries vs normal tables. Any
chance you compared that too?

2011/10/19 Black, Michael (IS) <michael.bla...@ngc.com>

> I recently benchmarked this...FTS4 has a prefix option that can make it
> slightly faster than TEXT.  Other than that it's about the same speed.
>
> http://www.mail-archive.com/sqlite-users@sqlite.org/msg64591.html
>
>
>
> The older part of the thread has the benchmark data
>
>
>
> Michael D. Black
>
> Senior Scientist
>
> Advanced Analytics Directorate
>
> Advanced GEOINT Solutions Operating Unit
>
> Northrop Grumman Information Systems
>
> 
> From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on
> behalf of Fabian [fabianpi...@gmail.com]
> Sent: Wednesday, October 19, 2011 9:20 AM
> To: General Discussion of SQLite Database
> Subject: EXT :[sqlite] FTS vs INDEX
>
> Did anyone do some benchmarks how the insert-speed of FTS compares to a
> TEXT
> INDEX column? I don't need many of the extra features of FTS, because I
> always need to look up rows by prefix or exact match, and both can be
> implemented efficiently via TEXT INDEX too. But if the overhead is
> comparable, I'd rather use FTS.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS vs INDEX

2011-10-19 Thread Black, Michael (IS)
I recently benchmarked this...FTS4 has a prefix option that can make it 
slightly faster than TEXT.  Other than that it's about the same speed.

http://www.mail-archive.com/sqlite-users@sqlite.org/msg64591.html



The older part of the thread has the benchmark data



Michael D. Black

Senior Scientist

Advanced Analytics Directorate

Advanced GEOINT Solutions Operating Unit

Northrop Grumman Information Systems


From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Fabian [fabianpi...@gmail.com]
Sent: Wednesday, October 19, 2011 9:20 AM
To: General Discussion of SQLite Database
Subject: EXT :[sqlite] FTS vs INDEX

Did anyone do some benchmarks how the insert-speed of FTS compares to a TEXT
INDEX column? I don't need many of the extra features of FTS, because I
always need to look up rows by prefix or exact match, and both can be
implemented efficiently via TEXT INDEX too. But if the overhead is
comparable, I'd rather use FTS.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] FTS vs INDEX

2011-10-19 Thread Fabian
Did anyone do some benchmarks how the insert-speed of FTS compares to a TEXT
INDEX column? I don't need many of the extra features of FTS, because I
always need to look up rows by prefix or exact match, and both can be
implemented efficiently via TEXT INDEX too. But if the overhead is
comparable, I'd rather use FTS.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users