We are using the FTS3 extension to sqlite to store large numbers of short
(~300 byte) documents. This is working very well and providing us with very
fast text search, but the behaviour around deletion of documents has me
confused.

Our system must control the the size of the database and will delete the
oldest documents when the database size breaches a certain limit. I now
understand from comments on this mailing list and elsewhere that this is not
an optimal pattern for the FTS extension as doclists for the oldest
documents are the least likely to be 'merged'.

My question is, does this actually work at all? If I delete a row from my
FTS4 table (resulting in a new empty doclist being added to the index), then
I subsequently add many (1000s) new documents and call the 'merge' function
several times (automerge is also enabled), is there any gaurentee that the
empty doclist and the populated doclist that it superseded will ever be
removed? My testing suggests this isn't the case.

I have a 1GB database with 6million documents. If I keep adding new
documents at around 1 per second and deleting documents when the size of the
data goes beyond 1GB, the size of the index seems to grow and the number of
documents I can store in the 1GB file seems decrease in a linear manner.

Calling the 'optimize' function seems to solve this issue (removing all the
dead doclists), but that isn't practical for our software, as it implies
some downtime for our high availablity service due to the long execution
time of the optimize function (Could be minutes for a 1GB file).

I have seen this
(http://sqlite.1065341.n5.nabble.com/fts3-database-grows-td42069.html) post
from 2008. However, it predates the 'automerge' and manual merge features,
and from the documentation I assumed these new features would delete all the
data related to deleted documents. Am I incorrect in my assumption?

Thanks for any clarification you can offer.



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/FTS3-4-merge-function-behaviour-when-deleting-rows-tp75370.html
Sent from the SQLite mailing list archive at Nabble.com.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to