On 12/25/2017 11:17 PM, Josu Diaz de Arcaya wrote:
Hi all,

I am very happy with the inclusion of LSM extension in sqlite3's code base.
I have a question regarding
the merging of segments within the database.

I noticed that autowork enabled was not enough for keeping disk consumption
at bay, the generated database
was N times larger than the generated by sqlite3.

If you're doing lots of operations on the same key - i.e. deleting and recreating the entries, or updating them often, then an LSM database can end up much larger than the equivalent b-tree - basically because it is so lazy about deleting old keys. Is your workload like that?

Using lsm_work(db,1-1,0)
after every transaction seems to
mitigate this and the disk footprint of the db gets back to normal. This
seem to be explained by Mr. Kennedy
here http://www.sqlite.org/src4/doc/tip/www/lsmusr.wiki#explicit_scheduling

* What does exactly do lsm_work(db,1,-1,0) ? My assumption is that it
merges all the on disk segments into one larger
segment, is this correct?

That's correct. Calling lsm_work(db, 1, -1, 0) merges all segments into one. i.e. it rewrites the entire database to be a single optimized b-tree. Calling it after every transaction seems like it would make the system much less efficient than a regular b-tree database.

* Doing this after every transactions of course takes some tool on
performance and everything goes significantly slower. What
is the best way of running lsm_work? in a separate thread like it is
mentioned in sqlite4's documentation?
http://www.sqlite.org/src4/doc/tip/www/lsmusr.wiki#explicit_scheduling

Using a background thread for merges is how Leveldb & co. work. It's not a bad idea I think.


* Finally, can I call lsm_work less aggressively in terms of parameters? I
am worried about virtual memory consumption so I'd rather not do
a full lsm_work every time, something like.. merge until there are no more
elements with delete markers on it?

I don't think there is a way to do that. And I'm not sure it would be substantially different from just optimizing the entire db anyway.

Setting LSM_CONFIG_AUTOMERGE to 2 might help a bit.

Dan.

_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to