On 12/25/2017 11:17 PM, Josu Diaz de Arcaya wrote:
Hi all,
I am very happy with the inclusion of LSM extension in sqlite3's code base.
I have a question regarding
the merging of segments within the database.
I noticed that autowork enabled was not enough for keeping disk consumption
at bay, the generated database
was N times larger than the generated by sqlite3.
If you're doing lots of operations on the same key - i.e. deleting and
recreating the entries, or updating them often, then an LSM database can
end up much larger than the equivalent b-tree - basically because it is
so lazy about deleting old keys. Is your workload like that?
Using lsm_work(db,1-1,0)
after every transaction seems to
mitigate this and the disk footprint of the db gets back to normal. This
seem to be explained by Mr. Kennedy
here http://www.sqlite.org/src4/doc/tip/www/lsmusr.wiki#explicit_scheduling
* What does exactly do lsm_work(db,1,-1,0) ? My assumption is that it
merges all the on disk segments into one larger
segment, is this correct?
That's correct. Calling lsm_work(db, 1, -1, 0) merges all segments into
one. i.e. it rewrites the entire database to be a single optimized
b-tree. Calling it after every transaction seems like it would make the
system much less efficient than a regular b-tree database.
* Doing this after every transactions of course takes some tool on
performance and everything goes significantly slower. What
is the best way of running lsm_work? in a separate thread like it is
mentioned in sqlite4's documentation?
http://www.sqlite.org/src4/doc/tip/www/lsmusr.wiki#explicit_scheduling
Using a background thread for merges is how Leveldb & co. work. It's not
a bad idea I think.
* Finally, can I call lsm_work less aggressively in terms of parameters? I
am worried about virtual memory consumption so I'd rather not do
a full lsm_work every time, something like.. merge until there are no more
elements with delete markers on it?
I don't think there is a way to do that. And I'm not sure it would be
substantially different from just optimizing the entire db anyway.
Setting LSM_CONFIG_AUTOMERGE to 2 might help a bit.
Dan.
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users