On Fri, 21 Nov 2014 14:01:39 -0500 (EST)
Joseph Fernandes <josfe...@redhat.com> wrote:

> 4) Therefore, we are looking at a datastore that can give us a very
> quick write(almost zero latency, as the recording is done inline
> w.r.t file IO) and that as good data querying facilities(Slight
> latency in the read is fine but the fresh of that record data should
> be spot on).

This strikes me as a classic case for "record, then analyze".  I would
capture the data more cheaply and use SQLite to decide what to do.  

You don't really care if the recorded times are exactly right; a few
missed updates wouldn't affect the cold/hot status very much.  You
should be willing to lose a few if you improve write latency.  OTOH
the maintenance operation isn't *very* time critical; you just can't
afford to walk the whole tree first.  

That suggests two possibilities for capture: 

1.  Keep a sequential file of {name,time} or {inode,time} pairs
(whichever is more convenient to use).  By using O_APPEND you get
atomic writes and perfect captures across threads.  fsync(2) as
desired.  

2.  If in practice the above file grows too large, use a primitive
hashing store such as BerkeleyDB to capture counts by name/inode.  It's
not even obvious you need an external store; you might be able to get
away with std::hash_map in C++ and periodically serialize that.  ISTM
you don't need to worry about concurrency because a few missed updates
here and there won't change much.  

At maintenance time, scoop the file into a SQLite table, and you're
back where you started, except you already have zero
write-time latency.  

HTH.  

--jkl
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to