On 12/02/2009 10:55 PM, Pete Zaitcev wrote:
I need a way to scan all objects that an Chunk node keeps. There's
a function that does it already: fs_list_objs. Looking at it, is there
a reason why it uses readdir instead of tchdbiternext?
The TC database master.tch stores table data, not object data.
master.tch is a (table name)->(table identifier) lookup table.
The object "database" remains 100% filesystem-based, with a fixed-length
metadata header prepended to each object.
That means per-object lookup and retrieval is super-quick, with the
kernel's pagecache and i/dcaches working hard for us.
However, the list-objects operation requires that we open each object's
file, and read the fixed-length metadata header. Objects not belonging
to the authenticated user are then discarded from the list-objects output.
A truly server-punishing operation -- opening and reading EVERY file's
fixed length header -- but the thought was that list-objects would be so
infrequent (once daily? once per cluster boot?) that it would not
matter much.
If that assumption turns out to be invalid or unwise, we can certainly
change things (see below).
In case of
self-checking, scanning directories is undesirable, because if an
object somehow (e.g. a hardware failure) ends existing in filesystem
but without a corresponding entry in the TC database, it will incorrectly
count as present.
I had considered storing object metadata in an additional TC database,
for a couple reasons: much faster list-objects and object metadata
retrieval, and storage of small objects.
If some future chunkd stores object metadata in a TC database, yes,
inconsistencies could arise.
Jeff
--
To unsubscribe from this list: send the line "unsubscribe hail-devel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html