Jeff Garzik
Sun, 15 Nov 2009 02:47:41 -0800
On 11/14/2009 11:53 PM, Pete Zaitcev wrote:
On Sat, 14 Nov 2009 18:44:26 -0500, Jeff Garzik<j...@garzik.org> wrote:1) What is the point of db->del() in rep_add_nid() ? You are in the middle of a transaction, and you immediately overwrite that record in the same transaction. That is clearly unnecessary work, when db->put() will simply overwrite an existing record, if requested.That didn't work last I tested, but I forgot what the issue was. The code is copied from object.c. I'll review it.
The only code matching that pattern is object_put_end(), which also performs unnecessary work of needlessly calling db->del().
This should be recognized as nothing more than programmer laziness, enabled by the sharing of __object_del() between DEL and PUT, and the necessity of deleting associated records (ACLs) in conjunction with the main [deletion | update].
That does not apply to rep_add_nid(), which has neither object data nor related records to delete.
(patches accepted to remove ->del from PUT path, too, if you wish)
2) rep_scan(): I would rather not make the entire daemon non-responsive for multiple seconds. The database is already in multi-threaded mode, and db4 is free-threaded, so it would seem to make a lot more sense to simply g_thread_create() a thread to do this work.True, but multi-threading in tabled is a much bigger undertaking than just "simply" creating a thread. There's a lot of common state. I think the 2s problem is less urgent than processing massive replication more efficiently (from a Chunk down).
I only see one global variable reference in the whole nicely-designed, nicely self-contained replica.c file. "a lot of common state" seems like quite an exaggeration.
This patch resuscitates the worst application model from the late 1980's, cooperative multi-tasking. I do not want to walk down that road: such programs have weird pauses at weird times, and behave in a very non-deterministic manner.
Furthermore, I think you will find that threading gives you a lot more freedom to work in the background.
Also, there should be no need to scan a chunkd's keys at all, as long as the chunkd instance is still communicating with us.No, there's a need, although there's no code for it: the scan should probe keys by requesting their metadata, which forturously includes checksums. This is how the results of Chunk's self-test will be communicated to tabled.
What is the need? This description provides "what" and "how" but does not answer "why?"
Jeff
--
To unsubscribe from this list: send the line "unsubscribe hail-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html