Pete Zaitcev wrote:
On Wed, 29 Jul 2009 10:31:56 -0400, Jeff Garzik <[email protected]> wrote:

Is there someone taking the point for the chunkd development?
That's me, for the moment.  :)

I have some short list todo for Chunk, after which I don't have
any particular plans:
 * Exit if CLD registration fails (maybe!).

Hopefully all this is wrapped up into libcldc, such that, an application needs to only worry about major, abstracted events after calling new-session:

* no master, after defined "hunt" procedure.

        This includes both init and master failure (as distinguished
        from fail-over).

        The application will need to be in the "no CLD session"
        state in both cases.

        And indeed, exit() might be the best way to do that.

* master fail-over

        Flush our [currently non-existent] CLD cache.

etc.


 * Put ourhost into the CLD record, and the port.
 * Use base directory instead of Cell.
 * Switch to asprintf for CLD filenames, Geo.

agreed


So far we managed hacking on same codebase with relative ease.
Just make sure to post patches early.

You should read the GoogleFS paper referenced on the chunkd wiki page: http://labs.google.com/papers/gfs-sosp2003.pdf It describes the purpose and use of a chunk server, in the context of distributed cloud storage.

I think we're at a point where we have our own base of knowledge
and evolved an overall architecture to the point we don't have to
ape every little detail of Google architecture.

Well, until the wiki has a description of the basic idea of a chunk server, the Google paper will have to do.

The point is not that we are aping Google, but more to describe the general concept to someone who does not know what a chunk server is, and how a chunk server fits into the "grand design."


In particular I'm
going to fight hard any talk of Chunk doing its own replication,
for now at least.

WRT chunkd and replication, yes, that's fine for version 1.0.

But consider which is more likely to have bandwidth to spare:

        a) client -> service
                or
        b) service -> service

Of the two, I'd say "a" is a bit more likely to be remote (WAN) and have a slow-upload situation like my home cable modem (1 mbps down, 50 kbps up), and "b" is more likely to be LAN.

Or to take converse logic -- is it likely that service->service replication is SLOWER than client->service replication?

Every way I look at it, client->{service,service,service} replication seems both easy... and potentially slower than alternatives :)

        Jeff


--
To unsubscribe from this list: send the line "unsubscribe hail-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to