Re: HAIL volunteer Rick Peralta

Jeff Garzik Wed, 29 Jul 2009 10:17:54 -0700

Pete Zaitcev wrote:

On Wed, 29 Jul 2009 10:31:56 -0400, Jeff Garzik <[email protected]> wrote:

Is there someone taking the point for the chunkd development?

That's me, for the moment.  :)


I have some short list todo for Chunk, after which I don't have
any particular plans:
 * Exit if CLD registration fails (maybe!).

Hopefully all this is wrapped up into libcldc, such that, an applicationneeds to only worry about major, abstracted events after callingnew-session:


* no master, after defined "hunt" procedure.

        This includes both init and master failure (as distinguished
        from fail-over).

        The application will need to be in the "no CLD session"
        state in both cases.

        And indeed, exit() might be the best way to do that.

* master fail-over

        Flush our [currently non-existent] CLD cache.

etc.

 * Put ourhost into the CLD record, and the port.
 * Use base directory instead of Cell.
 * Switch to asprintf for CLD filenames, Geo.


agreed

So far we managed hacking on same codebase with relative ease.
Just make sure to post patches early.
You should read the GoogleFS paper referenced on the chunkd wiki page:http://labs.google.com/papers/gfs-sosp2003.pdf It describes the purposeand use of a chunk server, in the context of distributed cloud storage.
I think we're at a point where we have our own base of knowledge
and evolved an overall architecture to the point we don't have to
ape every little detail of Google architecture.

Well, until the wiki has a description of the basic idea of a chunkserver, the Google paper will have to do.

The point is not that we are aping Google, but more to describe thegeneral concept to someone who does not know what a chunk server is, andhow a chunk server fits into the "grand design."

In particular I'm
going to fight hard any talk of Chunk doing its own replication,
for now at least.


WRT chunkd and replication, yes, that's fine for version 1.0.

But consider which is more likely to have bandwidth to spare:

        a) client -> service
                or
        b) service -> service

Of the two, I'd say "a" is a bit more likely to be remote (WAN) and havea slow-upload situation like my home cable modem (1 mbps down, 50 kbpsup), and "b" is more likely to be LAN.

Or to take converse logic -- is it likely that service->servicereplication is SLOWER than client->service replication?

Every way I look at it, client->{service,service,service} replicationseems both easy... and potentially slower than alternatives :)


        Jeff


--
To unsubscribe from this list: send the line "unsubscribe hail-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: HAIL volunteer Rick Peralta

Reply via email to