Re: HAIL volunteer Rick Peralta

Jeff Garzik Wed, 29 Jul 2009 12:14:53 -0700

Fabian Deutsch wrote:

Am Mittwoch, den 29.07.2009, 13:59 -0400 schrieb Jeff Garzik:
Or to take converse logic -- is it likely that service->servicereplication is SLOWER than client->service replication?
Every way I look at it, client->{service,service,service}
replication
seems both easy... and potentially slower than alternatives :)
To elaborate a bit more...  there obviously are cases where you want
theclient to be the genesis of parallel data streams into the cloud.
My point was more that there are real world situations where multipleoutgoing streams from the client is significantly slower than asinglestream into the cloud, plus asking the cloud to perform further
copies.
Yes, I agree that we just should have one stream per BLOB to one chunkd,
but we might attach replication destinations when streaming this blob to
one chunkd.The result is, that we've just got one stream to a chunkd instance,
including some replication destinations, and chunkd will hapilly spread
the relpicates.
So we are just keeping the logic of where to replciate to, away from
chunkd and leave it to the client (which can ask a third daemon) where
to store the replicates.

I think we all agree on keeping the logic of where to replicate to, awayfrom chunkd.

chunkd should be as dumb^H^H^Hsimple as possible, to permit maximumflexibility of chunkd-based applications.

chunkd-based applications will be the ones making chunk load balancingdecisions, for example.

dsts[] = logic->getDstsFor(blob)
chunk->put(blob, dsts) /* Will return after successfull replc. */

The local in-cloud replication strategy, like chaining or parallel could
be passed too, but might not be as relevant as the destinations itself.

The more I think about this, the more I think this will simply become aconfiguration setting of the storage pool[1], i.e. inside tabled ornfs4d configuration.

That would permit local administrators to make a decision whetherchaining (from the client!) or parallel should be used.


All of this, it must be noted, is long term discussion.

As of today, chunkd is "defacto" coded to be parallel-from-clientbecause that's the only method possible today :)


        Jeff

[1] Or perhaps the concept of a storage pool -- a collection of chunkd'sshared by multiple applications -- will have its own configuration.Another long term discussion for another day...





--
To unsubscribe from this list: send the line "unsubscribe hail-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: HAIL volunteer Rick Peralta

Reply via email to