Re: HAIL volunteer Rick Peralta

Rick Peralta Fri, 31 Jul 2009 09:37:53 -0700

Hi All,

Thanks for inviting me to the forum and thanks to you all for making things 
happen!


My father said, "don't change anything unless you know why".  Those words ring 
in my ears more and more after decades of System development.  It is my 
intention and hope to respect the wisdom of those words and be clear about what 
the objectives of any endeavor is (including sloth ;^).

The chunkd effort caught my eye for a variety of reasons.  It is functionally 
very much like something I advocated for a long time ago, it is a relatively 
simple, yet powerful machine and it may benefit by some redesign for 
performance (my personal specialty).

The question at hand is: What truly needs to be done?  Bugs are bugs and one 
can debate one solution over another, but in the end it's about getting things 
to work well.  Multithreading the transport layer is probably a good idea, but 
some diligence should be paid to why.  There are any number of other open 
issues that also deserve some attention.  Coding is fine, but understanding 
what and why seems to be a first step.

In order to have a common basis for evaluation I'd like to suggest a standard 
platform to consider in the context of discussions.  The current implementation 
of chunkd, running on a standard server (probably with a 32 bit address space), 
with gigabit Ethernet, and a single disk (good for about 25 MB/s & 15 ms seek 
time).  Consideration of more or different bulk storage, 10 Gbe, IB or other 
high bandwidth implementations and so forth can be considered as branches from 
the core model.

Given the current implementation of chunkd, it generally resides in user space, 
over a standard file system (complete with caches, overhead and whatever else 
comes along).

PZ>
I have some short list todo for Chunk, after which I don't have
any particular plans:
 * Exit if CLD registration fails (maybe!).
 * Put ourhost into the CLD record, and the port.
 * Use base directory instead of Cell.
 * Switch to asprintf for CLD filenames, Geo.

FD>
Yes. I also think that chunkd should not do it's own replication. As the
strategy may be domain/application dependend. Therefor I'd appreciate if
chunkd would provide some kind of "copy(dst,sha)" function, to be able
to directly copy to another chunkd instance.

JG>
Hopefully all this is wrapped up into libcldc...

JG>
* total single-node volume size:  one cheap SATA hard drive
* total number of chunks:  ==
        total number of tabled objects / number of storage nodes
* distribution of chunk sizes:  dependent upon the application using tabled
* aggregate bandwidth:  dependent upon the application using tabled

fbp>
Might we put some numbers to this?
Most notable is typical chunk size and number of supported clients.

 - Rick Peralta
    www.linkedin.com/in/rickperalta

--
To unsubscribe from this list: send the line "unsubscribe hail-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: HAIL volunteer Rick Peralta

Reply via email to