Hi All,
Thanks for inviting me to the forum and thanks to you all for making things
happen!
My father said, "don't change anything unless you know why". Those words ring
in my ears more and more after decades of System development. It is my
intention and hope to respect the wisdom of those words and be clear about what
the objectives of any endeavor is (including sloth ;^).
The chunkd effort caught my eye for a variety of reasons. It is functionally
very much like something I advocated for a long time ago, it is a relatively
simple, yet powerful machine and it may benefit by some redesign for
performance (my personal specialty).
The question at hand is: What truly needs to be done? Bugs are bugs and one
can debate one solution over another, but in the end it's about getting things
to work well. Multithreading the transport layer is probably a good idea, but
some diligence should be paid to why. There are any number of other open
issues that also deserve some attention. Coding is fine, but understanding
what and why seems to be a first step.
In order to have a common basis for evaluation I'd like to suggest a standard
platform to consider in the context of discussions. The current implementation
of chunkd, running on a standard server (probably with a 32 bit address space),
with gigabit Ethernet, and a single disk (good for about 25 MB/s & 15 ms seek
time). Consideration of more or different bulk storage, 10 Gbe, IB or other
high bandwidth implementations and so forth can be considered as branches from
the core model.
Given the current implementation of chunkd, it generally resides in user space,
over a standard file system (complete with caches, overhead and whatever else
comes along).
PZ>
I have some short list todo for Chunk, after which I don't have
any particular plans:
* Exit if CLD registration fails (maybe!).
* Put ourhost into the CLD record, and the port.
* Use base directory instead of Cell.
* Switch to asprintf for CLD filenames, Geo.
FD>
Yes. I also think that chunkd should not do it's own replication. As the
strategy may be domain/application dependend. Therefor I'd appreciate if
chunkd would provide some kind of "copy(dst,sha)" function, to be able
to directly copy to another chunkd instance.
JG>
Hopefully all this is wrapped up into libcldc...
JG>
* total single-node volume size: one cheap SATA hard drive
* total number of chunks: ==
total number of tabled objects / number of storage nodes
* distribution of chunk sizes: dependent upon the application using tabled
* aggregate bandwidth: dependent upon the application using tabled
fbp>
Might we put some numbers to this?
Most notable is typical chunk size and number of supported clients.
- Rick Peralta
www.linkedin.com/in/rickperalta
--
To unsubscribe from this list: send the line "unsubscribe hail-devel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html