On Mon, Jan 31, 2011 at 6:34 PM, Sean Bigdatafun <[email protected]> wrote: > On Tue, Jan 25, 2011 at 12:37 PM, Nathan Rutman <[email protected]> wrote: >> - Commodity hardware requires data replication for safety. The HDFS >> implementation has three penalties: storage redundancy, network loading, >> and >> blocking writes. By default, HDFS blocks are replicated 3x: local, >> "nearby", and "far away" to minimize the impact of data center >> catastrophe. >> In addition to the obvious 3x cost for storage, the result is that every >> data block must be written "far away" - exactly the opposite of the "Move >> Computation to Data" mantra. Furthermore, these over-network writes are >> synchronous; the client write blocks until all copies are complete on >> disk, >> with the longest latency path of 2 network hops plus a disk write gating >> the >> overall write speed. Note that while this would be disastrous for a >> general-purpose filesystem, with true WORM usage it may be acceptable to >> penalize writes this way.
If you have a well designed network infrastructure, using local disks is not noticeably faster than using a remote disk. The performance penalty of network hops is negligible compared to the performance penalty of going to disk. >> - There is no data caching. When delivering large contiguous streaming >> data, this doesn't matter. But when the read load is random, seeky, or >> partial, this is a missing high-impact performance feature. The caching is provided by the OS. The Linux page cache is doing a much better job at caching and pre-fetching than a Hadoop DataNode can ever dream of. >> - In a WORM model, changing a small part of a file requires all the >> file data to be copied, so e.g. database record modifications would be >> very >> expensive. >> > Yes, can anyone answer this question? HBase doesn't modify files. It only streams edits to a log, and sometimes write full files at once. >> - Moving computation is cheaper than moving data. But the data >> nonetheless always has to be moved: either read off of a local hard drive >> or >> read over the network into the compute node's memory. It is not >> necessarily >> the case that reading a local hard drive is faster than reading a >> distributed (striped) file over a fast network. Commodity network (e.g. >> 1GigE), probably yes. But a fast (and expensive) network (e.g. 4xDDR >> Infiniband) can deliver data significantly faster than a local commodity >> hard drive. >> > I agree with this statement: "It is not necessarily the case that reading a > local hard drive is faster than reading a distributed (striped) file over a > fast network. ", probably Infiniband as well as 10GigE network. And this is > why I feel it might not be a good strategy that HBase entirely attach its > design to HDFS. You don't need Infiniband or 10GigE. At StumbleUpon we have 1GbE line-rate between every machine and the RTT latency across racks is about 95µs. It's far far more cost-effective than 10GbE or Infiniband. -- Benoit "tsuna" Sigoure Software Engineer @ www.StumbleUpon.com
