On Sat, Aug 28, 2010 at 1:38 PM, Ryan Rawson <[email protected]> wrote:
> One thought I had was if we have the writable code, surely just > putting a different transport around it wouldn't be THAT bad right :-) > > Of course writables are really tied to that DataInputStream or > whatever, so we'd have to work on that. Benoit said something about > writables needing to do blocking reads and that causing issues, but > there was a netty3 thing specifically designed to handle that by > throwing and retrying the op later when there was more data. > > The data transfer protocol actually doesn't do anything with Writables - it's all hand coded bytes going over the transport. I have some code floating around somewhere for translating between blocking IO and Netty - not sure where, though :) -Todd > On Sat, Aug 28, 2010 at 1:32 PM, Todd Lipcon <[email protected]> wrote: > > On Sat, Aug 28, 2010 at 1:29 PM, Ryan Rawson <[email protected]> wrote: > > > >> a production server should be CPU bound, with memory caching etc. Our > >> prod systems do see a reasonable load, and jstack always shows some > >> kind of wait generally... > >> > >> but we need more IO pushdown into HDFS. For example if we are loading > >> regions, why not do N at the same time? That figure N is probably > >> more dependent on how many disks/node you have than anything else > >> really. > >> > >> For simple reads (eg: hfile) would it really be that hard to retrofit > >> some kind of async netty based API on top of the existing DFSClient > >> logic? > >> > > > > Would probably be a duplication rather than a retrofit, but it's probably > > doable -- the protocol is pretty simple for reads, and failure/retry is > much > > less complicated compared to writes (though still pretty complicated) > > > > > >> > >> -ryan > >> > >> On Sat, Aug 28, 2010 at 1:11 PM, Todd Lipcon <[email protected]> wrote: > >> > Depending on the workload, parallelism doesn't seem to matter much. On > my > >> > 8-core Nehalem test cluster with 12 disks each, I'm always network > bound > >> far > >> > before I'm CPU bound for most benchmarks. ie jstacks show threads > mostly > >> > waiting for IO to happen, not blocked on locks. > >> > > >> > Is that not the case for your production boxes? > >> > > >> > On Sat, Aug 28, 2010 at 1:07 PM, Ryan Rawson <[email protected]> > wrote: > >> > > >> >> bigtable was written for 1 core machines, with ~ 100 regions per box. > >> >> Thanks to CMS we generally can't run on < 4 cores, and at this point > >> >> 16 core machines (with HTT) is becoming pretty standard. > >> >> > >> >> The question is, how do we leverage the ever-increasing sizes of > >> >> machines and differentiate ourselves from bigtable? What did google > >> >> do (if anything) to adopt to the 16 core machines? We should be able > >> >> to do quite a bit on a 20 or 40 node cluster. > >> >> > >> >> more thread parallelism? > >> >> > >> > > >> > > >> > > >> > -- > >> > Todd Lipcon > >> > Software Engineer, Cloudera > >> > > >> > > > > > > > > -- > > Todd Lipcon > > Software Engineer, Cloudera > > > -- Todd Lipcon Software Engineer, Cloudera
