Re: hbase vs bigtable

Jay Booth Sat, 28 Aug 2010 15:37:13 -0700

I don't think async is a magic bullet for it's own sake, we've all
seen those papers that show good performance from blocking
implementations.  Particularly, I don't think async is worth a whole
lot on the client side of service, which HBase is to HDFS.


What about an HDFS call for localize(Path) which attempts to replicate
the blocks for a file to the local datanode (if any) in a background
thread?  If RegionServers called that function for their files every
so often, you'd eliminate a lot of bandwidth constraints, although the
latency of establishing a local socket for every read is still there.

On Sat, Aug 28, 2010 at 4:42 PM, Todd Lipcon <[email protected]> wrote:
> On Sat, Aug 28, 2010 at 1:38 PM, Ryan Rawson <[email protected]> wrote:
>
>> One thought I had was if we have the writable code, surely just
>> putting a different transport around it wouldn't be THAT bad right :-)
>>
>> Of course writables are really tied to that DataInputStream or
>> whatever, so we'd have to work on that.  Benoit said something about
>> writables needing to do blocking reads and that causing issues, but
>> there was a netty3 thing specifically designed to handle that by
>> throwing and retrying the op later when there was more data.
>>
>>
> The data transfer protocol actually doesn't do anything with Writables -
> it's all hand coded bytes going over the transport.
>
> I have some code floating around somewhere for translating between blocking
> IO and Netty - not sure where, though :)
>
> -Todd
>
>
>>  On Sat, Aug 28, 2010 at 1:32 PM, Todd Lipcon <[email protected]> wrote:
>> > On Sat, Aug 28, 2010 at 1:29 PM, Ryan Rawson <[email protected]> wrote:
>> >
>> >> a production server should be CPU bound, with memory caching etc.  Our
>> >> prod systems do see a reasonable load, and jstack always shows some
>> >> kind of wait generally...
>> >>
>> >> but we need more IO pushdown into HDFS.  For example if we are loading
>> >> regions, why not do N at the same time?  That figure N is probably
>> >> more dependent on how many disks/node you have than anything else
>> >> really.
>> >>
>> >> For simple reads (eg: hfile) would it really be that hard to retrofit
>> >> some kind of async netty based API on top of the existing DFSClient
>> >> logic?
>> >>
>> >
>> > Would probably be a duplication rather than a retrofit, but it's probably
>> > doable -- the protocol is pretty simple for reads, and failure/retry is
>> much
>> > less complicated compared to writes (though still pretty complicated)
>> >
>> >
>> >>
>> >> -ryan
>> >>
>> >> On Sat, Aug 28, 2010 at 1:11 PM, Todd Lipcon <[email protected]> wrote:
>> >> > Depending on the workload, parallelism doesn't seem to matter much. On
>> my
>> >> > 8-core Nehalem test cluster with 12 disks each, I'm always network
>> bound
>> >> far
>> >> > before I'm CPU bound for most benchmarks. ie jstacks show threads
>> mostly
>> >> > waiting for IO to happen, not blocked on locks.
>> >> >
>> >> > Is that not the case for your production boxes?
>> >> >
>> >> > On Sat, Aug 28, 2010 at 1:07 PM, Ryan Rawson <[email protected]>
>> wrote:
>> >> >
>> >> >> bigtable was written for 1 core machines, with ~ 100 regions per box.
>> >> >> Thanks to CMS we generally can't run on < 4 cores, and at this point
>> >> >> 16 core machines (with HTT) is becoming pretty standard.
>> >> >>
>> >> >> The question is, how do we leverage the ever-increasing sizes of
>> >> >> machines and differentiate ourselves from bigtable?  What did google
>> >> >> do (if anything) to adopt to the 16 core machines?  We should be able
>> >> >> to do quite a bit on a 20 or 40 node cluster.
>> >> >>
>> >> >> more thread parallelism?
>> >> >>
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > Todd Lipcon
>> >> > Software Engineer, Cloudera
>> >> >
>> >>
>> >
>> >
>> >
>> > --
>> > Todd Lipcon
>> > Software Engineer, Cloudera
>> >
>>
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

Re: hbase vs bigtable

Reply via email to