Re: hbase vs bigtable

Ryan Rawson Sat, 28 Aug 2010 16:27:41 -0700

One problem of performance right now is our inability to push io down
into the kernel. This is where async Apis help. A full read in hbase
might require reading 10+ files before ever returning a single row.
Doing these in parallel would be nice. Spawning 10+ threads isn't
really a good idea.


Right now hadoop scales by adding processes, we just don't have that option.

On Saturday, August 28, 2010, Todd Lipcon <[email protected]> wrote:
> Agreed, I think we'll get more bang for our buck by finishing up (reviving)
> patches like HDFS-941 or HDFS-347. Unfortunately performance doesn't seem to
> be the highest priority among our customers so it's tough to find much time
> to work on these things until we really get stability up to par.
>
> -Todd
>
> On Sat, Aug 28, 2010 at 3:36 PM, Jay Booth <[email protected]> wrote:
>
>> I don't think async is a magic bullet for it's own sake, we've all
>> seen those papers that show good performance from blocking
>> implementations.  Particularly, I don't think async is worth a whole
>> lot on the client side of service, which HBase is to HDFS.
>>
>> What about an HDFS call for localize(Path) which attempts to replicate
>> the blocks for a file to the local datanode (if any) in a background
>> thread?  If RegionServers called that function for their files every
>> so often, you'd eliminate a lot of bandwidth constraints, although the
>> latency of establishing a local socket for every read is still there.
>>
>> On Sat, Aug 28, 2010 at 4:42 PM, Todd Lipcon <[email protected]> wrote:
>> > On Sat, Aug 28, 2010 at 1:38 PM, Ryan Rawson <[email protected]> wrote:
>> >
>> >> One thought I had was if we have the writable code, surely just
>> >> putting a different transport around it wouldn't be THAT bad right :-)
>> >>
>> >> Of course writables are really tied to that DataInputStream or
>> >> whatever, so we'd have to work on that.  Benoit said something about
>> >> writables needing to do blocking reads and that causing issues, but
>> >> there was a netty3 thing specifically designed to handle that by
>> >> throwing and retrying the op later when there was more data.
>> >>
>> >>
>> > The data transfer protocol actually doesn't do anything with Writables -
>> > it's all hand coded bytes going over the transport.
>> >
>> > I have some code floating around somewhere for translating between
>> blocking
>> > IO and Netty - not sure where, though :)
>> >
>> > -Todd
>> >
>> >
>> >>  On Sat, Aug 28, 2010 at 1:32 PM, Todd Lipcon <[email protected]>
>> wrote:
>> >> > On Sat, Aug 28, 2010 at 1:29 PM, Ryan Rawson <[email protected]>
>> wrote:
>> >> >
>> >> >> a production server should be CPU bound, with memory caching etc.
>>  Our
>> >> >> prod systems do see a reasonable load, and jstack always shows some
>> >> >> kind of wait generally...
>> >> >>
>> >> >> but we need more IO pushdown into HDFS.  For example if we are
>> loading
>> >> >> regions, why not do N at the same time?  That figure N is probably
>> >> >> more dependent on how many disks/node you have than anything else
>> >> >> really.
>> >> >>
>> >> >> For simple reads (eg: hfile) would it really be that hard to retrofit
>> >> >> some kind of async netty based API on top of the existing DFSClient
>> >> >> logic?
>> >> >>
>> >> >
>> >> > Would probably be a duplication rather than a retrofit, but it's
>> probably
>> >> > doable -- the protocol is pretty simple for reads, and failure/retry
>> is
>> >> much
>> >> > less complicated compared to writes (though still pretty complicated)
>> >> >
>> >> >
>> >> >>
>> >> >> -ryan
>> >> >>
>> >> >> On Sat, Aug 28, 2010 at 1:11 PM, Todd Lipcon <[email protected]>
>> wrote:
>> >> >> > Depending on the workload, parallelism doesn't seem to matter much.
>> On
>> >> my
>> >> >> > 8-core Nehalem test cluster with 12 disks each, I'm always network
>> >> bound
>> >> >> far
>> >> >> > before I'm CPU bound for most benchmarks. ie jstacks show threads
>> >> mostly
>> >> >> > waiting for IO to happen, not blocked on locks.
>> >> >> >
>> >> >> > Is that not the case for your production boxes?
>> >> >> >
>> >> >> > On Sat, Aug 28, 2010 at 1:07 PM, Ryan Rawson <--
> Todd Lipcon
> Software Engineer, Cloudera
>

Re: hbase vs bigtable

Reply via email to