I think my general point is we could hack up the hbase source, add
refcounting, circumvent the gc, etc or we could demand more from the dfs.

If a variant of hdfs-347 was committed, reads could come from the Linux
buffer cache and life would be good.

The choice isn't fast hbase vs slow hbase, there are elements of bugs there
as well.
On Jul 9, 2011 12:25 PM, "M. C. Srivas" <mcsri...@gmail.com> wrote:
> On Fri, Jul 8, 2011 at 6:47 PM, Jason Rutherglen <
jason.rutherg...@gmail.com
>> wrote:
>
>> There are couple of things here, one is direct byte buffers to put the
>> blocks outside of heap, the other is MMap'ing the blocks directly from
>> the underlying HDFS file.
>
>
>> I think they both make sense. And I'm not sure MapR's solution will
>> be that much better if the latter is implemented in HBase.
>>
>
> There're some major issues with mmap'ing the local hdfs file (the "block")
> directly:
> (a) no checksums to detect data corruption from bad disks
> (b) when a disk does fail, the dfs could start reading from an alternate
> replica ... but that option is lost when mmap'ing and the RS will crash
> immediately
> (c) security is completely lost, but that is minor given hbase's current
> status
>
> For those hbase deployments that don't care about the absence of the (a)
and
> (b), especially (b), its definitely a viable option that gives good perf.
>
> At MapR, we did consider similar direct-access capability and rejected it
> due to the above concerns.
>
>
>
>>
>> On Fri, Jul 8, 2011 at 6:26 PM, Ryan Rawson <ryano...@gmail.com> wrote:
>> > The overhead in a byte buffer is the extra integers to keep track of
the
>> > mark, position, limit.
>> >
>> > I am not sure that putting the block cache in to heap is the way to go.
>> > Getting faster local dfs reads is important, and if you run hbase on
top
>> of
>> > Mapr, these things are taken care of for you.
>> > On Jul 8, 2011 6:20 PM, "Jason Rutherglen" <jason.rutherg...@gmail.com>
>> > wrote:
>> >> Also, it's for a good cause, moving the blocks out of main heap using
>> >> direct byte buffers or some other more native-like facility (if DBB's
>> >> don't work).
>> >>
>> >> On Fri, Jul 8, 2011 at 5:34 PM, Ryan Rawson <ryano...@gmail.com>
wrote:
>> >>> Where? Everywhere? An array is 24 bytes, bb is 56 bytes. Also the API
>> >>> is...annoying.
>> >>> On Jul 8, 2011 4:51 PM, "Jason Rutherglen" <
jason.rutherg...@gmail.com
>> >
>> >>> wrote:
>> >>>> Is there an open issue for this? How hard will this be? :)
>> >>>
>> >
>>

Reply via email to