Can you pastebin the output of the lsr command on the table's dir?

Thx

J-D

On Tue, Nov 9, 2010 at 10:54 PM, Hari Sreekumar
<[email protected]> wrote:
> I checked the "browse filesystem" link in the web interface (50070). HBase
> creates a directly named after the table ,and in the directory, there are
> files which are 5-6 MB in size, on average. Some are in kbs, and there are
> some of 12-13 MB size, but most are around  6 MB. I was thinking these files
> are stored in 64 MB blocks, leading to the space usage.
>
> hari
>
> On Wed, Nov 10, 2010 at 11:56 AM, Jean-Daniel Cryans 
> <[email protected]>wrote:
>
>> I'm pretty sure that's not how it's reported by the "du" command, but
>> I wouldn't expect to see files of 5MB on average. Can you be more
>> specific?
>>
>> J-D
>>
>> On Tue, Nov 9, 2010 at 9:58 PM, Hari Sreekumar <[email protected]>
>> wrote:
>> > Ah, so the bloat is not because of the files being 5-6 MB in size?
>> Wouldn't
>> > a 6 MB file occupy 64 MB if I set block size as 64 MB?
>> >
>> > hari
>> >
>> > On Wed, Nov 10, 2010 at 11:16 AM, Jean-Daniel Cryans <
>> [email protected]>wrote:
>> >
>> >> Each value is stored with it's full key e.g. row key + family +
>> >> qualifier + timestamp + offsets. You don't give any information
>> >> regarding how you stored the data, but if you have large enough keys
>> >> then it should easily explain the bloat.
>> >>
>> >> J-D
>> >>
>> >> On Tue, Nov 9, 2010 at 9:21 PM, Hari Sreekumar <
>> [email protected]>
>> >> wrote:
>> >> > Hi,
>> >> >
>> >> >     Data seems to be taking up too much space when I put into HBase.
>> e.g,
>> >> I
>> >> > have a 2 GB text file which seems to be taking up ~70 GB when I dump
>> into
>> >> > HBase. I have block size set to 64 MB and replication=3, which I think
>> is
>> >> > the possible reason for this expansion. But if that is the case, how
>> can
>> >> I
>> >> > prevent it? Decreasing the block size will have a negative impact on
>> >> > performance, so is there a way I can increase the average size on
>> >> > HBase-created  files to be comparable to 64 MB. Right now they are ~5
>> MB
>> >> on
>> >> > average. Or is this an entirely different thing at work here?
>> >> >
>> >> > thanks,
>> >> > hari
>> >> >
>> >>
>> >
>>
>

Reply via email to