Can you pastebin the output of the lsr command on the table's dir? Thx
J-D On Tue, Nov 9, 2010 at 10:54 PM, Hari Sreekumar <[email protected]> wrote: > I checked the "browse filesystem" link in the web interface (50070). HBase > creates a directly named after the table ,and in the directory, there are > files which are 5-6 MB in size, on average. Some are in kbs, and there are > some of 12-13 MB size, but most are around 6 MB. I was thinking these files > are stored in 64 MB blocks, leading to the space usage. > > hari > > On Wed, Nov 10, 2010 at 11:56 AM, Jean-Daniel Cryans > <[email protected]>wrote: > >> I'm pretty sure that's not how it's reported by the "du" command, but >> I wouldn't expect to see files of 5MB on average. Can you be more >> specific? >> >> J-D >> >> On Tue, Nov 9, 2010 at 9:58 PM, Hari Sreekumar <[email protected]> >> wrote: >> > Ah, so the bloat is not because of the files being 5-6 MB in size? >> Wouldn't >> > a 6 MB file occupy 64 MB if I set block size as 64 MB? >> > >> > hari >> > >> > On Wed, Nov 10, 2010 at 11:16 AM, Jean-Daniel Cryans < >> [email protected]>wrote: >> > >> >> Each value is stored with it's full key e.g. row key + family + >> >> qualifier + timestamp + offsets. You don't give any information >> >> regarding how you stored the data, but if you have large enough keys >> >> then it should easily explain the bloat. >> >> >> >> J-D >> >> >> >> On Tue, Nov 9, 2010 at 9:21 PM, Hari Sreekumar < >> [email protected]> >> >> wrote: >> >> > Hi, >> >> > >> >> > Data seems to be taking up too much space when I put into HBase. >> e.g, >> >> I >> >> > have a 2 GB text file which seems to be taking up ~70 GB when I dump >> into >> >> > HBase. I have block size set to 64 MB and replication=3, which I think >> is >> >> > the possible reason for this expansion. But if that is the case, how >> can >> >> I >> >> > prevent it? Decreasing the block size will have a negative impact on >> >> > performance, so is there a way I can increase the average size on >> >> > HBase-created files to be comparable to 64 MB. Right now they are ~5 >> MB >> >> on >> >> > average. Or is this an entirely different thing at work here? >> >> > >> >> > thanks, >> >> > hari >> >> > >> >> >> > >> >
