Re: Disk space usage of HFilev1 vs HFilev2

anil gupta Tue, 14 Aug 2012 14:53:09 -0700

Thanks a lot for quick test, Harsh. This will certainly help me. I'll see
if i am missing something in my comparison of hdfs usage between HBase0.90
and HBase0.92.


Thanks Again,
Anil

On Tue, Aug 14, 2012 at 2:42 PM, Harsh J <[email protected]> wrote:

> Not wanting to have this thread too end up as a mystery-result on the
> web, I did some tests. I loaded 10k rows (of 100 KB random chars each)
> into test tables on 0.90 and 0.92 both, flushed them, major_compact'ed
> them (waited for completion and drop in IO write activity) and then
> measured them to find this:
>
> 0.92 takes a total of 1049661190 bytes under its /hbase/test directory.
> 0.90 takes a total of 1049467570 bytes under its /hbase/test directory.
>
> So… not much of a difference. It is still your data that counts. I
> believe what Anil may have had were merely additional, un-compacted
> stores?
>
> P.s. Note that my 'test' table were all defaults. That is, merely
> "create 'test', 'col1'", nothing else, so the block indexes must've
> probably gotten created for every row, as thats at 64k by default,
> while my rows are all 100k each.
>
> On Wed, Aug 15, 2012 at 2:25 AM, anil gupta <[email protected]> wrote:
> > Hi Kevin,
> >
> > If it's not possible to store table in HFilev1 in HBase 0.92 then my last
> > option will be to do store data on pseudo-distributed or standalone
> cluster
> > for the comparison.
> > The advantage with the current installation is that its a fully
> distributed
> > cluster with around 33 million records in a table. So, it would give me a
> > better estimate.
> >
> > Thanks,
> > Anil Gupta
> >
> > On Tue, Aug 14, 2012 at 1:48 PM, Kevin O'dell <[email protected]
> >wrote:
> >
> >> Do you not have a pseudo cluster for testing anywhere?
> >>
> >> On Tue, Aug 14, 2012 at 4:46 PM, anil gupta <[email protected]>
> wrote:
> >>
> >> > Hi Jerry,
> >> >
> >> > I am wiling to do that but the problem is that i wiped off the
> HBase0.90
> >> > cluster. Is there a way to store a table in HFilev1 in HBase0.92? If i
> >> can
> >> > store a file in HFilev1 in 0.92 then i can do the comparison.
> >> >
> >> > Thanks,
> >> > Anil Gupta
> >> >
> >> > On Tue, Aug 14, 2012 at 1:28 PM, Jerry Lam <[email protected]>
> wrote:
> >> >
> >> > > Hi Anil:
> >> > >
> >> > > Maybe you can try to compare the two HFile implementation directly?
> Let
> >> > say
> >> > > write 1000 rows into HFile v1 format and then into HFile v2 format.
> You
> >> > can
> >> > > then compare the size of the two directly?
> >> > >
> >> > > HTH,
> >> > >
> >> > > Jerry
> >> > >
> >> > > On Tue, Aug 14, 2012 at 3:36 PM, anil gupta <[email protected]>
> >> > wrote:
> >> > >
> >> > > > Hi Zahoor,
> >> > > >
> >> > > > Then it seems like i might have missed something when doing hdfs
> >> usage
> >> > > > estimation of HBase. I usually do hadoop fs -dus
> /hbase/$TABLE_NAME
> >> for
> >> > > > getting the hdfs usage of a table. Is this the right way? Since i
> >> wiped
> >> > > of
> >> > > > the HBase0.90 cluster so now i cannot look into hdfs usage of it.
> Is
> >> it
> >> > > > possible to store a table in HFileV1 instead of HFileV2 in
> HBase0.92?
> >> > > > In this way i can do a fair comparison.
> >> > > >
> >> > > > Thanks,
> >> > > > Anil Gupta
> >> > > >
> >> > > > On Tue, Aug 14, 2012 at 12:13 PM, jmozah <[email protected]>
> wrote:
> >> > > >
> >> > > > > Hi Anil,
> >> > > > >
> >> > > > > I really doubt that there is 50% drop in file sizes... As far
> as i
> >> > > know..
> >> > > > > there is no drastic space conserving feature in V2. Just as  an
> >> after
> >> > > > > thought.. do a major compact and check the sizes.
> >> > > > >
> >> > > > > ./Zahoor
> >> > > > > http://blog.zahoor.in
> >> > > > >
> >> > > > >
> >> > > > > On 15-Aug-2012, at 12:31 AM, anil gupta <[email protected]>
> >> > wrote:
> >> > > > >
> >> > > > > > l
> >> > > > >
> >> > > > >
> >> > > >
> >> > > >
> >> > > > --
> >> > > > Thanks & Regards,
> >> > > > Anil Gupta
> >> > > >
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > Thanks & Regards,
> >> > Anil Gupta
> >> >
> >>
> >>
> >>
> >> --
> >> Kevin O'Dell
> >> Customer Operations Engineer, Cloudera
> >>
> >
> >
> >
> > --
> > Thanks & Regards,
> > Anil Gupta
>
>
>
> --
> Harsh J
>



-- 
Thanks & Regards,
Anil Gupta

Re: Disk space usage of HFilev1 vs HFilev2

Reply via email to