Thanks a lot for quick test, Harsh. This will certainly help me. I'll see if i am missing something in my comparison of hdfs usage between HBase0.90 and HBase0.92.
Thanks Again, Anil On Tue, Aug 14, 2012 at 2:42 PM, Harsh J <[email protected]> wrote: > Not wanting to have this thread too end up as a mystery-result on the > web, I did some tests. I loaded 10k rows (of 100 KB random chars each) > into test tables on 0.90 and 0.92 both, flushed them, major_compact'ed > them (waited for completion and drop in IO write activity) and then > measured them to find this: > > 0.92 takes a total of 1049661190 bytes under its /hbase/test directory. > 0.90 takes a total of 1049467570 bytes under its /hbase/test directory. > > So… not much of a difference. It is still your data that counts. I > believe what Anil may have had were merely additional, un-compacted > stores? > > P.s. Note that my 'test' table were all defaults. That is, merely > "create 'test', 'col1'", nothing else, so the block indexes must've > probably gotten created for every row, as thats at 64k by default, > while my rows are all 100k each. > > On Wed, Aug 15, 2012 at 2:25 AM, anil gupta <[email protected]> wrote: > > Hi Kevin, > > > > If it's not possible to store table in HFilev1 in HBase 0.92 then my last > > option will be to do store data on pseudo-distributed or standalone > cluster > > for the comparison. > > The advantage with the current installation is that its a fully > distributed > > cluster with around 33 million records in a table. So, it would give me a > > better estimate. > > > > Thanks, > > Anil Gupta > > > > On Tue, Aug 14, 2012 at 1:48 PM, Kevin O'dell <[email protected] > >wrote: > > > >> Do you not have a pseudo cluster for testing anywhere? > >> > >> On Tue, Aug 14, 2012 at 4:46 PM, anil gupta <[email protected]> > wrote: > >> > >> > Hi Jerry, > >> > > >> > I am wiling to do that but the problem is that i wiped off the > HBase0.90 > >> > cluster. Is there a way to store a table in HFilev1 in HBase0.92? If i > >> can > >> > store a file in HFilev1 in 0.92 then i can do the comparison. > >> > > >> > Thanks, > >> > Anil Gupta > >> > > >> > On Tue, Aug 14, 2012 at 1:28 PM, Jerry Lam <[email protected]> > wrote: > >> > > >> > > Hi Anil: > >> > > > >> > > Maybe you can try to compare the two HFile implementation directly? > Let > >> > say > >> > > write 1000 rows into HFile v1 format and then into HFile v2 format. > You > >> > can > >> > > then compare the size of the two directly? > >> > > > >> > > HTH, > >> > > > >> > > Jerry > >> > > > >> > > On Tue, Aug 14, 2012 at 3:36 PM, anil gupta <[email protected]> > >> > wrote: > >> > > > >> > > > Hi Zahoor, > >> > > > > >> > > > Then it seems like i might have missed something when doing hdfs > >> usage > >> > > > estimation of HBase. I usually do hadoop fs -dus > /hbase/$TABLE_NAME > >> for > >> > > > getting the hdfs usage of a table. Is this the right way? Since i > >> wiped > >> > > of > >> > > > the HBase0.90 cluster so now i cannot look into hdfs usage of it. > Is > >> it > >> > > > possible to store a table in HFileV1 instead of HFileV2 in > HBase0.92? > >> > > > In this way i can do a fair comparison. > >> > > > > >> > > > Thanks, > >> > > > Anil Gupta > >> > > > > >> > > > On Tue, Aug 14, 2012 at 12:13 PM, jmozah <[email protected]> > wrote: > >> > > > > >> > > > > Hi Anil, > >> > > > > > >> > > > > I really doubt that there is 50% drop in file sizes... As far > as i > >> > > know.. > >> > > > > there is no drastic space conserving feature in V2. Just as an > >> after > >> > > > > thought.. do a major compact and check the sizes. > >> > > > > > >> > > > > ./Zahoor > >> > > > > http://blog.zahoor.in > >> > > > > > >> > > > > > >> > > > > On 15-Aug-2012, at 12:31 AM, anil gupta <[email protected]> > >> > wrote: > >> > > > > > >> > > > > > l > >> > > > > > >> > > > > > >> > > > > >> > > > > >> > > > -- > >> > > > Thanks & Regards, > >> > > > Anil Gupta > >> > > > > >> > > > >> > > >> > > >> > > >> > -- > >> > Thanks & Regards, > >> > Anil Gupta > >> > > >> > >> > >> > >> -- > >> Kevin O'Dell > >> Customer Operations Engineer, Cloudera > >> > > > > > > > > -- > > Thanks & Regards, > > Anil Gupta > > > > -- > Harsh J > -- Thanks & Regards, Anil Gupta
