Cool. Now we have something on the records :-) ./Zahoor@iPad
On 15-Aug-2012, at 3:12 AM, Harsh J <[email protected]> wrote: > Not wanting to have this thread too end up as a mystery-result on the > web, I did some tests. I loaded 10k rows (of 100 KB random chars each) > into test tables on 0.90 and 0.92 both, flushed them, major_compact'ed > them (waited for completion and drop in IO write activity) and then > measured them to find this: > > 0.92 takes a total of 1049661190 bytes under its /hbase/test directory. > 0.90 takes a total of 1049467570 bytes under its /hbase/test directory. > > So… not much of a difference. It is still your data that counts. I > believe what Anil may have had were merely additional, un-compacted > stores? > > P.s. Note that my 'test' table were all defaults. That is, merely > "create 'test', 'col1'", nothing else, so the block indexes must've > probably gotten created for every row, as thats at 64k by default, > while my rows are all 100k each. > > On Wed, Aug 15, 2012 at 2:25 AM, anil gupta <[email protected]> wrote: >> Hi Kevin, >> >> If it's not possible to store table in HFilev1 in HBase 0.92 then my last >> option will be to do store data on pseudo-distributed or standalone cluster >> for the comparison. >> The advantage with the current installation is that its a fully distributed >> cluster with around 33 million records in a table. So, it would give me a >> better estimate. >> >> Thanks, >> Anil Gupta >> >> On Tue, Aug 14, 2012 at 1:48 PM, Kevin O'dell >> <[email protected]>wrote: >> >>> Do you not have a pseudo cluster for testing anywhere? >>> >>> On Tue, Aug 14, 2012 at 4:46 PM, anil gupta <[email protected]> wrote: >>> >>>> Hi Jerry, >>>> >>>> I am wiling to do that but the problem is that i wiped off the HBase0.90 >>>> cluster. Is there a way to store a table in HFilev1 in HBase0.92? If i >>> can >>>> store a file in HFilev1 in 0.92 then i can do the comparison. >>>> >>>> Thanks, >>>> Anil Gupta >>>> >>>> On Tue, Aug 14, 2012 at 1:28 PM, Jerry Lam <[email protected]> wrote: >>>> >>>>> Hi Anil: >>>>> >>>>> Maybe you can try to compare the two HFile implementation directly? Let >>>> say >>>>> write 1000 rows into HFile v1 format and then into HFile v2 format. You >>>> can >>>>> then compare the size of the two directly? >>>>> >>>>> HTH, >>>>> >>>>> Jerry >>>>> >>>>> On Tue, Aug 14, 2012 at 3:36 PM, anil gupta <[email protected]> >>>> wrote: >>>>> >>>>>> Hi Zahoor, >>>>>> >>>>>> Then it seems like i might have missed something when doing hdfs >>> usage >>>>>> estimation of HBase. I usually do hadoop fs -dus /hbase/$TABLE_NAME >>> for >>>>>> getting the hdfs usage of a table. Is this the right way? Since i >>> wiped >>>>> of >>>>>> the HBase0.90 cluster so now i cannot look into hdfs usage of it. Is >>> it >>>>>> possible to store a table in HFileV1 instead of HFileV2 in HBase0.92? >>>>>> In this way i can do a fair comparison. >>>>>> >>>>>> Thanks, >>>>>> Anil Gupta >>>>>> >>>>>> On Tue, Aug 14, 2012 at 12:13 PM, jmozah <[email protected]> wrote: >>>>>> >>>>>>> Hi Anil, >>>>>>> >>>>>>> I really doubt that there is 50% drop in file sizes... As far as i >>>>> know.. >>>>>>> there is no drastic space conserving feature in V2. Just as an >>> after >>>>>>> thought.. do a major compact and check the sizes. >>>>>>> >>>>>>> ./Zahoor >>>>>>> http://blog.zahoor.in >>>>>>> >>>>>>> >>>>>>> On 15-Aug-2012, at 12:31 AM, anil gupta <[email protected]> >>>> wrote: >>>>>>> >>>>>>>> l >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Thanks & Regards, >>>>>> Anil Gupta >>>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> Thanks & Regards, >>>> Anil Gupta >>>> >>> >>> >>> >>> -- >>> Kevin O'Dell >>> Customer Operations Engineer, Cloudera >>> >> >> >> >> -- >> Thanks & Regards, >> Anil Gupta > > > > -- > Harsh J
