Re: How many column families in one table ?

Vimal Jain Mon, 01 Jul 2013 10:07:30 -0700

Sorry for the typo .. please ignore previous mail.. Here is the corrected
one..
1)I have around 140 columns for each row , out of 140 , around 100 columns
hold java primitive data type , remaining 40 columns  contain serialized
java object as byte array(Inside each object is an ArrayList). Yes , I do
delete data but the frequency is very less ( 1 out of 5K operations ). I
dont run any compaction.
2) I had ran scan keeping in mind the CPU,IO and other system related
parameters.I found them to be normal with system load being 0.1-0.3.
3) Yes i have 3 versions of cell ( default value).




On Mon, Jul 1, 2013 at 10:33 PM, Vimal Jain <[email protected]> wrote:

> Hi Lars,
> 1)I have around 140 columns for each row , out of 140 , around 100 rows
> are holds java primitive data type , remaining 40 rows contains serialized
> java object as byte array. Yes , I do delete data but the frequency is very
> less ( 1 out of 5K operations ). I dont run any compaction.
> 2) I had ran scan keeping in mind the CPU,IO and other system related
> parameters.I found them to be normal with system load being 0.1-0.3.
> 3) Yes i have 3 versions of cell ( default value).
>
>
> On Mon, Jul 1, 2013 at 9:08 PM, lars hofhansl <[email protected]> wrote:
>
>> The performance you're seeing is definitely not typical. 'couple of
>> further questions:
>> - How large are your KVs (columns)?- Do you delete data? Do you run major
>> compactions?
>> - Can you measure: CPU, IO, context switches, etc, during the scanning?
>> - Do you have many versions of the columns?
>>
>>
>> Note that HBase is a key value store, i.e. the storage is sparse. Each
>> column is represented by its own key value pair, and HBase has to do the
>> work to reassemble the data.
>>
>>
>> -- Lars
>>
>>
>>
>> ________________________________
>>  From: Vimal Jain <[email protected]>
>> To: [email protected]
>> Sent: Monday, July 1, 2013 4:44 AM
>> Subject: Re: How many column families in one table ?
>>
>>
>> Hi,
>> We had some hardware constraints along with the fact that our total data
>> size was in GBs.
>> Thats why to start with Hbase ,  we first began  with pseudo distributed
>> mode and thought if required we would upgrade to fully distributed mode.
>>
>>
>>
>> On Mon, Jul 1, 2013 at 5:09 PM, Ted Yu <[email protected]> wrote:
>>
>> > bq. I have configured Hbase in pseudo distributed mode on top of HDFS.
>> >
>> > What was the reason for using pseudo distributed mode in production
>> setup ?
>> >
>> > Cheers
>> >
>> > On Mon, Jul 1, 2013 at 1:44 AM, Vimal Jain <[email protected]> wrote:
>> >
>> > > Thanks Dhaval/Michael/Ted/Otis for your replies.
>> > > Actually , i asked this question because i am seeing some performance
>> > > degradation in my production Hbase setup.
>> > > I have configured Hbase in pseudo distributed mode on top of HDFS.
>> > > I have created 17 Column families :( . I am actually using 14 out of
>> > these
>> > > 17 column families.
>> > > Each column family has around on average 8-10 column qualifiers so
>> total
>> > > around 140 columns are there for each row key.
>> > > I have around 1.6 millions rows in the table.
>> > > To completely scan the table for all 140 columns  , it takes around
>> 30-40
>> > > minutes.
>> > > Is it normal or Should i redesign my table schema ( probably merging
>> 4-5
>> > > column families into one , so that at the end i have just 3-4 cf ) ?
>> > >
>> > >
>> > >
>> > > On Sat, Jun 29, 2013 at 12:06 AM, Otis Gospodnetic <
>> > > [email protected]> wrote:
>> > >
>> > > > Hm, works for me -
>> > > >
>> > > >
>> > >
>> >
>> http://search-hadoop.com/m/qOx8l15Z1q42/column+families+fb&subj=Re+HBase+Column+Family+Limit+Reasoning
>> > > >
>> > > > Shorter version: http://search-hadoop.com/m/qOx8l15Z1q42
>> > > >
>> > > > Otis
>> > > > --
>> > > > Solr & ElasticSearch Support -- http://sematext.com/
>> > > > Performance Monitoring -- http://sematext.com/spm
>> > > >
>> > > >
>> > > >
>> > > > On Fri, Jun 28, 2013 at 8:40 AM, Vimal Jain <[email protected]>
>> wrote:
>> > > > > Hi All ,
>> > > > > Thanks for your replies.
>> > > > >
>> > > > > Ted,
>> > > > > Thanks for the link, but its not working . :(
>> > > > >
>> > > > >
>> > > > > On Fri, Jun 28, 2013 at 5:57 PM, Ted Yu <[email protected]>
>> wrote:
>> > > > >
>> > > > >> Vimal:
>> > > > >> Please also refer to:
>> > > > >>
>> > > > >>
>> > > >
>> > >
>> >
>> http://search-hadoop.com/m/qOx8l15Z1q42/column+families+fb&subj=Re+HBase+Column+Family+Limit+Reasoning
>> > > > >>
>> > > > >> On Fri, Jun 28, 2013 at 1:37 PM, Michel Segel <
>> > > > [email protected]
>> > > > >> >wrote:
>> > > > >>
>> > > > >> > Short answer... As few as possible.
>> > > > >> >
>> > > > >> > 14 CF doesn't make too much sense.
>> > > > >> >
>> > > > >> > Sent from a remote device. Please excuse any typos...
>> > > > >> >
>> > > > >> > Mike Segel
>> > > > >> >
>> > > > >> > On Jun 28, 2013, at 12:20 AM, Vimal Jain <[email protected]>
>> > wrote:
>> > > > >> >
>> > > > >> > > Hi,
>> > > > >> > > How many column families should be there in an hbase table ?
>> Is
>> > > > there
>> > > > >> any
>> > > > >> > > performance issue in read/write if we have more column
>> families
>> > ?
>> > > > >> > > I have designed one table with around 14 column families in
>> it
>> > > with
>> > > > >> each
>> > > > >> > > having on average 6 qualifiers.
>> > > > >> > > Is it a good design ?
>> > > > >> > >
>> > > > >> > > --
>> > > > >> > > Thanks and Regards,
>> > > > >> > > Vimal Jain
>> > > > >> >
>> > > > >>
>> > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > > Thanks and Regards,
>> > > > > Vimal Jain
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > Thanks and Regards,
>> > > Vimal Jain
>> > >
>> >
>>
>>
>>
>> --
>> Thanks and Regards,
>> Vimal Jain
>>
>
>
>
> --
> Thanks and Regards,
> Vimal Jain
>



-- 
Thanks and Regards,
Vimal Jain

Re: How many column families in one table ?

Reply via email to