Sorry for the typo .. please ignore previous mail.. Here is the corrected one.. 1)I have around 140 columns for each row , out of 140 , around 100 columns hold java primitive data type , remaining 40 columns contain serialized java object as byte array(Inside each object is an ArrayList). Yes , I do delete data but the frequency is very less ( 1 out of 5K operations ). I dont run any compaction. 2) I had ran scan keeping in mind the CPU,IO and other system related parameters.I found them to be normal with system load being 0.1-0.3. 3) Yes i have 3 versions of cell ( default value).
On Mon, Jul 1, 2013 at 10:33 PM, Vimal Jain <[email protected]> wrote: > Hi Lars, > 1)I have around 140 columns for each row , out of 140 , around 100 rows > are holds java primitive data type , remaining 40 rows contains serialized > java object as byte array. Yes , I do delete data but the frequency is very > less ( 1 out of 5K operations ). I dont run any compaction. > 2) I had ran scan keeping in mind the CPU,IO and other system related > parameters.I found them to be normal with system load being 0.1-0.3. > 3) Yes i have 3 versions of cell ( default value). > > > On Mon, Jul 1, 2013 at 9:08 PM, lars hofhansl <[email protected]> wrote: > >> The performance you're seeing is definitely not typical. 'couple of >> further questions: >> - How large are your KVs (columns)?- Do you delete data? Do you run major >> compactions? >> - Can you measure: CPU, IO, context switches, etc, during the scanning? >> - Do you have many versions of the columns? >> >> >> Note that HBase is a key value store, i.e. the storage is sparse. Each >> column is represented by its own key value pair, and HBase has to do the >> work to reassemble the data. >> >> >> -- Lars >> >> >> >> ________________________________ >> From: Vimal Jain <[email protected]> >> To: [email protected] >> Sent: Monday, July 1, 2013 4:44 AM >> Subject: Re: How many column families in one table ? >> >> >> Hi, >> We had some hardware constraints along with the fact that our total data >> size was in GBs. >> Thats why to start with Hbase , we first began with pseudo distributed >> mode and thought if required we would upgrade to fully distributed mode. >> >> >> >> On Mon, Jul 1, 2013 at 5:09 PM, Ted Yu <[email protected]> wrote: >> >> > bq. I have configured Hbase in pseudo distributed mode on top of HDFS. >> > >> > What was the reason for using pseudo distributed mode in production >> setup ? >> > >> > Cheers >> > >> > On Mon, Jul 1, 2013 at 1:44 AM, Vimal Jain <[email protected]> wrote: >> > >> > > Thanks Dhaval/Michael/Ted/Otis for your replies. >> > > Actually , i asked this question because i am seeing some performance >> > > degradation in my production Hbase setup. >> > > I have configured Hbase in pseudo distributed mode on top of HDFS. >> > > I have created 17 Column families :( . I am actually using 14 out of >> > these >> > > 17 column families. >> > > Each column family has around on average 8-10 column qualifiers so >> total >> > > around 140 columns are there for each row key. >> > > I have around 1.6 millions rows in the table. >> > > To completely scan the table for all 140 columns , it takes around >> 30-40 >> > > minutes. >> > > Is it normal or Should i redesign my table schema ( probably merging >> 4-5 >> > > column families into one , so that at the end i have just 3-4 cf ) ? >> > > >> > > >> > > >> > > On Sat, Jun 29, 2013 at 12:06 AM, Otis Gospodnetic < >> > > [email protected]> wrote: >> > > >> > > > Hm, works for me - >> > > > >> > > > >> > > >> > >> http://search-hadoop.com/m/qOx8l15Z1q42/column+families+fb&subj=Re+HBase+Column+Family+Limit+Reasoning >> > > > >> > > > Shorter version: http://search-hadoop.com/m/qOx8l15Z1q42 >> > > > >> > > > Otis >> > > > -- >> > > > Solr & ElasticSearch Support -- http://sematext.com/ >> > > > Performance Monitoring -- http://sematext.com/spm >> > > > >> > > > >> > > > >> > > > On Fri, Jun 28, 2013 at 8:40 AM, Vimal Jain <[email protected]> >> wrote: >> > > > > Hi All , >> > > > > Thanks for your replies. >> > > > > >> > > > > Ted, >> > > > > Thanks for the link, but its not working . :( >> > > > > >> > > > > >> > > > > On Fri, Jun 28, 2013 at 5:57 PM, Ted Yu <[email protected]> >> wrote: >> > > > > >> > > > >> Vimal: >> > > > >> Please also refer to: >> > > > >> >> > > > >> >> > > > >> > > >> > >> http://search-hadoop.com/m/qOx8l15Z1q42/column+families+fb&subj=Re+HBase+Column+Family+Limit+Reasoning >> > > > >> >> > > > >> On Fri, Jun 28, 2013 at 1:37 PM, Michel Segel < >> > > > [email protected] >> > > > >> >wrote: >> > > > >> >> > > > >> > Short answer... As few as possible. >> > > > >> > >> > > > >> > 14 CF doesn't make too much sense. >> > > > >> > >> > > > >> > Sent from a remote device. Please excuse any typos... >> > > > >> > >> > > > >> > Mike Segel >> > > > >> > >> > > > >> > On Jun 28, 2013, at 12:20 AM, Vimal Jain <[email protected]> >> > wrote: >> > > > >> > >> > > > >> > > Hi, >> > > > >> > > How many column families should be there in an hbase table ? >> Is >> > > > there >> > > > >> any >> > > > >> > > performance issue in read/write if we have more column >> families >> > ? >> > > > >> > > I have designed one table with around 14 column families in >> it >> > > with >> > > > >> each >> > > > >> > > having on average 6 qualifiers. >> > > > >> > > Is it a good design ? >> > > > >> > > >> > > > >> > > -- >> > > > >> > > Thanks and Regards, >> > > > >> > > Vimal Jain >> > > > >> > >> > > > >> >> > > > > >> > > > > >> > > > > >> > > > > -- >> > > > > Thanks and Regards, >> > > > > Vimal Jain >> > > > >> > > >> > > >> > > >> > > -- >> > > Thanks and Regards, >> > > Vimal Jain >> > > >> > >> >> >> >> -- >> Thanks and Regards, >> Vimal Jain >> > > > > -- > Thanks and Regards, > Vimal Jain > -- Thanks and Regards, Vimal Jain
