I think the code in ScanQueryMatcher would suggest a performance limit somewhere in the high 10k's of columns. This is a completely unscientific guess just from looking at the code.
Jesse Yates <[email protected]> schrieb: >Out of curiosity (havent rtfm on this yet) do we have any hard >bounds/performance impact on the max number of column families/qualifiers? Has >that behavior changed with the dynamic cf stuff that fairly recently got >rolled in? > >Further, any pointers on where to start digging into the code on this would be >great! > >Thanks! > >- Jesse Yates > >Sent from my iPhone. > >On Dec 29, 2011, at 1:18 AM, lars hofhansl <[email protected]> wrote: > >> Less is not necessarily better. HBase can ignore stores (column families) >> during a scan or get if thatno columns in that family were requested. >> >> So what you want to do is group columns that are typically queried together >> in a single column family, and put >> columns that are not typically queried together in separate families. >> >> >> -- Lars >> >> >> ----- Original Message ----- >> From: Rohit Kelkar <[email protected]> >> To: [email protected] >> Cc: >> Sent: Wednesday, December 28, 2011 9:01 PM >> Subject: Re: No. of families >> >> When we say less column families, how much is less? Is this guided by >> a ratio of the number of rows stored in the Htable to number of column >> families. Or number of tables to number of column families. If I >> understand correctly, the content of each column family is stored in a >> separate file. So does it have anything to do with the disk space >> allocated to hadoop? >> >> - Rohit Kelkar >> >> On Wed, Dec 28, 2011 at 10:14 PM, Mohammad Tariq <[email protected]> wrote: >>> Hi Doug, >>> >>> Thanks a lot for the reply.Ya, I had asked a similar >>> question.Actually I am stuck with some schema design issue.I am sorry, >>> the intention was not to ask the same thing repeatedly.I'll try to >>> figure it out with the help of guidelines provided.Many thanks. >>> >>> Regards, >>> Mohammad Tariq >>> >>> >>> >>> On Wed, Dec 28, 2011 at 7:24 PM, Doug Meil >>> <[email protected]> wrote: >>>> >>>> Hi there- >>>> >>>> re: "number of CF's" >>>> >>>> Yes. Fewer is better. >>>> >>>> http://hbase.apache.org/book.html#schema >>>> >>>> re: "sub column families" >>>> >>>> >>>> There aren't "sub column families" - it's just columns (within a CF). >>>> >>>> http://hbase.apache.org/book.html#datamodel >>>> >>>> >>>> If I am not mistaken you asked a similar question to the dist-list a few >>>> weeks ago. The answers haven't changed. >>>> >>>> >>>> >>>> >>>> >>>> >>>> On 12/28/11 2:53 AM, "Mohammad Tariq" <[email protected]> wrote: >>>> >>>>> Hello all, >>>>> >>>>> Having less no. of column families is advisable. It is feasible to >>>>> have 2 or 3 sub column families within a single column family???I >>>>> want to store xml data in Hbase and I have sub tags that may go down >>>>> to 2 or 3 levels. >>>>> >>>>> Regards, >>>>> Mohammad Tariq >>>>> >>>> >>>> >> >
