Yes you are absolutely correct.  HBase must materialize the row for
the data you retrieve. If that is one column family, or one column or
a list of columns or the entire row.  It just has to fit into memory.
It requires a API change to fix, not sure if that is making into 0.21.
But if you split up by column family as you indicated, HBase only
retrieves the data necessary.

-ryan

On Wed, Nov 11, 2009 at 10:25 PM, Greg Cottman <greg.cott...@quest.com> wrote:
> Hi Ryan,
>
> If you only query columns from one column family though, won't HBase use data 
> locality to fetch only enough data to populate that column family?
>
> That way you can have rows with more columns in them, and still write 
> efficient queries that don't fetch all the irrelevant columns in a fat row.
>
> Cheers,
> Greg.
>
> -----Original Message-----
> From: Ryan Rawson [mailto:ryano...@gmail.com]
> Sent: Thursday, 12 November 2009 5:18 PM
> To: hbase-user@hadoop.apache.org
> Subject: Re: newbie question: what is better? one with a lot of keys OR a lot 
> of tables with fewer keys?
>
> Either is fine. When you read an entire row from hbase, it must
> materialize the entire row in ram. Thus your table width is limited if
> you wish to read the entire row at a time.
>
> On Wed, Nov 11, 2009 at 9:45 PM, Jeff Zhang <zjf...@gmail.com> wrote:
>> Continue this question,
>>
>> which is better for hbase, more rows with fewer columns or fewer rows with
>> more columns
>>
>>
>> Jeff Zhang
>>
>>
>> On Thu, Nov 12, 2009 at 5:17 AM, TuxRacer69 <tuxrace...@gmail.com> wrote:
>>
>>> Thank you Jean-Daniel
>>>
>>>
>>> Jean-Daniel Cryans wrote:
>>>
>>>> Alex,
>>>>
>>>> In HBase it really makes more sense to put all the data you can in a
>>>> single table as it will be automatically partitioned and distributed
>>>> across the region servers (providing you have more than 256MB of
>>>> data).
>>>>
>>>> J-D
>>>>
>>>>
>>>
>>>
>>
>

Reply via email to