Yes, 

Currently if one of the column family causes a split, then all of the column 
families get split. So if you are dealing with a large blob, you're going to 
shoot yourself in the foot. 

Are you filtering on any of the values in the 'info' family? 
If not, you could try creating a serialized record. (AVRO is an example) for 
the info data, 
and then store the data in a single column family where one column contains the 
info rec and the other column contains the blob. 

Or you could use two tables with the same row key. But that would mean two 
get()s... having said that if you were doing a table scan, you'd want to scan 
the info column and based on the results, you would fetch back the blob.

HTH

-Mike

On Mar 20, 2012, at 3:56 AM, Laxman wrote:

> Do we see any problem with the below schema?
> 
>      family "info":
>          "info:pg" - keeps page number
>          "info:id" - sender ID
>          "info:nm" - pdf name
>          "info:prop_name" - column to hold property name
>          "info:prop_value" - column to hold property value
>      family "data":
>          "data:blob" - blob of pdf file
> 
> --
> Regards,
> Laxman
>> -----Original Message-----
>> From: Konrad Tendera [mailto:kon...@tendera.eu]
>> Sent: Monday, March 19, 2012 8:22 PM
>> To: user@hbase.apache.org
>> Subject: Rows vs. Columns
>> 
>> Hello,
>> 
>> I'm designing some schema for my use case and I'm considering what will
>> be better: rows or columns. Here's what I need - my schema actually
>> looks like this (it will be used for keeping not large pdf files or
>> single pages of larger document)
>> table files:
>>     family "info":
>>         "info:pg" - keeps page number
>>         "info:id" - sender ID
>>         "info:nm" - pdf name
>>         ***
>>     family "data":
>>         "data:blob" - blob of pdf file
>> 
>> Now let's get back to ***: each user can add multiple of additional
>> properties ("name" - "value"), but let's assume that every user will be
>> so creative that there won't be two same names. I don't know how solve
>> this problem: each "name" will be new column ("info:name") or I should
>> try to do this like it is said here:
>> http://hbase.apache.org/book.html#schema.smackdown.rowscols and make
>> new
>> row for earch property?
>> 
>> K.
> 
> 

Reply via email to