Re: Scan.addFamiliy reduces results

lars hofhansl Thu, 15 Mar 2012 10:18:28 -0700

Hi haijia,

In that case HBase will still return the data for columns in family B and C.But 
if you only added family A then HBase would only return "rows" for which family 
A has any columns.


-- Lars
________________________________

From: Haijia Zhou <[email protected]>
To: [email protected]; lars hofhansl <[email protected]> 
Sent: Thursday, March 15, 2012 10:12 AM
Subject: Re: Scan.addFamiliy reduces results


I have the same confusion. Say if I added three column families A, B anc C to 
the scan, now if a row has data for column family B and C but no data for A, 
then it won't be returned  in the next() method?
What if the requirement is to get row data regardless of whether there's data 
for a specific column family or not?


On Thu, Mar 15, 2012 at 1:04 PM, lars hofhansl <[email protected]> wrote:

Hi Peter,
>for HBase you have keep in mind that it is a sparse columnar (or KeyValue) 
>store: (rowkey, columnfamily, column, TS) -> value
>
>A scan only returns those KeyValues that match the scan. So when you set 
>families on your scan you'll only get those rows for which the scan found any 
>columns.
>
>Makes sense?
>
>-- Lars
>
>
>
>________________________________
> From: Peter Wolf <[email protected]>
>To: [email protected]
>Sent: Thursday, March 15, 2012 9:52 AM
>Subject: Re: Scan.addFamiliy reduces results
>
>
>Thanks Doug,
>
>I had read that, and I just read it again.  But I am missing something...
>
>Why does adding a family reduce the number of results?  Is there an
>implied filter of some form?  Does addFamily add some constraint on
>which rows are returned?
>
>Note that all my rows *ought* to have values in all the families.
>
>Thanks
>Peter
>
>On 3/15/12 12:39 PM, Doug Meil wrote:
>> re:  "However, I am getting different number of results, depending on
>> which families are added"
>>
>> Yes.
>>
>> I'd suggest you read this in the RefGuide.
>>
>> http://hbase.apache.org/book.html#datamodel
>>
>>
>>
>>
>>
>> On 3/15/12 12:08 PM, "Peter Wolf"<[email protected]>  wrote:
>>
>>> Hi all,
>>>
>>> I am doing a scan on a table with multiple families.  My code looks like
>>> this...
>>>
>>>          Scan scan = new Scan(calculateStartRowKey(a),
>>> calculateEndRowKey(b));
>>>
>>>          scan.setCaching(10000);
>>>          Filter filter = new SingleColumnValueFilter(xFamily, xColumn,
>>> CompareFilter.CompareOp.EQUAL, Bytes.toBytes(x));
>>>          scan.setFilter(filter);
>>>          scan
>>>                  .addFamily(xFamily)
>>>                  .addFamily(yFamily)
>>>                  .addFamily(zFamily);
>>>
>>>          ResultScanner scanner = hTable.getScanner(scan);
>>>
>>>          Iterator<Result>  it = scanner.iterator();
>>>          int resultCount = 0;
>>>          while (it.hasNext()) {
>>>                Result result = it.next();
>>>
>>>                resultCount++;
>>>          }
>>>
>>> However, I am getting different number of results, depending on which
>>> families are added.  For example these give different result counts
>>>
>>>          scan
>>>                  //.addFamily(xFamily)
>>>                  .addFamily(yFamily)
>>>                  .addFamily(zFamily);
>>> and
>>>          scan
>>>                  .addFamily(xFamily)
>>>                  .addFamily(yFamily)
>>>                  .addFamily(zFamily);
>>>
>>>
>>> There is no error message, and I don't see anything in the Scan
>>> documentation.  Does anyone know what is going on?
>>>
>>> Thanks
>>> Peter
>>>
>>>
>>>
>>

Re: Scan.addFamiliy reduces results

Reply via email to