I have the same confusion. Say if I added three column families A, B anc C to the scan, now if a row has data for column family B and C but no data for A, then it won't be returned in the next() method? What if the requirement is to get row data regardless of whether there's data for a specific column family or not?
On Thu, Mar 15, 2012 at 1:04 PM, lars hofhansl <[email protected]> wrote: > Hi Peter, > for HBase you have keep in mind that it is a sparse columnar (or KeyValue) > store: (rowkey, columnfamily, column, TS) -> value > > A scan only returns those KeyValues that match the scan. So when you set > families on your scan you'll only get those rows for which the scan found > any columns. > > Makes sense? > > -- Lars > > > > ________________________________ > From: Peter Wolf <[email protected]> > To: [email protected] > Sent: Thursday, March 15, 2012 9:52 AM > Subject: Re: Scan.addFamiliy reduces results > > Thanks Doug, > > I had read that, and I just read it again. But I am missing something... > > Why does adding a family reduce the number of results? Is there an > implied filter of some form? Does addFamily add some constraint on > which rows are returned? > > Note that all my rows *ought* to have values in all the families. > > Thanks > Peter > > On 3/15/12 12:39 PM, Doug Meil wrote: > > re: "However, I am getting different number of results, depending on > > which families are added" > > > > Yes. > > > > I'd suggest you read this in the RefGuide. > > > > http://hbase.apache.org/book.html#datamodel > > > > > > > > > > > > On 3/15/12 12:08 PM, "Peter Wolf"<[email protected]> wrote: > > > >> Hi all, > >> > >> I am doing a scan on a table with multiple families. My code looks like > >> this... > >> > >> Scan scan = new Scan(calculateStartRowKey(a), > >> calculateEndRowKey(b)); > >> > >> scan.setCaching(10000); > >> Filter filter = new SingleColumnValueFilter(xFamily, xColumn, > >> CompareFilter.CompareOp.EQUAL, Bytes.toBytes(x)); > >> scan.setFilter(filter); > >> scan > >> .addFamily(xFamily) > >> .addFamily(yFamily) > >> .addFamily(zFamily); > >> > >> ResultScanner scanner = hTable.getScanner(scan); > >> > >> Iterator<Result> it = scanner.iterator(); > >> int resultCount = 0; > >> while (it.hasNext()) { > >> Result result = it.next(); > >> > >> resultCount++; > >> } > >> > >> However, I am getting different number of results, depending on which > >> families are added. For example these give different result counts > >> > >> scan > >> //.addFamily(xFamily) > >> .addFamily(yFamily) > >> .addFamily(zFamily); > >> and > >> scan > >> .addFamily(xFamily) > >> .addFamily(yFamily) > >> .addFamily(zFamily); > >> > >> > >> There is no error message, and I don't see anything in the Scan > >> documentation. Does anyone know what is going on? > >> > >> Thanks > >> Peter > >> > >> > >> > > >
