Hi,
I hope that you can shed some light on these 2 scenarios below.
I have 2 small tables of 6000 rows.
Table 1 has only 1 column in each of its rows.
Table 2 has 40 columns in each of its rows.
Other than that the two tables are identical.
In both tables there is only 1 row that contains a matching column that I am
filtering on. And the Scan performs correctly in both cases by returning only
the single result.
The code looks something like the following:
Scan scan = new Scan(startRow, stopRow); // the start/stop rows should
include all 6000 rows
scan.addColumn(cf, qualifier); // only return the column that I am interested
in (should only be in 1 row and only 1 version)
Filter f1 = new InclusiveStopFilter(stopRow);
Filter f2 = new SingleColumnValueFilter(cf, qualifier,
CompareFilter.CompareOp.EQUALS, value);
scan.setFilter(new FilterList(f1, f2));
scan .setTimeRange(0, MAX_LONG);
scan.setMaxVersions(1);
ResultScanner rs = t.getScanner(scan);
for (Result result: rs)
{
...
}
For table 1, rs.next() takes about 30ms.
For table 2, rs.next() takes about 180ms.
Both are returning the exact same result. Why is it taking so much longer on
table 2 to get the same result? The scan depth is the same. The only
difference is the column width. But I'm filtering on a single column and
returning only that column.
Am I missing something? As I increase the number of columns, the response time
gets worse. I do expect the response time to get worse when increasing the
number of rows, but not by increasing the number of columns since I'm returning
only 1 column in both cases.
I appreciate any comments that you have.
-Tony
Tony Dean
SAS Institute Inc.
Principal Software Developer
919-531-6704