Okie, seeing the shell script and the code I feel that while you use this counter, the user's filter is not taken into account. It adds a FirstKeyOnlyFilter and proceeds with the scan. :(.
Regards Ram On Mon, Dec 24, 2012 at 10:11 PM, Dalia Sobhy <[email protected]>wrote: > > yeah scan gives the correct number of rows, while count returns the total > number of rows. > > Both are using the same filter, I even tried it using Java API, using row > count method. > > rowCount = aggregationClient.rowCount(TABLE_NAME, null, scan); > > I get the total number of rows not the number of rows filtered. > > So any idea ?? > > Thanks Ram :) > > > Date: Mon, 24 Dec 2012 21:57:54 +0530 > > Subject: Re: Hbase Count Aggregate Function > > From: [email protected] > > To: [email protected] > > > > So you find that scan with a filter and count with the same filter is > > giving you different results? > > > > Regards > > Ram > > > > On Mon, Dec 24, 2012 at 8:33 PM, Dalia Sobhy <[email protected] > >wrote: > > > > > > > > Dear all, > > > > > > I have 50,000 row with diagnosis qualifier = "cardiac", and another > 50,000 > > > rows with "renal". > > > > > > When I type this in Hbase shell, > > > > > > import org.apache.hadoop.hbase.filter.CompareFilter > > > import org.apache.hadoop.hbase.filter.SingleColumnValueFilter > > > import org.apache.hadoop.hbase.filter.SubstringComparator > > > import org.apache.hadoop.hbase.util.Bytes > > > > > > scan 'patient', { COLUMNS => "info:diagnosis", FILTER => > > > SingleColumnValueFilter.new(Bytes.toBytes('info'), > > > Bytes.toBytes('diagnosis'), > > > CompareFilter::CompareOp.valueOf('EQUAL'), > > > SubstringComparator.new('cardiac'))} > > > > > > Output = 50,000 row > > > > > > import org.apache.hadoop.hbase.filter.CompareFilter > > > import org.apache.hadoop.hbase.filter.SingleColumnValueFilter > > > import org.apache.hadoop.hbase.filter.SubstringComparator > > > import org.apache.hadoop.hbase.util.Bytes > > > > > > count 'patient', { COLUMNS => "info:diagnosis", FILTER => > > > SingleColumnValueFilter.new(Bytes.toBytes('info'), > > > Bytes.toBytes('diagnosis'), > > > CompareFilter::CompareOp.valueOf('EQUAL'), > > > SubstringComparator.new('cardiac'))} > > > Output = 100,000 row > > > > > > Even though I tried it using Hbase Java API, Aggregation Client > Instance, > > > and I enabled the Coprocessor aggregation for the table. > > > rowCount = aggregationClient.rowCount(TABLE_NAME, null, scan) > > > > > > Also when measuring the improved performance on case of adding more > nodes > > > the operation takes the same time. > > > > > > So any advice please? > > > > > > I have been throughout all this mess from a couple of weeks > > > > > > Thanks, > >
