Oh...Oops.. Regards Ram
On Wed, Jan 2, 2013 at 3:14 AM, Dalia Sobhy <[email protected]>wrote: > > Thanks Ram, > > Issue is resolved i forgot to add > scan.addFilter(fliterlist); > > Thats why it was not filtering !!! > > > Date: Wed, 26 Dec 2012 21:11:32 +0530 > > Subject: Re: Hbase Count Aggregate Function > > From: [email protected] > > To: [email protected] > > > > Dalia, > > > > I tried out this eg, > > > > {code} > > private static final byte[] TEST_TABLE = Bytes.toBytes("TestTable"); > > private static final byte[] TEST_FAMILY = Bytes.toBytes("TestFamily"); > > private static final byte[] TEST_QUALIFIER = > > Bytes.toBytes("TestQualifier"); > > private static final byte[] TEST_MULTI_CQ = > Bytes.toBytes("TestMultiCQ"); > > > > private static byte[] ROW = Bytes.toBytes("testRow"); > > private static final int ROWSIZE = 20; > > private static final int rowSeperator1 = 5; > > private static final int rowSeperator2 = 12; > > private static byte[][] ROWS = makeN(ROW, ROWSIZE); > > for (int i = 0; i < ROWSIZE; i++) { > > Put put = new Put(ROWS[i]); > > put.setWriteToWAL(false); > > Long l = new Long(i); > > put.add(TEST_FAMILY, TEST_QUALIFIER, Bytes.toBytes(l)); > > table.put(put); > > Put p2 = new Put(ROWS[i]); > > put.setWriteToWAL(false); > > p2.add(TEST_FAMILY, Bytes.add(TEST_MULTI_CQ, Bytes.toBytes(l)), > Bytes > > .toBytes(l * 10)); > > table.put(p2); > > > > AggregationClient aClient = new AggregationClient(conf); > > Scan scan = new Scan(); > > scan.addColumn(TEST_FAMILY, TEST_QUALIFIER); > > final ColumnInterpreter<Long, Long> ci = new LongColumnInterpreter(); > > SingleColumnValueFilter scvf = new > SingleColumnValueFilter(TEST_FAMILY, > > TEST_QUALIFIER, CompareOp.EQUAL, > > Bytes.toBytes(4l)); > > scan.setFilter(scvf); > > long rowCount = aClient.rowCount(TEST_TABLE, ci, > > scan); > > assertEquals(ROWSIZE, rowCount); > > } > > {code} > > > > So this assertion is failing and it is working as expected. If you want > to > > try out check out the testcase > > in TestAggregateProtocol.testRowCountAllTable(). > > Just modify the testcase so that you pass a SingleColumnValueFilter. It > is > > working fine. > > > > Please check and let me know. May be am doing some mistake. > > > > Regards > > Ram > > > > On Tue, Dec 25, 2012 at 11:25 PM, Dalia Sobhy < > [email protected]>wrote: > > > > > > > > Is there a problem in letting ID (rowkey) "int" value?? > > > > > > > Date: Tue, 25 Dec 2012 22:44:00 +0530 > > > > Subject: Re: Hbase Count Aggregate Function > > > > From: [email protected] > > > > To: [email protected] > > > > > > > > @Dalia > > > > > > > > I think the aggregation client should work with what you have passed. > > > What > > > > i meant in the previous mail was with table.count() and now with > > > > AggregationClient. > > > > {code} > > > > if (scan.getFilter() == null && qualifier == null) > > > > scan.setFilter(new FirstKeyOnlyFilter()); > > > > {code} > > > > > > > > So as you have passed the filter then it should work as how the SCVF > > > should > > > > work. I can check this out during free time (may be tomorrow). > > > > If not you can raise a bug. If it turns to be fine then we can > close it > > > > out otherwise its better we fix it. > > > > I can understand your urgency in this. > > > > > > > > Regards > > > > Ram > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Dec 25, 2012 at 10:27 PM, <[email protected]> wrote: > > > > > > > > > RowCount method accepts scan object where you can attach your > custom > > > > > filter. > > > > > > > > > > Cheers > > > > > > > > > > > > > > > > > > > > On Dec 25, 2012, at 8:42 AM, Dalia Sobhy < > [email protected]> > > > > > wrote: > > > > > > > > > > > > > > > > > Do you mean I implement a new rowCount method in Aggregation > Client > > > > > Class. > > > > > > > > > > > > I cannot understand, could u illustrate with a code sample Ram? > > > > > > > > > > > >>> Date: Tue, 25 Dec 2012 00:21:14 +0530 > > > > > >>> Subject: Re: Hbase Count Aggregate Function > > > > > >>> From: [email protected] > > > > > >>> To: [email protected] > > > > > >>> > > > > > >>> Hi > > > > > >>> You could have custom filter implemented which is similar to > > > > > >>> FirstKeyOnlyfilter. > > > > > >>> Implement the filterKeyValue method such that it should match > your > > > > > keyvalue > > > > > >>> (the specific qualifier that you are looking for). > > > > > >>> > > > > > >>> Deploy it in your cluster. It should work. > > > > > >>> > > > > > >>> Regards > > > > > >>> Ram > > > > > >>> > > > > > >>> On Mon, Dec 24, 2012 at 10:35 PM, Dalia Sobhy < > > > > > [email protected]>wrote: > > > > > >>> > > > > > >>>> > > > > > >>>> So do you have a suggestion how to enable/work the filter? > > > > > >>>> > > > > > >>>>> Date: Mon, 24 Dec 2012 22:22:49 +0530 > > > > > >>>>> Subject: Re: Hbase Count Aggregate Function > > > > > >>>>> From: [email protected] > > > > > >>>>> To: [email protected] > > > > > >>>>> > > > > > >>>>> Okie, seeing the shell script and the code I feel that while > you > > > use > > > > > this > > > > > >>>>> counter, the user's filter is not taken into account. > > > > > >>>>> It adds a FirstKeyOnlyFilter and proceeds with the scan. :(. > > > > > >>>>> > > > > > >>>>> Regards > > > > > >>>>> Ram > > > > > >>>>> > > > > > >>>>> On Mon, Dec 24, 2012 at 10:11 PM, Dalia Sobhy < > > > > > >>>> [email protected]>wrote: > > > > > >>>>> > > > > > >>>>>> > > > > > >>>>>> yeah scan gives the correct number of rows, while count > returns > > > the > > > > > >>>> total > > > > > >>>>>> number of rows. > > > > > >>>>>> > > > > > >>>>>> Both are using the same filter, I even tried it using Java > API, > > > > > using > > > > > >>>> row > > > > > >>>>>> count method. > > > > > >>>>>> > > > > > >>>>>> rowCount = aggregationClient.rowCount(TABLE_NAME, null, > scan); > > > > > >>>>>> > > > > > >>>>>> I get the total number of rows not the number of rows > filtered. > > > > > >>>>>> > > > > > >>>>>> So any idea ?? > > > > > >>>>>> > > > > > >>>>>> Thanks Ram :) > > > > > >>>>>> > > > > > >>>>>>> Date: Mon, 24 Dec 2012 21:57:54 +0530 > > > > > >>>>>>> Subject: Re: Hbase Count Aggregate Function > > > > > >>>>>>> From: [email protected] > > > > > >>>>>>> To: [email protected] > > > > > >>>>>>> > > > > > >>>>>>> So you find that scan with a filter and count with the same > > > filter > > > > > is > > > > > >>>>>>> giving you different results? > > > > > >>>>>>> > > > > > >>>>>>> Regards > > > > > >>>>>>> Ram > > > > > >>>>>>> > > > > > >>>>>>> On Mon, Dec 24, 2012 at 8:33 PM, Dalia Sobhy < > > > > > >>>> [email protected] > > > > > >>>>>>> wrote: > > > > > >>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> Dear all, > > > > > >>>>>>>> > > > > > >>>>>>>> I have 50,000 row with diagnosis qualifier = "cardiac", > and > > > > > another > > > > > >>>>>> 50,000 > > > > > >>>>>>>> rows with "renal". > > > > > >>>>>>>> > > > > > >>>>>>>> When I type this in Hbase shell, > > > > > >>>>>>>> > > > > > >>>>>>>> import org.apache.hadoop.hbase.filter.CompareFilter > > > > > >>>>>>>> import > org.apache.hadoop.hbase.filter.SingleColumnValueFilter > > > > > >>>>>>>> import org.apache.hadoop.hbase.filter.SubstringComparator > > > > > >>>>>>>> import org.apache.hadoop.hbase.util.Bytes > > > > > >>>>>>>> > > > > > >>>>>>>> scan 'patient', { COLUMNS => "info:diagnosis", FILTER => > > > > > >>>>>>>> SingleColumnValueFilter.new(Bytes.toBytes('info'), > > > > > >>>>>>>> Bytes.toBytes('diagnosis'), > > > > > >>>>>>>> CompareFilter::CompareOp.valueOf('EQUAL'), > > > > > >>>>>>>> SubstringComparator.new('cardiac'))} > > > > > >>>>>>>> > > > > > >>>>>>>> Output = 50,000 row > > > > > >>>>>>>> > > > > > >>>>>>>> import org.apache.hadoop.hbase.filter.CompareFilter > > > > > >>>>>>>> import > org.apache.hadoop.hbase.filter.SingleColumnValueFilter > > > > > >>>>>>>> import org.apache.hadoop.hbase.filter.SubstringComparator > > > > > >>>>>>>> import org.apache.hadoop.hbase.util.Bytes > > > > > >>>>>>>> > > > > > >>>>>>>> count 'patient', { COLUMNS => "info:diagnosis", FILTER => > > > > > >>>>>>>> SingleColumnValueFilter.new(Bytes.toBytes('info'), > > > > > >>>>>>>> Bytes.toBytes('diagnosis'), > > > > > >>>>>>>> CompareFilter::CompareOp.valueOf('EQUAL'), > > > > > >>>>>>>> SubstringComparator.new('cardiac'))} > > > > > >>>>>>>> Output = 100,000 row > > > > > >>>>>>>> > > > > > >>>>>>>> Even though I tried it using Hbase Java API, Aggregation > > > Client > > > > > >>>>>> Instance, > > > > > >>>>>>>> and I enabled the Coprocessor aggregation for the table. > > > > > >>>>>>>> rowCount = aggregationClient.rowCount(TABLE_NAME, null, > scan) > > > > > >>>>>>>> > > > > > >>>>>>>> Also when measuring the improved performance on case of > adding > > > > > more > > > > > >>>>>> nodes > > > > > >>>>>>>> the operation takes the same time. > > > > > >>>>>>>> > > > > > >>>>>>>> So any advice please? > > > > > >>>>>>>> > > > > > >>>>>>>> I have been throughout all this mess from a couple of > weeks > > > > > >>>>>>>> > > > > > >>>>>>>> Thanks, > > > > > >>>>>> > > > > > >>>>>> > > > > > >>>> > > > > > >>>> > > > > > >> > > > > > > > > > > > > > > > > > > >
