Thanks James, I tried both and that helped.
On Fri, Jun 5, 2015 at 9:44 AM James Taylor <[email protected]> wrote: > Hi Vijay, > You've got a couple of options: > 1) Force the query to do a skip scan (the Phoenix equivalent of the > FuzzyRowKeyFilter) by adding a hint like this: > > select /*+ SKIP_SCAN */ * from my_table where cid = ? and tid = ? > > By default, Phoenix won't do a skip scan when there's gaps in the pk > columns filtered on by your where clause, but the hint will force it to be > used. > > 2) Add a secondary index on cid, tid. I suspect it wasn't being used > because you weren't including all the columns from your data table in your > index. Try creating a secondary index like this: > > create index cid_tid_idx on my_table(cid, tid) include (col1, col2, > ...) > > You could alternatively force Phoenix to use your secondary index by using > the INDEX hint on your select statement. For more detail and other options > for secondary indexes, see > http://phoenix.apache.org/secondary_indexing.html > > Thanks, > James > > On Fri, Jun 5, 2015 at 7:18 AM, Hemal Parekh <[email protected]> wrote: > >> Vijay, >> >> You can try this. >> >> select * from my_table where <primary key> like '?%' and tid = ? >> (argument ? is customerid) >> >> This will do a range scan and a filter on server side. Other option is to >> change primary key design to (customerId int, transactionId varchar, >> timestamp >> BigInt). This will allow to query on primary key directly resulting in >> range scan. >> >> select * from my_table where <primary key> like '?%' (your argument >> should include both customerid and transactionid) >> >> Above will return all timestamps versioned records for a matching >> combination of customerid and transactionid. >> >> >> Thanks, >> Hemal >> >> >> >> On Fri, Jun 5, 2015 at 10:00 AM, Vijay Kukkala <[email protected]> >> wrote: >> >>> Cluster configuration: Phoenix 4.0.2 with Hbase 0.98 HDP 2.1 >>> >>> One of our table has a primary key (customerId int, timestamp BigInt, >>> transactionId varchar). >>> One of our use cases is to retrieve records by customerId and >>> transactionID. >>> >>> select * from my_table where cid = ? and tid = ? >>> >>> looking at the explain plan, it resulted in a range scan on customerID >>> with a filter on transactionID. >>> >>> I tried to create an index on (CustomerID, transactionID) but didn't >>> result any change in the explain plan. >>> >>> Trying to understand if there is a better way to handle this. In Hbase, >>> there is a fuzzyRowKeyFilter which allows to query on parts on the key. >>> >>> Would appreciate if someone can point me in the right direction. >>> thanks, >>> Vijay >>> >>> >> >> >> -- >> >> Hemal Parekh >> Senior Data Warehouse Architect >> m. 240.449.4396 >> [image: Bitscopic Inc] <http://bitscopic.com> >> >> >
