Hi James, Thanks for the hints on the paged query. I have one additional question about range scan based on region's start and stop rowkey. Is it possible to do such range scan in phoenix given the hbase's region start/stop rowkey bytes?
i.e. SELECT col1,col2,... FROM TABLE_A where RK BETWEEN startRowKey and stopRowKey. Where the RowKey in HBase is composed by several columns in Phoenix (i.e. bigint, int, boolean, and varchar). I saw if the table is salted will the paged query syntax still yields a correct range scan for a given region? Thanks, Li On Mon, Apr 18, 2016 at 3:17 PM, Li Gao <[email protected]> wrote: > Hi James. > > I see, [2] might work for my use case. > > Thanks, > Li > > > On Mon, Apr 18, 2016 at 2:54 PM, James Taylor <[email protected]> > wrote: > >> Thanks for the clarification, Li. Are you essentially trying to make >> Phoenix multi-client node? Our idea for that is Drillix [1]. Short term, if >> you know the split points, you could use our row value constructor syntax >> [2] to do the above. >> >> Thanks, >> James >> >> >> [1] >> https://apurtell.s3.amazonaws.com/phoenix/Drillix+Combined+Operational+%26+Analytical+SQL+at+Scale.pdf >> [2] https://phoenix.apache.org/paged.html >> >> On Mon, Apr 18, 2016 at 2:18 PM, Li Gao <[email protected]> wrote: >> >>> Hi James, >>> >>> Thanks for the quick reply. It is helpful but not sure it can solve the >>> issue we have. Let me state use case in another way to make it more >>> obvious. >>> >>> Say Table A has 10 regions spread across 10 HBase nodes, in addition I >>> have 10 data processor machines (not the same as the hbase cluster) that >>> can each independently issue a query to Phoenix to retrieve part of the >>> table. >>> >>> Ideally I am looking for something like: >>> >>> SELECT col1,col2,... FROM TABLE_A WHERE (i.e. region=1,2,3,4...) >>> >>> So each processor machine can issue a region-specific query and retrieve >>> a non-overlapping piece of the table projection. I am not sure how such >>> Phoenix query can be constructed. >>> >>> Hope this clarifies the question. >>> >>> Thanks, >>> Li >>> >>> On Mon, Apr 18, 2016 at 2:09 PM, James Taylor <[email protected]> >>> wrote: >>> >>>> Phoenix already does this (and to a finer, configurable granularity). >>>> See https://phoenix.apache.org/update_statistics.html >>>> >>>> Thanks, >>>> James >>>> >>>> On Mon, Apr 18, 2016 at 2:08 PM, Li Gao <[email protected]> wrote: >>>> >>>>> Hi, >>>>> >>>>> In Phoenix is it possible to query the data by region splits? i.e. if >>>>> Table A has 10 regions on the cluster, how I can issue 10 concurrent >>>>> queries to Table A so that each query covers exactly 1 region for the >>>>> table? This is helpful for us to split the queries across multiple >>>>> processor machines and help us build MPP query connector for Phoenix. >>>>> >>>>> Any hints would be appreciated. >>>>> >>>>> Thanks, >>>>> Li >>>>> >>>> >>>> >>> >> >
