Phoenix does support intelligent ways when you query using columns since it is a SQL engine.
There the parallelism happens by using guideposts - those are fixed spaced row keys stored in a seperate stats table. So when you do a query the Phoenix internally spawns parallels scan queries using those guide posts and thus making querying faster. Regards Ram On Fri, Oct 21, 2016 at 1:26 PM, Anil <[email protected]> wrote: > Thank you Ram. > > "So now you are spawning those many scan threads equal to the number of > regions " - YES > > There are two ways of scanning region in parallel > > 1. scan a region with start row and stop row in parallel with single scan > operation on server side and hbase take care of parallelism internally. > 2. transform a start row and stop row of a region into number of start and > stop rows (by some criteria) and span scan query for each start and stop > row. > > #1 is not supported (as you also said). > > i am looking for #2. i checked the phoenix documentation and code. it seems > to me that phoenix is doing #2. i looked into phoenix code and could not > understand it completely. > > The usecase is very simple. Hbase not good (at least in terms of > performance for OLTP) query by all columns (other than row key) and sorting > of all columns of a row. even phoenix too. > > So i am planning load the hbase/phoenix table into in-memory data base for > faster access. > > scanning of big region sequentially will lead to larger load time. so > finding ways to minimize the load time. > > Hope this helps. > > Thanks. > > > On 21 October 2016 at 09:30, ramkrishna vasudevan < > [email protected]> wrote: > > > Hi Anil > > > > So now you are spawning those many scan threads equal to the number of > > regions. > > bq.Is there any way to scan a region in parallel ? > > You mean with in a region you want to scan parallely? Which means that a > > single query you want to split up into N number of small scans and read > and > > aggregate on the client side/server side? > > > > Currently you cannot do that. Once you set a start and stoprow the scan > > will determine which region it belongs to and retrieves the data > > sequentially in that region (it applies the filtering that you do during > > the course of the scan). > > > > Have you tried Apache Phoenix? Its a SQL wrapper over HBase and there > you > > could do parallel scans for a given SQL query if there are some guide > posts > > collected. Such things cannot be an integral part of HBase. But I fear > as I > > am not aware of your usecase we cannot suggest on this. > > > > REgards > > Ram > > > > > > On Fri, Oct 21, 2016 at 8:40 AM, Anil <[email protected]> wrote: > > > > > Any pointers ? > > > > > > On 20 October 2016 at 18:15, Anil <[email protected]> wrote: > > > > > > > HI, > > > > > > > > I am loading hbase table into an in-memory db to support filter, > > ordering > > > > and pagination. > > > > > > > > I am scanning region and inserting data into in-memory db. each > region > > > > scan is done in single thread so each region is scanned in parallel. > > > > > > > > Is there any way to scan a region in parallel ? any pointers would be > > > > helpful. > > > > > > > > Thanks > > > > > > > > > >
