Thank you Ram. Now its clear. i will take a look at it. Thanks again.
On 21 October 2016 at 14:25, ramkrishna vasudevan < [email protected]> wrote: > Phoenix does support intelligent ways when you query using columns since it > is a SQL engine. > > There the parallelism happens by using guideposts - those are fixed spaced > row keys stored in a seperate stats table. So when you do a query the > Phoenix internally spawns parallels scan queries using those guide posts > and thus making querying faster. > > Regards > Ram > > On Fri, Oct 21, 2016 at 1:26 PM, Anil <[email protected]> wrote: > > > Thank you Ram. > > > > "So now you are spawning those many scan threads equal to the number of > > regions " - YES > > > > There are two ways of scanning region in parallel > > > > 1. scan a region with start row and stop row in parallel with single scan > > operation on server side and hbase take care of parallelism internally. > > 2. transform a start row and stop row of a region into number of start > and > > stop rows (by some criteria) and span scan query for each start and stop > > row. > > > > #1 is not supported (as you also said). > > > > i am looking for #2. i checked the phoenix documentation and code. it > seems > > to me that phoenix is doing #2. i looked into phoenix code and could not > > understand it completely. > > > > The usecase is very simple. Hbase not good (at least in terms of > > performance for OLTP) query by all columns (other than row key) and > sorting > > of all columns of a row. even phoenix too. > > > > So i am planning load the hbase/phoenix table into in-memory data base > for > > faster access. > > > > scanning of big region sequentially will lead to larger load time. so > > finding ways to minimize the load time. > > > > Hope this helps. > > > > Thanks. > > > > > > On 21 October 2016 at 09:30, ramkrishna vasudevan < > > [email protected]> wrote: > > > > > Hi Anil > > > > > > So now you are spawning those many scan threads equal to the number of > > > regions. > > > bq.Is there any way to scan a region in parallel ? > > > You mean with in a region you want to scan parallely? Which means that > a > > > single query you want to split up into N number of small scans and read > > and > > > aggregate on the client side/server side? > > > > > > Currently you cannot do that. Once you set a start and stoprow the scan > > > will determine which region it belongs to and retrieves the data > > > sequentially in that region (it applies the filtering that you do > during > > > the course of the scan). > > > > > > Have you tried Apache Phoenix? Its a SQL wrapper over HBase and there > > you > > > could do parallel scans for a given SQL query if there are some guide > > posts > > > collected. Such things cannot be an integral part of HBase. But I fear > > as I > > > am not aware of your usecase we cannot suggest on this. > > > > > > REgards > > > Ram > > > > > > > > > On Fri, Oct 21, 2016 at 8:40 AM, Anil <[email protected]> wrote: > > > > > > > Any pointers ? > > > > > > > > On 20 October 2016 at 18:15, Anil <[email protected]> wrote: > > > > > > > > > HI, > > > > > > > > > > I am loading hbase table into an in-memory db to support filter, > > > ordering > > > > > and pagination. > > > > > > > > > > I am scanning region and inserting data into in-memory db. each > > region > > > > > scan is done in single thread so each region is scanned in > parallel. > > > > > > > > > > Is there any way to scan a region in parallel ? any pointers would > be > > > > > helpful. > > > > > > > > > > Thanks > > > > > > > > > > > > > > >
