Sorry Sasi, missed your last mails. It seems that you have one region in a table or the query touching one region because of monotonically increasing key['MK00100','YOU',4] . Varying performance is because you may have filter which are aggressive and skipping lots of rows in between (*0* (7965 ms), *2041* (7155 ms), *4126 *(1630 ms)) and that's why server is taking time.
can you try after doing salting on the table. https://phoenix.apache.org/salted.html On Wed, Sep 28, 2016 at 10:47 AM, Sasikumar Natarajan <sasi...@gmail.com> wrote: > Any one has suggestions for the performance issue discussed in this > thread?. Your suggestions would help me resolve this issue. > > Infrastructure details: > > Azure HDInsight HBase > > Type Node Size Cores Nodes > Head D3 V2 8 2 > Region D3 V2 16 4 > ZooKeeper D3 V2 12 3 > Thanks, > Sasikumar Natarajan. > > > On Fri, Sep 23, 2016 at 7:57 AM, Sasikumar Natarajan <sasi...@gmail.com> > wrote: > >> Also its not only the first time it takes time when we call >> ResultSet.next(). >> >> When we iterate over ResultSet, it takes a long time initially and then >> iterates faster. Again after few iterations, it takes sometime and this >> goes on. >> >> >> >> Sample observation: >> >> >> >> Total Rows available on ResultSet : 5130 >> >> Statement.executeQuery() has taken : 702 ms >> >> ResultSet Indices at which long time has been taken : *0* (7965 ms), >> *2041* (7155 ms), *4126 *(1630 ms) >> >> On Fri, Sep 23, 2016 at 7:52 AM, Sasikumar Natarajan <sasi...@gmail.com> >> wrote: >> >>> Hi Ankit, >>> Where does the server processing happens, on the HBase >>> cluster or the server where Phoenix core runs. >>> >>> PFB the details you have asked for, >>> >>> Query: >>> >>> SELECT col1, col2, col5, col7, col11, col12 FROM SPL_FINAL where >>> col1='MK00100' and col2='YOU' and col3=4 and col5 in (?,?,?,?,?) and ((col7 >>> between to_date('2016-08-01 00:00:00.000') and to_date('2016-08-05 >>> 23:59:59.000')) or (col8 between to_date('2016-08-01 00:00:00.000') and >>> to_date('2016-08-05 23:59:59.000'))) >>> >>> >>> Explain plan: >>> >>> CLIENT 1-CHUNK PARALLEL 1-WAY RANGE SCAN OVER SPL_FINAL >>> ['MK00100','YOU',4] >>> SERVER FILTER BY (COL5 IN ('100','101','105','234','653') AND >>> ((COL7 >= TIMESTAMP '2016-08-01 00:00:00.000' AND COL7 <= TIMESTAMP >>> '2016-08-05 23:59:59.000') OR (COL8 >= TIMESTAMP '2016-08-01 00:00:00.000' >>> AND COL8 <= TIMESTAMP '2016-08-05 23:59:59.000'))) >>> DDL: >>> >>> CREATE TABLE IF NOT EXISTS SPL_FINAL >>> (col1 VARCHAR NOT NULL, >>> col2 VARCHAR NOT NULL, >>> col3 INTEGER NOT NULL, >>> col4 INTEGER NOT NULL, >>> col5 VARCHAR NOT NULL, >>> col6 VARCHAR NOT NULL, >>> col7 TIMESTAMP NOT NULL, >>> col8 TIMESTAMP NOT NULL, >>> ext.col9 VARCHAR, >>> ext.col10 VARCHAR, >>> pri.col11 VARCHAR[], //this column contains 3600 items in every row >>> pri.col12 VARCHAR >>> ext.col13 BOOLEAN >>> CONSTRAINT SPL_FINAL_PK PRIMARY KEY (col1, col2, col3, col4, col5, col6, >>> col7, col8)) COMPRESSION='SNAPPY'; >>> >>> Thanks, >>> Sasikumar Natarajan. >>> >>> On Thu, Sep 22, 2016 at 12:36 PM, Ankit Singhal < >>> ankitsingha...@gmail.com> wrote: >>> >>>> Share some more details about the query, DDL and explain plan. In >>>> Phoenix, there are cases where we do some server processing at the time >>>> when rs.next() is called first time but subsequent next() should be faster. >>>> >>>> On Thu, Sep 22, 2016 at 9:52 AM, Sasikumar Natarajan <sasi...@gmail.com >>>> > wrote: >>>> >>>>> Hi, >>>>> I'm using Apache Phoenix core 4.4.0-HBase-1.1 library to query the >>>>> data available on Phoenix server. >>>>> >>>>> preparedStatement.executeQuery() seems to be taking less time. But >>>>> to enter into *while (rs.next()) {} *takes a long time. I would like >>>>> to know what is causing the delay to make the ResultSet ready. Please >>>>> share >>>>> your thoughts on this. >>>>> >>>>> >>>>> -- >>>>> Regards, >>>>> Sasikumar Natarajan >>>>> >>>> >>>> >>> >>> >>> -- >>> Regards, >>> Sasikumar Natarajan >>> >> >> >> >> -- >> Regards, >> Sasikumar Natarajan >> > > > > -- > Regards, > Sasikumar Natarajan >