Re: Phoenix ResultSet.next() takes a long time for first row
Sorry Sasi, missed your last mails. It seems that you have one region in a table or the query touching one region because of monotonically increasing key['MK00100','YOU',4] . Varying performance is because you may have filter which are aggressive and skipping lots of rows in between (*0* (7965 ms), *2041* (7155 ms), *4126 *(1630 ms)) and that's why server is taking time. can you try after doing salting on the table. https://phoenix.apache.org/salted.html On Wed, Sep 28, 2016 at 10:47 AM, Sasikumar Natarajanwrote: > Any one has suggestions for the performance issue discussed in this > thread?. Your suggestions would help me resolve this issue. > > Infrastructure details: > > Azure HDInsight HBase > > Type Node SizeCores Nodes > Head D3 V2 8 2 > Region D3 V2 16 4 > ZooKeeper D3 V2 12 3 > Thanks, > Sasikumar Natarajan. > > > On Fri, Sep 23, 2016 at 7:57 AM, Sasikumar Natarajan > wrote: > >> Also its not only the first time it takes time when we call >> ResultSet.next(). >> >> When we iterate over ResultSet, it takes a long time initially and then >> iterates faster. Again after few iterations, it takes sometime and this >> goes on. >> >> >> >> Sample observation: >> >> >> >> Total Rows available on ResultSet : 5130 >> >> Statement.executeQuery() has taken : 702 ms >> >> ResultSet Indices at which long time has been taken : *0* (7965 ms), >> *2041* (7155 ms), *4126 *(1630 ms) >> >> On Fri, Sep 23, 2016 at 7:52 AM, Sasikumar Natarajan >> wrote: >> >>> Hi Ankit, >>>Where does the server processing happens, on the HBase >>> cluster or the server where Phoenix core runs. >>> >>> PFB the details you have asked for, >>> >>> Query: >>> >>> SELECT col1, col2, col5, col7, col11, col12 FROM SPL_FINAL where >>> col1='MK00100' and col2='YOU' and col3=4 and col5 in (?,?,?,?,?) and ((col7 >>> between to_date('2016-08-01 00:00:00.000') and to_date('2016-08-05 >>> 23:59:59.000')) or (col8 between to_date('2016-08-01 00:00:00.000') and >>> to_date('2016-08-05 23:59:59.000'))) >>> >>> >>> Explain plan: >>> >>> CLIENT 1-CHUNK PARALLEL 1-WAY RANGE SCAN OVER SPL_FINAL >>> ['MK00100','YOU',4] >>> SERVER FILTER BY (COL5 IN ('100','101','105','234','653') AND >>> ((COL7 >= TIMESTAMP '2016-08-01 00:00:00.000' AND COL7 <= TIMESTAMP >>> '2016-08-05 23:59:59.000') OR (COL8 >= TIMESTAMP '2016-08-01 00:00:00.000' >>> AND COL8 <= TIMESTAMP '2016-08-05 23:59:59.000'))) >>> DDL: >>> >>> CREATE TABLE IF NOT EXISTS SPL_FINAL >>> (col1 VARCHAR NOT NULL, >>> col2 VARCHAR NOT NULL, >>> col3 INTEGER NOT NULL, >>> col4 INTEGER NOT NULL, >>> col5 VARCHAR NOT NULL, >>> col6 VARCHAR NOT NULL, >>> col7 TIMESTAMP NOT NULL, >>> col8 TIMESTAMP NOT NULL, >>> ext.col9 VARCHAR, >>> ext.col10 VARCHAR, >>> pri.col11 VARCHAR[], //this column contains 3600 items in every row >>> pri.col12 VARCHAR >>> ext.col13 BOOLEAN >>> CONSTRAINT SPL_FINAL_PK PRIMARY KEY (col1, col2, col3, col4, col5, col6, >>> col7, col8)) COMPRESSION='SNAPPY'; >>> >>> Thanks, >>> Sasikumar Natarajan. >>> >>> On Thu, Sep 22, 2016 at 12:36 PM, Ankit Singhal < >>> ankitsingha...@gmail.com> wrote: >>> Share some more details about the query, DDL and explain plan. In Phoenix, there are cases where we do some server processing at the time when rs.next() is called first time but subsequent next() should be faster. On Thu, Sep 22, 2016 at 9:52 AM, Sasikumar Natarajan wrote: > Hi, > I'm using Apache Phoenix core 4.4.0-HBase-1.1 library to query the > data available on Phoenix server. > > preparedStatement.executeQuery() seems to be taking less time. But > to enter into *while (rs.next()) {} *takes a long time. I would like > to know what is causing the delay to make the ResultSet ready. Please > share > your thoughts on this. > > > -- > Regards, > Sasikumar Natarajan > >>> >>> >>> -- >>> Regards, >>> Sasikumar Natarajan >>> >> >> >> >> -- >> Regards, >> Sasikumar Natarajan >> > > > > -- > Regards, > Sasikumar Natarajan >
Re: Phoenix ResultSet.next() takes a long time for first row
Any one has suggestions for the performance issue discussed in this thread?. Your suggestions would help me resolve this issue. Infrastructure details: Azure HDInsight HBase Type Node SizeCores Nodes Head D3 V2 8 2 Region D3 V2 16 4 ZooKeeper D3 V2 12 3 Thanks, Sasikumar Natarajan. On Fri, Sep 23, 2016 at 7:57 AM, Sasikumar Natarajanwrote: > Also its not only the first time it takes time when we call > ResultSet.next(). > > When we iterate over ResultSet, it takes a long time initially and then > iterates faster. Again after few iterations, it takes sometime and this > goes on. > > > > Sample observation: > > > > Total Rows available on ResultSet : 5130 > > Statement.executeQuery() has taken : 702 ms > > ResultSet Indices at which long time has been taken : *0* (7965 ms), > *2041* (7155 ms), *4126 *(1630 ms) > > On Fri, Sep 23, 2016 at 7:52 AM, Sasikumar Natarajan > wrote: > >> Hi Ankit, >>Where does the server processing happens, on the HBase cluster >> or the server where Phoenix core runs. >> >> PFB the details you have asked for, >> >> Query: >> >> SELECT col1, col2, col5, col7, col11, col12 FROM SPL_FINAL where >> col1='MK00100' and col2='YOU' and col3=4 and col5 in (?,?,?,?,?) and ((col7 >> between to_date('2016-08-01 00:00:00.000') and to_date('2016-08-05 >> 23:59:59.000')) or (col8 between to_date('2016-08-01 00:00:00.000') and >> to_date('2016-08-05 23:59:59.000'))) >> >> >> Explain plan: >> >> CLIENT 1-CHUNK PARALLEL 1-WAY RANGE SCAN OVER SPL_FINAL >> ['MK00100','YOU',4] >> SERVER FILTER BY (COL5 IN ('100','101','105','234','653') AND ((COL7 >> >= TIMESTAMP '2016-08-01 00:00:00.000' AND COL7 <= TIMESTAMP '2016-08-05 >> 23:59:59.000') OR (COL8 >= TIMESTAMP '2016-08-01 00:00:00.000' AND COL8 <= >> TIMESTAMP '2016-08-05 23:59:59.000'))) >> DDL: >> >> CREATE TABLE IF NOT EXISTS SPL_FINAL >> (col1 VARCHAR NOT NULL, >> col2 VARCHAR NOT NULL, >> col3 INTEGER NOT NULL, >> col4 INTEGER NOT NULL, >> col5 VARCHAR NOT NULL, >> col6 VARCHAR NOT NULL, >> col7 TIMESTAMP NOT NULL, >> col8 TIMESTAMP NOT NULL, >> ext.col9 VARCHAR, >> ext.col10 VARCHAR, >> pri.col11 VARCHAR[], //this column contains 3600 items in every row >> pri.col12 VARCHAR >> ext.col13 BOOLEAN >> CONSTRAINT SPL_FINAL_PK PRIMARY KEY (col1, col2, col3, col4, col5, col6, >> col7, col8)) COMPRESSION='SNAPPY'; >> >> Thanks, >> Sasikumar Natarajan. >> >> On Thu, Sep 22, 2016 at 12:36 PM, Ankit Singhal > > wrote: >> >>> Share some more details about the query, DDL and explain plan. In >>> Phoenix, there are cases where we do some server processing at the time >>> when rs.next() is called first time but subsequent next() should be faster. >>> >>> On Thu, Sep 22, 2016 at 9:52 AM, Sasikumar Natarajan >>> wrote: >>> Hi, I'm using Apache Phoenix core 4.4.0-HBase-1.1 library to query the data available on Phoenix server. preparedStatement.executeQuery() seems to be taking less time. But to enter into *while (rs.next()) {} *takes a long time. I would like to know what is causing the delay to make the ResultSet ready. Please share your thoughts on this. -- Regards, Sasikumar Natarajan >>> >>> >> >> >> -- >> Regards, >> Sasikumar Natarajan >> > > > > -- > Regards, > Sasikumar Natarajan > -- Regards, Sasikumar Natarajan
Re: Phoenix ResultSet.next() takes a long time for first row
Also its not only the first time it takes time when we call ResultSet.next(). When we iterate over ResultSet, it takes a long time initially and then iterates faster. Again after few iterations, it takes sometime and this goes on. Sample observation: Total Rows available on ResultSet : 5130 Statement.executeQuery() has taken : 702 ms ResultSet Indices at which long time has been taken : *0* (7965 ms), *2041* (7155 ms), *4126 *(1630 ms) On Fri, Sep 23, 2016 at 7:52 AM, Sasikumar Natarajanwrote: > Hi Ankit, >Where does the server processing happens, on the HBase cluster > or the server where Phoenix core runs. > > PFB the details you have asked for, > > Query: > > SELECT col1, col2, col5, col7, col11, col12 FROM SPL_FINAL where > col1='MK00100' and col2='YOU' and col3=4 and col5 in (?,?,?,?,?) and ((col7 > between to_date('2016-08-01 00:00:00.000') and to_date('2016-08-05 > 23:59:59.000')) or (col8 between to_date('2016-08-01 00:00:00.000') and > to_date('2016-08-05 23:59:59.000'))) > > > Explain plan: > > CLIENT 1-CHUNK PARALLEL 1-WAY RANGE SCAN OVER SPL_FINAL ['MK00100','YOU',4] > SERVER FILTER BY (COL5 IN ('100','101','105','234','653') AND ((COL7 > >= TIMESTAMP '2016-08-01 00:00:00.000' AND COL7 <= TIMESTAMP '2016-08-05 > 23:59:59.000') OR (COL8 >= TIMESTAMP '2016-08-01 00:00:00.000' AND COL8 <= > TIMESTAMP '2016-08-05 23:59:59.000'))) > DDL: > > CREATE TABLE IF NOT EXISTS SPL_FINAL > (col1 VARCHAR NOT NULL, > col2 VARCHAR NOT NULL, > col3 INTEGER NOT NULL, > col4 INTEGER NOT NULL, > col5 VARCHAR NOT NULL, > col6 VARCHAR NOT NULL, > col7 TIMESTAMP NOT NULL, > col8 TIMESTAMP NOT NULL, > ext.col9 VARCHAR, > ext.col10 VARCHAR, > pri.col11 VARCHAR[], //this column contains 3600 items in every row > pri.col12 VARCHAR > ext.col13 BOOLEAN > CONSTRAINT SPL_FINAL_PK PRIMARY KEY (col1, col2, col3, col4, col5, col6, > col7, col8)) COMPRESSION='SNAPPY'; > > Thanks, > Sasikumar Natarajan. > > On Thu, Sep 22, 2016 at 12:36 PM, Ankit Singhal > wrote: > >> Share some more details about the query, DDL and explain plan. In >> Phoenix, there are cases where we do some server processing at the time >> when rs.next() is called first time but subsequent next() should be faster. >> >> On Thu, Sep 22, 2016 at 9:52 AM, Sasikumar Natarajan >> wrote: >> >>> Hi, >>> I'm using Apache Phoenix core 4.4.0-HBase-1.1 library to query the >>> data available on Phoenix server. >>> >>> preparedStatement.executeQuery() seems to be taking less time. But to >>> enter into *while (rs.next()) {} *takes a long time. I would like to >>> know what is causing the delay to make the ResultSet ready. Please share >>> your thoughts on this. >>> >>> >>> -- >>> Regards, >>> Sasikumar Natarajan >>> >> >> > > > -- > Regards, > Sasikumar Natarajan > -- Regards, Sasikumar Natarajan
Re: Phoenix ResultSet.next() takes a long time for first row
Hi Ankit, Where does the server processing happens, on the HBase cluster or the server where Phoenix core runs. PFB the details you have asked for, Query: SELECT col1, col2, col5, col7, col11, col12 FROM SPL_FINAL where col1='MK00100' and col2='YOU' and col3=4 and col5 in (?,?,?,?,?) and ((col7 between to_date('2016-08-01 00:00:00.000') and to_date('2016-08-05 23:59:59.000')) or (col8 between to_date('2016-08-01 00:00:00.000') and to_date('2016-08-05 23:59:59.000'))) Explain plan: CLIENT 1-CHUNK PARALLEL 1-WAY RANGE SCAN OVER SPL_FINAL ['MK00100','YOU',4] SERVER FILTER BY (COL5 IN ('100','101','105','234','653') AND ((COL7 >= TIMESTAMP '2016-08-01 00:00:00.000' AND COL7 <= TIMESTAMP '2016-08-05 23:59:59.000') OR (COL8 >= TIMESTAMP '2016-08-01 00:00:00.000' AND COL8 <= TIMESTAMP '2016-08-05 23:59:59.000'))) DDL: CREATE TABLE IF NOT EXISTS SPL_FINAL (col1 VARCHAR NOT NULL, col2 VARCHAR NOT NULL, col3 INTEGER NOT NULL, col4 INTEGER NOT NULL, col5 VARCHAR NOT NULL, col6 VARCHAR NOT NULL, col7 TIMESTAMP NOT NULL, col8 TIMESTAMP NOT NULL, ext.col9 VARCHAR, ext.col10 VARCHAR, pri.col11 VARCHAR[], //this column contains 3600 items in every row pri.col12 VARCHAR ext.col13 BOOLEAN CONSTRAINT SPL_FINAL_PK PRIMARY KEY (col1, col2, col3, col4, col5, col6, col7, col8)) COMPRESSION='SNAPPY'; Thanks, Sasikumar Natarajan. On Thu, Sep 22, 2016 at 12:36 PM, Ankit Singhalwrote: > Share some more details about the query, DDL and explain plan. In Phoenix, > there are cases where we do some server processing at the time when > rs.next() is called first time but subsequent next() should be faster. > > On Thu, Sep 22, 2016 at 9:52 AM, Sasikumar Natarajan > wrote: > >> Hi, >> I'm using Apache Phoenix core 4.4.0-HBase-1.1 library to query the >> data available on Phoenix server. >> >> preparedStatement.executeQuery() seems to be taking less time. But to >> enter into *while (rs.next()) {} *takes a long time. I would like to >> know what is causing the delay to make the ResultSet ready. Please share >> your thoughts on this. >> >> >> -- >> Regards, >> Sasikumar Natarajan >> > > -- Regards, Sasikumar Natarajan
Re: Phoenix ResultSet.next() takes a long time for first row
Share some more details about the query, DDL and explain plan. In Phoenix, there are cases where we do some server processing at the time when rs.next() is called first time but subsequent next() should be faster. On Thu, Sep 22, 2016 at 9:52 AM, Sasikumar Natarajanwrote: > Hi, > I'm using Apache Phoenix core 4.4.0-HBase-1.1 library to query the > data available on Phoenix server. > > preparedStatement.executeQuery() seems to be taking less time. But to > enter into *while (rs.next()) {} *takes a long time. I would like to know > what is causing the delay to make the ResultSet ready. Please share your > thoughts on this. > > > -- > Regards, > Sasikumar Natarajan >
Phoenix ResultSet.next() takes a long time for first row
Hi, I'm using Apache Phoenix core 4.4.0-HBase-1.1 library to query the data available on Phoenix server. preparedStatement.executeQuery() seems to be taking less time. But to enter into *while (rs.next()) {} *takes a long time. I would like to know what is causing the delay to make the ResultSet ready. Please share your thoughts on this. -- Regards, Sasikumar Natarajan