Re: Phoenix ResultSet.next() takes a long time for first row

2016-09-28 Thread Ankit Singhal
Sorry Sasi, missed your last mails.

It seems that you have one region in a table or the query touching one
region because of monotonically increasing key['MK00100','YOU',4]  .
Varying performance is because you may have filter which are aggressive and
skipping lots of rows in between (*0*  (7965 ms), *2041* (7155 ms),
*4126 *(1630
ms)) and that's why server is taking time.

can you try after doing salting on the table.
https://phoenix.apache.org/salted.html




On Wed, Sep 28, 2016 at 10:47 AM, Sasikumar Natarajan 
wrote:

> Any one has suggestions for the performance issue discussed in this
> thread?. Your suggestions would help me resolve this issue.
>
> Infrastructure details:
>
> Azure HDInsight HBase
>
> Type Node SizeCores   Nodes
> Head D3 V2 8 2
> Region D3 V2 16 4
> ZooKeeper D3 V2 12 3
> Thanks,
> Sasikumar Natarajan.
>
>
> On Fri, Sep 23, 2016 at 7:57 AM, Sasikumar Natarajan 
> wrote:
>
>> Also its not only the first time it takes time when we call
>> ResultSet.next().
>>
>> When we iterate over ResultSet, it takes a long time initially and then
>> iterates faster. Again after few iterations, it takes sometime and this
>> goes on.
>>
>>
>>
>> Sample observation:
>>
>>
>>
>> Total Rows available on ResultSet : 5130
>>
>> Statement.executeQuery() has taken : 702 ms
>>
>> ResultSet Indices at which long time has been taken : *0*  (7965 ms),
>> *2041* (7155 ms), *4126 *(1630 ms)
>>
>> On Fri, Sep 23, 2016 at 7:52 AM, Sasikumar Natarajan 
>> wrote:
>>
>>> Hi Ankit,
>>>Where does the server processing happens, on the HBase
>>> cluster or the server where Phoenix core runs.
>>>
>>> PFB the details you have asked for,
>>>
>>> Query:
>>>
>>> SELECT col1, col2, col5, col7, col11, col12 FROM SPL_FINAL where
>>> col1='MK00100' and col2='YOU' and col3=4 and col5 in (?,?,?,?,?) and ((col7
>>> between to_date('2016-08-01 00:00:00.000') and to_date('2016-08-05
>>> 23:59:59.000')) or (col8 between to_date('2016-08-01 00:00:00.000') and
>>> to_date('2016-08-05 23:59:59.000')))
>>>
>>>
>>> Explain plan:
>>>
>>> CLIENT 1-CHUNK PARALLEL 1-WAY RANGE SCAN OVER SPL_FINAL
>>> ['MK00100','YOU',4]
>>> SERVER FILTER BY (COL5 IN ('100','101','105','234','653') AND
>>> ((COL7 >= TIMESTAMP '2016-08-01 00:00:00.000' AND COL7 <= TIMESTAMP
>>> '2016-08-05 23:59:59.000') OR (COL8 >= TIMESTAMP '2016-08-01 00:00:00.000'
>>> AND COL8 <= TIMESTAMP '2016-08-05 23:59:59.000')))
>>> DDL:
>>>
>>> CREATE TABLE IF NOT EXISTS SPL_FINAL
>>> (col1 VARCHAR NOT NULL,
>>> col2 VARCHAR NOT NULL,
>>> col3 INTEGER NOT NULL,
>>> col4 INTEGER NOT NULL,
>>> col5 VARCHAR NOT NULL,
>>> col6 VARCHAR NOT NULL,
>>> col7 TIMESTAMP NOT NULL,
>>> col8 TIMESTAMP NOT NULL,
>>> ext.col9 VARCHAR,
>>> ext.col10 VARCHAR,
>>> pri.col11 VARCHAR[], //this column contains 3600 items in every row
>>> pri.col12 VARCHAR
>>> ext.col13 BOOLEAN
>>> CONSTRAINT SPL_FINAL_PK PRIMARY KEY (col1, col2, col3, col4, col5, col6,
>>> col7, col8)) COMPRESSION='SNAPPY';
>>>
>>> Thanks,
>>> Sasikumar Natarajan.
>>>
>>> On Thu, Sep 22, 2016 at 12:36 PM, Ankit Singhal <
>>> ankitsingha...@gmail.com> wrote:
>>>
 Share some more details about the query, DDL and explain plan. In
 Phoenix, there are cases where we do some server processing at the time
 when rs.next() is called first time but subsequent next() should be faster.

 On Thu, Sep 22, 2016 at 9:52 AM, Sasikumar Natarajan  wrote:

> Hi,
> I'm using Apache Phoenix core 4.4.0-HBase-1.1 library to query the
> data available on Phoenix server.
>
> preparedStatement.executeQuery()  seems to be taking less time. But
> to enter into *while (rs.next()) {} *takes a long time. I would like
> to know what is causing the delay to make the ResultSet ready. Please 
> share
> your thoughts on this.
>
>
> --
> Regards,
> Sasikumar Natarajan
>


>>>
>>>
>>> --
>>> Regards,
>>> Sasikumar Natarajan
>>>
>>
>>
>>
>> --
>> Regards,
>> Sasikumar Natarajan
>>
>
>
>
> --
> Regards,
> Sasikumar Natarajan
>


Re: Phoenix ResultSet.next() takes a long time for first row

2016-09-27 Thread Sasikumar Natarajan
Any one has suggestions for the performance issue discussed in this
thread?. Your suggestions would help me resolve this issue.

Infrastructure details:

Azure HDInsight HBase

Type Node SizeCores   Nodes
Head D3 V2 8 2
Region D3 V2 16 4
ZooKeeper D3 V2 12 3
Thanks,
Sasikumar Natarajan.


On Fri, Sep 23, 2016 at 7:57 AM, Sasikumar Natarajan 
wrote:

> Also its not only the first time it takes time when we call
> ResultSet.next().
>
> When we iterate over ResultSet, it takes a long time initially and then
> iterates faster. Again after few iterations, it takes sometime and this
> goes on.
>
>
>
> Sample observation:
>
>
>
> Total Rows available on ResultSet : 5130
>
> Statement.executeQuery() has taken : 702 ms
>
> ResultSet Indices at which long time has been taken : *0*  (7965 ms),
> *2041* (7155 ms), *4126 *(1630 ms)
>
> On Fri, Sep 23, 2016 at 7:52 AM, Sasikumar Natarajan 
> wrote:
>
>> Hi Ankit,
>>Where does the server processing happens, on the HBase cluster
>> or the server where Phoenix core runs.
>>
>> PFB the details you have asked for,
>>
>> Query:
>>
>> SELECT col1, col2, col5, col7, col11, col12 FROM SPL_FINAL where
>> col1='MK00100' and col2='YOU' and col3=4 and col5 in (?,?,?,?,?) and ((col7
>> between to_date('2016-08-01 00:00:00.000') and to_date('2016-08-05
>> 23:59:59.000')) or (col8 between to_date('2016-08-01 00:00:00.000') and
>> to_date('2016-08-05 23:59:59.000')))
>>
>>
>> Explain plan:
>>
>> CLIENT 1-CHUNK PARALLEL 1-WAY RANGE SCAN OVER SPL_FINAL
>> ['MK00100','YOU',4]
>> SERVER FILTER BY (COL5 IN ('100','101','105','234','653') AND ((COL7
>> >= TIMESTAMP '2016-08-01 00:00:00.000' AND COL7 <= TIMESTAMP '2016-08-05
>> 23:59:59.000') OR (COL8 >= TIMESTAMP '2016-08-01 00:00:00.000' AND COL8 <=
>> TIMESTAMP '2016-08-05 23:59:59.000')))
>> DDL:
>>
>> CREATE TABLE IF NOT EXISTS SPL_FINAL
>> (col1 VARCHAR NOT NULL,
>> col2 VARCHAR NOT NULL,
>> col3 INTEGER NOT NULL,
>> col4 INTEGER NOT NULL,
>> col5 VARCHAR NOT NULL,
>> col6 VARCHAR NOT NULL,
>> col7 TIMESTAMP NOT NULL,
>> col8 TIMESTAMP NOT NULL,
>> ext.col9 VARCHAR,
>> ext.col10 VARCHAR,
>> pri.col11 VARCHAR[], //this column contains 3600 items in every row
>> pri.col12 VARCHAR
>> ext.col13 BOOLEAN
>> CONSTRAINT SPL_FINAL_PK PRIMARY KEY (col1, col2, col3, col4, col5, col6,
>> col7, col8)) COMPRESSION='SNAPPY';
>>
>> Thanks,
>> Sasikumar Natarajan.
>>
>> On Thu, Sep 22, 2016 at 12:36 PM, Ankit Singhal > > wrote:
>>
>>> Share some more details about the query, DDL and explain plan. In
>>> Phoenix, there are cases where we do some server processing at the time
>>> when rs.next() is called first time but subsequent next() should be faster.
>>>
>>> On Thu, Sep 22, 2016 at 9:52 AM, Sasikumar Natarajan 
>>> wrote:
>>>
 Hi,
 I'm using Apache Phoenix core 4.4.0-HBase-1.1 library to query the
 data available on Phoenix server.

 preparedStatement.executeQuery()  seems to be taking less time. But to
 enter into *while (rs.next()) {} *takes a long time. I would like to
 know what is causing the delay to make the ResultSet ready. Please share
 your thoughts on this.


 --
 Regards,
 Sasikumar Natarajan

>>>
>>>
>>
>>
>> --
>> Regards,
>> Sasikumar Natarajan
>>
>
>
>
> --
> Regards,
> Sasikumar Natarajan
>



-- 
Regards,
Sasikumar Natarajan


Re: Phoenix ResultSet.next() takes a long time for first row

2016-09-22 Thread Sasikumar Natarajan
Also its not only the first time it takes time when we call
ResultSet.next().

When we iterate over ResultSet, it takes a long time initially and then
iterates faster. Again after few iterations, it takes sometime and this
goes on.



Sample observation:



Total Rows available on ResultSet : 5130

Statement.executeQuery() has taken : 702 ms

ResultSet Indices at which long time has been taken : *0*  (7965 ms),
*2041* (7155
ms), *4126 *(1630 ms)

On Fri, Sep 23, 2016 at 7:52 AM, Sasikumar Natarajan 
wrote:

> Hi Ankit,
>Where does the server processing happens, on the HBase cluster
> or the server where Phoenix core runs.
>
> PFB the details you have asked for,
>
> Query:
>
> SELECT col1, col2, col5, col7, col11, col12 FROM SPL_FINAL where
> col1='MK00100' and col2='YOU' and col3=4 and col5 in (?,?,?,?,?) and ((col7
> between to_date('2016-08-01 00:00:00.000') and to_date('2016-08-05
> 23:59:59.000')) or (col8 between to_date('2016-08-01 00:00:00.000') and
> to_date('2016-08-05 23:59:59.000')))
>
>
> Explain plan:
>
> CLIENT 1-CHUNK PARALLEL 1-WAY RANGE SCAN OVER SPL_FINAL ['MK00100','YOU',4]
> SERVER FILTER BY (COL5 IN ('100','101','105','234','653') AND ((COL7
> >= TIMESTAMP '2016-08-01 00:00:00.000' AND COL7 <= TIMESTAMP '2016-08-05
> 23:59:59.000') OR (COL8 >= TIMESTAMP '2016-08-01 00:00:00.000' AND COL8 <=
> TIMESTAMP '2016-08-05 23:59:59.000')))
> DDL:
>
> CREATE TABLE IF NOT EXISTS SPL_FINAL
> (col1 VARCHAR NOT NULL,
> col2 VARCHAR NOT NULL,
> col3 INTEGER NOT NULL,
> col4 INTEGER NOT NULL,
> col5 VARCHAR NOT NULL,
> col6 VARCHAR NOT NULL,
> col7 TIMESTAMP NOT NULL,
> col8 TIMESTAMP NOT NULL,
> ext.col9 VARCHAR,
> ext.col10 VARCHAR,
> pri.col11 VARCHAR[], //this column contains 3600 items in every row
> pri.col12 VARCHAR
> ext.col13 BOOLEAN
> CONSTRAINT SPL_FINAL_PK PRIMARY KEY (col1, col2, col3, col4, col5, col6,
> col7, col8)) COMPRESSION='SNAPPY';
>
> Thanks,
> Sasikumar Natarajan.
>
> On Thu, Sep 22, 2016 at 12:36 PM, Ankit Singhal 
> wrote:
>
>> Share some more details about the query, DDL and explain plan. In
>> Phoenix, there are cases where we do some server processing at the time
>> when rs.next() is called first time but subsequent next() should be faster.
>>
>> On Thu, Sep 22, 2016 at 9:52 AM, Sasikumar Natarajan 
>> wrote:
>>
>>> Hi,
>>> I'm using Apache Phoenix core 4.4.0-HBase-1.1 library to query the
>>> data available on Phoenix server.
>>>
>>> preparedStatement.executeQuery()  seems to be taking less time. But to
>>> enter into *while (rs.next()) {} *takes a long time. I would like to
>>> know what is causing the delay to make the ResultSet ready. Please share
>>> your thoughts on this.
>>>
>>>
>>> --
>>> Regards,
>>> Sasikumar Natarajan
>>>
>>
>>
>
>
> --
> Regards,
> Sasikumar Natarajan
>



-- 
Regards,
Sasikumar Natarajan


Re: Phoenix ResultSet.next() takes a long time for first row

2016-09-22 Thread Sasikumar Natarajan
Hi Ankit,
   Where does the server processing happens, on the HBase cluster
or the server where Phoenix core runs.

PFB the details you have asked for,

Query:

SELECT col1, col2, col5, col7, col11, col12 FROM SPL_FINAL where
col1='MK00100' and col2='YOU' and col3=4 and col5 in (?,?,?,?,?) and ((col7
between to_date('2016-08-01 00:00:00.000') and to_date('2016-08-05
23:59:59.000')) or (col8 between to_date('2016-08-01 00:00:00.000') and
to_date('2016-08-05 23:59:59.000')))


Explain plan:

CLIENT 1-CHUNK PARALLEL 1-WAY RANGE SCAN OVER SPL_FINAL ['MK00100','YOU',4]
SERVER FILTER BY (COL5 IN ('100','101','105','234','653') AND ((COL7 >=
TIMESTAMP '2016-08-01 00:00:00.000' AND COL7 <= TIMESTAMP '2016-08-05
23:59:59.000') OR (COL8 >= TIMESTAMP '2016-08-01 00:00:00.000' AND COL8 <=
TIMESTAMP '2016-08-05 23:59:59.000')))
DDL:

CREATE TABLE IF NOT EXISTS SPL_FINAL
(col1 VARCHAR NOT NULL,
col2 VARCHAR NOT NULL,
col3 INTEGER NOT NULL,
col4 INTEGER NOT NULL,
col5 VARCHAR NOT NULL,
col6 VARCHAR NOT NULL,
col7 TIMESTAMP NOT NULL,
col8 TIMESTAMP NOT NULL,
ext.col9 VARCHAR,
ext.col10 VARCHAR,
pri.col11 VARCHAR[], //this column contains 3600 items in every row
pri.col12 VARCHAR
ext.col13 BOOLEAN
CONSTRAINT SPL_FINAL_PK PRIMARY KEY (col1, col2, col3, col4, col5, col6,
col7, col8)) COMPRESSION='SNAPPY';

Thanks,
Sasikumar Natarajan.

On Thu, Sep 22, 2016 at 12:36 PM, Ankit Singhal 
wrote:

> Share some more details about the query, DDL and explain plan. In Phoenix,
> there are cases where we do some server processing at the time when
> rs.next() is called first time but subsequent next() should be faster.
>
> On Thu, Sep 22, 2016 at 9:52 AM, Sasikumar Natarajan 
> wrote:
>
>> Hi,
>> I'm using Apache Phoenix core 4.4.0-HBase-1.1 library to query the
>> data available on Phoenix server.
>>
>> preparedStatement.executeQuery()  seems to be taking less time. But to
>> enter into *while (rs.next()) {} *takes a long time. I would like to
>> know what is causing the delay to make the ResultSet ready. Please share
>> your thoughts on this.
>>
>>
>> --
>> Regards,
>> Sasikumar Natarajan
>>
>
>


-- 
Regards,
Sasikumar Natarajan


Re: Phoenix ResultSet.next() takes a long time for first row

2016-09-22 Thread Ankit Singhal
Share some more details about the query, DDL and explain plan. In Phoenix,
there are cases where we do some server processing at the time when
rs.next() is called first time but subsequent next() should be faster.

On Thu, Sep 22, 2016 at 9:52 AM, Sasikumar Natarajan 
wrote:

> Hi,
> I'm using Apache Phoenix core 4.4.0-HBase-1.1 library to query the
> data available on Phoenix server.
>
> preparedStatement.executeQuery()  seems to be taking less time. But to
> enter into *while (rs.next()) {} *takes a long time. I would like to know
> what is causing the delay to make the ResultSet ready. Please share your
> thoughts on this.
>
>
> --
> Regards,
> Sasikumar Natarajan
>


Phoenix ResultSet.next() takes a long time for first row

2016-09-21 Thread Sasikumar Natarajan
Hi,
I'm using Apache Phoenix core 4.4.0-HBase-1.1 library to query the data
available on Phoenix server.

preparedStatement.executeQuery()  seems to be taking less time. But to
enter into *while (rs.next()) {} *takes a long time. I would like to know
what is causing the delay to make the ResultSet ready. Please share your
thoughts on this.


-- 
Regards,
Sasikumar Natarajan