Re: Phoenix ResultSet.next() takes a long time for first row

Ankit Singhal Wed, 28 Sep 2016 03:21:50 -0700

Sorry Sasi, missed your last mails.

It seems that you have one region in a table or the query touching one
region because of monotonically increasing key['MK00100','YOU',4]  .
Varying performance is because you may have filter which are aggressive and
skipping lots of rows in between (*0*  (7965 ms), *2041* (7155 ms),
*4126 *(1630
ms)) and that's why server is taking time.


can you try after doing salting on the table.
https://phoenix.apache.org/salted.html




On Wed, Sep 28, 2016 at 10:47 AM, Sasikumar Natarajan <sasi...@gmail.com>
wrote:

> Any one has suggestions for the performance issue discussed in this
> thread?. Your suggestions would help me resolve this issue.
>
> Infrastructure details:
>
> Azure HDInsight HBase
>
> Type Node Size        Cores       Nodes
> Head D3 V2 8 2
> Region D3 V2 16 4
> ZooKeeper D3 V2 12 3
> Thanks,
> Sasikumar Natarajan.
>
>
> On Fri, Sep 23, 2016 at 7:57 AM, Sasikumar Natarajan <sasi...@gmail.com>
> wrote:
>
>> Also its not only the first time it takes time when we call
>> ResultSet.next().
>>
>> When we iterate over ResultSet, it takes a long time initially and then
>> iterates faster. Again after few iterations, it takes sometime and this
>> goes on.
>>
>>
>>
>> Sample observation:
>>
>>
>>
>> Total Rows available on ResultSet : 5130
>>
>> Statement.executeQuery() has taken : 702 ms
>>
>> ResultSet Indices at which long time has been taken : *0*  (7965 ms),
>> *2041* (7155 ms), *4126 *(1630 ms)
>>
>> On Fri, Sep 23, 2016 at 7:52 AM, Sasikumar Natarajan <sasi...@gmail.com>
>> wrote:
>>
>>> Hi Ankit,
>>>            Where does the server processing happens, on the HBase
>>> cluster or the server where Phoenix core runs.
>>>
>>> PFB the details you have asked for,
>>>
>>> Query:
>>>
>>> SELECT col1, col2, col5, col7, col11, col12 FROM SPL_FINAL where
>>> col1='MK00100' and col2='YOU' and col3=4 and col5 in (?,?,?,?,?) and ((col7
>>> between to_date('2016-08-01 00:00:00.000') and to_date('2016-08-05
>>> 23:59:59.000')) or (col8 between to_date('2016-08-01 00:00:00.000') and
>>> to_date('2016-08-05 23:59:59.000')))
>>>
>>>
>>> Explain plan:
>>>
>>> CLIENT 1-CHUNK PARALLEL 1-WAY RANGE SCAN OVER SPL_FINAL
>>> ['MK00100','YOU',4]
>>>     SERVER FILTER BY (COL5 IN ('100','101','105','234','653') AND
>>> ((COL7 >= TIMESTAMP '2016-08-01 00:00:00.000' AND COL7 <= TIMESTAMP
>>> '2016-08-05 23:59:59.000') OR (COL8 >= TIMESTAMP '2016-08-01 00:00:00.000'
>>> AND COL8 <= TIMESTAMP '2016-08-05 23:59:59.000')))
>>> DDL:
>>>
>>> CREATE TABLE IF NOT EXISTS SPL_FINAL
>>> (col1 VARCHAR NOT NULL,
>>> col2 VARCHAR NOT NULL,
>>> col3 INTEGER NOT NULL,
>>> col4 INTEGER NOT NULL,
>>> col5 VARCHAR NOT NULL,
>>> col6 VARCHAR NOT NULL,
>>> col7 TIMESTAMP NOT NULL,
>>> col8 TIMESTAMP NOT NULL,
>>> ext.col9 VARCHAR,
>>> ext.col10 VARCHAR,
>>> pri.col11 VARCHAR[], //this column contains 3600 items in every row
>>> pri.col12 VARCHAR
>>> ext.col13 BOOLEAN
>>> CONSTRAINT SPL_FINAL_PK PRIMARY KEY (col1, col2, col3, col4, col5, col6,
>>> col7, col8)) COMPRESSION='SNAPPY';
>>>
>>> Thanks,
>>> Sasikumar Natarajan.
>>>
>>> On Thu, Sep 22, 2016 at 12:36 PM, Ankit Singhal <
>>> ankitsingha...@gmail.com> wrote:
>>>
>>>> Share some more details about the query, DDL and explain plan. In
>>>> Phoenix, there are cases where we do some server processing at the time
>>>> when rs.next() is called first time but subsequent next() should be faster.
>>>>
>>>> On Thu, Sep 22, 2016 at 9:52 AM, Sasikumar Natarajan <sasi...@gmail.com
>>>> > wrote:
>>>>
>>>>> Hi,
>>>>>     I'm using Apache Phoenix core 4.4.0-HBase-1.1 library to query the
>>>>> data available on Phoenix server.
>>>>>
>>>>> preparedStatement.executeQuery()  seems to be taking less time. But
>>>>> to enter into *while (rs.next()) {} *takes a long time. I would like
>>>>> to know what is causing the delay to make the ResultSet ready. Please 
>>>>> share
>>>>> your thoughts on this.
>>>>>
>>>>>
>>>>> --
>>>>> Regards,
>>>>> Sasikumar Natarajan
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Regards,
>>> Sasikumar Natarajan
>>>
>>
>>
>>
>> --
>> Regards,
>> Sasikumar Natarajan
>>
>
>
>
> --
> Regards,
> Sasikumar Natarajan
>

Re: Phoenix ResultSet.next() takes a long time for first row

Reply via email to