AW: AW: AW: Paging performance problem

Hohl, Gerrit Thu, 07 Jan 2021 00:01:17 -0800

Hi Rick,

good to know. 😉


Regards
Gerrit

-----Ursprüngliche Nachricht-----
Von: Rick Hillegas <[email protected]> 
Gesendet: Dienstag, 5. Januar 2021 20:00
An: Derby Discussion <[email protected]>; Hohl, Gerrit <[email protected]>
Betreff: Re: AW: AW: Paging performance problem

Hi Gerrit,

I'm glad that you found a satisfactory answer to your performance question. For 
the record, I can find my away around query plans printed in German.

Cheers,
-Rick

On 1/5/21 12:31 AM, Hohl, Gerrit wrote:
> Hi Rick,
>
>
> unfortunately I had to realize that the output of the query plans are printed 
> by Derby in German.
> And I'm not quiet sure how to change that.
>
> But we maybe found the reason: The slower version uses the "HEAP" (based on 
> the runtime statistics) to get the result while the faster version uses only 
> the index ("projection").
>
> The reason behind it seems that the slower version loads records from the 
> table (I guess that is what HEAP means - loading records into memory).
> It gets the create_timestamp from the index, but any other column - even if 
> it is only the "id" - from the table.
> It even doesn't matter much - at least in the described case - if we only 
> query for the "id" or also for the other columns.
>
> The faster version uses a "projection" and works only on the index, no access 
> on the table needed.
> If we put the faster version into JOIN like I described in my mail earlier we 
> then load only the needed records from the table.
>
> I hope I could explain the magic behind it a little bit, so helps others who 
> may come across similar issues.
>
>
> Best regards,
> Gerrit
>
> -----Ursprüngliche Nachricht-----
> Von: Rick Hillegas <[email protected]>
> Gesendet: Montag, 4. Januar 2021 16:43
> An: Derby Discussion <[email protected]>; Hohl, Gerrit 
> <[email protected]>
> Betreff: Re: AW: Paging performance problem
>
> Hi Gerrit,
>
> It's hard to say without seeing the query plans for these scenarios.
> What query plans do you see when you follow the instructions in the "Working 
> with RunTimeStatistics" section of the Derby Tuning Guide:
> http://db.apache.org/derby/docs/10.15/tuning/ctundepth13055.html
>
> -Rick
>
> On 1/4/21 7:08 AM, Hohl, Gerrit wrote:
>> Hello everyone,
>>
>> I guess I found at least one solution:
>>
>> CREATE INDEX data_create_timestamp_id_key ON data (create_timestamp 
>> ASC, id ASC)
>>
>> SELECT d.* FROM (SELECT id FROM data ORDER BY create_timestamp OFFSET
>> 4499990 ROWS FETCH NEXT 10 ROWS ONLY) as tmp, data d WHERE (tmp.id =
>> d.id)
>>
>> Takes less than 4 seconds with my test table.
>> But I'm wondering if that is really the best solution, especially as I'm 
>> still not sure about the reason.
>>
>> Isn't that new index I created also an unbalanced binary tree? Shouldn't it 
>> take the same amount of time?
>> But if I drop it and have only the primary key and data_create_timestamp_key 
>> index, the query takes even longer than my originally one (100s ).
>>
>> In addition: I passed a simply SQL query. Wondering if I also can recreate 
>> that with JPA/JPQL...
>>
>> Regards,
>> Gerrit
>>
>> Von: Hohl, Gerrit
>> Gesendet: Montag, 4. Januar 2021 15:43
>> An: Derby Discussion <[email protected]>
>> Betreff: Paging performance problem
>>
>> Hello everyone,
>>
>> we're trying to use OFFSET and LIMIT for paging. But as our tables our 
>> growing, we're running in some performance problems we didn't expect.
>>
>> Let's say we have the following structure (the table can also have more 
>> columns, but for the sake of simplicity...):
>>
>> CREATE TABLE data (id INTEGER NOT NULL, create_timestamp TIMESTAMP 
>> NOT NULL, PRIMARY(id)); CREATE INDEX data_create_timestamp_key ON 
>> data (create_timestamp ASC);
>>
>> We have around 5m records in it. And now we do the two following queries:
>>
>> SELECT * FROM data ORDER BY create_timestamp OFFSET 0 ROWS FETCH NEXT
>> 10 ROWS ONLY; SELECT * FROM data ORDER BY create_timestamp OFFSET
>> 4499990 ROWS FETCH NEXT 10 ROWS ONLY;
>>
>> While the first query returns right after it was started, the 2nd query 
>> takes almost a minute to complete.
>> First we thought that Derby might not have used the index for some reasons.
>> But the runtime statistics showed that it is using it.
>>
>> We assume that it is due to the fact that the index might be some kind of 
>> unbalanced binary tree.
>> Is that true? And is there any work-around?
>>
>> Gruß
>> Gerrit
>>
>> P.S.: We still use Apache Derby 10.14.2.0 as we're still tight to Java 8.
>>

AW: AW: AW: Paging performance problem

Reply via email to