Dear All, compliments of the day.

Very sorry for a long text ahead.

Using the H2 database for some reporting, I stumbled over a few performance 
concerns.
For a small bank, we use the H2 database as backend for an IFRS compliant 
accounting software and build with it the financial reports, like trial 
balance and disclosures. (With the same software, our prime customers use 
Oracle on quite large servers with more than 24 core and 128 GB Ram).

The reporting related workflow executes 6 simple steps:

*Report Definition:*

1) Define the Report Source Query and persist that into a table. If that 
table does not exist, it is built by a CREATE TABLE ... AS SELECT ... FROM 
statement. If that table exists, an INSERT INTO ... SELECT ... FROM is used 
instead.
Also that table will be indexed automatically, based on the WHERE clause.

2) Define many small Aggregate Query Definitions based on that Source Query 
Table

3) Define the mapping between the result sets and columns of these 
Aggregate Query Definitions and Excel Spreadsheets and store everything in 
an Report Template XML file

*Report Compilation:*

4) Select the Report Template XML file

5) Execute the Report Source Query

6) Execute the Aggregate Queries and transfer the data into the Excel file

In general, step  6) will be very fast and step 5) can be monitored and 
tuned in order to make it fast enough (usually we use the "Long-Ops", 
Oracle ADDM and SQL Tuner and apply SQL Profiles)

Now my challenge: The relevant customer is small enough (30k live accounts) 
and H2 performs extremely well normally. Also the Trialbalance and the 
Disclosures are built well and within 1-2 minutes -- *but only on the first 
execution after the program start*.

Whenever I *start exactly the same report directly after the first report 
is built* (without any change of data or report definitions, without any 
concurrent activity etc.), it will build the Trial Balance fast and *then 
it is stuck forever* on the Report Source Query for the Disclosures. I have 
no idea why. Killing the program and restarting it will give the same 
result: first execution very fast, following execution does not return data 
forever, but only for that Disclosure Report. Trialbalance is still fast 
and both queries (Trialbalance and Disclosures) are fairly complex with a 
lot of joins and sub-queries.

My questions are: how come? Should a repeated execution not be faster 
because of the query cache? And how can I see and monitor, what the 
database is actually doing and what execution plan has been applied for the 
running queries? (I know how to get an execution plan before execution, but 
not ex-post for an already running query.)

Are there any tools similar to "Long-Ops" and/or V$SESSION?
And how am I supposed to monitor and analyze such situations? 

Important for the records: I hate Oracle and love H2 database very much and 
mention the Oracle tool-set only as illustration, what I am looking for. I 
am convinced that H2 can handle the load and the queries very well in 
general, but only something goes wrong after the first executions. I want 
to find and fix that cause.

Thank you so much already and best regards!
Andreas

-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/h2-database/ebf1800a-7478-4310-88d4-1a94c60ffe45%40googlegroups.com.

Reply via email to