[h2] Re: some performance numbers

Brian Craft Wed, 09 Oct 2013 21:04:45 -0700

Continued posting in my own thread, in case it's useful to anyone. ;)

Copying the db to a new db with PAGE_SIZE of 8k lowered the average query 
time by a factor of 20: from 10 sec to 0.5 sec.


I should copy it to a db with the original PAGE_SIZE to see if insert order 
affects anything (since the SCRIPT dumps operate one table at a time, it 
gets written in a very different order than the original db).


On Wednesday, October 9, 2013 12:26:31 PM UTC-7, Brian Craft wrote:
>
> After more investigation I strongly suspect that IO is my bottleneck, but 
> the disk read rate is low because there are very many seeks being generated 
> by h2.
>
> I expect PAGE_SIZE is the best way to address this. Is that correct?
>
> Does row insert order affect this much? For example, would alternate 
> inserts into two tables result in the data being more spatially distributed 
> on the disk than writing first one table, then the other?
>
>
> On Tuesday, October 8, 2013 5:41:57 PM UTC-7, Brian Craft wrote:
>>
>> I'm wondering if these numbers look typical of h2, or if I still have 
>> some bottleneck I haven't identified.
>>
>> I have a probes table (mentioned in other threads) with one index over 
>> two columns: eid (int) and name (varchar). (eid, name) is unique. For each 
>> eid there might be 10k-500k rows.
>>
>> The table has about 38M rows. A query that retrieves 500 rows matching a 
>> single eid with 240k rows, takes about 5 seconds. 
>>
>> This is using the TABLE() inner join, instead of a big IN query. The 
>> explain analyze output shows it hitting the index.
>>
>> If I close the db and re-open it, the same query takes much less than a 
>> second, which I suspect reflects the OS block cache. Though, as I mentioned 
>> earlier, the disk doesn't seem very strained during the first query. I'm 
>> not sure what that means.
>>
>> The query:
>> SELECT  `probes`.`name` as `gene`, `probes`.`id`  FROM `probes`       
>> INNER JOIN TABLE(name varchar=?) T ON T.`name`=`probes`.`name`       WHERE 
>> (`probes`.`eid` = ?)
>>
>> The explain analyze:
>>
>> SELECT
>>     PROBES.NAME AS GENE,
>>     PROBES.ID
>> FROM TABLE(NAME VARCHAR=?1) T
>>     /* function */
>>     /* scanCount: 501 */
>> INNER JOIN PUBLIC.PROBES
>>     /* PUBLIC.EID_NAME_INDEX_8: EID = ?2
>>         AND NAME = T.NAME
>>      */
>>     ON 1=1
>>     /* scanCount: 1000 */
>> WHERE (PROBES.EID = ?2)
>>     AND (T.NAME = PROBES.NAME)
>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/groups/opt_out.

[h2] Re: some performance numbers

Reply via email to