[ 
https://issues.apache.org/jira/browse/DERBY-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13059896#comment-13059896
 ] 

Knut Anders Hatlen commented on DERBY-4620:
-------------------------------------------

The data structures are somewhat wasteful, but I'm not sure there's so much we 
can do about it. An SQLInteger object takes 16 bytes of memory, which sounds 
large when compared to a primitive int, which is 4 bytes. But it takes no more 
memory than a java.lang.Integer object, which also weighs in at 16 bytes. As 
long as we're using java.util.HashMap to store the hash table in memory, we 
cannot use primitives as keys, so I can't see that we can do any better than 
java.lang.Integer/iapi.types.SQLInteger for single-column keys.

For multi-column keys consisting of simple values there might be a possibility 
to do better, as for example an int[] array is more compact than an Object[] 
array. But we could only take advantage of it if all the columns in the key are 
of the same type (since an int[] array can only store ints). And we'd also need 
special logic to handle the cases where we're joining columns of different 
types (for example INT columns in one table joined with SMALLINT columns in 
another table), cases where the SQLInteger and SQLSmallint classes would have 
handled the comparison correctly with no extra logic. So I have the feeling 
that reducing the memory usage significantly would be difficult without adding 
a lot of complexity.

Quick update about the testing of the patch: There was one failure in the JUnit 
regression tests, but it looked unrelated (in InterruptResilienceTest). Will 
file a bug for that. Derbyall had failures in lang/subquery.sql and 
lang/wisconsin.java because of changes in some of the query plans (expected 
hash scans, but found index scans and nested loop joins). The repro for the bug 
finally completed successfully after 10 hours when run with -Xmx62M.

> Query optimizer causes OOM error on a complex query
> ---------------------------------------------------
>
>                 Key: DERBY-4620
>                 URL: https://issues.apache.org/jira/browse/DERBY-4620
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.5.3.0
>         Environment: java version "1.6.0_17"
> Java(TM) SE Runtime Environment (build 1.6.0_17-b04)
> Java HotSpot(TM) Client VM (build 14.3-b01, mixed mode, sharing)
> Linux rocio.xxx 2.6.24-27-generic #1 SMP Fri Mar 12 01:10:31 UTC 2010 i686 
> GNU/Linux
>            Reporter: Chris Wilson
>              Labels: derby_triage10_8
>         Attachments: estimate-sizes.diff, query-plan.txt
>
>
> I have a query that generates about 2,000 rows in a summary table. It runs 
> fast enough (~30 seconds) on MySQL. The same query on Derby runs for about 10 
> minutes and then fails with an OutOfMemoryError.
> I have created a test case to reproduce the problem. It's not minimal because 
> it relies on a rather large dataset to reproduce it, and it's not trivial, 
> but I don't mind doing a bit of work trimming it if someone can point me in 
> the necessary direction.
> You can check out the test case from our Subversion server here:
>   http://rita.wfplogistics.org/svn/trackrita/rita/doc/derby-oom-slow-query
> which includes a pre-built Derby database in "testdb.derby". If this database 
> is deleted, "test.sh" will recreate it, but that takes about 10-15 minutes.
> Just modify the script "test.sh" to point to your Derby libraries, and run it 
> (or just execute the commands in "movement_complete.sql") to demonstrate the 
> problem. You can view the source of that file online here:
>   http://rita.wfplogistics.org/trac/browser/rita/conf/movement_complete.sql
> The first "insert into movement_complete" (starting around line 4) takes 
> about 15 seconds to complete and inserts 5890 rows. The second, starting 
> around line 54, does not complete in reasonable time on Derby. On MySQL, it 
> runs in 28 seconds and inserts 2038 rows. On Derby, after 10 minutes I get:
> JAVA ERROR: java.lang.OutOfMemoryError: Java heap space
> ij> ERROR X0Y67: Cannot issue rollback in a nested connection when there 
> is a pending operation in the parent connection.
> (process exits)
> It does not output the query plan in this case.
> Following the suggestion of Bryan Pendleton, I tried increasing the JVM 
> memory limit from the default to 1024m, and this allows the query to finish 
> executing quite quickly. I guess that means that the optimiser is just taking 
> a lot of memory to 
> optimise the query, and it spends forever in GC before finally hitting OOM 
> and giving up when using the default settings.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to