Hi John, After a quick read of your mail, it seems to me some of the OutOfMemoryErrors you are seeing may be because of the leak described in DERBY-210. I have a patch for this attached to the JIRA issue. If it is okay, can you please try re-running the tests after applying this patch? I think it should make some difference in the memory usage.
Thanks, Deepa On 1/27/06, John Embretsen <[EMAIL PROTECTED]> wrote: > Hi all, > > As some of you may have noticed if you have tried testing Derby with > DOTS, or read the document "Using DOTS as Apache Derby System Test" > posted on the Apache Derby web site at > http://db.apache.org/derby/integrate/index.html, it is likely that you > will see OutOfMemoryErrors (OOME) in the logs (derby.log and DOTS > error log) when running (long enough) with default JVM configuration. > > I am currently doing some testing using DOTS, and can verify this > unfortunate fact. It is the Derby Network Server that runs out of memory > (details to follow). Since I do not fully accept the "explanation" of > what may be causing this in the above mentioned document, I would like > to share some of my findings with the community, so that we can > > a) Determine with certainty the source of the errors: Derby, DOTS, > Java,... ? > b) Figure out why this is happening > c) Find out what we can do to fix this, other than increasing the Java > Heap size ad infinitum. > > This is not the shortest E-mail you've seen, but please bear with me, as > this is (IMHO) a non-trivial problem which requires a certain amount of > information for us to be able to find the best solution. > > > Test setup > -------------------------------------- > > I've been running DOTS tests with various configurations to reduce the > risk of this being a user error, but here's the main setup I've been > using: > > * DOTS (v1.1.1), with mostly "default" settings except a few, such as: > * DURATION > 48 hours > * CPU_TARGET = 70 % > * AUTO_MODE = yes > * Derby 10.1.2.1, Client/Server mode, ClientDriver (client and server on > separate hosts) > * Default Derby settings > * Sun's JVM version 1.5.0 (I also see OutOfMemoryErrors using other VM > versions) > > DOTS has been set up with the help of the documents posted on the above > mentioned web site, and the DOTS User's Guide at > http://ltp.sourceforge.net/dotshowto.php. I am mostly using the test > case ATCJ2, which is "simulating" an online auction application, > although I have also seen the same type of error when using other test > cases. > > > Symptoms > -------------------------------------- > > Using a typical heap size (e.g. 128 MB) and size of the Permanent > Generation Space (aka. PermSpace, which probably is the most interesting > part of the heap in this context, as I will explain below) (e.g. 64 MB), > I usually see OutOfMemoryErrors within the first few hours of the test > run. > > Here's an example of the first occurrence of an OutOfMemoryError in > derby.log. The Derby Network server was started 2006-01-24 11:09:22.931 > GMT, and the DOTS test case was started on the client host a few seconds > later: > > 2006-01-24 14:11:22.278 GMT Thread[DRDAConnThread_19,5,main] Wrote class > org.apache.derby.exe.ac6ca0115bx0108xfc16xe0e7x000007823a688d6 to file > ac6ca0115bx0108xfc16xe0e7x000007823a688d6.class. Please provide support with > the file and the following exception information: java.lang.OutOfMemoryError: > PermGen space > 2006-01-24 14:11:22.279 GMT Thread[DRDAConnThread_19,5,main] (XID = 212876), > (SESSIONID = 18), (DATABASE = > /export/home2/tmp/je159969/DOTS/db/server/TESTDB), (DRDAID = > O19F70EC.D025-939843809132829725{19}), Cleanup action starting > 2006-01-24 14:11:22.279 GMT Thread[DRDAConnThread_19,5,main] (XID = 212876), > (SESSIONID = 18), (DATABASE = > /export/home2/tmp/je159969/DOTS/db/server/TESTDB), (DRDAID = > O19F70EC.D025-939843809132829725{19}), Failed Statement is: SELECT > BID_PRICE,BID_COUNT FROM ITEM WHERE ITEMID = 'ITEM7990' > java.lang.OutOfMemoryError: PermGen space > Cleanup action completed > > The test (and the Derby Server) continued to run, but kept throwing > OutOfMemoryErrors, and to a larger degree failed to execute statements > sent by the client. After a while (~2 days), the Network Server was not > able to do much of anything. Within that time, about 800 classes of the > type mentioned in the log excerpt above ended up in > ${derby.system.home}, although I did not tell Derby to dump generated > classes. > > I was monitoring the server heap using jstat (a Sun JDK monitoring tool, > see http://java.sun.com/j2se/1.5.0/docs/tooldocs/share/jstat.html ), and > as you can see from the attached graph (PU_jstat-gc-t.png) based on the > jstat output, the PermSpace was filling up rather quickly. After about 3 > hours (more accurately 10920 seconds), the first OutOfMemoryError was > thrown. I've attached a graph showing the first 4 hours (14400 seconds) > of the test run, so that you can see in more detail what happened > (PU-first4hrs_jstat-gc-t.png). > > I've also seen errors of the type "OutOfMemoryError: Java Heap Space" in > derby.log when running DOTS, but "OutOfMemoryError: PermGen space" is > the (by far) most frequently occurring OOME. > > The type of error and Mean Time To Failure (MTTF) of course varies with > your test setup, JVM configuration as well as the hardware you are > using. However, I believe a maximum heap size of 128 MB and maximum > PermSpace of 64 MB is close to the typical default values for JVMs (both > client and server mode) on today's desktops. > > > Source of errors, Chain of events > -------------------------------------- > > Since I don't know much about the internals of Derby I'm not very sure > about any of this, but here's what I think, based on the log files and > JVM behavior: > > The DOTS test case contains both prepared and unprepared statements. As > you can see from the log excerpt above, the statement which "caused" the > OOME in this case (other statements have been reported as the failing > ones in other test runs) was > > SELECT BID_PRICE,BID_COUNT FROM ITEM WHERE ITEMID = 'ITEM7990' > > Looking at the DOTS source code for ATCJ2.java, we see that this is the > String getBidItemSQL concatenated with the String "'ITEM7990'". > (You can get the source code for DOTS at > http://sourceforge.net/project/showfiles.php?group_id=3382 ) > > Here are three other relevant code lines (download DOTS to see full > context): > > Statement stmt = null; > > stmt = > conn.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE,ResultSet.CONCUR_READ_ONLY); > > rs = stmt.executeQuery(getBidItemSQL + "'" + itemID + "'"); > > The statement and the result set are explicitly closed iff rs.next() > returns true. > > I guess that since a PreparedStatement is not being used here, the Derby > engine compiles a new statement like this every time the method doBid() > is being called (which is quite often), since the item id is varying quite > a lot. > > The test case also use PreparedStatements. Some of these are "normal", > meaning that the statement strings contain question mark placeholders, > and that setXXX and getXXX methods are used to set/get the values. The > odd thing is that the PreparedStatements (at least some of them) are not > reused, but closed, nulled, and re-prepared. Although this is not > recommended practice performance-wise, I would like to think that this > should not be the cause of a Derby memory leak, although DERBY-210 > describes issues with the Derby Server when such statements are _not_ > explicitly closed by the user. > > The third kind of statement used in the DOTS test case is > PreparedStatements with some question mark placeholders, but where an > item id is appended to the end this statement string every time it is > used. Here is an example: > > /** SQL for updating bid related columns in ITEM table */ > private final static String updateBidItemSQL = > "UPDATE ITEM SET BID_PRICE = ?,BID_COUNT=? WHERE ITEMID = "; > > I don't know the rationale behind doing it this way. It may be just "sloppy > code", or it may be intentional. > > Now, it seems to me that the sum of all this is that lots and lots of > different statements are being compiled and executed by the derby engine > in a relatively short period of time, which involves a whole lot of > generated byte code. This might explain why we run out of PermSpace... > > As far as I understand (I am not an expert in this area), the PermSpace > part of the heap is where the actual classes (not the object instances) > are stored during JVM execution. It seems that during such test runs > Derby generates new classes faster than the JVM is able to garbage > collect them, thus eventually running out of PermSpace. > > My limited experiments with somewhat larger heaps and PermSpaces > indicate that the problem does not go away, it just takes a bit longer > to fill up the PermSpace. > > The problem is probably worse when using a 1.3.x JVM (I did not try > this), since, according to > http://www.informit.com/guides/content.asp?g=java&seqNum=98&rl=1 , 1.3.x > versions of the JVM do not unload classes from its PermSpace. > > > Now what? > -------------------------------------- > > I am interested in finding out what is causing these errors, > and what can be done to avoid them in the future. > > It might be that changing the source code of DOTS will make the problem > go away (for example by changing the way PreparedStatements are used, or > by going through the code and making sure all statements and result sets > are explicitly closed); I have not tried this yet. > > However, shouldn't Derby be able to handle even this type of code in a > more graceful manner? Or are the DOTS test cases so "poorly" designed > that we cannot expect any relational database system to run them > successfully for more than a few hours with default JVM settings? > > I have noticed that there are several open or recently resolved > JIRA-issues treating OutOfMemoryErrors (e.g. DERBY-210, DERBY-443, > DERBY-557, DERBY-756), but I have not (yet) been able to link either of > those directly with the symptoms I see using DOTS. > > I have not run any DOTS tests recently using the Derby trunk, but perhaps > others out there have tried this? Do you get OutOfMemoryErrors? > > Has anyone tried DOTS (relatively long test runs) with other databases? > > More testing and closer inspection of some of the DOTS source code is on > my agenda - but if any of you have any ideas, or comments, feel free to > speak out :) > > Thanks for reading this, > > > -- > John > > >
