Re: [Resin-interest] Memory and slowdown...
Well, based on previous advice we are inspection both the profile and heap dump. Both point to a problem with one of our business objects (referenced previous: ShallowSongBO) so I implemented a simple instance counter on that object. There is a static variable. It is incremented in the constructor and decremented by finalize. Both are synchronized on the class "ShallowSongBO" so the counter should be thread safe. I simply print the count each time an instance is created of finalized. On my box and on production to start with the instance count is well behaved. It goes up for a while and then at some point it will fall off. On production, eventually it begins to ONLY increase and never go down. I suspect that other objects have the same problem, it is just that we load many many songs compared to other classes. In fact the heap dump shows that the object we load the second most is also hanging around. It looks to me like garbage collection just stops at some point. As an experiment I've implemented a url that allows me to execute System.gc(); This is probably a placebo and should be needed but I'll know today or tonight if requesting garbage collection has any effect. I have to wait for the problem to manifest itself again some time later today or tonight to know if this will have any effect. Andrew Bill Au wrote: I would take some threads dump during heavy load to see what is going on. Jconsole with the JTop plug-in can show you the threads that are using the most CPU. Bill On Tue, Apr 8, 2008 at 12:06 AM, Knut Forkalsrud [EMAIL PROTECTED] wrote: On Apr 7, 2008, at 9:18 AM, Sandeep Ghael wrote: jvm-arg-Xdebug/jvm-arg In my experience the debug switch sometimes causes the JVM to behave erratically under heavy load. I would get rid of it and try again. -Knut ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest
Re: [Resin-interest] Memory and slowdown...
Andrew Fritz wrote: Well, based on previous advice we are inspection both the profile and heap dump. Both point to a problem with one of our business objects (referenced previous: ShallowSongBO) so I implemented a simple instance counter on that object. There is a static variable. It is incremented in the constructor and decremented by finalize. Both are synchronized on the class ShallowSongBO so the counter should be thread safe. I simply print the count each time an instance is created of finalized. On my box and on production to start with the instance count is well behaved. It goes up for a while and then at some point it will fall off. On production, eventually it begins to ONLY increase and never go down. I suspect that other objects have the same problem, it is just that we load many many songs compared to other classes. In fact the heap dump shows that the object we load the second most is also hanging around. It looks to me like garbage collection just stops at some point. You might want to read Bruce Eckels old Blog, and search for the entry 'Destructors in GCed languages' http://onthethought.blogspot.com/ It might shed some light on your problem. As an experiment I've implemented a url that allows me to execute System.gc(); This is probably a placebo and should be needed but I'll know today or tonight if requesting garbage collection has any effect. I have to wait for the problem to manifest itself again some time later today or tonight to know if this will have any effect. Andrew Bill Au wrote: I would take some threads dump during heavy load to see what is going on. Jconsole with the JTop plug-in can show you the threads that are using the most CPU. Bill On Tue, Apr 8, 2008 at 12:06 AM, Knut Forkalsrud [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: On Apr 7, 2008, at 9:18 AM, Sandeep Ghael wrote: jvm-arg-Xdebug/jvm-arg In my experience the debug switch sometimes causes the JVM to behave erratically under heavy load. I would get rid of it and try again. -Knut ___ resin-interest mailing list resin-interest@caucho.com mailto:resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest -- -Rob *_ ** Robert Leland INTEGRITY One Partners * P: (703) 581-6522 1900 Campus Commons Drive, Suite 150 F: (703) 476-7405 Reston, VA 20191 [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] *BUSINESS CONSULTING | TECHNOLOGY | INNOVATION RD* ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest
Re: [Resin-interest] Memory and slowdown...
[I sent this yesterday morning, but I don't think it made to the list. Trying once more.] Right off the bat, that's a big stack size to be using. I'm assuming you're on a 32-bit machine? If so, then the max addressable space of your process is 2G, which includes the java heap plus overhead needed for (what I call) native memory allocation, which includes memory needed for thread allocation. It doesn't matter that your machine has 3.3G of RAM if it's 32-bit. So, with a 1500M heap, you're only leaving about 500M for the JVM and other native memory allocation. Personally, we run our resin servers (which are on Windows) with a 128K stack size with no problems and, when we were still on 32-bit, it bought us a lot of time while we finished upgrading to 64-bit. It's best practice to set ms and mx (and using -server, which is passed to resin as -J-server as the first parameter) because it tells java to grab the entire needed amount of heap right when it starts. What I surmise might be happening to you is that your java process starts with 64M (if running in client) or 128M (if running in server) when it starts because you're not specifying the ms value. But each thread is consuming 4M of native memory, so when your heap tries to grow beyond a certain size, trying to reach that 1500M limit, it can't get there because too much of the 2G of addressable space is being consumed by your threads. Rob On Apr 7, 2008, at 09:18 , Sandeep Ghael wrote: Hi Scott, I work with Andrew and we are still fighting this problem. Thanks for the advice.. we are analyzing the heap dump as you suggest. For added color to the problem, linked is an image of one of our servers load (courtesy of Munin). The other server behaves similarly, but the two do not manifest the problem in concert (this is a cluster environ with 2 servers). You can see that the mem usage climbs to the point where the server begins to encounter high load. The server load will drop dramatically along with mem usage when either the server is restarted (manually or automatically). http://sandeepghael.com/ServerMemoryPattern.jpg I was reading this Caucho resin page on perf tuning of the jvm and have a few questions: http://www.caucho.com/resin-3.0/performance/jvm-tuning.xtp 1) why is it best practice to set -Xms and maximum -Xmx heap sizes to the same value. Currently we are setting -Xmx at 1500m with - Xms undefined. 2) I actually experimented with lowering the max heap size to -1024M, and the problem seems to occur faster. We thought that lowering the JVM heap size might prevent OS swap if that was the problem. 3) if -Xss is 4m, and we have 256 max threads, that mean we should account for the OS to commit 4m*256=1G for stack space. Correct? 4) if our machine has 3.3G ram, what is best practice in terms of mem allocation for the JVM vs the rest of the OS? Our conf file below. regards, Sandeep (clustered environment) !-- - The JVM arguments -- jvm-arg-Xmx1500m/jvm-arg jvm-arg-Xss4m/jvm-arg jvm-arg-Xdebug/jvm-arg jvm-arg-Dcom.sun.management.jmxremote/jvm-arg !-- - Uncomment to enable admin heap dumps - jvm-arg-agentlib:resin/jvm-arg -- watchdog-arg-Dcom.sun.management.jmxremote/watchdog- arg !-- - Configures the minimum free memory allowed before Resin - will force a restart. -- memory-free-min24M/memory-free-min !-- Maximum number of threads. -- thread-max256/thread-max !-- Configures the socket timeout -- socket-timeout65s/socket-timeout !-- Configures the keepalive -- keepalive-max128/keepalive-max keepalive-timeout15s/keepalive-timeout On Thu, Apr 3, 2008 at 11:27 AM, Scott Ferguson [EMAIL PROTECTED] wrote: On Apr 2, 2008, at 8:21 AM, Andrew Fritz wrote: Our production servers have their maximum memory set to 2048m. Everything is fine for a while. Eventually the java process ends up with all 2048m allocated. At this point server load starts going up and response time gets bad. Eventually request start timing out. Restarting the server fixes the problem instantly and everything is good again. Occasionally one of the servers will do this on its own, presumably because it reaches the 1m free threshold. That appears to be to small a margin and a restart is needed well before there is only 1m left so I adjusted the minimum free memory from 1m to 24m. That seems like a bandage though. The heap dump returned a blank page so I'm not sure what was going on there. I'm just curious if anyone has any theories about what might be eating up memory over time. We are using Hibernate and PHP and of
Re: [Resin-interest] Memory and slowdown...
On Apr 8, 2008, at 7:55 AM, Andrew Fritz wrote: Well, based on previous advice we are inspection both the profile and heap dump. Both point to a problem with one of our business objects (referenced previous: ShallowSongBO) so I implemented a simple instance counter on that object. There is a static variable. It is incremented in the constructor and decremented by finalize. Both are synchronized on the class ShallowSongBO so the counter should be thread safe. I simply print the count each time an instance is created of finalized. On my box and on production to start with the instance count is well behaved. It goes up for a while and then at some point it will fall off. On production, eventually it begins to ONLY increase and never go down. I suspect that other objects have the same problem, it is just that we load many many songs compared to other classes. In fact the heap dump shows that the object we load the second most is also hanging around. Good. That's the first step in tracking down the problem. It looks to me like garbage collection just stops at some point. Probably not. At this point in debugging, it's better to assume the JVM is running perfectly, no matter how temping it might be to blame it. :) Most likely, there's some reference somewhere to your ShallowSongBO that shouldn't exist. This is where a more sophisticated profiler would help tremendously, although it's still possible to track down. Some things to check: 1) is the number of Hibernate sessions expected, or is it larger than normal? If you see 1000s of sessions, then something is holding the sessions beyond the end of the request, which are holding your object. 2) is the number of Quercus Env objects expected? Similarly, if something's holding the Env beyond the request, that might be either holding your object directly, or holding the Hibernate sessions (if this is happening, it might be a Quercus bug.) 3) etc, look for other containers, or things that should only have 1 object per request. If the number of those objects is significantly bigger than the number of requests, that's the place to start. You might want to use jvm-arg-Xrunhprof:heap=sites/jvm-arg. It'll give you similar information as Resin's heap dump, but might be more useful (although it's a bit more complicated to debug.) -- Scott As an experiment I've implemented a url that allows me to execute System.gc(); This is probably a placebo and should be needed but I'll know today or tonight if requesting garbage collection has any effect. I have to wait for the problem to manifest itself again some time later today or tonight to know if this will have any effect. Andrew Bill Au wrote: I would take some threads dump during heavy load to see what is going on. Jconsole with the JTop plug-in can show you the threads that are using the most CPU. Bill On Tue, Apr 8, 2008 at 12:06 AM, Knut Forkalsrud [EMAIL PROTECTED] wrote: On Apr 7, 2008, at 9:18 AM, Sandeep Ghael wrote: jvm-arg-Xdebug/jvm-arg In my experience the debug switch sometimes causes the JVM to behave erratically under heavy load. I would get rid of it and try again. -Knut ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest
Re: [Resin-interest] Memory and slowdown...
Oh, I'm not blaming the JVM. I'm sure it is working as designed. But, it sounds like you are sure that GC is running. I'll look into the hibernate session and Quercus Env objects and see what the heap dump shows relative to them. Andrew Scott Ferguson wrote: On Apr 8, 2008, at 7:55 AM, Andrew Fritz wrote: Well, based on previous advice we are inspection both the profile and heap dump. Both point to a problem with one of our business objects (referenced previous: ShallowSongBO) so I implemented a simple instance counter on that object. There is a static variable. It is incremented in the constructor and decremented by finalize. Both are synchronized on the class "ShallowSongBO" so the counter should be thread safe. I simply print the count each time an instance is created of finalized. On my box and on production to start with the instance count is well behaved. It goes up for a while and then at some point it will fall off. On production, eventually it begins to ONLY increase and never go down. I suspect that other objects have the same problem, it is just that we load many many songs compared to other classes. In fact the heap dump shows that the object we load the second most is also hanging around. Good. That's the first step in tracking down the problem. It looks to me like garbage collection just stops at some point. Probably not. At this point in debugging, it's better to assume the JVM is running perfectly, no matter how temping it might be to blame it. :) Most likely, there's some reference somewhere to your ShallowSongBO that shouldn't exist. This is where a more sophisticated profiler would help tremendously, although it's still possible to track down. Some things to check: 1) is the number of Hibernate sessions expected, or is it larger than normal? If you see 1000s of sessions, then something is holding the sessions beyond the end of the request, which are holding your object. 2) is the number of Quercus Env objects expected? Similarly, if something's holding the Env beyond the request, that might be either holding your object directly, or holding the Hibernate sessions (if this is happening, it might be a Quercus bug.) 3) etc, look for other containers, or things that should only have 1 object per request. If the number of those objects is significantly bigger than the number of requests, that's the place to start. You might want to use jvm-arg-Xrunhprof:heap=sites/jvm-arg. It'll give you similar information as Resin's heap dump, but might be more useful (although it's a bit more complicated to debug.) -- Scott As an experiment I've implemented a url that allows me to execute System.gc(); This is probably a placebo and should be needed but I'll know today or tonight if requesting garbage collection has any effect. I have to wait for the problem to manifest itself again some time later today or tonight to know if this will have any effect. Andrew Bill Au wrote: I would take some threads dump during heavy load to see what is going on. Jconsole with the JTop plug-in can show you the threads that are using the most CPU. Bill On Tue, Apr 8, 2008 at 12:06 AM, Knut Forkalsrud [EMAIL PROTECTED] wrote: On Apr 7, 2008, at 9:18 AM, Sandeep Ghael wrote: jvm-arg-Xdebug/jvm-arg In my experience the debug switch sometimes causes the JVM to behave erratically under heavy load. I would get rid of it and try again. -Knut ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest
Re: [Resin-interest] Memory and slowdown...
It's best practice to set ms and mx (and using -server, which is passed to resin as -J-server as the first parameter) because it tells java to grab the entire needed amount of heap right when it starts. Just to clarify, -J-server is for Resin 3.0, for Resin 3.1 you add a jvm-arg in resin.conf for arguments to pass to the server's JVM: jvm-arg-server/jvm-arg jvm-arg-Xms=.../jvm-arg jvm-arg-Xmx=.../jvm-arg -- Sam ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest
Re: [Resin-interest] Memory and slowdown...
well if you tell your JVM to log GC activity... you will be able to know what GC is doing jvm-arg-Xloggc:${server.root}/log/gc.log/jvm-arg jvm-arg-XX:+PrintGCTimeStamps/jvm-arg jvm-arg-XX:+PrintGCDetails/jvm-arg Andrew Fritz wrote: Oh, I'm not blaming the JVM. I'm sure it is working as designed. But, it sounds like you are sure that GC is running. I'll look into the hibernate session and Quercus Env objects and see what the heap dump shows relative to them. Andrew Scott Ferguson wrote: On Apr 8, 2008, at 7:55 AM, Andrew Fritz wrote: Well, based on previous advice we are inspection both the profile and heap dump. Both point to a problem with one of our business objects (referenced previous: ShallowSongBO) so I implemented a simple instance counter on that object. There is a static variable. It is incremented in the constructor and decremented by finalize. Both are synchronized on the class ShallowSongBO so the counter should be thread safe. I simply print the count each time an instance is created of finalized. On my box and on production to start with the instance count is well behaved. It goes up for a while and then at some point it will fall off. On production, eventually it begins to ONLY increase and never go down. I suspect that other objects have the same problem, it is just that we load many many songs compared to other classes. In fact the heap dump shows that the object we load the second most is also hanging around. Good. That's the first step in tracking down the problem. It looks to me like garbage collection just stops at some point. Probably not. At this point in debugging, it's better to assume the JVM is running perfectly, no matter how temping it might be to blame it. :) Most likely, there's some reference somewhere to your ShallowSongBO that shouldn't exist. This is where a more sophisticated profiler would help tremendously, although it's still possible to track down. Some things to check: 1) is the number of Hibernate sessions expected, or is it larger than normal? If you see 1000s of sessions, then something is holding the sessions beyond the end of the request, which are holding your object. 2) is the number of Quercus Env objects expected? Similarly, if something's holding the Env beyond the request, that might be either holding your object directly, or holding the Hibernate sessions (if this is happening, it might be a Quercus bug.) 3) etc, look for other containers, or things that should only have 1 object per request. If the number of those objects is significantly bigger than the number of requests, that's the place to start. You might want to use jvm-arg-Xrunhprof:heap=sites/jvm-arg. It'll give you similar information as Resin's heap dump, but might be more useful (although it's a bit more complicated to debug.) -- Scott As an experiment I've implemented a url that allows me to execute System.gc(); This is probably a placebo and should be needed but I'll know today or tonight if requesting garbage collection has any effect. I have to wait for the problem to manifest itself again some time later today or tonight to know if this will have any effect. Andrew Bill Au wrote: I would take some threads dump during heavy load to see what is going on. Jconsole with the JTop plug-in can show you the threads that are using the most CPU. Bill On Tue, Apr 8, 2008 at 12:06 AM, Knut Forkalsrud [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: On Apr 7, 2008, at 9:18 AM, Sandeep Ghael wrote: jvm-arg-Xdebug/jvm-arg In my experience the debug switch sometimes causes the JVM to behave erratically under heavy load. I would get rid of it and try again. -Knut ___ resin-interest mailing list resin-interest@caucho.com mailto:resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest ___ resin-interest mailing list resin-interest@caucho.com mailto:resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest -- *Eric S. Kreiser** *Senior Software Architect *M**z**inga**
Re: [Resin-interest] Memory and slowdown...
We got heap dump working. Didn't have it enabled in the config... I'm spending some time looking through the profiles and heap dump now to see if I can see anything of interest. Andrew Scott Ferguson wrote: On Apr 2, 2008, at 8:21 AM, Andrew Fritz wrote: Our production servers have their maximum memory set to 2048m. Everything is fine for a while. Eventually the java process ends up with all 2048m allocated. At this point server load starts going up and response time gets bad. Eventually request start timing out. Restarting the server fixes the problem instantly and everything is good again. Occasionally one of the servers will do this on its own, presumably because it reaches the 1m free threshold. That appears to be to small a margin and a restart is needed well before there is only 1m left so I adjusted the minimum free memory from 1m to 24m. That seems like a bandage though. The heap dump returned a blank page so I'm not sure what was going on there. I'm just curious if anyone has any theories about what might be eating up memory over time. We are using Hibernate and PHP and of course java. Does the heap dump page work for you in a normal situation, i.e. before you start running out of memory? That's really the first place to start looking. The leaking memory might be obvious from Resin's heap dump page. If it's not enough information, the next step would be to use a more sophisticated memory profiler. -- Scott Andrew ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest
Re: [Resin-interest] Memory and slowdown...
Hi Scott, I work with Andrew and we are still fighting this problem. Thanks for the advice.. we are analyzing the heap dump as you suggest. For added color to the problem, linked is an image of one of our servers load (courtesy of Munin). The other server behaves similarly, but the two do not manifest the problem in concert (this is a cluster environ with 2 servers). You can see that the mem usage climbs to the point where the server begins to encounter high load. The server load will drop dramatically along with mem usage when either the server is restarted (manually or automatically). http://sandeepghael.com/ServerMemoryPattern.jpg I was reading this Caucho resin page on perf tuning of the jvm and have a few questions: http://www.caucho.com/resin-3.0/performance/jvm-tuning.xtp 1) why is it best practice to set -Xms and maximum -Xmx heap sizes to the same value. Currently we are setting -Xmx at 1500m with -Xms undefined. 2) I actually experimented with lowering the max heap size to -1024M, and the problem seems to occur faster. We thought that lowering the JVM heap size might prevent OS swap if that was the problem. 3) if -Xss is 4m, and we have 256 max threads, that mean we should account for the OS to commit 4m*256=1G for stack space. Correct? 4) if our machine has 3.3G ram, what is best practice in terms of mem allocation for the JVM vs the rest of the OS? Our conf file below. regards, Sandeep (clustered environment) !-- - The JVM arguments -- jvm-arg-Xmx1500m/jvm-arg jvm-arg-Xss4m/jvm-arg jvm-arg-Xdebug/jvm-arg jvm-arg-Dcom.sun.management.jmxremote/jvm-arg !-- - Uncomment to enable admin heap dumps - jvm-arg-agentlib:resin/jvm-arg -- watchdog-arg-Dcom.sun.management.jmxremote/watchdog-arg !-- - Configures the minimum free memory allowed before Resin - will force a restart. -- memory-free-min24M/memory-free-min !-- Maximum number of threads. -- thread-max256/thread-max !-- Configures the socket timeout -- socket-timeout65s/socket-timeout !-- Configures the keepalive -- keepalive-max128/keepalive-max keepalive-timeout15s/keepalive-timeout On Thu, Apr 3, 2008 at 11:27 AM, Scott Ferguson [EMAIL PROTECTED] wrote: On Apr 2, 2008, at 8:21 AM, Andrew Fritz wrote: Our production servers have their maximum memory set to 2048m. Everything is fine for a while. Eventually the java process ends up with all 2048m allocated. At this point server load starts going up and response time gets bad. Eventually request start timing out. Restarting the server fixes the problem instantly and everything is good again. Occasionally one of the servers will do this on its own, presumably because it reaches the 1m free threshold. That appears to be to small a margin and a restart is needed well before there is only 1m left so I adjusted the minimum free memory from 1m to 24m. That seems like a bandage though. The heap dump returned a blank page so I'm not sure what was going on there. I'm just curious if anyone has any theories about what might be eating up memory over time. We are using Hibernate and PHP and of course java. Does the heap dump page work for you in a normal situation, i.e. before you start running out of memory? That's really the first place to start looking. The leaking memory might be obvious from Resin's heap dump page. If it's not enough information, the next step would be to use a more sophisticated memory profiler. -- Scott Andrew ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest
Re: [Resin-interest] Memory and slowdown...
So after doing some more research, and comparing the profiles of one server (just restarted) to the other server in the loaded state I've got a lead so to speak, but I'm not sure what it means... On the loaded server about 34% of the time is spent in "readNative" compared to 80%+ on the unloaded server. On the loaded server java.lang.Class.getInterfaces() is using up 27% of the run time. The stack trace is: at java.lang.Class.getInterfaces() at org.hibernate.intercept.FieldInterceptionHelper.extractFieldInterceptor() at org.hibernate.intercept.FieldInterceptionHelper.clearDirty() at org.hibernate.event.def.DefaultFlushEntityEventListener.isUpdateNecessary() at org.hibernate.event.def.DefaultFlushEntityEventListener.onFlushEntity() at org.hibernate.event.def.AbstractFlushingEventListener.flushEntities() at org.hibernate.event.def.AbstractFlushingEventListener.flushEverythingToExecutions() at org.hibernate.event.def.DefaultAutoFlushEventListener.onAutoFlush() at org.hibernate.impl.SessionImpl.autoFlushIfRequired() at org.hibernate.impl.SessionImpl.list() at org.hibernate.impl.QueryImpl.list() at org.mwm.musicmeta.db.MusicMetaDataHibernateReader.readShallowSongsForAlbum() at sun.reflect.GeneratedMethodAccessor704.invoke() at sun.reflect.DelegatingMethodAccessorImpl.invoke() at java.lang.reflect.Method.invoke() at com.caucho.quercus.env.JavaMethod.invoke() Which appears to be Hibernate related. Our "ShallowSongBO" is also the third item on the heap dump, right behind String and char[] (which is pretty insane). I can't conceive of why these might be hanging around, but the ShallowSongBO is loaded regularly from php. Anyone have any thoughts about what in this combination might be causing as persistant object to hang around beyond the requests life time if I'm closing out the hibernate session correctly? Andrew Sandeep Ghael wrote: Hi Scott, I work with Andrew and we are still fighting this problem. Thanks for the advice.. we are analyzing the heap dump as you suggest. For added color to the problem, linked is an image of one of our servers load (courtesy of Munin). The other server behaves similarly, but the two do not manifest the problem in concert (this is a cluster environ with 2 servers). You can see that the mem usage climbs to the point where the server begins to encounter high load. The server load will drop dramatically along with mem usage when either the server is restarted (manually or automatically). http://sandeepghael.com/ServerMemoryPattern.jpg I was reading this Caucho resin page on perf tuning of the jvm and have a few questions: http://www.caucho.com/resin-3.0/performance/jvm-tuning.xtp 1) why is it best practice to set "-Xms and maximum -Xmx heap sizes to the same value". Currently we are setting -Xmx at 1500m with -Xms undefined. 2) I actually experimented with lowering the max heap size to -1024M, and the problem seems to occur faster. We thought that lowering the JVM heap size might prevent OS swap if that was the problem. 3) if -Xss is 4m, and we have 256 max threads, that mean we should account for the OS to commit 4m*256=1G for stack space. Correct? 4) if our machine has 3.3G ram, what is best practice in terms of mem allocation for the JVM vs the rest of the OS? Our conf file below. regards, Sandeep (clustered environment) !-- - The JVM arguments -- jvm-arg-Xmx1500m/jvm-arg jvm-arg-Xss4m/jvm-arg jvm-arg-Xdebug/jvm-arg jvm-arg-Dcom.sun.management.jmxremote/jvm-arg !-- - Uncomment to enable admin heap dumps - jvm-arg-agentlib:resin/jvm-arg -- watchdog-arg-Dcom.sun.management.jmxremote/watchdog-arg !-- - Configures the minimum free memory allowed before Resin - will force a restart. -- memory-free-min24M/memory-free-min !-- Maximum number of threads. -- thread-max256/thread-max !-- Configures the socket timeout -- socket-timeout65s/socket-timeout !-- Configures the keepalive -- keepalive-max128/keepalive-max keepalive-timeout15s/keepalive-timeout On Thu, Apr 3, 2008 at 11:27 AM, Scott Ferguson [EMAIL PROTECTED] wrote: On Apr 2, 2008, at 8:21 AM, Andrew Fritz wrote: Our production servers have their maximum memory set to 2048m. Everything is fine for a while. Eventually the java process ends up with all 2048m allocated. At this point server load starts going up and response time gets bad. Eventually request start timing out. Restarting the server fixes the problem instantly and everything is good again. Occasionally one of the servers will do this on its own, presumably because it reaches the 1m free threshold. That appears to be to small a margin and a restart is needed well before there is only 1m left so I adjusted the minimum free memory from 1m to 24m. That seems like a bandage
Re: [Resin-interest] Memory and slowdown...
On Apr 7, 2008, at 9:18 AM, Sandeep Ghael wrote: Hi Scott, I work with Andrew and we are still fighting this problem. Thanks for the advice.. we are analyzing the heap dump as you suggest. For added color to the problem, linked is an image of one of our servers load (courtesy of Munin). The other server behaves similarly, but the two do not manifest the problem in concert (this is a cluster environ with 2 servers). You can see that the mem usage climbs to the point where the server begins to encounter high load. The server load will drop dramatically along with mem usage when either the server is restarted (manually or automatically). http://sandeepghael.com/ServerMemoryPattern.jpg I was reading this Caucho resin page on perf tuning of the jvm and have a few questions: http://www.caucho.com/resin-3.0/performance/jvm-tuning.xtp 1) why is it best practice to set -Xms and maximum -Xmx heap sizes to the same value. Currently we are setting -Xmx at 1500m with - Xms undefined. I'm not sure this one is a big deal. The GC adaptively increases the minimum value as your application starts. So setting -Xms may improve startup time slightly, but shouldn't affect steady-stage performance. 2) I actually experimented with lowering the max heap size to -1024M, and the problem seems to occur faster. We thought that lowering the JVM heap size might prevent OS swap if that was the problem. That's very possible. You really don't want to be swapping during a GC. 3) if -Xss is 4m, and we have 256 max threads, that mean we should account for the OS to commit 4m*256=1G for stack space. Correct? Right, but 4m should be overkill unless you have a deeply recursive program. 4) if our machine has 3.3G ram, what is best practice in terms of mem allocation for the JVM vs the rest of the OS? I'm not sure, actually. Others might have better suggestions. -- Scott Our conf file below. regards, Sandeep (clustered environment) !-- - The JVM arguments -- jvm-arg-Xmx1500m/jvm-arg jvm-arg-Xss4m/jvm-arg jvm-arg-Xdebug/jvm-arg jvm-arg-Dcom.sun.management.jmxremote/jvm-arg !-- - Uncomment to enable admin heap dumps - jvm-arg-agentlib:resin/jvm-arg -- watchdog-arg-Dcom.sun.management.jmxremote/watchdog- arg !-- - Configures the minimum free memory allowed before Resin - will force a restart. -- memory-free-min24M/memory-free-min !-- Maximum number of threads. -- thread-max256/thread-max !-- Configures the socket timeout -- socket-timeout65s/socket-timeout !-- Configures the keepalive -- keepalive-max128/keepalive-max keepalive-timeout15s/keepalive-timeout On Thu, Apr 3, 2008 at 11:27 AM, Scott Ferguson [EMAIL PROTECTED] wrote: On Apr 2, 2008, at 8:21 AM, Andrew Fritz wrote: Our production servers have their maximum memory set to 2048m. Everything is fine for a while. Eventually the java process ends up with all 2048m allocated. At this point server load starts going up and response time gets bad. Eventually request start timing out. Restarting the server fixes the problem instantly and everything is good again. Occasionally one of the servers will do this on its own, presumably because it reaches the 1m free threshold. That appears to be to small a margin and a restart is needed well before there is only 1m left so I adjusted the minimum free memory from 1m to 24m. That seems like a bandage though. The heap dump returned a blank page so I'm not sure what was going on there. I'm just curious if anyone has any theories about what might be eating up memory over time. We are using Hibernate and PHP and of course java. Does the heap dump page work for you in a normal situation, i.e. before you start running out of memory? That's really the first place to start looking. The leaking memory might be obvious from Resin's heap dump page. If it's not enough information, the next step would be to use a more sophisticated memory profiler. -- Scott Andrew ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest ___ resin-interest mailing list resin-interest@caucho.com
Re: [Resin-interest] Memory and slowdown...
On Apr 7, 2008, at 9:18 AM, Sandeep Ghael wrote: jvm-arg-Xdebug/jvm-arg In my experience the debug switch sometimes causes the JVM to behave erratically under heavy load. I would get rid of it and try again. -Knut ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest
[Resin-interest] Memory and slowdown...
Our production servers have their maximum memory set to 2048m. Everything is fine for a while. Eventually the java process ends up with all 2048m allocated. At this point server load starts going up and response time gets bad. Eventually request start timing out. Restarting the server fixes the problem instantly and everything is good again. Occasionally one of the servers will do this on its own, presumably because it reaches the 1m free threshold. That appears to be to small a margin and a restart is needed well before there is only 1m left so I adjusted the minimum free memory from 1m to 24m. That seems like a bandage though. The heap dump returned a blank page so I'm not sure what was going on there. I'm just curious if anyone has any theories about what might be eating up memory over time. We are using Hibernate and PHP and of course java. Andrew ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest
Re: [Resin-interest] Memory and slowdown...
On Apr 2, 2008, at 8:21 AM, Andrew Fritz wrote: Our production servers have their maximum memory set to 2048m. Everything is fine for a while. Eventually the java process ends up with all 2048m allocated. At this point server load starts going up and response time gets bad. Eventually request start timing out. Restarting the server fixes the problem instantly and everything is good again. Occasionally one of the servers will do this on its own, presumably because it reaches the 1m free threshold. That appears to be to small a margin and a restart is needed well before there is only 1m left so I adjusted the minimum free memory from 1m to 24m. That seems like a bandage though. The heap dump returned a blank page so I'm not sure what was going on there. I'm just curious if anyone has any theories about what might be eating up memory over time. We are using Hibernate and PHP and of course java. Does the heap dump page work for you in a normal situation, i.e. before you start running out of memory? That's really the first place to start looking. The leaking memory might be obvious from Resin's heap dump page. If it's not enough information, the next step would be to use a more sophisticated memory profiler. -- Scott Andrew ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest ___ resin-interest mailing list resin-interest@caucho.com http://maillist.caucho.com/mailman/listinfo/resin-interest