> From: Leo Sutic [mailto:[EMAIL PROTECTED]] > > > From: Berin Loritsch [mailto:[EMAIL PROTECTED]] > > > > Assume you have a CM that automatically reclaims all components > > > after each request. That is, for Cocoon, when the request > comes in, > > > the CM starts keeping track of what components have been > taken out, > > > and when the request has been processed, they are release()'d (or > > > equivalent method). > > > > > > Now introduce pooled components. > > > > > > If more than pool-max components are looked-up during > > > the request you are not performing well, as you empty > > > the pool. > > > > I thought I already did introduce pooled components. It's > > really simple. The GC process for components releases > > them--just like we currently do. The GC process is done > > after the Response is committed. > > The scenario was when more than pool-max lookups had been > done before the GC kicks in. Suppose you have a pool-max of 3: > > public void handleRequest () { > someMethod (); > someMethod (); > someMethod (); > someMethod (); > }
And this is different from the current state of affairs, how? If a request requires 5 transformer instances, and you have your pool max set to 3, you will still experience slow down. This is no different than automatically releasing a component when the request is handled. > public void someMethod () { > manager.lookup (ExpensiveButPooledComponent.ROLE); > ... > } > > With an explicit release() this could be made not to drain > the pool. With GC you can not, unless you set the timeout > ridiculously low. With an explicit release() you are in the same boat as the GC method. For Cocoon we have a really simple lifelength for requested components: the length of a request. It's not that hard to implement or to comprehend. It is also pretty easy to manage the instances available. Many of the components that are currently pooled can be made into a PerThread policy. All we need is a ThreadLocal variable to create the instance of the object. This accounts for a large majority. Unfortunately, the core components in Cocoon have an interface that is not friendly, and we need a unique instance for every request. AKA pooling. I am also advocating that the current pipeline component interfaces be changed. The Generator, Transformer, and Serializer implement SAX methods--which is mixing concerns. They should return one. Now, we can set it up so that we can have a new version of the interface without breaking backwards compatibility with current components--but that is a subject for another thread. There is something inherently wrong when the only option available to you is to pool the components or create them new every time. The interface is wrong. It adds overhead and long and drawn out witch hunts finding where the component references are leaking. If we can design the components so that they can either be shared among all threads (optimal), or at the very least ensure that one instance per thread is sufficient then we have something where the framework is no longer the issue and we no longer need the release() mechanism. The issues come with the forcing of Poolable. That decision should be something that the container can decide to implement if it wants to--possibly to save instances so that the number of instances of a component are fewer than the number of threads. However, the interfaces for the Cocoon pipeline components are broken. A Generator should return an XMLSource, a Transformer should return an interface that merges XMLSource and ContentHandler, and a Serializer should return a ContentHandler. That way we can have something as simple as XMLSource source = generator.getXMLSource("file", uri); XMLSource trans = source; Iterator xformers = transformers.iterator(); while ( xformers.hasNext() ) { Struct entry = (XMLSource)xformers.next(); XMLSource newTrans = transformers.getPipeline(newTrans.type, newTrans.uri) trans.setContentHandler(newTrans); trans = newTrans; } trans.setContentHandler( serializer.getHandler("svg2png") ); source.execute(); As the ContentHandler.endDocument() is called on each item, they are automatically returned to their pools. Its not bad. Not to mention, the current style generators, transformers, and serializers whould be able to be used as the return values--so that everyone's hard work is not wasted. > > The GC routine for the container collects any components that > > need to be reclaimed into the pool. As a result we will have > > fewer dangling components than is currently possible. Right > > now, we have the equivalent of C++ memory allocation. The > > onus is on the developer to get it right. The GC brings the > > component into the Java age where GC is the norm. You don't > > have to worry about deleting everything you new in Java, the > > user doesn't have to worry about releasing everything you lookup. > > Well that's fine in theory, but in practice you will end up tweaking > and tweaking your GC timeouts and pool sizes, getting bizarre > errors along the way. You already have to skrew with pool sizes. The GC element is not going to make things less predictable on that front. In fact, it is a good possibility to make it *more* predictable. As to timeouts, we can use one policy for the container type. For example, Cocoon would benefit from a request based approach. Other containers may have to use a timeout based approach. Its up to the container. Are timeouts sufficient? No. Does it add additional complexity for the container? Yes. Does it help the developer? absolutely. > > Example: > > > > Proxy that releases the component instance after a timeout of > > 100 ms will wait as a container of nothing until it is either > > GC'd by the JVM or until an interface method has been called. > > In that case, the call blocks until a new Component instance > > is pulled from the pool. The method is then called. > > But component state is lost in the "refresh". Meaning that for > a SAX transformer or *any other component with state* you have > screwed up the processing. (So don't allow components with > state, then - well, then they are all ThreadSafe and we do not need > pools.) See above. The Cocoon pipeline component interfaces are really screwed up in this respect. A component's state should be sufficient per thread. Anything that is more granular than that needs a different treatment. > The basis of GC is that you can unambiguously tell when an > object is no longer used - when it can not possibly be used. > The speedups we have in pooling is due to explicitly telling > the container that this object can be reclaimed, thus keeping > the object count low. In Cocoon we have the advantage of knowing that. A pipeline component cannot possibly be used past the processing of a request. It makes for a really simple GC mechanism. > > I do not want any more work on the client. Let the container > > be smart and the client be dumb. > > Agreed! But what you propose is simply too complex to ever > work in practice. There are just too many restrictions on how > a component may behave, too many parameters for the GC > policy. Too much that can go wrong. I am finding more and more what people are calling components are nothing more than Objects that observe the Bridge pattern. They implement an interface, introduce a few lifecycle methods, etc. If they are object then they should be treated as such. If a pooled object requires an explicit return to a pool, than that decision should be made in the GeneratorManager, or the TransformerManager, etc. Not in the core lookup mechanism. > Add in different GC policies for different containers and > you end up with making the whole thing more complex instead of > less. > > Summary: GC of components... > > ...means that components may not have state and be pooled. It means that the state has to be at least consistent within a thread. Of course, the proxy can maintain the state as well--but that is more complexity as well.... > ...means that you always risk draining the pool. That is the notion that I am trying to dispell. It means that there is fewer instances of memory leaks in Cocoon because what one developer forgot to release is not going to hurt everyone else. > ...means a load of GC policy parameters for the client. ? I don't get this at all. GC policy is a function of the container--the client has no say in its use. The JVM does not have programmatic hooks to allow you to modify at runtime what GC policy it has. The fact that it has a System.gc() method is too much IMO to give to a client. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]