Re: AW: Multithreading FOP ?
Peter B. West wrote: > Regarding configuration in FOP, it is interesting to note that there are > two different config hierarchies depending on whether the environment is > uniform, as, e.g., in a single thread, or diverse, as in the example > Arnd offered. (That is, a separate process constructs stylesheet > information and other variables into an instance-specific storage > location, and invokes a fop thread with a reference to that location.) > > In the first case, the config hierarchy is: > > system config > user config > command line This is an application centric view. And these are the traditional contexts used in many other application in order to avoid repeated setting of almost static stuff while still allowing flexibility. We are dealing with other points of view: - FOP used embedded in some other software consuming and producing byte streams - FOP as part of other software performing its own drawing and user interaction - FOP as a self contained application You config hierarchy applies only to the last. The first application of FOP should only make minimal assupmtions about available infrastructure and other environmental issues. In particular, it should not assume there are even config files or command line parameters directly regarding FOP to read. There is no use to distinguish between various config files and command line parameters. The mechanisms used should be universally available, like Java properties for very static stuff and fallback values and interfaces for supplying user font data, for which there could be a default implementation which reads the data from URLs, which by default are resolved to certain file URLs. Around this core could be a shell which provides for FOP specific config files and command line parameter filtering, much like the XWindows libraries provide similar stuff. This can be used both for embedding FOP elsewhere as well as for a stand alone FOP application. For the core, I'm still thinking the factory pattern used by JAXP is sensible: the factory is a singleton which serves as a template for the FO processors. All default configuration values are kept there, they are initialized with fallback values hard coded or read from java properties when the factory is created and can be changed at any time. After a processor is created, the settings for this processor can still be changed until the formatting run starts. After the run ends, the processor object is discarded (no reset) and new processors are obtained from the factory. There are no MT issues because all methods on the factory and the processor object are synchronized at their respective objects, and the objects hold all necessary data. This model suits the servlet environment particularly well. It could be claimed that some stuff from a processor run could be reused and should be cached, for example the FO tree. But I think you can as well cache the produced byte stream. It's another matter in case of the AWT renderer, there could be possibly many rerenderings form the same area tree (console refreshs, scrolling etc). But then, the AWT rendering processor is no necessarily discarded after each display refresh. Regards J.Pietschmann - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: AW: Multithreading FOP ?
Jeremias Maerki wrote: > As far as I know, you can't assume that, because the statics > are global within the context of a ClassLoader and not the JVM. > Normally though, you have only a simple ClassLoader hierarchy. > In that case your assumption is right. But it isn't as soon as > some complex classloading is done (as in EJB servers, > web containers, Avalon Phoenix etc). In that regard, Arnd could > (probably!) have run two differently configured FOPs in one VM, > but not without major headaches, I think. That's why I personally > would like to get rid of all unneccessary statics that could > be in the way of using FOP in a multithreaded and multi-configuration > environment. If I remember correctly from the Java 2 VM specification, a class specification is actually a combination of the full package name and - the "defining" class loader. If a class of the same package name is loaded (ok, "defined", to be precise) by two different class loaders, then these two classes ARE really considered different. This is roughly similar to the problem with class and package names: java.sql.Date and java.util.Date ARE different classes. So, technically, globals ARE unique within a VM. This may sound like nit-picking, but then perhaps it's not. Not that I would advocate using statics based on this.. 8-) If you think I'm wrong in this, please tell me. I think I know a few places where I rely on this (hopefully correct) knowledge. Arnd Beissner -- Cappelino Informationstechnologie GmbH Arnd Beißner Bahnhofstr. 3, 71063 Sindelfingen, Germany Email: [EMAIL PROTECTED] Phone: +49-7031-463458 Mobile: +49-173-3016917 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: AW: Multithreading FOP ?
Hi Peter (Comments inline) On 14.04.2002 04:15:25 Peter B. West wrote: > Joerg, > > Thanks, it does answer my questions, and raises a few others. I'm > heartened by this, because what you have described is the inappropriate > use of global data in a multi-threaded context. I'm interested because > I like statics. They are smaller and faster; what's not to like? > Before continuing with them, though, I need to make sure that I > understand the problems. I agree with you that statics are convenient. I used to work with them a lot, but stopped in most cases since we started to use Avalon in our applications. They were too much of a stone in the way of using the same package/component multiple times in different ways. For example we had a static class for JNDI access. That's nice as long as you're accessing only one JNDI server. Today we've got an Avalon-style component named JNDIProvider that's configured over an XML file (using ExcaliburComponentManager). All we have to do to access a second JNDI server at the same time is to provide another JNDIProvider under a different name and use that name in the configuration of the JNDI-consuming component. > The scenario you describe is a doomed attempt to globalise local data. > However, there are times when some initialization of truly global data > is required, yet it cannot be accomplished with static{} blocks. What's > needed is a one-time initialization method. This can be synchronized. > Protect the initialization method or some initialization object, and > set a flag within the method to declare that the job is done. Test this > on entry, after synchronization. Because none of the processes which > require the data will attempt access until after init, access itself > does not need to be synchronized. (I'm assuming here that all statics > are global relative to the JVM.) As far as I know, you can't assume that, because the statics are global within the context of a ClassLoader and not the JVM. Normally though, you have only a simple ClassLoader hierarchy. In that case your assumption is right. But it isn't as soon as some complex classloading is done (as in EJB servers, web containers, Avalon Phoenix etc). In that regard, Arnd could (probably!) have run two differently configured FOPs in one VM, but not without major headaches, I think. That's why I personally would like to get rid of all unneccessary statics that could be in the way of using FOP in a multithreaded and multi-configuration environment. Configuration IMO should be done directly on the component and not in a global way. You gain a lot of flexibility. For illustration, I've just done a quick search over the Avalon subprojects Framework, Excalibur and Phoenix to see how many non-final static variables are used. I've found a few, but I'm convinced that most should also have been marked final because they are actually constants. Most other instances are used in a context of wrapping other non-Avalon parts that make use of statics. What I want to say is this: I believe that it's possible (and worthwhile) to remove the statics where FOP configuration is concerned. A possible way to do this is using the facilities that Avalon provides. The Driver class (or something similar) will be a container for all FOP-related components/classes and will be the central point of configuration. The container will propagate the configuration down to it children (following the Inversion of Control pattern). That way we can have something similar to the JAXP approach of configuration (as Joerg proposes). > It seems to me, of what I have heard so far, that there is no problem > with statics _per se_. If they are used with an awareness of the > possibility of multi-threading, they should present no special > difficulties. I have heard it said, though, that statics are forbidden > in EJB environments. Is this true? If so, what are the special > constraints that apply to EJBs? See Arved's comment. > Regarding configuration in FOP, it is interesting to note that there are > two different config hierarchies depending on whether the environment is > uniform, as, e.g., in a single thread, or diverse, as in the example > Arnd offered. (That is, a separate process constructs stylesheet > information and other variables into an instance-specific storage > location, and invokes a fop thread with a reference to that location.) > > In the first case, the config hierarchy is: > > system config > user config > command line > > despite the fact that the user config file may be specified on the > command line. Other data from the command line will override > assignments in the user config (else why specify them?) > > In the second scenario, the most instance-specific data is in the user > config file (if that is being used to pass the instance data) or in some > other instance-specific config source. So the hierarchy looks like: > > system config > command line > user config
Re: AW: Multithreading FOP ?
Peter B. West wrote: > I like statics. They are smaller and faster; what's not to like? > Before continuing with them, though, I need to make sure that I > understand the problems. Yes to smaller and faster, though if the existance of statics forces you to use synchronization, your mileage may vary 8-). And indeed there's a general - though not widely known - problem with statics (if properly encapsulated) in Java, as follows: Since JDK 1.1, there's a concept of unloading of classes. If a class is implemented after the singleton pattern (for example using the widely used "private constructor + getInstance method" mechanism), you may have moments (unexpected by you) in the lifetime of the VM where only the class itself holds a reference to its single instance. If that happens, the class is eligible for garbage collection. If it is actually garbage collected, this either just impacts performance or crashes the app, because you never anticipated re-initialization of the class. I typically work around this by maintaining a "singleton registry" that keeps a reference to each singleton instance in my app. This problem does not occur very often, but if it occurs, it's pretty nasty. 8-) > does not need to be synchronized. (I'm assuming here that all statics > are global relative to the JVM.) Yes, that's what the spec says (assuming that by 'global' you mean there's just the one instance). > It seems to me, of what I have heard so far, that there is no problem > with statics _per se_. If they are used with an awareness of the > possibility of multi-threading, they should present no special > difficulties. I have heard it said, though, that statics are forbidden > in EJB environments. Is this true? If so, what are the special > constraints that apply to EJBs? Ok, concerning EJBs, statics (besides constants, of course) are forbidden because you don't instantiate the server-side EJBs, the application server does. There's also an activation mechanism in addition to instantiation, used to implement EJB instance pooling on the server. Since this makes statics extremely dangerous in EJBs, they are forbidden. I wouldn't be surprised if some appservers would even enforce this rule in their classloader. > Regarding configuration in FOP, it is interesting to note that there are > two different config hierarchies depending on whether the environment is > uniform, as, e.g., in a single thread, or diverse, as in the example > Arnd offered. (That is, a separate process constructs stylesheet > information and other variables into an instance-specific storage > location, and invokes a fop thread with a reference to that location.) > In the second scenario, the most instance-specific data is in the user > config file (if that is being used to pass the instance data) or in some > other instance-specific config source. So the hierarchy looks like: > > system config > command line > user config > > or > > system config > user config > command line > instance config > >I like the second idea better. Me too. However, what about: > system config > user config > instance config To me, command line configuration is just an example of instance configuration, not an added level. In the FOP application, the commandline configuration IS the instance configuration. In embedded uses of FOP, the commandline never arrives in FOP's classes. > Not knowing a great deal about JVMs and class loaders, I'm curious to > know how dynamic data can be introduced into threads started within a > pre-existing JVM. One solution of Arnd's problem would seem to be to > control the process of setting up the FOP thread configuration > subdirectories from within the JVM, and allow for new FOP objects to be > initialised with this information. That is not a general solution ot > the problem though. How is it usually done? Had this been possible with FOP, I'd have created two separate instances of FOP on startup of my application. Then, each instance would have run in a separate thread. As for a more general approach, I do not see a perfect one-size-fits-all solution. I see two extremes (perhaps the middle between the extremes is best, perhaps it's not): 1. On the one hand, removing all statics in FOP would give maximum performance for each single instance as synchronization could be eliminated completely. This should outweigh possible penalties for carrying references to "global" objects. 2. Use excessive pooling of "expensive" objects, including fonts, font metrics, factories of any kind, just everything that's expensive to create and somewhat likely to be reused by other "jobs" in a server environment. This is both much more complex to get right and also carries an overhead for synchronization in many places. Which way to go? This should depend on where FOP is expected to be used. For most of "my" applications, 2. would be better, though I could perfectly live with 1. Going wit
RE: AW: Multithreading FOP ?
> -Original Message- > From: Peter B. West [mailto:[EMAIL PROTECTED]] > Sent: April 13, 2002 11:15 PM > To: [EMAIL PROTECTED] > Subject: Re: AW: Multithreading FOP ? > [ SNIP ] > It seems to me, of what I have heard so far, that there is no problem > with statics _per se_. If they are used with an awareness of the > possibility of multi-threading, they should present no special > difficulties. I have heard it said, though, that statics are forbidden > in EJB environments. Is this true? If so, what are the special > constraints that apply to EJBs? Being distributed is the major factor. This is also true for servlets that are J2EE-compatible. Arved - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: AW: Multithreading FOP ?
Joerg, Thanks, it does answer my questions, and raises a few others. I'm heartened by this, because what you have described is the inappropriate use of global data in a multi-threaded context. I'm interested because I like statics. They are smaller and faster; what's not to like? Before continuing with them, though, I need to make sure that I understand the problems. The scenario you describe is a doomed attempt to globalise local data. However, there are times when some initialization of truly global data is required, yet it cannot be accomplished with static{} blocks. What's needed is a one-time initialization method. This can be synchronized. Protect the initialization method or some initialization object, and set a flag within the method to declare that the job is done. Test this on entry, after synchronization. Because none of the processes which require the data will attempt access until after init, access itself does not need to be synchronized. (I'm assuming here that all statics are global relative to the JVM.) It seems to me, of what I have heard so far, that there is no problem with statics _per se_. If they are used with an awareness of the possibility of multi-threading, they should present no special difficulties. I have heard it said, though, that statics are forbidden in EJB environments. Is this true? If so, what are the special constraints that apply to EJBs? Regarding configuration in FOP, it is interesting to note that there are two different config hierarchies depending on whether the environment is uniform, as, e.g., in a single thread, or diverse, as in the example Arnd offered. (That is, a separate process constructs stylesheet information and other variables into an instance-specific storage location, and invokes a fop thread with a reference to that location.) In the first case, the config hierarchy is: system config user config command line despite the fact that the user config file may be specified on the command line. Other data from the command line will override assignments in the user config (else why specify them?) In the second scenario, the most instance-specific data is in the user config file (if that is being used to pass the instance data) or in some other instance-specific config source. So the hierarchy looks like: system config command line user config or system config user config command line instance config I like the second idea better. Not knowing a great deal about JVMs and class loaders, I'm curious to know how dynamic data can be introduced into threads started within a pre-existing JVM. One solution of Arnd's problem would seem to be to control the process of setting up the FOP thread configuration subdirectories from within the JVM, and allow for new FOP objects to be initialised with this information. That is not a general solution ot the problem though. How is it usually done? Peter J.Pietschmann wrote: > The problem is multiple threads accessing static class data, > which is really global. > Well, the standard scenario is: There are multiple threads > sleeping while waiting for requests. One thread wakes up, > sets the FOP baseDir, creates a Driver instqance and starts > rendering. Just before the thread is about to resolve an URI > for an external graphic, it is suspended and another thread > gets a chance to run, it reads its request, sets the global > baseDir to soemthing else, and is itself suspended in favour > for the first thread, which reads the now changed value for > baseDir from the configuration, and explodes. > > It doesn't help to make the Driver methods synchronized, > because there are two instances of the driver object :-( > you would have to lock the global configuration data so > that the second thread would have to wait until the first > finishes processing. Of course, this nullifies the advantages > of using multithreading, especially on MP machines. > > I like the approach JAXP did for transformers. You have > a factory where you can set default stuff so that you > don't have to do this every time an individual processor > is created, and you can override settings on the individual > instances. The individual processor instances never access > global data after creation. > > Does this answer your question(s)? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: AW: Multithreading FOP ?
Peter B. West wrote: > Please indulge my ignorance again. May I assume that it is not possible > to run two main()s in the same VM? Not in the sense you probably mean. > From this discussion so far I have gained much more insight into the > nervousness about statics. Is the problem that servers want to execute > multiple instances of classes within the one VM? Are there other problems? The problem is multiple threads accessing static class data, which is really global. Well, the standard scenario is: There are multiple threads sleeping while waiting for requests. One thread wakes up, sets the FOP baseDir, creates a Driver instqance and starts rendering. Just before the thread is about to resolve an URI for an external graphic, it is suspended and another thread gets a chance to run, it reads its request, sets the global baseDir to soemthing else, and is itself suspended in favour for the first thread, which reads the now changed value for baseDir from the configuration, and explodes. It doesn't help to make the Driver methods synchronized, because there are two instances of the driver object :-( you would have to lock the global configuration data so that the second thread would have to wait until the first finishes processing. Of course, this nullifies the advantages of using multithreading, especially on MP machines. I like the approach JAXP did for transformers. You have a factory where you can set default stuff so that you don't have to do this every time an individual processor is created, and you can override settings on the individual instances. The individual processor instances never access global data after creation. Does this answer your question(s)? J.Pietschmann - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: AW: Multithreading FOP ?
Folks, Please indulge my ignorance again. May I assume that it is not possible to run two main()s in the same VM? From this discussion so far I have gained much more insight into the nervousness about statics. Is the problem that servers want to execute multiple instances of classes within the one VM? Are there other problems? Peter - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: AW: Multithreading FOP ?
Peter B. West wrote: > Please excuse my ignorance of these issues, but what mechanisms would >folks expect to use to set per-invocation configurations for FOP? > >>One problem you may run across is that configuration in FOP is help in >>global objects. >>Besides not being thread-safe you will not be able to run multiple FOP >>threads with different >>configuration settings. If you want to investigate this, look for the class >>Options. I probably need to give some context for this: What I wanted to do was optimize large-scale PDF generation with FOP. For a customer, I needed to format thousands of documents with 1 or 2 pages each (invoices and stuff). For each document, the appropriate stylesheet and associated files (logos, language or country specific files) are - if necessary - extracted from a database (that keeps everything) and store into a subdirectory. Then FOP is invoked with basedir set to that subdirectory. If, for performance reasons, I want two instances of FOP on a multiprocessor machine, I cannot set basedir separately without starting two VMs. I finally decided to create a separate VM for each FOP "printer" and feed each FOP VM via RMI. Ok, what would I want? First, I'd like to be able to create multiple instances of Fop without side-effects. I'd say this is an issue for servlet-users, too. Second, I'd say almost everything should be configurable on a per-invocation basis. Logging is one exception that comes to mind. By the way, there's a positive thing about FOP that I can report: With a single instance of FOP, I have produced about 10.000 PDFs of medium complexity (tables, small images, font embedding, separate) without any apparent memory leaks or instabilities. Peak memory usage on Linux during that job: about 30MB. I'd say that kind of stability is more than you could say for most commercial products. 8-) Arnd Beissner -- Cappelino Informationstechnologie GmbH Arnd Beißner Bahnhofstr. 3, 71063 Sindelfingen, Germany Email: [EMAIL PROTECTED] Phone: +49-7031-463458 Mobile: +49-173-3016917
Re: AW: Multithreading FOP ?
Please excuse my ignorance of these issues, but what mechanisms would folks expect to use to set per-invocation configurations for FOP? Peter Chaumette, Patrick wrote: >Thanks for the infos, > >also got this from Arnd > > >-- >One problem you may run across is that configuration in FOP is help in >global objects. >Besides not being thread-safe you will not be able to run multiple FOP >threads with different >configuration settings. If you want to investigate this, look for the class >Options. > >I ran across this problem during invoice printing on multiprocessor machines >and finally >decided to start a separate VM for each FOP instance. > >Hope this helps, > >Arnd Beissner > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]