Re: AW: Multithreading FOP ?

2002-05-13 Thread J.Pietschmann

Peter B. West wrote:
> Regarding configuration in FOP, it is interesting to note that there are 
> two different config hierarchies depending on whether the environment is 
> uniform, as, e.g., in a single thread, or diverse, as in the example 
> Arnd offered.  (That is, a separate process constructs stylesheet 
> information and other variables into an instance-specific storage 
> location, and invokes a fop thread with a reference to that location.)
> 
> In the first case, the config hierarchy is:
> 
> system config
> user config
> command line

This is an application centric view. And these are the traditional
contexts used in many other application in order to avoid repeated
setting of almost static stuff while still allowing flexibility.

We are dealing with other points of view:
- FOP used embedded in some other software consuming and producing
   byte streams
- FOP as part of other software performing its own drawing and user
   interaction
- FOP as a self contained application
You config hierarchy applies only to the last.

The first application of FOP should only make minimal assupmtions
about available infrastructure and other environmental issues. In
particular, it should not assume there are even config files or
command line parameters directly regarding FOP to read. There is
no use to distinguish between various config files and command line
parameters. The mechanisms used should be universally available,
like Java properties for very static stuff and fallback values
and interfaces for supplying user font data, for which there
could be a default implementation which reads the data from URLs,
which by default are resolved to certain file URLs.

Around this core could be a shell which provides for FOP specific
config files and command line parameter filtering, much like the
XWindows libraries provide similar stuff. This can be used both
for embedding FOP elsewhere as well as for a stand alone FOP
application.

For the core, I'm still thinking the factory pattern used by JAXP
is sensible: the factory is a singleton which serves as a template
for the FO processors. All default configuration values are kept
there, they are initialized with fallback values hard coded or
read from java properties when the factory is created and can
be changed at any time. After a processor is created, the settings
for this processor can still be changed until the formatting run
starts. After the run ends, the processor object is discarded (no
reset) and new processors are obtained from the factory. There
are no MT issues because all methods on the factory and the
processor object are synchronized at their respective objects,
and the objects hold all necessary data. This model suits the
servlet environment particularly well. It could be claimed that
some stuff from a processor run could be reused and should be
cached, for example the FO tree. But I think you can as well
cache the produced byte stream. It's another matter in case
of the AWT renderer, there could be possibly many rerenderings
form the same area tree (console refreshs, scrolling etc). But
then, the AWT rendering processor is no necessarily discarded
after each display refresh.

Regards
J.Pietschmann


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: AW: Multithreading FOP ?

2002-04-14 Thread Arnd Beißner

Jeremias Maerki wrote:

> As far as I know, you can't assume that, because the statics
> are global within the context of a ClassLoader and not the JVM.
> Normally though, you have only a simple ClassLoader hierarchy.
> In that case your assumption is right. But it isn't as soon as
> some complex classloading is done (as in EJB servers,
> web containers, Avalon Phoenix etc). In that regard, Arnd could
> (probably!) have run two differently configured FOPs in one VM,
> but not without major headaches, I think. That's why I personally
> would like to get rid of all unneccessary statics that could
> be in the way of using FOP in a multithreaded and multi-configuration
> environment.

If I remember correctly from the Java 2 VM specification, a class
specification is actually a combination of the full package name
and - the "defining" class loader. If a class of the same package name
is loaded (ok, "defined", to be precise) by two different class loaders,
then these two classes ARE really considered different. This is roughly 
similar to
the problem with class and package names: java.sql.Date and java.util.Date
ARE different classes. So, technically, globals ARE unique within a VM.
This may sound like nit-picking, but then perhaps it's not.

Not that I would advocate using statics based on this.. 8-)

If you think I'm wrong in this, please tell me. I think I know a few
places where I rely on this (hopefully correct) knowledge.

Arnd Beissner
--
Cappelino Informationstechnologie GmbH
Arnd Beißner
Bahnhofstr. 3, 71063 Sindelfingen, Germany
Email: [EMAIL PROTECTED]
Phone: +49-7031-463458
Mobile: +49-173-3016917


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: AW: Multithreading FOP ?

2002-04-14 Thread Jeremias Maerki

Hi Peter

(Comments inline)

On 14.04.2002 04:15:25 Peter B. West wrote:
> Joerg,
> 
> Thanks, it does answer my questions, and raises a few others.  I'm 
> heartened by this, because what you have described is the inappropriate 
> use of global data in a multi-threaded context.  I'm interested because 
> I like statics.  They are smaller and faster; what's not to like? 
>  Before continuing with them, though, I need to make sure that I 
> understand the problems.

I agree with you that statics are convenient. I used to work with them a
lot, but stopped in most cases since we started to use Avalon in our
applications. They were too much of a stone in the way of using the same
package/component multiple times in different ways.

For example we had a static class for JNDI access. That's nice as long
as you're accessing only one JNDI server. Today we've got an
Avalon-style component named JNDIProvider that's configured over an XML
file (using ExcaliburComponentManager). All we have to do to access a
second JNDI server at the same time is to provide another JNDIProvider
under a different name and use that name in the configuration of the
JNDI-consuming component.

> The scenario you describe is a doomed attempt to globalise local data. 
>  However, there are times when some initialization of truly global data 
> is required, yet it cannot be accomplished with static{} blocks.  What's 
> needed is a one-time initialization method.  This can be synchronized. 
>  Protect the initialization method or some initialization object, and 
> set a flag within the method to declare that the job is done.  Test this 
> on entry, after synchronization.  Because none of the processes which 
> require the data will attempt access until after init, access itself 
> does not need to be synchronized.  (I'm assuming here that all statics 
> are global relative to the JVM.)

As far as I know, you can't assume that, because the statics are global
within the context of a ClassLoader and not the JVM. Normally though,
you have only a simple ClassLoader hierarchy. In that case your
assumption is right. But it isn't as soon as some complex classloading
is done (as in EJB servers, web containers, Avalon Phoenix etc). In that
regard, Arnd could (probably!) have run two differently configured FOPs
in one VM, but not without major headaches, I think. That's why I
personally would like to get rid of all unneccessary statics that could
be in the way of using FOP in a multithreaded and multi-configuration
environment.

Configuration IMO should be done directly on the component and not in a
global way. You gain a lot of flexibility.

For illustration, I've just done a quick search over the Avalon
subprojects Framework, Excalibur and Phoenix to see how many non-final static
variables are used. I've found a few, but I'm convinced that most should
also have been marked final because they are actually constants. Most
other instances are used in a context of wrapping other non-Avalon parts
that make use of statics. What I want to say is this: I believe that
it's possible (and worthwhile) to remove the statics where FOP
configuration is concerned. A possible way to do this is using the
facilities that Avalon provides. The Driver class (or something similar)
will be a container for all FOP-related components/classes and will be
the central point of configuration. The container will propagate the
configuration down to it children (following the Inversion of Control
pattern). That way we can have something similar to the JAXP approach of
configuration (as Joerg proposes).

> It seems to me, of what I have heard so far, that there is no problem 
> with statics _per se_.  If they are used with an awareness of the 
> possibility of multi-threading, they should present no special 
> difficulties.  I have heard it said, though, that statics are forbidden 
> in EJB environments.  Is this true?  If so, what are the special 
> constraints that apply to EJBs?

See Arved's comment.

> Regarding configuration in FOP, it is interesting to note that there are 
> two different config hierarchies depending on whether the environment is 
> uniform, as, e.g., in a single thread, or diverse, as in the example 
> Arnd offered.  (That is, a separate process constructs stylesheet 
> information and other variables into an instance-specific storage 
> location, and invokes a fop thread with a reference to that location.)
> 
> In the first case, the config hierarchy is:
> 
> system config
> user config
> command line
> 
> despite the fact that the user config file may be specified on the 
> command line.  Other data from the command line will override 
> assignments in the user config (else why specify them?)
> 
> In the second scenario, the most instance-specific data is in the user 
> config file (if that is being used to pass the instance data) or in some 
> other instance-specific config source.  So the hierarchy looks like:
> 
> system config
> command line
> user config

Re: AW: Multithreading FOP ?

2002-04-14 Thread Arnd Beißner

Peter B. West wrote:

> I like statics.  They are smaller and faster; what's not to like? 
> Before continuing with them, though, I need to make sure that I 
> understand the problems.

Yes to smaller and faster, though if the existance of statics forces
you to use synchronization, your mileage may vary 8-). And indeed
there's a general - though not widely known - problem with statics
(if properly encapsulated) in Java, as follows:

Since JDK 1.1, there's a concept of unloading of classes. If a class
is implemented after the singleton pattern (for example using the
widely used "private constructor + getInstance method" mechanism), you may
have moments (unexpected by you) in the lifetime of the VM where
only the class itself holds a reference to its single instance.
If that happens, the class is eligible for garbage collection.
If it is actually garbage collected, this either just impacts performance
or crashes the app, because you never anticipated re-initialization of
the class. I typically work around this by maintaining a "singleton 
registry"
that keeps a reference to each singleton instance in my app. This
problem does not occur very often, but if it occurs, it's pretty nasty. 
8-)

> does not need to be synchronized.  (I'm assuming here that all statics 
> are global relative to the JVM.)

Yes, that's what the spec says (assuming that by 'global' you mean
there's just the one instance).

> It seems to me, of what I have heard so far, that there is no problem 
> with statics _per se_.  If they are used with an awareness of the 
> possibility of multi-threading, they should present no special 
> difficulties.  I have heard it said, though, that statics are forbidden 
> in EJB environments.  Is this true?  If so, what are the special 
> constraints that apply to EJBs?

Ok, concerning EJBs, statics (besides constants, of course) are forbidden
because you don't instantiate the server-side EJBs, the application server
does. There's also an activation mechanism in addition to instantiation,
used to implement EJB instance pooling on the server. Since this makes
statics extremely dangerous in EJBs, they are forbidden. I wouldn't be
surprised if some appservers would even enforce this rule in their 
classloader.

> Regarding configuration in FOP, it is interesting to note that there are 

> two different config hierarchies depending on whether the environment is 

> uniform, as, e.g., in a single thread, or diverse, as in the example 
> Arnd offered.  (That is, a separate process constructs stylesheet 
> information and other variables into an instance-specific storage 
> location, and invokes a fop thread with a reference to that location.)

> In the second scenario, the most instance-specific data is in the user 
> config file (if that is being used to pass the instance data) or in some 

> other instance-specific config source.  So the hierarchy looks like:
>
> system config
> command line
> user config
>
> or
>
> system config
> user config
> command line
> instance config
>
>I like the second idea better.

Me too. However, what about:

> system config
> user config
> instance config

To me, command line configuration is just an example of instance 
configuration,
not an added level. In the FOP application, the commandline configuration 
IS
the instance configuration. In embedded uses of FOP, the commandline never
arrives in FOP's classes.

> Not knowing a great deal about JVMs and class loaders, I'm curious to 
> know how dynamic data can be introduced into threads started within a 
> pre-existing JVM.  One solution of Arnd's problem would seem to be to 
> control the process of setting up the FOP thread configuration 
> subdirectories from within the JVM, and allow for new FOP objects to be 
> initialised with this information.  That is not a general solution ot 
> the problem though.  How is it usually done?

Had this been possible with FOP, I'd have created two separate instances
of FOP on startup of my application. Then, each instance would have run in 
a
separate thread. As for a more general approach, I do not see a perfect
one-size-fits-all solution.

I see two extremes (perhaps the middle between the extremes is best, 
perhaps
it's not):

1. On the one hand, removing all statics in FOP would give maximum 
performance
   for each single instance as synchronization could be eliminated 
completely.
   This should outweigh possible penalties for carrying references to 
"global"
   objects.

2. Use excessive pooling of "expensive" objects, including fonts, font 
metrics,
   factories of any kind, just everything that's expensive to create and 
somewhat
   likely to be reused by other "jobs" in a server environment. This is 
both
   much more complex to get right and also carries an overhead for 
synchronization
   in many places.

Which way to go? This should depend on where FOP is expected to be used. 
For most
of "my" applications, 2. would be better, though I could perfectly live 
with 1.
Going wit

RE: AW: Multithreading FOP ?

2002-04-13 Thread Arved Sandstrom

> -Original Message-
> From: Peter B. West [mailto:[EMAIL PROTECTED]]
> Sent: April 13, 2002 11:15 PM
> To: [EMAIL PROTECTED]
> Subject: Re: AW: Multithreading FOP ?
>
[ SNIP ]
> It seems to me, of what I have heard so far, that there is no problem
> with statics _per se_.  If they are used with an awareness of the
> possibility of multi-threading, they should present no special
> difficulties.  I have heard it said, though, that statics are forbidden
> in EJB environments.  Is this true?  If so, what are the special
> constraints that apply to EJBs?

Being distributed is the major factor. This is also true for servlets that
are J2EE-compatible.

Arved


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: AW: Multithreading FOP ?

2002-04-13 Thread Peter B. West

Joerg,

Thanks, it does answer my questions, and raises a few others.  I'm 
heartened by this, because what you have described is the inappropriate 
use of global data in a multi-threaded context.  I'm interested because 
I like statics.  They are smaller and faster; what's not to like? 
 Before continuing with them, though, I need to make sure that I 
understand the problems.

The scenario you describe is a doomed attempt to globalise local data. 
 However, there are times when some initialization of truly global data 
is required, yet it cannot be accomplished with static{} blocks.  What's 
needed is a one-time initialization method.  This can be synchronized. 
 Protect the initialization method or some initialization object, and 
set a flag within the method to declare that the job is done.  Test this 
on entry, after synchronization.  Because none of the processes which 
require the data will attempt access until after init, access itself 
does not need to be synchronized.  (I'm assuming here that all statics 
are global relative to the JVM.)

It seems to me, of what I have heard so far, that there is no problem 
with statics _per se_.  If they are used with an awareness of the 
possibility of multi-threading, they should present no special 
difficulties.  I have heard it said, though, that statics are forbidden 
in EJB environments.  Is this true?  If so, what are the special 
constraints that apply to EJBs?

Regarding configuration in FOP, it is interesting to note that there are 
two different config hierarchies depending on whether the environment is 
uniform, as, e.g., in a single thread, or diverse, as in the example 
Arnd offered.  (That is, a separate process constructs stylesheet 
information and other variables into an instance-specific storage 
location, and invokes a fop thread with a reference to that location.)

In the first case, the config hierarchy is:

system config
user config
command line

despite the fact that the user config file may be specified on the 
command line.  Other data from the command line will override 
assignments in the user config (else why specify them?)

In the second scenario, the most instance-specific data is in the user 
config file (if that is being used to pass the instance data) or in some 
other instance-specific config source.  So the hierarchy looks like:

system config
command line
user config

or

system config
user config
command line
instance config

I like the second idea better.

Not knowing a great deal about JVMs and class loaders, I'm curious to 
know how dynamic data can be introduced into threads started within a 
pre-existing JVM.  One solution of Arnd's problem would seem to be to 
control the process of setting up the FOP thread configuration 
subdirectories from within the JVM, and allow for new FOP objects to be 
initialised with this information.  That is not a general solution ot 
the problem though.  How is it usually done?

Peter


J.Pietschmann wrote:

> The problem is multiple threads accessing static class data,
> which is really global.
> Well, the standard scenario is: There are multiple threads
> sleeping while waiting for requests. One thread wakes up,
> sets the FOP baseDir, creates a Driver instqance and starts
> rendering. Just before the thread is about to resolve an URI
> for an external graphic, it is suspended and another thread
> gets a chance to run, it reads its request, sets the global
> baseDir to soemthing else, and is itself suspended in favour
> for the first thread, which reads the now changed value for
> baseDir from the configuration, and explodes.
>
> It doesn't help to make the Driver methods synchronized,
> because there are two instances of the driver object :-(
> you would have to lock the global configuration data so
> that the second thread would have to wait until the first
> finishes processing. Of course, this nullifies the advantages
> of using multithreading, especially on MP machines.
>
> I like the approach JAXP did for transformers. You have
> a factory where you can set default stuff so that you
> don't have to do this every time an individual processor
> is created, and you can override settings on the individual
> instances. The individual processor instances never access
> global data after creation.
>
> Does this answer your question(s)?




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: AW: Multithreading FOP ?

2002-04-13 Thread J.Pietschmann

Peter B. West wrote:
> Please indulge my ignorance again.  May I assume that it is not possible 
> to run two main()s in the same VM?

Not in the sense you probably mean.

>  From this discussion so far I have gained much more insight into the 
> nervousness about statics.  Is the problem that servers want to execute 
> multiple instances of classes within the one VM?  Are there other problems?

The problem is multiple threads accessing static class data,
which is really global.
Well, the standard scenario is: There are multiple threads
sleeping while waiting for requests. One thread wakes up,
sets the FOP baseDir, creates a Driver instqance and starts
rendering. Just before the thread is about to resolve an URI
for an external graphic, it is suspended and another thread
gets a chance to run, it reads its request, sets the global
baseDir to soemthing else, and is itself suspended in favour
for the first thread, which reads the now changed value for
baseDir from the configuration, and explodes.

It doesn't help to make the Driver methods synchronized,
because there are two instances of the driver object :-(
you would have to lock the global configuration data so
that the second thread would have to wait until the first
finishes processing. Of course, this nullifies the advantages
of using multithreading, especially on MP machines.

I like the approach JAXP did for transformers. You have
a factory where you can set default stuff so that you
don't have to do this every time an individual processor
is created, and you can override settings on the individual
instances. The individual processor instances never access
global data after creation.

Does this answer your question(s)?

J.Pietschmann


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: AW: Multithreading FOP ?

2002-04-13 Thread Peter B. West

Folks,

Please indulge my ignorance again.  May I assume that it is not possible 
to run two main()s in the same VM?

 From this discussion so far I have gained much more insight into the 
nervousness about statics.  Is the problem that servers want to execute 
multiple instances of classes within the one VM?  Are there other problems?

Peter


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: AW: Multithreading FOP ?

2002-04-12 Thread Arnd Beißner

Peter B. West wrote:
> Please excuse my ignorance of these issues, but what mechanisms would 
>folks expect to use to set per-invocation configurations for FOP?
>
>>One problem you may run across is that configuration in FOP is help in
>>global objects. 
>>Besides not being thread-safe you will not be able to run multiple FOP
>>threads with different 
>>configuration settings. If you want to investigate this, look for the class
>>Options. 

I probably need to give some context for this:

What I wanted to do was optimize large-scale PDF generation with FOP. For a customer,
I needed to format thousands of documents with 1 or 2 pages each (invoices and stuff).
For each document, the appropriate stylesheet and associated files (logos, language or
country specific files) are - if necessary - extracted from a database (that keeps everything)
and store into a subdirectory. Then FOP is invoked with basedir set to that subdirectory.
If, for performance reasons, I want two instances of FOP on a multiprocessor machine, I cannot
set basedir separately without starting two VMs.

I finally decided to create a separate VM for each FOP "printer" and feed each FOP VM via
RMI.

Ok, what would I want?

First, I'd like to be able to create multiple instances of Fop without side-effects.
I'd say this is an issue for servlet-users, too.

Second, I'd say almost everything should be configurable on a per-invocation basis.
Logging is one exception that comes to mind.

By the way, there's a positive thing about FOP that I can report:
With a single instance of FOP, I have produced about 10.000 PDFs of medium
complexity (tables, small images, font embedding, separate) without any apparent memory
leaks or instabilities. Peak memory usage on Linux during that job: about 30MB.
I'd say that kind of stability is more than you could say for most commercial products. 8-)

Arnd Beissner
--
Cappelino Informationstechnologie GmbH
Arnd Beißner
Bahnhofstr. 3, 71063 Sindelfingen, Germany
Email: [EMAIL PROTECTED]
Phone: +49-7031-463458
Mobile: +49-173-3016917


Re: AW: Multithreading FOP ?

2002-04-11 Thread Peter B. West

Please excuse my ignorance of these issues, but what mechanisms would 
folks expect to use to set per-invocation configurations for FOP?

Peter

Chaumette, Patrick wrote:

>Thanks for the infos,
>
>also got this from Arnd
>
>
>--
>One problem you may run across is that configuration in FOP is help in
>global objects. 
>Besides not being thread-safe you will not be able to run multiple FOP
>threads with different 
>configuration settings. If you want to investigate this, look for the class
>Options. 
>
>I ran across this problem during invoice printing on multiprocessor machines
>and finally 
>decided to start a separate VM for each FOP instance. 
>
>Hope this helps, 
>
>Arnd Beissner 
>



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]