Am 14.08.2011 14:18, schrieb Sylvain Wallez:
Le 12/08/11 21:08, Thorsten Scherler a écrit :
Hi all,
I am migrating a StAX development from a customer to c3 StAX, since the
resulting code will be much more generic and understandable.
In my case I need to process all files from different folders, parse
them and invoke a second pipeline from the main pipe.
Meaning I have one principal pipeline which I need to repeat x times.
I started to create the pipeline and it works very nice, however I
encounter some downsides with reusing the pipe.
I found that you can execute a java based pipe exactly one time. There
is no such method to reset the pipe. My plan was to inject the pipeline
in my main code and then configure it on the Fly (reusing the same pipe
on different files).
Further there is as well no way to dynamically change the different
components once added to the pipe.
I mean
Pipeline<StAXPipelineComponent> pipeStAX = new
NonCachingPipeline<StAXPipelineComponent>();
pipeStAX.addComponent(new XMLGenerator(input));
...
pipeStAX.setup(System.out);
pipeStAX.execute();
Now my question is how people feel about:
a) Making java based pipes resettable pipeStAX.reset()
b) Adding a method like pipeStAX.getComonponet(int i) to retrieve the
component x in position i.
a) What exactly should Pipeline.reset() do? (Besides calling reset on
each component)
And what should a component do during a reset?
I think components can be configured/set up as often as you like.
b) If you construct the components directly, can't you keep a reference
to them and just call the setters/methods directly when needed?
I guess I don't understand why the pipeline is not reusable in your case
or what you need to reconfigure between the runs.
Maybe you need x different pipelines for x different configurations?
Although reset() can allow pipeline reuse, it won't solve the problem
when you have multiple concurrent threads that could benefit from
reusing the pipeline.
Cocoon 2.x had component pools to allow reuse in a multithreaded
context while avoiding the big cost of reparsing the component's
configuration, but this proved to have a significant overhead.
A solution that wouldn't require much changes in the current API would
be to require pipelines and pipeline components to be Cloneable, so
that you could build a pipeline instance once at startup and then
clone it each time you need to use it. That would require component
writers to be careful about cloneability though.
Sylvain
Pipelines are not thread-safe!
I think the effort required to make them thread-safe is far too great
given the (IMO negligible) benefits.
Since everyone can create their own pipeline components there is no way
to guarantee that it will work correctly all the time.
(I don't think "should work in multi-threaded environments if the
component developer didn't make a mistake" should appear in any
documentation)
In the case mentioned above (direct Pipeline API calls) component
instances are created by the user's code, so the responsibility of doing
that efficiently and correctly is the user's and not ours, IMO.
Something like a component factory / provider is currently well outside
the Pipeline API's responsibilities - actually it's part of the sitemap
- and I think it should stay that way.
I see the Pipeline API as a small library that provides some helpful
classes, which you use in a very controlled and precise manner (like
commons-lang, commons-io, etc.)
Not like a full execution environment with it's own flow of control (you
get that when you use cocoon-servlet with sitemaps).
If you really need/want more efficient construction of components, give
this task to someone who specializes in that.
Make a Spring context and use prototype beans or even create an object
pool, or use some other dependency injection container you like.
I don't think we should try to compete with those frameworks on their
home field.
Steven