Re: Reuse of pipelines in java

Steven Dolg Tue, 16 Aug 2011 00:26:01 -0700

Am 14.08.2011 14:18, schrieb Sylvain Wallez:

Le 12/08/11 21:08, Thorsten Scherler a écrit :

Hi all,


I am migrating a StAX development from a customer to c3 StAX, since the
resulting code will be much more generic and understandable.

In my case I need to process all files from different folders, parse
them and invoke a second pipeline from the main pipe.

Meaning I have one principal pipeline which I need to repeat x times.
I started to create the pipeline and it works very nice, however I
encounter some downsides with reusing the pipe.

I found that you can execute a java based pipe exactly one time. There
is no such method to reset the pipe. My plan was to inject the pipeline
in my main code and then configure it on the Fly (reusing the same pipe
on different files).

Further there is as well no way to dynamically change the different
components once added to the pipe.

I mean

Pipeline<StAXPipelineComponent>  pipeStAX = new
NonCachingPipeline<StAXPipelineComponent>();
pipeStAX.addComponent(new XMLGenerator(input));
...
pipeStAX.setup(System.out);
pipeStAX.execute();

Now my question is how people feel about:
a) Making java based pipes resettable pipeStAX.reset()
b) Adding a method like pipeStAX.getComonponet(int i) to retrieve the
component x in position i.

a) What exactly should Pipeline.reset() do? (Besides calling reset oneach component)

And what should a component do during a reset?
I think components can be configured/set up as often as you like.

b) If you construct the components directly, can't you keep a referenceto them and just call the setters/methods directly when needed?I guess I don't understand why the pipeline is not reusable in your caseor what you need to reconfigure between the runs.

Maybe you need x different pipelines for x different configurations?

Although reset() can allow pipeline reuse, it won't solve the problemwhen you have multiple concurrent threads that could benefit fromreusing the pipeline.
Cocoon 2.x had component pools to allow reuse in a multithreadedcontext while avoiding the big cost of reparsing the component'sconfiguration, but this proved to have a significant overhead.
A solution that wouldn't require much changes in the current API wouldbe to require pipelines and pipeline components to be Cloneable, sothat you could build a pipeline instance once at startup and thenclone it each time you need to use it. That would require componentwriters to be careful about cloneability though.
Sylvain


Pipelines are not thread-safe!

I think the effort required to make them thread-safe is far too greatgiven the (IMO negligible) benefits.Since everyone can create their own pipeline components there is no wayto guarantee that it will work correctly all the time.(I don't think "should work in multi-threaded environments if thecomponent developer didn't make a mistake" should appear in anydocumentation)

In the case mentioned above (direct Pipeline API calls) componentinstances are created by the user's code, so the responsibility of doingthat efficiently and correctly is the user's and not ours, IMO.Something like a component factory / provider is currently well outsidethe Pipeline API's responsibilities - actually it's part of the sitemap- and I think it should stay that way.I see the Pipeline API as a small library that provides some helpfulclasses, which you use in a very controlled and precise manner (likecommons-lang, commons-io, etc.)Not like a full execution environment with it's own flow of control (youget that when you use cocoon-servlet with sitemaps).

If you really need/want more efficient construction of components, givethis task to someone who specializes in that.Make a Spring context and use prototype beans or even create an objectpool, or use some other dependency injection container you like.I don't think we should try to compete with those frameworks on theirhome field.


Steven

Re: Reuse of pipelines in java

Reply via email to