RE: Cocoon resource publishing

Unico Hommes Wed, 25 Jun 2003 09:34:25 -0700


> > > I think the alternative constructor would do it 
> definitely. Although 
> > > I would miss all the functionality that CocoonBean provides for 
> > > creating and initializing Cocoon.
> 
> > <uv>
> > So what functionality are you referring to here? Surely you 
> don't want 
> > to go configuring the Cocoon instance if it has already been 
> > configured elsewhere (by its containing servlet). </uv>
> > 
> > <uh>
> > No, but I could use it wherever I do create and configure 
> it. Be that 
> > in a servlet or in an Avalon embedded component. Although 
> it wouldn't 
> > be a problem to do that without the help of a utility class 
> and I am 
> > doing exactly that right now, my suggestion was that it might be 
> > useful. </uh>
> 
> Okay. But why do you want to create a Cocoon object as 
> independent from the Cocoon bean? Why can't the bean create 
> and configure it for you?


1) because CocoonBean is single threaded and I need to run concurrent
requests.
2) because I want to share the same Cocoon instance for performance.

> > <uv>
> > Are you suggesting that we might want the bean to be able 
> to generate 
> > a page from a webapp that isn't being served by the servlet? </uv>
> > 
> > <uh>
> > I'm saying that I see two separate areas of concern that CocoonBean 
> > could be useful for me. One is to help create and configure Cocoon, 
> > the other is to help run Cocoon and gather information about these 
> > runs. The former would be useful for me in a thread safe type 
> > component where I create and manage a shared cocoon instance, the 
> > latter I'd like to use in the per-thread situation of a publication 
> > request. </uh>
> 
> I'm afraid I don't quite understand. Can you explain more? I 
> don't quite understand what you mean by these two scenarios. 
> You want the bean to create and configure cocoon (which I 
> presume it already does). By the latter, are you referring to 
> stuff like following links?

Yes. The code in CocoonBean can be seperated into two categories. One is
for creating and configuring the Cocoon instance it is going to use, the
other is using it. Cocoon is thread-safe. CocoonBean is not.


> > <uv>
> > So, what do you mean by 'shared between clients'? 
> > </uv>
> > 
> > <uh>
> > I mean that the same Cocoon instance be accessible from different 
> > locations in the system. One client of this "Cocoon 
> service" would the 
> > HttpServlet that handles regular http calls. Another a mail system 
> > that publishes pages over smtp, and yet another that 
> receives commands 
> > to push pages onto a remote server as part of a publication action.
> 
> Why do you want to share the same Cocoon instance? Is it a 
> performance thing?

Yes.

> 
> > Actually that was what I started out from, because the publication 
> > system I have now follows this exact approach. It accesses the same 
> > Cocoon instance that is used to handle http requests as 
> well. Since it 
> > is separated from the http servlet I needed to put the 
> Cocoon instance 
> > somewhere from where I could access it from both locations.
> 
> So are you saying that you create a CocoonBean around the 
> Cocoon instance created by the HTTP Servlet and hand that 
> bean over to other systems for them to use in rendering URIs?

No I have an Avalon component that creates and manages Cocoon. The http
servlet and publishing service get it from a ServiceManager they
receive. I don't use CocoonBean now.

> > But it isn't necessarily required for me to keep these two clients 
> > separated. It's just the way it works for me now, and 
> divorcing Cocoon 
> > management from the code using it, is something that would be 
> > generally useful I think. </uh>
> 
> That is the primary purpose of the bean.

Except that it's not useable in a multi-threaded environment unless you
create one CocoonBean per thread.

> > The basic requirement that I have is that of a webservice 
> that pushes 
> > files onto a target location such as a remote FTP server. I 
> consider 
> > two different approaches. One is to integrate the service 
> with Cocoon 
> > as it now runs as a servlet or come up with an 
> implementation that is 
> > separate from it.
> 
> This is exactly what I have in mind for the bean/cli. I made 
> the bean write to ModifiableSources so that it can write 
> directly to such things as remove FTP servers. So I would 
> create a PublishingGenerator or PublishingTransformer that 
> takes in a configuration (like the current cli.xconf) which 
> tells the publisher what to do.

The first generation of my publisher was actually triggered by a Cocoon
Action. Sitemap parameters controlled the way the publisher was run. We
experienced problems with it because it ate memory thus crashing our
system when publication requests came in concurrently. Apart from that
configuration was becoming a drag for our sitemap editors and clogged up
the sitemaps with unreadable action logic.

The problem with running the publisher from sitemap components is that
there is no access to Cocoon object from there (would be rather awkward
design too). So requests can't be run locally.

> That then gets hold of a Bean and hands the info to the bean 
> for processing. The bean is then responsible for generating 
> the pages and dispatching them to their final location (by 
> simply opening an output stream on a modifiable source). The 
> Bean could then write to a specified listener every time a 
> file write is completed, which is then passed on to the next 
> stage of the pipeline.
> 
> So if you get it so that the Cocoon HTTP servlet can call the 
> bean, you get delivery to multiple destinations via multiple 
> protocols pretty much for free. If you need to create 
> modifiableSources for your protocols, you can then also give 
> those sources to the whole Avalon and Cocoon communities.

:) As it happens I commited a patch for an FTPSource to Avalon a few
days ago that was applied yesterday.

> 
> > In the former case, publication requests might involve a 
> protocol that 
> > is similar to Cocoon views. I am thinking that the sitemap 
> could have 
> > a "targets" section that defines publication locations, the default 
> > being just the stream to the requesting client browser. 
> Then, similar 
> > to the way views work, a cocoon request could be made by optionally 
> > specifying the target to publish to:
> > http://cocoon/resource-to-publish?cocoon-target=myliveserver
> > 
> > The other possibility is keeping the publication service 
> separate from 
> > the CocoonServlet. Configuration would be outside the sitemap and I 
> > would need a way to share CocoonServlet's Cocoon and 
> > PublicationService's Cocoon.
> 
> I would stick for the moment with the xconf format that the 
> bean (well, actually the
> CLI) uses. It would be possible to make the bean configure 
> itself from an xconf file passed in as SAX events, or as a 
> Configuration object, which would enable it to be configured 
> in a number of ways.
> 
> But if the publicationService is just running as a Cocoon 
> sitemap component, then I'd suggest it gets its configuration 
> from its incoming SAX stream (probably identified with an 
> appropriate namespace). Then the site builder can decide 
> exactly where the info comes from, and adapt it as required, 
> even at runtime.
> 
> > I think I prefer the first approach because it wouldn't require 
> > additional setup and configuration is right where you'd 
> expect it to 
> > be. Seems cleaner somehow. It also circumvents the issue of sharing 
> > the Cocoon instances we were discussing before. On the other hand, 
> > http being a request-response type of affair, the browser 
> response is 
> > not well defined. There's also the additional complexity 
> added to the 
> > core of Cocoon.
> 
> To my mind, the browser response in such a situation is a 
> report saying whether or not the generation of pages was successful.

That's an option. However, some publications can take a long time when
it also involves crawling and the sites are large and/or slow. Also,
publication processing like this is very resource intensive. For this
reason we had to process the requests asynchronously, queueing them if a
lot of concurrent requests come in.

> 
> I would say there's two sides to the approach I would 
> recommend: small steps, and keep it compatible where 
> possible. Let's identify small ways we can get the 
> functionality we want, whilst respecting the interfaces that 
> others are quite possibly using. So, what first small steps 
> would assist you in your work?

Well, I'd like to use CocoonBean for publication runs, but I want to
reuse Cocoon instance. So as a start I think adding a constructor to
CocoonBean like you proposed would be a good start. 

After that there are several possibilities:

1) Separate publisher service:

- In order to create a CocoonServlet subclass that gets its Cocoon from
elsewhere instead of creating it itself: change access modifier of
CocoonServlet.getCocoon() method from private to protected. 

- CocoonService that wraps Cocoon to embed it in an Avalon container (I
already have part of that code) and/or JNDI resource factory that makes
Cocoon object available to webapp from JNDI (does not require to run
Avalon enabled web environment).

- Write publisher service.

2) Integrated publisher service:

- Decide on a protocol for issuing publication requests. E.g.
http://cocoon/resource-to-publish?cocoon-target=named-target

- Decide on a place to put configuration. (web.xml, sitemap,
cocoon.xconf, targets.xconf?)

- Modify CocoonServlet to understand publication requests.

3) ... ?


Judging from fact that most cocooners aren't exactly milling around in
exitement about this publisher service I think keeping it as a separate
component would be preferred.


Regards,
Unico

RE: Cocoon resource publishing

Reply via email to