Nicola Ken Barozzi wrote:
...

Why is it so slow?
Mostly because it generates each source three times.

* to get the links.

* to get the mime type

* for each link

...whose mime type is not known yet...

to get the mime/type.
* to get the page itself

Note: It gets the page with all the links translated using data gathered on previous step.


To do this it uses two environments, the FileSavingEnvironment and the LinkSamplingEnvironment.

...

The three calls to Cocoon can be reduced quite easily to two, by making the call to the FileSavingEnvironment return both things at the same time and using those.

Clarify: what two things.


Or by caching the result as the proposed Ant task in Cocoon scratchpad does.

The problem arises with the LinkSamplingEnvironment, because it uses a Cocoon view to get the links. Thus we need to ask Cocoon two things, the links and the contents.

We can combine getType and getLinks calls into one, see below.


Let's leave aside the view concept for now, and think about how to sample links from a content being produced.

We can use a LinklSamplingPipeline.
Yes, a pipeline that introduces a connector just after the "content"-tagged sitemap component and saves the links found in the environment.

Mmmm... Correction: pipeline that introduces LinkSamplingTransforming right before serializer. You can't get links from the content view because it might (will) have none yet. Links must be sampled right before the serializer, as links view does.


Thus after the call we would have in the environment the result, the type and the links, all in one call.

Type and links - yes, I agree. Content - no, we won't get correct content because links will not be translated in this content. And produced content is impossible to "re-link" because it can be any binary format supporting links (MS Excel, PDF, MS Word, ...)

But, there is hope to get all in once - if LinkSamplingTransformer will also be LinkTranslatingTransformer and will call Main back on every new link (recursive processing - as opposed to iterative processing in current implementation of the Main). The drawback of recursion approach is increased memory consumption.


<snip/>

Vadim




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]

Reply via email to