Date: 2005-02-25T07:17:41
Editor: NicoVerwer
Wiki: Cocoon Wiki
Page: ProfilingPipelinesWithBigSaxEventStreams
URL: http://wiki.apache.org/cocoon/ProfilingPipelinesWithBigSaxEventStreams
no comment
New Page:
= Profiling pipelines with big SAX-event streams =
The profiliing pipelines, `ProfilingCachingProcessingPipeline` and
`ProfilingNonCachingProcessingPipeline` are useful to measure the performance
of your Cocoon application.
However, when the XML that you process is really big (millions of SAX events),
these pipelines crash sometimes with an error like:
{{{
org.apache.cocoon.ProcessingException: Failed to execute pipeline.:
org.xml.sax.SAXException: Index too large
}}}
This error is caused by the `XMLByteStreamCompiler`, which is used to store a
compiled format of the SAX-event stream in a byte-array (I am not sure why that
is needed).
Especially when you are dealing with large documents, measuring performance can
be very important, but if the profiling pipelines give up, some other method
must be used.
'''Warning: In the rest of this page, some ugly coding techniques are used.
Do not use the code given here as an example of how to program Cocoon
components.'''
Instead of one of the profiling pipelines, a normal (non-profiling, and
probably non-caching) pipeline must be used. The performance measuring is done
by a new transformer, the `LOWProfilerTransformer` (see the attachment).
'LOWProfiler' means 'Log-file Output Writing Profiler', and also indicates that
the standard profiler is preferable.
The `LOWProfilerTransformer` must be inserted in the pipeline, just before the
serializer, e.g.,
{{{
<map:match pattern="test">
<map:generate src="bigfile.xml"/>
...do some transformations etc...
<map:transform type="lowprofiler"/>
<map:serialize type="xml"/>
</map:match>
}}}
You will need to declare the transformer:
{{{
<map:transformers default="xslt">
<map:transformer name="lowprofiler"
src="org.apache.cocoon.transformation.LOWProfilerTransformer"
logger="lowprofiler"/>
</map:transformers>
}}}
As you can see, a new log-file is needed where the profiling results are
written. This is declared in `logkit.xconf`:
{{{
<cocoon id="lowprofiler">
<filename>${context-root}/WEB-INF/logs/lowprofiler.log</filename>
<format type="cocoon">
%7.7{priority} %{time} [%{category}] (%{uri})
%{thread}/%{class:short}: %{message}\n
</format>
<append>true</append>
</cocoon>
...
<category log-level="INFO" name="lowprofiler">
<log-target id-ref="lowprofiler"/>
</category>
}}}