Date: 2005-02-25T07:17:41
   Editor: NicoVerwer
   Wiki: Cocoon Wiki
   Page: ProfilingPipelinesWithBigSaxEventStreams
   URL: http://wiki.apache.org/cocoon/ProfilingPipelinesWithBigSaxEventStreams

   no comment

New Page:

= Profiling pipelines with big SAX-event streams =

The profiliing pipelines, `ProfilingCachingProcessingPipeline` and 
`ProfilingNonCachingProcessingPipeline` are useful to measure the performance 
of your Cocoon application.
However, when the XML that you process is really big (millions of SAX events), 
these pipelines crash sometimes with an error like:

{{{
org.apache.cocoon.ProcessingException: Failed to execute pipeline.: 
org.xml.sax.SAXException: Index too large
}}}

This error is caused by the `XMLByteStreamCompiler`, which is used to store a 
compiled format of the SAX-event stream in a byte-array (I am not sure why that 
is needed).
Especially when you are dealing with large documents, measuring performance can 
be very important, but if the profiling pipelines give up, some other method 
must be used.

'''Warning: In the rest of this page, some ugly coding techniques are used.
Do not use the code given here as an example of how to program Cocoon 
components.'''

Instead of one of the profiling pipelines, a normal (non-profiling, and 
probably non-caching) pipeline must be used. The performance measuring is done 
by a new transformer, the `LOWProfilerTransformer` (see the attachment).
'LOWProfiler' means 'Log-file Output Writing Profiler', and also indicates that 
the standard profiler is preferable.
The `LOWProfilerTransformer` must be inserted in the pipeline, just before the 
serializer, e.g.,

{{{
      <map:match pattern="test">
        <map:generate src="bigfile.xml"/>
    ...do some transformations etc...
        <map:transform type="lowprofiler"/>
        <map:serialize type="xml"/>
      </map:match>
}}}

You will need to declare the transformer:

{{{
    <map:transformers default="xslt">
      <map:transformer name="lowprofiler"
        src="org.apache.cocoon.transformation.LOWProfilerTransformer"
        logger="lowprofiler"/>
    </map:transformers>
}}}

As you can see, a new log-file is needed where the profiling results are 
written. This is declared in `logkit.xconf`:

{{{
    <cocoon id="lowprofiler">
      <filename>${context-root}/WEB-INF/logs/lowprofiler.log</filename>
      <format type="cocoon">
        %7.7{priority} %{time}   [%{category}] (%{uri}) 
%{thread}/%{class:short}: %{message}\n
      </format>
      <append>true</append>
    </cocoon>
  ...
    <category log-level="INFO" name="lowprofiler">
        <log-target id-ref="lowprofiler"/>
    </category>
}}}