Lars,

are you aware that the Java VM doesn't automatically use all available
memory? IIRC on Windows by default it uses maximum 64 MB. You can
augment that by adding a parameter like -Xmx192m to the java startup
command. If you search the cocoon wiki or the internet you'll find
plenty of information on that.

As for the XSLT transforms: yes, these build a complete model of your
document in memory before doing the transform, though it's an optimized
model that should be much smaller then a typical DOM. From your earlier
non-updateable-dom problems, I'd assume that the
SourceWritingTransformer also builds a DOM in memory. By default the
serialized output of a pipeline is also completely buffered before
sending it to the client, see the explanation for "outputBufferSize" in
the default root sitemap on how to avoid that.

On Tue, 2004-08-03 at 05:25, Lars Huttar wrote:
> Dear Cocoon gurus,              [Cocoon 2.1.2, Tomcat 4.1]
> 
> We have an application where we need to generate an index
> from a large database. We seem to be running out of memory
> even in getting the unprocessed out of the database.
> 
> We initially did (sitemap pseudocode)
>   - xsp query to get rows from "Index" table of database
>   - XSLT transformation that groups together rows with certain identical fields
>   - XSLT transformation that wraps "source:write" markup around
>     the XML
>   - the write-source transformer to put the XML into a file
>   - (serialize as XML)
> 
> This worked for small rowsets, but when we jump from 3700 to
> 9500 rows, it fails, with the message
> org.apache.cocoon.ProcessingException: Exception in ServerPagesGenerator.generate():
> java.lang.RuntimeException: org.apache.cocoon.ProcessingException: insertFragment: 
> fragment is
> required.
> 
> which sounds like write-source transformer is complaining that it didn't
> get its "fragment" (data to write to the file), so I supposed
> there was a failure before the write-source transformer.
> I wondered if the XSLT transformations were each building
> a DOM for the entire input. This would account for running out
> of memory.
> 
> So I tried reducing the pipeline to just obtaining the data
> and writing it to a file without grouping.
> First I tried
> 
>   - xsp query to get rows from "Index" table of database
>   - XSLT transformation that just wraps "source:write" markup around
>     the XML
>   - the write-source transformer to put the XML into a file
> 
> but this failed too, and of course it has an XSLT transformation
> which is suspect -- is it building a DOM? So next I tried
> 
>   - file generator to get a file that contained a source:write
>     wrapper around a cinclude statement
>   - cinclude transformer to get the data
>   - the write-source transformer to put the XML into a file
> 
> And in a separate pipeline called by the cinclude statement,
> 
>   - xsp query to get rows from "Index" table of database
> 
> But this still failed!
> 
> So now I'm wondering how it's possible to process big sets
> of data at all in Cocoon. We thought SAX meant that the XML
> data was sent piece-by-piece down the pipeline, serially,
> so you didn't run out of memory when you had a big XML data
> file. Does using XSLT mess that up by building DOMs?
> What about cinclude?
> What *can* you use to get lots of data from a database
> and process it without having to have it all in memory
> at once? Does this task need to be done outside of Cocoon?
> 
> Of course, we can split the operation up into little pieces;
> but we don't want to go through that hassle if it's avoidable.
> 
> Is it possible that I'm missing the point completely and
> there's something other than memory that's causing the
> operation to fail?
> By the way this machine has 384MB, and another I was testing
> on had 512MB. They both failed at about the same point.
> 
> Thanks for any explanations or suggestions...
> Lars

-- 
Bruno Dumon                             http://outerthought.org/
Outerthought - Open Source, Java & XML Competence Support Center
[EMAIL PROTECTED]                          [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to