Henning, I don't have a lot of time right now to dig into this, so I'll just remark on a few things after just reading your post...
On 24.05.2007 01:19:29 Henning P. Schmiedehausen wrote: > Hi, > > I was thinking on opening a Bugzilla report for this, however I think > it is better to discuss this first to narrow this down further. > > First things first: I have test cases that allow this bug to be > reproduced. They are available for download as an Eclipse project at > http://people.apache.org/~henning/fopdemo.zip > > One can build these tests using the included ant script, which builds > two jars: target/domdemo.jar and target/filedemo.jar > > These are run with java -jar target/<jarname>.jar <threads> <jobs> and > require at least Java 1.5. <threads> is the number of render-threads > to spawn and <jobs> the number of jobs to run. > > On to my problem: A few weeks ago, I inherited an application in a > customer project which generates PDF reports out of a Data > Warehouse. I was recruited to amend some performance problems and in > the end I was able to optimize it to use a single XSL transformation > driving FOP to generate the PDF reports. > > We are talking about 70,000 PDF reports ATM, wanting to scale up to > six-digit numbers. The application is running heavily multithreaded on > serious hardware: A Sun Fire 6800 with 24 processor cores and whooping > 48 Gigs of RAM (we have to share with the actual data warehouse, > though, so our Zone does have a lot of CPU but 'only' 10 Gigs of the > RAM...) > > A few words about the reports: Every report contains a few shared > bitmap images (GIF and PNG) and each report also contains ~10 SVG > images which are put into the XSL:FO source using a style sheet and > then rendered into the output PDF. Each of the SVG images for each of > the reports is generated and used only on a single report (no > re-use). Internally, the images are part of a DOM tree that gets > transformed by a XSL template. Pretty much what > AbstractRenderer::createPdf in the test case does. > > This code is well tested and works in production. > > We had to exchange one of the bit map graphics that are used on all > reports with an SVG. This is similar to what the filedemo.jar > does. The FO source now contains a > > <fo:block> > <fo:external-graphic src="logo.svg" /> > </fo:block> > > instead of > > <fo:block> > <fo:external-graphic src="logo.png" /> > </fo:block> > > This works great using a single thread (java -jar target/filedemo.jar 1 15). > > As soon as we used more than one thread, the application starts > throwing exceptions. Strange exceptions: > > 2007-05-23 23:55:20,875 [pool-1-thread-2] ERROR SVGUserAgent - SVG > Errorjar:file:/home/henning/workspace/FOP%20Demo/target/filedemo.jar!/logo.svg: > The attribute "style" represents an invalid CSS declaration > ("fill-rule:nonzero;fill:#D9D9D9;stroke:#D9D9D9;stroke-width:0.254;stroke-miterlimit:4;"). > Original message: > > org.w3c.dom.DOMException: > jar:file:/home/henning/workspace/FOP%20Demo/target/filedemo.jar!/logo.svg: > The attribute "style" represents an invalid CSS declaration > ("fill-rule:nonzero;fill:#D9D9D9;stroke:#D9D9D9;stroke-width:0.254;stroke-miterlimit:4;"). > Original message: > > at > org.apache.batik.css.engine.CSSEngine.getCascadedStyleMap(CSSEngine.java:835) > at > org.apache.batik.css.engine.CSSEngine.getComputedStyle(CSSEngine.java:878) <snip/> Doesn't really look like FOP is involved here. Have you tried running this with the latest Batik beta release? > > This gets worse using more threads but is easily reproducable on a > single CPU machine. To this looks like a static buffer or variable > getting garbled / hit by multiple threads at the same time. > > I looked at this and decided that maybe the internal cache might be > the culprit and decided to load the SVG as a DOM object and then use a > different XSL transformation (see dom-stylesheet.xsl): > > ... > <xsl:element name="fo:instream-foreign-object"> > <xsl:value-of disable-output-escaping="yes" > select="/report/svg-content/logo"/> > <xsl:apply-templates select="/report/svg-content/logo/*" mode="subset"/> > </xsl:element> > ... > > <xsl:template match="*" mode="subset"> > <xsl:copy> > <xsl:copy-of select="@*"/> > <xsl:apply-templates mode="subset"/> > </xsl:copy> > </xsl:template> > > to copy the DOM nodes from the document into the FO output > stream. This is what the domdemo.jar does. The problem here is, that I > needed serious hardware (as described above) to get this bug to show > up. Sometimes it also works if you run exactly the same number of > threads as you have CPUs (I.e. 2 for a dual-core machine). It never > fails with only a single render thread). > > It shows up by FOP emitting messages like this: > > 2007-05-24 00:00:56,274 [pool-1-thread-1] INFO RenderDOM - Job 0 writing PDF > 2007-05-24 00:00:56,301 [pool-1-thread-2] WARN FOTreeBuilder - Mismatch: > instream-foreign-object (http://www.w3.org/1999/XSL/Format) vs. block > (http://www.w3.org/1999/XSL/Format) > 2007-05-24 00:00:56,301 [pool-1-thread-2] WARN FOTreeBuilder - Mismatch: > instream-foreign-object (http://www.w3.org/1999/XSL/Format) vs. flow > (http://www.w3.org/1999/XSL/Format) > 2007-05-24 00:00:56,301 [pool-1-thread-2] WARN FOTreeBuilder - Mismatch: > instream-foreign-object (http://www.w3.org/1999/XSL/Format) vs. page-sequence > (http://www.w3.org/1999/XSL/Format) > 2007-05-24 00:00:56,301 [pool-1-thread-2] WARN FOTreeBuilder - Mismatch: > instream-foreign-object (http://www.w3.org/1999/XSL/Format) vs. root > (http://www.w3.org/1999/XSL/Format) > 2007-05-24 00:00:56,302 [pool-1-thread-2] ERROR FOTreeBuilder - > javax.xml.transform.TransformerException: java.lang.NullPointerException > SystemId Unknown; Line #17; Column #94; java.lang.NullPointerException No stack trace here? Normally, the "Mismatch" messages are only follow-up errors. > I'm not sure whether this is a problem with the > importNode(logoDocument.getDocumentElement()) (and Xalan is not thread > safe here) or further down the chain when the PDF gets created and the > resulting FO tree gets read. > > Putting the > > Node node = document.importNode(logoDocument.getDocumentElement(), true); > child.appendChild(node); > > lines into a synchronized(logoDocument) { } block makes the problem > either disappear (on my dual core Intel here, I can currently not test > on the big iron); also removing the 'static' declaration from > logoDocument in RenderDOM (have a single logo document for each Job, > which is a performance penalty) does. > > I looked a bit through the sources of FOP and Batik and at least > PDFSVGHandler::renderSVGDocument(RenderContext, Document) looks > non-threadsafe to me: it builds a BridgeContext and a GVTBuilder, then > runs GVTBuilder::build(BridgeContext, Document); this in turn runs > BridgeContext::setDocument(Document) which does not look threadsafe > and also BridgeContext::initializeDocument which definitely is not > threadsafe (just place a breakpoint on the very first if (eng == null) > and step with two threads simultaneously in). I can't see anything wrong in PDFSVGHandler concerning thread-safety, but maybe I'm missing it. BridgeContext is a Batik thing. I can't say anything without diving in deeper. Maybe Cameron is listening into this and can comment on it. > > I'm not sure if this is actually a bug (the filedemo thing IMHO is) or > just me hitting a corner case. The rendering works fine without a > hitch as long as no SVG is shared between multiple documents getting > rendered by different threads. Henning, please be prepared to be on your own on this. Our project is lacking resources lately. I will likely not have time of help in a timely manner. Good luck anyway! Jeremias Maerki