Hi Laurent, > In real world use cases, it's acceptable to support index entries > only at the end of the numbering sequence or in another numbering > sequence, so let's do post-processing. There are plenty of issues > to solve but they are mostly related to `I/Os` and XSL and Novelang > design so I won't discuss them in this list.
I'm just thinking: If we are restricted to an index in a separate page-sequence after the actual entries, wouldn't it be possible DURING the layout (when creating the KnuthSequence?) to look forward (or back) and modify the entries (which already should know their page number), and then layout once? > One question left, however. I wonder how to hint FO document for > generating Area Tree or Intermediate Format that I could reparse > easily, for locating pages containing index entries, and extracting > index keys and lists of page numbers. I don't know if that helps you, but I generate the area tree as a DOM Document and then read data from it. The blocks I'm interested in are marked with known ids. private org.w3c.dom.Document multipass(StreamSource source) { try { FopFactory fopFactory = getFopFactory(); FOUserAgent foUserAgent = fopFactory.newFOUserAgent(); SAXTransformerFactory mpFactory = getMultipassFactory(); Transformer transformer = mpFactory.newTransformer(); TransformerHandler handler = mpFactory.newTransformerHandler(); DOMResult domResult = new DOMResult(); handler.setResult(domResult); org.apache.fop.render.Renderer targetRenderer = foUserAgent.getRendererFactory().createRenderer( foUserAgent, MimeConstants.MIME_PDF); XMLRenderer renderer = new XMLRenderer(); renderer.mimicRenderer(targetRenderer); renderer.setContentHandler(handler); renderer.setUserAgent(foUserAgent); foUserAgent.setRendererOverride(renderer); Fop fop = fopFactory.newFop(foUserAgent); Result res = new SAXResult(fop.getDefaultHandler()); transformer.transform(source, res); org.w3c.dom.Document doc = domResult.getNode(); // killing all but the last page from the document, because of performance reasons. IMPORTANT, trust me!! while (!doc.getDocumentElement().getLastChild().equals(doc.getDocumentElement().getFirstChild())) { doc.getDocumentElement().removeChild(doc.getDocumentElement().getFirstChild()); } // read the data from the document, get the entries to kill (Strings in Set res are reference ids of pagenumber entries) XPathFactory factory=XPathFactory.newInstance(); XPath xPath=factory.newXPath(); NodeList nl = (NodeList)xPath.evaluate("//blo...@prod-id='"+pageKey+"']", doc, XPathConstants.NODESET); if (nl.getLength()<=0) { return res; } Node root = nl.item(0); for (String key : references.keySet()) { Set<String> uniques = new HashSet<String>(); Set<String> toCheck = references.get(key); if (toCheck.size()>1) { // bei einem: immer unique, also egal. for (String check : toCheck) { String val = xPath.evaluate(".//te...@prod-id='"+check+"']/word/text()", root, XPathConstants.STRING).toString(); if (uniques.contains(val)) { res.add(check); } else { uniques.add(val); } } } } // With this ids I iterate over my structure and remove elements with ids in the set. Which probably won't help you much. } catch (TransformerException e) { System.out.println(e.getMessage()); e.printStackTrace(); } catch (FOPException e) { System.out.println(e.getMessage()); e.printStackTrace(); // } catch (IOException e) { // System.out.println(e.getMessage()); // e.printStackTrace(); } catch (SAXException e) { System.out.println(e.getMessage()); e.printStackTrace(); } return null; } Mit freundlichen Grüßen Georg Datterl ------ Kontakt ------ Georg Datterl Geneon media solutions gmbh Gutenstetter Straße 8a 90449 Nürnberg HRB Nürnberg: 17193 Geschäftsführer: Yong-Harry Steiert Tel.: 0911/36 78 88 - 26 Fax: 0911/36 78 88 - 20 www.geneon.de Weitere Mitglieder der Willmy MediaGroup: IRS Integrated Realization Services GmbH: www.irs-nbg.de Willmy PrintMedia GmbH: www.willmy.de Willmy Consult & Content GmbH: www.willmycc.de -----Ursprüngliche Nachricht----- Von: Laurent Caillette [mailto:laurent.caille...@ullink.com] Gesendet: Mittwoch, 19. August 2009 14:56 An: fop-dev@xmlgraphics.apache.org Betreff: RE: Referencing multiple pages for index entries Thanks Georg, the "Index and Pagenumbers" discussion is of great interest. To sum up: - By design, FOP doesn't re-layout after page number citations (as shown by Andreas D.). So a FOP extension won't solve my case if I want nicely-formatted index entries. - Complete support of `XSL 1.1` indexes means supporting growing and shrinking entries. When not at the very end of the page sequence, this implies multi-pass layout. - By design, FOP doesn't support multi-pass layout. - But FOP allows post-processing through Area Tree Format or Intermediate Format. In real world use cases, it's acceptable to support index entries only at the end of the numbering sequence or in another numbering sequence, so let's do post-processing. There are plenty of issues to solve but they are mostly related to `I/Os` and XSL and Novelang design so I won't discuss them in this list. One question left, however. I wonder how to hint FO document for generating Area Tree or Intermediate Format that I could reparse easily, for locating pages containing index entries, and extracting index keys and lists of page numbers. Thanks all, c. -----Message d'origine----- De : Georg Datterl [mailto:georg.datt...@geneon.de] Envoyé : mercredi 19 août 2009 12:08 À : fop-dev@xmlgraphics.apache.org Objet : AW: Referencing multiple pages for index entries Hi Laurent, I had the same problem, except for the "5-7". I only had to remove multiple entries with identical page numbers. A search for buzzword index in the archives should unearth that thread ("Index and Pagenumbers"). __________ Information provenant d'ESET NOD32 Antivirus, version de la base des signatures de virus 4347 (20090819) __________ Le message a été vérifié par ESET NOD32 Antivirus. http://www.eset.com