RE: Absolute position of original text in final PDF

Igor Rosenberg Tue, 12 Jan 2010 08:41:19 -0800

Hello,
My original problem was: 
        how can I know the position in the final PDF of the original text?


I've just discovered the following method: 
  org.apache.fop.render.pdf.PDFRenderer.saveBlockPosIfTargetable(Block)

The javadoc says: 
        * id + absolute position will  be saved. The saved position is 
        * only correct if this function is called at the very start of 
renderBlock!

This method is called always when a Block gets rendered. I guess that if I 
modify this method, I can retrieve for every Block its absolute positions. I'll 
try to fiddle with that. As I can infer from the Blocks the text that is 
included, I recon I'll have the solution to making my bounding boxes 

Cheers
Igor

-----Original Message-----
From: Georg Datterl [mailto:[email protected]] 
Sent: martes, 12 de enero de 2010 13:40
To: [email protected]
Subject: AW: Absolute position of original text in final PDF

Hi Igor, 

I get the area tree as a DOM Node and get the nodes I'm interest in by XPath 
expressions.

Regards,
 
Georg Datterl
 
------ Kontakt ------
 
Georg Datterl
 
Geneon media solutions gmbh
Gutenstetter Straße 8a
90449 Nürnberg
 
HRB Nürnberg: 17193
Geschäftsführer: Yong-Harry Steiert 

Tel.: 0911/36 78 88 - 26
Fax: 0911/36 78 88 - 20
 
www.geneon.de
 
Weitere Mitglieder der Willmy MediaGroup:
 
IRS Integrated Realization Services GmbH:    www.irs-nbg.de 
Willmy PrintMedia GmbH:                            www.willmy.de
Willmy Consult & Content GmbH:                 www.willmycc.de 
-----Ursprüngliche Nachricht-----
Von: Igor Rosenberg [mailto:[email protected]] 
Gesendet: Dienstag, 12. Januar 2010 13:18
An: [email protected]
Betreff: RE: Absolute position of original text in final PDF

Hello,
Does anyone know how to go down the Area Tree, looking for specific elements 
(in my case Blocks), and managing to infer the element's x and y coordinates?  
The Area Tree is truly complicated - it's hierarchical, is there any way of 
going down all the nodes, without having to know precisely what the Node's 
class is? 
Regards
Igor


-----Original Message-----
From: Georg Datterl [mailto:[email protected]]
Sent: martes, 12 de enero de 2010 10:40
To: [email protected]
Subject: AW: Absolute position of original text in final PDF

Hi Igor, 

since you already work with the area tree, try this: Don't autogenerate the id 
but give each text in your XML a unique id. Set this id as the block id. Now in 
the area tree you can find the block which contains your text (prod-id is your 
id). This block has a width (ipd) and a height (bpd) as well as your id 
(prod-id). Now for the positioning, I have not yet done that and a fast try did 
not give me a clear picture, but worst case you can fake the horizontal 
starting point by knowing the left margin and calculating the vertical starting 
point by adding the bpd of previous blocks in the area tree. But I guess, those 
who know the area tree better, may have a better solution for that.

Regards,
 
Georg Datterl
 
------ Kontakt ------
 
Georg Datterl
 
Geneon media solutions gmbh
Gutenstetter Straße 8a
90449 Nürnberg
 
HRB Nürnberg: 17193
Geschäftsführer: Yong-Harry Steiert 

Tel.: 0911/36 78 88 - 26
Fax: 0911/36 78 88 - 20
 
www.geneon.de
 
Weitere Mitglieder der Willmy MediaGroup:
 
IRS Integrated Realization Services GmbH:    www.irs-nbg.de 
Willmy PrintMedia GmbH:                            www.willmy.de
Willmy Consult & Content GmbH:                 www.willmycc.de 
-----Ursprüngliche Nachricht-----
Von: Igor Rosenberg [mailto:[email protected]]
Gesendet: Montag, 11. Januar 2010 18:44
An: [email protected]
Betreff: RE: Absolute position of original text in final PDF

Hi,

The ids get generated on the fly during the XML to FO transformation, for the 
original <text> tags, and get output in the FO format as <fo:block id=...> . 
This is expressed as 

                                        <fo:block id="#{generate-id()}" 

in the XSL document. See generate-id() in 
http://www.w3schools.com/XSL/func_generateid.asp . So basically, I'm just doing 
the XSLT transformation (my own format to FO), and adding as bonus the id 
generation. Find the XSLT and FO details below

Cheers

Igor

 

------

 

The relevant parts of the XSL look like 

 

        <xsl:template match="body">

                <fo:block space-after.optimum="3pt" space-before.optimum="4pt">

                        <xsl:apply-templates/>

                </fo:block>

        </xsl:template>

 

      <xsl:template match="text">

                <xsl:if test="@alignment='Left'">

                                        <fo:block id="#{generate-id()}" 
font-size="8pt" font-weight="normal" font-family="sans-serif" line-height="9pt" 
space-after.optimum="8pt" text-align="left">

                                                <xsl:value-of select="."/>

                                        </fo:block>

                                </xsl:when>

  

------

A section of the output fo, as produced by apache fo, looks like this (indented 
by me) - see how the ids appear in the fo:block tags  

 

<fo:block>            

<fo:block text-align="right" space-after.optimum="8pt" line-height="9pt" 
font-family="sans-serif" font-weight="normal" font-size="8pt" id="#N10011">     
   

Adress 1 line

</fo:block>    

<fo:block text-align="right"space-after.optimum="8pt" line-height="9pt" 
font-family="sans-serif" font-weight="normal" font-size="8pt" id="#N10017">     
   

Address 2 line 

</fo:block>        

<fo:block text-align="center" space-after.optimum="10pt" line-height="13pt" 
font-family="sans-serif" font-weight="normal" font-size="12pt" id="#N1001F">    
    

Another text line in different font

</fo:block>

</fo:block>

 

 

From: Peter Hancock [mailto:[email protected]]
Sent: lunes, 11 de enero de 2010 18:30
To: [email protected]
Subject: Re: Absolute position of original text in final PDF

 

Hi Igor,

It is not clear to me how these <text> elements are defined - in your xml 
input?  If so how do you transform them to fo whilst retaining the id 
attribute?  Could you provide a small example of the xml and the corresponding 
xsl that you wish to be input of fop.

Thanks,

Pete

On Mon, Jan 11, 2010 at 4:39 PM, Igor Rosenberg 
<[email protected]> wrote:

Dear FOP mailinglist readers,

 

I've been fighting with the Apache FOP source for a week, but I can't solve my 
problem alone... 

 

One of the features of the application I'm writing produces a PDF, based on an 
XML that follows a simple schema (header info, tables, images and text, but 
nothing fancy). Generating the FO then the PDF are the easy steps. Fop does the 
job marvelously. Now I need to output to the user the coordinates of bounding 
boxes. Those  bounding boxes must represent the placement in the PDF of the 
original text within the XML. To provide an example: 

 

If I had in my original XML ,   

<text id="xxx">This text appears somewhere in the PDF</text>

I would want, during the XML to PDF process, to output something like

            Bounding_box {id="xxx", x=34, y=45, w=444,h=25}

I understand this as "the original text of tag xxx is contained in the pdf in 
the rectangle starting at point (34,45), of width 444, and height 25"

(if the text is split into several pages or areas, receiving a list of 
rectangles would be fine)

 

To summarize: how can I know the position in the final PDF of the original text?

 

I've tried decorating different classes of FOP, looking at the FOTreeBuilder, 
the AreaTreeParser, but failed to maintain the identifier of the original text 
tags. 

I'd prefer staying with release 0.95, but can also use the trunk if required. 

 

While browsing, I thought that the  accessibility features might help, but 
couldn't figure out how

http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/documentation/content/xdocs/trunk/accessibility.xml?view=markup

I thought relying on the Area Tree, but couldn't retrieve the original id that 
were set to the original XML tags. 

                http://wiki.apache.org/xmlgraphics-fop/AreaTreeXMLDocumentation

                http://old.nabble.com/Area-Tree-Handling-to24431098.html

 

Thanks for any  help

 

Igor 

 
------------------------------------------------------------------
This e-mail and the documents attached are confidential and intended 
solely for the addressee; it may also be privileged. If you receive 
this e-mail in error, please notify the sender immediately and destroy it. 
As its integrity cannot be secured on the Internet, the Atos Origin 
group liability cannot be triggered for the message content. Although 
the sender endeavours to maintain a computer virus-free network, 
the sender does not warrant that this transmission is virus-free and 
will not be liable for any damages resulting from any virus transmitted. 

Este mensaje y los ficheros adjuntos pueden contener informacion confidencial 
destinada solamente a la(s) persona(s) mencionadas anteriormente 
pueden estar protegidos por secreto profesional. 
Si usted recibe este correo electronico por error, gracias por informar 
inmediatamente al remitente y destruir el mensaje. 
Al no estar asegurada la integridad de este mensaje sobre la red, Atos Origin 
no se hace responsable por su contenido. Su contenido no constituye ningun 
compromiso para el grupo Atos Origin, salvo ratificacion escrita por ambas 
partes. 
Aunque se esfuerza al maximo por mantener su red libre de virus, el emisor 
no puede garantizar nada al respecto y no sera responsable de cualesquiera 
danos que puedan resultar de una transmision de virus. 
------------------------------------------------------------------

RE: Absolute position of original text in final PDF

Reply via email to