Hello, I'm working on an application which merges XFDF files with annotations with PDF files and noticed some strange behaviour with certain types of text annotations.
It looks like text that is not contained in a span is ignored when merging. One user uploaded this annotation (not the actual texts) from an older Acrobat : <freetext width="2.000000" color="#FFFFFF" creationdate="D:20180910162711+02'00'" flags="print" date="D:20180911172716+02'00'" page="0" rect="1136.342529,3886.797363,1221.432617,4367.977539" rotation="90" subject="Textfeld" title="username" ><contents-richtext ><body xmlns="http://www.w3.org/1999/xhtml" >xmlns:xfa="http://www.xfa.org/schema/xfa-data/1.0/" >xfa:APIVersion="Acrobat:11.0.15" xfa:spec="2.0.2" >style="font-size:11.0pt;text-align:left;color:#0000FF;font-weight:bold;font-style:normal;font-family:Arial;font-stretch:normal" ><p dir="ltr" >ABC <span style="text-decoration:underline" >DEF</span > GHI
</p ><p dir="ltr" ><span style="font-weight:normal" >More text
</span ></p ><p dir="ltr" ><span style="font-weight:normal" >More text</span ></p ><p dir="ltr" ><span style="font-weight:normal" >More text</span ></p ></body ></contents-richtext ><defaultappearance >0 0 1 rg /Arial,Bold 11 Tf</defaultappearance ><defaultstyle >font: bold Arial 11.0pt; text-align:left; color:#0000FF </defaultstyle ></freetext > After merging, the texts "ABC" and "GHI" are gone - they are not displayed and not shown in the comments area in Acrobat Reader. When I tried to create a similar annotation using a current Acrobat Reader DC, I get <freetext color="#FFFFFF" creationdate="D:20180913132943+02'00'" flags="print" date="D:20180913132956+02'00'" page="0" rect="181.799377,672.266907,326.595337,723.213623" subject="Textfeld" title="keggenhoff" ><contents-richtext ><body xmlns="http://www.w3.org/1999/xhtml" >xmlns:xfa="http://www.xfa.org/schema/xfa-data/1.0/" >xfa:APIVersion="Acrobat:18.11.0" xfa:spec="2.0.2" >style="font-size:12.0pt;text-align:left;color:#FF0000;font-weight:normal;font-style:normal;font-family:Helvetica,sans-serif;font-stretch:normal" ><p dir="ltr" ><span style="font-family:Helvetica" >ABC</span ><span style="text-decoration:word;font-family:Helvetica" > DEF</span ><span style="font-family:Helvetica" > GHI</span ></p ></body ></contents-richtext ><defaultappearance >0.898 0.1333 0.2157 rg /Helv 12 Tf</defaultappearance ><defaultstyle >font: Helvetica,sans-serif 12.0pt; text-align:left; color:#E52237 ></defaultstyle ></freetext > When I merge this annotation with the PDF, the text is complete. However, when I remove the span tags around ABC and GHI, both texts are again missing after merging. Now my question is whether the (ancient) Acrobat should have included span tags there or if PDFBox should process the text that is not inside a span. I tested this with PDFBox 2.0.6 and 2.0.11 and the behaviour was identical. Thanks in advance, Kai Keggenhoff