Dave,

Thanks for the response.  My detailed responses to your questions below.  In 
general, I think that I see that it (rightly) won't do what I'm trying.  So, no 
worries.

Thanks again, 
Chris

-----Original Message-----
From: David Bertoni [mailto:[email protected]] 
Sent: Wednesday, May 20, 2009 1:34 PM
To: [email protected]
Subject: Re: problem with special characters / entities

Heinz, Chris wrote:
> Hey, I'm a noob here, so if anyone wants to point me to the archives of 
> this mailing list to search for my problem, that's fine.
> 
> My problem is that I have three special characters being placed into 
> formatted text:  return, non-breaking spaces, and soft hyphens.  I can 
> input them as 
,  , and &#xAD.  The first two Xerces handles 
> fine, the third I seem to be getting a standard hyphen???
Have you examined the content of the document to verify this?  I don't 
know of any code in Xerces-C that would translate a soft hyphen to a 
regular hyphen.
>>> I think this was on my application end.  My duh.

> But when I 
> write them out, they go in as non-printing control characters.  Xerces 
> can import those fine, so I can round trip, but, the non-printing 
> characters aren't too user-friendly.
I'm not sure I understand your question and the problems you're seeing. 
  Are you trying to configure the serializer so it generates entities 
for certain characters?  If so, there's no way to do that.
>>> Yah, I guess that's what I'm trying to do.  You're right, these are legal 
>>> Windows-1252 characters, why should Xerces do anything to them?

> 
> I have defined in my dtd file:
> 
> <!ENTITY return "&#x0D;">
> <!ENTITY nbsp "&#xA0;">
> <!ENTITY softhyphen "&#xAD;">
In general, the DTD is processed by the parser, the entities are 
expanded, and their identities are lost. There is no connection between 
the DTD in the input document, and the document the serializer generates.

> 
> And tried &return;, etc, that didn't seem to work at all.
Didn't seem to work in what way?
>>> I put &return; into the input stream, it seemed to be completely ignored -- 
>>> I got no character placed at that position.  Not a biggie, &#x0d; works 
>>> fine.

> I've checked DomOptions and looked at DOMSerializer, haven't seen 
> anything that looks like it would help.
The usual way to handle this is to specify US-ASCII as the encoding. 
Since that encoding only supports characters below 128, all other 
characters will be written as numeric character references.

However, that will not solve the problem with the U+000D, which should 
already be written as a numeric character reference.  If that's not the 
case, the Xerces serializer has a bug.
>>>  I've attached the output file created (note, via a MemBufFormatTarget 
>>> rather than directly to a file), there are single x0Ds at the end of each 
>>> of the <fo:inline>s in the 2nd <fo:block>.  Note, Xerces did not create the 
>>> overall XML of this file, just the document embedded in the [CDATA[

Dave

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

<?xml version="1.0"?>
<File>
<FileName>file1-out.xml</FileName>
<FileContents><![CDATA[ <?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<!DOCTYPE dlg:dxf-text SYSTEM "dialogue.dtd">
<dlg:dxf-text xmlns:dlg="http://www.exstream.com/2003/XSL/Dialogue"; schemaVersion="2.0" xmlns:dxf="http://www.exstream.com/2008/XSL/DXF"; xmlns:fo="http://www.w3.org/1999/XSL/Format";>

  <fo:flow display-align="auto" height="2503.00lu" margin-bottom="0.00lu" margin-left="0.00lu" margin-right="0.00lu" margin-top="0.00lu" width="7917.00lu">
    <fo:block end-indent="0lu" keep-together="auto" keep-with-next="auto" line-height="140lu" space-after="0lu" space-before="0lu" start-indent="10lu" tab-ruler="-1" text-align="justify" text-indent="0lu" usage-rule="Rule|0|">
      <dlg:tab-ruler default-tab="0.00lu" id="" list-type="none" number-indent="0" number-string="" number-type="num" user-set-color="false" user-set-type="false">
        <dlg:tab-stop tab-align="left" tab-char="0" tab-indent="500.00lu"/>
        <dlg:tab-stop tab-align="left" tab-char="0" tab-indent="1390.00lu"/>
        <dlg:tab-stop tab-align="left" tab-char="0" tab-indent="2000.00lu"/>
      </dlg:tab-ruler>
      <fo:inline color="" font-family="Arial" font-size="9.00pt" font-style="normal" letter-spacing="0.00pt" text-decoration="underline">OWNERSHIP OF PROPERTY: </fo:inline>
      <fo:inline color="" font-family="Arial" font-size="9.00pt" font-style="normal" letter-spacing="0.00pt">Borrower represents that the Property is owned by Borrower free and clear of all liens and encumbrances except those of which Borrower has informed Lender in writing. Prior to any default, Borrower may keep and use the Property at Borrower's own risk, subject to the provisions of the Uniform­itationality Commercial Code. If the Property includes a motor vehicle or a mobile home, Borrower will, upon request, deliver the certificate of title to the motor vehicle or a mobile home to Lender.  EDITED.</fo:inline>
    </fo:block>
    <fo:block end-indent="0lu" keep-together="auto" keep-with-next="auto" line-height="140lu" space-after="0lu" space-before="70lu" start-indent="10lu" tab-ruler="-1" text-align="left" text-indent="0lu" usage-rule="Rule|0|">
      <dlg:tab-ruler default-tab="1000.00lu" id="" list-type="none" number-indent="0" number-string="" number-type="num" user-set-color="false" user-set-type="false"/>
      <fo:inline color="" font-family="Arial" font-size="9.00pt" font-style="normal" letter-spacing="0.00pt" text-decoration="underline">USE OF PROPERTY: </fo:inline>
      <fo:inline color="" font-family="Arial" font-size="9.00pt" font-style="normal" letter-spacing="0.00pt">Borrower will not sell, lease, encumber, or otherwise dispose of the Property without Lender's prior written consent.
</fo:inline>
      <fo:inline color="" font-family="Arial" font-size="9.00pt" font-style="normal" letter-spacing="0.00pt">Borrower will keep the Property at Borrower's address (as shown on page 1) unless Lender has granted permission in writing for the Property to be
</fo:inline>
      <fo:inline color="" font-family="Arial" font-size="9.00pt" font-style="normal" letter-spacing="0.00pt">located elsewhere. The Property will be used only in the state in which Borrower lives unless the Property is a motor vehicle, in which case it will be
</fo:inline>
      <fo:inline color="" font-family="Arial" font-size="9.00pt" font-style="normal" letter-spacing="0.00pt">used outside the state only in the course of Borrower's normal use of the Property. Borrower will not use or permit the use of the Property for hire or
</fo:inline>
      <fo:inline color="" font-family="Arial" font-size="9.00pt" font-style="normal" letter-spacing="0.00pt">for illegal purposes.  Non breaking spaces here.</fo:inline>
    </fo:block>
    <fo:block end-indent="0lu" keep-together="auto" keep-with-next="auto" line-height="140lu" space-after="0lu" space-before="73lu" start-indent="10lu" tab-ruler="-1" text-align="justify" text-indent="0lu" usage-rule="Rule|0|">
      <dlg:tab-ruler default-tab="247.00lu" id="" list-type="none" number-indent="0" number-string="" number-type="num" user-set-color="false" user-set-type="false"/>
      <fo:inline color="" font-family="Arial" font-size="9.00pt" font-style="normal" letter-spacing="0.00pt" text-decoration="underline">TAXES AND FEES: </fo:inline>
      <fo:inline color="" font-family="Arial" font-size="9.00pt" font-style="normal" letter-spacing="0.00pt">Borrower will pay all taxes, assessments, and other fees payable on the Property, including, but not limited to any fee required by a public official to record the satisfaction of this loan, and/or the reconveyance of a Deed of Trust, and/or the release of Lender's interest in the Property. If Borrower fails to pay such amounts, Lender may pay such amounts for Borrower and the amounts paid by Lender will be added to the unpaid balance of the loan.</fo:inline>
    </fo:block>
  </fo:flow>

</dlg:dxf-text>
]]></FileContents>
</File>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to