The document would need to have a DTD, but you don't need to be validating.
Among other things, "ignorable whitespace" is always assessed when the
document has a DTD which has been read, regardless of whether you've
enabled validation or not.

Thanks.

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrgla...@ca.ibm.com
E-mail: mrgla...@apache.org

kesh...@us.ibm.com wrote on 07/11/2011 10:52:32 PM:

> If you are validating against a DTD, and IF the enclosing element
> does not have mixed content, look at the SAX/DOM defiinitions of
> "ignorable whitespace" and how to handle it. (The term is
> unfortunately; it's better described as "whitespace in element-only
content")
>
> If you are not validating the document, the parser can not make this
> distinction and you must do so in your application code.
>
>
> ______________________________________
> "You build world of steel and stone
> I build worlds of words alone
> Skilled tradespeople, long years taught:
> You shape matter; I shape thought."
> (http://www.songworm.com/lyrics/songworm-parody/ShapesofShadow.html)
>

>
> From:
>
> Albretch Mueller <lbrt...@gmail.com>
>
> To:
>
> j-users@xerces.apache.org
>
> Date:
>
> 07/11/2011 06:13 PM
>
> Subject:
>
> dismissing characters such as carriage returns and spaces after an
> ending and before an starting tag ...
>
>
>
>
>
> ~
> I am XMLRead[er|ing] an XML file (which I am validating using the
> specified schema) that looks like this:
> ~
> <mediawiki xmlns="http://www.mediawiki.org/xml/export-0.5/";
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
> xsi:schemaLocation="http://www.mediawiki.org/xml/export-0.5/
> http://www.mediawiki.org/xml/export-0.5.xsd"; version="0.5"
> xml:lang="en">
>  <siteinfo>
>    <sitename>Wikipedia</sitename>
>    <base>http://en.wikipedia.org/wiki/Main_Page</base>
>    <generator>MediaWiki 1.17wmf1</generator>
>    <case>first-letter</case>
>    <namespaces>
>      <namespace key="-2" case="first-letter">Media</namespace>
>      <namespace key="109" case="first-letter">Book talk</namespace>
>    </namespaces>
>  </siteinfo>
> </mediawiki>
> ~
> What do you do in order for the ContentHandler not to report as
> "characters" such character sequences after an ending and before an
> starting tag?
> ~
> Than you
> lbrtchx
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-users-unsubscr...@xerces.apache.org
> For additional commands, e-mail: j-users-h...@xerces.apache.org

Reply via email to