HTML != XML. Try an HTML parser like NekoHTML [1].

Please note that you're not using Apache Xerces at all. 
com.sun.org.apache.* is Oracle's fork of the codebase. We have no 
influence over it.

Thanks.

[1] http://nekohtml.sourceforge.net/

Michael Glavassevich
XML Technologies and WAS Development
IBM Toronto Lab
E-mail: mrgla...@ca.ibm.com
E-mail: mrgla...@apache.org

laredotornado <laredotorn...@gmail.com> wrote on 30/03/2012 05:16:13 PM:

> Hi,
> 
> I'm using Java 6 and the latest version of Xerces.  I'm trying to parse 
an
> HTML document that begins like this ...
> 
>     <!DOCTYPE html>
> 
> and later references the entity "&raquo;".  Parsing dies with the 
exception
> ...
> 
> org.xml.sax.SAXParseException: The entity "raquo" was referenced, but 
not
> declared.
>    at
> 
com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:249)
>    at
> com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse
> (DocumentBuilderImpl.java:284)
>    at
> com.myco.myproject.util.XmlUtilities.getStringAsDocument
> (XmlUtilities.java:147)
>    at
> 
com.myco.myproject.util.NetUtilities.getUrlAsDocument(NetUtilities.java:65)
>    at
> com.myco.myproject.parsers.impl.AbstractMetromixParser.parsePage
> (AbstractMetromixParser.java:107)
>    at
> com.myco.myproject.parsers.impl.AbstractMetromixParser.getEvents
> (AbstractMetromixParser.java:76)
>    at com.myco.myproject.domain.EventFeed.refresh(EventFeed.java:81)
>    at com.myco.myproject.domain.EventFeed.getEvents(EventFeed.java:72)
>    at
> com.myco.myproject.parsers.impl.MetromixParserTest.testParser
> (MetromixParserTest.java:21)
>    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>    at
> 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>    at
> sun.reflect.DelegatingMethodAccessorImpl.invoke
> (DelegatingMethodAccessorImpl.java:25)
>    at java.lang.reflect.Method.invoke(Method.java:597)
>    at
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall
> (FrameworkMethod.java:44)
>    at
> org.junit.internal.runners.model.ReflectiveCallable.run
> (ReflectiveCallable.java:15)
>    at
> org.junit.runners.model.FrameworkMethod.invokeExplosively
> (FrameworkMethod.java:41)
>    at
> org.junit.internal.runners.statements.InvokeMethod.evaluate
> (InvokeMethod.java:20)
>    at
> 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
>    at
> 
org.springframework.test.context.junit4.statements.RunBeforeTestMethodCallbacks.evaluate
> (RunBeforeTestMethodCallbacks.java:74)
>    at
> 
org.springframework.test.context.junit4.statements.RunAfterTestMethodCallbacks.evaluate
> (RunAfterTestMethodCallbacks.java:83)
>    at
> org.springframework.test.context.junit4.statements.SpringRepeat.evaluate
> (SpringRepeat.java:72)
>    at
> org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runChild
> (SpringJUnit4ClassRunner.java:231)
>    at
> org.junit.runners.BlockJUnit4ClassRunner.runChild
> (BlockJUnit4ClassRunner.java:50)
>    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
>    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
>    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
>    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
>    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
>    at
> 
org.springframework.test.context.junit4.statements.RunBeforeTestClassCallbacks.evaluate
> (RunBeforeTestClassCallbacks.java:61)
>    at
> 
org.springframework.test.context.junit4.statements.RunAfterTestClassCallbacks.evaluate
> (RunAfterTestClassCallbacks.java:71)
>    at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
>    at
> org.springframework.test.context.junit4.SpringJUnit4ClassRunner.run
> (SpringJUnit4ClassRunner.java:174)
>    at
> org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run
> (JUnit4TestReference.java:50)
>    at
> 
org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
>    at
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests
> (RemoteTestRunner.java:467)
>    at
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests
> (RemoteTestRunner.java:683)
>    at
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run
> (RemoteTestRunner.java:390)
>    at
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main
> (RemoteTestRunner.java:197)
> 
> Is there any way to tell the parser to ignore these types of entities it
> cannot resolve?  If not, what resolver do I have to plugin?
> 
> Thanks, - Dave
> -- 
> View this message in context: http://old.nabble.com/Way-it-ignore-
> entity-reference-resolving--tp33544935p33544935.html
> Sent from the Xerces - J - Users mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-users-unsubscr...@xerces.apache.org
> For additional commands, e-mail: j-users-h...@xerces.apache.org

Reply via email to