Donald Ball wrote: > (sent to cocoon-users, no help there...) > > hey guys. i'm trying to retrieve some xml content over http to begin one > of my pipelines: > > /nlm/query?author=Smith > > <map:match pattern="nlm/query"> > <map:match type="request" pattern="author"> > <map:generate >src="http://www.ncbi.nlm.nih.gov/entrez/utils/pmqty.fcgi?db=PubMed&mode=XML&dispmax=999&term={1}[au]"/> > <map:serialize type="xml"/> > </map:match> > </map:match> > > the xml returned from the nih server will begin like so: > > <?xml version="1.0"?> > <!DOCTYPE QueryResult PUBLIC "-//NLM//DTD QueryResult, 22 Jan 2002//EN" > "/entrez/query/DTD/pmqty_020122.dtd" > > <QueryResult> > > unfortunately, i get an exception when cocoon tries to parse this > document. it claims that it cannot access the dtd: > > java.net.MalformedURLException: no protocol: > /entrez/query/DTD/pmqty_020122.dtd > at java.net.URL.(URL.java:473) > at java.net.URL.(URL.java:376) > at java.net.URL.(URL.java:330) > at > org.apache.xerces.impl.XMLEntityManager.startEntity(XMLEntityManager.java:731) > at > org.apache.xerces.impl.XMLEntityManager.startDTDEntity(XMLEntityManager.java:691) > at > org.apache.xerces.impl.XMLDTDScannerImpl.setInputSource(XMLDTDScannerImpl.java:258) > at > >org.apache.xerces.impl.XMLDocumentScannerImpl$DTDDispatcher.dispatch(XMLDocumentScannerImpl.java:811) > at > >org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:333) > at > >org.apache.xerces.parsers.StandardParserConfiguration.parse(StandardParserConfiguration.java:525) > at > >org.apache.xerces.parsers.StandardParserConfiguration.parse(StandardParserConfiguration.java:581) > at org.apache.xerces.parsers.XMLParser.parse(XMLParser.java:147) > at > org.apache.xerces.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1157) > at > org.apache.avalon.excalibur.xml.JaxpParser.parse(JaxpParser.java:241) > at > >org.apache.cocoon.components.source.AbstractStreamSource.toSAX(AbstractStreamSource.java:204) > at > org.apache.cocoon.generation.FileGenerator.generate(FileGenerator.java:142) > > shouldn't it be trying to download the DTD from this url: > > http://www.ncbi.nlm.nih.gov/entrez/query/DTD/pmqty_020122.dtd > > where it does, in fact, live? > > i did manage to work around this problem using the excellent entity > catalogs facility, and i suspect that's what we'll want to use in the long > term, but i would like to track down why this isn't working as (i think) > it ought to. thanks in advance. > > - donald
Good to hear that the entity catlogs worked for you. I think that the reason that you cannot do without the entity catalog resolver, is that the document type declaration in the XML instance document is not using a full URL, i.e. http://www.ncbi.nlm.nih.gov/entrez/qu... So the parser is tying to find the DTD at the root of your local filesystem, i.e. /entrez/qu... --David --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]