Jörg,

Thank you for adding the issue to JIRA.

In response to your comment on point #2, my approach to URI handling is driven 
by specific needs and may not suit everyone. These needs are:

* Reduce the cost of accessing the DTD. It's typically static (content) and 
could be accessed on the file system, using a path relative to the XML feed 
whose format it is defining. Ideally it's served w/o going through a cocoon 
pipeline.

* The XML feed contents change many times a day. Depending on feed size, it may 
or may not be compressed into a zip archive. If it is, the archive is placed 
into the same directory as the feed would be, while cocoon pipeline uses a 
ResourceExistsSelector to look for <somefeed.zip> and default to <somefeed>.xml 
in its absence.

Using the original implementation (with the ":" you spotted missing) would 
require the DTD to be inserted into the archive, not to mention the overhead of 
going through another instance of cocoon pipeline to obtain the DTD from a 
zip:file... location. Too costly in a scaleable system we need to maintain.

Btw, if the DTD path is absolute, my understanding is, the method getURI is not 
invoked at all, as location of the source becomes irrelevant. This may explain 
how the bug wasn't spotted in the first place, as I guess in most cases the DTD 
path is absolute (or the declaration is missing all together).


Regards,

-Leonid


-----Original Message-----
From: Joerg Heinicke [mailto:[EMAIL PROTECTED] 
Sent: Monday, March 12, 2007 6:58 PM
To: [email protected]
Subject: Re: File Generator and Compressed XML

Hi Leonid,

thanks for your detailed investigations. I added it to Jira [1] and 
fixed part 1. Additionally to this issue with protocolEnd there was an 
error in getURI() of ZipSource. Instead of

   return this.protocol + this.archive.getURI() + "!/" + this.filePath;

it has to be

   return this.protocol + ":" + this.archive.getURI() + "!/" + 
this.filePath;

On 12.03.2007 15:52, Leonid Geller wrote:

> 2. When using a SYSTEM identifier with relative DTD path, the XML parser will 
> look for the file relative to the URI of the zipped source, 
> zip:archive.zip!/source.xml which is obviously going to fail.
> 
> Here, the solution is to have the source implementation class (in this case 
> org.apache.cocoon.components.source.impl.ZipSource) to change getURI method 
> to return source.xml based on archive.zip location, w/o the zip protocol. 
> Current implementation:
> 
>       return this.protocol + this.archive.getURI() + "!/" + this.filePath;
> 
> is not going to work. Something like this will:
> 
>       int iZipIdx = this.archive.getURI().lastIndexOf("/");
>       if (iZipIdx < 0) iZipIdx = 0;
>       return this.archive.getURI().substring(0,iZipIdx)+"/"+ this.filePath;

For your 2nd part I don't know if I can follow or agree. Let's assume 
there is a file "zip:file://test.zip!/test.xml" (as in the test case). 
In which way do you expect a relative path to be resolved? IMO 
zip:file://test.zip is the context and must not be left. So with a 
relative "test.dtd" it should be "zip:file://test.zip!/test.dtd". Even 
an absolute path like "/test.dtd" should stay in the zip archive 
context, while I'd assume a path with protocol to be resolved out of 
this context.

Regards
Jörg

[1] https://issues.apache.org/jira/browse/COCOON-2022

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
______________________________________________________________________


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to