-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello Simon (and everyone),

I read about that one CLI problem having been fixed. It did affect me (and I'm 
happy to see it fixed), but it is not the problem I am having right now, 
which applies to both the CLI and the servlet.

One more thing I've found out in the meantime: Debugging the serialization 
process yields that when the Serializer's startElement() method is invoked 
for an HTML <a> element, the href attribute and its value are present. 
However, it does not show up in the result document. This applies both when 
using the HTMLSerializer (for HTML 4.01) and the XMLSerializer (for XHTML 
1.0).

In addition, when using the XMLSerializer, I am also getting unexpected 
results for the output of <p> elements. They contain an unnecessary xmlns 
attributes. The do represent the correct namespace, but if I understand the 
XML namespace spec correctly, the xmlns attribute should not be output if the 
default namespace has already been defined somewhere up the tree and not 
redefined since (which is the case -- the root <html> element contains the 
default namespace declaration).

This appears to be a Xalan-related issue. As I've found out, the problem does 
not occur when using Saxon 6.5.3. IIUC the serialization-related Cocoon code 
contains some workarounds for old Xalan bugs (in 
org.apache.cocoon.serialization.AbstractTextSerializer), could it be possible 
that these workarounds cause some unwanted side effects?

Best regards,
Florian


On Saturday 20 September 2003 23:35, Simon Mieth wrote:
| Hi Florian,
|
| i have read your older message and the linked messages there. I'm not sure
| if i understand all of your problem. There was a problem with the CLI,
| which is fixed in the actually CVS-Version. The problem was that every last
| link on a site was not processed, so if you only have one link, it stops
| crawling at this point, or every last link and depending sites wouldnt
| crawled. This is fixed now. But the problem you have now, should not depend
| on this fixed bug.
|
| Regards,
|
| Simon

- --
Florian G. Haas <[EMAIL PROTECTED]> http://members.ycn.com/~fgh/en/

GnuPG key ID: 0x46D00BE3
Key fingerprint: 18B4 3E7B 191E F534 254A  1F7C 816D 950B 46D0 0BE3

My GnuPG key is available from the public PGP key server at
pgp.mit.edu (and various other key servers).

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE/bbOGgW2VC0bQC+MRAvlVAJkBjF1NBPszWQvkS4zEkmRCyKjxnACg4cxS
QEZv+MpjqDFlq6j+aI8UI0w=
=x/YX
-----END PGP SIGNATURE-----

Reply via email to