On 1/13/06, Bob Harner <[EMAIL PROTECTED]> wrote:
> On 1/13/06, Andreas Hartmann <[EMAIL PROTECTED]> wrote:
> > Josias Thoeny schrieb:
> > > On Thu, 2006-01-12 at 11:11 -0500, Bob Harner wrote:
> > >> The behavior seems to be correct for links to documents, but in my
> > >> case the link is actually to an asset (a JPEG file).  So even though
> > >> the file really exists, it is't actually a "document".  I believe
> > >> LinkRewritingTransformer should either ignore (leave in place) such
> > >> absolute links to assets or correctly check for their existence.  What
> > >> do you think?
> > >
> > > The easiest way probably is to extend the DefaultDocumentBuilder s.t. it
> > > recognizes only urls with .html as document urls. This way the
> > > LinkRewritingTransformer should leave all other urls (e.g. with .jpg
> > > extension) in place.
> >
> >  > However, if you plan to use the proxy mechanism and you want the asset
> >  > urls to be rewritten using a proxy url prefix, you have to do it the
> >  > other way (recognize the assets as documents). I'm not sure if this
> >  > can be done easily, though.
> >
> > It's about time to handle documents and assets in the same way ... :(
> >
> > -- Andreas
>
> We don't use the proxy mechanism, but we do export the pages to static
> files and serve them up through a separate web server.   We already
> extended StaticHtmlExporter to translate the URL's of such links (to
> both assets and documents) in the document.
>
> I think I will extend (subclass) DefaultDocumentBuilder.java and see
> how that goes.
>

I'm back on this finally and am finding it more difficult than
expected.  There are other problems with LinkRewritingTransformer in
1.2.4 & 1.2.x (and apparently 1.4 too).  It seems far too limited in
what it tries to do:

1) it only rewrites <a href="foo"> tags.  But there can be URL's
needing rewriting in several other tags as well:  <img src="foo">,
<script src="foo">, <object data="foo">, <meta http-equiv="refresh"
content="2;url=foo">, <link href="foo">, <embed src="foo">, <form
action="foo"> and probably others.  (And IIUC XHTML 2.0 (future) will
allow an href tag on *any* element.)  Rewriting such links is an
assumed feature of any mature CMS, IMHO.  I don't think it is really
reasonable to prohibit URL's starting with "/".  In fact, I'd go
further and say that Lenya should even rewrite links whose URL's
contain the same host name as the page, to make them relative.

2) It relies heavily on the DefaultDocumentBuilder class, whose
isDocument() method simplistically returns true for any URL's starting
like "/lenya/mypub/authoring/" even if the URL points to an asset, not
a CMS document.  In contrast, note that the sitemaps verify that the
URL ends in ".html" before assuming that a URL is really a CMS
document.

Are links handled any better in 1.4?

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to