Bob Harner schrieb:
As briefly discussed on the user list recently (subject: "Losing
hyperlinks - what xsl removes them?"), the LinkRewritingTransformer
seems to need some improvements so that it can rewrite all types of
links.  It currently only rewrites <a href="foo"> where foo is a
document-relative URI.  I'm sure I'm NOT the best person to do so
(being much less familiar with 1.4 than 1.2.x), but I've been looking
over the code and humbly offer the following initial thoughts.  Your
advise and guidance is eagerly sought...

1) <editorial>We have really overloaded the word "resource" in Lenya &
Cocoon, haven't we?  Sometimes it means "an asset or a CMS document"
(per http://wiki.apache.org/lenya/ProposalArchitecture), or sometimes
it specifically just an asset (per Resource.java).  The word is also
used in sitemap files to refer to a reusable part of a pipeline. Elsewhere it refers vaguely to a "miscellaneous relate file" (the
lenya/resources dir).  Sometimes it means the amount of memory, hard
drive space, and CPU cycles available.  And Document Types are now
officially Resource Types.

Actually, in the repo API I called them "Document Types" again. I'm
still not sure if the term "Resource" or "Document" is appropriate for
a "content item". Or maybe "content item" is really superior.

The terms "content type" and "document type" are preoccupied. But IMO
we should just use the same term as for content items, regardless of
any preoccupation.

How about this hierarchy:

- Publication
  - Area
    - Content
      - ContentNode (belongs to a ContentType)
        - ContentItem (a language version)
          - (Content)Version (of the version history)
    - Structures (more general than Sites)
      - Structure
        - StructureNode (references ContentNode or ContentItem)


This overloading of terminology makes it
harder to learn Lenya. I think "Content", "Content Item", and "Content
Type" are probably much better terms for a CMS to use. Precise and
unambiguous terminology always a good thing.</editorial>

2) As Andreas said a couple weeks ago, "It's about time to handle
documents and assets in the same way".  I think there is a need for a
comon interface shared by both CMS documents and assets, so both can
be handled uniformly -- particulary for link rewriting, where the
URI's of both CMS documents and assets need to be rewritten in the
same way.  This would be, perhaps, "ContentItem".  And both Document
and Resource (which maybe should be named Asset?) should implement
this interface and DefaultDocument and Resource should extend a
DefaultContentItem class.  Or is there a better idea?

I'm not even sure if we need the separation between Documents and
Assets. Maybe there is a way to handle both of them uniformly.
I'd rather add specific functionality:

- Can the content item input/output XML?
- How is the content item rendered when it is referenced by another
  content item?
- What are the presentation options?
- ...

IMO additional, asset-specific functionality could be handled by an
asset-management module or something like this, not by the core API.


3) I think maybe the link rewriting should be done when a CMS document
is published, deactivated, or exported, rather than every time it is
displayed.

The problem is that the document has to be updated when *another*
document is changed/removed. This means when you deactivate a document,
you have to remove the links from all documents which are referencing
this document. I agree that this would be a good thing, but with the
current architecture it is a very time-consuming operation.


This change would be a performance boost for every page. Or am I missing something in why it needs to be done at display time?

4) LinkRewritingTransformer relies heavily on the
DefaultDocumentBuilder class, whose isDocument() method simplistically
returns true for any URL's starting like "/lenya/mypub/authoring/"
even if the URL points to an asset, not a CMS document.  In contrast,
note that the sitemaps verify that the URL ends in ".html" before
assuming that a URL is really a CMS document.  Should
DefaultDocumentBuilder's isDocument() method be changed to look for
the ".html" ending?  (But do CMS documents *always* have an ".html"
ending?)

No, we can't do this. This is another reason why I think that the
DocumentBuilder concept is doomed (see the thread "Mapping URLs to
documents"). This is a quite complex and fundamental issue, IMO we
have to come to a decision here first.

Thanks for bringing this up!

-- Andreas


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to