Re: Meta data based link manager

Andreas Hartmann Sat, 05 Apr 2008 03:32:30 -0700

Andreas Hartmann schrieb:

Michael Wechner schrieb:
Andreas Hartmann wrote:
Hi Lenya devs,
currently, the only available LinkManager implementation is theContentLinkManager. When you ask it to return all links that point toa particular document, it parses all (!) other documents in the samearea and extracts the links based on the link XPaths of the resourcetype. As you can imagine that can take a while, especially in largepublications. I expericened this with the docu publication. If youwant to decativate a page, you can fetch a coffee in the meantime,and even drink it (at least if it's an espresso).
In a discussion on the Jackrabbit mailing list, Bertrand Delacretazsuggested to extract all links that are contained in a documentbefore saving it,
I guess you mean rather just after it has been saved successfully, right?
Saved in the sense of committed to the file system. The extractionhappens in Persistable.save(), which is called right beforeTransactionable.save() in Session.commit(), which writes the contentfrom the in-memory session to the file system. This happens only aftersuccessful validation when you use the standard Lenya editors.

BTW, an implication of this approach is that the MetaDataLinkManagerdoesn't find any new links that were added in the current session. Andit still finds old links that were removed in the current session. Doyou think this is acceptable? Another option might be to use apersistent map which maps each document to all referencing documents.


-- Andreas

-- Andreas
and store them in the meta data. Now, since all Lenya meta data areindexed, this link list can be used for a Lucene search. The querylooks like this (special characters have to be escaped):
\{http\://apache.org/lenya/metadata/link/1.0\}outgoingLinks:lenya\-document\:1aca68c0\-0243\-11dd\-881a\-f3cc793eb58e\*
The name of the meta data field is

  {http://apache.org/lenya/metadata/link/1.0}outgoingLinks

The term value is

  lenya-document:1aca68c0-0243-11dd-881a-f3cc793eb58e*
Note the wildcard at the end of the term value. It includes URLs withan attached language or publication parameter. The link manager usessome post-search checks to verify that only the actually linkeddocuments are listed (conforming to the declared LinkResolverimplementation).
With the MetaDataLinkManager, the deactivate screen appears virtuallyimmediately. I guess this scales nicely with large number ofdocuments (as good as Lucene scales). If you have an index spanningmultiple publications, you can even detect links from otherpublications.
I have the MetaDataLinkManager in my local sandbox. It depends on thesearch API which I have posted on the user list. If you want to takea closer look at the classes, I can upload a ZIP somewhere, or maybeI can extract a patch.
that would be great
Replacing the ContentLinkManager with the MetaDataLinkManager wouldrequire to "touch" all documents so that the links are extracted(they are indexed automatically when the session is committed).
Is anybody interested in this feature?
sure, I think it would be a great improvement

Cheers

Michael
-- Andreas



--
Andreas Hartmann, CTO
BeCompany GmbH
http://www.becompany.ch
Tel.: +41 (0) 43 818 57 01


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Meta data based link manager

Reply via email to