Santiago Gala wrote:
>
> Raphael Luta wrote:
>
> >
> > This should be handled by the feed declaration: currently OCS only
> > define attributes for specifying content refresh rate (and we can
> > use this information to induce a polling rate) but it should also
> > add an allowed hit rate, a la ICE specification.
> >
>
> The problem is that there is no direct relation between provider and channels.
> I.E., the OCS can be refreshed every XXX time, but the channels are not
> tagged, at least for some of the formats. Also, the problem with providers
> delivering several channels is that they will be updated, currenty, in
> directory order, that will very possibly be alphabetical, so the same server
> will be hit in parallel. I'll look for a simple way to avoid this problem.
>
I'm not sure I understand what you're saying here: OCS defines the
following attributes to tag the different channels :
<ocs:updatePeriod>
This optional element is used to describe the period over which the channel
format is updated. Acceptable values are: Hourly, Daily,
Weekly, Monthly, Yearly. If omitted, daily is assumed.
<ocs:updateFrequency>
This optional element is used to describe the frequency of updates in relation
to the update period. The value indicates how many times in
that period the channel is updated. For example, an updatePeriod of
daily, and an updateFrequency of 2 indicates the channel format is
updated twice daily. If omitted a value of 1 is assumed for this
element. Note, this element's value must be a positive integer greater than
zero.
<ocs:updateBase>
This optional element defines a base date which can be used to
calculate the publishing schedule. The date format takes the form:
yyyy-mm-ddThh:mm
If the server wants to recommend some access policy to their channels,
they *should* set these attributes.
Note: some extensions to these informations can be made to extend
this (borrowed from ICE): allowed time frame for refresh, etc...
Jetspeed should behave this way:
- first fetch the OCS feed from the URL configured in its properties
- read the updatefrequency for the OCS entry and set the update rate
for this feed at this value
- scan the OCS feed for defined channels and schedule their retrieval
using the update attributes when available or using a default
policy if not available
Please note that the document retrieval workflow is :
- fetch a document
- process a document based on its schema/mime-type
- store the document in the cache
- schedule a refresh
Such a workflow is true wether we're dealing with feeds or channels.
In the case of channels, the process step is currently empty but
could concievably be not empty (tag filter for example)
In the case of feeds, the process step in a transformation to
create entries in the Registry.
So actually all we need to make the aggregation system work
in Jetspeed is to implement classes to enforce this workflow
and feed it an initial list of URLs (of whatever type you like)
Mhh.. guess I've drifted away from the subject again... ;)
>
> We are considering to have only one server using the cache, while the rest
> read it as a read-only resource. Also, we could consider having the cache
> refresh as an external process, sharing only the directory with the jetspeed
> machine.
>
Yes, I see Jetspeed operation the same way, the aggregation system
and the portal system should really be 2 different blocks runnable
on different systems and in different numbers (ie 2 portals and 1
cache ; 3 specialized caches and 2 portals...). I think sharing
a directory is a good first step...
>
> I'll look at both approaches. Nothing final by now.
>
I've already worked on a revamped aggregation system as described
above a few months ago but was stalled because of a massive
real-work attack. :(
--
Rapha�l Luta - [EMAIL PROTECTED]
--
--------------------------------------------------------------
Please read the FAQ! <http://java.apache.org/faq/>
To subscribe: [EMAIL PROTECTED]
To unsubscribe: [EMAIL PROTECTED]
Archives and Other: <http://java.apache.org/main/mail.html>
Problems?: [EMAIL PROTECTED]