Hello Daan,

> Situation:
> We have some xml files that contain metadata we want to be 
> able to query the repository for. So, we set up extractors 
> that periodically (every 5
> seconds) extract these properties and set them on the file as 
> custom webdav properties.

I do not understand this. Do you mean that you in dasl-indexer.xml you
configure <cron>0/5 * * * * ? *</cron> or do you mean something else? I
you mean the cron job in the dasl-indexer.xml, it means that every 5
seconds batch index runs when there are documents in the queue, ie when
documents are added/deleted/modified. If you mean something else with
the 5 seconds, I would really like to have some extra info because i am
in the dark

> 
> Now whenever we update any of these metadata files, the 
> repository notices the files have changed and tries to re 
> index them, so far so good.
> However, when we try to query the repo for any of these 
> custom properties whilst the extractor is updating the index, 
> we don't get back the result set we expected.

Do you mean by this, that for example a webdav property changes from 1
-> 2, and you query for the value '1' and you still get the document,
while the property already changed to '2'. This might be the case since
when you query the repository directly after the change, before the
indexing ran, the index is not yet updated, so you'll get an 'old'
result. I am not sure what kind of app you are developing (you might
explain a little) but basically it means that when you have an app
modifying data and at the same time searching the repository, you can
seem to get stale data.

Basically our applications are that a standalone cms connects to a
repository using mainly propfinds/patch/put etc, and frontend standalone
sites connecting to the repository. The searching is mainly done from
sites. They heavily cache results. This cache is event based. So, sites
serve cached responses, which might be evicted when the repository sends
an event that something changed (jms). This jms is always send *after*
the indexing took place. 

But, you might be developing without the cms, and have your own custom
app reading, searching and writing to the repository, right? 

> 
> Is there a configuration setting that fixes this issue, or an 
> easy workaround?

For the slide repository, we did not implement a 'blocking search while
there are documents in the index queue' which I suppose you are actually
looking for (i am not sure though). 

The new repository we are developing, Jackrabbit as base, has this
behavior by default build in. I am not sure how urgent you issue is. You
might reduce the cron job from 5 sec to 1 sec, though this still might
be to slow for you. Hope you also realize we never had this issue. 

Pls let me know wether i am correct,

Regards Ard

> 
> Daan Hoogenboezem
> 
> ********************************************
> Hippocms-dev: Hippo CMS development public mailinglist
> 
********************************************
Hippocms-dev: Hippo CMS development public mailinglist

Reply via email to