Hello Daan, > Situation: > We have some xml files that contain metadata we want to be > able to query the repository for. So, we set up extractors > that periodically (every 5 > seconds) extract these properties and set them on the file as > custom webdav properties.
I do not understand this. Do you mean that you in dasl-indexer.xml you configure <cron>0/5 * * * * ? *</cron> or do you mean something else? I you mean the cron job in the dasl-indexer.xml, it means that every 5 seconds batch index runs when there are documents in the queue, ie when documents are added/deleted/modified. If you mean something else with the 5 seconds, I would really like to have some extra info because i am in the dark > > Now whenever we update any of these metadata files, the > repository notices the files have changed and tries to re > index them, so far so good. > However, when we try to query the repo for any of these > custom properties whilst the extractor is updating the index, > we don't get back the result set we expected. Do you mean by this, that for example a webdav property changes from 1 -> 2, and you query for the value '1' and you still get the document, while the property already changed to '2'. This might be the case since when you query the repository directly after the change, before the indexing ran, the index is not yet updated, so you'll get an 'old' result. I am not sure what kind of app you are developing (you might explain a little) but basically it means that when you have an app modifying data and at the same time searching the repository, you can seem to get stale data. Basically our applications are that a standalone cms connects to a repository using mainly propfinds/patch/put etc, and frontend standalone sites connecting to the repository. The searching is mainly done from sites. They heavily cache results. This cache is event based. So, sites serve cached responses, which might be evicted when the repository sends an event that something changed (jms). This jms is always send *after* the indexing took place. But, you might be developing without the cms, and have your own custom app reading, searching and writing to the repository, right? > > Is there a configuration setting that fixes this issue, or an > easy workaround? For the slide repository, we did not implement a 'blocking search while there are documents in the index queue' which I suppose you are actually looking for (i am not sure though). The new repository we are developing, Jackrabbit as base, has this behavior by default build in. I am not sure how urgent you issue is. You might reduce the cron job from 5 sec to 1 sec, though this still might be to slow for you. Hope you also realize we never had this issue. Pls let me know wether i am correct, Regards Ard > > Daan Hoogenboezem > > ******************************************** > Hippocms-dev: Hippo CMS development public mailinglist > ******************************************** Hippocms-dev: Hippo CMS development public mailinglist
