Hello, I have almost finished a new DIH EntityProcessor which I am calling the manifestEnityProcessor. It is designed around the idea that whatever demon is used to maintain your set of a few 100,000 xml documents it is likely to drop a report or log file explaining what has been changed within your content store. This assumes a file based content repository.
The manifestEnityProcessor is used as follows <entity name="jc" processor="ManifestEntityProcessor" baseDir="/Volumes/Techmore/ts/aaa/schema/data" rootEntity="false" dataSource="null" allowRegex="^.*\.xml$" manifestFileName="/Volumes/ts/man-find.txt" manifestAddRegex="(.*)$" > The idea is you have a log file or other report, perhaps from tar or zip, and you wish to use this to control the indexing of the new content. The new entity fields are as follows. manifestFileName is the name of the manifest file. If this value is relative, it assumed to be relative to baseDir. Required. manifestAddRegex is a required regex to identify lines which when matched should cause docs to be added to the index. manifestDelRegex is an optional value of a regex to identify documents which when matched should be deleted from the index **PLANNED** allowRegex a required regex to identify the portion of the ADD/DELete line identified above which contains the file or pathname to ADDed or DELeted. If the resulting value relative, it assumed to be relative to baseDir. What do I do next? Raise a JIRA issue and add the code? Is DIH the right place to add this? Suggestions for a different name? Suggestions on how to do the delete bitty from within an entity? Regards Fergus. -- =============================================================== Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===============================================================