Hi All,
I’m trying to find a way to reduce the time spent on incremental runs of the crawler (HTTP, file system, file share) by creating a list of modified files (created/modified and deleted). The challenge is how to supply the crawler with such list? There are great interfaces (JSON API and scripting language), which could be used for that, but: 1) no deletion command gets sent to the index for NOT-Found (deleted files) entries from the modification list, if the crawler hasn’t indexed these files before 2a) re-using one “incremental” job: crawler would delete the previously indexed documents, if it they don’t appear on the modification list anymore 2b) re-creating the “incremental” job every time: crawler would delete ALL previous indexed docs from the index, if the job gets deleted So, currently I see no possibilities for the incremental indexing based on a modification list without extending the functionality of the framework, or maybe I missed something and there are features I’m not aware of? Thanks! -- rgds, Konstantin
