Matthias W. wrote:
Hi, I've got a textfile with all URLs to index, I don't want to crawl URLs before indexing.
Just having the urls isn't the same as having an index. You would still need to crawl them. You can inject your url list into a clean crawldb and fetch only those urls with the inject, generate, fetch commands. Then you can use the index command to index them.
How to do this? Also I'm creating an index in a temporary folder and on success I want to overwrite the old index. How do I check in the shell script, if the crawl- (index-) command was successful?
You could check size. You could also check it programatically through lucene.
Dennis
