[OPEN-ILS-DEV] Crawling Evergreen via SuperCat + Performance

Duimovich, George Tue, 02 Feb 2010 08:30:29 -0800

Hello,

We have a small proof of concept project going forward in which our 
departmental search team will be crawling our records via one or another 
SuperCat formats. These will be indexed in our department's Autonomy search 
engine and reported upon for feasibility, issues, etc. in regards to 
integration with our so-called "Knowledge search gateway"  The intent is for 
the library catalogue to be another "source" that can be searched upon 
alongside our Intranet (web), internal blogs, and wikis, etc..


I'll be suggesting they can crawl basically anytime as long as they are 
reasonably throttled if they work during business hours; however, since this 
will be new for us, I'd also like to schedule a crawl under "dragster" type 
conditions and obtain some metrics from some of the various crawl attempts. 
(e.g. at what crawl rate does our server start to sweat, etc.)

My question: I'll ask our search guy to keep some stats but also wonder what 
are the best or recommended ways for monitoring performance metrics under crawl 
conditions from the EG server side of things. We can get some start/end times 
from the logs and draw some conclusions that way, but any other advice would be 
helpful too.

Thanks,

George Duimovich
NRCan Library / Bibliothèque de RNCan

[OPEN-ILS-DEV] Crawling Evergreen via SuperCat + Performance

Reply via email to