- - - - - - - - - - - - - - - - - - - - - - - - - - - - Name: Maxime Subject: Re: How can I avoid bombarding the servers?
You may apply serveral things: 1. Ask webmasters to add Crawl-delay rule in DataparkSearch section of robots.txt (dpsearch is comply with this command) 2. Use -p switch for indexer to specify a pause in milliseconds between consecutive page fetching for same indexing thread. 3. By default, dpsearch uses some seed technique to merge pages in url table. You may disable this using -r swicth for indexer. If you use this swicth, please avoid it. - - - - - - - - - - - - - - - - - - - - - - - - - - - - Read the full topic here: http://www.dataparksearch.org/cgi-bin/simpleforum.cgi?fid=02;topic_id=1157474787
