On Wed, Jun 23, 2010 at 5:27 PM, Dennis Kubes <[email protected]> wrote:
> You may still see some urls that *seem* to be outside of your domains list > while using the domain urlfilter. Remember the following: > > 1. Urls are checked in order of domain suffix, domain name, and > hostname. If you have .com and something.net, urls in > something.com will also get picked up. > 2. This doesn't handle redirects, it only handles generated urls. If > your domain urls file has something.com and the original url is > http://something.com/something.html but redirects to > http://ww2.something.net/redirect/login.html for example, the url > will still get crawled and saved. > > For verification grep through the logs to be sure. Be aware of the > redirects if you see a few urls that don't match your patterns. If you see > a lot that don't match then something isn't working. > > Dennis > > Thanks Dennis, that makes sense. The domain filter seems to be working and is all I need for now. -Max

