Harvey Kane wrote: > Hi Neil, > > Are you sure search engines are hitting those 410 pages? I have found > that when you removee all links to the old site, the search engines have > no way of visiting the old URLs anymore - meaning they never get to see > your 301, and never get to see the 410. > Yeah, I'm seeing the requests in the Apache log. MSN at least is trying every couple of days, getting a 301 and a 410. I would expect it to de-index. I've tried searching for any backlinks to these pages but haven't found any. > My solution to that is to create a list of all the deleted pages that > search engines have in their index (scrape the search engine index for > this), and link to them somewhere - linking to your deleted pages in the > XML sitemap is a good idea, or create a 'deleted pages' page on your > site and discreetly link to it. Search engines follow your links, see > the 301, see the 410, then remove the page from the index. That's the > theory anyway. > That sounds peculiar... I don't think they're following a link at the moment, I think they're just being stubborn possibly because those URLs could have been very popular search clickthroughs... I don't think I can be bothered munging up the sitemap just to point to a small handful of these URLs, besides, they're on a different domain to the new site, and XML sitemaps don't do cross-domain? > If you are saying that MSN is following your 301 and actually hitting > the 410, then not removing the page from the index, that does seem odd. > Does indeed. Google isn't doing it, just MSN. Some of the pages are actually leading to 404 rather than 410 due to a bug in my pattern matching, and it's still retrying them too, so it's not just ignorance of a 410 code.
I think next thing I'll try is putting a special robots.txt file on the old domain and make the redirect ignore that file, so if they hit the old domain and ask for it, it'll tell them they're not allowed to index anything on that domain (and hopefully therefore they won't hit the old URLs). I also considered just rewriting these old URLs to the homepage or something, that way at least they'll get some content and figure out it's not relevant anymore. Neil --~--~---------~--~----~------------~-------~--~----~ NZ PHP Users Group: http://groups.google.com/group/nzphpug To post, send email to [email protected] To unsubscribe, send email to [EMAIL PROTECTED] -~----------~----~----~----~------~----~------~--~---
