Harvey Kane wrote:
> Hi Neil,
>
> Are you sure search engines are hitting those 410 pages? I have found 
> that when you removee all links to the old site, the search engines have 
> no way of visiting the old URLs anymore - meaning they never get to see 
> your 301, and never get to see the 410.
>   
Yeah, I'm seeing the requests in the Apache log. MSN at least is trying
every couple of days, getting a 301 and a 410. I would expect it to
de-index. I've tried searching for any backlinks to these pages but
haven't found any.
> My solution to that is to create a list of all the deleted pages that 
> search engines have in their index (scrape the search engine index for 
> this), and link to them somewhere - linking to your deleted pages in the 
> XML sitemap is a good idea, or create a 'deleted pages' page on your 
> site and discreetly link to it. Search engines follow your links, see 
> the 301, see the 410, then remove the page from the index. That's the 
> theory anyway.
>   
That sounds peculiar... I don't think they're following a link at the
moment, I think they're just being stubborn possibly because those URLs
could have been very popular search clickthroughs... I don't think I can
be bothered munging up the sitemap just to point to a small handful of
these URLs, besides, they're on a different domain to the new site, and
XML sitemaps don't do cross-domain?
> If you are saying that MSN is following your 301 and actually hitting 
> the 410, then not removing the page from the index, that does seem odd.
>   
Does indeed. Google isn't doing it, just MSN. Some of the pages are
actually leading to 404 rather than 410 due to a bug in my pattern
matching, and it's still retrying them too, so it's not just ignorance
of a 410 code.

I think next thing I'll try is putting a special robots.txt file on the
old domain and make the redirect ignore that file, so if they hit the
old domain and ask for it, it'll tell them they're not allowed to index
anything on that domain (and hopefully therefore they won't hit the old
URLs).

I also considered just rewriting these old URLs to the homepage or
something, that way at least they'll get some content and figure out
it's not relevant anymore.


Neil

--~--~---------~--~----~------------~-------~--~----~
NZ PHP Users Group: http://groups.google.com/group/nzphpug
To post, send email to [email protected]
To unsubscribe, send email to
[EMAIL PROTECTED]
-~----------~----~----~----~------~----~------~--~---

Reply via email to