Rereading what you posted, by "such a prominent place on Google", did
you mean specifically the ".4" and ".3" links that show up below
sqlalchemy when www.sqlalchemy.org is returned in the search results?
Those are what google calls "sitelinks". You can tell them not to use
certain pages as sitelinks via the google webmaster tools. They'll
remove them, but they don't get replaced so you will have two fewer
sitelinks then.

On Thu, May 21, 2009 at 8:03 PM, Bobby Impollonia <[email protected]> wrote:
>>  otherwise if you have any advice on how to get 0.4/0.3
>> delisted from such a prominent place on Google, that would be
>> appreciated.
>
> The simplest thing to do is to append:
> Disallow: /docs/04/
> Disallow: /docs/03/
>
> to the file:
> http://www.sqlalchemy.org/robots.txt
>
> This tells google (and all well-behaved search engines) not to index
> those urls (and anything under them). The next time the googlebot
> comes through, it will see the new robots.txt and remove those pages
> from its index. This will take a couple weeks at most.
>
> You can learn more about robots.txt here:
> http://www.robotstxt.org/
>
> The disadvantage to doing it that way is that you will lose the google
> juice (pagerank) for inbound links to the old documentation.
>
> An alternative approach that gets around this to use a <link
> rel="canonical" ...> tag in the <head> of each page of the 04 and 03
> documentation pointing to the corresponding page of 05 documentation
> as its "canonical" url.
>
> By doing this, you are claiming that the 04/ 03 documentation pages
> are "duplicates" of the corresponding 05 pages. Google juice from
> inbound links to an old documentation page will accrue to the
> appropriate 05 documentation page instead.
>
> However, strictly speaking, the different versions aren't quite
> "duplicates", so you might be pushing the boundaries of what is
> allowed a bit by claiming they are.
>
> Here is more info on rel="canonical" from google:
> http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html
>
> A similar approach would be to do a 301 redirect from each old
> documentation page to the corresponding 05 documentation page, but
> only if the visitor is the googlebot. This is straightforward to
> implement with mod_rewrite (the googlebot can be recognized by its
> user-agent string), but probably a bad idea since google usually
> considers it "cloaking" to serve different content to the googlebot
> than to regular visitors.
>
> You should also consider submitting an XML sitemap to google via the
> google webmaster tools. This allows you to completely spell out for
> them the structure of the site and what you want indexed.
>
> I also noticed that your current robots.txt file disallows indexing of
> anything under /trac/. It would nice to let google index bugs in trac
> so that someone who searches google for sqlalchemy help can come
> across an extant bug describing their problem. In addition, you have
> links on the front page ("changelog" and "what's new") that go to urls
> under /trac/ ,  so google will not follow those links due to your
> robots.txt.
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to