On Mon, Jan 16, 2012 at 15:10, Ronald Chmara <[email protected]> wrote:
> http://lists.pdxlinux.org/pipermail/plug/
>      <META NAME="robots" CONTENT="noindex,follow">
>
> http://lists.pdxlinux.org/pipermail/plug/2012-January/thread.html
>     <META NAME="robots" CONTENT="noindex,follow">
>
> http://lists.pdxlinux.org/pipermail/plug/2012-January/074836.html
>   <META NAME="robots" CONTENT="index,nofollow">

This looks okay to me.  From what I read, it's saying to not index the
index pages, but do follow them.  When it hits an actual message, it's
saying to index them but don't follow any included links.  Perhaps
it's not able to find the original index pages because nothing links
to them that it can follow that's in its database.  Digging...

http://pdxlinux.org/robots.txt has disallow entries for /mailman/ and
/pipermail/, but this applies only to pdxlinux.org domain(?), not
lists.pdxlinux.org.  The email archives are also linked to directly
from pdxlinux.org/mail/, which is crawlable.  Digging...

The two messages from October that Google did index are showing up
because they're linked to from a web history of the #orlug IRC
channel.  This was the only place they were linked from according to
Google:
http://home.borked.us/irc/urllog.shtml

Bing shows only 12 messages from the mailing list archive in all of
2011.  Bing, however also doesn't show much for previous years (159
pages indexed for all years...they lie an say 1,510 results until you
get to page 9 and then it changes to 159).

Anyone else with ideas?  Is there a place someone could've submitted a
URL to tell Google we don't want to be indexed anymore, or is it
possible they just haven't got around to doing it in a while?
_______________________________________________
PLUG mailing list
[email protected]
http://lists.pdxlinux.org/mailman/listinfo/plug

Reply via email to