I can't see the recent messages when searching via Startpage, but I do see
them when going straight to Google from the Google corporate network, which
I assume hits the "bleeding edge" index. My guess is that the archive only
gets crawled when Google recrawls the entire web and not in the incremental
crawls, so it both gets crawled less frequently and takes longer to be
released when it does get crawled due to the need for more extensive QA for
the big index.

On Fri, Aug 26, 2016 at 1:05 AM Georgi Guninski <gunin...@guninski.com>
wrote:

> On Fri, Aug 26, 2016 at 12:23:00AM -0700, Riad S. Wahby wrote:
> > Georgi Guninski <gunin...@guninski.com> wrote:
> > > Which parts of the cpunks.org web archive are indexed by google?
> >
> > No idea. I had assumed that without a robots.txt Google would just
> > slurp up everything, but perhaps that's not true. Anyhow, I added
> > one just in case (though you'll have to clear your browser's cache
> > because the blanket rewrite rule uses 301s).
> >
>
> I suppose the redirected robots.txt is not a problem and it used to
> work.
>

Reply via email to