I can't see the recent messages when searching via Startpage, but I do see them when going straight to Google from the Google corporate network, which I assume hits the "bleeding edge" index. My guess is that the archive only gets crawled when Google recrawls the entire web and not in the incremental crawls, so it both gets crawled less frequently and takes longer to be released when it does get crawled due to the need for more extensive QA for the big index.
On Fri, Aug 26, 2016 at 1:05 AM Georgi Guninski <gunin...@guninski.com> wrote: > On Fri, Aug 26, 2016 at 12:23:00AM -0700, Riad S. Wahby wrote: > > Georgi Guninski <gunin...@guninski.com> wrote: > > > Which parts of the cpunks.org web archive are indexed by google? > > > > No idea. I had assumed that without a robots.txt Google would just > > slurp up everything, but perhaps that's not true. Anyhow, I added > > one just in case (though you'll have to clear your browser's cache > > because the blanket rewrite rule uses 301s). > > > > I suppose the redirected robots.txt is not a problem and it used to > work. >