As for logging 301s, 302s and 404s and the scope, I don't think we are
interested in checking EOL content for those.

As we are about to approve https://review.openstack.org/#/c/507629/, we
also want everybody to understand broken links found in EOL content won't
be fixed, since no content updates to EOL content will be provided.

Cheers,
pk


On Thu, 5 Oct 2017 22:51:31 -0400 (EDT)
"[email protected]" <[email protected]> wrote:

> Hello all!
> 
> As you may be aware, sitemaps generation for docs.openstack.org is currently 
> done via a manually triggered scrapy process. It currently also scrapes the 
> entirety of docs.openstack.org, making processing slow. In order to improve 
> the efficiency of this process, I would like to propose the following updates 
> to the sitemap generation toolkit:
>     * keep track (in logs) of 301s, 302s, and 404s,
>     * automatic pull of supported releases,
>     * cron-managed automatic updates, and
>     * setup of Google Webmaster tools (https://www.google.com/webmasters/) 
>     * a few style cleanups
>     
> Beyond this, implementing more targeted crawling would improve the processing 
> speed and scope massively. This is, however, a bit of a complicated matter, 
> as it requires us to decide what, exactly, defines scope relevence, in order 
> to limit the crawl domain.
> 
> These are, of course, only our precursory findings. and we would love to hear 
> some feedback about alternate methods and possible tricky aspects of the 
> suggested changes. What do you think? Let us know!
> 
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: [email protected]?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to