On Thu, 3 Oct 2019 at 09:45, Vladimir Sitnikov <[email protected]> wrote: > > Milamber>I just upload a robots.txt file with disallow all into each RC > into my > Milamber>public html folder on home.apache.org. > > Can you please clarify? > > https://www.robotstxt.org/robotstxt.html says the file should be placed at > the root directory. > > Then Google's documentation says: > > https://support.google.com/webmasters/answer/6062608?hl=en > Google>A robotted page can still be indexed if linked to from from other > sites > Google>While Google won't crawl or index the content blocked by robots.txt, > we might still find and index a disallowed URL if it is linked from other > places on the web > > Which means "Google would discover the link to the preview from a mailing > list archive", so the only feasible option is to remove the page completely > (or move it to a new place that is never mentioned in the mailing lists"
As the doc says: Google won't index the contents. So surely that means it won't return matches to the staging site based on content? I think that will address your concern (assuming the robots file is in the correct place to prevent content indexing) There's nothing we can do to stop the URL itself from being indexed, but does that matter provided that the content it holds is not indexed? Regardless of where the staging content is held, it's a good idea to stop in being indexed. > Vladimir
