[Wikidata-bugs] [Maniphest] [Commented On] T144308: [Task] Disallow Special:GoToLinkedPage in wikidata.org/robots.txt

2016-11-10 Thread thiemowmde
thiemowmde added a comment. We changed our robots.txt two and a half weeks ago. Re-visiting possibly millions of URLs in such a short time is something neither we nor Google want. At the moment there are 28,000 left, it seems. The links we want to exclude are tools and never meant to be indexed.

[Wikidata-bugs] [Maniphest] [Commented On] T144308: [Task] Disallow Special:GoToLinkedPage in wikidata.org/robots.txt

2016-11-09 Thread Sjoerddebruin
Sjoerddebruin added a comment. https://www.google.com/search?q=Ethiopian+wolf+site%3Awikidata.org was visited yesterday and is still indexed by Google.TASK DETAILhttps://phabricator.wikimedia.org/T144308EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Mbch331,

[Wikidata-bugs] [Maniphest] [Commented On] T144308: [Task] Disallow Special:GoToLinkedPage in wikidata.org/robots.txt

2016-10-24 Thread Mbch331
Mbch331 added a comment. Request on https://www.wikidata.org/wiki/MediaWiki_talk:Robots.txt#Also_exclude_URLs_with_question_marks is done.TASK DETAILhttps://phabricator.wikimedia.org/T144308EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Mbch331Cc:

[Wikidata-bugs] [Maniphest] [Commented On] T144308: [Task] Disallow Special:GoToLinkedPage in wikidata.org/robots.txt

2016-10-12 Thread thiemowmde
thiemowmde added a comment. As you said, Special:GoToLinkedPage redirects and does not output HTML (except for the form, which is a single page, and the reason why Disallow: /wiki/Special:GoToLinkedPage should not be used). The target pages of these redirects are Wikipedia articles. They should be

[Wikidata-bugs] [Maniphest] [Commented On] T144308: [Task] Disallow Special:GoToLinkedPage in wikidata.org/robots.txt

2016-10-12 Thread TheDJ
TheDJ added a comment. FYI: you disallowed crawling, that doesn't mean you disallowed indexing for modern search engines. If another indexed page links to the url, that google will still index it. To quote When you block URLs from being indexed in Google via robots.txt, they may still show those

[Wikidata-bugs] [Maniphest] [Commented On] T144308: [Task] Disallow Special:GoToLinkedPage in wikidata.org/robots.txt

2016-09-23 Thread Mbch331
Mbch331 added a comment. In T144308#2662636, @Sjoerddebruin wrote: Google still index these pages: https://www.google.nl/search?client=safari=en=African+wild+dog+site:wikidata.org=UTF-8=UTF-8_rd=cr=G1HlV9TLHemRwAKeurXICg (notice that the first result has a cached version of yesterday) Probably

[Wikidata-bugs] [Maniphest] [Commented On] T144308: [Task] Disallow Special:GoToLinkedPage in wikidata.org/robots.txt

2016-09-23 Thread Sjoerddebruin
Sjoerddebruin added a comment. Google still index these pages: https://www.google.nl/search?client=safari=en=African+wild+dog+site:wikidata.org=UTF-8=UTF-8_rd=cr=G1HlV9TLHemRwAKeurXICg (notice that the first result has a cached version of yesterday)TASK

[Wikidata-bugs] [Maniphest] [Commented On] T144308: [Task] Disallow Special:GoToLinkedPage in wikidata.org/robots.txt

2016-09-01 Thread TheDJ
TheDJ added a comment. You'll probably want to consider adapting the extension to make to enforce this in a better way for all users.. Perhaps the X-Robots-Tag http header can be used to remove indexing of the redirect... Not sure, redirects can be a bit problematic in that way.TASK

[Wikidata-bugs] [Maniphest] [Commented On] T144308: [Task] Disallow Special:GoToLinkedPage in wikidata.org/robots.txt

2016-08-30 Thread hoo
hoo added a comment. What about URLs like title=Special:GoToLinkedPage=dewiki=Q123456?TASK DETAILhttps://phabricator.wikimedia.org/T144308EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Mbch331, hooCc: Mbch331, Jonas, hoo, aude, Lydia_Pintscher, Tobi_WMDE_SW,