[Bug 29162] Automatically add encoded URL lines for entries in MediaWiki:robots.txt
https://bugzilla.wikimedia.org/show_bug.cgi?id=29162 Eran Roz eran@outlook.com changed: What|Removed |Added CC||eran@outlook.com Blocks||14720 -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 29162] Automatically add encoded URL lines for entries in MediaWiki:robots.txt
https://bugzilla.wikimedia.org/show_bug.cgi?id=29162 Nemo federicol...@tiscali.it changed: What|Removed |Added Severity|major |enhancement -- You are receiving this mail because: You are the assignee for the bug. You are watching all bug changes. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 29162] Automatically add encoded URL lines for entries in MediaWiki:robots.txt
https://bugzilla.wikimedia.org/show_bug.cgi?id=29162 Nemo federicol...@tiscali.it changed: What|Removed |Added Component|Site requests |General/Unknown -- You are receiving this mail because: You are the assignee for the bug. You are watching all bug changes. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 29162] Automatically add encoded URL lines for entries in MediaWiki:robots.txt
https://bugzilla.wikimedia.org/show_bug.cgi?id=29162 Mark A. Hershberger m...@everybody.org changed: What|Removed |Added Priority|Unprioritized |Lowest CC||m...@everybody.org --- Comment #5 from Mark A. Hershberger m...@everybody.org 2011-06-15 01:30:51 UTC --- (In reply to comment #0) MediaWiki should generate these extra rules automatically for users. (In reply to comment #4) MediaWiki:Robots.txt is a WMF hack, there's no such feature in MediaWiki core. Now, how to prioritize... -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 29162] Automatically add encoded URL lines for entries in MediaWiki:robots.txt
https://bugzilla.wikimedia.org/show_bug.cgi?id=29162 Alexandre Emsenhuber [IAlex] ialex.w...@gmail.com changed: What|Removed |Added CC||ialex.w...@gmail.com Component|General/Unknown |Site requests Product|MediaWiki |Wikimedia --- Comment #4 from Alexandre Emsenhuber [IAlex] ialex.w...@gmail.com 2011-05-28 19:20:50 UTC --- Changing product/component to Wikimedia/Site requests, MediaWiki:Robots.txt is a WMF hack, there's no such feature in MediaWiki core. -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 29162] Automatically add encoded URL lines for entries in MediaWiki:robots.txt
https://bugzilla.wikimedia.org/show_bug.cgi?id=29162 --- Comment #2 from FT2 ft2.w...@gmail.com 2011-05-27 13:51:10 UTC --- Very useful, however we need to check what happens if some of the alternate URLs to the canonical page are excluded by robots.txt and some aren't. Does it apply the robots.txt rule that it has for the canonical page, to all alternatives? Or does it get confused? Example: /Wikipedia%3AArbitration%2FExample is stated to have /Wikipedia:Arbitration/Example as its canonical link. However one of these is NOINDEXed via robots.txt or in its header and one isn't. Knowing the canonicity helps to identify these as duplicates and the same page. But does it guarantee both of these will be treated as NOINDEXED if one of them is and the other isn't? Or do we still have to cover all variants of the URL in robots.txt? -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 29162] Automatically add encoded URL lines for entries in MediaWiki:robots.txt
https://bugzilla.wikimedia.org/show_bug.cgi?id=29162 --- Comment #3 from FT2 ft2.w...@gmail.com 2011-05-27 17:43:31 UTC --- To clarify, URL variants where robots.txt or header tags prohibit spidering will probably be excluded from spidering in the first place. So Google will be left to collate those URL variants it came across where robots,txt or header tags _didn't_ prevent spidering -- and a canonical setting which states these are all the same page. Ie this setting could help avoid duplicates but my guess is it probably _won't_ prevent URLs not stopped by robots.txt or header tags from being listed in results. -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l