[Bug 29162] Automatically add encoded URL lines for entries in MediaWiki:robots.txt

2014-04-22 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=29162

Eran Roz eran@outlook.com changed:

   What|Removed |Added

 CC||eran@outlook.com
 Blocks||14720

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 29162] Automatically add encoded URL lines for entries in MediaWiki:robots.txt

2013-04-04 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=29162

Nemo federicol...@tiscali.it changed:

   What|Removed |Added

   Severity|major   |enhancement

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are watching all bug changes.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 29162] Automatically add encoded URL lines for entries in MediaWiki:robots.txt

2013-04-04 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=29162

Nemo federicol...@tiscali.it changed:

   What|Removed |Added

  Component|Site requests   |General/Unknown

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are watching all bug changes.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 29162] Automatically add encoded URL lines for entries in MediaWiki:robots.txt

2011-06-14 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=29162

Mark A. Hershberger m...@everybody.org changed:

   What|Removed |Added

   Priority|Unprioritized   |Lowest
 CC||m...@everybody.org

--- Comment #5 from Mark A. Hershberger m...@everybody.org 2011-06-15 
01:30:51 UTC ---
(In reply to comment #0)
 MediaWiki should generate these extra rules automatically for users.

(In reply to comment #4)
 MediaWiki:Robots.txt is a WMF hack, there's no such feature in MediaWiki core.

Now, how to prioritize...

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 29162] Automatically add encoded URL lines for entries in MediaWiki:robots.txt

2011-05-28 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=29162

Alexandre Emsenhuber [IAlex] ialex.w...@gmail.com changed:

   What|Removed |Added

 CC||ialex.w...@gmail.com
  Component|General/Unknown |Site requests
Product|MediaWiki   |Wikimedia

--- Comment #4 from Alexandre Emsenhuber [IAlex] ialex.w...@gmail.com 
2011-05-28 19:20:50 UTC ---
Changing product/component to Wikimedia/Site requests, MediaWiki:Robots.txt is
a WMF hack, there's no such feature in MediaWiki core.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 29162] Automatically add encoded URL lines for entries in MediaWiki:robots.txt

2011-05-27 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=29162

--- Comment #2 from FT2 ft2.w...@gmail.com 2011-05-27 13:51:10 UTC ---
Very useful, however we need to check what happens if some of the alternate
URLs to the canonical page are excluded by robots.txt and some aren't. 

Does it apply the robots.txt rule that it has for the canonical page, to all
alternatives? Or does it get confused?

Example:

/Wikipedia%3AArbitration%2FExample is stated to have
/Wikipedia:Arbitration/Example as its canonical link. However one of these is
NOINDEXed via robots.txt or in its header and one isn't. Knowing the canonicity
helps to identify these as duplicates and the same page. But does it
guarantee both of these will be treated as NOINDEXED if one of them is and the
other isn't? Or do we still have to cover all variants of the URL in
robots.txt?

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 29162] Automatically add encoded URL lines for entries in MediaWiki:robots.txt

2011-05-27 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=29162

--- Comment #3 from FT2 ft2.w...@gmail.com 2011-05-27 17:43:31 UTC ---
To clarify, URL variants where robots.txt or header tags prohibit spidering
will probably be excluded from spidering in the first place. So Google will be
left to collate those URL variants it came across where robots,txt or header
tags _didn't_ prevent spidering -- and a canonical setting which states these
are all the same page. 

Ie this setting could help avoid duplicates but my guess is it probably _won't_
prevent URLs not stopped by robots.txt or header tags from being listed in
results.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l