https://bugzilla.wikimedia.org/show_bug.cgi?id=62494

            Bug ID: 62494
           Summary: Add exception in robots.txt to allow the Internet
                    Archiver to index action=raw
           Product: Wikimedia
           Version: wmf-deployment
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: Unprioritized
         Component: Site requests
          Assignee: wikibugs-l@lists.wikimedia.org
          Reporter: nathanlarson3...@gmail.com
                CC: benap...@gmail.com,
                    bugzilla+org.wikime...@tuxmachine.com,
                    dereck...@espace-win.org, g...@wikimedia.org,
                    tom...@twkozlowski.net, wikimedia.b...@snowolf.eu
       Web browser: ---
   Mobile Platform: ---

Presently, robots.txt has Disallow: /w/ Thus, the raw wikitext of pages isn't
accessible via the Internet Archive; see e.g.
https://web.archive.org/web/20140307111730/http://en.wikipedia.org/w/index.php?title=Main_Page&action=raw

This is in contrast to sites like WikiIndex, which allow it:
https://web.archive.org/web/20131021230044/http://wikiindex.org/index.php?title=Welcome&action=raw

We should allow the Internet Archiver to index these pages so that the raw
wikitext will be available for future generations, even if the page goes away.
See
[[mw:Manual:Robots.txt#Allow_indexing_of_raw_pages_by_the_Internet_Archiver]].

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to