[MediaWiki-commits] [Gerrit] MediaWiki.php: Redirect non-standard title urls to canonical - change (mediawiki/core)

2015-06-30 Thread jenkins-bot (Code Review)
jenkins-bot has submitted this change and it was merged.

Change subject: MediaWiki.php: Redirect non-standard title urls to canonical
..


MediaWiki.php: Redirect non-standard title urls to canonical

Urls that use the page's title and no extra query parameters now redirect
to the standard url format.

Previously we only did this for variations of the title value (e.g. 
Foo%20Bar),
not for variations of the overall url structure (like title=Foo - /wiki/Foo).

Existing redirect (unchanged):
 /wiki/Foo%20Bar
 /w/index.php?title=Foo%20Bar

New redirects:
 /wiki/Foo_Bar?action=view
 /w/index.php?title=Foo_Bar
 /w/index.php?title=Foo_Baraction=view

Any intentional (or unintentional) ways a url can be rewritten by the server,
such as /?title=Foo_Bar in case of Wikimedia, are redirected as well.

While this has been a problem for many years, it went unnoticed until
recently when Google started to index significantly more results of
the /?title=name form. This query returns About 3,220,000 results:
https://google.com/search?q=site:en.wikipedia.org+inurl:title+-intitle:title

The only change in logic is that the titlekey comparison is now no longer a
factor in deciding whether to redirect. Instead the existing comparison for the
entire url is used to cover this.

However I kept titlekey comparison in the redirect-loop check as otherwise this
check would throw on all canonical page views where no redirect can be made.
Added a comment explaining how this redirect loop was possible.

Bug: T67402
Change-Id: I88ed3525141c765910e66188427b9aab36b958a9
---
M includes/MediaWiki.php
M tests/phpunit/includes/MediaWikiTest.php
2 files changed, 21 insertions(+), 11 deletions(-)

Approvals:
  Aaron Schulz: Looks good to me, approved
  BBlack: Looks good to me, but someone else must approve
  jenkins-bot: Verified



diff --git a/includes/MediaWiki.php b/includes/MediaWiki.php
index 932dea2..5510d35 100644
--- a/includes/MediaWiki.php
+++ b/includes/MediaWiki.php
@@ -279,6 +279,8 @@
 * - Normalise empty title:
 *   /wiki/ - /wiki/Main
 *   /w/index.php?title= - /wiki/Main
+* - Normalise non-standard title urls:
+*   /w/index.php?title=Foo_Bar - /wiki/Foo_Bar
 * - Don't redirect anything with query parameters other than 'title' 
or 'action=view'.
 *
 * @return bool True if a redirect was set.
@@ -289,8 +291,6 @@
 
if ( $request-getVal( 'action', 'view' ) != 'view'
|| $request-wasPosted()
-   || ( $request-getVal( 'title' ) !== null
-$title-getPrefixedDBkey() == 
$request-getVal( 'title' ) )
|| count( $request-getValueNames( array( 'action', 
'title' ) ) )
|| !Hooks::run( 'TestCanonicalRedirect', array( 
$request, $title, $output ) )
) {
@@ -305,7 +305,19 @@
}
// Redirect to canonical url, make it a 301 to allow caching
$targetUrl = wfExpandUrl( $title-getFullURL(), PROTO_CURRENT );
-   if ( $targetUrl == $request-getFullRequestURL() ) {
+
+   if ( $targetUrl != $request-getFullRequestURL() ) {
+   $output-setSquidMaxage( 1200 );
+   $output-redirect( $targetUrl, '301' );
+   return true;
+   }
+
+   // If there is no title, or the title is in a non-standard 
encoding, we demand
+   // a redirect. If cgi somehow changed the 'title' query to be 
non-standard while
+   // the url is standard, the server is misconfigured.
+   if ( $request-getVal( 'title' ) === null
+   || $title-getPrefixedDBkey() != $request-getVal( 
'title' )
+   ) {
$message = Redirect loop detected!\n\n .
This means the wiki got confused about what 
page was  .
requested; this sometimes happens when moving 
a wiki  .
@@ -327,9 +339,7 @@
}
throw new HttpError( 500, $message );
}
-   $output-setSquidMaxage( 1200 );
-   $output-redirect( $targetUrl, '301' );
-   return true;
+   return false;
}
 
/**
diff --git a/tests/phpunit/includes/MediaWikiTest.php 
b/tests/phpunit/includes/MediaWikiTest.php
index df94d3e..e196243 100644
--- a/tests/phpunit/includes/MediaWikiTest.php
+++ b/tests/phpunit/includes/MediaWikiTest.php
@@ -34,7 +34,7 @@
'url' = 
'http://example.org/w/index.php?title=Foo_Bar',
'query' = array( 'title' = 'Foo_Bar' ),
'title' = 'Foo_Bar',
-   'redirect' = false,
+   'redirect' = 

[MediaWiki-commits] [Gerrit] MediaWiki.php: Redirect non-standard title urls to canonical - change (mediawiki/core)

2015-06-19 Thread Krinkle (Code Review)
Krinkle has uploaded a new change for review.

  https://gerrit.wikimedia.org/r/219446

Change subject: MediaWiki.php: Redirect non-standard title urls to canonical
..

MediaWiki.php: Redirect non-standard title urls to canonical

Urls that use the page's title and no extra query parameters now redirect
to the standard url format.

Previously we only did this for variations of the title value, not for 
variations
of the overlal url structure.

These already redirected:
  http://example.org/wiki/Foo%20Bar   
http://example.org/wiki/Foo_Bar
  http://example.org/w/index.php?title=Foo%20Bar  
http://example.org/wiki/Foo_Bar
  http://example.org/?title=Foo%20Bar 
http://example.org/wiki/Foo_Bar

We now redirect these as well:
  http://example.org/w/index.php?title=Foo_Bar
http://example.org/wiki/Foo_Bar
  http://example.org/?title=Foo_Bar   
http://example.org/wiki/Foo_Bar

The only change is that the titlekey comparison is now no longer a factor in
deciding whether to redirect. Instead the existing comparison for the entire url
is used to cover this.

However kept the titlekey check for the redirect-loop as otherwise this check
would throw on all canonical page views where no redirect can be made. Added a
comment explaining how a redirect loop was possible as the code looked redudant.

Bug: T67402
Change-Id: I88ed3525141c765910e66188427b9aab36b958a9
---
M includes/MediaWiki.php
1 file changed, 31 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/mediawiki/core 
refs/changes/46/219446/1

diff --git a/includes/MediaWiki.php b/includes/MediaWiki.php
index 5e3836f..2ab6b7a 100644
--- a/includes/MediaWiki.php
+++ b/includes/MediaWiki.php
@@ -276,6 +276,8 @@
 * - Normalise empty title:
 *   /wiki/ - /wiki/Main
 *   /w/index.php?title= - /wiki/Main
+* - Normalise non-standard title urls:
+*   /w/index.php?title=Foo_Bar - /wiki/Foo_Bar
 * - Don't redirect anything with query parameters other than 'title' 
or 'action=view'.
 *
 * @return bool True if a redirect was set.
@@ -286,8 +288,6 @@
 
if ( $request-getVal( 'action', 'view' ) != 'view'
|| $request-wasPosted()
-   || ( $request-getVal( 'title' ) !== null
-$title-getPrefixedDBkey() == 
$request-getVal( 'title' ) )
|| count( $request-getValueNames( array( 'action', 
'title' ) ) )
|| !Hooks::run( 'TestCanonicalRedirect', array( 
$request, $title, $output ) )
) {
@@ -302,28 +302,38 @@
}
// Redirect to canonical url, make it a 301 to allow caching
$targetUrl = wfExpandUrl( $title-getFullURL(), PROTO_CURRENT );
-   if ( $targetUrl == $request-getFullRequestURL() ) {
-   $message = Redirect loop detected!\n\n .
-   This means the wiki got confused about what 
page was  .
-   requested; this sometimes happens when moving 
a wiki  .
-   to a new server or changing the server 
configuration.\n\n;
 
-   if ( $this-config-get( 'UsePathInfo' ) ) {
-   $message .= The wiki is trying to interpret 
the page  .
-   title from the URL path portion 
(PATH_INFO), which  .
-   sometimes fails depending on the web 
server. Try  .
-   setting \\$wgUsePathInfo = false;\ 
in your  .
-   LocalSettings.php, or check that 
\$wgArticlePath  .
-   is correct.;
-   } else {
-   $message .= Your web server was detected as 
possibly not  .
-   supporting URL path components 
(PATH_INFO) correctly;  .
-   check your LocalSettings.php for a 
customized  .
-   \$wgArticlePath setting and/or toggle 
\$wgUsePathInfo  .
-   to true.;
+   // If there is no title, or the title is in a non-standard 
encoding,
+   // we demand a redirect. If cgi somehow changed the 'title' 
query to be non-standard
+   // while the url is stadnard, the server is misconfigured.
+   if ( $targetUrl == $request-getFullRequestURL() ) {
+   if ( $request-getVal( 'title' ) === null
+   || $title-getPrefixedDBkey() != 
$request-getVal( 'title' )
+   ) {
+   $message = Redirect loop detected!\n\n .
+   This means the wiki got