[Bug 25718] Title-getLocalURL does not respect $wgArticlePath when given a query

2012-04-12 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25718

Krinkle krinklem...@gmail.com changed:

   What|Removed |Added

Version|1.16.0  |1.16.x

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25718] Title-getLocalURL does not respect $wgArticlePath when given a query

2010-10-31 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25718

--- Comment #9 from Kiran Jonnalagadda j...@pobox.com 2010-10-31 11:47:34 UTC 
---
Thank you, Daniel. This is a great explanation.

However, the exact same problem exists without the patch, in MediaWiki's
default configuration with ugly URLs. Consider the URLs it would generate for
an installation in /wiki:

/wiki/index.php?title=Example_page
/wiki/index.php?title=Example_pagemonth=11year=2010

There is no way to use robots.txt to disallow the second URL without blocking
the first. This, therefore, is a problem for the Semantic Result Formats
extension to fix. It is not introduced by this patch.

As far as I can tell, there is no use case within MediaWiki's default
configuration that touches line 824 of Title.php -- there is never a URL that
includes a query but not an action parameter (apart from the default 'view'
action, which also bypasses line 824).

Does MediaWiki come with unit tests to verify this? I'm new to the source.

It appears that the patch touches an area of code that is rarely used and hence
has been long overlooked. The indexing problem it potentially causes with SRF
Calendar already exists without the patch. The patch only makes the function
behave consistently.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25718] Title-getLocalURL does not respect $wgArticlePath when given a query

2010-10-31 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25718

--- Comment #10 from Kiran Jonnalagadda j...@pobox.com 2010-10-31 12:05:31 
UTC ---
One possible solution is to define a new 'null' or 'none' action in
$wgActionPaths that gets looked up when no action is called for (this being
different from the default 'view' action).

getLocalURL can then use the installation's preferred URL syntax instead of
assuming the use of pretty or ugly URLs.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25718] Title-getLocalURL does not respect $wgArticlePath when given a query

2010-10-31 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25718

--- Comment #11 from Kiran Jonnalagadda j...@pobox.com 2010-10-31 12:49:33 
UTC ---
 As far as I can tell, there is no use case within MediaWiki's default
 configuration that touches line 824 of Title.php -- there is never a URL that
 includes a query but not an action parameter (apart from the default 'view'
 action, which also bypasses line 824).

Whoops. Completely wrong here. History browsing uses query parameters without
an action.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25718] Title-getLocalURL does not respect $wgArticlePath when given a query

2010-10-31 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25718

--- Comment #12 from Kiran Jonnalagadda j...@pobox.com 2010-10-31 13:06:54 
UTC ---
 Whoops. Completely wrong here. History browsing uses query parameters without
 an action.

... and those URLs are generated without touching line 824 of Title.php, so the
presence of this patch makes no difference.

From the documentation for getLocalURL:
http://svn.wikimedia.org/doc/classTitle.html#a75d9fae7aabf6187318c5a298b01c5ef

 $queryMixed: an optional query string; if not specified, 
 $wgArticlePath will be used.

So the documentation is specific: $wgArticlePath is only used when no query is
specified. This patch does the logically expected thing, but violates the
documented expectation.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25718] Title-getLocalURL does not respect $wgArticlePath when given a query

2010-10-31 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25718

--- Comment #13 from Daniel Friesen mediawiki-b...@nadir-seen-fire.com 
2010-10-31 14:38:31 UTC ---
I just took an actual look at the patch and larger look at the code, I see two
things staring at me...

Firstly, it feels for some reason to me that something is wrong with the
$variant stuff... I'm not quite up to speed on the variant stuff, so it's
either a MW bug where variant handling that should be in the second block isn't
there. Or variant handling is supposed to be done only for /short/Urls and the
patch adds a new location where /short/Urls end up showing up but is missing
variant handling code. Or it's only meant to happen when $query is empty and
I'm just to far behind.

Though that's probably less of an issue compared to this.

Looking at Line 821 I see what seams to be part of a feature that appears to
allow you to pass - to getLocalURL as a method of saying to getLocalURL I
don't have any query, but I want you to give me a long index.php style url
because I have a case where getting short urls will cause chaos, so NO short
urls, although it does look like a tweak would be nice so there is no trailing
.

The way the patch is written seams to break that... if this patch is applied it
looks like $query = - will start outputting short urls when getLocalURL was
explicitly asked to output long urls, and additionally this will be done
without variant handling that would have otherwise been done.

Also now that I notice the comment, the title= showing up in a short url does
not look like proper implementation of a feature, even without the other issues
I'd reject the patch till that was fixed.


Re-reading comments again, it looks to me you have a lingering misunderstanding
of $wgArticlePath, $wgActionPaths and what those blocks of code actually do.

Firstly of course, that note on how in default config robots.txt can't be used
is moot, the robots.txt stuff is an advanced configuration for advanced robots
optimized shorturl style configuration and for the reasons you describe
requires use of short urls to function. Just because a default configuration
doesn't support it doesn't mean that it should be broken universally (this
patch) when it is only supported in an advanced configuration. I should also
point out that if MediaWiki detects that the server is able to support it,
MediaWiki actually sets up $wgArticlePath to work in a /index.php/Article form,
so robots.txt IS actually usable by default on the right server by disallowing
/index.php?.

Now for the actual misunderstanding you commented As far as I can tell, there
is no use case within MediaWiki's default
configuration that touches line 824 of Title.php -- there is never a URL that
includes a query but not an action parameter (apart from the default 'view'
action, which also bypasses line 824).
As I understand from this -- even taking your later comment on history into
account -- you believe that calls to getLocalURL containing a query with an
action parameter are always caught by the chunk of code between lines 807-819.
That is incorrect. The block of code there is ONLY ever applied if the
non-default $wgActionPaths (NOT $wgArticlePath, but $wgActionPaths) which in
fact there are very very few MediaWiki installations that actually enable. So
in actuality the code in lines 810-817 which does things with getLocalURL calls
that have a action= in the query is actually almost NEVER run, at least not
unless the wiki is one of the rare few wiki that have specially configured
$wgActionPaths. The line 824 you believe is being bypassed by default in fact
is actually NEVER EVER bypassed by default. By default config line 824 handles
every single call to getLocalURL which contains anything at all inside the
$query.

And to re-iterate, the expected behavior for getLocalURL and the like is that
urls with no query, and not marked with a faux - query to disallow shorturls
will have a shorturl returned if possible. While any url with a query is ALWAYS
returned in long index.php form. Changing things so that things that currently
output long urls output short urls will break things. I forgot about it, but
there is another reason why that /w/ robots.txt trick is used. The nature of
most pages that use queries is that they are dynamic. In other words they
likely contain links within them which could cause search engines to endlessly
spider dynamically generated content that the wiki does not want them to spider
(ie: search queries). So changing these to short urls may cause dynamic things
to suddenly start being spidered by search engines when they were originally
intended not to. Because of this, I believe that ANY feature we add that allows
getLocalURL to return short urls with queries instead of long urls MUST be an
explicit opt-in where getLocalURL is passed with an extra optional argument
telling it that shorturls with queries are ok and that any code changed to use
this be done by someone understanding 

[Bug 25718] Title-getLocalURL does not respect $wgArticlePath when given a query

2010-10-31 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25718

--- Comment #14 from Kiran Jonnalagadda j...@pobox.com 2010-10-31 14:56:45 
UTC ---
Thank you for the most excellent comment, Daniel. getLocalURL is far too core
to MediaWiki to make changes without understanding the implications thoroughly,
which I clearly don't.

I propose this ticket be marked INVALID.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25718] Title-getLocalURL does not respect $wgArticlePath when given a query

2010-10-31 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25718

Bryan Tong Minh bryan.tongm...@gmail.com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||WONTFIX

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25718] Title-getLocalURL does not respect $wgArticlePath when given a query

2010-10-30 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25718

Reedy s...@reedyboy.net changed:

   What|Removed |Added

   Keywords|easy|need-review, patch
Summary|API call Title-getLocalURL |Title-getLocalURL does not
   |does not respect|respect $wgArticlePath when
   |$wgArticlePath when given a |given a query
   |query   |
   Severity|enhancement |minor

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25718] Title-getLocalURL does not respect $wgArticlePath when given a query

2010-10-30 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25718

Niklas Laxström niklas.laxst...@gmail.com changed:

   What|Removed |Added

 CC||niklas.laxst...@gmail.com

--- Comment #2 from Niklas Laxström niklas.laxst...@gmail.com 2010-10-30 
19:05:32 UTC ---
I think this is by design. If you want prettier urls you can use
$wgActionPaths.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25718] Title-getLocalURL does not respect $wgArticlePath when given a query

2010-10-30 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25718

Bryan Tong Minh bryan.tongm...@gmail.com changed:

   What|Removed |Added

 CC||bryan.tongm...@gmail.com

--- Comment #3 from Bryan Tong Minh bryan.tongm...@gmail.com 2010-10-30 
19:13:09 UTC ---
This breaks urls like action=raw, which can only be used in ugly mode.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25718] Title-getLocalURL does not respect $wgArticlePath when given a query

2010-10-30 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25718

Daniel Friesen mediawiki-b...@nadir-seen-fire.com changed:

   What|Removed |Added

 CC||mediawiki-b...@nadir-seen-f
   ||ire.com

--- Comment #4 from Daniel Friesen mediawiki-b...@nadir-seen-fire.com 
2010-10-30 19:38:44 UTC ---
Additionally I believe this is by design so that action=edit urls on wiki that
are configured like Wikipedia can be made to always have /w/index.php style
urls so they can be blanket blocked by robots.txt

/wiki/Article?action=edit can't adequately be covered in robots.txt and while
noindex is on the page itself it's much better if search engines can be
prevented from even waisting bandwidth by loading urls they are just going to
get a noindex on.

So you'll either have to use $wgActionPaths or start a bug asking for some sort
of $wgShortActions array which can be loaded with an explicit list of actions
that should take the /wiki/Article?action= form instead of the default
index.php?title=action= form.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25718] Title-getLocalURL does not respect $wgArticlePath when given a query

2010-10-30 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25718

--- Comment #5 from Kiran Jonnalagadda j...@pobox.com 2010-10-30 23:03:04 UTC 
---
getLocalURL() already has special handling for when it sees 'action' in the
query parameters. This bug report and patch are for queries that do not specify
an action. Therefore, $wgActionPaths does not apply here and action=raw isn't
affected.

Here is where 'action' is checked for:
http://svn.wikimedia.org/viewvc/mediawiki/tags/REL1_16_0/phase3/includes/Title.php?view=markup#l804

The patch assumes $wgArticlePath is setup for pretty URLs. This may or may not
be handled by the wfAppendQuery() function called elsewhere from within
getLocalURL(). I'm not sure where wfAppendQuery() is defined.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25718] Title-getLocalURL does not respect $wgArticlePath when given a query

2010-10-30 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25718

Platonides platoni...@gmail.com changed:

   What|Removed |Added

 CC||platoni...@gmail.com

--- Comment #6 from Platonides platoni...@gmail.com 2010-10-30 23:15:16 UTC 
---
If you change an url like
/w/index.php?title=Opinion_calendarmonth=11year=2010
to /wiki/Opinion_calendar?month=11year=2010

Then robots.txt blocking /w/ won't apply and a spider will follow the whole
calendar.

I think this is a won't fix.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25718] Title-getLocalURL does not respect $wgArticlePath when given a query

2010-10-30 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25718

--- Comment #7 from Kiran Jonnalagadda j...@pobox.com 2010-10-30 23:30:18 UTC 
---
I've confirmed that the patch does the right thing even when using ugly URLs.

Platonides has a very good point. Maybe rel=nofollow applies there?

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25718] Title-getLocalURL does not respect $wgArticlePath when given a query

2010-10-30 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25718

--- Comment #8 from Daniel Friesen mediawiki-b...@nadir-seen-fire.com 
2010-10-31 02:43:32 UTC ---
Even if you rel=nofollow a url, it's possible that someone may use WikiText
to link to the same url and this link may not contain rel=nofollow, it's also
possible for the external link to be from another website, so there is no way
to ensure that there are never any non-nofollow links sitting around pointing
to urls that you don't want search engines to spider.

rel=nofollow also does not stop spiders from following a url, spiders have
taken more of a stance lately of reading rel=nofollow as I don't endorse this,
don't count it towards a pagerank rather than NEVER use this link to go to
that url. Even if you rel=nofollow something they may still follow the link to
that page, so the only way to keep a spider from ever spidering a url is to use
robots.txt it's the only reliable tool (We know rel=noindex is used on the
pages themselves, but who wants search engines spidering every noindex'ed page
they can see, it's a waste of requests on high traffic wiki).

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l