Re: [Wikitech-l] Accidental 302 hijack by interwiki index: Google v Wikimedia bug

2010-09-19 Thread Krinkle
It depends on the interwiki settings. Some have 'local' [1] enabled,  
some not.

See the documentation[1]:
 iw_local: informs MediaWiki how it should treat interwiki links  
 coming from external sources. If iw_local is 1, then it will treat  
 these links as though they were generated from within the local wiki.
 For example, the interwiki link fr: on the en.wikipedia.org project  
 has iw_local=1 set. Therefore, the link to 
 http://en.wikipedia.org/wiki/fr:Accueil 
  gracefully redirects you to the French Homepage (Accueil). However,  
 the Wikimedia foundation project site is flagged 0 on  
 en.wikipedia.org; the link to http://en.wikipedia.org/wiki/wikimedia:Home 
  does not work, even though [[wikimedia:Home]] would work if it were  
 on a local Wikipedia page.

http://en.wikipedia.org/wiki/commons:Apple
This example works fine, no need for any weird stuff.

http://en.wikipedia.org/wiki/mw:DB
This one doesn't and only works when done from within a page (like  
[[mw:DB]]) becuase iw_local is 0.

I think the Special:Search entry is a bug / undocumented feature in  
that it ignored the 'local'-setting.

--Krinkle

[1] http://www.mediawiki.org/wiki/Manual:Interwiki_table


Op 19 sep 2010, om 00:34 heeft MZMcBride het volgende geschreven:

 Aryeh Gregor wrote:
 Right now third-party software can do stuff like a
 href=http://en.wikipedia.org/wiki/$1; and replace $1 by user input,
 and it will work basically like [[$1]] typed on Wikipedia, and that's
 good.

 http://en.wikipedia.org/wiki/mw:Not_really...

 I still don't understand why users (third-party or not) are forced  
 to use
 links http://en.wikipedia.org/wiki/Special:Search? 
 search=mw:like_this in
 order to redirect properly.

 MZMcBride



 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] [Announce]: Mark Bergsma promotion to Operations Engineer Programs Manager

2010-09-19 Thread Federico Leva (Nemo)
On 15 September 2010 16:41, Domas Mituzas wrote:
  Hi!
 
  Erik gave an overview of how EPMs work a few days ago:
  
http://article.gmane.org/gmane.science.linguistics.wikipedia.technical/49532
 
  What I learned is that most important information should be put under 
  most obscure subject lines, so that only people who really really 
care  would read that.

You're right... I added something here: 
http://wikimediafoundation.org/wiki/Engineering_Program_Manager , it 
should be more visible (please someone improve it).

Nemo

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] using parserTests code for selenium test framework

2010-09-19 Thread Dan Nessett
On Fri, 17 Sep 2010 19:13:33 +, Dan Nessett wrote:

 On Fri, 17 Sep 2010 18:40:53 +, Dan Nessett wrote:
 
 I have been tasked to evaluate whether we can use the parserTests db
 code for the selenium framework. I just looked it over and have serious
 reservations. I would appreciate any comments on the following
 analysis.
 
 The environment for selenium tests is different than that for
 parserTests. It is envisioned that multiple concurrent tests could run
 using the same MW code base. Consequently, each test run must:
 
 + Use a db that if written to will not destroy other test wiki
 information.
 + Switch in a new images and math directory so any writes do not
 interfere with other tests.
 + Maintain the integrity of the cache.
 
 Note that tests would *never* run on a production wiki (it may be
 possible to do so if they do no writes, but safety considerations
 suggest they should always run on a test data, not production data). In
 fact production wikis should always retain the setting
 $wgEnableSelenium = false, to ensure selenium test are disabled.
 
 Given this background, consider the following (and feel free to comment
 on it):
 
 parserTests temporary table code:
 
 A fixed set of tables are specified in the code. parserTests creates
 temporary tables with the same name, but using a different static
 prefix. These tables are used for the parserTests run.
 
 Problems using this approach for selenium tests:
 
  + Selenium tests on extensions may require use of extension specific
 tables, the names of which cannot be elaborated in the code.
 
 + Concurrent test runs of parserTests are not supported, since the
 temporary tables have fixed names and therefore concurrent writes to
 them by parallel test runs would cause interference.
 
 + Clean up from aborted runs requires dropping fossil tables. But, if a
 previous run tested an extension with extension-specific tables, there
 is no way for a test of some other functionality to figure out which
 tables to drop.
 
 For these reasons, I don't think we can reuse the parserTests code.
 However, I am open to arguments to the contrary.
 
 After reflection, here are some other problems.
 
 + Some tests assume the existence of data in the db. For example, the
 PagedTiffHandler tests assume the image Multipage.tiff is already
 loaded. However, this requires an entry in the image table. You could
 modify the test to clone the existing image table, but that means you
 have problems with:
 
 + Some tests assume certain data is *not* in the db. PagedTiffHandler
 has tests that upload images. These cannot already be in the images
 table. So, you can't simply clone the images table.
 
 All of this suggests to me that a better strategy is:
 
 + When the test run begins, clone a db associated with the test suite.
 
 + Switch the wiki to use this db and return a cookie or some other state
 information that identifies this test run configuration.
 
 + When the test suite runs, each wiki access supplies this state so the
 wiki code can switch in the correct db.
 
 + Cleanup of test runs requires removing the cloned db.
 
 + To handled aborted runs, there needs to be a mechanism to time out
 cloned dbs and the state associated with the test run.

Regardless of how we implement the persistent storage for managing test 
runs, there needs to be a way to trigger it use. To minimize the changes 
to core, we need a hook that runs after processing LocalSettings (and by 
implication DefaultSettings), but before any wiki state is accessed 
(e.g., before accessing the db, the images directory, any cached data). I 
looked at the existing hooks, but so far have not found one that appears 
suitable.

So, either we need to identify an appropriate existing hook, or we need 
to add a hook that meets the requirements.

-- 
-- Dan Nessett


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] using parserTests code for selenium test framework

2010-09-19 Thread Dan Nessett
On Sun, 19 Sep 2010 02:47:00 +0200, Platonides wrote:

 Dan Nessett wrote:
 What about memcached?
 (that would be a key based on the original db name)
 
 The storage has to be persistent to accommodate wiki crashes (e.g.,
 httpd crash, server OS crash, power outage). It might be possible to
 use memcachedb, but as far as I am aware that requires installing
 Berkeley DB, which complicated deployment.
 
 Why not employ the already installed DB software used by the wiki? That
 provides persistent storage and requires no additional software.
 
 My original idea was to use whatever ObjectCache the wiki used, but it
 could be forced to use the db as backend (that's the objectcache table).

My familiarity with the ObjectCache is casual. I presume it holds data 
that is set on particular wiki access requests and that data is then used 
on subsequent requests to make them more efficient. If so, then using a 
common ObjectCache for all concurrent test runs would cause interference 
between them. To ensure such interference doesn't exist, we would need to 
switch in a per-test-run ObjectCache (which takes us back to the idea of 
using a per-test-run db, since the ObjectCache is implemented using the 
objectcache table).

-- 
-- Dan Nessett


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] using parserTests code for selenium test framework

2010-09-19 Thread Platonides
Dan Nessett wrote:
 Platonides wrote:
 Dan Nessett wrote:
 What about memcached?
 (that would be a key based on the original db name)

 The storage has to be persistent to accommodate wiki crashes (e.g.,
 httpd crash, server OS crash, power outage). It might be possible to
 use memcachedb, but as far as I am aware that requires installing
 Berkeley DB, which complicated deployment.

 Why not employ the already installed DB software used by the wiki? That
 provides persistent storage and requires no additional software.

 My original idea was to use whatever ObjectCache the wiki used, but it
 could be forced to use the db as backend (that's the objectcache table).
 
 My familiarity with the ObjectCache is casual. I presume it holds data 
 that is set on particular wiki access requests and that data is then used 
 on subsequent requests to make them more efficient. If so, then using a 
 common ObjectCache for all concurrent test runs would cause interference 
 between them. To ensure such interference doesn't exist, we would need to 
 switch in a per-test-run ObjectCache (which takes us back to the idea of 
 using a per-test-run db, since the ObjectCache is implemented using the 
 objectcache table).

You load originaldb.objectcache, retrieve the specific configuration,
and switch into it.
For supporting many sumyltaneous configurations, the keyname could have
the instance (whatever that cookie is set to) appended, although those
dynamic configurations make me a bit nervous.


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] using parserTests code for selenium test framework

2010-09-19 Thread Dan Nessett
On Sun, 19 Sep 2010 23:42:08 +0200, Platonides wrote:

 Dan Nessett wrote:
 Platonides wrote:
 Dan Nessett wrote:
 What about memcached?
 (that would be a key based on the original db name)

 The storage has to be persistent to accommodate wiki crashes (e.g.,
 httpd crash, server OS crash, power outage). It might be possible to
 use memcachedb, but as far as I am aware that requires installing
 Berkeley DB, which complicated deployment.

 Why not employ the already installed DB software used by the wiki?
 That provides persistent storage and requires no additional software.

 My original idea was to use whatever ObjectCache the wiki used, but it
 could be forced to use the db as backend (that's the objectcache
 table).
 
 My familiarity with the ObjectCache is casual. I presume it holds data
 that is set on particular wiki access requests and that data is then
 used on subsequent requests to make them more efficient. If so, then
 using a common ObjectCache for all concurrent test runs would cause
 interference between them. To ensure such interference doesn't exist,
 we would need to switch in a per-test-run ObjectCache (which takes us
 back to the idea of using a per-test-run db, since the ObjectCache is
 implemented using the objectcache table).
 
 You load originaldb.objectcache, retrieve the specific configuration,
 and switch into it.
 For supporting many sumyltaneous configurations, the keyname could have
 the instance (whatever that cookie is set to) appended, although those
 dynamic configurations make me a bit nervous.

Well, this may work, but consider the following.

A nightly build environment (and even a local developer test environment) 
tests the latest revision using a suite of regression tests. These tests 
exercise the same wiki code, each parametrized by:

+ Browser type (e.g., Firefox, IE, Safari, Opera)
+ Database (e.g., MySQL, Postgres, SQLite)
+ OS platform (e.g., Linux, BSD unix variant, Windows variant)

A particular test environment may not support all permutations of these 
parameters (in particular a local developer environment may support only 
one OS), but the code mechanism for supporting the regression tests 
should. To ensure timely completion of these tests, they will almost 
certainly run concurrently.

So, when a regression test runs, it must not only retrieve the 
configuration data associated with it, it must create a test run 
environment (e.g., a test db, a test images directory, test cache data). 
The creation of this test run environment requires an identifier 
somewhere so its resources may be reclaimed when the test run completes 
or after an abnormal end of the test run.

Thus, the originaldb must not only hold configuration data with db keys 
identifying the particular test and its parameters, but also an 
identifier for the test run that can be used to reclaim resources if the 
test ends abnormally. The question is whether using a full wiki db for 
this purpose is advantageous or whether stripping out all of the other 
tables except the objectcache table is the best implementation strategy.

-- 
-- Dan Nessett


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] Upload action patrol

2010-09-19 Thread Krinkle
Hi all,

I've sent this to wikitech-l before but I see now online that it  
didn't create a new thread but instead recognized it as reply to an  
old thread. Not sure what happened, but here again:



It's been roughly three years since I first saw this topic filed on  
BugZilla[1] and before that is was often raised on IRC and on-wiki  
during discussion about it being very clumsy and unpracticle to  
systematically patrol uploads. Back then, from my point of view, this  
was about local uploads.

Now adays I'm much more active on and for Wikimedia Commons, and not  
so much on local uploads.
Obviously with more and more wikis moving towards Commons and the  
growth of the wikis themselfs it's about time we can atleast some kind  
of method of being able to atleast indicate that a file has been  
'checked'. Or, to be more specific, to know what hasn't been checked.

On Commons there are several review systems for common external  
resources that are used to import material from (such as Picassa and  
Flickr). And those work very well. Bots crawl recent uploads and  
whenever a reference to Flickr is found they are tagged as need-review  
and the easy ones are even reviewed by bots (since is something unique  
to Picassa and Flickr since they are machine readable and license info  
can be automatically verified) and everyhting else (false matches and  
errors) is manually reviewed.

However this is just a very tiny little bit of all the files on Commons.
Last march I've raised the topic of edit patrol on Commons [2] and  
that has been a great success. We've got a team together and every  
single anonymous edit made after April 1st 2010 has been or will soon  
be patrolled [3]. Not once has it gone past the 30-day expiration time  
for recentchanges table.
The same has been kept up for new page patrol aswell for several years.

Commons being primarily a media site, it's a bit of an akward thing to  
say that we are totally unable to patrol uploads effectively.
We can't filter out uploads by bots, or trusted users. We can't filter  
out what's been patrolled by patrollers. It's just an incredible mess  
that sits there.

Several attempts have been made in the past to work around the  
software. But no matter how you try, a patrol flag will make things a  
whole lot easier.
Once there is the possiblity to click a link and *poof* toggle that  
unpatrolled boolean I'm sure it won't take long before there are nice  
AJAX-tools coming to make this easier en-mass and a checklist / team  
will be formed to get the job done.

Alrighty, enough rant. What needs to be done for an implementation ?

When asking about this on IRC somebody said this; although a bit of a  
workarond we can do this already by means of NewPage patrol in the  
File namespace.
Unless it's well hidden, this is false. Because uploads don't create  
an patrollable entry for the upload log action, nor for the  
description page creation. As a matter of fact the creation of those  
description page aren't registered in the recentchanges table at all  
(Special:NewPages / Special:RecentChanges).

Depending on how uploads will become patrollable the above could  
actually be a good thing. Since having to patrol both would be  
ineffecient, and uploading a file isn't neccecarily asociated with  
creating a page by users anyway. Plus it would mean duplicate entries  
in Special:RecentChanges (upload action / page creation).

Log actions are already present in the recent changes table so I'm  
guessing it doesn't take that much of a change in order to make  
uploads patrollable.

One interesting thing about uploading (the same is true with moving,  
and (un)protecting a page) is that it is also listed in the page  
history (instead of just in the Logs) which means it is already very  
accessable by the users and doesn't require a new system as to where  
the [mark as patrolled] links should appear.

For re-uploads on the diff page (like with edits) and on new uploads  
on the first revision. (although the latter may be subject to this  
bug: https://bugzilla.wikimedia.org/show_bug.cgi?id=15936 which I hope  
will be solved though it's not a show stopper, as long as there is any  
way at all to get there (even if it requires to go to  
Special:RecentChanges) that would be an incredible improvement to the  
current situation).

Greetings,
Krinkle


[1] https://bugzilla.wikimedia.org/show_bug.cgi?id=9501
[2] 
http://commons.wikimedia.org/wiki/Commons:Village_pump/Archive/2010Mar#Marking_edits_as_patrolled
[3] 
http://commons.wikimedia.org/wiki/Commons:Counter_Vandalism_Unit#Anonymous_edits

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] Bugzilla Weekly Report

2010-09-19 Thread reporter
MediaWiki Bugzilla Report for September 13, 2010 - September 20, 2010

Status changes this week

Bugs NEW   :  83  
Bugs ASSIGNED  :  9   
Bugs REOPENED  :  13  
Bugs RESOLVED  :  58  

Total bugs still open: 4901

Resolutions for the week:

Bugs marked FIXED  :  36  
Bugs marked REMIND :  0   
Bugs marked INVALID:  8   
Bugs marked DUPLICATE  :  8   
Bugs marked WONTFIX:  3   
Bugs marked WORKSFORME :  2   
Bugs marked LATER  :  1   
Bugs marked MOVED  :  0   

Specific Product/Component Resolutions  User Metrics 

New Bugs Per Component

Site requests   5   
DonationInterface   3   
General/Unknown 3   
SemanticForms   3   
UsabilityInitiative 3   

New Bugs Per Product

MediaWiki   11  
Wikimedia   9   
MediaWiki extensions19  

Top 5 Bug Resolvers

roan.kattouw [AT] gmail.com 7   
niklas.laxstrom [AT] gmail.com  7   
jeluf [AT] gmx.de   7   
tparscal [AT] wikimedia.org 6   
innocentkiller [AT] gmail.com   4   


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l