Re: [Wikitech-l] Accidental 302 hijack by interwiki index: Google v Wikimedia bug
It depends on the interwiki settings. Some have 'local' [1] enabled, some not. See the documentation[1]: iw_local: informs MediaWiki how it should treat interwiki links coming from external sources. If iw_local is 1, then it will treat these links as though they were generated from within the local wiki. For example, the interwiki link fr: on the en.wikipedia.org project has iw_local=1 set. Therefore, the link to http://en.wikipedia.org/wiki/fr:Accueil gracefully redirects you to the French Homepage (Accueil). However, the Wikimedia foundation project site is flagged 0 on en.wikipedia.org; the link to http://en.wikipedia.org/wiki/wikimedia:Home does not work, even though [[wikimedia:Home]] would work if it were on a local Wikipedia page. http://en.wikipedia.org/wiki/commons:Apple This example works fine, no need for any weird stuff. http://en.wikipedia.org/wiki/mw:DB This one doesn't and only works when done from within a page (like [[mw:DB]]) becuase iw_local is 0. I think the Special:Search entry is a bug / undocumented feature in that it ignored the 'local'-setting. --Krinkle [1] http://www.mediawiki.org/wiki/Manual:Interwiki_table Op 19 sep 2010, om 00:34 heeft MZMcBride het volgende geschreven: Aryeh Gregor wrote: Right now third-party software can do stuff like a href=http://en.wikipedia.org/wiki/$1; and replace $1 by user input, and it will work basically like [[$1]] typed on Wikipedia, and that's good. http://en.wikipedia.org/wiki/mw:Not_really... I still don't understand why users (third-party or not) are forced to use links http://en.wikipedia.org/wiki/Special:Search? search=mw:like_this in order to redirect properly. MZMcBride ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] [Announce]: Mark Bergsma promotion to Operations Engineer Programs Manager
On 15 September 2010 16:41, Domas Mituzas wrote: Hi! Erik gave an overview of how EPMs work a few days ago: http://article.gmane.org/gmane.science.linguistics.wikipedia.technical/49532 What I learned is that most important information should be put under most obscure subject lines, so that only people who really really care would read that. You're right... I added something here: http://wikimediafoundation.org/wiki/Engineering_Program_Manager , it should be more visible (please someone improve it). Nemo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] using parserTests code for selenium test framework
On Fri, 17 Sep 2010 19:13:33 +, Dan Nessett wrote: On Fri, 17 Sep 2010 18:40:53 +, Dan Nessett wrote: I have been tasked to evaluate whether we can use the parserTests db code for the selenium framework. I just looked it over and have serious reservations. I would appreciate any comments on the following analysis. The environment for selenium tests is different than that for parserTests. It is envisioned that multiple concurrent tests could run using the same MW code base. Consequently, each test run must: + Use a db that if written to will not destroy other test wiki information. + Switch in a new images and math directory so any writes do not interfere with other tests. + Maintain the integrity of the cache. Note that tests would *never* run on a production wiki (it may be possible to do so if they do no writes, but safety considerations suggest they should always run on a test data, not production data). In fact production wikis should always retain the setting $wgEnableSelenium = false, to ensure selenium test are disabled. Given this background, consider the following (and feel free to comment on it): parserTests temporary table code: A fixed set of tables are specified in the code. parserTests creates temporary tables with the same name, but using a different static prefix. These tables are used for the parserTests run. Problems using this approach for selenium tests: + Selenium tests on extensions may require use of extension specific tables, the names of which cannot be elaborated in the code. + Concurrent test runs of parserTests are not supported, since the temporary tables have fixed names and therefore concurrent writes to them by parallel test runs would cause interference. + Clean up from aborted runs requires dropping fossil tables. But, if a previous run tested an extension with extension-specific tables, there is no way for a test of some other functionality to figure out which tables to drop. For these reasons, I don't think we can reuse the parserTests code. However, I am open to arguments to the contrary. After reflection, here are some other problems. + Some tests assume the existence of data in the db. For example, the PagedTiffHandler tests assume the image Multipage.tiff is already loaded. However, this requires an entry in the image table. You could modify the test to clone the existing image table, but that means you have problems with: + Some tests assume certain data is *not* in the db. PagedTiffHandler has tests that upload images. These cannot already be in the images table. So, you can't simply clone the images table. All of this suggests to me that a better strategy is: + When the test run begins, clone a db associated with the test suite. + Switch the wiki to use this db and return a cookie or some other state information that identifies this test run configuration. + When the test suite runs, each wiki access supplies this state so the wiki code can switch in the correct db. + Cleanup of test runs requires removing the cloned db. + To handled aborted runs, there needs to be a mechanism to time out cloned dbs and the state associated with the test run. Regardless of how we implement the persistent storage for managing test runs, there needs to be a way to trigger it use. To minimize the changes to core, we need a hook that runs after processing LocalSettings (and by implication DefaultSettings), but before any wiki state is accessed (e.g., before accessing the db, the images directory, any cached data). I looked at the existing hooks, but so far have not found one that appears suitable. So, either we need to identify an appropriate existing hook, or we need to add a hook that meets the requirements. -- -- Dan Nessett ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] using parserTests code for selenium test framework
On Sun, 19 Sep 2010 02:47:00 +0200, Platonides wrote: Dan Nessett wrote: What about memcached? (that would be a key based on the original db name) The storage has to be persistent to accommodate wiki crashes (e.g., httpd crash, server OS crash, power outage). It might be possible to use memcachedb, but as far as I am aware that requires installing Berkeley DB, which complicated deployment. Why not employ the already installed DB software used by the wiki? That provides persistent storage and requires no additional software. My original idea was to use whatever ObjectCache the wiki used, but it could be forced to use the db as backend (that's the objectcache table). My familiarity with the ObjectCache is casual. I presume it holds data that is set on particular wiki access requests and that data is then used on subsequent requests to make them more efficient. If so, then using a common ObjectCache for all concurrent test runs would cause interference between them. To ensure such interference doesn't exist, we would need to switch in a per-test-run ObjectCache (which takes us back to the idea of using a per-test-run db, since the ObjectCache is implemented using the objectcache table). -- -- Dan Nessett ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] using parserTests code for selenium test framework
Dan Nessett wrote: Platonides wrote: Dan Nessett wrote: What about memcached? (that would be a key based on the original db name) The storage has to be persistent to accommodate wiki crashes (e.g., httpd crash, server OS crash, power outage). It might be possible to use memcachedb, but as far as I am aware that requires installing Berkeley DB, which complicated deployment. Why not employ the already installed DB software used by the wiki? That provides persistent storage and requires no additional software. My original idea was to use whatever ObjectCache the wiki used, but it could be forced to use the db as backend (that's the objectcache table). My familiarity with the ObjectCache is casual. I presume it holds data that is set on particular wiki access requests and that data is then used on subsequent requests to make them more efficient. If so, then using a common ObjectCache for all concurrent test runs would cause interference between them. To ensure such interference doesn't exist, we would need to switch in a per-test-run ObjectCache (which takes us back to the idea of using a per-test-run db, since the ObjectCache is implemented using the objectcache table). You load originaldb.objectcache, retrieve the specific configuration, and switch into it. For supporting many sumyltaneous configurations, the keyname could have the instance (whatever that cookie is set to) appended, although those dynamic configurations make me a bit nervous. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] using parserTests code for selenium test framework
On Sun, 19 Sep 2010 23:42:08 +0200, Platonides wrote: Dan Nessett wrote: Platonides wrote: Dan Nessett wrote: What about memcached? (that would be a key based on the original db name) The storage has to be persistent to accommodate wiki crashes (e.g., httpd crash, server OS crash, power outage). It might be possible to use memcachedb, but as far as I am aware that requires installing Berkeley DB, which complicated deployment. Why not employ the already installed DB software used by the wiki? That provides persistent storage and requires no additional software. My original idea was to use whatever ObjectCache the wiki used, but it could be forced to use the db as backend (that's the objectcache table). My familiarity with the ObjectCache is casual. I presume it holds data that is set on particular wiki access requests and that data is then used on subsequent requests to make them more efficient. If so, then using a common ObjectCache for all concurrent test runs would cause interference between them. To ensure such interference doesn't exist, we would need to switch in a per-test-run ObjectCache (which takes us back to the idea of using a per-test-run db, since the ObjectCache is implemented using the objectcache table). You load originaldb.objectcache, retrieve the specific configuration, and switch into it. For supporting many sumyltaneous configurations, the keyname could have the instance (whatever that cookie is set to) appended, although those dynamic configurations make me a bit nervous. Well, this may work, but consider the following. A nightly build environment (and even a local developer test environment) tests the latest revision using a suite of regression tests. These tests exercise the same wiki code, each parametrized by: + Browser type (e.g., Firefox, IE, Safari, Opera) + Database (e.g., MySQL, Postgres, SQLite) + OS platform (e.g., Linux, BSD unix variant, Windows variant) A particular test environment may not support all permutations of these parameters (in particular a local developer environment may support only one OS), but the code mechanism for supporting the regression tests should. To ensure timely completion of these tests, they will almost certainly run concurrently. So, when a regression test runs, it must not only retrieve the configuration data associated with it, it must create a test run environment (e.g., a test db, a test images directory, test cache data). The creation of this test run environment requires an identifier somewhere so its resources may be reclaimed when the test run completes or after an abnormal end of the test run. Thus, the originaldb must not only hold configuration data with db keys identifying the particular test and its parameters, but also an identifier for the test run that can be used to reclaim resources if the test ends abnormally. The question is whether using a full wiki db for this purpose is advantageous or whether stripping out all of the other tables except the objectcache table is the best implementation strategy. -- -- Dan Nessett ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Upload action patrol
Hi all, I've sent this to wikitech-l before but I see now online that it didn't create a new thread but instead recognized it as reply to an old thread. Not sure what happened, but here again: It's been roughly three years since I first saw this topic filed on BugZilla[1] and before that is was often raised on IRC and on-wiki during discussion about it being very clumsy and unpracticle to systematically patrol uploads. Back then, from my point of view, this was about local uploads. Now adays I'm much more active on and for Wikimedia Commons, and not so much on local uploads. Obviously with more and more wikis moving towards Commons and the growth of the wikis themselfs it's about time we can atleast some kind of method of being able to atleast indicate that a file has been 'checked'. Or, to be more specific, to know what hasn't been checked. On Commons there are several review systems for common external resources that are used to import material from (such as Picassa and Flickr). And those work very well. Bots crawl recent uploads and whenever a reference to Flickr is found they are tagged as need-review and the easy ones are even reviewed by bots (since is something unique to Picassa and Flickr since they are machine readable and license info can be automatically verified) and everyhting else (false matches and errors) is manually reviewed. However this is just a very tiny little bit of all the files on Commons. Last march I've raised the topic of edit patrol on Commons [2] and that has been a great success. We've got a team together and every single anonymous edit made after April 1st 2010 has been or will soon be patrolled [3]. Not once has it gone past the 30-day expiration time for recentchanges table. The same has been kept up for new page patrol aswell for several years. Commons being primarily a media site, it's a bit of an akward thing to say that we are totally unable to patrol uploads effectively. We can't filter out uploads by bots, or trusted users. We can't filter out what's been patrolled by patrollers. It's just an incredible mess that sits there. Several attempts have been made in the past to work around the software. But no matter how you try, a patrol flag will make things a whole lot easier. Once there is the possiblity to click a link and *poof* toggle that unpatrolled boolean I'm sure it won't take long before there are nice AJAX-tools coming to make this easier en-mass and a checklist / team will be formed to get the job done. Alrighty, enough rant. What needs to be done for an implementation ? When asking about this on IRC somebody said this; although a bit of a workarond we can do this already by means of NewPage patrol in the File namespace. Unless it's well hidden, this is false. Because uploads don't create an patrollable entry for the upload log action, nor for the description page creation. As a matter of fact the creation of those description page aren't registered in the recentchanges table at all (Special:NewPages / Special:RecentChanges). Depending on how uploads will become patrollable the above could actually be a good thing. Since having to patrol both would be ineffecient, and uploading a file isn't neccecarily asociated with creating a page by users anyway. Plus it would mean duplicate entries in Special:RecentChanges (upload action / page creation). Log actions are already present in the recent changes table so I'm guessing it doesn't take that much of a change in order to make uploads patrollable. One interesting thing about uploading (the same is true with moving, and (un)protecting a page) is that it is also listed in the page history (instead of just in the Logs) which means it is already very accessable by the users and doesn't require a new system as to where the [mark as patrolled] links should appear. For re-uploads on the diff page (like with edits) and on new uploads on the first revision. (although the latter may be subject to this bug: https://bugzilla.wikimedia.org/show_bug.cgi?id=15936 which I hope will be solved though it's not a show stopper, as long as there is any way at all to get there (even if it requires to go to Special:RecentChanges) that would be an incredible improvement to the current situation). Greetings, Krinkle [1] https://bugzilla.wikimedia.org/show_bug.cgi?id=9501 [2] http://commons.wikimedia.org/wiki/Commons:Village_pump/Archive/2010Mar#Marking_edits_as_patrolled [3] http://commons.wikimedia.org/wiki/Commons:Counter_Vandalism_Unit#Anonymous_edits ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Bugzilla Weekly Report
MediaWiki Bugzilla Report for September 13, 2010 - September 20, 2010 Status changes this week Bugs NEW : 83 Bugs ASSIGNED : 9 Bugs REOPENED : 13 Bugs RESOLVED : 58 Total bugs still open: 4901 Resolutions for the week: Bugs marked FIXED : 36 Bugs marked REMIND : 0 Bugs marked INVALID: 8 Bugs marked DUPLICATE : 8 Bugs marked WONTFIX: 3 Bugs marked WORKSFORME : 2 Bugs marked LATER : 1 Bugs marked MOVED : 0 Specific Product/Component Resolutions User Metrics New Bugs Per Component Site requests 5 DonationInterface 3 General/Unknown 3 SemanticForms 3 UsabilityInitiative 3 New Bugs Per Product MediaWiki 11 Wikimedia 9 MediaWiki extensions19 Top 5 Bug Resolvers roan.kattouw [AT] gmail.com 7 niklas.laxstrom [AT] gmail.com 7 jeluf [AT] gmx.de 7 tparscal [AT] wikimedia.org 6 innocentkiller [AT] gmail.com 4 ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l