Re: [Wikitech-l] Extension:OpenID 3.00 - Security Release
Was this the last blocker to getting the extension deployed? ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Extension:OpenID 3.00 - Security Release
Am 08.03.2013 10:07, schrieb Yuvi Panda: Was this the last blocker to getting the extension deployed? One, two or three further non-sec-related patches will follow in the next days which improve the user GUI, especially the preference tab for OpenID. stay tuned... Regards, Tom signature.asc Description: OpenPGP digital signature ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Extension:OpenID 3.00 - Security Release
On 03/08/2013 01:34 AM, Petr Bena wrote: this shouldn't be very dangerous Even if it isn't in practice in the typical cases, it exposes a third party to a risk they are unable to assess if they use that OpenID. (And it doesn't require a 'crat going rogue even here -- renames are sometimes done without salting the former username and an unrelated third party could create an account to reuse the username and then probe plausible consumers of the ID). -- Marc ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Bug 1542 - Log spam blacklist hits
Hey Guys, Thanks for explaining it to me. Can I have your IRC handles, I still think I have many doubts. Is there a simpler bug related with extension, so I can get an Idea of it working. On Fri, Mar 8, 2013 at 5:23 AM, Chris Steipp cste...@wikimedia.org wrote: On Thu, Mar 7, 2013 at 1:34 PM, Platonides platoni...@gmail.com wrote: On 07/03/13 21:03, anubhav agarwal wrote: Hey Chris I was exploring SpamBlaklist Extension. I have some doubts hope you could clear them. Is there any place I can get documentation of Class SpamBlacklist in the file SpamBlacklist_body.php. ? There really isn't any documentation besides the code, but a couple more things you should look at. Notice that in SpamBlacklist.php, there is the line $wgHooks['EditFilterMerged'][] = 'SpamBlacklistHooks::filterMerged';, which is the way that SpamBlacklist registers itself with MediaWiki core to filter edits. So when MediaWiki core runs the EditFilterMerged hooks (which it does in includes/EditPage.php, line 1287), all of the extensions that have registered a function for that hook are run with the passed in arguments, so SpamBlacklistHooks::filterMerged is run. And SpamBlacklistHooks::filterMerged then just sets up and calls SpamBlacklist::filter. So that is where you can start tracing what is actually in the variables, in case Platonides summary wasn't enough. In function filter what does the following variables represent ? $title Title object (includes/Title.php) This is the page where it tried to save. $text Text being saved in the page/section $section Name of the section or '' $editpage EditPage object if EditFilterMerged was called, null otherwise $out A ParserOutput class (actually, this variable name was a bad choice, it looks like a OutputPage), see includes/parser/ParserOutput.php I have understood the following things from the code, please correct me if I am wrong. It extracts the edited text, and parse it to find the links. Actually, it uses the fact that the parser will have processed the links, so in most cases just obtains that information. It then replaces the links which match the whitelist regex, This doesn't make sense as you explain it. It builds a list of links, and replaces whitelisted ones with '', ie. removes whitelisted links from the list. and then checks if there are some links that match the blacklist regex. Yes If the check is greater you return the content matched. Right, $check will be non-0 if the links matched the blacklist. it already enters in the debuglog if it finds a match Yes, but that is a private log. Bug 1542 talks about making that accesible in the wiki. Yep. For example, see * https://en.wikipedia.org/wiki/Special:Log * https://en.wikipedia.org/wiki/Special:AbuseLog I guess the bug aims at creating a sql table. I was thinking of the following fields to log. Title, Text, User, URLs, IP. I don't understand why you denied it. Because we don't like to publish the IPs *in the wiki*. The WMF privacy policy also discourages us from keeping IP addresses longer than 90 days, so if you do keep IPs, then you need a way to hide / purge them, and if they allow someone to see what IP address a particular username was using, then only users with checkuser permissions are allowed to see that. So it would be easier for you not to include it, but if it's desired, then you'll just have to build those protections out too. I think the approach should be to log matches using abusefilter extension if that one is loaded. The abusefilter log format has a lot of data in it specific to AbuseFilter, and is used to re-test abuse filters, so adding these hits into that log might cause some issues. I think either the general log, or using a separate, new log table would be best. Just for some numbers, in the first 7 days of this month, we've had an average of 27,000 hits each day. So if this goes into an existing log, it's going to generate a significant amount of data. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l -- Cheers, Anubhav Anubhav Agarwal| 4rth Year | Computer Science Engineering | IIT Roorkee ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Bug 1542 - Log spam blacklist hits
csteipp. Feel free to ping me whenever. On Mar 8, 2013 6:23 AM, anubhav agarwal anubhav...@gmail.com wrote: Hey Guys, Thanks for explaining it to me. Can I have your IRC handles, I still think I have many doubts. Is there a simpler bug related with extension, so I can get an Idea of it working. On Fri, Mar 8, 2013 at 5:23 AM, Chris Steipp cste...@wikimedia.org wrote: On Thu, Mar 7, 2013 at 1:34 PM, Platonides platoni...@gmail.com wrote: On 07/03/13 21:03, anubhav agarwal wrote: Hey Chris I was exploring SpamBlaklist Extension. I have some doubts hope you could clear them. Is there any place I can get documentation of Class SpamBlacklist in the file SpamBlacklist_body.php. ? There really isn't any documentation besides the code, but a couple more things you should look at. Notice that in SpamBlacklist.php, there is the line $wgHooks['EditFilterMerged'][] = 'SpamBlacklistHooks::filterMerged';, which is the way that SpamBlacklist registers itself with MediaWiki core to filter edits. So when MediaWiki core runs the EditFilterMerged hooks (which it does in includes/EditPage.php, line 1287), all of the extensions that have registered a function for that hook are run with the passed in arguments, so SpamBlacklistHooks::filterMerged is run. And SpamBlacklistHooks::filterMerged then just sets up and calls SpamBlacklist::filter. So that is where you can start tracing what is actually in the variables, in case Platonides summary wasn't enough. In function filter what does the following variables represent ? $title Title object (includes/Title.php) This is the page where it tried to save. $text Text being saved in the page/section $section Name of the section or '' $editpage EditPage object if EditFilterMerged was called, null otherwise $out A ParserOutput class (actually, this variable name was a bad choice, it looks like a OutputPage), see includes/parser/ParserOutput.php I have understood the following things from the code, please correct me if I am wrong. It extracts the edited text, and parse it to find the links. Actually, it uses the fact that the parser will have processed the links, so in most cases just obtains that information. It then replaces the links which match the whitelist regex, This doesn't make sense as you explain it. It builds a list of links, and replaces whitelisted ones with '', ie. removes whitelisted links from the list. and then checks if there are some links that match the blacklist regex. Yes If the check is greater you return the content matched. Right, $check will be non-0 if the links matched the blacklist. it already enters in the debuglog if it finds a match Yes, but that is a private log. Bug 1542 talks about making that accesible in the wiki. Yep. For example, see * https://en.wikipedia.org/wiki/Special:Log * https://en.wikipedia.org/wiki/Special:AbuseLog I guess the bug aims at creating a sql table. I was thinking of the following fields to log. Title, Text, User, URLs, IP. I don't understand why you denied it. Because we don't like to publish the IPs *in the wiki*. The WMF privacy policy also discourages us from keeping IP addresses longer than 90 days, so if you do keep IPs, then you need a way to hide / purge them, and if they allow someone to see what IP address a particular username was using, then only users with checkuser permissions are allowed to see that. So it would be easier for you not to include it, but if it's desired, then you'll just have to build those protections out too. I think the approach should be to log matches using abusefilter extension if that one is loaded. The abusefilter log format has a lot of data in it specific to AbuseFilter, and is used to re-test abuse filters, so adding these hits into that log might cause some issues. I think either the general log, or using a separate, new log table would be best. Just for some numbers, in the first 7 days of this month, we've had an average of 27,000 hits each day. So if this goes into an existing log, it's going to generate a significant amount of data. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l -- Cheers, Anubhav Anubhav Agarwal| 4rth Year | Computer Science Engineering | IIT Roorkee ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Seemingly proprietary Javascript
Le 06/03/13 13:34, Chad wrote: Jack Phoenix wrote: we'll soon be debating about the very meaning of the word is. Jack is not alone. ^^ Care to elaborate the meaning there? -- Antoine hashar Musso Sorry it had to be made ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Github/Gerrit mirroring
Le 05/03/13 11:27, Krinkle a écrit : If all we do is immediately copy the PR, submit it to Gerrit and close the PR saying Please create a WMFLabs account, learn all of fucking Gerrit, and then continue on Gerrit to finalise the patch, then we should just kill PR now. That always has been my point. The code ultimately require to land in Gerrit so, to me, there is no point in using GitHub pull requests. I guess the whole idea of using GitHub is for public relation and to attract new people. Then, if a developer is not willing to learn Gerrit, its code is probably not worth the effort of us integrating github/gerrit. That will just add some more poor quality code to your review queues. -- Antoine hashar Musso ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Github/Gerrit mirroring
... Then, if a developer is not willing to learn Gerrit, its code is probably not worth the effort of us integrating github/gerrit. That will just add some more poor quality code to your review queues. That seems like a pretty big assumption, and likely to be wrong. The simpler the code review process, the happier people will be to submit patches. Quality seems independent from that, and more likely linked to the ease of validating patches (linting, unit test requirements, good style guides, etc). But that's just a guess. If deemed interesting, I would be glad to help quantify patch quality and analyze what helps to improve it. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Some Sort of Notice for Breaking Changes
Is there any way that extension developers can get some sort of notice for breaking changes, e.g., https://gerrit.wikimedia.org/r/50138? Luckily my extension's JobQueue implementation hasn't been merged yet, but if it had I would have no idea that it had been broken by the core. *--* *Tyler Romeo* Stevens Institute of Technology, Class of 2015 Major in Computer Science www.whizkidztech.com | tylerro...@gmail.com ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Github/Gerrit mirroring
On 03/08/2013 08:31 AM, Dan Andreescu wrote: ... Then, if a developer is not willing to learn Gerrit, its code is probably not worth the effort of us integrating github/gerrit. That will just add some more poor quality code to your review queues. imho GitHub has the potential to get us a first patch from many contributors that won't arrive through gerrit.wikimedia.org first. It's just a lot simpler for GitHub users. Some of those patches will be good, some not so much, but that is probably also the case for first time contributors in Gerrit. When a developer submits a second and a third pull request via GitHub then we can politely invite her to check http://www.mediawiki.org/wiki/Gerrit and join our actual development process. -- Quim Gil Technical Contributor Coordinator @ Wikimedia Foundation http://www.mediawiki.org/wiki/User:Qgil ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Github/Gerrit mirroring
I've been hosting my puppet-cdh4 (Hadoop) repository on Github for a while now. I am planning on moving this into Gerrit. I've been getting pretty high quality pull requests for the last month or so from a couple of different users. (Including CentOS support, supporting MapReduce v1 as well as YARN, etc.) https://github.com/wikimedia/puppet-cdh4/issues?page=1state=closed I'm happy to host this in Gerrit, but I suspect that contribution to this project will drop once I do. :/ On Mar 8, 2013, at 11:47 AM, Quim Gil q...@wikimedia.org wrote: On 03/08/2013 08:31 AM, Dan Andreescu wrote: ... Then, if a developer is not willing to learn Gerrit, its code is probably not worth the effort of us integrating github/gerrit. That will just add some more poor quality code to your review queues. imho GitHub has the potential to get us a first patch from many contributors that won't arrive through gerrit.wikimedia.org first. It's just a lot simpler for GitHub users. Some of those patches will be good, some not so much, but that is probably also the case for first time contributors in Gerrit. When a developer submits a second and a third pull request via GitHub then we can politely invite her to check http://www.mediawiki.org/wiki/Gerrit and join our actual development process. -- Quim Gil Technical Contributor Coordinator @ Wikimedia Foundation http://www.mediawiki.org/wiki/User:Qgil ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] JobQueue changes (Re: Some Sort of Notice for Breaking Changes)
On Fri, Mar 8, 2013 at 8:35 AM, Tyler Romeo tylerro...@gmail.com wrote: Is there any way that extension developers can get some sort of notice for breaking changes, e.g., https://gerrit.wikimedia.org/r/50138? Luckily my extension's JobQueue implementation hasn't been merged yet, but if it had I would have no idea that it had been broken by the core. Hi Tyler, Sorry to hear that there might be a problem here. It's been a pet peeve of mine that we seem to be a little too eager to break backwards compatibility in places where it may not be necessary. That said, let's try to avoid a meta-process discussion before we collectively understand the example you are bringing up, and focus on the JobQueue. As near as I can tell from a quick skim of the changeset you're referencing, Aaron's changes here are purely additive. Am I reading this wrong? Is there some other changeset that changes/removes existing interfaces that you meant to reference instead? Rob ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Github/Gerrit mirroring
On 8 Mar 2013 10:47, Quim Gil q...@wikimedia.org wrote: On 03/08/2013 08:31 AM, Dan Andreescu wrote: ... Then, if a developer is not willing to learn Gerrit, its code is probably not worth the effort of us integrating github/gerrit. That will just add some more poor quality code to your review queues. imho GitHub has the potential to get us a first patch from many contributors that won't arrive through gerrit.wikimedia.org first. It's just a lot simpler for GitHub users. Some of those patches will be good, some not so much, but that is probably also the case for first time contributors in Gerrit. +1 to me the need to create a gerrit account is a huge barrier for entry. I think we are missing out on attracting small but useful patches from developers who are not heavily invested in the project and have no wish to become regular core contributors... When a developer submits a second and a third pull request via GitHub then we can politely invite her to check http://www.mediawiki.org/wiki/Gerrit and join our actual development process. -- Quim Gil Technical Contributor Coordinator @ Wikimedia Foundation http://www.mediawiki.org/wiki/User:Qgil ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] JobQueue changes (Re: Some Sort of Notice for Breaking Changes)
On Fri, Mar 8, 2013 at 12:18 PM, Rob Lanphier ro...@wikimedia.org wrote: As near as I can tell from a quick skim of the changeset you're referencing, Aaron's changes here are purely additive. Am I reading this wrong? Is there some other changeset that changes/removes existing interfaces that you meant to reference instead? At first glance it seems additive, but the change adds a new abstract method to the JobQueue class, meaning any child class of JobQueue that doesn't have the new method implemented will trigger a fatal error. To make it not breaking, the function would have to have a default implementation in the main JobQueue class. *--* *Tyler Romeo* Stevens Institute of Technology, Class of 2015 Major in Computer Science www.whizkidztech.com | tylerro...@gmail.com ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Wikimedia Hackathon Amsterdam 2013: Registration opened
Hi everyone, Wikimedia Nederland invites all developers to the Wikimedia Hackathon. The Wikimedia Hackathon will be in 2013 from 24-26 May. The registration is now open and also includes the possibility to apply for a travel, accommodation or full scholarship. You can find the form at https://docs.google.com/spreadsheet/viewform?formkey=dFg2SmRRbkpxNmxCcFNFdlduVlJuTUE6MQ#gid=0 The hackathon is an opportunity for all Wikimedia community developers and sysadmins to come together, squash bugs and write great new features tools. Unlike the previous years (2012, 2011, etc.) this Hackathon won't be in Berlin, but in Amsterdam. The event is open to a wide range of developers. We welcome both seasoned and new developers as well as people working on MediaWiki, tools, pywikipedia, Wikidata, gadgets, extensions, templates … . Please suggest and discus topics at https://www.mediawiki.org/wiki/Amsterdam_Hackathon_2013/Topics . You can indicate that you're coming at https://www.mediawiki.org/wiki/Amsterdam_Hackathon_2013/Attendees and/or https://www.facebook.com/events/167285526755104/ . This doesn't replace registration, it's just to let others know what you're up to. Keep an eye on https://www.mediawiki.org/wiki/Amsterdam_Hackathon_2013 for updates! Maarten ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Indexing structures for Wikidata
On Thu, Mar 7, 2013 at 12:50 PM, Denny Vrandečić denny.vrande...@wikimedia.de wrote: As you probably know, the search in Wikidata sucks big time. Until we have created a proper Solr-based search and deployed on that infrastructure, we would like to implement and set up a reasonable stopgap solution. The simplest and most obvious signal for sorting the items would be to 1) make a prefix search 2) weight all results by the number of Wikipedias it links to This should usually provide the item you are looking for. Currently, the search order is random. Good luck with finding items like California, Wellington, or Berlin. Now, what I want to ask is, what would be the appropriate index structure for that table. The data is saved in the wb_terms table, which would need to be extended by a weight field. There is already a suggestion (based on discussions between Tim and Daniel K if I understood correctly) to change the wb_terms table index structure (see here https://bugzilla.wikimedia.org/show_bug.cgi?id=45529 ), but since we are changing the index structure anyway it would be great to get it right this time. Anyone who can jump in? (Looking especially at Asher and Tim) Any help would be appreciated. Cheers, Denny -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l AFAIK sql isn't particularly good for indexing that type of query. You could maybe have a bunch of indexes for the first couple letters of a term, and then after some point hope that things are narrowed down enough that just doing a prefix search is acceptable. For example, you might have an indexes on (wb_term(1), wb_weight), (wb_term(2), wb_weight), ..., (wb_term(7), wb_weight) and one on just wb_term. That way (I believe) you would be able to do efficient searches for a prefix ordered by weight, provided the prefix is less than 7 characters. (7 was chosen arbitrarily out of a hat. Performance goes down as you add more indexes from what I understand. I'm not sure how far you would be able to take this scheme before that becomes an issue. You could maybe enhance this by only showing search suggestion updates for every 2 characters the user enters or something). --bawolff p.s. Have not tested this, and talking a bit outside my knowledge area, so ymmv ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [Labs-l] Wikimedia Hackathon Amsterdam 2013: Registration opened
I have one question :) why the registration form is asking me which year and month I will depart? Are you afraid some attendees are planning to stay for several years? :D On Fri, Mar 8, 2013 at 6:54 PM, Maarten Dammers maar...@mdammers.nl wrote: Hi everyone, Wikimedia Nederland invites all developers to the Wikimedia Hackathon. The Wikimedia Hackathon will be in 2013 from 24-26 May. The registration is now open and also includes the possibility to apply for a travel, accommodation or full scholarship. You can find the form at https://docs.google.com/spreadsheet/viewform?formkey=dFg2SmRRbkpxNmxCcFNFdlduVlJuTUE6MQ#gid=0 The hackathon is an opportunity for all Wikimedia community developers and sysadmins to come together, squash bugs and write great new features tools. Unlike the previous years (2012, 2011, etc.) this Hackathon won't be in Berlin, but in Amsterdam. The event is open to a wide range of developers. We welcome both seasoned and new developers as well as people working on MediaWiki, tools, pywikipedia, Wikidata, gadgets, extensions, templates … . Please suggest and discus topics at https://www.mediawiki.org/wiki/Amsterdam_Hackathon_2013/Topics . You can indicate that you're coming at https://www.mediawiki.org/wiki/Amsterdam_Hackathon_2013/Attendees and/or https://www.facebook.com/events/167285526755104/ . This doesn't replace registration, it's just to let others know what you're up to. Keep an eye on https://www.mediawiki.org/wiki/Amsterdam_Hackathon_2013 for updates! Maarten ___ Labs-l mailing list lab...@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/labs-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Github/Gerrit mirroring
On Fri, 08 Mar 2013 17:07:18 +0100, Antoine Musso hashar+...@free.fr wrote: I guess the whole idea of using GitHub is for public relation and to attract new people. Then, if a developer is not willing to learn Gerrit, its code is probably not worth the effort of us integrating github/gerrit. That will just add some more poor quality code to your review queues. This a hundred times. I manage a few (small) open-source projects at GitHub, and most of the patches I get are not even up to my standards (and those are significantly lower than WMF's ones). Submitting a patch to gerrit and even fixing it after code review is not that hard. (Of course any more complicated operations like rebasing do suck, but you hopefully won't be doing that with your first patch.) -- Matma Rex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] JobQueue changes (Re: Some Sort of Notice for Breaking Changes)
Also, after doing a git-blame, I found https://gerrit.wikimedia.org/r/51886, which was also merged today. I could search through the core for other changes like this but it'd require an immense amount of time. *--* *Tyler Romeo* Stevens Institute of Technology, Class of 2015 Major in Computer Science www.whizkidztech.com | tylerro...@gmail.com ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Deployment highlights - 2013-03-08
Hello! This is your friendly weekly deployments highlight email. For the week of March 11th (next week), here are some things to be aware of: * Scribunto (Lua) will be available on all wikis as of Wed the 13th * HTTPS for all logged in users This is planned to happen next week, but the exact deployment window is still to be determined. I will inform wikitech-l and -ambassadors when it is scheduled. See this bug for more info: https://bugzilla.wikimedia.org/show_bug.cgi?id=39380 Best, Greg -- | Greg GrossmeierGPG: B2FA 27B1 F7EB D327 6B8E | | identi.ca: @gregA18D 1138 8E47 FAC8 1C7D | ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Deployment highlights - 2013-03-08
quote name=Greg Grossmeier date=2013-03-08 time=11:26:38 -0800 Hello! This is your friendly weekly deployments highlight email. For the week of March 11th (next week), here are some things to be aware of: Also regarding the Mobile Uploads feature on Wed 13th: * we're releasing a call to action to login or signup from the article upload feature, as well as the ability to donate images to commons, to the full mobile web (sorry for the noise) -- | Greg GrossmeierGPG: B2FA 27B1 F7EB D327 6B8E | | identi.ca: @gregA18D 1138 8E47 FAC8 1C7D | ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Github/Gerrit mirroring
Le 08/03/13 09:21, Jon Robson a écrit : +1 to me the need to create a gerrit account is a huge barrier for entry. I think we are missing out on attracting small but useful patches from developers who are not heavily invested in the project and have no wish to become regular core contributors... Maybe Gerrit can be made to let one authenticate with its github account? -- Antoine hashar Musso ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Github/Gerrit mirroring
On Fri, Mar 8, 2013 at 11:50 AM, Antoine Musso hashar+...@free.fr wrote: Le 08/03/13 09:21, Jon Robson a écrit : +1 to me the need to create a gerrit account is a huge barrier for entry. I think we are missing out on attracting small but useful patches from developers who are not heavily invested in the project and have no wish to become regular core contributors... Maybe Gerrit can be made to let one authenticate with its github account? Nope. We use LDAP for auth with Gerrit, and it does not support having multiple authentication methods at the same time (nor do I really see it as worth the effort). Getting Github PR into the Gerrit ecosystem is on the Gerrit roadmap, but we don't have a firm date just yet. I plan to announce this much more widely when we're close to that. -Chad ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Identifying pages that are slow to render
Federico Leva (Nemo) wrote: There's slow-parse.log, but it's private unless a solution is found for https://gerrit.wikimedia.org/r/#/c/49678/ https://wikitech.wikimedia.org/wiki/Logs Separate slow-parse into public and private files https://bugzilla.wikimedia.org/show_bug.cgi?id=45830 https://gerrit.wikimedia.org/r/49678 was abandoned; it looks like https://gerrit.wikimedia.org/r/52608 is now the relevant Gerrit changeset. MZMcBride ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Some Sort of Notice for Breaking Changes
Partly related: to be fair, Aaron asked comments about release notes and announcements some months ago (although in that case for schema changes) but there was none. http://lists.wikimedia.org/pipermail/wikitech-l/2012-November/064630.html Nemo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Extension:OpenID 3.00 - Security Release
On Fri, Mar 8, 2013 at 1:07 AM, Yuvi Panda yuvipa...@gmail.com wrote: Was this the last blocker to getting the extension deployed? On wikitech the blockers were the switch of the wiki name (from labsconsole to wikitech) and this. There's still some issues that need to be worked out for deployment on the main projects. Also, it needs a full review before deployment to the projects, and we need to work out how this will affect the OAuth plans. We have a kickoff meeting for this coming up soon. I'll send updates when that occurs. For deployment on wikitech I think I'd like to wait for a full security review, so it may be a little while. - Ryan ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Some Sort of Notice for Breaking Changes
True, but schema changes are not as bad because they won't cause fatal errors in PHP. At the very least if a schema change occurs your wiki will still be operational. --Tyler Romeo On Mar 8, 2013 4:26 PM, Federico Leva (Nemo) nemow...@gmail.com wrote: Partly related: to be fair, Aaron asked comments about release notes and announcements some months ago (although in that case for schema changes) but there was none. http://lists.wikimedia.org/**pipermail/wikitech-l/2012-** November/064630.htmlhttp://lists.wikimedia.org/pipermail/wikitech-l/2012-November/064630.html Nemo __**_ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/wikitech-lhttps://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Some Sort of Notice for Breaking Changes
https://bugzilla.wikimedia.org/show_bug.cgi?id=45915 I think, we should use Twitter in addition to the mailinglist. I am not a fan of all new tools, but many OSS projects (owncloud, mailvelope) post their breaking news there. We do not (yet). signature.asc Description: OpenPGP digital signature ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Some Sort of Notice for Breaking Changes
On Fri, Mar 8, 2013 at 3:02 PM, Thomas Gries m...@tgries.de wrote: https://bugzilla.wikimedia.org/show_bug.cgi?id=45915 I think, we should use Twitter in addition to the mailinglist. I am not a fan of all new tools, but many OSS projects (owncloud, mailvelope) post their breaking news there. We do not (yet). You're joking, right? -Chad ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Some Sort of Notice for Breaking Changes
Am 09.03.2013 00:04, schrieb Chad: I think, we should use Twitter in addition to the mailinglist. You're joking, right? -Chad Why do you think I am joking ? Major changes can be signalled there - or did I miss something ? ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Identifying pages that are slow to render
Antoine Musso hashar+...@free.fr wrote: Le 06/03/13 22:05, Robert Rohde a écrit : On enwiki we've already made Lua conversions with most of the string templates, several formatting templates (e.g. {{rnd}}, {{precision}}), {{coord}}, and a number of others. And there is work underway on a number of the more complex overhauls (e.g. {{cite}}, {{convert}}). However, it would be nice to identify problematic templates that may be less obvious. You can get in touch with Brad Jorsch and Tim Starling. They most probably have a list of templates that should quickly converted to LUA modules. If we got {{cite}} out, that will be already a nice improvement :-] Not really, given https://bugzilla.wikimedia.org/show_bug.cgi?id=45861 //Saper ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Indexing non-text content in LuceneSearch
-Original Message- From: wikitech-l-boun...@lists.wikimedia.org [mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of Brion Vibber Sent: Thursday, March 7, 2013 9:59 PM To: Wikimedia developers Subject: Re: [Wikitech-l] Indexing non-text content in LuceneSearch On Thu, Mar 7, 2013 at 11:45 AM, Daniel Kinzler dan...@brightbyte.de wrote: 1) create a specialized XML dump that contains the text generated by getTextForSearchIndex() instead of actual page content. That probably makes the most sense; alternately, make a dump that includes both raw data and text for search. This also allows for indexing extra stuff for files -- such as extracted text from a PDF of DjVu or metadata from a JPEG -- if the dump process etc can produce appropriate indexable data. However, that only works if the dump is created using the PHP dumper. How are the regular dumps currently generated on WMF infrastructure? Also, would be be feasible to make an extra dump just for LuceneSearch (at least for wikidata.org)? The dumps are indeed created via MediaWiki. I think Ariel or someone can comment with more detail on how it currently runs, it's been a while since I was in the thick of it. 2) We could re-implement the ContentHandler facility in Java, and require extensions that define their own content types to provide a Java based handler in addition to the PHP one. That seems like a pretty massive undertaking of dubious value. But it would allow maximum control over what is indexed how. No don't do it :) 3) The indexer code (without plugins) should not know about Wikibase, but it may have hard coded knowledge about JSON. It could have a special indexing mode for JSON, in which the structure is deserialized and traversed, and any values are added to the index (while the keys used in the structure would be ignored). We may still be indexing useless interna from the JSON, but at least there would be a lot fewer false negatives. Indexing structured data could be awesome -- again I think of file metadata as well as wikidata-style stuff. But I'm not sure how easy that'll be. Should probably be in addition to the text indexing, rather than replacing. -- brion I agree with Brion. Here are my 5 shenekel's worth. To indexing non-mwdumps with LuceneSearch I would: 1. modify the demon to read the custom/dump format or update the xml dump to support json dump. 2. it uses the MWdumper codebase to do this now. 3. add a lucene analyzer to handle the new data type, say a json analyzer. 4. add a Lucenedoc per Json based Wikidata schema 5. update the queries parser to handle the new queries and the modified Lucene documents. 6. for bonus points modify spelling correction and write a wiki data ranking algoritm But this would only solve reading static dumps used to bootstrap the index, I would then have to Change how MWSearch periodically polls Brion's OAIRepository to pull in updated pages. I've been coding some analytics from MWDumps from WMF/Wikia Wikis for research project I can say this: 1. Most big dumps (e.g. historic) inherit the isses of wikitext namely unescaped tags and entities which crash modern XML java libraries - so escape your data and validate the xml! 2. The god old SAX code in the MWDumper still works fine - so use it. 3. Use lucene 2.4 with the deprecated old APIs 4. Ariel is doing a great job (e.g. the 7Z compression and the splitting of the dumps) but these are things MWdumper does not handle yet. Finally based on my work with i18n team, TranslateWiki search that indexing JSON data with Solar + Solarium requires no Search Engine coding at all. You define the document schema, and use solarium to push JSON and get results too. I could do a demo of how to do this at a coming Hakathon if there is any interest, however when I offered to replace LuceneSearch like this last October the idea was rejected out of hand. -- oren ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Mediawiki's access points and mw-config
On Wed, Feb 27, 2013 at 9:13 PM, Daniel Friesen dan...@nadir-seen-fire.comwrote: index.php, api.php, etc... provide entrypoints into the configured wiki. mw-config/ installs and upgrades the wiki. With much of itself disconnected from core code that requires a configured wiki. And after installation it can even be eliminated completely without issue. I think this clarifies the issue for me. Correct me if I'm wrong, but basically the entry points are for continued, repeated use, for indeed *accessing* wiki resources (hence I suggest the normalization of the name of these scripts to access points everywhere in the docs, because entry is a little more generic), while mw-config/index.php is a one-off script that has no use once the wiki installation is done. I'll update the docs in mw.org accordingly, to make this clear. I wouldn't even include mw-config in entrypoint modifications that would be applied to other entrypoint code. You mean like this one https://gerrit.wikimedia.org/r/#/c/49208/? I can understand, in the sense that it gives people the wrong idea regarding its relationship with the other access points, but if the documentation is clear, I see no reason not to have mw-config/index.php benefit from changes when the touched code is the part common to all *entry* points (in the strict meaning of files that can be used to enter the wiki from a web browser). That said, and considering what Platonides mentioned: It was originally named config. It came from the link that sent you there: You need to configure your wiki first. Then someone had problems with other program that was installed sitewide on his host appropiating the /config/ folder, so it was renamed to mw-config. ...I would suggest the mw-config directory to be renamed to something that more clearly identifies its purpose. I'm thinking first-run or something to that effect. I'll submit a patchset proposing this. --Waldir ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l