Re: [Wikitech-l] HTTPS Wikipedia search for Firefox - update?
On Fri, Dec 28, 2012 at 10:58 PM, Jeremy Baron jer...@tuxmachine.comwrote: On Sat, Dec 29, 2012 at 6:52 AM, bawolff bawolff...@gmail.com wrote: On Fri, Dec 28, 2012 at 1:50 PM, Ryan Lane rlan...@gmail.com wrote: There's no change. We're still waiting on MediaWiki changes to occur before we switch logged-in users to HTTPS by default. [...] Furthermore, what does making the firefox search box be https have to do with having users log in to secure by default. I suppose we might not want people to loose their login if they're logged into insecure and search via firefox with secure login - is that what you're concerned about, or is it something else? I was thinking it was just wanting to ramp up load internally first (where it's really easy and fast to ramp back down if needed) and then expand to other places when we're more confident. Turning on HTTPS by default for all logged in users is one of those ways to ramp up load in a controlled and easily reversible way. This. - Ryan ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Fwd: Re: [Wikimedia-l] No access to the Uzbek Wikipedia in Uzbekistan
On Thu, Dec 27, 2012 at 7:35 PM, Marco Fleckinger marco.fleckin...@wikipedia.at wrote: Do we have one extra machine left. Then we could set up this as NAT-Router. This will replace another machine if we do not have one extra IP left. The original ports need to be forwarded to that then. Cheers Marco Original-Nachricht Von: Leslie Carr lc...@wikimedia.org Gesendet: Fri Dec 28 00:03:33 MEZ 2012 An: Wikimedia Mailing List wikimedi...@lists.wikimedia.org Betreff: Re: [Wikimedia-l] No access to the Uzbek Wikipedia in Uzbekistan On Thu, Dec 27, 2012 at 2:37 PM, Marco Fleckinger marco.fleckin...@wikipedia.at wrote: Leslie Carr lc...@wikimedia.org schrieb: On Thu, Dec 27, 2012 at 1:39 PM, Marco Fleckinger marco.fleckin...@wikipedia.at wrote: Just an idea, which is not very beautiful: What about a router forwarding ports to the correct machine by using iptables? Would that also work in connection with search engines? Are you suggesting we use different nonstandard ports for each different wiki/language combo that resides on the same IP ? Yes exactly! I guess that is theoretically possible with a more intrusive load balancer in the middle. We need the HOST information from the http header to be added as we have our varnish caches serving multiple services, not one(or more) per language/project combo. I'm pretty sure that lvs doesn't have this ability (which we use). Some large commercial load balancers have the ability to rewrite some headers, but that would be a pretty intensive operation (think lots of cpu needed, since it needs to terminate SSL and then rewrite headers) and would probably be expensive. If you have another way you think we can do this, I am all ears! We may want to move this discussion to wikitech-l as all the technical discussions probably bore most of the people on wikimedia-l Leslie Wikimedia is a pretty big player. Has anyone from the foundation with some sort of fancy sounding title called up the ISP in question and asked wtf?. The original email on wikimedia-l made it sound like the issue is unintentional. -bawolff ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] No access to the Uzbek Wikipedia in Uzbekistan
bawolff wrote: Wikimedia is a pretty big player. Has anyone from the foundation with some sort of fancy sounding title called up the ISP in question and asked wtf?. The original email on wikimedia-l made it sound like the issue is unintentional. Dear Bawolff, as far as I know no one from the WMF has called up the ISPs. Some local Wikipedians have, but have received no answer. Just to add: the issue discussed happens with all the ISPs in the country, not just a single one. For some background information please take a look at these short sections: https://en.wikipedia.org/wiki/Human_rights_in_Uzbekistan#Internet https://en.wikipedia.org/wiki/Uzbek_Wikipedia#Blocking_of_Wikipedia With all best wishes. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Time class in MediaWiki?
Hi, I could find a method to covert a timestamp into the user preferred timezone in the Language class; Looks like the wrong place to me. Is there any other way (think global function) to convert to the user's timezone and preferred format? Also, is there any common script to do this in JS? With reference to bug 43365 -- Happy Holidays, Nischay Nahata nischayn22.in ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Gerrit reviewer-bot
On 29 December 2012 01:08, Merlijn van Deen valhall...@arctus.nl wrote: Could you make it so that I can subscribe to a file pattern, too? I'd like to be added as a reviewer to all patch sets that include files with i18n in the name (unless it's a patch set by L10n-bot). This will take some work (because I have to retrieve the patchsets from gerrit in some way), but it should be possible. I'll try to get this working! A (somewhat hackish, but that's the entire bot) implementation is now running - you can add a 'file_regexp' template parameter which is matched against all changed files. See [1] for examples and subscription ;-) - you can either get added to matched changes within a repository (Alex' use case) or to matched changes in all repositories (Siebrand and Raimond's use case): add yourself under the '*' header for that. To test the regexp, try [2] (method=search, options DOTALL and IGNORECASE). Best, Merlijn [1] http://www.mediawiki.org/wiki/Git/Reviewers#* [2] http://re-try.appspot.com/ ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Gerrit reviewer-bot
Hi, I just received an email from this bot, unfortunately that was undesirable. Its not necessary to cc the original author of a patch when adding a reviewer, so is the case when adding a reviewer manually. On Friday, December 28, 2012, Merlijn van Deen wrote: Hello all, To add to the great work by Ori and Jon, I have my own recent pet project to announce: Reviewer-bot, which adds reviewers to any new changes uploaded to Gerrit. The basic idea is as follows: 1) reviewer-bot listens to Gerrit's events stream 2) a contributor uploads a new change (only the first patchset is reacted upon) 3) reviewer-bot checks http://www.mediawiki.org/wiki/Git/Reviewers to see which reviewers would like to be added to new changes 4) reviewer-bot adds the reviewers to the change Obviously, it's still in it's infancy, so it will probably crash every now and then. However, please try it by adding your gerrit username to the Reviewers page! Of course, I also have the obligatory 'Fork me on github!'-notice: the code is available at https://github.com/valhallasw/gerrit-reviewer-bot . I hope that, together with the RSS-based approach by Ori and the daily digest-approach by Jon, this will help to improve the Gerrit experience - especially for new developers! Best, Merlijn ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org javascript:; https://lists.wikimedia.org/mailman/listinfo/wikitech-l -- Cheers, Nischay Nahata nischayn22.in ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Gerrit reviewer-bot
Hi Nischay, On 29 December 2012 16:30, Nischay Nahata nischay...@gmail.com wrote: I just received an email from this bot, unfortunately that was undesirable. Its not necessary to cc the original author of a patch when adding a reviewer, so is the case when adding a reviewer manually. Gerrit notifies the patch owner is someone else (in this case: reviewer-bot) adds a reviewer (in this case: me) to a patch. I'm unaware of an option to suppress these messages (and I'm also unsure whether that actually is desirable). Best, Merlijn ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Unit tests scream for attention
On Sat, Dec 29, 2012 at 8:57 AM, bawolff bawolff...@gmail.com wrote: When I used to run unit tests, there were quite regularly issues where the unit tests assumed you had the default configuration, where they really should not assume such a thing. (That was of course a while ago, so things may have changed). This is an annoyance to me as well. So, I went triaging, and finally found the issues that failed the unit tests for me. I have committed fixes for them to gerrit: https://gerrit.wikimedia.org/r/#/c/41362/ https://gerrit.wikimedia.org/r/#/c/41360/ Bryan ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Html comments into raw wiki code: can they be wrapped into parsed html?
I'd like to use html comment into raw wiki text, to use them as effective, server-unexpensive data containers that could be read and parsed by a js script in view mode. But I see that html comment, written into raw wiki text, are stripped away by parsing routines. I can access to raw code of current page in view mode by js with a index.php or an api.php call, and I do, but this is much more server-expensive IMHO. Is there any sound reason to strip html comments away? If there is no sound reason, could such a stripping be avoided? Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Html comments into raw wiki code: can they be wrapped into parsed html?
Perhaps, you chose the wrong approach. Dig in HTML5 data attributes, for examples. That's a better data interface between wikipage code and the View. You can then access them with $(selector).data() method. On Sun, Dec 30, 2012 at 12:23 AM, Alex Brollo alex.bro...@gmail.com wrote: I'd like to use html comment into raw wiki text, to use them as effective, server-unexpensive data containers that could be read and parsed by a js script in view mode. But I see that html comment, written into raw wiki text, are stripped away by parsing routines. I can access to raw code of current page in view mode by js with a index.php or an api.php call, and I do, but this is much more server-expensive IMHO. Is there any sound reason to strip html comments away? If there is no sound reason, could such a stripping be avoided? Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l -- З павагай, Павел Селіцкас/Pavel Selitskas Wizardist @ Wikimedia projects p.selits...@gmail.com, +375257408304 Skype: p.selitskas ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Html comments into raw wiki code: can they be wrapped into parsed html?
On 29/12/12 22:23, Alex Brollo wrote: I'd like to use html comment into raw wiki text, to use them as effective, server-unexpensive data containers that could be read and parsed by a js script in view mode. But I see that html comment, written into raw wiki text, are stripped away by parsing routines. I can access to raw code of current page in view mode by js with a index.php or an api.php call, and I do, but this is much more server-expensive IMHO. Is there any sound reason to strip html comments away? If there is no sound reason, could such a stripping be avoided? They are wikitext comments, defined to be stripped for the final user. I think there is an extension allowing to output html comments. You can also use some tag properties as containers. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Can we help Tor users make legitimate edits?
On 28/12/12 18:29, Tilman Bayer wrote: On Fri, Dec 28, 2012 at 1:26 AM, Sumana Harihareswara wrote: I've floated this problem past Tor and privacy people, and here are a few ideas: 1) Just use the existing mechanisms more leniently. Encourage the communities (Wikimedia Tor) to use https://en.wikipedia.org/wiki/Wikipedia:Request_an_account (to get an account from behind Tor) and to let more people get IP block exemptions even before they've made any edits ( 30 people have gotten exemptions on en.wp in 2012). Add encouraging get an exempt account language to the you're blocked because you're using Tor messaging. Then if there's an uptick in vandalism from Tor then they can just tighten up again. This seems the right approach. 2) Encourage people with closed proxies to re-vitalize https://en.wikipedia.org/wiki/Wikipedia:WOCP . Problem: using closed proxies is okay for people with some threat models but not others. I didn't know about it. This is an interesting concept. It would be possible to setup some 'public wikipedia proxys' (eg. by an European chapter) and encourage its use. It would still be possible to checkuser people going through that, but a 2-tier process would be needed (wiki checkuser + proxy admin) thus protecting from a “rogue checkuser” (Is that the primary concern of good editors wishing to use proxys?). We could use that setup for gaining information about usage (eg. it was 90% spam). 3) Look at Nymble - http://freehaven.net/anonbib/#oakland11-formalizing and http://cgi.soic.indiana.edu/~kapadia/nymble/overview.php . It would allow Wikimedia to distance itself from knowing people's identities, but still allow admins to revoke permissions if people acted up. The user shows a real identity, gets a token, and exchanges that token over tor for an account. If the user abuses the site, Wikimedia site admins can blacklist the user without ever being able to learn who they were or what other edits they did. More: https://cs.uwaterloo.ca/~iang/ Ian Golberg's, Nick Hopper's, and Apu Kapadia's groups are all working on Nymble or its derivatives. It's not ready for production yet, I bet, but if someone wanted a Big Project As Brad and Ariel point out, Nymble in the form described on the linked project page does not seem to allow long-term blocks, and cannot deal with dynamic IPs. In other words, it would only provide the analogue of autoblock functionality for Tor users. The linked paper by Henry and Goldberg is more realistic about these limitations, discussing IP addresses only as one of several possible unique identifiers (§V). From the concluding remarks to that chapter, it seems most likely that they would recommend some form of PKI or government ID-based registration for our purposes. Requiring a government ID for connecting through tor would be even worse for privacy. I completely agree that matching with the IP address used to request the nymble token is not enough. Maybe if the tokens were instead based in ISP+zone geolocation, that could be a way. Still, that would still miss linkability for vandals which use eg. both their home and work connections. 3a) A token authorization system (perhaps a MediaWiki extension) where the server blindly signs a token, and then the user can use that token to bypass the Tor blocks. (Tyler mentioned he saw this somewhere in a Bugzilla suggestion; I haven't found it.) Bug 3729 ? Thoughts? Are any of you interested in working on this problem? #tor on the OFTC IRC server is full of people who'd be interested in talking about this. This is a social problem. We have the tools to fix it (account creation + ip block exemption). If someone asked me for that (in a project where I can) because they are censored by their government I would gladly grant it. That also means that when they replaced 'Jimbo' with 'penis', 5 minutes after getting their account, I would notice and kick them out. In my experience, far more people is trying to use tor in wikipedia for vandalising than for doing constructive edits / due to local censorship. Although I concede that it's probably the opposite on ‘certain wikis’ I don't edit. The problem with global solutions are vandals abusing it. If I don't get caught on 10 edits I can edit through tor is a candle for vandals. Note that I don't get caught is different than doing a constructive edit. An idea would be to force some recaptcha-style work before giving such tokens, so even though we know they will abuse the system, we are still using them as improving force (although the following vandalism could still be worse than what we gained). I also wonder if we are not aiming too high, trying to solve the anonimity and traceability problems on the internet, while we have for instance captchas forced to anons and newbies on a couple wikis due to a bot vandalism done years ago (bug 41745). ___ Wikitech-l mailing list
Re: [Wikitech-l] Time class in MediaWiki?
On Sat, Dec 29, 2012 at 6:41 AM, Nischay Nahata nischay...@gmail.comwrote: I could find a method to covert a timestamp into the user preferred timezone in the Language class; Looks like the wrong place to me. Is there any other way (think global function) to convert to the user's timezone and preferred format? Date formatting is language-based, so the date formatting functions do indeed live in Language. It's also based on user preferences, which makes it a bit of an odd fit, but it's a legit localization thing. :) Elsewhere in the code, timestamps are passed around in timezone-independent formats based on UTC. Also, is there any common script to do this in JS? With reference to bug 43365 I'm not sure if we have full localization for dates in JS... ...but you can use the browser's built-in support. You won't get the same formatting, and it may not match the user's *time zone preference* in MediaWiki... eg https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/Date/toLocaleString -- brion ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Big data benefits and limitations (relevance: WMF editor engagement, fundraising, and HR practices)
I'm sending this to Wikimedia-l, Wikitech-l, and Research-l in case other people in the Wikimedia movement or staff are interested in big data as it relates to Wikimedia. I hope that those who are interested in discussions about WMF editor engagement efforts, WMF fundraising, or WMF HR practices will also find that this email interests them. Feel free to skip straight to the links in the latter portion of this email if you're already familiar with big data and its analysis and if you just want to see what other people are writing about the subject. * Introductory comments / my personal opinion Big data refers to large quantities of information that are so large that they are difficult to analyze and may not be related internally in an obvious way. See https://en.wikipedia.org/wiki/Big_data I think that most of us would agree that moving much of an organization's information into the Cloud, and/or directing people to analyze massive quantities of information, will not automatically result in better, or even good, decisions based on that information. Also, I think that most of us would agree that bigger and/or more accessible quantities of data does not necessarily imply that the data are more accurate or more relevant for a particular purpose. Another concern is the possibility of unwelcome intrusions into sensitive information, including the possibility of data breaches; imagine the possible consequences if a hacker broke into supposedly secure databases held by Facebook or the Securities and Exchange Commission. We have an enormous quantity of data on Wikimedia projects, and many ways that we can examine those data. As this Dilbert strip points out, context is important, and looking at statistics devoid of their larger contexts can be problematic. http://dilbert.com/strips/comic/1993-02-07/ Since data analysis is also something that Wikipedia does in the areas I mentioned previously, I'm passing along a few links for those who may be interested about the benefits and limitations of big data. * Links: From the Harvard Business Review http://hbr.org/2012/04/good-data-wont-guarantee-good-decisions/ar/1 From the New York Times https://www.nytimes.com/2012/12/30/technology/big-data-is-great-but-dont-forget-intuition.html and https://www.nytimes.com/2012/02/12/sunday-review/big-datas-impact-in-the-world.html From the Wall Street Journal. This may be especially interesting to those who are participating in the discussions on Wikimedia-l regarding how Wikimedia selects, pays, and manages its staff. http://online.wsj.com/article/SB1872396390443890304578006252019616768.html And from English Wikipedia (: https://en.wikipedia.org/wiki/Big_data and https://en.wikipedia.org/wiki/Data_mining and https://en.wikipedia.org/wiki/Business_intelligence Cheers, Pine ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Html comments into raw wiki code: can they be wrapped into parsed html?
On Sat, Dec 29, 2012 at 6:59 PM, Platonides platoni...@gmail.com wrote: Is there any sound reason to strip html comments away? If there is no sound reason, could such a stripping be avoided? Comments can sometimes be used to get XSS in unexpected ways (like conditional comments for IE). I think they're stripped because that was easier then writing a sanitizer for them, and they're pretty useless. If all else fails, you can do the hacky thing of stuffing information into either a class attribute or title attribute of an element. (data even better, but I don't know if that's allowed in wikitext or not) --bawolff ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l