Re: [Wikitech-l] Serious alternatives to Gerrit
But do we have a plan for improving Gerrit in a substantial way? Gerrit releases often and every release is quite better than the last. Chad can and has been pushing changes upstream. OpenStack (which has 180+ companies that can potentially assist), QT, and other substantial communities are using and are improving Gerrit. I can get behind the decision to use a currently substandard tool in order to preserve Wikimedia's long term freedom. But to stick with Gerrit, we must have a plan for fixing it that does not simply declare that the ability to make changes means that the magic FOSS fairy will make it so. I don't know what it would take -- maybe a weekend in SF where we invite every Java and Prolog expert we can find? Paying a contractor or two to make the necessary fixes? Make sure to add any complaints to the wiki page (http://www.mediawiki.org/wiki/Git/Gerrit_evaluation#The_case_against_Gerrit). Many claims that Gerrit sucks tend to be due with misunderstandings of how Gerrit works. Many other claims are due to our workflow or our restrictions with access currently. Of course, many claims are legitimate and we are reporting the issues, tracking them upstream, and in some cases pushing in fixes. There's some substantial downsides to Gerrit, but I don't see alternatives that don't have equally as substantial downsides. Sumana listed numerous downsides to using Github. We can't fix the downsides in Github; we at least have the ability to do so with Gerrit. Just to drive this point home a little more, let's look at OpenStack's reasoning with going with Gerrit (rather than Github): https://lists.launchpad.net/openstack/msg03457.html This isn't just about attracting scores of new volunteers or having a reputational economy. It's a push for change driven by the fact that Gerrit seriously undercuts developer productivity and happiness. When we've got so many difficult, ambitious projects under way, I think those are two things we should be prioritizing. By that measuring stick, Gerrit fails miserably and GitHub is a winner. I'm not sure I agree with the claim that this seriously undercuts productivity. Is there any data to back this up? I'd like to add a couple major downsides to using Github: 1. It would be worrisome from a security point of view to host the operations repos on Github. I think it's very likely that those repos would stay in Gerrit if the devs decided to move to Github. This would mean that we'd have split workflows, forcing developers to use both. 2. We can't use github enterprise: https://enterprise.github.com/faq#faq-6 - Ryan ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Serious alternatives to Gerrit
On Tue, Jul 24, 2012 at 10:46 PM, Alolita Sharma alolita.sha...@gmail.com wrote: Cool, Priestley is awesome. If he comes to visit we should prevent him from leaving :) +1. We should definitely think about adopting Phabricator as a project if we're going to invest in its core developer. I think this should be reversed. If we're gonna use Phabricator, we should attract its core developer. We shouldn't switch code review systems just because we hired a famous person. I personally think that something like gitlabs would be superior to Phabricator, but that's another discussion. Roan ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Serious alternatives to Gerrit
On Wed, Jul 25, 2012 at 12:39 AM, Ryan Lane rlan...@gmail.com wrote: 1. It would be worrisome from a security point of view to host the operations repos on Github. I think it's very likely that those repos would stay in Gerrit if the devs decided to move to Github. This would mean that we'd have split workflows, forcing developers to use both. What's more, we also wouldn't want to host MW deployment code (the wmf/* branches) on Github for the same reason. Those aren't even separate repos, they're branches within the mediawiki/core repo. Having a split workflow for contributing code to the project on the one hand and merging it into deployment on the other hand would be even worse. Roan ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Creating a centralized access point for propriety databases/resources
You could always create an OpenVPN gateway that provides access. Many edu institutions have the same setup to access those resources. DJ On Mon, Jul 23, 2012 at 6:21 PM, Ocaasi Ocaasi wikioca...@yahoo.com wrote: Hi Folks! The problem: Many proprietary research databases have donated free access to select Wikipedia editors (Credo Reference, HighBeam Research, JSTOR). Managing separate account distribution for each service doesn't scale well. The idea: Centralize access to these separate resources behind a single secure (firewalled) gateway, to which accounts would be given to a limited number of approved users. After logging in to this single gateway, users would be able to enter any of the multiple participating research databases without needing to log in to each one separately. The question: What are the basic technical specifications for setting up such a system. What are open source options, ideally? What language would be ideal? What is required to host such a system? Can you suggest a sketch of the basic steps necessary to implement such an idea? Any advice, from basics to details would be greatly appreciated. Thanks so much! Ocaasi http://enwp.org/User:Ocaasi ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Serious alternatives to Gerrit
On 25 July 2012 10:39, Ryan Lane rlan...@gmail.com wrote: But do we have a plan for improving Gerrit in a substantial way? Currently the platform guys are too busy to even report bugs upstream. I'm not sure I agree with the claim that this seriously undercuts productivity. Is there any data to back this up? There is, and I wonder how you cannot be aware of that. Let me repeat what I've said before. I spent countless hours preparing translatewiki.net for git. Updating and comitting to hundreds of repositories takes minutes. I'm glad we were able to automate that, except that it took two months to get it working as it was supposed to work right after the switch. 3rd party users (me included) all still left out in the cold. There are docs or scripts to manage installation with subset of extensions from svn and git. Barely a day goes by without someone having some kind of problem with git or gerrit. And then we have these discussions about it. Where is the solution for my review workflow? Ever since the switch I spend more time on code review to review less code. On individual level the wasted productivity might not be much, (but it is: [1][2]), but adding it all together - that is a lot. In your mind, do a rough estimate of all the time spent on preparing and executing the switch, all the time spend on learning and solving the issues. Then convert that to dollars and you get rough price tag for the switch. [1] Each new person needs to spend tens of hours to learn to use our GiGeGat effectively - if they don't give up [2] Some people have waited for commit access and creation of new repositories for multiple days -Niklas -- Niklas Laxström ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Serious alternatives to Gerrit
On Jul 24, 2012, at 9:09 PM, Sumana Harihareswara wrote: On 07/17/2012 08:41 PM, Rob Lanphier wrote: It would appear from reading this page that the only alternative to Gerrit that has a serious following is GitHub. Is that the case? There's some irony and yet so apropos in that now that Gerrit is finally stabilizing we're discussing alternatives. :-) Oh well… Here's my 2c on GitHub… In an ignore reality world, I suppose my personal choices would be 1) GitHub; 2) Phabricator; 3) everything else. But let's cross GitHub off that list (for WMF). Maybe in some future when our development process more closely models a seat-of-the-pants startup universe of code first, break often, recover fast we could consider GitHub for hosting some of our public repositories, but since I don't see that happening anytime soon (ever?). … The nonstarter is that while we could host the public repositories, we do have a lot of non-public stuff in Gerrit right now. That stuff can't go into the cloud. Well on to specifics… But I have a lot of reservations about using GitHub as our primary source control and code review platform. There's the free-as-in-freedom issue, of course, Personally I think this ship sailed the day we used Google Apps for e-mail. :-) but I'm also concerned about flexibility, account management, fragmentation of community and duplication of tools, and their terms of service. == Flexibility == I see GitHub as kind of like a Mac. This trope is too facile. But I do agree with what you are alluding to which is while it's fine for some, that doesn't mean it's fine for us. Especially us in our current development process. It has a nice UI for the use case that its creators envision. It's fine for personal use. A great many very large open source projects are currently using or hosted at GitHub (including node, jQuery, and our Android/PhoneGap app ;-)) And if we try it, everything'll be great until we smack into an invisible brick wall. We'll want to work around one little thing, the way that we sneak around various issues in Gerrit, with hacks and searches and upgrades, if it's not in GitHub's web UI or API [3], we'll be stuck. The API is simplistic but serviceable. However, the satellite tools that are built around it are either part of GitHub (their internal issue tracker, their own Ruby-based wiki Gollum, etc) or are mostly for commercial use/cloud-based. For instance, in tools that assist in deployment from a GitHub repositiory (even if that was feasible for us which it isn't), most seem to have a hidden assumption that these are Web 2.0 companies deploying on AWS… not to mention that usage of those tools clearly violates our policy and values. Right now we have our primary Git repo on our own machines, which is the ultimate backdoor. The way we have been modifying our tools, automating certain kinds of commits (like with l10n-bot), troubleshooting by looking at our logfiles, and generally customizing things to suit our weird needs -- GitHub is closed source and won't let us do that. We are not the typical use case for GitHub. Since we have hundreds of extensions, each with their own repository, we would have way more repositories and members than almost any other organization on there. So, one example: arbitrary sortability of lists of repositories. We could mod Gerrit to do it, but not GitHub. How would we centralize and list the repositories so they're easy to browse, search, edit, follow, and watch them together? It looks like GitHub's less suitable for that, but I'd welcome examples of orgs that create their own sub-GitHub hubs. Well GitHubs modality doesn't prevent operating on the git repository through the API. But I agree since where is the support on our end for doing/writing these when we already have something servicable in Gerrit? == Accounts == By using GitHub, we would no longer be managing the user accounts. This would make single sign-on with other Wikimedia services (especially Labs) completely impossible. Technically this integration could be done by them authorizing us to their accounts via OAuth2. It's not the same thing as what you're saying though… it's kind of the opposite of what you're saying. What you want is what GitHub Enterprise is for. I mentioned above that GitHub seems more meant for single FLOSS projects than for confederations of related repositories. GitHub does not have the concept of groups, so granting access to collections of repos would be a time-consuming process. GitHub does not support branch-level permissions, either (it encourages forking and then merging back to master), and that does not seem as suitable for long-term collaborative branches. This isn't quite true. GitHub does have the concept of groups (you can create as many as you want and control
Re: [Wikitech-l] Serious alternatives to Gerrit
I think Ori's comment that touched this off was tongue-in-cheek. :-) On Jul 25, 2012, at 12:40 AM, Roan Kattouw wrote: On Tue, Jul 24, 2012 at 10:46 PM, Alolita Sharma alolita.sha...@gmail.com wrote: Cool, Priestley is awesome. If he comes to visit we should prevent him from leaving :) +1. We should definitely think about adopting Phabricator as a project if we're going to invest in its core developer. I think this should be reversed. If we're gonna use Phabricator, we should attract its core developer. We shouldn't switch code review systems just because we hired a famous person. I personally think that something like gitlabs would be superior to Phabricator, but that's another discussion. Roan ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Serious alternatives to Gerrit
On Jul 25, 2012, at 12:39 AM, Ryan Lane wrote: Many claims that Gerrit sucks tend to be due with misunderstandings of how Gerrit works. Many other claims are due to our workflow or our restrictions with access currently. Of course, many claims are legitimate and we are reporting the issues, tracking them upstream, and in some cases pushing in fixes. In their defense, I think a lot of it has to do with terrible UI/UX in Gerrit. The basics is can be modified by CSS and templates (I believe we've done some), but it only goes so far. How do I modify Javascript in Gerrit? I think it starts (and ends) somewhere in the hell that is GWT… and when GWT begins with, something about how awesome it is to be able to write Ajax stuff using Java, I stop reading http://www.flickr.com/photos/tychay/1388234558/ I've not added this complaint because David already put 90% of this in the #1 reason on his list already. ;-) Related is the fact that we seem to have a lot of PHP web dev expertise (for some reason) and Gerrit went from Python (serviceable) to Java (totally opaque). Apologies to those of you at the WMF who lurv themselves some Java… all two of you… and one of you is probably the guy who wrote the case against … But it is true a lot of the griping was related to people from a SVN/CVS model not understanding the Git model at all (in my more cynical moments, I feel that neither do Gerrit's developers :-D ). Take care, terry ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Serious alternatives to Gerrit
On Wed, Jul 25, 2012 at 7:29 AM, Terry Chay tc...@wikimedia.org wrote: Related is the fact that we seem to have a lot of PHP web dev expertise (for some reason) and Gerrit went from Python (serviceable) to Java (totally opaque). Apologies to those of you at the WMF who lurv themselves some Java… all two of you… and one of you is probably the guy who wrote the case against The more I've thought about it, the less that I feel language it's written in really matters at all. The number of people contributing upstream is always going to be relatively small, and as long as those who /want/ to contribute upstream are comfortable with it, it could be written in Cobol for all I care. It kinda struck me the other day when the subject of bug-tracking tools came up again. Had we been using $SOME_OTHER_PRODUCT and people were advocating switching to Bugzilla, I'm sure people would complain omg, it's Perl--we can't contribute upstream. But in reality, how many people *have* contributed upstream to Bugzilla? Most people file bugs in our tracker and they get re-filed upstream, which is perfectly fine as long as there's an upstream who responds, which in this case there is. I think the choice of platform matters when we're talking about ease of installation/upgrading to some degree so we don't make the ops angry, but that's a total non-issue with Gerrit because installation/uprgrades are very very easy :) -Chad ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Creating a centralized access point for propriety databases/resources
Thanks for the tip! I'm trying to understand the differences between: *phpMyAdmin *SAML *OpenID *OpenVPN Could you give me a quick insight into how they differ, strengths/weaknesses, etc.? More details for The Wikipedia Library concept are at http://enwp.org/WP:TWL Cheers! Jake Orlowitz Wikipedia editor: Ocaasi http://enwp.org/User:Ocaasi wikioca...@yahoo.com 484-380-3940 From: Derk-Jan Hartman d.j.hartman+wmf...@gmail.com To: Ocaasi Ocaasi wikioca...@yahoo.com; Wikimedia developers wikitech-l@lists.wikimedia.org Sent: Wednesday, July 25, 2012 4:26 AM Subject: Re: [Wikitech-l] Creating a centralized access point for propriety databases/resources You could always create an OpenVPN gateway that provides access. Many edu institutions have the same setup to access those resources. DJ On Mon, Jul 23, 2012 at 6:21 PM, Ocaasi Ocaasi wikioca...@yahoo.com wrote: Hi Folks! The problem: Many proprietary research databases have donated free access to select Wikipedia editors (Credo Reference, HighBeam Research, JSTOR). Managing separate account distribution for each service doesn't scale well. The idea: Centralize access to these separate resources behind a single secure (firewalled) gateway, to which accounts would be given to a limited number of approved users. After logging in to this single gateway, users would be able to enter any of the multiple participating research databases without needing to log in to each one separately. The question: What are the basic technical specifications for setting up such a system. What are open source options, ideally? What language would be ideal? What is required to host such a system? Can you suggest a sketch of the basic steps necessary to implement such an idea? Any advice, from basics to details would be greatly appreciated. Thanks so much! Ocaasi http://enwp.org/User:Ocaasi ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Creating a centralized access point for propriety databases/resources
I can cover some of thse: *phpMyAdmin This is an open source database manager for MySQL databases - it won't work for what you want. *SAML *OpenID From the page you link it looks like you know about these two; i.e. they act as sign in gateways. OpenID is more indie, SAML is more enterprise - otherwise there are not major differences in what they can achieve. The major bar to entry is getting the providers to add the ability to sign in using one of these methods. I'd personally recommend selecting OpenID as it could then be used for a wider variety of logins around the web. AFAIK resources like Athens (i.e. similar to what you appear to want) tend to use SAML. *OpenVPN VPN means setting up access to a pre-authorised network - which then means you can access the restricted resource. I don't think it fits your use case. Tom ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Creating a centralized access point for propriety databases/resources
Hi This looks similar to something I have been thinking about recently However I would go about it using openeId. But it would require all the databases sites to support openId. I think that the extensions exists to do this using mediawiki, but WMF projects do not trust/support this method of authentication. If all parties were to support this standard it would be possible to develop an gadget which could log users into all the sites at once. Do you know how many users have been granted access to each databases, this would be useful for estimating the importance/impact of this project. Oren Bochman -Original Message- From: wikitech-l-boun...@lists.wikimedia.org [mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of Ocaasi Ocaasi Sent: Monday, July 23, 2012 6:22 PM To: wikitech-l@lists.wikimedia.org Subject: [Wikitech-l] Creating a centralized access point for propriety databases/resources Hi Folks! The problem: Many proprietary research databases have donated free access to select Wikipedia editors (Credo Reference, HighBeam Research, JSTOR). Managing separate account distribution for each service doesn't scale well. The idea: Centralize access to these separate resources behind a single secure (firewalled) gateway, to which accounts would be given to a limited number of approved users. After logging in to this single gateway, users would be able to enter any of the multiple participating research databases without needing to log in to each one separately. The question: What are the basic technical specifications for setting up such a system. What are open source options, ideally? What language would be ideal? What is required to host such a system? Can you suggest a sketch of the basic steps necessary to implement such an idea? Any advice, from basics to details would be greatly appreciated. Thanks so much! Ocaasi http://enwp.org/User:Ocaasi ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Creating a centralized access point for propriety databases/resources
We currently have relationships with three separate resource databases. *HighBeam, 1000 authorized accounts, 700 active (http://enwp.org/WP:HighBeam) *JSTOR, 100 accounts, all active (http://enwp.org/WP:JSTOR) *Credo, 400 accounts, all active (http://enwp.org/WP:CREDO) No parties have agreed to participate in The Wikipedia Library *yet*, as it's still in the concept stage, but my initial projection is that 1000 editors would have access to it, and 100 additional users per year would be granted. One of the challenges will be getting all the resource providers to agree on that number, but the hope is that once some do, it will create a cascade of adoption. So we're not looking at *thousands* of users, but more likely several hundreds. Still, given the impact of our most active editors, 1000 of them with access to the library would have significant impact. After all, we can't cannibalize these databases' subscription business by opening the library to ''all'' editors. It must be a carefully selected and limited group. -Original Message- From: wikitech-l-boun...@lists.wikimedia.org [mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of Ocaasi Ocaasi Sent: Monday, July 23, 2012 6:22 PM To: wikitech-l@lists.wikimedia.org Subject: [Wikitech-l] Creating a centralized access point for propriety databases/resources Hi Folks! The problem: Many proprietary research databases have donated free access to select Wikipedia editors (Credo Reference, HighBeam Research, JSTOR). Managing separate account distribution for each service doesn't scale well. The idea: Centralize access to these separate resources behind a single secure (firewalled) gateway, to which accounts would be given to a limited number of approved users. After logging in to this single gateway, users would be able to enter any of the multiple participating research databases without needing to log in to each one separately. The question: What are the basic technical specifications for setting up such a system. What are open source options, ideally? What language would be ideal? What is required to host such a system? Can you suggest a sketch of the basic steps necessary to implement such an idea? Any advice, from basics to details would be greatly appreciated. Thanks so much! Ocaasi http://enwp.org/User:Ocaasi ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Creating a centralized access point for propriety databases/resources
Ocaasi, please centralize your notes, ideas, and plans regarding this here: https://www.mediawiki.org/wiki/AcademicAccess I know Chad Horohoe, Ryan Lane, and Chris Steipp might have things to say about this; per https://www.mediawiki.org/wiki/Wikimedia_Engineering/2012-13_Goals#Activities_12 their team aims to work on OAuth and OpenID within the next 11 months, and AcademicAccess is a possible beneficiary of that. Thanks! -- Sumana Harihareswara Engineering Community Manager Wikimedia Foundation On 07/25/2012 10:03 AM, Ocaasi Ocaasi wrote: We currently have relationships with three separate resource databases. *HighBeam, 1000 authorized accounts, 700 active (http://enwp.org/WP:HighBeam) *JSTOR, 100 accounts, all active (http://enwp.org/WP:JSTOR) *Credo, 400 accounts, all active (http://enwp.org/WP:CREDO) No parties have agreed to participate in The Wikipedia Library *yet*, as it's still in the concept stage, but my initial projection is that 1000 editors would have access to it, and 100 additional users per year would be granted. One of the challenges will be getting all the resource providers to agree on that number, but the hope is that once some do, it will create a cascade of adoption. So we're not looking at *thousands* of users, but more likely several hundreds. Still, given the impact of our most active editors, 1000 of them with access to the library would have significant impact. After all, we can't cannibalize these databases' subscription business by opening the library to ''all'' editors. It must be a carefully selected and limited group. -Original Message- From: wikitech-l-boun...@lists.wikimedia.org [mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of Ocaasi Ocaasi Sent: Monday, July 23, 2012 6:22 PM To: wikitech-l@lists.wikimedia.org Subject: [Wikitech-l] Creating a centralized access point for propriety databases/resources Hi Folks! The problem: Many proprietary research databases have donated free access to select Wikipedia editors (Credo Reference, HighBeam Research, JSTOR). Managing separate account distribution for each service doesn't scale well. The idea: Centralize access to these separate resources behind a single secure (firewalled) gateway, to which accounts would be given to a limited number of approved users. After logging in to this single gateway, users would be able to enter any of the multiple participating research databases without needing to log in to each one separately. The question: What are the basic technical specifications for setting up such a system. What are open source options, ideally? What language would be ideal? What is required to host such a system? Can you suggest a sketch of the basic steps necessary to implement such an idea? Any advice, from basics to details would be greatly appreciated. Thanks so much! Ocaasi http://enwp.org/User:Ocaasi ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] [Wikimedia-l] Questions about browser geolocalization improvement proposal
Hi all, I have started a thread on Wikimedia-l mailing lst[1a][1b] about a proposal for improvement[2] of the current geolocation system used by Geonotice[3]. In the following e-mail we had some question about how the geolocalization from browser works. In particular, with the geolocalization via browser one can obtain the information of who received a geotargeted message or not? Is possible or not store the data about geolocalization? Summing up, we are wondering if this system respects the privacy of users. Can you help us answer those concerns? -- Forwarded message -- From: Cristian Consonni kikkocrist...@gmail.com Date: 2012/7/25 Subject: Re: [Wikimedia-l] Geolocalization improvement proposal To: Wikimedia Mailing List wikimedi...@lists.wikimedia.org 2012/7/24 birgitte...@yahoo.com: On Jul 24, 2012, at 11:14 AM, Cristian Consonni kikkocrist...@gmail.com wrote: 2012/7/24 birgitte...@yahoo.com: The main question is whether the benefit from being able to connect people with local events is worth the risk of collecting more personalized of their data than we are accustomed to handling. I could be wrong but I don't think we would handle more data than what we are doing now. We are not going to use that data and as far as I know that data dies in the moment the system has output the message. Maybe this is the area that needs more study. And I am probably the wrong person to try and even formulate technical questions, but is there a way to make use of this data without storing it? Without even knowing who recieved what personalized messages, unless, of course,they choose to respond? Thanks in advance, Cristian [1a] http://lists.wikimedia.org/pipermail/wikimedia-l/2012-July/121236.html [1b] http://lists.wikimedia.org/pipermail/wikimedia-l/2012-July/121281.html [2] http://meta.wikimedia.org/wiki/Geonotice [3] http://en.wikipedia.org/wiki/Wikipedia:Geonotice ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] ANN: Multilingual Linked Open Data for Enterprise (MLODE) Leipzig, September 23-24-25, 2012
Hi Sebastian, ich denke, dass wir zu dem Treffen was beitragen können. Was denkst Du? Welche Form wäre am sinnvollsten? Grüße, Denny 2012/7/24 Sebastian Hellmann hellm...@informatik.uni-leipzig.de: *Save the date: Leipzig, Germany 23-24-25 September 2012 http://sabre2012.infai.org/mlode Co-located with the Leipziger Semantic Web Day: http://aksw.org/lswt == Multilingual Linked Open Data for Enterprises == MLODE will bring together developers, data producers, academia and enterprises and connect people, communities, data and industrial use cases. The workshop will be very interactive and you are expected to help us achieve common goals: * bootstrap and build a Linguistic Linked Open Data Cloud (LLOD): http://linguistics.okfn.org/resources/llod/ * establish best practices for multilingual linked open data * create incentives for businesses and lower the barrier for participation in LOD for natural language processing and internationalisation and localisation enterprises. We are expecting intensive participation by members of the following communities (these are teasers, see the **detailed descriptions for each community** further below): * DBpedia ( http://dbpedia.org about:blank): DBpedia International now has over 10 language-specific chapters (such as http://el.dbpedia.org http://el.dbpedia.org/). At the MLODE workshop there will be a DBpedia Developers meetup. We will discuss the “Future of DBpedia” and create a common Road Map. If you want to get more involved in DBpedia, the workshop will be a good opportunity to meet the team. * Working Group for Open Data in Linguistics (OWLG, http://linguistics.okfn.org http://linguistics.okfn.org/%29): Now is the time to get your data into the LLOD cloud! We have created a development team that will convert your data to RDF and help establish links: http://code.google.com/p/mlode/. Please submit your data sets soon! (Furthermore we will have a legal session to discuss licensing issues.) * Multilingual Web ( http://www.multilingualweb.eu http://www.multilingualweb.eu/): Free, open data and lexica; we will have a session discussing best practices for multilingual linked open data (http://mlode.okfnpad.org/best-practices-multilingual-lod) and compatability with the RDF world with ITS 2.0. * Apache Stanbol ( http://incubator.apache.org/stanbol/): Enterprises will have the chance to present their use cases during lightning talks and we will have a Apache Stanbol Booth and an install fest to show hands-on how combined usage of public and closed data can be achieved and what benefits firms can gain from using these rapidly increasing data pools. * Ontolex W3C Community Group ( http://www.w3.org/community/ontolex/): Monnet Challenge will provide a data bounty for developers who convert data sets using lemon. * Also: NLP2RDF (http://nlp2rdf.org http://nlp2rdf.org/) - the NIF project, DBpedia Spotlight (http://spotlight.dbpedia.org http://spotlight.dbpedia.org/), Wiktionary2RDF (http://dbpedia.org/Wiktionary) How you can contribute: * Contact us if you are an enterprise and want to prepare a small presentation/lightning talk about your business use cases (using LOD) or problems you have (please see below for details) * Contact us if you want to give a short presentation on a relevant topic * We are looking for a sponsor for a DBpedia Booth * Submit your data sets for the LLOD: http://code.google.com/p/mlode/ * Become a sponsor of the workshop: http://sabre2012.infai.org/mlode/funding?#sponsorship * Or donate money and help the individual communities: http://sabre2012.infai.org/mlode/Funding DBpedia is a good example of a freely available and open data set that was generated by crowd-sourcing and academia, but it has provided an immense value to businesses and industry. We want to build on and continue this success for the areas of natural language processing enterprises and the internationalisation and localisation industries. The goal of the workshop is to bootstrap a Multilingual Linked Open Data cloud by bringing together many different linked open data sets and by creating synergy among different research and business communities. This workshop is aimed at researchers and industry and commercial consumers of data produced by research. We hope for mutual benefits between (potentially non-commercial) data providers and enterprises: Open-source and open-licences for software have shown that they can be successful in a commercial environment. How can we transfer these models to Multilingual Linked Open Data? And how can the transformation of currently monolingual Linked Open Data sources into a Multilingual Web of Open Data spur cross-linguistic research, and commercial applications in internationalisation and localisation enterprises? = Sponsors = We would like to thank our sponsors for supporting the workshop: * The **Working MultilingualWeb-LT Working Group** -
Re: [Wikitech-l] Creating a centralized access point for propriety databases/resources
Hi Ocaasi I agree that tighter work with the database providers is in order. 1000+ accounts for top contributors can make a significant impact on Wikipedia fact checking. Based on my experience at university (where I taught a lab-class on reference database usage) that there are many more options on how to do this. Most users in universities do not require to log in at all. (they work in context of an IP range that is enabled for databases.) Research libraries also implement floating licenses for databases that have limited access options. However to implement this it is often necessary to work with a large database aggregators (which solves the tech issues) and the rest is implemented by operations staff of a university. Oren Bochman -Original Message- From: wikitech-l-boun...@lists.wikimedia.org [mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of Sumana Harihareswara Sent: Wednesday, July 25, 2012 4:16 PM To: Ocaasi Ocaasi; Wikimedia developers Subject: Re: [Wikitech-l] Creating a centralized access point for propriety databases/resources Ocaasi, please centralize your notes, ideas, and plans regarding this here: https://www.mediawiki.org/wiki/AcademicAccess I know Chad Horohoe, Ryan Lane, and Chris Steipp might have things to say about this; per https://www.mediawiki.org/wiki/Wikimedia_Engineering/2012-13_Goals#Activitie s_12 their team aims to work on OAuth and OpenID within the next 11 months, and AcademicAccess is a possible beneficiary of that. Thanks! -- Sumana Harihareswara Engineering Community Manager Wikimedia Foundation On 07/25/2012 10:03 AM, Ocaasi Ocaasi wrote: We currently have relationships with three separate resource databases. *HighBeam, 1000 authorized accounts, 700 active (http://enwp.org/WP:HighBeam) *JSTOR, 100 accounts, all active (http://enwp.org/WP:JSTOR) *Credo, 400 accounts, all active (http://enwp.org/WP:CREDO) No parties have agreed to participate in The Wikipedia Library *yet*, as it's still in the concept stage, but my initial projection is that 1000 editors would have access to it, and 100 additional users per year would be granted. One of the challenges will be getting all the resource providers to agree on that number, but the hope is that once some do, it will create a cascade of adoption. So we're not looking at *thousands* of users, but more likely several hundreds. Still, given the impact of our most active editors, 1000 of them with access to the library would have significant impact. After all, we can't cannibalize these databases' subscription business by opening the library to ''all'' editors. It must be a carefully selected and limited group. -Original Message- From: wikitech-l-boun...@lists.wikimedia.org [mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of Ocaasi Ocaasi Sent: Monday, July 23, 2012 6:22 PM To: wikitech-l@lists.wikimedia.org Subject: [Wikitech-l] Creating a centralized access point for propriety databases/resources Hi Folks! The problem: Many proprietary research databases have donated free access to select Wikipedia editors (Credo Reference, HighBeam Research, JSTOR). Managing separate account distribution for each service doesn't scale well. The idea: Centralize access to these separate resources behind a single secure (firewalled) gateway, to which accounts would be given to a limited number of approved users. After logging in to this single gateway, users would be able to enter any of the multiple participating research databases without needing to log in to each one separately. The question: What are the basic technical specifications for setting up such a system. What are open source options, ideally? What language would be ideal? What is required to host such a system? Can you suggest a sketch of the basic steps necessary to implement such an idea? Any advice, from basics to details would be greatly appreciated. Thanks so much! Ocaasi http://enwp.org/User:Ocaasi ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Serious alternatives to Gerrit
On Wed, Jul 25, 2012 at 4:17 AM, Terry Chay tc...@wikimedia.org wrote: I think Ori's comment that touched this off was tongue-in-cheek. :-) Indeed, we're not hiring anyone at this point :). We're meeting folks from different code review projects as part of the process. We're also trying to connect in person with the Gerrit folks, probably in the same week. -- Erik Möller VP of Engineering and Product Development, Wikimedia Foundation Support Free Knowledge: https://wikimediafoundation.org/wiki/Donate ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] branched article history extension [Re: Article revision numbers]
Hi, I've started working on an extension to manage branching history, calling it Nonlinear. Here's the crude code, https://github.com/adamwight/Nonlinear Screenshot of the effect on revision history: mediawiki screenshot On 07/16/2012 04:10 PM, Platonides wrote: On 17/07/12 00:22, Adam Wight wrote: Hello comrades, I've run into a challenge too interesting to keep to myself ;) My immediate goal is to prototype an offline wikipedia, similar to Kiwix, which allows the end-user to make edits and synchronize them back to a central repository like enwiki. The catch is, how to insert these changes without edit conflicts? With linear revision numbering, I can't imagine a natural representation of the data, only some kind of ad-hoc sandbox solution. Extending the article revision numbering to represent a branching history would be the natural way to handle optimistic replication. Non-linear revisioning might also facilitate simpler models for page protection, and would allow the formation of multiple, independent consensuses. -Adam Wight Actually, the revision table allows for non-linear development (it stores from which version you edited the article). You could even make to win a version different than the one with the latest timestamp (by changing page_rev) one. You will need to change the way of viewing history, however, and add a system to keep track of heads and merges. There may be some assumtions accross the codebase about the latest revision being the active one, too. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Serious alternatives to Gerrit
* What does GitHub Enterprise buy us? Which of these issues would that fix? It's a self-hosted GitHub. It would allow us to have private repositories (good for deploys, ops, etc.) and manage our own user database (we could integrate with our own auth system) and probably waives the 13 and under rule above. The price is too steep since its a per-seat license. A nonstarter if the WMF is going to have to pay for every potential developer who wants to attach. As mentioned before, we can't use github enterprise at all, since it doesn't allow for hosting public repos. Let's ignore that it even exists. We do need a GitHub strategy -- to make our projects more discoverable, make use of more contributions, and participate in the GitHub reputational economy. So we must figure out the right ways to mirror and sync. But I doubt our own long-term needs would work well with using GitHub as our main platform. I'm 1000% with you on this. We should definitely at some point mirror our code in GitHub like the PHP project does http://www.php.net/git.php. Being able to publish and handle pull requests coming from GitHub would be a nice feature in Gerrit or any replacement. It'd be nice if others can have their own MW extensions or versions of extensions and core on GitHub and pull from us (and us from them) esp. for extensions that may need some love or have changes that don't satisfy the WMF code quality bar. Well, we can enable replication from Gerrit to Github. We haven't done so, yet, but it's a feature that's available. - Ryan ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Creating a centralized access point for propriety databases/resources
I'm trying to understand the differences between: *phpMyAdmin *SAML *OpenID *OpenVPN You should only consider SAML and OpenID. More exactly, you should really only consider SAML, since the resources you are trying to connect to only support SAML, and not OpenID. We can use OpenID for proxied access to resources that don't support SAML, but it's very likely nearly all of the resources we're trying to access support SAML. Ideally we'd integrate central auth with something that supports multiple protocols. SimpleSAMLPHP supports SAML, OpenID, OAuth and a few other protocols. It also can handle the circles of trust that we'd need to create with the libraries/universities. - Ryan ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Serious alternatives to Gerrit
As mentioned before, we can't use github enterprise at all, since it doesn't allow for hosting public repos. Let's ignore that it even exists. I feel like as Wikipedia is one of the top 10 most visited sites on the Internet we might be able to work out something special with them though right? I'm not saying we have to go down that route, nor have I even examined all the advantages and disadvantages of the idea. I feel though that the possibility exists though and should be looked into. If I was GitHub and the WMF approached me about potentially using GitHub Enterprise for MediaWiki and MediaWiki extensions and NOT for creating a competition service to GitHub, then I would likely entertain the idea of crafting a special set of terms for them. Furthermore, I personally might even charge them differently given that that charging per developer would be crippling to an open-source project. Of course, I am not GitHub, nor can I anticipate what they might do, nor their internal policies, but I can speak for myself and how I would run a FOSS focused company. Thank you, Derric Atzrott ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Creating a centralized access point for propriety databases/resources
@ Ryan, If you say SAML is the best approach, then that's what we'll use. OpenID can be a backup for those that are not SAML compatible for some reason. @ Oren, we want to make it so that the vast majority of the work is done on our end if possible. Ideally, participating resource donors wouldn't have to do anything to their websites at all. That may not be realistic, but it's the direction I'd like to lean. Jake Orlowitz Wikipedia editor: Ocaasi http://enwp.org/User:Ocaasi wikioca...@yahoo.com From: Ryan Lane rlan...@gmail.com To: Ocaasi Ocaasi wikioca...@yahoo.com; Wikimedia developers wikitech-l@lists.wikimedia.org Cc: Derk-Jan Hartman d.j.hartman+wmf...@gmail.com Sent: Wednesday, July 25, 2012 2:04 PM Subject: Re: [Wikitech-l] Creating a centralized access point for propriety databases/resources I'm trying to understand the differences between: *phpMyAdmin *SAML *OpenID *OpenVPN You should only consider SAML and OpenID. More exactly, you should really only consider SAML, since the resources you are trying to connect to only support SAML, and not OpenID. We can use OpenID for proxied access to resources that don't support SAML, but it's very likely nearly all of the resources we're trying to access support SAML. Ideally we'd integrate central auth with something that supports multiple protocols. SimpleSAMLPHP supports SAML, OpenID, OAuth and a few other protocols. It also can handle the circles of trust that we'd need to create with the libraries/universities. - Ryan ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Serious alternatives to Gerrit
On Tue, Jul 24, 2012 at 10:26 PM, Erik Moeller e...@wikimedia.org wrote: As one quick update, we're also in touch with Evan Priestley, who's no longer at Facebook and now running Phabricator as a dedicated open source project and potential business. If all goes well, Evan's going to come visit WMF sometime soon, which will be an opportunity to seriously explore whether Phabricator could be a viable long term alternative (it's probably not a near term one). Will post more details if this meeting materializes. Evan pointed me to the following pages which give some hints at Phabricator's longer term direction (requires logging in): Project roadmap: https://secure.phabricator.com/w/roadmap/ Plans for repo/project level permission management: https://secure.phabricator.com/T603 In prep for this, I started a section about Phabricator as an alternative. https://www.mediawiki.org/w/index.php?title=Git%2FGerrit_evaluationdiff=565357oldid=565266 Steven ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Serious alternatives to Gerrit
On Wed, Jul 25, 2012 at 02:21:03PM -0400, Derric Atzrott wrote: As mentioned before, we can't use github enterprise at all, since it doesn't allow for hosting public repos. Let's ignore that it even exists. I feel like as Wikipedia is one of the top 10 most visited sites on the Internet we might be able to work out something special with them though right? I'm not saying we have to go down that route, nor have I even examined all the advantages and disadvantages of the idea. I feel though that the possibility exists though and should be looked into. If I was GitHub and the WMF approached me about potentially using GitHub Enterprise for MediaWiki and MediaWiki extensions and NOT for creating a competition service to GitHub, then I would likely entertain the idea of crafting a special set of terms for them. Furthermore, I personally might even charge them differently given that that charging per developer would be crippling to an open-source project. Of course, I am not GitHub, nor can I anticipate what they might do, nor their internal policies, but I can speak for myself and how I would run a FOSS focused company. I think the BitKeeper story is relevant here: BitKeeper was one of the first DVCSes. It was (is?) a proprietary for-profit tool that gave special free licenses to certain free/open-source software projects, like Linux. Linux was using it for years, due to having some unique at the time features (and Linus liking it), although it was a controversial choice in the community. At one point, due to some reverse engineering (basically typing help at its server and showing that in a talk) by some community members, the company behind BitKeeper decided to revoke this free (as in beer) license from the community members, effectively halting Linux development. Git was first published a week after that. Now, the situation is a bit different here. But I can certainly imagine getting this special license exception revoked, GitHub Enterprise discontinued as a product or whatever else. Can you imagine the disruption that would cause to our development and operations? Regars, Faidon ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] suggestion: replace CAPTCHA with better approaches
Hi The wikipedia's captcha is a great opportunity for getting '''useful'' work done by humans. This is now called a [[game with a purpose]]. I think we can ideally use it to help: * ocr wikisource text like recaptcha does * translate articles fragments using geo-location of editors. Translate [xyz-known] [...] Translate [xyz-new] [...] check using blau metric etc. * get more opinions on spam edits. Is this diff [spam] [good faith edit] [ok] * collect linguistics information on different languages edition. Is XYZ a [verb] / [noun] / [adjective] ... [other] *disambiguate Is [xyz-known] [xyz] ... [xyz] ... [xyz] ... Is [yzx-unknown] [yzx1] ... [yzx1] ... [yzx1] ... Etc This way if people feel motivated at cheating at captcha they will end up helping Wikipedia It is up to us to try to balance things out. I'm pretty sure users will be less annoyed at solving captchas that actually contribute some value. -Original Message- From: wikitech-l-boun...@lists.wikimedia.org [mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of matanya Sent: Tuesday, July 24, 2012 4:12 PM To: wikitech-l@lists.wikimedia.org Subject: [Wikitech-l] suggestion: replace CAPTCHA with better approaches As for the last few month the spam rate stewards deal with is raising. I suggest we implement a new mechanism: Instead of giving the user a CAPTCHA to solve, give him a image from commons and ask him to add a brief description in his own language. We can give him two images, one with known description, and the other with unknown, after enough users translate the unknown in the same why, we can use it as a verified translation. We base on the known image description to allow the user to create the account. Is it possible to embed a file from commons in the login page? is it possible to parse the entered text and store it? benefits: A) it would be harder for bots to create automated accounts. B) We will get translations to many languages with little effort from the users signing up. What do you think? ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Category Internacionalization.
I can get all the categories with the Parser Output, but I can´t add or delete a category. I'm using updateCategoryCounts() in LinksUpdate but it doesn't work. On Tue, Jul 24, 2012 at 3:23 PM, Platonides platoni...@gmail.com wrote: El 24/07/12 19:45, Mauricio Etchevest escribió: Hi ! I'm working on a extension for Media Wiki. And I need to detect when a categorization is made on an article. So I search for the annotation with the keyword category but, then I need to detect categorizations in other languages.How can I get the translation of the keyword category ? Wich is the best way to add/remove an article from a Category ? Thanks! You're doing it the wrong way. You want to call getCategoryLinks() on the parser output. Take a look at LinksUpdate.php ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] suggestion: replace CAPTCHA with better approaches
This way if people feel motivated at cheating at captcha they will end up helping Wikipedia It is up to us to try to balance things out. I'm pretty sure users will be less annoyed at solving captchas that actually contribute some value. Obligatory XKCD: https://xkcd.com/810/ The best CAPTCHAs are the kind that do this. Look at how hard it is to beat reCAPTCHA because they have taken this approach. One must be careful though that the CAPTCHA is constructed such that it won't be as simple as a lookup though, and will actually require some thought (so that probably eliminates the noun, verb, adjective idea). This idea has my support. Thank you, Derric Atzrott ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Serious alternatives to Gerrit
On Jul 25, 2012, at 11:36 AM, Faidon Liambotis wrote: think the BitKeeper story is relevant here Yes, good point. Honestly before we talk GitHub or GitLab, we should consider if we are willing to rethink our model of handling code submissions to be more Pull-requesty. These two systems don't really have pre commit code review in the traditional sense (correct me if I'm wrong) and I don't think there is a way to bolt this on. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Serious alternatives to Gerrit
On Wed, Jul 25, 2012 at 3:18 PM, Terry Chay tc...@wikimedia.org wrote: On Jul 25, 2012, at 11:36 AM, Faidon Liambotis wrote: think the BitKeeper story is relevant here Yes, good point. Honestly before we talk GitHub or GitLab, we should consider if we are willing to rethink our model of handling code submissions to be more Pull-requesty. These two systems don't really have pre commit code review in the traditional sense (correct me if I'm wrong) and I don't think there is a way to bolt this on. Yup. A better understanding of our overall code submission workflow would be very useful in taking the next big step (GitHub or git/Phabricator or whatever). Alolita ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Serious alternatives to Gerrit
Well there's more than pushing changes upstream. Modifying/customizing Gerri's UI is supposed to be able to be done without pushing back upstream. As a relative novice of GWT, modifying/customizing UI in gerrit seems rather opaque. But before I go whole hog on Phabricator, I'd have to look more into how it handles templating and customization… I think there is a large class of changes we might want to make that aren't so programmy as to involve using Conduit or Arcanist to get it done. In any case, I think at the moment Gerrit does what we want and its finally usably fast. I'm sure off the top of my head I can think of a number of things that keep something written in PHP (Phabricator) from matching Gerrit's feature set (LDAP? ACLs?), and pushing upstream patches isn't high on my list pro/con any of these systems. But things are changing here at a rapid pace so I'm not going to say Gerrit now, Gerrit forever. ;-) On Jul 25, 2012, at 5:01 AM, Chad wrote: On Wed, Jul 25, 2012 at 7:29 AM, Terry Chay tc...@wikimedia.org wrote: Related is the fact that we seem to have a lot of PHP web dev expertise (for some reason) and Gerrit went from Python (serviceable) to Java (totally opaque). Apologies to those of you at the WMF who lurv themselves some Java… all two of you… and one of you is probably the guy who wrote the case against The more I've thought about it, the less that I feel language it's written in really matters at all. The number of people contributing upstream is always going to be relatively small, and as long as those who /want/ to contribute upstream are comfortable with it, it could be written in Cobol for all I care. It kinda struck me the other day when the subject of bug-tracking tools came up again. Had we been using $SOME_OTHER_PRODUCT and people were advocating switching to Bugzilla, I'm sure people would complain omg, it's Perl--we can't contribute upstream. But in reality, how many people *have* contributed upstream to Bugzilla? Most people file bugs in our tracker and they get re-filed upstream, which is perfectly fine as long as there's an upstream who responds, which in this case there is. I think the choice of platform matters when we're talking about ease of installation/upgrading to some degree so we don't make the ops angry, but that's a total non-issue with Gerrit because installation/uprgrades are very very easy :) -Chad ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Serious alternatives to Gerrit
On Wed, Jul 25, 2012 at 7:29 AM, Terry Chay tc...@wikimedia.org wrote: In their defense, I think a lot of it has to do with terrible UI/UX in Gerrit. The basics is can be modified by CSS and templates (I believe we've done some), but it only goes so far. How do I modify Javascript in Gerrit? I think it starts (and ends) somewhere in the hell that is GWT… and when GWT begins with, something about how awesome it is to be able to write Ajax stuff using Java, I stop reading http://www.flickr.com/photos/tychay/1388234558/ This seems to be a common misconception about skinning Gerrit, so please allow me to take a moment to clear up this. To deliver custom CSS (or HTML, or Javascript), we can do that with stock Gerrit /right now/. Right now, two big issues stand in the way of *really* skinning Gerrit the way we'd like: 1) GWT's CSS is included last, after your custom site CSS. This is stupid, but fixable (and actually, you can work around it with slapping !important on everything, but that's silly and I want to fix it for real). 2) Right now, most classes aren't considered public facing, so the names are randomly reassigned when Gerrit is recompiled. This is easily fixable, as classes where we want the name to remain stable (and there are some already) can be marked with an annotation and therefore made public. This is a one line fix per- class. I haven't actually tried doing custom Javascript yet, but it should be completely doable via the GerritSite.html header that you can customize (in fact, I've got some other non-JS customizations I want to roll out there soon). Gerrit skinning isn't nearly as scary as playing with GWT (which you only really need to know if you're trying to actually modify the forms/ etc that are being delivered). Once we get the labs installation of Gerrit back up (working on it!), I'd love to grant access to some CSS gurus amongst us who'd be willing to try coming up with a prettier skin for Gerrit. -Chad ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Serious alternatives to Gerrit
On Wed, Jul 25, 2012 at 6:09 AM, Sumana Harihareswara suma...@wikimedia.org wrote: On 07/17/2012 08:41 PM, Rob Lanphier wrote: It would appear from reading this page that the only alternative to Gerrit that has a serious following is GitHub. Is that the case? We definitely need a GitHub *strategy*. GitHub draws together tons of open source contributors. So we ought to address: * pull requests. People *will* clone our projects onto GitHub and end up submitting pull requests there; we have to find or make tools to sync those, or at least get notified about them and make it easy to pull them into whatever we use. [0] [1] * discoverability. Having a presence on GitHub gets us publicity to a lot of potential contributors. * reputation. People on GitHub want credit, in their system, for their commits. It'd help us to give them that somehow. I'm sure the approach doesn't scale the best but OpenStreetMap has a mirror of it's code on github, while the official repository is self-hosted within OSM. http://git.openstreetmap.org/rails.git/shortlog People can and do regularly submit pull requests from Github and they get merged in. https://github.com/openstreetmap/openstreetmap-website I thought we used to have a mirror of MediaWiki on github but maybe that was when we were using SVN. There's also the Wikimedia mobile stuff on github and curious how that's working, in terms of incorporating volunteer contributions. Cheers, Katie [snip] (Thanks to Chad and RobLa for talking through much of this with me.) -- Sumana Harihareswara Engineering Community Manager Wikimedia Foundation ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l -- Board member, Wikimedia District of Columbia http://wikimediadc.org @wikimediadc / @wikimania2012 ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Serious alternatives to Gerrit
On 25/07/12 19:49, Ryan Lane wrote: None of these issues are Gerrit specific. You are complaining about the switch from svn to git. Yes, we know there was productivity lost in the switchover. We're discussing alternatives to git, not the switchover, though. - Ryan We're discussing alternatives to *gerrit*, not to git. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Serious alternatives to Gerrit
On 25/07/12 06:09, Sumana Harihareswara wrote: == A couple open questions == * What's the FLOSS project on GitHub that's most like us, in terms of size, number of unique repositories, privacy concerns, robustness needs, and so on? How are they dealing with these issues? I think php project. They have lots of repositories. It is a bit mixed, since on the one hand their core repos are at http://git.php.net/, with github (https://github.com/php) being a mirror. Pull notifications go to a mailing list and there is a tool for closing them: http://qa.php.net/pulls, we could ask David Soria Parra if we want more info/to copy somehting from their setup. On the other hand, many pecl extensions have moved to their own repo at github but I donk't really know how to find them. They seem to be scattered with their own user like https://github.com/php-memcached-dev Pear is more consistent, with all the extensions grouped under the same user https://github.com/pear ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Serious alternatives to Gerrit
On Wed, Jul 25, 2012 at 3:31 PM, Chad innocentkil...@gmail.com wrote: I haven't actually tried doing custom Javascript yet, but it should be completely doable via the GerritSite.html header that you can customize (in fact, I've got some other non-JS customizations I want to roll out there soon). I've done this, stealing some code from the OpenStack-CI people. It's in an abandoned change in the puppet repo: https://gerrit.wikimedia.org/r/#/c/3285/2/files/gerrit/skin/GerritSiteHeader.html Roan ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] suggestion: replace CAPTCHA with better approaches
On Thu, Jul 26, 2012 at 5:00 AM, Derric Atzrott datzr...@alizeepathology.com wrote: This way if people feel motivated at cheating at captcha they will end up helping Wikipedia It is up to us to try to balance things out. I'm pretty sure users will be less annoyed at solving captchas that actually contribute some value. Obligatory XKCD: https://xkcd.com/810/ ;-) The best CAPTCHAs are the kind that do this. Look at how hard it is to beat reCAPTCHA because they have taken this approach. One must be careful though that the CAPTCHA is constructed such that it won't be as simple as a lookup though, and will actually require some thought (so that probably eliminates the noun, verb, adjective idea). This idea has my support. We should use less CAPTCHAs. If the problem is spam, we should build better new URL review systems. There are externally managed spam lists that we could use to identify spammers. 'new URL' s could be defined as domain names that were not in the external links table for more than 24 hrs. Addition of these new URLs could be smartly throttled. un-autoconfirmed edits which include 'new URLs' could be throttled so that they can only be added to a single article for the first 24 hours. That allows a new user to make use of a new domain name unimpeded, however they can only use it on one page for the first 24 hrs. If the new URL was spam, it will hopefully be removed within 24 hrs, which resets the clock for the spammer. i.e. they can only add the spam to one page each 24 hrs. Another idea is for the wiki to ask the user that adds new URLs to review three recent edits that included new URLs and ask the user to indicate whether or not the new URL was SPAM and should be removed. This may be unworkable because the spam-bot could use the linksearch tool to check whether a link is good or not. -- John Vandenberg ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Problem with saving edits
I have been having problems saving some edits the last week or so. This is not associated with an edit conflict. Wondering if anyone else has experience this problem? -- James Heilman MD, CCFP-EM, Wikipedian The Wikipedia Open Textbook of Medicine www.opentextbookofmedicine.com ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Problem with saving edits
James Heilman wrote: I have been having problems saving some edits the last week or so. This is not associated with an edit conflict. Wondering if anyone else has experience this problem? You're being incredibly vague here. On which wikis have you had this problem? Does it happen every time or just sometimes? Do you get a specific error message? What happens when you try to save the edits? Does this happen while logged in and while logged out? Does this happen in a specific Web browser or in every Web browser? Does this happen with a specific computer or any computer? All these details and more will help you get a useful reply. :-) MZMcBride ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] IRC office hours with the Analytics team, Monday, July 30, 2012 at 19:00 UTC
Hi all, you are cordially invited to the first ever IRC office hours of the Foundation's recently formed Analytics team, taking place in #wikimedia-analytics on Freenode on Monday, July 30 at 19:00 UTC / noon PT (http://www.timeanddate.com/worldclock/fixedtime.html?hour=19day=30month=07year=2012 ). It is an opportunity to ask all your analytics and statistics related questions about Wikipedia and the other Wikimedia projects, in particular regarding the Wikimedia Report Card and the upcoming Kraken analytics platform. See also the blog post that the team just published: https://blog.wikimedia.org/2012/07/25/meet-the-analytics-team/ , as well as https://www.mediawiki.org/wiki/Analytics General information about IRC office hours is available at https://meta.wikimedia.org/wiki/IRC_office_hours . Regards, -- Tilman Bayer Senior Operations Analyst (Movement Communications) Wikimedia Foundation IRC (Freenode): HaeB ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Serious alternatives to Gerrit
We're discussing alternatives to *gerrit*, not to git. Heh. Sorry, yes. that's what I meant. - Ryan ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] suggestion: replace CAPTCHA with better approaches
On Wed, 25 Jul 2012 17:55:06 -0700, John Vandenberg jay...@gmail.com wrote: On Thu, Jul 26, 2012 at 5:00 AM, Derric Atzrott datzr...@alizeepathology.com wrote: This way if people feel motivated at cheating at captcha they will end up helping Wikipedia It is up to us to try to balance things out. I'm pretty sure users will be less annoyed at solving captchas that actually contribute some value. Obligatory XKCD: https://xkcd.com/810/ ;-) The best CAPTCHAs are the kind that do this. Look at how hard it is to beat reCAPTCHA because they have taken this approach. One must be careful though that the CAPTCHA is constructed such that it won't be as simple as a lookup though, and will actually require some thought (so that probably eliminates the noun, verb, adjective idea). This idea has my support. We should use less CAPTCHAs. If the problem is spam, we should build better new URL review systems. There are externally managed spam lists that we could use to identify spammers. 'new URL' s could be defined as domain names that were not in the external links table for more than 24 hrs. Addition of these new URLs could be smartly throttled. un-autoconfirmed edits which include 'new URLs' could be throttled so that they can only be added to a single article for the first 24 hours. That allows a new user to make use of a new domain name unimpeded, however they can only use it on one page for the first 24 hrs. If the new URL was spam, it will hopefully be removed within 24 hrs, which resets the clock for the spammer. i.e. they can only add the spam to one page each 24 hrs. Another idea is for the wiki to ask the user that adds new URLs to review three recent edits that included new URLs and ask the user to indicate whether or not the new URL was SPAM and should be removed. This may be unworkable because the spam-bot could use the linksearch tool to check whether a link is good or not. -- John Vandenberg Your proposal fails to account for two important facts: - A lot of spam may not even add links to the page. - Don't underestimate bot programming. I've seen bots in the wild that wait for autoconfirmed status and then spam. If there is some pattern that can be followed to get access to spam the wiki, bots will be programmed to use that pattern to bypass spam limits. -- ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name] ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l