Re: [Wikitech-l] Serious alternatives to Gerrit

2012-07-25 Thread Ryan Lane
 But do we have a plan for improving Gerrit in a substantial way?


Gerrit releases often and every release is quite better than the last.
Chad can and has been pushing changes upstream. OpenStack (which has
180+ companies that can potentially assist), QT, and other substantial
communities are using and are improving Gerrit.

 I can get behind the decision to use a currently substandard tool in order
 to preserve Wikimedia's long term freedom. But to stick with Gerrit, we
 must have a plan for fixing it that does not simply declare that the
 ability to make changes means that the magic FOSS fairy will make it so. I
 don't know what it would take -- maybe a weekend in SF where we invite
 every Java and Prolog expert we can find? Paying a contractor or two to
 make the necessary fixes?


Make sure to add any complaints to the wiki page
(http://www.mediawiki.org/wiki/Git/Gerrit_evaluation#The_case_against_Gerrit).
Many claims that Gerrit sucks tend to be due with misunderstandings
of how Gerrit works. Many other claims are due to our workflow or our
restrictions with access currently. Of course, many claims are
legitimate and we are reporting the issues, tracking them upstream,
and in some cases pushing in fixes.

There's some substantial downsides to Gerrit, but I don't see
alternatives that don't have equally as substantial downsides. Sumana
listed numerous downsides to using Github. We can't fix the downsides
in Github; we at least have the ability to do so with Gerrit. Just to
drive this point home a little more, let's look at OpenStack's
reasoning with going with Gerrit (rather than Github):

https://lists.launchpad.net/openstack/msg03457.html

 This isn't just about attracting scores of new volunteers or having a
 reputational economy. It's a push for change driven by the fact that
 Gerrit seriously undercuts developer productivity and happiness. When we've
 got so many difficult, ambitious projects under way, I think those are two
 things we should be prioritizing. By that measuring stick, Gerrit fails
 miserably and GitHub is a winner.


I'm not sure I agree with the claim that this seriously undercuts
productivity. Is there any data to back this up?

I'd like to add a couple major downsides to using Github:

1. It would be worrisome from a security point of view to host the
operations repos on Github. I think it's very likely that those repos
would stay in Gerrit if the devs decided to move to Github. This would
mean that we'd have split workflows, forcing developers to use both.

2. We can't use github enterprise:

https://enterprise.github.com/faq#faq-6

- Ryan

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Serious alternatives to Gerrit

2012-07-25 Thread Roan Kattouw
On Tue, Jul 24, 2012 at 10:46 PM, Alolita Sharma
alolita.sha...@gmail.com wrote:
 Cool, Priestley is awesome. If he comes to visit we should prevent him from
 leaving :)

 +1. We should definitely think about adopting Phabricator as a project if
 we're going to invest in its core developer.

I think this should be reversed. If we're gonna use Phabricator, we
should attract its core developer. We shouldn't switch code review
systems just because we hired a famous person.

I personally think that something like gitlabs would be superior to
Phabricator, but that's another discussion.

Roan

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Serious alternatives to Gerrit

2012-07-25 Thread Roan Kattouw
On Wed, Jul 25, 2012 at 12:39 AM, Ryan Lane rlan...@gmail.com wrote:
 1. It would be worrisome from a security point of view to host the
 operations repos on Github. I think it's very likely that those repos
 would stay in Gerrit if the devs decided to move to Github. This would
 mean that we'd have split workflows, forcing developers to use both.

What's more, we also wouldn't want to host MW deployment code (the
wmf/* branches) on Github for the same reason. Those aren't even
separate repos, they're branches within the mediawiki/core repo.
Having a split workflow for contributing code to the project on the
one hand and merging it into deployment on the other hand would be
even worse.

Roan

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Creating a centralized access point for propriety databases/resources

2012-07-25 Thread Derk-Jan Hartman
You could always create an OpenVPN gateway that provides access.  Many edu
institutions have the same setup to access those resources.

DJ

On Mon, Jul 23, 2012 at 6:21 PM, Ocaasi Ocaasi wikioca...@yahoo.com wrote:

 Hi Folks!
 The problem: Many proprietary research databases have donated free access
 to select Wikipedia editors (Credo Reference, HighBeam Research, JSTOR).
 Managing separate account distribution for each service doesn't scale well.
 The idea: Centralize access to these separate resources behind a single
 secure (firewalled) gateway, to which accounts would be given to a limited
 number of approved users. After logging in to this single gateway, users
 would be able to enter any of the multiple participating research databases
 without needing to log in to each one separately.
 The question: What are the basic technical specifications for setting up
 such a system. What are open source options, ideally? What language would
 be ideal? What is required to host such a system? Can you suggest a sketch
 of the basic steps necessary to implement such an idea?
 Any advice, from basics to details would be greatly appreciated.  Thanks
 so much!
 Ocaasi
 http://enwp.org/User:Ocaasi
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Serious alternatives to Gerrit

2012-07-25 Thread Niklas Laxström
On 25 July 2012 10:39, Ryan Lane rlan...@gmail.com wrote:
 But do we have a plan for improving Gerrit in a substantial way?

Currently the platform guys are too busy to even report bugs upstream.


 I'm not sure I agree with the claim that this seriously undercuts
 productivity. Is there any data to back this up?

There is, and I wonder how you cannot be aware of that. Let me repeat
what I've said before.

I spent countless hours preparing translatewiki.net for git. Updating
and comitting to hundreds of repositories takes minutes. I'm glad we
were able to automate that, except that it took two months to get it
working as it was supposed to work right after the switch.

3rd party users (me included) all still left out in the cold. There
are docs or scripts to manage installation with subset of extensions
from svn and git.
Barely a day goes by without someone having some kind of problem with
git or gerrit. And then we have these discussions about it.
Where is the solution for my review workflow? Ever since the switch I
spend more time on code review to review less code.

On individual level the wasted productivity might not be much, (but it
is: [1][2]), but adding it all together - that is a lot.
In your mind, do a rough estimate of all the time spent on preparing
and executing the switch, all the time spend on learning and solving
the issues. Then convert that to dollars and you get rough price tag
for the switch.

[1] Each new person needs to spend tens of hours to learn to use our
GiGeGat effectively - if they don't give up
[2] Some people have waited for commit access and creation of new
repositories for multiple days

  -Niklas

-- 
Niklas Laxström

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Serious alternatives to Gerrit

2012-07-25 Thread Terry Chay
On Jul 24, 2012, at 9:09 PM, Sumana Harihareswara wrote:

 On 07/17/2012 08:41 PM, Rob Lanphier wrote:
 It would appear from reading this page that the only alternative to
 Gerrit that has a serious following is GitHub.  Is that the case?

 


There's some irony and yet so apropos in that now that Gerrit is 
finally stabilizing we're discussing alternatives. :-)

Oh well… Here's my 2c on GitHub…

In an ignore reality world, I suppose my personal choices would be 1) 
GitHub; 2) Phabricator; 3) everything else. But let's cross GitHub off that 
list (for WMF).

Maybe in some future when our development process more closely models a 
seat-of-the-pants startup universe of code first, break often, recover fast we 
could consider GitHub for hosting some of our public repositories, but since I 
don't see that happening anytime soon (ever?).

…

The nonstarter is that while we could host the public repositories, we 
do have a lot of non-public stuff in Gerrit right now. That stuff can't go into 
the cloud.

Well on to specifics…

 But I have a lot of reservations about using GitHub as our primary
 source control and code review platform.  There's the free-as-in-freedom
 issue, of course,

Personally I think this ship sailed the day we used Google Apps for 
e-mail. :-)

 but I'm also concerned about flexibility, account
 management, fragmentation of community and duplication of tools, and
 their terms of service.
 
 == Flexibility ==
 I see GitHub as kind of like a Mac.

This trope is too facile. But I do agree with what you are alluding to 
which is while it's fine for some, that doesn't mean it's fine for us. 
Especially us in our current development process.

  It has a nice UI for the use case
 that its creators envision.  It's fine for personal use.

A great many very large open source projects are currently using or 
hosted at GitHub (including node, jQuery, and our Android/PhoneGap app ;-))

  And if we try
 it, everything'll be great until we smack into an invisible brick
 wall.  We'll want to work around one little thing, the way that we sneak
 around various issues in Gerrit, with hacks and searches and upgrades,
 if it's not in GitHub's web UI or API [3], we'll be stuck.

The API is simplistic but serviceable. However, the satellite tools 
that are built around it are either part of GitHub (their internal issue 
tracker, their own Ruby-based wiki Gollum, etc) or are mostly for commercial 
use/cloud-based. For instance, in tools that assist in deployment from a GitHub 
repositiory (even if that was feasible for us which it isn't), most seem to 
have a hidden assumption that these are Web 2.0 companies deploying on AWS… not 
to mention that usage of those tools clearly violates our policy and values.

 Right now we have our primary Git repo on our own machines, which is the
 ultimate backdoor. The way we have been modifying our tools, automating
 certain kinds of commits (like with l10n-bot), troubleshooting by
 looking at our logfiles, and generally customizing things to suit our
 weird needs -- GitHub is closed source and won't let us do that.  We are
 not the typical use case for GitHub.  Since we have hundreds of
 extensions, each with their own repository, we would have way more
 repositories and members than almost any other organization on there.
 So, one example: arbitrary sortability of lists of repositories.  We
 could mod Gerrit to do it, but not GitHub.  How would we centralize and
 list the repositories so they're easy to browse, search, edit, follow,
 and watch them together?  It looks like GitHub's less suitable for that,
 but I'd welcome examples of orgs that create their own sub-GitHub hubs.

Well GitHubs modality doesn't prevent operating on the git repository 
through the API. But I agree since where is the support on our end for 
doing/writing these when we already have something servicable in Gerrit?

 == Accounts ==
 By using GitHub, we would no longer be managing the user accounts. This
 would make single sign-on with other Wikimedia services (especially
 Labs) completely impossible.

Technically this integration could be done by them authorizing us to 
their accounts via OAuth2. It's not the same thing as what you're saying 
though… it's kind of the opposite of what you're saying. What you want is what 
GitHub Enterprise is for.

 I mentioned above that GitHub seems more meant for single FLOSS projects
 than for confederations of related repositories. GitHub does not have
 the concept of groups, so granting access to collections of repos
 would be a time-consuming process. GitHub does not support branch-level
 permissions, either (it encourages forking and then merging back to
 master), and that does not seem as suitable for long-term collaborative
 branches.

This isn't quite true. GitHub does have the concept of groups (you can 
create as many as you want and control 

Re: [Wikitech-l] Serious alternatives to Gerrit

2012-07-25 Thread Terry Chay
I think Ori's comment that touched this off was tongue-in-cheek. :-)

On Jul 25, 2012, at 12:40 AM, Roan Kattouw wrote:

 On Tue, Jul 24, 2012 at 10:46 PM, Alolita Sharma
 alolita.sha...@gmail.com wrote:
 Cool, Priestley is awesome. If he comes to visit we should prevent him from
 leaving :)
 
 +1. We should definitely think about adopting Phabricator as a project if
 we're going to invest in its core developer.
 
 I think this should be reversed. If we're gonna use Phabricator, we
 should attract its core developer. We shouldn't switch code review
 systems just because we hired a famous person.
 
 I personally think that something like gitlabs would be superior to
 Phabricator, but that's another discussion.
 
 Roan
 
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Serious alternatives to Gerrit

2012-07-25 Thread Terry Chay

On Jul 25, 2012, at 12:39 AM, Ryan Lane wrote:

 Many claims that Gerrit sucks tend to be due with misunderstandings
 of how Gerrit works. Many other claims are due to our workflow or our
 restrictions with access currently. Of course, many claims are
 legitimate and we are reporting the issues, tracking them upstream,
 and in some cases pushing in fixes.

In their defense, I think a lot of it has to do with terrible UI/UX in 
Gerrit. The basics is can be modified by CSS and templates (I believe we've 
done some), but it only goes so far. How do I modify Javascript in Gerrit? I 
think it starts (and ends) somewhere in the hell that is GWT… and when GWT 
begins with, something about how awesome it is to be able to write Ajax stuff 
using Java, I stop reading http://www.flickr.com/photos/tychay/1388234558/

I've not added this complaint because David already put 90% of this in 
the #1 reason on his list already. ;-)

Related is the fact that we seem to have a lot of PHP web dev expertise 
(for some reason) and Gerrit went from Python (serviceable) to Java (totally 
opaque). Apologies to those of you at the  WMF who lurv themselves some Java… 
all two of you… and one of you is probably the guy who wrote the case against

…

But it is true a lot of the griping was related to people from a 
SVN/CVS model not understanding the Git model at all (in my more cynical 
moments, I feel that neither do Gerrit's developers :-D ).


Take care,

terry
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Serious alternatives to Gerrit

2012-07-25 Thread Chad
On Wed, Jul 25, 2012 at 7:29 AM, Terry Chay tc...@wikimedia.org wrote:
 Related is the fact that we seem to have a lot of PHP web dev 
 expertise (for some reason) and Gerrit went from Python (serviceable) to Java 
 (totally opaque). Apologies to those of you at the  WMF who lurv themselves 
 some Java… all two of you… and one of you is probably the guy who wrote the 
 case against


The more I've thought about it, the less that I feel language it's written in
really matters at all. The number of people contributing upstream is always
going to be relatively small, and as long as those who /want/ to contribute
upstream are comfortable with it, it could be written in Cobol for all I care.

It kinda struck me the other day when the subject of bug-tracking tools
came up again. Had we been using $SOME_OTHER_PRODUCT and
people were advocating switching to Bugzilla, I'm sure people would
complain omg, it's Perl--we can't contribute upstream. But in reality,
how many people *have* contributed upstream to Bugzilla? Most
people file bugs in our tracker and they get re-filed upstream, which is
perfectly fine as long as there's an upstream who responds, which in
this case there is.

I think the choice of platform matters when we're talking about ease of
installation/upgrading to some degree so we don't make the ops angry,
but that's a total non-issue with Gerrit because installation/uprgrades
are very very easy :)

-Chad

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Creating a centralized access point for propriety databases/resources

2012-07-25 Thread Ocaasi Ocaasi
Thanks for the tip!  

I'm trying to understand the differences between:

*phpMyAdmin
*SAML
*OpenID
*OpenVPN

Could you give me a quick insight into how they differ, strengths/weaknesses, 
etc.?

More details for The Wikipedia Library concept are at http://enwp.org/WP:TWL

Cheers!
 
Jake Orlowitz
Wikipedia editor: Ocaasi
http://enwp.org/User:Ocaasi
wikioca...@yahoo.com
484-380-3940



 From: Derk-Jan Hartman d.j.hartman+wmf...@gmail.com
To: Ocaasi Ocaasi wikioca...@yahoo.com; Wikimedia developers 
wikitech-l@lists.wikimedia.org 
Sent: Wednesday, July 25, 2012 4:26 AM
Subject: Re: [Wikitech-l] Creating a centralized access point for propriety 
databases/resources
 

You could always create an OpenVPN gateway that provides access.  Many edu 
institutions have the same setup to access those resources.


DJ


On Mon, Jul 23, 2012 at 6:21 PM, Ocaasi Ocaasi wikioca...@yahoo.com wrote:

Hi Folks!
The problem: Many proprietary research databases have donated free access to 
select Wikipedia editors (Credo Reference, HighBeam Research, JSTOR). Managing 
separate account distribution for each service doesn't scale well.
The idea: Centralize access to these separate resources behind a single secure 
(firewalled) gateway, to which accounts would be given to a limited number of 
approved users. After logging in to this single gateway, users would be able 
to enter any of the multiple participating research databases without needing 
to log in to each one separately.
The question: What are the basic technical specifications for setting up such 
a system. What are open source options, ideally? What language would be ideal? 
What is required to host such a system? Can you suggest a sketch of the basic 
steps necessary to implement such an idea?
Any advice, from basics to details would be greatly appreciated.  Thanks so 
much!
Ocaasi
http://enwp.org/User:Ocaasi
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Creating a centralized access point for propriety databases/resources

2012-07-25 Thread Thomas Morton
I can cover some of thse:



 *phpMyAdmin

This is an open source database manager for MySQL databases - it won't work
for what you want.


 *SAML
 *OpenID


From the page you link it looks like you know about these two; i.e. they
act as sign in gateways.

OpenID is more indie, SAML is more enterprise - otherwise there are not
major differences in what they can achieve.

The major bar to entry is getting the providers to add the ability to sign
in using one of these methods.

I'd personally recommend selecting OpenID as it could then be used for a
wider variety of logins around the web.

AFAIK resources like Athens (i.e. similar to what you appear to want) tend
to use SAML.


 *OpenVPN


VPN means setting up access to a pre-authorised network - which then means
you can access the restricted resource. I don't think it fits your use case.


Tom
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Creating a centralized access point for propriety databases/resources

2012-07-25 Thread Oren Bochman
Hi

This looks similar to something I have been thinking about recently

However I would go about it using openeId. But it would require all the
databases sites to support openId. I think that the extensions exists to do
this using mediawiki, but 
WMF projects do not trust/support this method of authentication.

If all parties were to support this standard it would be possible to develop
an gadget which could log users into all the sites at once.

Do you know how many users have been granted access to each databases, this
would be useful for estimating the importance/impact of this project.

Oren Bochman

-Original Message-
From: wikitech-l-boun...@lists.wikimedia.org
[mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of Ocaasi Ocaasi
Sent: Monday, July 23, 2012 6:22 PM
To: wikitech-l@lists.wikimedia.org
Subject: [Wikitech-l] Creating a centralized access point for propriety
databases/resources

Hi Folks!
The problem: Many proprietary research databases have donated free access to
select Wikipedia editors (Credo Reference, HighBeam Research, JSTOR).
Managing separate account distribution for each service doesn't scale well.
The idea: Centralize access to these separate resources behind a single
secure (firewalled) gateway, to which accounts would be given to a limited
number of approved users. After logging in to this single gateway, users
would be able to enter any of the multiple participating research databases
without needing to log in to each one separately.
The question: What are the basic technical specifications for setting up
such a system. What are open source options, ideally? What language would be
ideal? What is required to host such a system? Can you suggest a sketch of
the basic steps necessary to implement such an idea?
Any advice, from basics to details would be greatly appreciated.  Thanks so
much!
Ocaasi
http://enwp.org/User:Ocaasi
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Creating a centralized access point for propriety databases/resources

2012-07-25 Thread Ocaasi Ocaasi
We currently have relationships with three separate resource databases.

*HighBeam, 1000 authorized accounts, 700 active (http://enwp.org/WP:HighBeam)
*JSTOR, 100 accounts, all active (http://enwp.org/WP:JSTOR)
*Credo, 400 accounts, all active (http://enwp.org/WP:CREDO)

No parties have agreed to participate in The Wikipedia Library *yet*, as it's 
still in the concept stage, but my initial projection is that 1000 editors 
would have access to it, and 100 additional users per year would be granted.  
One of the challenges will be getting all the resource providers to agree on 
that number, but the hope is that once some do, it will create a cascade of 
adoption.  

So we're not looking at *thousands* of users, but more likely several hundreds. 
 Still, given the impact of our most active editors, 1000 of them with access 
to the library would have significant impact.  After all, we can't cannibalize 
these databases' subscription business by opening the library to ''all'' 
editors.  It must be a carefully selected and limited group.


-Original Message-
From: wikitech-l-boun...@lists.wikimedia.org
[mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of Ocaasi Ocaasi
Sent: Monday, July 23, 2012 6:22 PM
To: wikitech-l@lists.wikimedia.org
Subject: [Wikitech-l] Creating a centralized access point for propriety
databases/resources

Hi Folks!
The problem: Many proprietary research databases have donated free access to
select Wikipedia editors (Credo Reference, HighBeam Research, JSTOR).
Managing separate account distribution for each service doesn't scale well.
The idea: Centralize access to these separate resources behind a single
secure (firewalled) gateway, to which accounts would be given to a limited
number of approved users. After logging in to this single gateway, users
would be able to enter any of the multiple participating research databases
without needing to log in to each one separately.
The question: What are the basic technical specifications for setting up
such a system. What are open source options, ideally? What language would be
ideal? What is required to host such a system? Can you suggest a sketch of
the basic steps necessary to implement such an idea?
Any advice, from basics to details would be greatly appreciated.  Thanks so
much!
Ocaasi
http://enwp.org/User:Ocaasi
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Creating a centralized access point for propriety databases/resources

2012-07-25 Thread Sumana Harihareswara
Ocaasi, please centralize your notes, ideas, and plans regarding this here:

https://www.mediawiki.org/wiki/AcademicAccess

I know Chad Horohoe, Ryan Lane, and Chris Steipp might have things to
say about this; per
https://www.mediawiki.org/wiki/Wikimedia_Engineering/2012-13_Goals#Activities_12
their team aims to work on OAuth and OpenID within the next 11 months,
and AcademicAccess is a possible beneficiary of that.

Thanks!
-- 
Sumana Harihareswara
Engineering Community Manager
Wikimedia Foundation

On 07/25/2012 10:03 AM, Ocaasi Ocaasi wrote:
 We currently have relationships with three separate resource databases.
 
 *HighBeam, 1000 authorized accounts, 700 active (http://enwp.org/WP:HighBeam)
 *JSTOR, 100 accounts, all active (http://enwp.org/WP:JSTOR)
 *Credo, 400 accounts, all active (http://enwp.org/WP:CREDO)
 
 No parties have agreed to participate in The Wikipedia Library *yet*, as it's 
 still in the concept stage, but my initial projection is that 1000 editors 
 would have access to it, and 100 additional users per year would be granted.  
 One of the challenges will be getting all the resource providers to agree on 
 that number, but the hope is that once some do, it will create a cascade of 
 adoption.  
 
 So we're not looking at *thousands* of users, but more likely several 
 hundreds.  Still, given the impact of our most active editors, 1000 of them 
 with access to the library would have significant impact.  After all, we 
 can't cannibalize these databases' subscription business by opening the 
 library to ''all'' editors.  It must be a carefully selected and limited 
 group.
 
 
 -Original Message-
 From: wikitech-l-boun...@lists.wikimedia.org
 [mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of Ocaasi Ocaasi
 Sent: Monday, July 23, 2012 6:22 PM
 To: wikitech-l@lists.wikimedia.org
 Subject: [Wikitech-l] Creating a centralized access point for propriety
 databases/resources
 
 Hi Folks!
 The problem: Many proprietary research databases have donated free access to
 select Wikipedia editors (Credo Reference, HighBeam Research, JSTOR).
 Managing separate account distribution for each service doesn't scale well.
 The idea: Centralize access to these separate resources behind a single
 secure (firewalled) gateway, to which accounts would be given to a limited
 number of approved users. After logging in to this single gateway, users
 would be able to enter any of the multiple participating research databases
 without needing to log in to each one separately.
 The question: What are the basic technical specifications for setting up
 such a system. What are open source options, ideally? What language would be
 ideal? What is required to host such a system? Can you suggest a sketch of
 the basic steps necessary to implement such an idea?
 Any advice, from basics to details would be greatly appreciated.  Thanks so
 much!
 Ocaasi
 http://enwp.org/User:Ocaasi
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] [Wikimedia-l] Questions about browser geolocalization improvement proposal

2012-07-25 Thread Cristian Consonni
Hi all,

I have started a thread on Wikimedia-l mailing lst[1a][1b] about a
proposal for improvement[2] of the current geolocation system used by
Geonotice[3].

In the following e-mail we had some question about how the
geolocalization from browser works.
In particular, with the geolocalization via browser one can obtain the
information of who received a geotargeted message or not? Is possible
or not store the data about geolocalization?

Summing up, we are wondering if this system respects the privacy of
users. Can you help us answer those concerns?

-- Forwarded message --
From: Cristian Consonni kikkocrist...@gmail.com
Date: 2012/7/25
Subject: Re: [Wikimedia-l] Geolocalization improvement proposal
To: Wikimedia Mailing List wikimedi...@lists.wikimedia.org


2012/7/24  birgitte...@yahoo.com:
 On Jul 24, 2012, at 11:14 AM, Cristian Consonni kikkocrist...@gmail.com 
 wrote:

 2012/7/24  birgitte...@yahoo.com:
 The main question is whether the benefit from being able to connect people 
 with local events is worth the risk of collecting more personalized of 
 their data than we are accustomed to handling.

 I could be wrong but I don't think we would handle more data than
 what we are doing now. We are not going to use that data and as far as
 I know that data dies in the moment the system has output the
 message.

 Maybe this is the area that needs more study. And I am probably the wrong 
 person to try and even formulate technical questions, but is there a way to 
 make use of this data without storing it? Without even knowing who recieved 
 what personalized messages, unless, of course,they choose to respond?

Thanks in advance,

Cristian

[1a] http://lists.wikimedia.org/pipermail/wikimedia-l/2012-July/121236.html
[1b] http://lists.wikimedia.org/pipermail/wikimedia-l/2012-July/121281.html
[2] http://meta.wikimedia.org/wiki/Geonotice
[3] http://en.wikipedia.org/wiki/Wikipedia:Geonotice

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] ANN: Multilingual Linked Open Data for Enterprise (MLODE) Leipzig, September 23-24-25, 2012

2012-07-25 Thread Denny Vrandečić
Hi Sebastian,

ich denke, dass wir zu dem Treffen was beitragen können. Was denkst Du?
Welche Form wäre am sinnvollsten?

Grüße,
Denny

2012/7/24 Sebastian Hellmann hellm...@informatik.uni-leipzig.de:
 *Save the date: Leipzig, Germany 23-24-25 September 2012
 http://sabre2012.infai.org/mlode
 Co-located with the Leipziger Semantic Web Day: http://aksw.org/lswt

 == Multilingual Linked Open Data for Enterprises ==

 MLODE will bring together developers, data producers, academia and
 enterprises and connect people, communities, data and industrial use cases.
 The workshop will be very interactive and you are expected to help us
 achieve common goals:
 * bootstrap and build a Linguistic Linked Open Data Cloud (LLOD):
 http://linguistics.okfn.org/resources/llod/
 * establish best practices for multilingual linked open data
 * create incentives for businesses and lower the barrier for participation
 in LOD for natural language processing and internationalisation and
 localisation enterprises.

 We are expecting intensive participation by members of the following
 communities (these are teasers, see the **detailed descriptions for each
 community** further below):
 * DBpedia ( http://dbpedia.org about:blank): DBpedia International now has
 over 10 language-specific chapters (such as http://el.dbpedia.org
 http://el.dbpedia.org/). At the MLODE workshop there will be a DBpedia
 Developers meetup. We will discuss the “Future of DBpedia” and create a
 common Road Map. If you want to get more involved in DBpedia, the workshop
 will be a good opportunity to meet the team.
 * Working Group for Open Data in Linguistics (OWLG,
 http://linguistics.okfn.org http://linguistics.okfn.org/%29): Now is the
 time to get your data into the LLOD cloud! We have created a development
 team that will convert your data to RDF and help establish links:
 http://code.google.com/p/mlode/. Please submit your data sets soon!
 (Furthermore we will have a legal session to discuss licensing issues.)
 * Multilingual Web ( http://www.multilingualweb.eu
 http://www.multilingualweb.eu/): Free, open data and lexica; we will have
 a session discussing best practices for multilingual linked open data
 (http://mlode.okfnpad.org/best-practices-multilingual-lod) and compatability
 with the RDF world with ITS 2.0.
 * Apache Stanbol ( http://incubator.apache.org/stanbol/): Enterprises will
 have the chance to present their use cases during lightning talks and we
 will have a Apache Stanbol Booth and an install fest to show hands-on how
 combined usage of public and closed data can be achieved and what benefits
 firms can gain from using these rapidly increasing data pools.
 * Ontolex W3C Community Group ( http://www.w3.org/community/ontolex/):
 Monnet Challenge will provide a data bounty for developers who convert data
 sets using lemon.
 * Also: NLP2RDF (http://nlp2rdf.org http://nlp2rdf.org/) - the NIF
 project, DBpedia Spotlight (http://spotlight.dbpedia.org
 http://spotlight.dbpedia.org/), Wiktionary2RDF
 (http://dbpedia.org/Wiktionary)


 How you can contribute:

 * Contact us if you are an enterprise and want to prepare a small
 presentation/lightning talk about your business use cases (using LOD) or
 problems you have (please see below for details)
 * Contact us if you want to give a short presentation on a relevant topic
 * We are looking for a sponsor for a DBpedia Booth
 * Submit your data sets for the LLOD: http://code.google.com/p/mlode/
 * Become a sponsor of the workshop:
 http://sabre2012.infai.org/mlode/funding?#sponsorship
 * Or donate money and help the individual communities:
 http://sabre2012.infai.org/mlode/Funding

 DBpedia is a good example of a freely available and open data set that was
 generated by crowd-sourcing and academia, but it has provided an immense
 value to businesses and industry. We want to build on and continue this
 success for the areas of natural language processing enterprises and the
 internationalisation and localisation industries.

 The goal of the workshop is to bootstrap a Multilingual Linked Open Data
 cloud by bringing together many different linked open data sets and by
 creating synergy among different research and business communities. This
 workshop is aimed at researchers and industry and commercial consumers of
 data produced by research. We hope for mutual benefits between (potentially
 non-commercial) data providers and enterprises: Open-source and
 open-licences for software have shown that they can be successful in a
 commercial environment. How can we transfer these models to Multilingual
 Linked Open Data? And how can the transformation of currently monolingual
 Linked Open Data sources into a Multilingual Web of Open Data spur
 cross-linguistic research, and commercial applications in
 internationalisation and localisation enterprises?

 = Sponsors =
 We would like to thank our sponsors for supporting the workshop:
 * The **Working MultilingualWeb-LT Working Group** -
 

Re: [Wikitech-l] Creating a centralized access point for propriety databases/resources

2012-07-25 Thread Oren Bochman
Hi Ocaasi

I agree that tighter work with the database providers is in order. 1000+
accounts for top contributors can make a significant impact on Wikipedia
fact checking.

Based on my experience at university (where I taught a lab-class on
reference database usage) that there are many more options on how to do
this. Most users in universities do not require to log in at all. (they work
in context of an IP range that is enabled for databases.) Research libraries
also implement floating licenses for databases that have limited access
options.

However to implement this it is often necessary to work with a large
database aggregators (which solves the tech issues) and the rest is
implemented by operations staff of a university.

Oren Bochman

-Original Message-
From: wikitech-l-boun...@lists.wikimedia.org
[mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of Sumana
Harihareswara
Sent: Wednesday, July 25, 2012 4:16 PM
To: Ocaasi Ocaasi; Wikimedia developers
Subject: Re: [Wikitech-l] Creating a centralized access point for propriety
databases/resources

Ocaasi, please centralize your notes, ideas, and plans regarding this here:

https://www.mediawiki.org/wiki/AcademicAccess

I know Chad Horohoe, Ryan Lane, and Chris Steipp might have things to say
about this; per
https://www.mediawiki.org/wiki/Wikimedia_Engineering/2012-13_Goals#Activitie
s_12
their team aims to work on OAuth and OpenID within the next 11 months, and
AcademicAccess is a possible beneficiary of that.

Thanks!
--
Sumana Harihareswara
Engineering Community Manager
Wikimedia Foundation

On 07/25/2012 10:03 AM, Ocaasi Ocaasi wrote:
 We currently have relationships with three separate resource databases.
 
 *HighBeam, 1000 authorized accounts, 700 active 
 (http://enwp.org/WP:HighBeam) *JSTOR, 100 accounts, all active 
 (http://enwp.org/WP:JSTOR) *Credo, 400 accounts, all active 
 (http://enwp.org/WP:CREDO)
 
 No parties have agreed to participate in The Wikipedia Library *yet*, as
it's still in the concept stage, but my initial projection is that 1000
editors would have access to it, and 100 additional users per year would be
granted.  One of the challenges will be getting all the resource providers
to agree on that number, but the hope is that once some do, it will create a
cascade of adoption.  
 
 So we're not looking at *thousands* of users, but more likely several
hundreds.  Still, given the impact of our most active editors, 1000 of them
with access to the library would have significant impact.  After all, we
can't cannibalize these databases' subscription business by opening the
library to ''all'' editors.  It must be a carefully selected and limited
group.
 
 
 -Original Message-
 From: wikitech-l-boun...@lists.wikimedia.org
 [mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of Ocaasi 
 Ocaasi
 Sent: Monday, July 23, 2012 6:22 PM
 To: wikitech-l@lists.wikimedia.org
 Subject: [Wikitech-l] Creating a centralized access point for 
 propriety databases/resources
 
 Hi Folks!
 The problem: Many proprietary research databases have donated free 
 access to select Wikipedia editors (Credo Reference, HighBeam Research,
JSTOR).
 Managing separate account distribution for each service doesn't scale
well.
 The idea: Centralize access to these separate resources behind a 
 single secure (firewalled) gateway, to which accounts would be given 
 to a limited number of approved users. After logging in to this single 
 gateway, users would be able to enter any of the multiple 
 participating research databases without needing to log in to each one
separately.
 The question: What are the basic technical specifications for setting 
 up such a system. What are open source options, ideally? What language 
 would be ideal? What is required to host such a system? Can you 
 suggest a sketch of the basic steps necessary to implement such an idea?
 Any advice, from basics to details would be greatly appreciated.  
 Thanks so much!
 Ocaasi
 http://enwp.org/User:Ocaasi
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Serious alternatives to Gerrit

2012-07-25 Thread Erik Moeller
On Wed, Jul 25, 2012 at 4:17 AM, Terry Chay tc...@wikimedia.org wrote:
 I think Ori's comment that touched this off was tongue-in-cheek. :-)

Indeed, we're not hiring anyone at this point :). We're meeting folks
from different code review projects as part of the process. We're also
trying to connect in person with the Gerrit folks, probably in the
same week.

-- 
Erik Möller
VP of Engineering and Product Development, Wikimedia Foundation

Support Free Knowledge: https://wikimediafoundation.org/wiki/Donate

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] branched article history extension [Re: Article revision numbers]

2012-07-25 Thread Adam Wight

Hi,
I've started working on an extension to manage branching history, 
calling it Nonlinear.  Here's the crude code, 
https://github.com/adamwight/Nonlinear


Screenshot of the effect on revision history:
mediawiki screenshot

On 07/16/2012 04:10 PM, Platonides wrote:

On 17/07/12 00:22, Adam Wight wrote:

Hello comrades,
I've run into a challenge too interesting to keep to myself ;)  My
immediate goal is to prototype an offline wikipedia, similar to Kiwix,
which allows the end-user to make edits and synchronize them back to a
central repository like enwiki.

The catch is, how to insert these changes without edit conflicts? With
linear revision numbering, I can't imagine a natural representation of
the data, only some kind of ad-hoc sandbox solution.

Extending the article revision numbering to represent a branching
history would be the natural way to handle optimistic replication.

Non-linear revisioning might also facilitate simpler models for page
protection, and would allow the formation of multiple, independent
consensuses.

-Adam Wight

Actually, the revision table allows for non-linear development (it
stores from which version you edited the article). You could even make
to win a version different than the one with the latest timestamp (by
changing page_rev) one.
You will need to change the way of viewing history, however, and add a
system to keep track of heads and merges.
There may be some assumtions accross the codebase about the latest
revision being the active one, too.


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Serious alternatives to Gerrit

2012-07-25 Thread Ryan Lane
 * What does GitHub Enterprise buy us?  Which of these issues would that fix?

 It's a self-hosted GitHub. It would allow us to have private 
 repositories (good for deploys, ops, etc.) and manage our own user database 
 (we could integrate with our own auth system) and probably waives the 13 and 
 under rule above.

 The price is too steep since its a per-seat license. A nonstarter if 
 the WMF is going to have to pay for every potential developer who wants to 
 attach.


As mentioned before, we can't use github enterprise at all, since it
doesn't allow for hosting public repos. Let's ignore that it even
exists.

 We do need a GitHub strategy -- to make our projects more discoverable,
 make use of more contributions, and participate in the GitHub
 reputational economy.  So we must figure out the right ways to mirror
 and sync.  But I doubt our own long-term needs would work well with
 using GitHub as our main platform.

 I'm 1000% with you on this.

 We should definitely at some point mirror our code in GitHub like the 
 PHP project does http://www.php.net/git.php. Being able to publish and 
 handle pull requests coming from GitHub would be a nice feature in Gerrit or 
 any replacement.  It'd be nice if others can have their own MW extensions or 
 versions of extensions and core on GitHub and pull from us (and us from them) 
 esp. for extensions that may need some love or have changes that don't 
 satisfy the WMF code quality bar.


Well, we can enable replication from Gerrit to Github. We haven't done
so, yet, but it's a feature that's available.

- Ryan

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Creating a centralized access point for propriety databases/resources

2012-07-25 Thread Ryan Lane
 I'm trying to understand the differences between:

 *phpMyAdmin
 *SAML
 *OpenID
 *OpenVPN


You should only consider SAML and OpenID. More exactly, you should
really only consider SAML, since the resources you are trying to
connect to only support SAML, and not OpenID. We can use OpenID for
proxied access to resources that don't support SAML, but it's very
likely nearly all of the resources we're trying to access support
SAML.

Ideally we'd integrate central auth with something that supports
multiple protocols. SimpleSAMLPHP supports SAML, OpenID, OAuth and a
few other protocols. It also can handle the circles of trust that we'd
need to create with the libraries/universities.

- Ryan

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Serious alternatives to Gerrit

2012-07-25 Thread Derric Atzrott
As mentioned before, we can't use github enterprise at all, since it
doesn't allow for hosting public repos. Let's ignore that it even exists.

I feel like as Wikipedia is one of the top 10 most visited sites on the
Internet we might be able to work out something special with them though
right?  I'm not saying we have to go down that route, nor have I even
examined all the advantages and disadvantages of the idea.  I feel though
that the possibility exists though and should be looked into.

If I was GitHub and the WMF approached me about potentially using GitHub
Enterprise for MediaWiki and MediaWiki extensions and NOT for creating a
competition service to GitHub, then I would likely entertain the idea of
crafting a special set of terms for them.  Furthermore, I personally might
even charge them differently given that that charging per developer would be
crippling to an open-source project.  Of course, I am not GitHub, nor can I
anticipate what they might do, nor their internal policies, but I can speak
for myself and how I would run a FOSS focused company.

Thank you,
Derric Atzrott


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Creating a centralized access point for propriety databases/resources

2012-07-25 Thread Ocaasi Ocaasi
@ Ryan, If you say SAML is the best approach, then that's what we'll use.  
OpenID can be a backup for those that are not SAML compatible for some reason.

@ Oren, we want to make it so that the vast majority of the work is done on our 
end if possible.  Ideally, participating resource donors wouldn't have to do 
anything to their websites at all.  That may not be realistic, but it's the 
direction I'd like to lean.
 
Jake Orlowitz
Wikipedia editor: Ocaasi
http://enwp.org/User:Ocaasi
wikioca...@yahoo.com




 From: Ryan Lane rlan...@gmail.com
To: Ocaasi Ocaasi wikioca...@yahoo.com; Wikimedia developers 
wikitech-l@lists.wikimedia.org 
Cc: Derk-Jan Hartman d.j.hartman+wmf...@gmail.com 
Sent: Wednesday, July 25, 2012 2:04 PM
Subject: Re: [Wikitech-l] Creating a centralized access point for propriety 
databases/resources
 
 I'm trying to understand the differences between:

 *phpMyAdmin
 *SAML
 *OpenID
 *OpenVPN


You should only consider SAML and OpenID. More exactly, you should
really only consider SAML, since the resources you are trying to
connect to only support SAML, and not OpenID. We can use OpenID for
proxied access to resources that don't support SAML, but it's very
likely nearly all of the resources we're trying to access support
SAML.

Ideally we'd integrate central auth with something that supports
multiple protocols. SimpleSAMLPHP supports SAML, OpenID, OAuth and a
few other protocols. It also can handle the circles of trust that we'd
need to create with the libraries/universities.

- Ryan
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Serious alternatives to Gerrit

2012-07-25 Thread Steven Walling
On Tue, Jul 24, 2012 at 10:26 PM, Erik Moeller e...@wikimedia.org wrote:

 As one quick update, we're also in touch with Evan Priestley, who's no
 longer at Facebook and now running Phabricator as a dedicated open
 source project and potential business. If all goes well, Evan's going
 to come visit WMF sometime soon, which will be an opportunity to
 seriously explore whether Phabricator could be a viable long term
 alternative (it's probably not a near term one). Will post more
 details if this meeting materializes.

 Evan pointed me to the following pages which give some hints at
 Phabricator's longer term direction (requires logging in):

 Project roadmap: https://secure.phabricator.com/w/roadmap/
 Plans for repo/project level permission management:
 https://secure.phabricator.com/T603


In prep for this, I started a section about Phabricator as an alternative.

https://www.mediawiki.org/w/index.php?title=Git%2FGerrit_evaluationdiff=565357oldid=565266

Steven
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Serious alternatives to Gerrit

2012-07-25 Thread Faidon Liambotis
On Wed, Jul 25, 2012 at 02:21:03PM -0400, Derric Atzrott wrote:
 As mentioned before, we can't use github enterprise at all, since it
 doesn't allow for hosting public repos. Let's ignore that it even exists.
 
 I feel like as Wikipedia is one of the top 10 most visited sites on the
 Internet we might be able to work out something special with them though
 right?  I'm not saying we have to go down that route, nor have I even
 examined all the advantages and disadvantages of the idea.  I feel though
 that the possibility exists though and should be looked into.
 
 If I was GitHub and the WMF approached me about potentially using GitHub
 Enterprise for MediaWiki and MediaWiki extensions and NOT for creating a
 competition service to GitHub, then I would likely entertain the idea of
 crafting a special set of terms for them.  Furthermore, I personally might
 even charge them differently given that that charging per developer would be
 crippling to an open-source project.  Of course, I am not GitHub, nor can I
 anticipate what they might do, nor their internal policies, but I can speak
 for myself and how I would run a FOSS focused company.

I think the BitKeeper story is relevant here: BitKeeper was one of the
first DVCSes. It was (is?) a proprietary for-profit tool that gave
special free licenses to certain free/open-source software projects,
like Linux. Linux was using it for years, due to having some unique at
the time features (and Linus liking it), although it was a controversial
choice in the community.

At one point, due to some reverse engineering (basically typing help
at its server and showing that in a talk) by some community members, the
company behind BitKeeper decided to revoke this free (as in beer)
license from the community members, effectively halting Linux
development. Git was first published a week after that.

Now, the situation is a bit different here. But I can certainly imagine
getting this special license exception revoked, GitHub Enterprise
discontinued as a product or whatever else. Can you imagine the
disruption that would cause to our development and operations?

Regars,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] suggestion: replace CAPTCHA with better approaches

2012-07-25 Thread Oren Bochman
Hi

The wikipedia's captcha is a great opportunity for getting '''useful'' work
done by humans.
This is now called a [[game with a purpose]]. 

I think we can ideally use it to help:
* ocr wikisource text like recaptcha does
* translate articles fragments using geo-location of editors.
  Translate [xyz-known] [...]
  Translate [xyz-new] [...] 
check using blau metric etc.
* get more opinions on spam edits.
  Is this diff [spam] [good faith edit] [ok]
* collect linguistics information on different languages edition.
Is XYZ a [verb] / [noun] / [adjective] ... [other]
*disambiguate 
  Is [xyz-known] [xyz] ... [xyz] ... [xyz] ...
  Is [yzx-unknown] [yzx1] ... [yzx1] ... [yzx1] ...
Etc

This way if people feel motivated at cheating at captcha they will end up
helping Wikipedia
It is up to us to try to balance things out.

I'm pretty sure users will be less annoyed at solving captchas that actually
contribute some value.



-Original Message-
From: wikitech-l-boun...@lists.wikimedia.org
[mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of matanya
Sent: Tuesday, July 24, 2012 4:12 PM
To: wikitech-l@lists.wikimedia.org
Subject: [Wikitech-l] suggestion: replace CAPTCHA with better approaches

As for the last few month the spam rate stewards deal with is raising.
I suggest we implement a new mechanism:

Instead of giving the user a CAPTCHA to solve, give him a image from commons
and ask him to add a brief description in his own language.

We can give him two images, one with known description, and the other with
unknown, after enough users translate the unknown in the same why, we can
use it as a verified translation. We base on the known image description to
allow the user to create the account.

Is it possible to embed a file from commons in the login page? is it
possible to parse the entered text and store it?

benefits:

A) it would be harder for bots to create automated accounts.

B) We will get translations to many languages with little effort from the
users signing up.

What do you think?



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Category Internacionalization.

2012-07-25 Thread Mauricio Etchevest
I can get all the categories with the Parser Output, but I can´t add or
delete a category.
I'm using updateCategoryCounts() in LinksUpdate but it doesn't work.


On Tue, Jul 24, 2012 at 3:23 PM, Platonides platoni...@gmail.com wrote:

 El 24/07/12 19:45, Mauricio Etchevest escribió:
  Hi !
 
  I'm working on a extension for Media Wiki. And I need to detect when a
  categorization is made on an article.
 
  So I search for the annotation with the keyword category but, then I
 need
  to detect categorizations in other languages.How can I get the
 translation
  of the keyword category ?
 
  Wich is the best way to add/remove an article from a Category ?
 
 
  Thanks!

 You're doing it the wrong way. You want to call getCategoryLinks() on
 the parser output. Take a look at LinksUpdate.php



 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] suggestion: replace CAPTCHA with better approaches

2012-07-25 Thread Derric Atzrott
This way if people feel motivated at cheating at captcha they will end up
helping Wikipedia It is up to us to try to balance things out.

I'm pretty sure users will be less annoyed at solving captchas that
actually contribute some value.

Obligatory XKCD: https://xkcd.com/810/

The best CAPTCHAs are the kind that do this.  Look at how hard it is to beat
reCAPTCHA because they have taken this approach.  One must be careful though
that the CAPTCHA is constructed such that it won't be as simple as a lookup
though, and will actually require some thought (so that probably eliminates
the noun, verb, adjective idea).

This idea has my support.

Thank you,
Derric Atzrott


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Serious alternatives to Gerrit

2012-07-25 Thread Terry Chay

On Jul 25, 2012, at 11:36 AM, Faidon Liambotis wrote:

  think the BitKeeper story is relevant here

Yes, good point. Honestly before we talk GitHub or GitLab, we should 
consider if we are willing to rethink our model of handling code submissions to 
be more Pull-requesty. These two systems don't really have pre commit code 
review in the traditional sense (correct me if I'm wrong) and I don't think 
there is a way to bolt this on.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Serious alternatives to Gerrit

2012-07-25 Thread Alolita Sharma
On Wed, Jul 25, 2012 at 3:18 PM, Terry Chay tc...@wikimedia.org wrote:


 On Jul 25, 2012, at 11:36 AM, Faidon Liambotis wrote:

   think the BitKeeper story is relevant here

 Yes, good point. Honestly before we talk GitHub or GitLab, we
 should consider if we are willing to rethink our model of handling code
 submissions to be more Pull-requesty. These two systems don't really have
 pre commit code review in the traditional sense (correct me if I'm wrong)
 and I don't think there is a way to bolt this on.


Yup. A better understanding of our overall code submission workflow would
be very useful in taking the next big step (GitHub or git/Phabricator or
whatever).

Alolita
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Serious alternatives to Gerrit

2012-07-25 Thread Terry Chay
Well there's more than pushing changes upstream. Modifying/customizing 
Gerri's UI is supposed to be able to be done without pushing back upstream. As 
a relative novice of GWT, modifying/customizing UI in gerrit seems rather 
opaque. But before I go whole hog on Phabricator, I'd have to look more into 
how it handles templating and customization… I think there is a large class of 
changes we might want to make that aren't so programmy as to involve using 
Conduit or Arcanist to get it done.

In any case, I think at the moment Gerrit does what we want and its 
finally usably fast. I'm sure off the top of my head I can think of a number of 
things that keep something written in PHP (Phabricator) from matching Gerrit's 
feature set (LDAP? ACLs?), and pushing upstream patches isn't high on my list 
pro/con any of these systems.

But things are changing here at a rapid pace so I'm not going to say 
Gerrit now, Gerrit forever. ;-)

On Jul 25, 2012, at 5:01 AM, Chad wrote:

 On Wed, Jul 25, 2012 at 7:29 AM, Terry Chay tc...@wikimedia.org wrote:
Related is the fact that we seem to have a lot of PHP web dev 
 expertise (for some reason) and Gerrit went from Python (serviceable) to 
 Java (totally opaque). Apologies to those of you at the  WMF who lurv 
 themselves some Java… all two of you… and one of you is probably the guy who 
 wrote the case against
 
 
 The more I've thought about it, the less that I feel language it's written 
 in
 really matters at all. The number of people contributing upstream is always
 going to be relatively small, and as long as those who /want/ to contribute
 upstream are comfortable with it, it could be written in Cobol for all I care.
 
 It kinda struck me the other day when the subject of bug-tracking tools
 came up again. Had we been using $SOME_OTHER_PRODUCT and
 people were advocating switching to Bugzilla, I'm sure people would
 complain omg, it's Perl--we can't contribute upstream. But in reality,
 how many people *have* contributed upstream to Bugzilla? Most
 people file bugs in our tracker and they get re-filed upstream, which is
 perfectly fine as long as there's an upstream who responds, which in
 this case there is.
 
 I think the choice of platform matters when we're talking about ease of
 installation/upgrading to some degree so we don't make the ops angry,
 but that's a total non-issue with Gerrit because installation/uprgrades
 are very very easy :)
 
 -Chad
 
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Serious alternatives to Gerrit

2012-07-25 Thread Chad
On Wed, Jul 25, 2012 at 7:29 AM, Terry Chay tc...@wikimedia.org wrote:
 In their defense, I think a lot of it has to do with terrible UI/UX 
 in Gerrit. The basics is can be modified by CSS and templates (I believe 
 we've done some), but it only goes so far. How do I modify Javascript in 
 Gerrit? I think it starts (and ends) somewhere in the hell that is GWT… and 
 when GWT begins with, something about how awesome it is to be able to write 
 Ajax stuff using Java, I stop reading 
 http://www.flickr.com/photos/tychay/1388234558/


This seems to be a common misconception about skinning Gerrit, so
please allow me to take a moment to clear up this.

To deliver custom CSS (or HTML, or Javascript), we can do that with
stock Gerrit /right now/. Right now, two big issues stand in the way
of *really* skinning Gerrit the way we'd like:

1) GWT's CSS is included last, after your custom site CSS. This is
stupid, but fixable (and actually, you can work around it with slapping
!important on everything, but that's silly and I want to fix it for real).

2) Right now, most classes aren't considered public facing, so the
names are randomly reassigned when Gerrit is recompiled. This is
easily fixable, as classes where we want the name to remain
stable (and there are some already) can be marked with an
annotation and therefore made public. This is a one line fix per-
class.

I haven't actually tried doing custom Javascript yet, but it should be
completely doable via the GerritSite.html header that you can
customize (in fact, I've got some other non-JS customizations I
want to roll out there soon).

Gerrit skinning isn't nearly as scary as playing with GWT (which you
only really need to know if you're trying to actually modify the forms/
etc that are being delivered). Once we get the labs installation of
Gerrit back up (working on it!), I'd love to grant access to some CSS
gurus amongst us who'd be willing to try coming up with a prettier
skin for Gerrit.

-Chad

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Serious alternatives to Gerrit

2012-07-25 Thread aude
On Wed, Jul 25, 2012 at 6:09 AM, Sumana Harihareswara suma...@wikimedia.org
 wrote:

 On 07/17/2012 08:41 PM, Rob Lanphier wrote:
  It would appear from reading this page that the only alternative to
  Gerrit that has a serious following is GitHub.  Is that the case?

 We definitely need a GitHub *strategy*.  GitHub draws together tons of
 open source contributors.  So we ought to address:

 * pull requests.  People *will* clone our projects onto GitHub and end
 up submitting pull requests there; we have to find or make tools to sync
 those, or at least get notified about them and make it easy to pull them
 into whatever we use. [0] [1]
 * discoverability.  Having a presence on GitHub gets us publicity to a
 lot of potential contributors.
 * reputation.  People on GitHub want credit, in their system, for their
 commits.  It'd help us to give them that somehow.


I'm sure the approach doesn't scale the best but OpenStreetMap has a mirror
of it's code on github, while the official repository is self-hosted within
OSM.

http://git.openstreetmap.org/rails.git/shortlog

People can and do regularly submit pull requests from Github and they get
merged in.

https://github.com/openstreetmap/openstreetmap-website

I thought we used to have a mirror of MediaWiki on github but maybe that
was when we were using SVN.  There's also the Wikimedia mobile stuff on
github and curious how that's working, in terms of incorporating volunteer
contributions.

Cheers,
Katie


[snip]




 (Thanks to Chad and RobLa for talking through much of this with me.)

 --
 Sumana Harihareswara
 Engineering Community Manager
 Wikimedia Foundation

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l




-- 
Board member, Wikimedia District of Columbia
http://wikimediadc.org
@wikimediadc / @wikimania2012
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Serious alternatives to Gerrit

2012-07-25 Thread Platonides
On 25/07/12 19:49, Ryan Lane wrote:
 None of these issues are Gerrit specific. You are complaining about
 the switch from svn to git. Yes, we know there was productivity lost
 in the switchover. We're discussing alternatives to git, not the
 switchover, though.
 
 - Ryan


We're discussing alternatives to *gerrit*, not to git.


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Serious alternatives to Gerrit

2012-07-25 Thread Platonides
On 25/07/12 06:09, Sumana Harihareswara wrote:
 == A couple open questions ==
 * What's the FLOSS project on GitHub that's most like us, in terms of
 size, number of unique repositories, privacy concerns, robustness needs,
 and so on?  How are they dealing with these issues?

I think php project. They have lots of repositories.
It is a bit mixed, since on the one hand their core repos are at
http://git.php.net/, with github (https://github.com/php) being a
mirror. Pull notifications go to a mailing list and there is a tool for
closing them: http://qa.php.net/pulls, we could ask David Soria Parra if
we want more info/to copy somehting from their setup.
On the other hand, many pecl extensions have moved to their own repo at
github but I donk't really know how to find them. They seem to be
scattered with their own user like https://github.com/php-memcached-dev
Pear is more consistent, with all the extensions grouped under the same
user https://github.com/pear


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Serious alternatives to Gerrit

2012-07-25 Thread Roan Kattouw
On Wed, Jul 25, 2012 at 3:31 PM, Chad innocentkil...@gmail.com wrote:
 I haven't actually tried doing custom Javascript yet, but it should be
 completely doable via the GerritSite.html header that you can
 customize (in fact, I've got some other non-JS customizations I
 want to roll out there soon).
I've done this, stealing some code from the OpenStack-CI people. It's
in an abandoned change in the puppet repo:
https://gerrit.wikimedia.org/r/#/c/3285/2/files/gerrit/skin/GerritSiteHeader.html

Roan

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] suggestion: replace CAPTCHA with better approaches

2012-07-25 Thread John Vandenberg
On Thu, Jul 26, 2012 at 5:00 AM, Derric Atzrott
datzr...@alizeepathology.com wrote:
This way if people feel motivated at cheating at captcha they will end up
 helping Wikipedia It is up to us to try to balance things out.

I'm pretty sure users will be less annoyed at solving captchas that
 actually contribute some value.

 Obligatory XKCD: https://xkcd.com/810/

;-)

 The best CAPTCHAs are the kind that do this.  Look at how hard it is to beat
 reCAPTCHA because they have taken this approach.  One must be careful though
 that the CAPTCHA is constructed such that it won't be as simple as a lookup
 though, and will actually require some thought (so that probably eliminates
 the noun, verb, adjective idea).

 This idea has my support.

We should use less CAPTCHAs.

If the problem is spam, we should build better new URL review
systems.  There are externally managed spam lists that we could use to
identify spammers.

'new URL' s could be defined as domain names that were not in the
external links table for more than 24 hrs.

Addition of these new URLs could be smartly throttled.

un-autoconfirmed edits which include 'new URLs' could be throttled so
that they can only be added to a single article for the first 24
hours.  That allows a new user to make use of a new domain name
unimpeded, however they can only use it on one page for the first 24
hrs.  If the new URL was spam, it will hopefully be removed within 24
hrs, which resets the clock for the spammer.  i.e. they can only add
the spam to one page each 24 hrs.

Another idea is for the wiki to ask the user that adds new URLs to
review three recent edits that included new URLs and ask the user to
indicate whether or not the new URL was SPAM and should be removed.
This may be unworkable because the spam-bot could use the linksearch
tool to check whether a link is good or not.

--
John Vandenberg

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] Problem with saving edits

2012-07-25 Thread James Heilman
I have been having problems saving some edits the last week or so.
This is not associated with an edit conflict. Wondering if anyone else
has experience this problem?

-- 
James Heilman
MD, CCFP-EM, Wikipedian

The Wikipedia Open Textbook of Medicine
www.opentextbookofmedicine.com

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Problem with saving edits

2012-07-25 Thread MZMcBride
James Heilman wrote:
 I have been having problems saving some edits the last week or so.
 This is not associated with an edit conflict. Wondering if anyone else
 has experience this problem?

You're being incredibly vague here.

On which wikis have you had this problem? Does it happen every time or just
sometimes? Do you get a specific error message? What happens when you try to
save the edits? Does this happen while logged in and while logged out? Does
this happen in a specific Web browser or in every Web browser? Does this
happen with a specific computer or any computer?

All these details and more will help you get a useful reply. :-)

MZMcBride



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] IRC office hours with the Analytics team, Monday, July 30, 2012 at 19:00 UTC

2012-07-25 Thread Tilman Bayer
Hi all,

you are cordially invited to the first ever IRC office hours of the
Foundation's recently formed Analytics team, taking place in
#wikimedia-analytics on Freenode on Monday, July 30 at 19:00 UTC /
noon PT 
(http://www.timeanddate.com/worldclock/fixedtime.html?hour=19day=30month=07year=2012
).

It is an opportunity to ask all your analytics and statistics related
questions about Wikipedia and the other Wikimedia projects, in
particular regarding the Wikimedia Report Card and the upcoming
Kraken analytics platform. See also the blog post that the team just
published: https://blog.wikimedia.org/2012/07/25/meet-the-analytics-team/
, as well as https://www.mediawiki.org/wiki/Analytics

General information about IRC office hours is available at
https://meta.wikimedia.org/wiki/IRC_office_hours .

Regards,
--
Tilman Bayer
Senior Operations Analyst (Movement Communications)
Wikimedia Foundation
IRC (Freenode): HaeB

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Serious alternatives to Gerrit

2012-07-25 Thread Ryan Lane
 We're discussing alternatives to *gerrit*, not to git.


Heh. Sorry, yes. that's what I meant.

- Ryan

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] suggestion: replace CAPTCHA with better approaches

2012-07-25 Thread Daniel Friesen
On Wed, 25 Jul 2012 17:55:06 -0700, John Vandenberg jay...@gmail.com  
wrote:



On Thu, Jul 26, 2012 at 5:00 AM, Derric Atzrott
datzr...@alizeepathology.com wrote:
This way if people feel motivated at cheating at captcha they will end  
up

helping Wikipedia It is up to us to try to balance things out.


I'm pretty sure users will be less annoyed at solving captchas that

actually contribute some value.

Obligatory XKCD: https://xkcd.com/810/


;-)

The best CAPTCHAs are the kind that do this.  Look at how hard it is to  
beat
reCAPTCHA because they have taken this approach.  One must be careful  
though
that the CAPTCHA is constructed such that it won't be as simple as a  
lookup
though, and will actually require some thought (so that probably  
eliminates

the noun, verb, adjective idea).

This idea has my support.


We should use less CAPTCHAs.

If the problem is spam, we should build better new URL review
systems.  There are externally managed spam lists that we could use to
identify spammers.

'new URL' s could be defined as domain names that were not in the
external links table for more than 24 hrs.

Addition of these new URLs could be smartly throttled.

un-autoconfirmed edits which include 'new URLs' could be throttled so
that they can only be added to a single article for the first 24
hours.  That allows a new user to make use of a new domain name
unimpeded, however they can only use it on one page for the first 24
hrs.  If the new URL was spam, it will hopefully be removed within 24
hrs, which resets the clock for the spammer.  i.e. they can only add
the spam to one page each 24 hrs.

Another idea is for the wiki to ask the user that adds new URLs to
review three recent edits that included new URLs and ask the user to
indicate whether or not the new URL was SPAM and should be removed.
This may be unworkable because the spam-bot could use the linksearch
tool to check whether a link is good or not.

--
John Vandenberg


Your proposal fails to account for two important facts:
- A lot of spam may not even add links to the page.
- Don't underestimate bot programming. I've seen bots in the wild that  
wait for autoconfirmed status and then spam. If there is some pattern that  
can be followed to get access to spam the wiki, bots will be programmed to  
use that pattern to bypass spam limits.


--
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l