Re: [Wikitech-l] Extension:OpenID 3.00 - Security Release

2013-03-08 Thread Yuvi Panda
Was this the last blocker to getting the extension deployed?
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Extension:OpenID 3.00 - Security Release

2013-03-08 Thread Thomas Gries
Am 08.03.2013 10:07, schrieb Yuvi Panda:
 Was this the last blocker to getting the extension deployed?

One, two or three further non-sec-related patches will follow in the
next days
which improve the user GUI, especially the preference tab for OpenID.

stay tuned...

Regards,
Tom



signature.asc
Description: OpenPGP digital signature
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Extension:OpenID 3.00 - Security Release

2013-03-08 Thread Marc A. Pelletier

On 03/08/2013 01:34 AM, Petr Bena wrote:

this shouldn't be very
dangerous


Even if it isn't in practice in the typical cases, it exposes a third 
party to a risk they are unable to assess if they use that OpenID.  (And 
it doesn't require a 'crat going rogue even here -- renames are 
sometimes done without salting the former username and an unrelated 
third party could create an account to reuse the username and then probe 
plausible consumers of the ID).


-- Marc


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Bug 1542 - Log spam blacklist hits

2013-03-08 Thread anubhav agarwal
Hey Guys,

Thanks for explaining it to me. Can I have your IRC handles, I still  think
I have many doubts.

Is there a simpler bug related with extension, so I can get an Idea of it
working.

On Fri, Mar 8, 2013 at 5:23 AM, Chris Steipp cste...@wikimedia.org wrote:

 On Thu, Mar 7, 2013 at 1:34 PM, Platonides platoni...@gmail.com wrote:
  On 07/03/13 21:03, anubhav agarwal wrote:
  Hey Chris
 
  I was exploring SpamBlaklist Extension. I have some doubts hope you
 could
  clear them.
 
  Is there any place I can get documentation of
  Class SpamBlacklist in the file SpamBlacklist_body.php. ?

 There really isn't any documentation besides the code, but a couple
 more things you should look at. Notice that in SpamBlacklist.php,
 there is the line $wgHooks['EditFilterMerged'][] =
 'SpamBlacklistHooks::filterMerged';, which is the way that
 SpamBlacklist registers itself with MediaWiki core to filter edits. So
 when MediaWiki core runs the EditFilterMerged hooks (which it does in
 includes/EditPage.php, line 1287), all of the extensions that have
 registered a function for that hook are run with the passed in
 arguments, so SpamBlacklistHooks::filterMerged is run. And
 SpamBlacklistHooks::filterMerged then just sets up and calls
 SpamBlacklist::filter. So that is where you can start tracing what is
 actually in the variables, in case Platonides summary wasn't enough.


 
  In function filter what does the following variables represent ?
 
  $title
  Title object (includes/Title.php) This is the page where it tried to
 save.
 
  $text
  Text being saved in the page/section
 
  $section
  Name of the section or ''
 
  $editpage
  EditPage object if EditFilterMerged was called, null otherwise
 
  $out
 
  A ParserOutput class (actually, this variable name was a bad choice, it
  looks like a OutputPage), see includes/parser/ParserOutput.php
 
 
  I have understood the following things from the code, please correct me
 if
  I am wrong. It extracts the edited text, and parse it to find the links.
 
  Actually, it uses the fact that the parser will have processed the
  links, so in most cases just obtains that information.
 
 
  It then replaces the links which match the whitelist regex,
  This doesn't make sense as you explain it. It builds a list of links,
  and replaces whitelisted ones with '', ie. removes whitelisted links
  from the list.
 
  and then checks if there are some links that match the blacklist regex.
  Yes
 
  If the check is greater you return the content matched.
 
  Right, $check will be non-0 if the links matched the blacklist.
 
  it already enters in the debuglog if it finds a match
 
  Yes, but that is a private log.
  Bug 1542 talks about making that accesible in the wiki.

 Yep. For example, see
 * https://en.wikipedia.org/wiki/Special:Log
 * https://en.wikipedia.org/wiki/Special:AbuseLog

 
 
  I guess the bug aims at creating a sql table.
  I was thinking of the following fields to log.
  Title, Text, User, URLs, IP. I don't understand why you denied it.
 
  Because we don't like to publish the IPs *in the wiki*.

 The WMF privacy policy also discourages us from keeping IP addresses
 longer than 90 days, so if you do keep IPs, then you need a way to
 hide / purge them, and if they allow someone to see what IP address a
 particular username was using, then only users with checkuser
 permissions are allowed to see that. So it would be easier for you not
 to include it, but if it's desired, then you'll just have to build
 those protections out too.

 
  I think the approach should be to log matches using abusefilter
  extension if that one is loaded.

 The abusefilter log format has a lot of data in it specific to
 AbuseFilter, and is used to re-test abuse filters, so adding these
 hits into that log might cause some issues. I think either the general
 log, or using a separate, new log table would be best. Just for some
 numbers, in the first 7 days of this month, we've had an average of
 27,000 hits each day. So if this goes into an existing log, it's going
 to generate a significant amount of data.

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l




-- 
Cheers,
Anubhav


Anubhav Agarwal| 4rth Year  | Computer Science  Engineering | IIT Roorkee
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Bug 1542 - Log spam blacklist hits

2013-03-08 Thread Chris Steipp
csteipp. Feel free to ping me whenever.
On Mar 8, 2013 6:23 AM, anubhav agarwal anubhav...@gmail.com wrote:

 Hey Guys,

 Thanks for explaining it to me. Can I have your IRC handles, I still  think
 I have many doubts.

 Is there a simpler bug related with extension, so I can get an Idea of it
 working.

 On Fri, Mar 8, 2013 at 5:23 AM, Chris Steipp cste...@wikimedia.org
 wrote:

  On Thu, Mar 7, 2013 at 1:34 PM, Platonides platoni...@gmail.com wrote:
   On 07/03/13 21:03, anubhav agarwal wrote:
   Hey Chris
  
   I was exploring SpamBlaklist Extension. I have some doubts hope you
  could
   clear them.
  
   Is there any place I can get documentation of
   Class SpamBlacklist in the file SpamBlacklist_body.php. ?
 
  There really isn't any documentation besides the code, but a couple
  more things you should look at. Notice that in SpamBlacklist.php,
  there is the line $wgHooks['EditFilterMerged'][] =
  'SpamBlacklistHooks::filterMerged';, which is the way that
  SpamBlacklist registers itself with MediaWiki core to filter edits. So
  when MediaWiki core runs the EditFilterMerged hooks (which it does in
  includes/EditPage.php, line 1287), all of the extensions that have
  registered a function for that hook are run with the passed in
  arguments, so SpamBlacklistHooks::filterMerged is run. And
  SpamBlacklistHooks::filterMerged then just sets up and calls
  SpamBlacklist::filter. So that is where you can start tracing what is
  actually in the variables, in case Platonides summary wasn't enough.
 
 
  
   In function filter what does the following variables represent ?
  
   $title
   Title object (includes/Title.php) This is the page where it tried to
  save.
  
   $text
   Text being saved in the page/section
  
   $section
   Name of the section or ''
  
   $editpage
   EditPage object if EditFilterMerged was called, null otherwise
  
   $out
  
   A ParserOutput class (actually, this variable name was a bad choice, it
   looks like a OutputPage), see includes/parser/ParserOutput.php
  
  
   I have understood the following things from the code, please correct
 me
  if
   I am wrong. It extracts the edited text, and parse it to find the
 links.
  
   Actually, it uses the fact that the parser will have processed the
   links, so in most cases just obtains that information.
  
  
   It then replaces the links which match the whitelist regex,
   This doesn't make sense as you explain it. It builds a list of links,
   and replaces whitelisted ones with '', ie. removes whitelisted links
   from the list.
  
   and then checks if there are some links that match the blacklist
 regex.
   Yes
  
   If the check is greater you return the content matched.
  
   Right, $check will be non-0 if the links matched the blacklist.
  
   it already enters in the debuglog if it finds a match
  
   Yes, but that is a private log.
   Bug 1542 talks about making that accesible in the wiki.
 
  Yep. For example, see
  * https://en.wikipedia.org/wiki/Special:Log
  * https://en.wikipedia.org/wiki/Special:AbuseLog
 
  
  
   I guess the bug aims at creating a sql table.
   I was thinking of the following fields to log.
   Title, Text, User, URLs, IP. I don't understand why you denied it.
  
   Because we don't like to publish the IPs *in the wiki*.
 
  The WMF privacy policy also discourages us from keeping IP addresses
  longer than 90 days, so if you do keep IPs, then you need a way to
  hide / purge them, and if they allow someone to see what IP address a
  particular username was using, then only users with checkuser
  permissions are allowed to see that. So it would be easier for you not
  to include it, but if it's desired, then you'll just have to build
  those protections out too.
 
  
   I think the approach should be to log matches using abusefilter
   extension if that one is loaded.
 
  The abusefilter log format has a lot of data in it specific to
  AbuseFilter, and is used to re-test abuse filters, so adding these
  hits into that log might cause some issues. I think either the general
  log, or using a separate, new log table would be best. Just for some
  numbers, in the first 7 days of this month, we've had an average of
  27,000 hits each day. So if this goes into an existing log, it's going
  to generate a significant amount of data.
 
  ___
  Wikitech-l mailing list
  Wikitech-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikitech-l
 



 --
 Cheers,
 Anubhav


 Anubhav Agarwal| 4rth Year  | Computer Science  Engineering | IIT Roorkee
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Seemingly proprietary Javascript

2013-03-08 Thread Antoine Musso
Le 06/03/13 13:34, Chad wrote:
 Jack Phoenix wrote:
  we'll soon be debating about the very meaning of the word is.

 Jack is not alone.
   ^^

Care to elaborate the meaning there?

-- 
Antoine hashar Musso
Sorry it had to be made


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Github/Gerrit mirroring

2013-03-08 Thread Antoine Musso
Le 05/03/13 11:27, Krinkle a écrit :
 If all we do is immediately copy the PR, submit it to Gerrit and
 close the PR saying Please create a WMFLabs account, learn all of 
 fucking Gerrit, and then continue on Gerrit to finalise the patch,
 then we should just kill PR now.

That always has been my point. The code ultimately require to land in
Gerrit so, to me, there is no point in using GitHub pull requests.

I guess the whole idea of using GitHub is for public relation and to
attract new people.  Then, if a developer is not willing to learn
Gerrit, its code is probably not worth the effort of us integrating
github/gerrit.  That will just add some more poor quality code to your
review queues.

-- 
Antoine hashar Musso


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Github/Gerrit mirroring

2013-03-08 Thread Dan Andreescu
 ... Then, if a developer is not willing to learn
 Gerrit, its code is probably not worth the effort of us integrating
 github/gerrit.  That will just add some more poor quality code to your
 review queues.


That seems like a pretty big assumption, and likely to be wrong.  The
simpler the code review process, the happier people will be to submit
patches.  Quality seems independent from that, and more likely linked to
the ease of validating patches (linting, unit test requirements, good style
guides, etc).  But that's just a guess.  If deemed interesting, I would be
glad to help quantify patch quality and analyze what helps to improve it.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Some Sort of Notice for Breaking Changes

2013-03-08 Thread Tyler Romeo
Is there any way that extension developers can get some sort of notice for
breaking changes, e.g., https://gerrit.wikimedia.org/r/50138? Luckily my
extension's JobQueue implementation hasn't been merged yet, but if it had I
would have no idea that it had been broken by the core.
*--*
*Tyler Romeo*
Stevens Institute of Technology, Class of 2015
Major in Computer Science
www.whizkidztech.com | tylerro...@gmail.com
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Github/Gerrit mirroring

2013-03-08 Thread Quim Gil

On 03/08/2013 08:31 AM, Dan Andreescu wrote:

... Then, if a developer is not willing to learn
Gerrit, its code is probably not worth the effort of us integrating
github/gerrit.  That will just add some more poor quality code to your
review queues.


imho GitHub has the potential to get us a first patch from many 
contributors that won't arrive through gerrit.wikimedia.org first. It's 
just a lot simpler for GitHub users. Some of those patches will be good, 
some not so much, but that is probably also the case for first time 
contributors in Gerrit.


When a developer submits a second and a third pull request via GitHub 
then we can politely invite her to check 
http://www.mediawiki.org/wiki/Gerrit and join our actual development 
process.


--
Quim Gil
Technical Contributor Coordinator @ Wikimedia Foundation
http://www.mediawiki.org/wiki/User:Qgil

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Github/Gerrit mirroring

2013-03-08 Thread Andrew Otto
I've been hosting my puppet-cdh4 (Hadoop) repository on Github for a while now. 
 I am planning on moving this into Gerrit.

I've been getting pretty high quality pull requests for the last month or so 
from a couple of different users. (Including CentOS support, supporting 
MapReduce v1 as well as YARN, etc.) 

  https://github.com/wikimedia/puppet-cdh4/issues?page=1state=closed

I'm happy to host this in Gerrit, but I suspect that contribution to this 
project will drop once I do. :/







On Mar 8, 2013, at 11:47 AM, Quim Gil q...@wikimedia.org wrote:

 On 03/08/2013 08:31 AM, Dan Andreescu wrote:
 ... Then, if a developer is not willing to learn
 Gerrit, its code is probably not worth the effort of us integrating
 github/gerrit.  That will just add some more poor quality code to your
 review queues.
 
 imho GitHub has the potential to get us a first patch from many contributors 
 that won't arrive through gerrit.wikimedia.org first. It's just a lot simpler 
 for GitHub users. Some of those patches will be good, some not so much, but 
 that is probably also the case for first time contributors in Gerrit.
 
 When a developer submits a second and a third pull request via GitHub then we 
 can politely invite her to check http://www.mediawiki.org/wiki/Gerrit and 
 join our actual development process.
 
 -- 
 Quim Gil
 Technical Contributor Coordinator @ Wikimedia Foundation
 http://www.mediawiki.org/wiki/User:Qgil
 
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] JobQueue changes (Re: Some Sort of Notice for Breaking Changes)

2013-03-08 Thread Rob Lanphier
On Fri, Mar 8, 2013 at 8:35 AM, Tyler Romeo tylerro...@gmail.com wrote:
 Is there any way that extension developers can get some sort of notice for
 breaking changes, e.g., https://gerrit.wikimedia.org/r/50138? Luckily my
 extension's JobQueue implementation hasn't been merged yet, but if it had I
 would have no idea that it had been broken by the core.

Hi Tyler,

Sorry to hear that there might be a problem here.  It's been a pet
peeve of mine that we seem to be a little too eager to break backwards
compatibility in places where it may not be necessary.  That said,
let's try to avoid a meta-process discussion before we collectively
understand the example you are bringing up, and focus on the JobQueue.

As near as I can tell from a quick skim of the changeset you're
referencing, Aaron's changes here are purely additive.  Am I reading
this wrong?  Is there some other changeset that changes/removes
existing interfaces that you meant to reference instead?

Rob

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Github/Gerrit mirroring

2013-03-08 Thread Jon Robson
On 8 Mar 2013 10:47, Quim Gil q...@wikimedia.org wrote:

 On 03/08/2013 08:31 AM, Dan Andreescu wrote:

 ... Then, if a developer is not willing to learn
 Gerrit, its code is probably not worth the effort of us integrating
 github/gerrit.  That will just add some more poor quality code to your
 review queues.


 imho GitHub has the potential to get us a first patch from many
contributors that won't arrive through gerrit.wikimedia.org first. It's
just a lot simpler for GitHub users. Some of those patches will be good,
some not so much, but that is probably also the case for first time
contributors in Gerrit.

+1 to me the need to create a gerrit account is a huge barrier for entry. I
think we are missing out on attracting small but useful patches from
developers who are not heavily invested in the project and have no wish to
become regular core contributors...

 When a developer submits a second and a third pull request via GitHub
then we can politely invite her to check
http://www.mediawiki.org/wiki/Gerrit and join our actual development
process.

 --
 Quim Gil
 Technical Contributor Coordinator @ Wikimedia Foundation
 http://www.mediawiki.org/wiki/User:Qgil


 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] JobQueue changes (Re: Some Sort of Notice for Breaking Changes)

2013-03-08 Thread Tyler Romeo
On Fri, Mar 8, 2013 at 12:18 PM, Rob Lanphier ro...@wikimedia.org wrote:

 As near as I can tell from a quick skim of the changeset you're
 referencing, Aaron's changes here are purely additive.  Am I reading
 this wrong?  Is there some other changeset that changes/removes
 existing interfaces that you meant to reference instead?


At first glance it seems additive, but the change adds a new abstract
method to the JobQueue class, meaning any child class of JobQueue that
doesn't have the new method implemented will trigger a fatal error.

To make it not breaking, the function would have to have a default
implementation in the main JobQueue class.

*--*
*Tyler Romeo*
Stevens Institute of Technology, Class of 2015
Major in Computer Science
www.whizkidztech.com | tylerro...@gmail.com
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Wikimedia Hackathon Amsterdam 2013: Registration opened

2013-03-08 Thread Maarten Dammers

Hi everyone,

Wikimedia Nederland invites all developers to the Wikimedia Hackathon. 
The Wikimedia Hackathon will be in 2013 from 24-26 May. The registration 
is now open and also includes the possibility to apply for a travel, 
accommodation or full scholarship. You can find the form at 
https://docs.google.com/spreadsheet/viewform?formkey=dFg2SmRRbkpxNmxCcFNFdlduVlJuTUE6MQ#gid=0


The hackathon is an opportunity for all Wikimedia community developers 
and sysadmins to come together, squash bugs and write great new features 
 tools. Unlike the previous years (2012, 2011, etc.) this Hackathon 
won't be in Berlin, but in Amsterdam.


The event is open to a wide range of developers. We welcome both 
seasoned and new developers as well as people working on MediaWiki, 
tools, pywikipedia, Wikidata, gadgets, extensions, templates … . Please 
suggest and discus topics at 
https://www.mediawiki.org/wiki/Amsterdam_Hackathon_2013/Topics .


You can indicate that you're coming at 
https://www.mediawiki.org/wiki/Amsterdam_Hackathon_2013/Attendees and/or 
https://www.facebook.com/events/167285526755104/ . This doesn't replace 
registration, it's just to let others know what you're up to.


Keep an eye on https://www.mediawiki.org/wiki/Amsterdam_Hackathon_2013 
for updates!


Maarten


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Indexing structures for Wikidata

2013-03-08 Thread bawolff
On Thu, Mar 7, 2013 at 12:50 PM, Denny Vrandečić
denny.vrande...@wikimedia.de wrote:
 As you probably know, the search in Wikidata sucks big time.

 Until we have created a proper Solr-based search and deployed on that
 infrastructure, we would like to implement and set up a reasonable stopgap
 solution.

 The simplest and most obvious signal for sorting the items would be to
 1) make a prefix search
 2) weight all results by the number of Wikipedias it links to

 This should usually provide the item you are looking for. Currently, the
 search order is random. Good luck with finding items like California,
 Wellington, or Berlin.

 Now, what I want to ask is, what would be the appropriate index structure
 for that table. The data is saved in the wb_terms table, which would need
 to be extended by a weight field. There is already a suggestion (based on
 discussions between Tim and Daniel K if I understood correctly) to change
 the wb_terms table index structure (see here 
 https://bugzilla.wikimedia.org/show_bug.cgi?id=45529 ), but since we are
 changing the index structure anyway it would be great to get it right this
 time.

 Anyone who can jump in? (Looking especially at Asher and Tim)

 Any help would be appreciated.

 Cheers,
 Denny

 --
 Project director Wikidata
 Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
 Tel. +49-30-219 158 26-0 | http://wikimedia.de

 Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
 Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
 der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
 Körperschaften I Berlin, Steuernummer 27/681/51985.
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

AFAIK sql isn't particularly good for indexing that type of query.

You could maybe have a bunch of indexes for the first couple letters
of a term, and then after some point hope that things are narrowed
down enough that just doing a prefix search is acceptable. For
example, you might have an indexes on (wb_term(1), wb_weight),
(wb_term(2), wb_weight), ..., (wb_term(7), wb_weight) and one on just
wb_term. That way (I believe) you would be able to do efficient
searches for a prefix ordered by weight, provided the prefix is less
than 7 characters. (7 was chosen arbitrarily out of a hat. Performance
goes down as you add more indexes from what I understand. I'm not sure
how far you would be able to take this scheme before that becomes an
issue. You could maybe enhance this by only showing search suggestion
updates for every 2 characters the user enters or something).

--bawolff

p.s. Have not tested this, and talking a bit outside my knowledge area, so ymmv

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Labs-l] Wikimedia Hackathon Amsterdam 2013: Registration opened

2013-03-08 Thread Petr Bena
I have one question :) why the registration form is asking me which
year and month I will depart? Are you afraid some attendees are
planning to stay for several years? :D

On Fri, Mar 8, 2013 at 6:54 PM, Maarten Dammers maar...@mdammers.nl wrote:
 Hi everyone,

 Wikimedia Nederland invites all developers to the Wikimedia Hackathon. The
 Wikimedia Hackathon will be in 2013 from 24-26 May. The registration is now
 open and also includes the possibility to apply for a travel, accommodation
 or full scholarship. You can find the form at
 https://docs.google.com/spreadsheet/viewform?formkey=dFg2SmRRbkpxNmxCcFNFdlduVlJuTUE6MQ#gid=0

 The hackathon is an opportunity for all Wikimedia community developers and
 sysadmins to come together, squash bugs and write great new features 
 tools. Unlike the previous years (2012, 2011, etc.) this Hackathon won't be
 in Berlin, but in Amsterdam.

 The event is open to a wide range of developers. We welcome both seasoned
 and new developers as well as people working on MediaWiki, tools,
 pywikipedia, Wikidata, gadgets, extensions, templates … . Please suggest and
 discus topics at
 https://www.mediawiki.org/wiki/Amsterdam_Hackathon_2013/Topics .

 You can indicate that you're coming at
 https://www.mediawiki.org/wiki/Amsterdam_Hackathon_2013/Attendees and/or
 https://www.facebook.com/events/167285526755104/ . This doesn't replace
 registration, it's just to let others know what you're up to.

 Keep an eye on https://www.mediawiki.org/wiki/Amsterdam_Hackathon_2013 for
 updates!

 Maarten


 ___
 Labs-l mailing list
 lab...@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/labs-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Github/Gerrit mirroring

2013-03-08 Thread Bartosz Dziewoński

On Fri, 08 Mar 2013 17:07:18 +0100, Antoine Musso hashar+...@free.fr wrote:


I guess the whole idea of using GitHub is for public relation and to
attract new people.  Then, if a developer is not willing to learn
Gerrit, its code is probably not worth the effort of us integrating
github/gerrit.  That will just add some more poor quality code to your
review queues.


This a hundred times. I manage a few (small) open-source projects at GitHub, 
and most of the patches I get are not even up to my standards (and those are 
significantly lower than WMF's ones).

Submitting a patch to gerrit and even fixing it after code review is not that 
hard. (Of course any more complicated operations like rebasing do suck, but you 
hopefully won't be doing that with your first patch.)

--
Matma Rex

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] JobQueue changes (Re: Some Sort of Notice for Breaking Changes)

2013-03-08 Thread Tyler Romeo
Also, after doing a git-blame, I found https://gerrit.wikimedia.org/r/51886,
which was also merged today. I could search through the core for other
changes like this but it'd require an immense amount of time.

*--*
*Tyler Romeo*
Stevens Institute of Technology, Class of 2015
Major in Computer Science
www.whizkidztech.com | tylerro...@gmail.com
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Deployment highlights - 2013-03-08

2013-03-08 Thread Greg Grossmeier
Hello!

This is your friendly weekly deployments highlight email.

For the week of March 11th (next week), here are some things to be aware
of:

* Scribunto (Lua) will be available on all wikis as of Wed the 13th
* HTTPS for all logged in users
  This is planned to happen next week, but the exact deployment window
  is still to be determined. I will inform wikitech-l and -ambassadors
  when it is scheduled.
  See this bug for more info:
  https://bugzilla.wikimedia.org/show_bug.cgi?id=39380

Best,

Greg

-- 
| Greg GrossmeierGPG: B2FA 27B1 F7EB D327 6B8E |
| identi.ca: @gregA18D 1138 8E47 FAC8 1C7D |

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Deployment highlights - 2013-03-08

2013-03-08 Thread Greg Grossmeier
quote name=Greg Grossmeier date=2013-03-08 time=11:26:38 -0800
 Hello!
 
 This is your friendly weekly deployments highlight email.
 
 For the week of March 11th (next week), here are some things to be aware
 of:

Also regarding the Mobile Uploads feature on Wed 13th:
* we're releasing a call to action to login or signup from the article
  upload feature, as well as the ability to donate images to commons, to
  the full mobile web


(sorry for the noise)

-- 
| Greg GrossmeierGPG: B2FA 27B1 F7EB D327 6B8E |
| identi.ca: @gregA18D 1138 8E47 FAC8 1C7D |

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Github/Gerrit mirroring

2013-03-08 Thread Antoine Musso
Le 08/03/13 09:21, Jon Robson a écrit :
 +1 to me the need to create a gerrit account is a huge barrier for entry. I
 think we are missing out on attracting small but useful patches from
 developers who are not heavily invested in the project and have no wish to
 become regular core contributors...

Maybe Gerrit can be made to let one authenticate with its github account?

-- 
Antoine hashar Musso


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Github/Gerrit mirroring

2013-03-08 Thread Chad
On Fri, Mar 8, 2013 at 11:50 AM, Antoine Musso hashar+...@free.fr wrote:
 Le 08/03/13 09:21, Jon Robson a écrit :
 +1 to me the need to create a gerrit account is a huge barrier for entry. I
 think we are missing out on attracting small but useful patches from
 developers who are not heavily invested in the project and have no wish to
 become regular core contributors...

 Maybe Gerrit can be made to let one authenticate with its github account?


Nope. We use LDAP for auth with Gerrit, and it does not support having
multiple authentication methods at the same time (nor do I really see it
as worth the effort).

Getting Github PR into the Gerrit ecosystem is on the Gerrit roadmap,
but we don't have a firm date just yet. I plan to announce this much
more widely when we're close to that.

-Chad

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Identifying pages that are slow to render

2013-03-08 Thread MZMcBride
Federico Leva (Nemo) wrote:
There's slow-parse.log, but it's private unless a solution is found for
https://gerrit.wikimedia.org/r/#/c/49678/
https://wikitech.wikimedia.org/wiki/Logs

Separate slow-parse into public and private files
https://bugzilla.wikimedia.org/show_bug.cgi?id=45830

https://gerrit.wikimedia.org/r/49678 was abandoned; it looks like
https://gerrit.wikimedia.org/r/52608 is now the relevant Gerrit
changeset.

MZMcBride



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Some Sort of Notice for Breaking Changes

2013-03-08 Thread Federico Leva (Nemo)
Partly related: to be fair, Aaron asked comments about release notes and 
announcements some months ago (although in that case for schema changes) 
but there was none.

http://lists.wikimedia.org/pipermail/wikitech-l/2012-November/064630.html

Nemo

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Extension:OpenID 3.00 - Security Release

2013-03-08 Thread Ryan Lane
On Fri, Mar 8, 2013 at 1:07 AM, Yuvi Panda yuvipa...@gmail.com wrote:

 Was this the last blocker to getting the extension deployed?


On wikitech the blockers were the switch of the wiki name (from labsconsole
to wikitech) and this. There's still some issues that need to be worked out
for deployment on the main projects. Also, it needs a full review before
deployment to the projects, and we need to work out how this will affect
the OAuth plans. We have a kickoff meeting for this coming up soon. I'll
send updates when that occurs.

For deployment on wikitech I think I'd like to wait for a full security
review, so it may be a little while.

- Ryan
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Some Sort of Notice for Breaking Changes

2013-03-08 Thread Tyler Romeo
True, but schema changes are not as bad because they won't cause fatal
errors in PHP. At the very least if a schema change occurs your wiki will
still be operational.

--Tyler Romeo
On Mar 8, 2013 4:26 PM, Federico Leva (Nemo) nemow...@gmail.com wrote:

 Partly related: to be fair, Aaron asked comments about release notes and
 announcements some months ago (although in that case for schema changes)
 but there was none.
 http://lists.wikimedia.org/**pipermail/wikitech-l/2012-**
 November/064630.htmlhttp://lists.wikimedia.org/pipermail/wikitech-l/2012-November/064630.html
 

 Nemo

 __**_
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/**mailman/listinfo/wikitech-lhttps://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Some Sort of Notice for Breaking Changes

2013-03-08 Thread Thomas Gries
https://bugzilla.wikimedia.org/show_bug.cgi?id=45915

I think, we should use Twitter in addition to the mailinglist.

I am not a fan of all new tools, but many OSS projects (owncloud,
mailvelope) post their breaking news there.
We do not (yet).




signature.asc
Description: OpenPGP digital signature
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Some Sort of Notice for Breaking Changes

2013-03-08 Thread Chad
On Fri, Mar 8, 2013 at 3:02 PM, Thomas Gries m...@tgries.de wrote:
 https://bugzilla.wikimedia.org/show_bug.cgi?id=45915

 I think, we should use Twitter in addition to the mailinglist.

 I am not a fan of all new tools, but many OSS projects (owncloud,
 mailvelope) post their breaking news there.
 We do not (yet).

You're joking, right?

-Chad

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Some Sort of Notice for Breaking Changes

2013-03-08 Thread Thomas Gries
Am 09.03.2013 00:04, schrieb Chad:
 I think, we should use Twitter in addition to the mailinglist.


 You're joking, right?

 -Chad
Why do you think I am joking ?
Major changes can be signalled there - or did I miss something ?

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Identifying pages that are slow to render

2013-03-08 Thread Marcin Cieslak
 Antoine Musso hashar+...@free.fr wrote:
 Le 06/03/13 22:05, Robert Rohde a écrit :
 On enwiki we've already made Lua conversions with most of the string
 templates, several formatting templates (e.g. {{rnd}}, {{precision}}),
 {{coord}}, and a number of others.  And there is work underway on a
 number of the more complex overhauls (e.g. {{cite}}, {{convert}}).
 However, it would be nice to identify problematic templates that may
 be less obvious.

 You can get in touch with Brad Jorsch and Tim Starling. They most
 probably have a list of templates that should quickly converted to LUA
 modules.

 If we got {{cite}} out, that will be already a nice improvement :-]

Not really, given https://bugzilla.wikimedia.org/show_bug.cgi?id=45861

//Saper


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Indexing non-text content in LuceneSearch

2013-03-08 Thread oren bochman
-Original Message-
From: wikitech-l-boun...@lists.wikimedia.org 
[mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of Brion Vibber
Sent: Thursday, March 7, 2013 9:59 PM
To: Wikimedia developers
Subject: Re: [Wikitech-l] Indexing non-text content in LuceneSearch

On Thu, Mar 7, 2013 at 11:45 AM, Daniel Kinzler dan...@brightbyte.de wrote:
 1) create a specialized XML dump that contains the text generated by
 getTextForSearchIndex() instead of actual page content.

That probably makes the most sense; alternately, make a dump that includes both 
raw data and text for search. This also allows for indexing extra stuff for 
files -- such as extracted text from a PDF of DjVu or metadata from a JPEG -- 
if the dump process etc can produce appropriate indexable data.

 However, that only works
 if the dump is created using the PHP dumper. How are the regular dumps 
 currently generated on WMF infrastructure? Also, would be be feasible 
 to make an extra dump just for LuceneSearch (at least for wikidata.org)?

The dumps are indeed created via MediaWiki. I think Ariel or someone can 
comment with more detail on how it currently runs, it's been a while since I 
was in the thick of it.

 2) We could re-implement the ContentHandler facility in Java, and 
 require extensions that define their own content types to provide a 
 Java based handler in addition to the PHP one. That seems like a 
 pretty massive undertaking of dubious value. But it would allow maximum 
 control over what is indexed how.

No don't do it :)

 3) The indexer code (without plugins) should not know about Wikibase, 
 but it may have hard coded knowledge about JSON. It could have a 
 special indexing mode for JSON, in which the structure is deserialized 
 and traversed, and any values are added to the index (while the keys 
 used in the structure would be ignored). We may still be indexing 
 useless interna from the JSON, but at least there would be a lot fewer false 
 negatives.

Indexing structured data could be awesome -- again I think of file metadata as 
well as wikidata-style stuff. But I'm not sure how easy that'll be. Should 
probably be in addition to the text indexing, rather than replacing.

-- brion

I agree with Brion.

Here are my 5 shenekel's worth.

To indexing non-mwdumps with LuceneSearch I would:
1. modify the demon to read the custom/dump format or update the xml dump to 
support json dump. 
2. it uses the MWdumper codebase to do this now.
3. add a lucene analyzer to handle the new data type, say a json analyzer.
4. add a Lucenedoc per Json based Wikidata schema
5. update the queries parser to handle the new queries and the modified Lucene 
documents.
6. for bonus points modify spelling correction and write a wiki data ranking 
algoritm
But this would only solve reading static dumps used to bootstrap the index, I 
would then have to 
Change how MWSearch periodically polls Brion's OAIRepository to pull in updated 
pages.

I've been coding some analytics from MWDumps from WMF/Wikia Wikis for research 
project I can say this:
1. Most big dumps (e.g. historic) inherit the isses of wikitext namely 
unescaped tags and entities which crash modern XML java libraries - so escape 
your data and validate the xml!
2. The god old SAX code in the MWDumper still works fine - so use it.
3. Use lucene 2.4 with the deprecated old APIs
4. Ariel is doing a great job (e.g. the 7Z compression and the splitting of the 
dumps) but these are things MWdumper does not handle yet.

Finally based on my work with i18n team, TranslateWiki search that indexing 
JSON data with Solar + Solarium requires no Search Engine coding at all.
You define the document schema, and use solarium to push JSON and get results 
too. I could do a demo of how to do this at a coming Hakathon if there
is any interest, however when I offered to replace LuceneSearch like this last 
October the idea was rejected out of hand.

-- oren

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Mediawiki's access points and mw-config

2013-03-08 Thread Waldir Pimenta
On Wed, Feb 27, 2013 at 9:13 PM, Daniel Friesen
dan...@nadir-seen-fire.comwrote:


 index.php, api.php, etc... provide entrypoints into the configured wiki.

 mw-config/ installs and upgrades the wiki. With much of itself
 disconnected from core code that requires a configured wiki. And after
 installation it can even be eliminated completely without issue.


I think this clarifies the issue for me. Correct me if I'm wrong, but
basically the entry points are for continued, repeated use, for indeed
*accessing* wiki resources (hence I suggest the normalization of the name
of these scripts to access points everywhere in the docs, because entry
is a little more generic), while mw-config/index.php is a one-off script
that has no use once the wiki installation is done. I'll update the docs in
mw.org accordingly, to make this clear.


 I wouldn't even include mw-config in entrypoint modifications that would
 be applied to other entrypoint code.


You mean like this one https://gerrit.wikimedia.org/r/#/c/49208/? I can
understand, in the sense that it gives people the wrong idea regarding its
relationship with the other access points, but if the documentation is
clear, I see no reason not to have mw-config/index.php benefit from changes
when the touched code is the part common to all *entry* points (in the
strict meaning of files that can be used to enter the wiki from a web
browser).

That said, and considering what Platonides mentioned:

It was originally named config. It came from the link that sent you
 there: You need to configure your wiki first. Then someone had
 problems with other program that was installed sitewide on his host
 appropiating the /config/ folder, so it was renamed to mw-config.


...I would suggest the mw-config directory to be renamed to something that
more clearly identifies its purpose. I'm thinking first-run or something
to that effect. I'll submit a patchset proposing this.

--Waldir
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l