Re: [Wikitech-l] RfC update: LESS stylesheet support in core

2013-09-19 Thread Nikolas Everett
On Thu, Sep 19, 2013 at 4:04 PM, Dan Andreescu dandree...@wikimedia.org wrote:
 - Has http://learnboost.github.io/stylus/ been considered? I've heard that
 it's a good compromise between sass and less (but I haven't played with it
 myself to see if it really lets you do more compass-like things).


 *Popularity* - does matter; one of the long comment threads on the RFC is
 from a potential contributor who is concerned that LESS makes it harder to
 contribute.  I mostly agree with Jon's and Steven's arguments that LESS is
 pretty easy to learn.  However, I have also heard about a year's worth of
 complaints about Limn being written in Coco instead of pure Javascript.  I
 personally think CSS - LESS is just as mentally taxing as Javascript -
 Coco, but I'm objectively in the minority based on the feedback I've
 received.  I'd be cautious here.  You can upcompile CSS into LESS, sure,
 but if a contributor has to understand a complex LESS codebase full of
 mixins and abstractions while debugging the generated CSS in the browser,
 they're right to point out that this requires effort.  And this is effort
 is only increased for more elegant languages like Stylus.


I'm for any compiled-to-css language because I feel they fill a big
gaping hole in css's ability to share code.  That is really compelling
to me.  I haven't been convinced the compiled-to-js languages offer
quite as compelling a value proposition so the analogy to Limn and
Coco is less relevant to me.  I admit I could be wrong about the value
proposition thing but that is how I feel.  I really don't want to
start a language war though.

I'm a Sass fan but I'll take whatever I can get.

I will point out that CSS is valid LESS which could assuage some fears.

Nik Everett

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] RfC update: LESS stylesheet support in core

2013-09-19 Thread Brion Vibber
I just want to check on folks to see if there's any more comments or issues
with this RfC:
https://www.mediawiki.org/wiki/Requests_for_comment/LESS

Basically, this adds a stylesheet preprocessor for ResourceLoader styles
specified as '.less' files; currently no on-wiki or gadget handling is
included, so there are not security issues with LESS @import rules.

LESS http://lesscss.org/ is pretty handy and is used by a number of our
extensions to make styles more maintainable (set constants, do math, make
combined rules for things like -webkit-blah). Direct LESS support in core
will do away with the precompilation step during development.

There's a patch implementing it in core:
https://gerrit.wikimedia.org/r/#/c/78669/

and a sample patch updating MobileFrontend to use it:
https://gerrit.wikimedia.org/r/#/c/84139/


Open questions:
* Are there any remaining problems with the caching and dependency checks?
* What's the best way to handle image embedding? (/* @embed */ rules get
messed up but we can use an alternate function...) -- see notes on gerrit
* ...any other concerns with performance, security, or basic functionality?

-- brion
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] CentralNotice -- Caching and Proxies

2013-09-19 Thread Matthew Walker
Node is a good choice for this kind of task. If the total size of all
unique banners is relatively small you might even be able to cache the
banners in-memory instead of doing backend cache requests.

Though, not very explicitly proposed; but I was thinking that the best plan
would be to have both node and varnish on the proxy box. I'd rather not
write a caching layer in node when varnish does a fine job at it; but I
also think it's somewhat silly to have symmetric traffic on the proxy when
I can avoid it by having varnish on box. And the amount of data is small
enough that we can easily fit it into  16GB RAM. (Probably less than 8;
but I don't know how it's all going to work itself out in production.)


~Matt Walker
Wikimedia Foundation
Fundraising Technology Team


On Thu, Sep 19, 2013 at 8:48 AM, Gabriel Wicke gwi...@wikimedia.org wrote:

 On 09/18/2013 06:06 PM, Matthew Walker wrote:
  Hey all,
 
  I've been scheming for a while on how to reduce the number of calls up to
  the server for CentralNotice. At the same time I want to greatly reduce
 the
  number of objects I have in cache.
 
  To do this I propose to change the architecture to having an intermediate
  proxy server with a static head JS section in mediawiki page head. The
  proxy would map down all the variables to only what is required at the
 time.

 +1 for limiting the application logic in regular text Varnishes, both
 from a performance and risk management perspective. Having your own
 banner proxies should make it easier to tweak its behavior to your needs
 without the risk of taking down the entire site.

 Node is a good choice for this kind of task. If the total size of all
 unique banners is relatively small you might even be able to cache the
 banners in-memory instead of doing backend cache requests.

 Gabriel

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] RfC update: LESS stylesheet support in core

2013-09-19 Thread Dan Andreescu
 - Has http://learnboost.github.io/stylus/ been considered? I've heard that
 it's a good compromise between sass and less (but I haven't played with it
 myself to see if it really lets you do more compass-like things).


I was just writing a message about Stylus [0] so I'm glad you brought it
up.  Limn [1] uses Stylus and we've been pretty happy with it.  I read the
RFC carefully and it seems the two big reasons to pick LESS over
Stylus/SASS are popularity and support in PHP.  The reason to pick
Stylus/SASS over LESS is a more elegant syntax and a slight edge in
features.

*PHP support* - Stylus does have PHP support [2] but it's not even close to
as mature as the LESS support.

*Popularity* - does matter; one of the long comment threads on the RFC is
from a potential contributor who is concerned that LESS makes it harder to
contribute.  I mostly agree with Jon's and Steven's arguments that LESS is
pretty easy to learn.  However, I have also heard about a year's worth of
complaints about Limn being written in Coco instead of pure Javascript.  I
personally think CSS - LESS is just as mentally taxing as Javascript -
Coco, but I'm objectively in the minority based on the feedback I've
received.  I'd be cautious here.  You can upcompile CSS into LESS, sure,
but if a contributor has to understand a complex LESS codebase full of
mixins and abstractions while debugging the generated CSS in the browser,
they're right to point out that this requires effort.  And this is effort
is only increased for more elegant languages like Stylus.

*Syntax* - Stylus and SASS definitely have cleaner, simpler syntax.  Stylus
aims to be the cleanest of the three but it definitely smells like that SNL
skit about the number of razor blades.  They have 4 blades?!  Fine, we'll
make one with *5* BLADES!!!  What I'm referring to here is that Stylus has
optional colons and tries to be as much like python as possible.

*Features* - The interesting thing about the features comparisons out there
is that all of them seem to be outdated.  For example this write-up [3]
highlights that @media queries can be nested in SASS (same is true for
Stylus).  But the LESS people implemented that as well (Feb 2013).  This
said, it does seem that Stylus and SASS are leading the pack in terms of
new features.  Introspection [4] is a very cool one in Stylus that I'm not
sure you can do in LESS.


I think the decision's pretty much been made to go with LESS, and I agree
with it.  I think it strikes the better balance between making it easy for
people to contribute and DRY-ing up our codebase.  But in the future, if we
loved the migration to LESS and we just wish it had more features and more
DRY-ness, we should revisit Stylus.


[0] - http://learnboost.github.io/stylus/
[1] - https://github.com/wikimedia/limn/tree/develop/css
[2] - https://github.com/AustP/Stylus.php
[3] - http://css-tricks.com/sass-vs-less/
[4] - http://learnboost.github.io/stylus/docs/introspection.html
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] RfC update: LESS stylesheet support in core

2013-09-19 Thread C. Scott Ananian
Some more questions for discussion:

- I'm concerned that some of the useful things people do with sass (ie,
robust cross-browser support with compass) are impossible with less.

- Has http://learnboost.github.io/stylus/ been considered? I've heard that
it's a good compromise between sass and less (but I haven't played with it
myself to see if it really lets you do more compass-like things).
​
- The interaction between ResourceLoader and @import seems a bit
under-defined.  Although less has not really documented it yet, less added
a slew of new @import options in 1.4.1/1.5.0 (see
https://github.com/less/less.js/blob/master/CHANGELOG.md ;
https://github.com/less/less.js/issues/1185 ;
https://github.com/less/less.js/issues/1209 ;
https://github.com/less/less.js/issues/1210 ).  It would be nice to have a
concrete written guideline for how MW authors are expected to use @import
and/or better integrate @import with ResourceLoader.

- @import processes referenced URLs asynchronously, IIRC, which might cause
issues w/ integration.  I haven't done a code review to see how the
existing patches handle this (or not).
  --scott
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [WikimediaMobile] [Analytics] Mobile stats

2013-09-19 Thread Adam Baso
+Analytics


On Thu, Sep 19, 2013 at 1:57 PM, Adam Baso ab...@wikimedia.org wrote:

 A run on yesterday's valid Wikipedia Zero hits showed that user agents NOT
 supporting HTML (i.e., only supporting WAP) is only 0.098 - 0.108 *percent*.

 Assuming a bunch of complaints don't come in (e.g., I'm getting tag
 soup!, as Max might say), I think we could make a reasonable case to stop
 supporting WAP through the formal channels (blog, mailing list(s), etc.).

 -Adam


 On Tue, Sep 17, 2013 at 1:11 PM, Arthur Richards 
 aricha...@wikimedia.orgwrote:

 That's awesome - thanks Max and Adam; it's great to see the last vestiges
 of X-Device finally disappear!


 On Tue, Sep 17, 2013 at 1:07 PM, Max Semenik maxsem.w...@gmail.comwrote:

 After looking at Varnish VCL with Adam, we discovered a bug in regex
 resulting in many phones being detected as WAP when they shouldn't be.
 Since the older change[1] simplifying detection had also fixed this bug,
 Brandon Black deployed it and since today the usage share of WAP should
 seriously drop. We will be monitoring the situation and revisit the issue
 of WAP popularity once we have enough data.

 [1] https://gerrit.wikimedia.org/r/83919

 On Tue, Sep 10, 2013 at 4:39 PM, Adam Baso ab...@wikimedia.org wrote:

 Thanks. 7-9% of responses on Wikipedia Zero being WAP is pretty
 substantial.


 On Tue, Sep 10, 2013 at 2:01 PM, Andrew Otto o...@wikimedia.orgwrote:

  These
  zero.tsv.log*
  files to which I refer seem to be, basically Varnish log lines that
  correspond to Wikipedia Zero-targeted traffic.
 Yup!  Correct.  zero.tsv.log* files are captured unsampled and based
 on the presence of a zero= tag in the X-Analytics header:


 http://git.wikimedia.org/blob/operations%2Fpuppet.git/37ffb0ccc1cd7d3f5612df8779e9a3bdb69066b2/templates%2Fudp2log%2Ffilters.oxygen.erb#L10

  Do I understand correctly that field as Content-Type?
 Yup again!  The varnishncsa format string that is currently being
 beamed at udp2log is here:


 http://git.wikimedia.org/blob/operations%2Fpuppet.git/37ffb0ccc1cd7d3f5612df8779e9a3bdb69066b2/modules%2Fvarnish%2Ffiles%2Fvarnishncsa.default



 --
 Best regards,
 Max Semenik ([[User:MaxSem]])

 ___
 Mobile-l mailing list
 mobil...@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/mobile-l




 --
 Arthur Richards
 Software Engineer, Mobile
 [[User:Awjrichards]]
 IRC: awjr
 +1-415-839-6885 x6687



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] RfC update: LESS stylesheet support in core

2013-09-19 Thread C. Scott Ananian
@ori: You might want to look into the different @import options before
being so dogmatic.  In particular, the media-query restrictions are
probably very useful to MW.  The (less) option also allows overriding CSS
files, which can help prevent the everything must be less! problem.  And
the (reference) option would let you use ResourceLoader to bundle files as
usual while *also* allowing less overrides.  This could be important when
we're trying to override styles defined in a different resource loader
bundle.

@dan: the particular less isn't very powerful issues I'm concerned about
are the ones solved by compass.  As is well-known, there is no equivalent
to compass for less, and is not likely every to be, since less can not
express the transformations required.  Compass uses ruby code to do this w/
sass.  For example,
https://github.com/chriseppstein/compass/blob/stable/lib/compass/sass_extensions/functions/gradient_support.rbis
the code in compass in order to generate clean gradient specifications
that work with all major browsers (including synthesizing SVG background
images where required).  (Spec in
http://compass-style.org/reference/compass/css3/images/ ).  Now, maybe we
don't actually need all that power.  But the automatic cross-browser
compatibility it allows sure is nice...
  --scott
​
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [WikimediaMobile] [Analytics] Mobile stats

2013-09-19 Thread Adam Baso
A run on yesterday's valid Wikipedia Zero hits showed that user agents NOT
supporting HTML (i.e., only supporting WAP) is only 0.098 - 0.108 *percent*.

Assuming a bunch of complaints don't come in (e.g., I'm getting tag
soup!, as Max might say), I think we could make a reasonable case to stop
supporting WAP through the formal channels (blog, mailing list(s), etc.).

-Adam


On Tue, Sep 17, 2013 at 1:11 PM, Arthur Richards aricha...@wikimedia.orgwrote:

 That's awesome - thanks Max and Adam; it's great to see the last vestiges
 of X-Device finally disappear!


 On Tue, Sep 17, 2013 at 1:07 PM, Max Semenik maxsem.w...@gmail.comwrote:

 After looking at Varnish VCL with Adam, we discovered a bug in regex
 resulting in many phones being detected as WAP when they shouldn't be.
 Since the older change[1] simplifying detection had also fixed this bug,
 Brandon Black deployed it and since today the usage share of WAP should
 seriously drop. We will be monitoring the situation and revisit the issue
 of WAP popularity once we have enough data.

 [1] https://gerrit.wikimedia.org/r/83919

 On Tue, Sep 10, 2013 at 4:39 PM, Adam Baso ab...@wikimedia.org wrote:

 Thanks. 7-9% of responses on Wikipedia Zero being WAP is pretty
 substantial.


 On Tue, Sep 10, 2013 at 2:01 PM, Andrew Otto o...@wikimedia.org wrote:

  These
  zero.tsv.log*
  files to which I refer seem to be, basically Varnish log lines that
  correspond to Wikipedia Zero-targeted traffic.
 Yup!  Correct.  zero.tsv.log* files are captured unsampled and based on
 the presence of a zero= tag in the X-Analytics header:


 http://git.wikimedia.org/blob/operations%2Fpuppet.git/37ffb0ccc1cd7d3f5612df8779e9a3bdb69066b2/templates%2Fudp2log%2Ffilters.oxygen.erb#L10

  Do I understand correctly that field as Content-Type?
 Yup again!  The varnishncsa format string that is currently being
 beamed at udp2log is here:


 http://git.wikimedia.org/blob/operations%2Fpuppet.git/37ffb0ccc1cd7d3f5612df8779e9a3bdb69066b2/modules%2Fvarnish%2Ffiles%2Fvarnishncsa.default



 --
 Best regards,
 Max Semenik ([[User:MaxSem]])

 ___
 Mobile-l mailing list
 mobil...@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/mobile-l




 --
 Arthur Richards
 Software Engineer, Mobile
 [[User:Awjrichards]]
 IRC: awjr
 +1-415-839-6885 x6687

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] RfC update: LESS stylesheet support in core

2013-09-19 Thread Ori Livneh
On Thu, Sep 19, 2013 at 12:24 PM, C. Scott Ananian
canan...@wikimedia.orgwrote:


 - The interaction between ResourceLoader and @import seems a bit
 under-defined.  [...] It would be nice to have a
 concrete written guideline for how MW authors are expected to use @import
 and/or better integrate @import with ResourceLoader.

 - @import processes referenced URLs asynchronously, IIRC, which might cause
 issues w/ integration.  I haven't done a code review to see how the
 existing patches handle this (or not).


@import directives in LESS files pointing at other LESS files are processed
synchronously by phpless and are not present in the generated CSS output,
and that's the only use of '@import' we encourage / allow.

@import is reserved for loading mix-ins and variables so that they may be
used by the current LESS stylesheet. It is not intended to be used for
concatenating / bundling stylesheets that are related to one another only
conceptually; that's what the ResourceLoader module definition is for.


/* - Valid use of @import: - */

# myExtension.less
@import extensionColors.less;
body { background-color: @bgColor; }

# extensionColors.less
@bgColor: #ccc;


/* - Invalid use of @import: - */

# myExtension.less
@import headerStyles.less;
body { background-color: #ccc; }

# headerStyles.less
h1 { font-family: serif; }

The relatedness of myExtension.less / headerStyles.less in the second
example should be expressed by referencing these files in the 'styles'
array of the same ResourceLoader module.

I can commit to documenting this on mw.org if / when the proposal is
accepted and the patch is merged.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] RecentChanges types (RC_* constants)

2013-09-19 Thread Federico Leva (Nemo)
Does any of the 3 options avoid the same problem as 
https://bugzilla.wikimedia.org/show_bug.cgi?id=44874 from hitting us? 
users can ignore Wikidata changes in turn of efficiency (enhanced RC), 
but I understand you don't want them to ignore Flow.


Nemo

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] RecentChanges types (RC_* constants)

2013-09-19 Thread Erik Bernhardson
Unfortunatly no, none of this has anything to do specifically with fixing
the spaghetti that is the enhanced changes format.  I have not loofked
deeply into the problem, but the comments from the WikiData developers that
have looked into it suggest it is a non-trivial change.  The change
proposed above is very trivial from an implementation perspective, but it
affects one of the most used tables in mediawiki and the developers I've
spoken with have different opinions on which way is the best way to go.  I
wanted to give those I have not talked to directly an opportunity to be
heard before we change anything.

Erik Bernhardson


On Thu, Sep 19, 2013 at 4:11 PM, Federico Leva (Nemo) nemow...@gmail.comwrote:

 Does any of the 3 options avoid the same problem as
 https://bugzilla.wikimedia.**org/show_bug.cgi?id=44874https://bugzilla.wikimedia.org/show_bug.cgi?id=44874from
  hitting us? users can ignore Wikidata changes in turn of efficiency
 (enhanced RC), but I understand you don't want them to ignore Flow.

 Nemo

 __**_
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/**mailman/listinfo/wikitech-lhttps://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] RfC update: LESS stylesheet support in core

2013-09-19 Thread Matthew Flaschen

On 09/19/2013 04:53 PM, Ori Livneh wrote:

I can commit to documenting this on mw.org if / when the proposal is
accepted and the patch is merged.


This is a good example.  I recommend adding it to 
https://www.mediawiki.org/wiki/Requests_for_comments/LESS/Conventions.


Matt Flaschen

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] RecentChanges types (RC_* constants)

2013-09-19 Thread Chad
On Thu, Sep 19, 2013 at 11:45 AM, Erik Bernhardson 
ebernhard...@wikimedia.org wrote:

3. Replace RC_EXTERNAL with RC_WIKIDATA and RC_FLOW constants in their
respective extensions.  This is also straightforward,  but adds
 development
overhead to ensure future creators of RC_* constants do not conflict
 with
each other.  It would be handled similarly to NS_* constants with an
on-wiki list.  I have heard some mention that naming conflicts have
occurred in the past with this solution. This would force queries
 looking
for only core sources of change to provide an inclusive list of RC_*
 values
to find, rather than using rc_type != RC_EXTERNAL.


Please don't repeat the mistake of having extension authors actually caring
what their namespace number is. Everyone just goes Oh, nobody's probably
using 200 so I'll just do that.

-Chad
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] RecentChanges types (RC_* constants)

2013-09-19 Thread Daniel Friesen

~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://danielfriesen.name/]

On 2013-09-19 4:44 PM, Chad wrote:
 On Thu, Sep 19, 2013 at 11:45 AM, Erik Bernhardson 
 ebernhard...@wikimedia.org wrote:

3. Replace RC_EXTERNAL with RC_WIKIDATA and RC_FLOW constants in their
respective extensions.  This is also straightforward,  but adds
 development
overhead to ensure future creators of RC_* constants do not conflict
 with
each other.  It would be handled similarly to NS_* constants with an
on-wiki list.  I have heard some mention that naming conflicts have
occurred in the past with this solution. This would force queries
 looking
for only core sources of change to provide an inclusive list of RC_*
 values
to find, rather than using rc_type != RC_EXTERNAL.


 Please don't repeat the mistake of having extension authors actually caring
 what their namespace number is. Everyone just goes Oh, nobody's probably
 using 200 so I'll just do that.

 -Chad
+1

@Eric
The on-wiki list you talk about is here:
https://www.mediawiki.org/wiki/Extension_default_namespaces

I have heard some mention that naming conflicts have occurred in the
past with this solution.
Yes there are plenty.
120-121 is used by both RefHelper and Rich Media
200-203 is used by SocialProfile and Data Import
300-301 is used by PollNY and Access Control List
Wikia also uses 300-399 when writing it's own extensions and doesn't
bother co-operating by at least adding the defaults they use to that
list to avoid conflicts.
500-501 is used by BlogPage and Linked Data
700-701 is used by LinkFilter and Collaboration
BlueSpice and BlogPage have a different type of conflict too. They BOTH
use the constant NS_BLOG and define different namespace defaults for it.

This on-wiki page is ONLY a registry of defaults. The standard practice
for these is that the starting number should be configurable so
namespace ids other than the default can be used to avoid conflicts. I'm
not so sure you'll be able to to that very well for RC external ids.

Anyways, this whole extension namespace id setup is considered a bug.
You don't want to get into this situation again.
We have an open bug on dropping this default namespace nonsense and
using dynamic registration of namespace IDs
https://bugzilla.wikimedia.org/show_bug.cgi?id=31063


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [RFC]: Clean URLs- dropping /wiki/ and /w/index.php?title=..

2013-09-19 Thread Matthew Flaschen

On 09/17/2013 05:59 AM, Daniel Friesen wrote:

Side topic https://en.wiktionary.org/w/r/t is messed up:  To check for
r/t on Wikipedia, see: //en.wikipedia.org/wiki/r/t
https://en.wikipedia.org/wiki/r/t


Good catch, filed: https://bugzilla.wikimedia.org/show_bug.cgi?id=54357

Matt Flaschen

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] RecentChanges types (RC_* constants)

2013-09-19 Thread Erik Bernhardson
I will take a look over the bug, quite a long conversation.  It will take
me the night most likely to digest the suggestions included.  I suppose my
first worry is that I was targeting simple changes which can be agree'd on
and implemented in a few lines, whereas the linked bug report seems to
suggest a system that I know will require many iterations and weeks of
on/off work before +2'd into core.

Erik Bernhardson


On Thu, Sep 19, 2013 at 5:07 PM, Daniel Friesen
dan...@nadir-seen-fire.comwrote:


 ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://danielfriesen.name/]

 On 2013-09-19 4:44 PM, Chad wrote:
  On Thu, Sep 19, 2013 at 11:45 AM, Erik Bernhardson 
  ebernhard...@wikimedia.org wrote:
 
 3. Replace RC_EXTERNAL with RC_WIKIDATA and RC_FLOW constants in
 their
 respective extensions.  This is also straightforward,  but adds
  development
 overhead to ensure future creators of RC_* constants do not conflict
  with
 each other.  It would be handled similarly to NS_* constants with an
 on-wiki list.  I have heard some mention that naming conflicts have
 occurred in the past with this solution. This would force queries
  looking
 for only core sources of change to provide an inclusive list of RC_*
  values
 to find, rather than using rc_type != RC_EXTERNAL.
 
 
  Please don't repeat the mistake of having extension authors actually
 caring
  what their namespace number is. Everyone just goes Oh, nobody's probably
  using 200 so I'll just do that.
 
  -Chad
 +1

 @Eric
 The on-wiki list you talk about is here:
 https://www.mediawiki.org/wiki/Extension_default_namespaces

 I have heard some mention that naming conflicts have occurred in the
 past with this solution.
 Yes there are plenty.
 120-121 is used by both RefHelper and Rich Media
 200-203 is used by SocialProfile and Data Import
 300-301 is used by PollNY and Access Control List
 Wikia also uses 300-399 when writing it's own extensions and doesn't
 bother co-operating by at least adding the defaults they use to that
 list to avoid conflicts.
 500-501 is used by BlogPage and Linked Data
 700-701 is used by LinkFilter and Collaboration
 BlueSpice and BlogPage have a different type of conflict too. They BOTH
 use the constant NS_BLOG and define different namespace defaults for it.

 This on-wiki page is ONLY a registry of defaults. The standard practice
 for these is that the starting number should be configurable so
 namespace ids other than the default can be used to avoid conflicts. I'm
 not so sure you'll be able to to that very well for RC external ids.

 Anyways, this whole extension namespace id setup is considered a bug.
 You don't want to get into this situation again.
 We have an open bug on dropping this default namespace nonsense and
 using dynamic registration of namespace IDs
 https://bugzilla.wikimedia.org/show_bug.cgi?id=31063


 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] RecentChanges types (RC_* constants)

2013-09-19 Thread Daniel Friesen
You can trivially avoid the need to do anything as complex as dynamic
namespace registration by simply using one of your other options like
using the string 'wikidata' or 'flow' rather than a constant and integer id.
If you want integer ids that badly you could always create a new
rc_external_types (or whatever you want to call it) mapping an
auto_increment id to keys like 'wikidata' and 'flow' and use the primary
key there as the rc_external_type.

Long story short. Hardcoding integer numbers into extensions hoping
you're not going to conflict with other extensions is never a good idea.
You're just subjecting yourself to future pain you could have avoided at
the start with a simple solution.

~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://danielfriesen.name/]

On 2013-09-19 5:41 PM, Erik Bernhardson wrote:
 I will take a look over the bug, quite a long conversation.  It will take
 me the night most likely to digest the suggestions included.  I suppose my
 first worry is that I was targeting simple changes which can be agree'd on
 and implemented in a few lines, whereas the linked bug report seems to
 suggest a system that I know will require many iterations and weeks of
 on/off work before +2'd into core.

 Erik Bernhardson

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [RFC]: Clean URLs- dropping /wiki/ and /w/index.php?title=..

2013-09-19 Thread Tim Starling
On 20/09/13 03:04, Jon Robson wrote:
 Thanks Tim for running those data. That seems to suggest the URL
 structure works for the most case.

I think the request rate for actual articles in the root is very, very
low. And if you look at the paste I gave earlier:

http://paste.tstarling.com/p/uhtFqg.html

there's reason to think that the amount of traffic that comes from
naive readers typing URLs and expecting an article is much smaller
than even 149k per week. A naive user would be more likely to type a
URL starting with a lower-case letter, and if you take those entries,
and filter out the obvious client bugs and typos, that leaves only 39
log entries. If we filter out some more log entries that are unlikely
search terms for Wikipedia articles (enregistrement-audio-musique,
is, unlimited_data_plan, etc.), that leaves maybe 30.
http://paste.tstarling.com/p/KWuHif.html

Of these, only 12 actually correspond to Wikipedia articles or redirects:

abolition
addicting_games
apple_inc
carnaval
dreamshade
facade
girls
insidious
karthik
online_coupons
snam
walkabout

So the number of naive readers actually helped by our 404 Refresh to
/wiki/ is probably closer to 12k per week than 149k per week.

Personally, I think the refresh is annoying, since it makes it much
more difficult to correct typos in manually-typed URLs. If you
actually meant to type some non-article URL like a CSS resource, and
make a typo which causes it to hit the refresh, the URL you typed is
erased from your browser's address bar and history, making correction
of the typo much more difficult. Maybe we should just include a link
to the search page, rather than redirect or refresh.

-- Tim Starling


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [RFC]: Clean URLs- dropping /wiki/ and /w/index.php?title=..

2013-09-19 Thread MZMcBride
Tim Starling wrote:
Personally, I think the refresh is annoying, since it makes it much
more difficult to correct typos in manually-typed URLs. If you
actually meant to type some non-article URL like a CSS resource, and
make a typo which causes it to hit the refresh, the URL you typed is
erased from your browser's address bar and history, making correction
of the typo much more difficult. Maybe we should just include a link
to the search page, rather than redirect or refresh.

Mark Ryan redesigned the 404 page in 2009 and specifically removed the
meta refresh tag (cf. https://bugs.wikimedia.org/17316#c0).

The redesigned page eventually got deployed, but the client-side refresh
very sneakily moved from the HTML output to a Refresh header (cf.
https://bugs.wikimedia.org/35052#c0).

Neither bug is resolved, if anyone is interested in helping out. :-)

MZMcBride



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [RFC]: Clean URLs- dropping /w/index.php?title=..

2013-09-19 Thread Gabriel Wicke
On 09/19/2013 10:04 AM, Jon Robson wrote:
 Thanks Tim for running those data. That seems to suggest the URL
 structure works for the most case.

It certainly confirms that search engines link to working links, and
users typing URLs manually are rare and (eventually) learn to prefix
/wiki/. I am not that convinced that the current number of 404s says
that much about the user-friendliness or aesthetics of different URL
schemes, but that is besides the point (and subjective).

I see /w/index.php?title=.. as the more important clean-up, which is why
the RFC is only about that aspect.

Gabriel

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [RFC]: Clean URLs- dropping /wiki/ and /w/index.php?title=..

2013-09-19 Thread Jon Robson
On 19 Sep 2013 18:23, Tim Starling tstarl...@wikimedia.org wrote:

 On 20/09/13 03:04, Jon Robson wrote:
  Thanks Tim for running those data. That seems to suggest the URL
  structure works for the most case.

 I think the request rate for actual articles in the root is very, very
 low.

I agree.. Sorry I guess my message wasn't so clear. I meant existing URL
structure :)

And if you look at the paste I gave earlier:

 http://paste.tstarling.com/p/uhtFqg.html

 there's reason to think that the amount of traffic that comes from
 naive readers typing URLs and expecting an article is much smaller
 than even 149k per week. A naive user would be more likely to type a
 URL starting with a lower-case letter, and if you take those entries,
 and filter out the obvious client bugs and typos, that leaves only 39
 log entries. If we filter out some more log entries that are unlikely
 search terms for Wikipedia articles (enregistrement-audio-musique,
 is, unlimited_data_plan, etc.), that leaves maybe 30.
 http://paste.tstarling.com/p/KWuHif.html

 Of these, only 12 actually correspond to Wikipedia articles or redirects:

 abolition
 addicting_games
 apple_inc
 carnaval
 dreamshade
 facade
 girls
 insidious
 karthik
 online_coupons
 snam
 walkabout

 So the number of naive readers actually helped by our 404 Refresh to
 /wiki/ is probably closer to 12k per week than 149k per week.

 Personally, I think the refresh is annoying, since it makes it much
 more difficult to correct typos in manually-typed URLs. If you
 actually meant to type some non-article URL like a CSS resource, and
 make a typo which causes it to hit the refresh, the URL you typed is
 erased from your browser's address bar and history, making correction
 of the typo much more difficult. Maybe we should just include a link
 to the search page, rather than redirect or refresh.

 -- Tim Starling


 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] People with knowledge of English swear words needed :o

2013-09-19 Thread Petr Bena
Are you good in swearing? WE NEED YOU

Huggle 3 comes with vandalism-prediction as it is precaching the diffs
even before they are enqueued including their contents. Each edit has
so called score which is a numerical value that if higher, the edit
is more likely a vandalism.

If you want to help us improve this feature, it is necessary to define
a score words list for every wiki where huggle is about to be used,
for example on English wiki.

Each list has following syntax:

(see 
https://en.wikipedia.org/w/index.php?title=Wikipedia:Huggle/Configdiff=573615259oldid=573615075)


score-words(score):
list of words separated by comma, can contain newlines but comma
must be present

example

score-words(200):
these, are, some, words, which, presence, of, increases, the, score,
each, word, by, 200,

So, if you know english better than me, which you likely do, go ahead
and improve the configuration file there, no worries, huggle's config
parser is very syntax-error proof.

If you have any other suggestion how to improve huggle's prediction,
go ahead and tell us!

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] People with knowledge of English swear words needed :o

2013-09-19 Thread C. Scott Ananian
Perhaps we could use some Math here?  Can we grab a list of the last, say,
100,000 edits reverted for vandalism, look at the diff, and compute a
frequency score based on that?
 --scott
​
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] People with knowledge of English swear words needed :o

2013-09-19 Thread Antoine Musso
Le 19/09/13 11:35, Petr Bena a écrit :
snip
 Huggle 3 comes with vandalism-prediction as it is precaching the diffs
 even before they are enqueued including their contents. Each edit has
 so called score which is a numerical value that if higher, the edit
 is more likely a vandalism.
 
 If you want to help us improve this feature, it is necessary to define
 a score words list for every wiki where huggle is about to be used,
 for example on English wiki.
 
 Each list has following syntax:
 
 (see 
 https://en.wikipedia.org/w/index.php?title=Wikipedia:Huggle/Configdiff=573615259oldid=573615075)

The good thing while reinventing the wheel, is that you can reuse
existing material :-]

Cluebot-NG has such a list: http://review.cluebot.cluenet.org  and its a
quite active one:
 http://en.wikipedia.org/wiki/Special:Contributions/ClueBot_NG


It uses a variety of algorithms to determine the score of an edit:
 http://en.wikipedia.org/wiki/User:ClueBot_NG#Vandalism_Detection_Algorithm


Maybe get in touch with them and reuse their engine?


-- 
Antoine hashar Musso


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] People with knowledge of English swear words needed :o

2013-09-19 Thread Chad
On Thu, Sep 19, 2013 at 2:35 AM, Petr Bena benap...@gmail.com wrote:

 Are you good in swearing? WE NEED YOU


I know 7 words you can add ;-)

[[w:Seven dirty words]]

-Chad
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] CentralNotice -- Caching and Proxies

2013-09-19 Thread Gabriel Wicke
On 09/18/2013 06:06 PM, Matthew Walker wrote:
 Hey all,
 
 I've been scheming for a while on how to reduce the number of calls up to
 the server for CentralNotice. At the same time I want to greatly reduce the
 number of objects I have in cache.
 
 To do this I propose to change the architecture to having an intermediate
 proxy server with a static head JS section in mediawiki page head. The
 proxy would map down all the variables to only what is required at the time.

+1 for limiting the application logic in regular text Varnishes, both
from a performance and risk management perspective. Having your own
banner proxies should make it easier to tweak its behavior to your needs
without the risk of taking down the entire site.

Node is a good choice for this kind of task. If the total size of all
unique banners is relatively small you might even be able to cache the
banners in-memory instead of doing backend cache requests.

Gabriel

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] People with knowledge of English swear words needed :o

2013-09-19 Thread Petr Bena
Hi, cool, I was actually expecting someone to come out with
suggestions like this. Indeed I didn't know that and now I do. In fact
closer cooperation with cluebot is on TO-DO :-) any good algorithm to
calculate vandalism is appreciated, in fact this might be the first
thing we should create hooks for, so that people can implement own
algorithms as either c++ or python plugins which count the score just
as they like... (unfortunately I didn't manage to get python engine
working for windows build yet)

On Thu, Sep 19, 2013 at 4:47 PM, Antoine Musso hashar+...@free.fr wrote:
 Le 19/09/13 11:35, Petr Bena a écrit :
 snip
 Huggle 3 comes with vandalism-prediction as it is precaching the diffs
 even before they are enqueued including their contents. Each edit has
 so called score which is a numerical value that if higher, the edit
 is more likely a vandalism.

 If you want to help us improve this feature, it is necessary to define
 a score words list for every wiki where huggle is about to be used,
 for example on English wiki.

 Each list has following syntax:

 (see 
 https://en.wikipedia.org/w/index.php?title=Wikipedia:Huggle/Configdiff=573615259oldid=573615075)

 The good thing while reinventing the wheel, is that you can reuse
 existing material :-]

 Cluebot-NG has such a list: http://review.cluebot.cluenet.org  and its a
 quite active one:
  http://en.wikipedia.org/wiki/Special:Contributions/ClueBot_NG


 It uses a variety of algorithms to determine the score of an edit:
  http://en.wikipedia.org/wiki/User:ClueBot_NG#Vandalism_Detection_Algorithm


 Maybe get in touch with them and reuse their engine?


 --
 Antoine hashar Musso


 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] People with knowledge of English swear words needed :o

2013-09-19 Thread Chris Steipp
On Thu, Sep 19, 2013 at 7:19 AM, C. Scott Ananian canan...@wikimedia.orgwrote:

 Perhaps we could use some Math here?  Can we grab a list of the last, say,
 100,000 edits reverted for vandalism, look at the diff, and compute a
 frequency score based on that?
  --scott


This is pretty much what my gsoc student implemented in the bayesian filter
extension. If that gets some use, then those lists could easily be fed back.



 ​
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Meet Bingle and Bugello

2013-09-19 Thread Arthur Richards
Ergh no not yet, although during the hackathon at Wikimania, Diederik and I
made some big changes in Bingle to use Bugzilla's jsonrpc api, which opens
a lot of doors for new cool things. I'm planning to write a blog post about
it in the coming weeks - I'll work with Diederik to come up with a
potential roadmap beforehand.


On Thu, Sep 19, 2013 at 7:56 AM, Sumana Harihareswara suma...@wikimedia.org
 wrote:

 On 06/20/2013 09:37 AM, Brion Vibber wrote:
  On Jun 20, 2013 9:26 AM, Arthur Richards aricha...@wikimedia.org
 wrote:
 
  On Wed, Jun 19, 2013 at 4:39 PM, Andre Klapper aklap...@wikimedia.org
  wrote:
 
 
  Is there any kind of Roadmap file that lists stuff that you think would
  be great to get fixed next or other random ideas, for potential
 drive-by
  contributors on GitHub?
 
 
  Not yet, but good idea! I'll get something up.
 
  Another thing to put on that list is Yuvi's bot for  GitHub pull requests
  to gerrit. We're starting to use this on the android Commons app, and
 it's
  pretty sweet!
 
  -- brion
 
  --
  Arthur Richards
  Software Engineer, Mobile
  [[User:Awjrichards]]
  IRC: awjr
  +1-415-839-6885 x6687
  ___
  Wikitech-l mailing list
  Wikitech-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikitech-l
  ___
  Wikitech-l mailing list
  Wikitech-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikitech-l
 

 Hey Arthur, did you end up putting together a roadmap? :-)

 --
 Sumana Harihareswara
 Engineering Community Manager
 Wikimedia Foundation




-- 
Arthur Richards
Software Engineer, Mobile
[[User:Awjrichards]]
IRC: awjr
+1-415-839-6885 x6687
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] People with knowledge of English swear words needed :o

2013-09-19 Thread Neil Harris

On 19/09/13 10:35, Petr Bena wrote:

Are you good in swearing? WE NEED YOU

Huggle 3 comes with vandalism-prediction as it is precaching the diffs
even before they are enqueued including their contents. Each edit has
so called score which is a numerical value that if higher, the edit
is more likely a vandalism.

If you want to help us improve this feature, it is necessary to define
a score words list for every wiki where huggle is about to be used,
for example on English wiki.

Each list has following syntax:

(see 
https://en.wikipedia.org/w/index.php?title=Wikipedia:Huggle/Configdiff=573615259oldid=573615075)


score-words(score):
 list of words separated by comma, can contain newlines but comma
must be present

example

score-words(200):
 these, are, some, words, which, presence, of, increases, the, score,
 each, word, by, 200,



[[en:User:/DeltaQuad/UAA/Blacklist]] contains a fairly comprehensive 
overview of English-language profanity and general trash-talk formatted 
as regexps, mixed in with other non-sweary blocking patterns that are 
specific to that blacklist's needs.


Neil


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [RFC]: Clean URLs- dropping /wiki/ and /w/index.php?title=..

2013-09-19 Thread Jon Robson
Thanks Tim for running those data. That seems to suggest the URL
structure works for the most case.

On Wed, Sep 18, 2013 at 12:07 AM, Tim Starling tstarl...@wikimedia.org wrote:
 On 17/09/13 13:59, Jon Robson wrote:
 I would suggest taking a look at the number of 404s caused by people trying
 to access pages without the wiki prefix This would be interesting data
 to go alongside this interesting proposal...

 There are lots of different sorts of 404s, so it's necessary to do
 some filtering. For example:

 * double-slashes, due to bug 52253
 * sitemap.xml
 * Apple touch icons
 * bullet.gif in various directories
 * vulnerability scanning, e.g. xmlrpc.php
 * BlueCoat verify/notify, as described in
 http://www.webmasterworld.com/search_engine_spiders/3859463.htm
 * Serial numbers like http://en.wikipedia.org/B008NAYASM .

 I filtered out everything with a dot or slash in the prospective
 article title, as well as the BlueCoat URLs and the UAs responsible
 for serial number URLs. To simplify analysis, I took log lines from
 the English Wikipedia only.

 Most of the remaining log entries were search engine crawlers, so I
 took those out too.

 The result was 149 log entries at a 1/1000 sample rate, for the week
 of September 8-14, implying a request rate of about 639,000 per month.
 This is about 0.006% of the English Wikipedia's page view rate.

 The 149 URLs are at http://paste.tstarling.com/p/uhtFqg.html

 -- Tim Starling


 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l



-- 
Jon Robson
http://jonrobson.me.uk
@rakugojon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] RecentChanges types (RC_* constants)

2013-09-19 Thread Erik Bernhardson
Within the Flow extension we have a need for inserting our own special
changes into the recentchanges table so that Watchlists continue to inform
users of changes in the same ways they are used to.  Within mediawiki the
WikiData extension has similar requirements and has implemented a solution
that works for their use case.   Flow is looking to extend this to handle
multiple types of external change sources.  The solution taken by WikiData
to render the lines works well and will be used by Flow, but we have some
concerns regarding how different types of external changes will be filtered
by the queries that generate the Special:RecentChanges and
Special:Watchlist pages.

How does the current solution work?

There is a field in the recentchanges table, rc_type.  All WikiData entries
use the value of RC_EXTERNAL( = 5) for this field. Queries are generated
with either (rc_type = 5) or (rc_type != 5) when filtering is required.

Requirements:

   - Currently WikiData entries into recentchanges are filtered from
   Special:RecentChanges and Special:Watchlist.  This is toggleable.  By
   default we will not want to filter Flow entries, but will want to offer a
   toggle much like WikiData does.
   - More types of external change sources should be able to add themselves
   in the future without core changes
   - We should play nice with the db slave's serving up watchlists.


There are a couple options, each with their own tradeoffs.


   1. Use rc_type = RC_EXTERNAL and add a new field to the recentchanges
   table, rc_external_type.  This would be a varchar(16) field. Wikidata and
   Flow would put their respective names in the field to distinguish between
   each other.  This is conceptually simple,  but makes the queries look even
   odder.  (rc_type != 5) becomes (rc_type != 5 AND rc_external_type !=
   'wikidata').
   2. Similar to 1, but instead of creating a new field reuse rc_log_type
   field which is only used when rc_type = RC_LOG.  This seems a bit hacky,
   but would only need a field rename to not feel so hacky.  I'm not proposing
   to rename the field though as there are a variety of extensions depending
   on the current field name and we are not going to coordinate getting them
   all updated at the exact same time.  The fact that this field is used by
   various extensions may be a hint that we shouldn't reuse it.
   3. Replace RC_EXTERNAL with RC_WIKIDATA and RC_FLOW constants in their
   respective extensions.  This is also straightforward,  but adds development
   overhead to ensure future creators of RC_* constants do not conflict with
   each other.  It would be handled similarly to NS_* constants with an
   on-wiki list.  I have heard some mention that naming conflicts have
   occurred in the past with this solution. This would force queries looking
   for only core sources of change to provide an inclusive list of RC_* values
   to find, rather than using rc_type != RC_EXTERNAL.


Things to consider:
On smaller wiki's WikiData changes can account for  50% of the changes.
Talk namespace edits, which we expect to eventually replace with flow
edits, account for ~20% of enwiki recentchanges rows

The standard query issued by Special:RecentChanges is

SELECT /* lots of fields */
FROM `recentchanges`
FORCE INDEX (rc_timestamp)
LEFT JOIN `watchlist` ON (wl_user = '2' AND (wl_title=rc_title) AND
(wl_namespace=rc_namespace))
LEFT JOIN `tag_summary` ON ((ts_rc_id=rc_id))
WHERE (rc_timestamp = '2013091200') AND rc_bot = '0' AND (rc_type !=
5)
ORDER BY rc_timestamp DESC LIMIT 50

The standard query issued by Special:Watchlist is

SELECT /* lots of fields */
FROM `recentchanges`
INNER JOIN `watchlist` ON (wl_user = '2' AND (wl_namespace=rc_namespace)
AND (wl_title=rc_title))
LEFT JOIN `page` ON ((rc_cur_id=page_id))
LEFT JOIN `tag_summary` ON ((ts_rc_id=rc_id))
WHERE (rc_timestamp  '20130916175626') AND (rc_this_oldid=page_latest OR
rc_type=3) AND (rc_type != 5)
ORDER BY rc_timestamp DESC



Without further input I will be implementing option 3 from above, I welcome
any input on better solutions, or potential pitfalls with this solution.

Erik Bernhardson
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l