[Wikitech-l] The Revision Scoring weekly update

2016-08-29 Thread Aaron Halfaker
Hey,

This is the 19th weekly update from revision scoring team that we have sent
to this mailing list.

Deployments:

   - We deployed a set of new models to ORES that reduce our memory usage
   and slightly increase fitness. [1]  These models were discussed in an email
   to the "ai" mailing list. [2]


   - We also completed a major quarterly goal.  The ORES review tool is now
   deployed as a beta feature on 8 wikis! [3]  This came with some quick fixes
   to fix some confusion and usability issues. [4]  The beta feature is now
   available on English, Polish, Portuguese, Russian, Dutch, Persian and
   Turkish Wikipedias as well as Wikidata.


New development:

   - We discussed and came to a rough consensus about how to integrate ORES
   into api.php. [5]


   - We deployed a new edit quality campaign on English Wikipedia to gather
   more data for training ORES. [6, 7]


   - We added a specific set of user groups to the ORES models for Turkish
   Wikipedia and saw an increase in model fitness. [8]


Maintenance and robustness:

   - We fixed bugs in our maintenance scripts for purging old model
   versions [9, 10]


   - We switch to using our production models on the beta labs cluster so
   now we can catch vandalism there too (and know that the models actually
   work) [11]


   - We improved the error messages reported from Wiki Labels so that the
   actual error appears when the API responds with non-200 HTTP status [12]


1. https://phabricator.wikimedia.org/T144101 -- Deploy ORES at 2016-08-29
2. https://lists.wikimedia.org/pipermail/ai/2016-August/68.html
3. https://phabricator.wikimedia.org/T140002 -- [Epic] Deploy ORES review
tool
4. https://phabricator.wikimedia.org/T143988 -- $wgOresModels set all
models true
5. https://phabricator.wikimedia.org/T122689 -- [Discuss] api.php
integration with ORES
6. https://phabricator.wikimedia.org/T143745 -- Deploy 2016 edit quality
campaign to English Wikipedia
7. https://en.wikipedia.org/wiki/Wikipedia:Labels/Edit_quality
8. https://phabricator.wikimedia.org/T140474 -- Include specific user
groups in the trwiki edit quality model
9. https://phabricator.wikimedia.org/T144216 -- Purge model score should
clean when there is no row is ores_model too
10. https://phabricator.wikimedia.org/T143798 -- Update model versions is
badly broken in ORES extension
11. https://phabricator.wikimedia.org/T143567 -- Switch beta to use the
proper wiki models for scoring (rather than "testwiki")
12. https://phabricator.wikimedia.org/T138255 -- Wikilabels UI reports
non-200 status errors badly

Sincerely,
Aaron from the Revision Scoring team
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] git release branches

2016-08-29 Thread Legoktm
Hi,

On 08/29/2016 10:33 AM, Brad Jorsch (Anomie) wrote:
> On Mon, Aug 29, 2016 at 11:04 AM, John  wrote:
> 
>> I am running into issues where I cant set a branch on the
>> extensions/ repo for REL 1_27 and get all extensions as of that branching,
>> Instead I am forced to go thru one by one and manually set it, if and only
>> if a given extension has said branch set, otherwise I am out of luck.
>>
> 
> I see the mediawiki/extensions and mediawiki/skins repos both have a
> REL1_27 branch. Do they somehow not work correctly to check out all
> submodules at the appropriate revisions?

That should work, you'd just need to remember to run "git submodule
update" after switching branches, which should be pretty easy. At that
point you still need to manage 4 different git repos though (core,
vendor, skins, extensions).

-- Legoktm

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [AI] New models coming to ORES & notes

2016-08-29 Thread Aaron Halfaker
FYI, these new models are now live.  We'll be running some maintenance
scripts for the ORES review tool to update the recent historic scores.
Otherwise, you should expect to see ORES producing scores with updated
model version numbers.

On Sat, Aug 20, 2016 at 1:39 AM, Amir Ladsgroup  wrote:

> One small note: This will cause the ORES review tool to invalidate it's db
> cache. So we will probably need to run some maintenance scripts here and
> there. You might feel a few bumps in the tool in Wikipedia. We will let you
> know beforehand :)
>
> Best
>
> On Sat, Aug 20, 2016 at 3:10 AM Aaron Halfaker 
> wrote:
>
>> Hey folks,
>>
>> We've been working on generating some updated models for ORES.  These
>> models will behave slightly differently from the models that we currently
>> have deployed.  This is a natural artifact of retraining the models on the
>> *exact same data* again because of some random properties of the learning
>> algorithms.  So, for the most part, this should be a non-issue for any
>> tools that use ORES.  However, I wanted to take this opportunity to
>> highlight some of the facilities ORES provides to help automatically detect
>> and adjust for these types of changes.
>>
>> *== Versions ==*
>> ORES provides information about all of the models.  This information
>> includes a model version number.  If you are caching ORES scores locally,
>> we recommend invalidating old scores whenever this model number changes.
>> For example, https://ores.wikimedia.org/v2/scores/
>> enwiki/damaging/12345678 currently returns
>>
>> {
>>   "scores": {
>> "enwiki": {
>>   "damaging": {
>> "scores": {
>>   "12345678": {
>> "prediction": false,
>> "probability": {
>>   "false": 0.7141333465390294,
>>   "true": 0.28586665346097057
>> }
>>   }
>> },
>> "version": "0.1.1"
>>   }
>> }
>>   }
>> }
>>
>> This score was generated with the "0.1.1" version of the model.  But once
>> we deploy the new models, the same request will return:
>> {
>>   "scores": {
>> "enwiki": {
>>   "damaging": {
>> "scores": {
>>   "12345678": {
>> "prediction": false,
>> "probability": {
>>   "false": 0.8204647324045306,
>>   "true": 0.17953526759546945
>> }
>>   }
>> },
>> "version": "0.1.2"
>>   }
>> }
>>   }
>> }
>>
>> Note that the version number changes to "0.1.2" and the probabilities
>> change slightly.  In this case, we're essentially re-training the same
>> model in a similar way, so we increment the "patch" number.
>>
>> However, we're switching modeling strategies for the article quality
>> models (enwiki-wp10, frwiki-wp10 & ruwiki-wp10), so those versions
>> increment the minor version from "0.3.2" to "0.4.0".  You may see more
>> substantial changes in prediction probabilities with those models, but a
>> quick spot-checking suggests that the changes are not substantial.
>>
>> *== Test statistics and threshholding ==*
>> So, many tools that use our edit quality models (reverted, damaging and
>> goodfaith) will set threshholds for flagging edits for review.  In order to
>> support these tools, we produce test statistics that suggest useful
>> thresholds.
>>
>> https://ores.wmflabs.org/v2/scores/enwiki/damaging/?model_info=test_stats
>> produces:
>>
>>   ...
>> "filter_rate_at_recall(min_recall=0.75)": {
>>   "filter_rate": 0.869,
>>   "recall": 0.752,
>>   "threshold": 0.492
>> },
>> "filter_rate_at_recall(min_recall=0.9)": {
>>   "filter_rate": 0.753,
>>   "recall": 0.902,
>>   "threshold": 0.173
>> },
>>   ...
>>
>> These two statistics show useful thresholds for detecting damaging
>> edits.  E.g. if you want to be sure that you catch nearly all vandalism
>> (and are OK with a higher false-positive rate), set the threshold at 0.173,
>> but if you'd like to catch most vandalism with almost no false-positives,
>> set the threshold at 0.492.  These fields can be read automatically by
>> tools so that they do not need to be manually updated every time that we
>> deploy a new model.
>>
>> Let me know if you have any questions and happy hacking!
>>
>> -Aaron
>> ___
>> AI mailing list
>> a...@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/ai
>>
>
> ___
> AI mailing list
> a...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/ai
>
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] git release branches

2016-08-29 Thread Brad Jorsch (Anomie)
On Mon, Aug 29, 2016 at 11:04 AM, John  wrote:

> I am running into issues where I cant set a branch on the
> extensions/ repo for REL 1_27 and get all extensions as of that branching,
> Instead I am forced to go thru one by one and manually set it, if and only
> if a given extension has said branch set, otherwise I am out of luck.
>

I see the mediawiki/extensions and mediawiki/skins repos both have a
REL1_27 branch. Do they somehow not work correctly to check out all
submodules at the appropriate revisions?


-- 
Brad Jorsch (Anomie)
Senior Software Engineer
Wikimedia Foundation
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Grouping phabricator notifications by task

2016-08-29 Thread Greg Grossmeier

> Hi,
> 
> I rely quite a lot on phabricator notifications for checking new activity
> on my subscribed tasks (I've found to suit me better than email).
> 
> After using the unread view in phab quite a lot I noticed that I really
> wanted to see the notifications grouped by the task, since that let's me
> see the new activity on the task at a glance and open it to read/act if I
> need to.
> 
> So I've made a user script that does exactly that:
>  https://gist.github.com/joakin/c0e5dffc23aaf05175a580d24a2adefe

That's great, thanks for sharing, Joaquin!

Greg

-- 
| Greg GrossmeierGPG: B2FA 27B1 F7EB D327 6B8E |
| Release Team ManagerA18D 1138 8E47 FAC8 1C7D |

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] git release branches

2016-08-29 Thread Chad
On Mon, Aug 29, 2016 at 8:34 AM John  wrote:

> my thought would not only be for the bundled extensions but for
> *everything*
> under skins/ and extensions/ to get a given REL branch when a release is
> set. It might mean that some extensions get a REL for a non-compatible
> branch but it would cover those extensions where devs forget to cut
> branches.
>
>
Devs don't have to remember to cut branches, I cut a REL1_* for all
extensions & skins during the release cycle. They aren't retroactively
done for new extensions or skins sure, but they should all be there.

I was talking more about putting the "bundled" ones (ie: those in the
tarballs) as submodules of core release branches like we do for the
wmf/* branches.

-Chad
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] git release branches

2016-08-29 Thread John
my thought would not only be for the bundled extensions but for *everything*
under skins/ and extensions/ to get a given REL branch when a release is
set. It might mean that some extensions get a REL for a non-compatible
branch but it would cover those extensions where devs forget to cut
branches.

On Mon, Aug 29, 2016 at 11:25 AM, Chad  wrote:

> On Mon, Aug 29, 2016 at 8:04 AM John  wrote:
>
> > Can we get a repo for each major release that contains core, skins, and
> all
> > extensions in a single checkout? Right now I am updating and with all of
> > the sub modules, I am running into issues where I cant set a branch on
> the
> > extensions/ repo for REL 1_27 and get all extensions as of that
> branching,
> > Instead I am forced to go thru one by one and manually set it, if and
> only
> > if a given extension has said branch set, otherwise I am out of luck.
> >
> > Getting everything together would cause a larger checkout, but keeps
> things
> > together for those who want to pick and choose.
> >
> >
> It's on my todo list to add all the extensions and skins that are bundled
> in a release to the core release branch as submodules, just haven't gotten
> around to it yet.
>
> In addition to the problems/benefits you point out, it also means that the
> extensions and skins would be included in the signed tags for each
> release, aiding in reproducibility of a release.
>
> -Chad
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] git release branches

2016-08-29 Thread Chad
On Mon, Aug 29, 2016 at 8:04 AM John  wrote:

> Can we get a repo for each major release that contains core, skins, and all
> extensions in a single checkout? Right now I am updating and with all of
> the sub modules, I am running into issues where I cant set a branch on the
> extensions/ repo for REL 1_27 and get all extensions as of that branching,
> Instead I am forced to go thru one by one and manually set it, if and only
> if a given extension has said branch set, otherwise I am out of luck.
>
> Getting everything together would cause a larger checkout, but keeps things
> together for those who want to pick and choose.
>
>
It's on my todo list to add all the extensions and skins that are bundled
in a release to the core release branch as submodules, just haven't gotten
around to it yet.

In addition to the problems/benefits you point out, it also means that the
extensions and skins would be included in the signed tags for each
release, aiding in reproducibility of a release.

-Chad
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] git release branches

2016-08-29 Thread John
Can we get a repo for each major release that contains core, skins, and all
extensions in a single checkout? Right now I am updating and with all of
the sub modules, I am running into issues where I cant set a branch on the
extensions/ repo for REL 1_27 and get all extensions as of that branching,
Instead I am forced to go thru one by one and manually set it, if and only
if a given extension has said branch set, otherwise I am out of luck.

Getting everything together would cause a larger checkout, but keeps things
together for those who want to pick and choose.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Grouping phabricator notifications by task

2016-08-29 Thread Joaquin Oltra Hernandez
Hi,

I rely quite a lot on phabricator notifications for checking new activity
on my subscribed tasks (I've found to suit me better than email).

After using the unread view in phab quite a lot I noticed that I really
wanted to see the notifications grouped by the task, since that let's me
see the new activity on the task at a glance and open it to read/act if I
need to.

So I've made a user script that does exactly that:
 https://gist.github.com/joakin/c0e5dffc23aaf05175a580d24a2adefe

There's a gif of what this does in the README and the code is really short.

I hope this is useful to some of you, it certainly has been making my life
easier.

I apply it to https://phabricator.wikimedia.org/notification/query/unread/
only, but I guess there's no reason why it wouldn't work in
https://phabricator.wikimedia.org/notification/*

Cheers.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l