[Wikitech-l] The Revision Scoring weekly update
Hey, This is the 19th weekly update from revision scoring team that we have sent to this mailing list. Deployments: - We deployed a set of new models to ORES that reduce our memory usage and slightly increase fitness. [1] These models were discussed in an email to the "ai" mailing list. [2] - We also completed a major quarterly goal. The ORES review tool is now deployed as a beta feature on 8 wikis! [3] This came with some quick fixes to fix some confusion and usability issues. [4] The beta feature is now available on English, Polish, Portuguese, Russian, Dutch, Persian and Turkish Wikipedias as well as Wikidata. New development: - We discussed and came to a rough consensus about how to integrate ORES into api.php. [5] - We deployed a new edit quality campaign on English Wikipedia to gather more data for training ORES. [6, 7] - We added a specific set of user groups to the ORES models for Turkish Wikipedia and saw an increase in model fitness. [8] Maintenance and robustness: - We fixed bugs in our maintenance scripts for purging old model versions [9, 10] - We switch to using our production models on the beta labs cluster so now we can catch vandalism there too (and know that the models actually work) [11] - We improved the error messages reported from Wiki Labels so that the actual error appears when the API responds with non-200 HTTP status [12] 1. https://phabricator.wikimedia.org/T144101 -- Deploy ORES at 2016-08-29 2. https://lists.wikimedia.org/pipermail/ai/2016-August/68.html 3. https://phabricator.wikimedia.org/T140002 -- [Epic] Deploy ORES review tool 4. https://phabricator.wikimedia.org/T143988 -- $wgOresModels set all models true 5. https://phabricator.wikimedia.org/T122689 -- [Discuss] api.php integration with ORES 6. https://phabricator.wikimedia.org/T143745 -- Deploy 2016 edit quality campaign to English Wikipedia 7. https://en.wikipedia.org/wiki/Wikipedia:Labels/Edit_quality 8. https://phabricator.wikimedia.org/T140474 -- Include specific user groups in the trwiki edit quality model 9. https://phabricator.wikimedia.org/T144216 -- Purge model score should clean when there is no row is ores_model too 10. https://phabricator.wikimedia.org/T143798 -- Update model versions is badly broken in ORES extension 11. https://phabricator.wikimedia.org/T143567 -- Switch beta to use the proper wiki models for scoring (rather than "testwiki") 12. https://phabricator.wikimedia.org/T138255 -- Wikilabels UI reports non-200 status errors badly Sincerely, Aaron from the Revision Scoring team ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] git release branches
Hi, On 08/29/2016 10:33 AM, Brad Jorsch (Anomie) wrote: > On Mon, Aug 29, 2016 at 11:04 AM, Johnwrote: > >> I am running into issues where I cant set a branch on the >> extensions/ repo for REL 1_27 and get all extensions as of that branching, >> Instead I am forced to go thru one by one and manually set it, if and only >> if a given extension has said branch set, otherwise I am out of luck. >> > > I see the mediawiki/extensions and mediawiki/skins repos both have a > REL1_27 branch. Do they somehow not work correctly to check out all > submodules at the appropriate revisions? That should work, you'd just need to remember to run "git submodule update" after switching branches, which should be pretty easy. At that point you still need to manage 4 different git repos though (core, vendor, skins, extensions). -- Legoktm ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [AI] New models coming to ORES & notes
FYI, these new models are now live. We'll be running some maintenance scripts for the ORES review tool to update the recent historic scores. Otherwise, you should expect to see ORES producing scores with updated model version numbers. On Sat, Aug 20, 2016 at 1:39 AM, Amir Ladsgroupwrote: > One small note: This will cause the ORES review tool to invalidate it's db > cache. So we will probably need to run some maintenance scripts here and > there. You might feel a few bumps in the tool in Wikipedia. We will let you > know beforehand :) > > Best > > On Sat, Aug 20, 2016 at 3:10 AM Aaron Halfaker > wrote: > >> Hey folks, >> >> We've been working on generating some updated models for ORES. These >> models will behave slightly differently from the models that we currently >> have deployed. This is a natural artifact of retraining the models on the >> *exact same data* again because of some random properties of the learning >> algorithms. So, for the most part, this should be a non-issue for any >> tools that use ORES. However, I wanted to take this opportunity to >> highlight some of the facilities ORES provides to help automatically detect >> and adjust for these types of changes. >> >> *== Versions ==* >> ORES provides information about all of the models. This information >> includes a model version number. If you are caching ORES scores locally, >> we recommend invalidating old scores whenever this model number changes. >> For example, https://ores.wikimedia.org/v2/scores/ >> enwiki/damaging/12345678 currently returns >> >> { >> "scores": { >> "enwiki": { >> "damaging": { >> "scores": { >> "12345678": { >> "prediction": false, >> "probability": { >> "false": 0.7141333465390294, >> "true": 0.28586665346097057 >> } >> } >> }, >> "version": "0.1.1" >> } >> } >> } >> } >> >> This score was generated with the "0.1.1" version of the model. But once >> we deploy the new models, the same request will return: >> { >> "scores": { >> "enwiki": { >> "damaging": { >> "scores": { >> "12345678": { >> "prediction": false, >> "probability": { >> "false": 0.8204647324045306, >> "true": 0.17953526759546945 >> } >> } >> }, >> "version": "0.1.2" >> } >> } >> } >> } >> >> Note that the version number changes to "0.1.2" and the probabilities >> change slightly. In this case, we're essentially re-training the same >> model in a similar way, so we increment the "patch" number. >> >> However, we're switching modeling strategies for the article quality >> models (enwiki-wp10, frwiki-wp10 & ruwiki-wp10), so those versions >> increment the minor version from "0.3.2" to "0.4.0". You may see more >> substantial changes in prediction probabilities with those models, but a >> quick spot-checking suggests that the changes are not substantial. >> >> *== Test statistics and threshholding ==* >> So, many tools that use our edit quality models (reverted, damaging and >> goodfaith) will set threshholds for flagging edits for review. In order to >> support these tools, we produce test statistics that suggest useful >> thresholds. >> >> https://ores.wmflabs.org/v2/scores/enwiki/damaging/?model_info=test_stats >> produces: >> >> ... >> "filter_rate_at_recall(min_recall=0.75)": { >> "filter_rate": 0.869, >> "recall": 0.752, >> "threshold": 0.492 >> }, >> "filter_rate_at_recall(min_recall=0.9)": { >> "filter_rate": 0.753, >> "recall": 0.902, >> "threshold": 0.173 >> }, >> ... >> >> These two statistics show useful thresholds for detecting damaging >> edits. E.g. if you want to be sure that you catch nearly all vandalism >> (and are OK with a higher false-positive rate), set the threshold at 0.173, >> but if you'd like to catch most vandalism with almost no false-positives, >> set the threshold at 0.492. These fields can be read automatically by >> tools so that they do not need to be manually updated every time that we >> deploy a new model. >> >> Let me know if you have any questions and happy hacking! >> >> -Aaron >> ___ >> AI mailing list >> a...@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/ai >> > > ___ > AI mailing list > a...@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/ai > > ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] git release branches
On Mon, Aug 29, 2016 at 11:04 AM, Johnwrote: > I am running into issues where I cant set a branch on the > extensions/ repo for REL 1_27 and get all extensions as of that branching, > Instead I am forced to go thru one by one and manually set it, if and only > if a given extension has said branch set, otherwise I am out of luck. > I see the mediawiki/extensions and mediawiki/skins repos both have a REL1_27 branch. Do they somehow not work correctly to check out all submodules at the appropriate revisions? -- Brad Jorsch (Anomie) Senior Software Engineer Wikimedia Foundation ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Grouping phabricator notifications by task
> Hi, > > I rely quite a lot on phabricator notifications for checking new activity > on my subscribed tasks (I've found to suit me better than email). > > After using the unread view in phab quite a lot I noticed that I really > wanted to see the notifications grouped by the task, since that let's me > see the new activity on the task at a glance and open it to read/act if I > need to. > > So I've made a user script that does exactly that: > https://gist.github.com/joakin/c0e5dffc23aaf05175a580d24a2adefe That's great, thanks for sharing, Joaquin! Greg -- | Greg GrossmeierGPG: B2FA 27B1 F7EB D327 6B8E | | Release Team ManagerA18D 1138 8E47 FAC8 1C7D | ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] git release branches
On Mon, Aug 29, 2016 at 8:34 AM Johnwrote: > my thought would not only be for the bundled extensions but for > *everything* > under skins/ and extensions/ to get a given REL branch when a release is > set. It might mean that some extensions get a REL for a non-compatible > branch but it would cover those extensions where devs forget to cut > branches. > > Devs don't have to remember to cut branches, I cut a REL1_* for all extensions & skins during the release cycle. They aren't retroactively done for new extensions or skins sure, but they should all be there. I was talking more about putting the "bundled" ones (ie: those in the tarballs) as submodules of core release branches like we do for the wmf/* branches. -Chad ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] git release branches
my thought would not only be for the bundled extensions but for *everything* under skins/ and extensions/ to get a given REL branch when a release is set. It might mean that some extensions get a REL for a non-compatible branch but it would cover those extensions where devs forget to cut branches. On Mon, Aug 29, 2016 at 11:25 AM, Chadwrote: > On Mon, Aug 29, 2016 at 8:04 AM John wrote: > > > Can we get a repo for each major release that contains core, skins, and > all > > extensions in a single checkout? Right now I am updating and with all of > > the sub modules, I am running into issues where I cant set a branch on > the > > extensions/ repo for REL 1_27 and get all extensions as of that > branching, > > Instead I am forced to go thru one by one and manually set it, if and > only > > if a given extension has said branch set, otherwise I am out of luck. > > > > Getting everything together would cause a larger checkout, but keeps > things > > together for those who want to pick and choose. > > > > > It's on my todo list to add all the extensions and skins that are bundled > in a release to the core release branch as submodules, just haven't gotten > around to it yet. > > In addition to the problems/benefits you point out, it also means that the > extensions and skins would be included in the signed tags for each > release, aiding in reproducibility of a release. > > -Chad > ___ > Wikitech-l mailing list > Wikitech-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] git release branches
On Mon, Aug 29, 2016 at 8:04 AM Johnwrote: > Can we get a repo for each major release that contains core, skins, and all > extensions in a single checkout? Right now I am updating and with all of > the sub modules, I am running into issues where I cant set a branch on the > extensions/ repo for REL 1_27 and get all extensions as of that branching, > Instead I am forced to go thru one by one and manually set it, if and only > if a given extension has said branch set, otherwise I am out of luck. > > Getting everything together would cause a larger checkout, but keeps things > together for those who want to pick and choose. > > It's on my todo list to add all the extensions and skins that are bundled in a release to the core release branch as submodules, just haven't gotten around to it yet. In addition to the problems/benefits you point out, it also means that the extensions and skins would be included in the signed tags for each release, aiding in reproducibility of a release. -Chad ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] git release branches
Can we get a repo for each major release that contains core, skins, and all extensions in a single checkout? Right now I am updating and with all of the sub modules, I am running into issues where I cant set a branch on the extensions/ repo for REL 1_27 and get all extensions as of that branching, Instead I am forced to go thru one by one and manually set it, if and only if a given extension has said branch set, otherwise I am out of luck. Getting everything together would cause a larger checkout, but keeps things together for those who want to pick and choose. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Grouping phabricator notifications by task
Hi, I rely quite a lot on phabricator notifications for checking new activity on my subscribed tasks (I've found to suit me better than email). After using the unread view in phab quite a lot I noticed that I really wanted to see the notifications grouped by the task, since that let's me see the new activity on the task at a glance and open it to read/act if I need to. So I've made a user script that does exactly that: https://gist.github.com/joakin/c0e5dffc23aaf05175a580d24a2adefe There's a gif of what this does in the README and the code is really short. I hope this is useful to some of you, it certainly has been making my life easier. I apply it to https://phabricator.wikimedia.org/notification/query/unread/ only, but I guess there's no reason why it wouldn't work in https://phabricator.wikimedia.org/notification/* Cheers. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l