[Wikitech-l] upgrade of irc.wikimedia.org

2016-05-02 Thread Daniel Zahn
Hello,

today the server that runs irc.wikimedia.org has been upgraded.

Or actually, we have installed a new, second server running on Debian
jessie and ported the puppet manifests and the IRC bot to work on
that, while the old
server still exists.  MW appservers are sending RC data to both of them.

If you are making a new connection to irc.wikimedia.org you should now
be served by the new host, kraz.wikimedia.org  but old connections
have not been broken.

The old server is still running unchanged and is reachable as
argon.wikimedia.org (as it was before too).

We have also made the change "URLs in the recent changes IRC feed will
no longer be rewritten to unencrypted HTTP." which was announced and
scheduled for today, May 2nd.

No action is required if your bot automatically reconnects, but bot
owners should ensure no IP addresses are hardcoded (see T123729 for
details.)

DNS caches slowly roll over and new connections will use kraz.
Existing sessions on argon and clients that hardcoded the argon IP
won't be affected yet, but after a grace period we are going to shut
down the old server argon.

refs:  https://phabricator.wikimedia.org/T123729#2216681
  https://meta.wikimedia.org/wiki/Tech/News/2016/02

Best regards,

Daniel
-- 
Daniel Zahn 
Operations Engineer

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Getting rid of $wgWellFormedXml = false;

2016-05-02 Thread Max Semenik
On Mon, May 2, 2016 at 3:04 PM, Brian Wolff  wrote:

>
> There are references to it breaking people's screen scraping bots last time
> it was turned on. That was like 5 years ago though.
>

At this point, I would say that everybody who screen-scrapes saw it coming
and breaking them is a good thing as sometimes, lessons just have to be
learned.


Best regards,
Max Semenik ([[User:MaxSem]])
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Readership metrics for the timespan until April 24, 2016

2016-05-02 Thread Tilman Bayer
Hi all,

here is the usual look
 at our
most important readership metrics. This time, highlights include the impact
of the iOS app’s big release of a revamped version, a Google bug that
brought the Android app a windfall of about half a million new users, and a
first examination of the new unique devices dataset.

As laid out earlier
,
the main purpose is to raise awareness about how these are developing, call
out the impact of any unusual events, and facilitate thinking about core
metrics in general. As always, feedback and discussion are welcome.
Week-over-week and month-over-month changes are now being recorded on the
Product page  at
MediaWiki.org. This edition of the report covers a timespan of eleven weeks.

Some other recent items of interest, in case they didn’t already catch your
attention:

   -

   At the Foundation’s February metrics meeting, I gave an update on
   longer-term traffic trends since 2013. (TL;DR: Overall pageviews have been
   flat to slightly declining, mobile has been steadily rising but recently
   slowed down, desktop is declining.) See the slide deck here or the chart
   below (updated with March data):

[image: Wikimedia monthly pageviews (worldwide, mobile vs desktop),
2013-2016.png]


   -

   The WMF Reading team published its quarterly review presentation
   

   for Q3 2015-16 (January-March), with lots of traffic and usage data.
   -

   I recently gave a Tech Talk about “new readership data”
   

   - i.e. some data that, separately from the core metrics discussed here,
   gives insights on how Wikipedia is read.



Now to the usual data. (All numbers below are averages for February 8-April
24, 2016 unless otherwise noted.)

Our family of core metrics who are populating this report mourns a
deplorable casualty (iOS daily active users are no longer available due to
the app’s switch to opt-in data collection, T130432
), rejoices about the speedy
convalescence of an accident victim thanks to expert therapy administered
by the Analytics and iOS teams (iOS pageviews, T131824
), sends well-wishes for
recovery to two other patients (Android app DAUs and pageviews, T132965
), and lastly welcomes a new
member: unique devices.

Pageviews

Total: 538 million/day (+1.5% from the previous report)

(caveat: likely undercounting by 1-2 million/day since April due to a bug
 related to the Android app’s
gradual switch to RESTBase)

Context (April 2015-April 2016):

[image: Wikimedia daily pageviews, all vs. mobile (April
2015-2016-04-24).png]

Overall, pageviews were up compared to the previous report (which had still
covered the slump around Christmas). There was a drop in early/mid March
though, which probably can’t be fully explained by a change in the pageview
definition 
on March 9 that improved the exclusion of bot pageviews, nor does it look
seasonal compared to March 2014 and March 2015.

See also the Vital Signs dashboard
.

Desktop: 55.2% ​(previous report: 54.3%)

Mobile web: 43.6% ​(previous report: 44.4%)

Apps: 1.2% ​(previous report: ​1.3%)

(caveat: app percentage likely too low due to the aforementioned
Android-related bug)

Context (April 2015-April 2016):

[image: Wikimedia daily pageviews, mobile percentage (April
2015..2016-04-24).png]

Mobile pageviews now regularly reach parity on weekends.

Global North ratio: 76.5% of total pageviews (previous report: 78.3%)

Context (April 2015-April 2016):

[image: Percentage of Wikimedia pageviews from the Global North (April
2015..2016-04-24).png]


In the previous report, we could already witness the ratio of Global North
pageviews falling (or conversely, the Global South ratio rising) from a
peak at the beginning of January. Afterwards, it saw a somewhat conspicuous
drop in mid-February, went back up a bit but is now back to the levels from
a year ago, before the HTTPS-only rollout in June.

Unique devices

Recently, the Analytics team made a new metric available: Daily and monthly
unique devices (see their announcement blog post
 for
background and details). These e

Re: [Wikitech-l] Getting rid of $wgWellFormedXml = false;

2016-05-02 Thread Brian Wolff
>
> The only benefit of $wgWellFormedXml was that you could toss your
> "well-formed" tag soup into an XML parser that didn't grok HTML. I have no
> idea if that worked reliably or was actually useful to anyone, but it's
> probably worth confirming that before actually removing the funky
> self-closing tags.
>

There are references to it breaking people's screen scraping bots last time
it was turned on. That was like 5 years ago though.

--bawolff
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Getting rid of $wgWellFormedXml = false;

2016-05-02 Thread Brion Vibber
I'd say an HTML5 output mode *ought* to work like this:

*Don't try to be clever.*
* Consistency and predictability are key to both security review and data
consumability.

*Quote attributes consistently and predictably.*
* Always use double-quotes on attributes in output.

*Output specced empty tags in HTML style.*
* , ,  are fine and not ambiguous at all to an HTML parser.
There's no need to go adding a "/" in at the end!
* These are already whitelisted in the Html class so it's easy to not mess
this up.

*Don't do other silly things for old-school XHTML 1.*
* CDATA wrapping of 

[Wikitech-l] How might interactive maps be used on Wikipedia?

2016-05-02 Thread Chris Koerner
Hello,

The Maps team at the Wikimedia Foundation is getting closer to make it
possible to add interactive maps  to
Wikipedia. If you've ever used services like Google Maps or Mapquest you
may be familiar with interactive maps. We’d like to invite editors to have
a conversation on how these maps might be used within articles. We've put
together information on how these maps and their style works from a
technical perspective

– where the data comes from, how maps are styled, how to add an interactive
map, and a few example use cases.


In particular we would like to focus the discussion around three key
questions (open discussion outside these questions is welcome too).


* What types of articles would use interactive maps?

* How do these articles differ in their requirements?

* Are there any classes of articles whose map styling requirement is
fundamentally in conflict with other article classes, thus requiring
multiple styles?

If you are interested, please visit
https://www.mediawiki.org/wiki/Maps/Conversation_about_interactive_map_use
to learn more and get involved.
-- 
Yours,
Chris Koerner
Community Liaison - Discovery
Wikimedia Foundation
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Getting rid of $wgWellFormedXml = false;

2016-05-02 Thread Brian Wolff
So currently, we have two ways of outputting html - $wgWellFormedXml =
true (The default), outputs html that happens to conform with the
rules of XML. $wgWellFormedXml = false on the other hand, uses more
lax html5 rules to save a few bytes.

Having two modes of output, feels rather silly to me. Originally I
think this was meant as a feature flag well $wgWellFormedXml=false
stabilized, but it never got turned on, and here we are 7 years later.

Having $wgWellFormedXml=false increases the complexity of the code,
and not all that many people use it (Notable exception is
translatewiki). I think its important that security critical code be
as simple as possible. Furthermore, there seems to be very little
benefit to having the second mode (After you account for gzip, saving
a few bytes from writing  instead of  really doesn't
matter, imo)

With that in mind, I would like to propose killing $wgWellFormedXml =
false; I'm not so much attached to the true mode (Although I do feel
the true mode is significantly more sane), as I just simply want there
to be a single mode. Putting the default to false was vetoed in
T52040, so I think that true would be the best choice to go with going
forward if we are getting rid of one of the modes.

If there are aspects of the other mode that people really want, then I
think we should simply merge that in to the default behavior instead
of having two separate modes.

See gerrit patch https://gerrit.wikimedia.org/r/286495 I would
appreciate everyone's feedback.

Thanks,
Brian

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Migration of browsertests* Jenkins jobs to selenium* jobs

2016-05-02 Thread Željko Filipin
Hi,

We have a set of Jenkins jobs that run daily and execute Ruby+Selenium
tests. Recently, the way we run those jobs reached a point where we had to
do some serious refactoring[0]. The old jobs were named browsertests*[1],
the new are named selenium*[2].

Changes:

#1 The creation and deletion of jobs have been made simpler. Each
repository now
has only a single job defined in Jenkins. It is a multi configuration job
that spawns one or more child job based on a configuration in each
repository: `tests/browser/ci.yml`[3].  The main job will then spawn child
jobs based on its content.

#2 Jobs execute `selenium` Rake target (`bundle exec rake selenium`). It is
defined in the Rakefile of each repository and load the Rake task from
mediawiki_selenium
Ruby gem version 1.7.0[4].

What does it mean for you?

At this point, no action in needed. All required changes have already been
made. When the selenium* job passed for a repository, I have already
deleted the browsertests* legacy one. There is still a few repositories
(Flow, MobileFrontend, MultimediaViewer, Wikidata) that need to be moved,
but we are working on that.

If you have any questions, let me know.

Željko
--
0: https://phabricator.wikimedia.org/T128190
1: https://integration.wikimedia.org/ci/view/BrowserTests/view/-Dashboard/
2: https://integration.wikimedia.org/ci/view/Selenium/
3: https://www.mediawiki.org/wiki/Continuous_integration/Entry_points#ci.yml
4: https://www.mediawiki.org/wiki/Continuous_integration/Entry_points#Rake
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Weekly update #2

2016-05-02 Thread Amir Ladsgroup
The second weekly update on the Revision Scoring project.

*New developments*

   - ORES has graphite dashboard now [1,2]


   - Deploying new campaigns and testing Wikilabels got easier [3,4]


   - Revscoring feature extraction got about 13% faster [5]


   - We deployed new versions of ORES and Wikilabels [6,7]
   - Wikidata ScoredRevision gadget had a serious issue, it got fixed [8]


*Progress in supporting new languages*

   - Wikidata damaging and goodfaith models are built and deployed [9,10,11]


   - Dutch damaging and goodfaith models are built and deployed [12]


   - We are working on langauge utilities of Tamil [13]


*Active Labeling campaigns*

   - Edit quality (damaging and good faith)


   - Wikipedias: Arabic, Azerbaijani, German, French, Hebrew, Hungarian,
   Indonesian, Italian, Japanese, Norweigian, Persian (v2), Polish, Spanish,
   Ukranian, Urdu, Vietnamese


   - Edit type


   - English Wikipedia



   1. 1. https://grafana.wikimedia.org/dashboard/db/ores


   1. 2. https://phabricator.wikimedia.org/T127594


   1. 3. https://phabricator.wikimedia.org/T133557


   1. 4. https://phabricator.wikimedia.org/T102336


   1. 5. https://github.com/wiki-ai/revscoring/pull/268


   1. 6. https://phabricator.wikimedia.org/T134032


   1. 7. https://phabricator.wikimedia.org/T134174
   2. 8. https://phabricator.wikimedia.org/T133903


   1. 9. https://phabricator.wikimedia.org/T130274


   1. 10. https://phabricator.wikimedia.org/T130301


   1. 11.
   https://lists.wikimedia.org/pipermail/wikidata/2016-May/008641.html


   1. 12. https://phabricator.wikimedia.org/T133563

13. https://phabricator.wikimedia.org/T134105

Sincerely,
The Revision Scoring team.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [GSoC 2016] Community Bonding

2016-05-02 Thread Quim Gil
On Sun, May 1, 2016 at 6:49 PM, Alangi Derick 
wrote:

> Hi everyone
>
> I am Alangi Derick and also d3r1ck on the IRC. I was selected in the GSoC
> 2016 program and was opportuned to work on the project titled "Integration
> of IFTTT support for Wikidata" and I am very happy to be the first African
> GSoCer in Wikimedia Foundation. This is indeed a priviledge and I wish to
> thank everyone that guided me in this movement to attain this level.
>

Alangi, I was very happy to see that you made it. When you started
contributing half year ago we told you that perseverance usually brings
fortune (maybe I didn't say it with these words)  ;)  and here you are,
with a well deserved GSoC internship. And indeed, you seem to be the first
Wikimedia GSoC student from Africa (Cameroon).

Since your project focuses on Wikidata, please consider sending this
message to the wikidata@ mailing list as well. Good luck with your project!

-- 
Quim Gil
Engineering Community Manager @ Wikimedia Foundation
http://www.mediawiki.org/wiki/User:Qgil
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l