[Wikitech-l] What % of WMF is en:wp?

2011-01-13 Thread David Gerard
http://www.readwriteweb.com/archives/what_will_wikipedia_look_like_in_another_10_years.php

The most important thing, Wales told us on a press call today, is
the increased diversity in languages. According to Wales, around 30%
of Wikipedia articles had been in English and already that number has
dropped to 20%. We're going to see very very large projects in
languages where we've never seen such things before, he explained.

- I'd thought en:wp was still about 30% of everything - ~1/3 the
edits, ~1/3 the articles, ~1/3 the page hits, etc.

What are the various numbers? Is there anywhere to look them up, or
something to look them up from which they can be derived?


- d.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] Ratio of interwiki bots' edits

2011-01-13 Thread Amir E. Aharoni
It started under the subject What % of WMF is en:wp?.

2011/1/13 David Gerard dger...@gmail.com:
 - I'd thought en:wp was still about 30% of everything - ~1/3 the
 edits, ~1/3 the articles, ~1/3 the page hits, etc.

I'm sorry about the shameless plug, but i just had to tell that the
number of edits in all the Wikipedias will change quite significantly
when bug 15607 will be closed (
https://bugzilla.wikimedia.org/show_bug.cgi?id=15607 ). Currently the
bulk of edits in the minor language Wikipedias is done by interwiki
bots, and i've got a hunch that en.wp is the de-facto hub for adding
interlanguage links. This is the workflow, more or less:

1. A human editor creates the article [[Ira Cohen]] in, say, Slovenian
(sl), after it already exists in 20 other Wikipedias.

2. A human editor adds [[sl:Ira Cohen]] to the article [[Ira Cohen]] in en.wp.

3. A human editor waits for the interwiki bots to pick it up and
propagate to 20 other Wikipedias in which this article already exists.

That makes it:
* 1 human edit in sl.wp.
* 1 human edit in en.wp.
* 20 bot edits in other Wikipedias.

After the Interlanguage extensions will be enabled, it will be:
* 1 human edit in sl.wp.
* 1 human edit in en.wp.
* 0 bot edits (some behind-the-scenes magic pushes the changes to 20
wikis, but it's not seen in Recent Changes.)

This is a major reason to have the Interlanguage extension finally
enabled. Besides a MAJOR cleaning-up in Recent Changes in all
Wikipedias, it will give a somewhat clearer picture of the activity in
the ones.

--
Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
http://aharoni.wordpress.com
We're living in pieces,
 I want to live in peace. - T. Moore

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Inclusion request for the Validator extension

2011-01-13 Thread Jeroen De Dauw
Hey,

 It is entirely to do with your originl e-mail, as it is a valid reason
 why your original request (inclusion of Validator in core) should not be
 carried out at this time, and discussion of it is important (at least if
 you intend to improve the extension to overcome this issue).

This is just one single issue, and stands rather loose from the question if
the extension should be included or not. You will always be able to find
issues of some sort, so with this kind of focus on superficial bugs (this
really is, it can be fixed in under a minute), and avoiding the real
question of inclusion, nothing will ever happen.

First reach an agreement on if it's a good idea or not; only then take care
of the small changes that might needed to be taken before including.

Cheers

--
Jeroen De Dauw
http://blog.bn2vs.com
Don't panic. Don't be evil.
--
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Expensive parser function count

2011-01-13 Thread Alex Brollo
Just to give an example: i wrote a different algorithm for
[[en:s:template;Loop]], naming it [[en:s:template;Loop!]] and I asked for
100 and 101 dots with them into an empty sandbox preview.

These are results:

Sandbox, empty, preview:
Preprocessor node count: 35/100
Post-expand include size: 1858/2048000 bytes
Template argument size: 450/2048000 bytes
Expensive parser function count: 0/500

Sandbox, 2 calls to loop to print 100 and 101 dots, preview
Preprocessor node count: 1045/100
Post-expand include size: 2260/2048000 bytes
Template argument size: 1551/2048000 bytes
Expensive parser function count: 0/500

Sandbox, 2 calls to loop! to print the same dots, preview
Preprocessor node count: 193/100
Post-expand include size: 2300/2048000 bytes
Template argument size: 680/2048000 bytes
Expensive parser function count: 0/500

Is there really no useful feedback from these data? Really there's no
correlation with server load?

Alex
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Expensive parser function count

2011-01-13 Thread Brion Vibber
On Wed, Jan 12, 2011 at 6:51 PM, Tim Starling tstarl...@wikimedia.orgwrote:
[snip]

 When I optimise the parse time of particular pages, I don't even use
 my sysadmin access. The best way to do it is to download the page with
 all its templates using Special:Export, and then to load it into a
 local wiki. Parsing large pages is typically CPU-dominated, so you can
 get a very good approximation without simulating the whole network.
 Once the page is in your local wiki, you can use whatever profiling
 tools you like: the MW profiler with extra sections, xdebug, gprof,
 etc. And you can modify the test cases very easily.


Well, that's the entire point of WP:PERF, at least before it was elevated to
acronym apotheosis. One might reword it as optimize through science, not
superstition.

You're exactly the sort of person who can and should worry about
performance: you have well-developed debugging skills and significant
knowledge of the system internals. By following well-understood logical
processes you can very effectively identify performance bottlenecks and find
either workarounds (do your template like THIS and it's faster) or fixes (if
we make THIS change to the parser or database lookup code, it goes faster).

I'm going to go out on a limb though and say that most people don't
themselves have the tools or skills to do that. It's not rocket science, but
these are not standard-issue skills. (Maybe they should be, but that's a
story for the educational system!)

The next step from thinking there's a problem is to investigate it
knowledgeably, which means either *having* those skills already,
*developing* them, or *finding* someone else who does.

-- brion
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] What % of WMF is en:wp?

2011-01-13 Thread Bence Damokos
On Thu, Jan 13, 2011 at 1:02 PM, David Gerard dger...@gmail.com wrote:


 http://www.readwriteweb.com/archives/what_will_wikipedia_look_like_in_another_10_years.php

 The most important thing, Wales told us on a press call today, is
 the increased diversity in languages. According to Wales, around 30%
 of Wikipedia articles had been in English and already that number has
 dropped to 20%. We're going to see very very large projects in
 languages where we've never seen such things before, he explained.

 - I'd thought en:wp was still about 30% of everything - ~1/3 the
 edits, ~1/3 the articles, ~1/3 the page hits, etc.

 What are the various numbers? Is there anywhere to look them up, or
 something to look them up from which they can be derived?

 The article one should be easy: 15 million articles in total, 3 million of
that in English -- ~20%. (The 15 million is off the top of my head, but
should be right.)
For edits, I guess you can only use the cumulative totals, that are easily
available, for English it is 437,897,137, the total is 1 186 000 000 --
~36% (See Special:Statistics, and Emijrp's counter)
If I read http://stats.wikimedia.org/reportcard right, the pageviews are 8
billion on the English, 14 billion total -- ~57%

Best regards,
Bence
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Making my $wgHooks['SkinTemplateNavigation'] more future-proof

2011-01-13 Thread jidanni
 DF == Daniel Friesen li...@nadir-seen-fire.com writes:
DF Using in_array and array_diff on an exploded array should work fine.
OK, but that would take several times as many lines than my current:

foreach(array_keys($links['namespaces']) as $ns){
 if(strpos($ns,'talk')!==false){
  if('new'==$links['namespaces'][$ns]['class']){
   unset($links['namespaces'][$ns]);}}}

and force me to hardwire in more specific knowledge of your current structure, 
no?

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Ratio of interwiki bots' edits

2011-01-13 Thread Ashar Voultoiz
On 13/01/11 13:23, Amir E. Aharoni wrote:
snip
 3. A human editor waits for the interwiki bots to pick it up and
 propagate to 20 other Wikipedias in which this article already exists.
snip
 This is a major reason to have the Interlanguage extension finally
 enabled. Besides a MAJOR cleaning-up in Recent Changes in all
 Wikipedias, it will give a somewhat clearer picture of the activity in
 the ones.

Another solution would be to extract Categories and Interwiki links out 
of text.  They are metadata after all.

-- 
Ashar Voultoiz


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Expensive parser function count

2011-01-13 Thread Alex Brollo
2011/1/14 Tim Starling tstarl...@wikimedia.org



 However, I'm not sure how that obtained that result, since
 {{loop!|100|x}} just expands to {{loop|100|x}}, since it hits the
 default case of the #switch. When I try it, I get a preprocessor node
 count of 1069, not 193.


:-)

The {{loop!|100|x}} is deprecated; loop! only manages by itself numbers
between 1 and 10, larger numbers as 100  should be obtained by nesting or
appending, as suggested into the doc of the template.For back-compatibility,
{{loop!|100|x}} simply calls {{loop|100|x}} instead of raising an error. So,
it's not surprising that optimization is only met into the range 1..10.

Preprocessor count 193 comes from suggested syntax for 100 (1 nesting) + 101
(1 nesting and 1 appending):

{{loop!|10|{{loop!|10|x

{{loop!|10|{{loop!|10|x{{loop!|1|x}}

This call is running into en:s:Wikisource:Sandbox now, and I got same
metrics from that page's html:

!--
NewPP limit report
Preprocessor node count: 193/100
Post-expand include size: 2300/2048000 bytes
Template argument size: 680/2048000 bytes
Expensive parser function count: 0/500
--

Nevertheless, loop is mainly an example and a test, not so a useful
template. Dealing with the trouble of main metadata consistency into
wikisource, deeply undermined by redundancy, our way to fix things really
produces higher metrics when compared with other projects results but
now I have a small toolbox of tricks to evaluate such a difference
(rendering time + existing metrics).

Obviously it would be great to have better metrics for good, consistent
perfomance comparison; but, as I told, I don't want to overload servers just
to produce new metrics to evaluate server overload. ;-)

Alex
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l