[Wikitech-l] What % of WMF is en:wp?
http://www.readwriteweb.com/archives/what_will_wikipedia_look_like_in_another_10_years.php The most important thing, Wales told us on a press call today, is the increased diversity in languages. According to Wales, around 30% of Wikipedia articles had been in English and already that number has dropped to 20%. We're going to see very very large projects in languages where we've never seen such things before, he explained. - I'd thought en:wp was still about 30% of everything - ~1/3 the edits, ~1/3 the articles, ~1/3 the page hits, etc. What are the various numbers? Is there anywhere to look them up, or something to look them up from which they can be derived? - d. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Ratio of interwiki bots' edits
It started under the subject What % of WMF is en:wp?. 2011/1/13 David Gerard dger...@gmail.com: - I'd thought en:wp was still about 30% of everything - ~1/3 the edits, ~1/3 the articles, ~1/3 the page hits, etc. I'm sorry about the shameless plug, but i just had to tell that the number of edits in all the Wikipedias will change quite significantly when bug 15607 will be closed ( https://bugzilla.wikimedia.org/show_bug.cgi?id=15607 ). Currently the bulk of edits in the minor language Wikipedias is done by interwiki bots, and i've got a hunch that en.wp is the de-facto hub for adding interlanguage links. This is the workflow, more or less: 1. A human editor creates the article [[Ira Cohen]] in, say, Slovenian (sl), after it already exists in 20 other Wikipedias. 2. A human editor adds [[sl:Ira Cohen]] to the article [[Ira Cohen]] in en.wp. 3. A human editor waits for the interwiki bots to pick it up and propagate to 20 other Wikipedias in which this article already exists. That makes it: * 1 human edit in sl.wp. * 1 human edit in en.wp. * 20 bot edits in other Wikipedias. After the Interlanguage extensions will be enabled, it will be: * 1 human edit in sl.wp. * 1 human edit in en.wp. * 0 bot edits (some behind-the-scenes magic pushes the changes to 20 wikis, but it's not seen in Recent Changes.) This is a major reason to have the Interlanguage extension finally enabled. Besides a MAJOR cleaning-up in Recent Changes in all Wikipedias, it will give a somewhat clearer picture of the activity in the ones. -- Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי http://aharoni.wordpress.com We're living in pieces, I want to live in peace. - T. Moore ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Inclusion request for the Validator extension
Hey, It is entirely to do with your originl e-mail, as it is a valid reason why your original request (inclusion of Validator in core) should not be carried out at this time, and discussion of it is important (at least if you intend to improve the extension to overcome this issue). This is just one single issue, and stands rather loose from the question if the extension should be included or not. You will always be able to find issues of some sort, so with this kind of focus on superficial bugs (this really is, it can be fixed in under a minute), and avoiding the real question of inclusion, nothing will ever happen. First reach an agreement on if it's a good idea or not; only then take care of the small changes that might needed to be taken before including. Cheers -- Jeroen De Dauw http://blog.bn2vs.com Don't panic. Don't be evil. -- ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Expensive parser function count
Just to give an example: i wrote a different algorithm for [[en:s:template;Loop]], naming it [[en:s:template;Loop!]] and I asked for 100 and 101 dots with them into an empty sandbox preview. These are results: Sandbox, empty, preview: Preprocessor node count: 35/100 Post-expand include size: 1858/2048000 bytes Template argument size: 450/2048000 bytes Expensive parser function count: 0/500 Sandbox, 2 calls to loop to print 100 and 101 dots, preview Preprocessor node count: 1045/100 Post-expand include size: 2260/2048000 bytes Template argument size: 1551/2048000 bytes Expensive parser function count: 0/500 Sandbox, 2 calls to loop! to print the same dots, preview Preprocessor node count: 193/100 Post-expand include size: 2300/2048000 bytes Template argument size: 680/2048000 bytes Expensive parser function count: 0/500 Is there really no useful feedback from these data? Really there's no correlation with server load? Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Expensive parser function count
On Wed, Jan 12, 2011 at 6:51 PM, Tim Starling tstarl...@wikimedia.orgwrote: [snip] When I optimise the parse time of particular pages, I don't even use my sysadmin access. The best way to do it is to download the page with all its templates using Special:Export, and then to load it into a local wiki. Parsing large pages is typically CPU-dominated, so you can get a very good approximation without simulating the whole network. Once the page is in your local wiki, you can use whatever profiling tools you like: the MW profiler with extra sections, xdebug, gprof, etc. And you can modify the test cases very easily. Well, that's the entire point of WP:PERF, at least before it was elevated to acronym apotheosis. One might reword it as optimize through science, not superstition. You're exactly the sort of person who can and should worry about performance: you have well-developed debugging skills and significant knowledge of the system internals. By following well-understood logical processes you can very effectively identify performance bottlenecks and find either workarounds (do your template like THIS and it's faster) or fixes (if we make THIS change to the parser or database lookup code, it goes faster). I'm going to go out on a limb though and say that most people don't themselves have the tools or skills to do that. It's not rocket science, but these are not standard-issue skills. (Maybe they should be, but that's a story for the educational system!) The next step from thinking there's a problem is to investigate it knowledgeably, which means either *having* those skills already, *developing* them, or *finding* someone else who does. -- brion ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] What % of WMF is en:wp?
On Thu, Jan 13, 2011 at 1:02 PM, David Gerard dger...@gmail.com wrote: http://www.readwriteweb.com/archives/what_will_wikipedia_look_like_in_another_10_years.php The most important thing, Wales told us on a press call today, is the increased diversity in languages. According to Wales, around 30% of Wikipedia articles had been in English and already that number has dropped to 20%. We're going to see very very large projects in languages where we've never seen such things before, he explained. - I'd thought en:wp was still about 30% of everything - ~1/3 the edits, ~1/3 the articles, ~1/3 the page hits, etc. What are the various numbers? Is there anywhere to look them up, or something to look them up from which they can be derived? The article one should be easy: 15 million articles in total, 3 million of that in English -- ~20%. (The 15 million is off the top of my head, but should be right.) For edits, I guess you can only use the cumulative totals, that are easily available, for English it is 437,897,137, the total is 1 186 000 000 -- ~36% (See Special:Statistics, and Emijrp's counter) If I read http://stats.wikimedia.org/reportcard right, the pageviews are 8 billion on the English, 14 billion total -- ~57% Best regards, Bence ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Making my $wgHooks['SkinTemplateNavigation'] more future-proof
DF == Daniel Friesen li...@nadir-seen-fire.com writes: DF Using in_array and array_diff on an exploded array should work fine. OK, but that would take several times as many lines than my current: foreach(array_keys($links['namespaces']) as $ns){ if(strpos($ns,'talk')!==false){ if('new'==$links['namespaces'][$ns]['class']){ unset($links['namespaces'][$ns]);}}} and force me to hardwire in more specific knowledge of your current structure, no? ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Ratio of interwiki bots' edits
On 13/01/11 13:23, Amir E. Aharoni wrote: snip 3. A human editor waits for the interwiki bots to pick it up and propagate to 20 other Wikipedias in which this article already exists. snip This is a major reason to have the Interlanguage extension finally enabled. Besides a MAJOR cleaning-up in Recent Changes in all Wikipedias, it will give a somewhat clearer picture of the activity in the ones. Another solution would be to extract Categories and Interwiki links out of text. They are metadata after all. -- Ashar Voultoiz ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Expensive parser function count
2011/1/14 Tim Starling tstarl...@wikimedia.org However, I'm not sure how that obtained that result, since {{loop!|100|x}} just expands to {{loop|100|x}}, since it hits the default case of the #switch. When I try it, I get a preprocessor node count of 1069, not 193. :-) The {{loop!|100|x}} is deprecated; loop! only manages by itself numbers between 1 and 10, larger numbers as 100 should be obtained by nesting or appending, as suggested into the doc of the template.For back-compatibility, {{loop!|100|x}} simply calls {{loop|100|x}} instead of raising an error. So, it's not surprising that optimization is only met into the range 1..10. Preprocessor count 193 comes from suggested syntax for 100 (1 nesting) + 101 (1 nesting and 1 appending): {{loop!|10|{{loop!|10|x {{loop!|10|{{loop!|10|x{{loop!|1|x}} This call is running into en:s:Wikisource:Sandbox now, and I got same metrics from that page's html: !-- NewPP limit report Preprocessor node count: 193/100 Post-expand include size: 2300/2048000 bytes Template argument size: 680/2048000 bytes Expensive parser function count: 0/500 -- Nevertheless, loop is mainly an example and a test, not so a useful template. Dealing with the trouble of main metadata consistency into wikisource, deeply undermined by redundancy, our way to fix things really produces higher metrics when compared with other projects results but now I have a small toolbox of tricks to evaluate such a difference (rendering time + existing metrics). Obviously it would be great to have better metrics for good, consistent perfomance comparison; but, as I told, I don't want to overload servers just to produce new metrics to evaluate server overload. ;-) Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l