[Wikimedia-l] article bytes more meaningful than users or revisions (was Re: Updates on VE data analysis)

2013-07-27 Thread James Salsman
MZMcBride wrote:
... the number of non-deleted revisions per day for the
 English Wikipedia. The results are here:
 https://en.wikipedia.org/wiki/Special:Permalink/565971356

So, that looks terrible: http://i.imgur.com/Z9lYCWj.png

It looks terrible in the same way that every other graph of active
users and several other related measures look like.

But it isn't. It doesn't account for the power law of practice which
causes everyone who has ever edited Wikipedia to get better at it with
time. And since so many IP editors are obviously returning, that means
a lot more than under the false but very common assumption that every
IP editor is new.

Here's what really matters, articlespace size:  http://i.imgur.com/TfaD99V.png

The size of the article text in bytes has been marching on linearly
since the beginning of Wikipedia, with extremely low variation, just
like the short popular vital articles and every other measure of
quality content.

There is no legitimate basis to worry about anything until the linear
trend of the total article bytes breaks out of its 12 year linear
trend.

(If you multiply columns 'E' and 'I' from
http://stats.wikimedia.org/EN/TablesWikipediaEN.htm the database size
shows a cusp at around 2006, corresponding to the growth modes, but
two separate linear trends fit both modes far better than any growth
model fits the entire curve.)

___
Wikimedia-l mailing list
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe

Re: [Wikimedia-l] Feedback for the Wikimedia Foundation

2013-07-27 Thread Jane Darnell
I have tried and failed to use the Visual Editor several times in the
past few weeks, and as with all new technologies, I consider myself a
follower rather than a leader,  so I was very interested to look
up the Dutch feedback that Romaine was reporting. One of the comments
was that it was impossible to create a simple blue link with the VE,
since the VE throws nowiki around any attempt to do this. Since that
is one of the most basic parts of wikimarkup that anyone will use, I
decided to investigate, since that was my problem too.

I am happy to report that I just discovered what the problem is. I had
turned off the Show edit toolbar option in my preferences (probably
over a year ago), so I wasn't seeing the top part of the VE edit
toolbar, which includes the hyperlink icon, among other things. I was
only seeing the other, second, line of the VE toolbar icons for
including media, reference, references list, and transclusion.

I expect that many other experienced Wikipedians have the same
problem. This should help solve a lot of the ghost edits.

2013/7/26, David Gerard dger...@gmail.com:
 On 26 July 2013 03:12, Everton Zanella Alvarenga
 everton.alvare...@okfn.org wrote:

 Maybe a new community (less conservative?) to build a good
 encyclopedia can come up if a new platformn be invented?


 Hence power users as a snarl word.

 After the uprising of the 17th of June
 The Secretary of the Writers’ Union
 Had leaflets distributed in the Stalinallee
 Stating that the people
 Had forfeited the confidence of the government
 And could win it back only
 By redoubled efforts. Would it not be easier
 In that case for the government
 To dissolve the people
 And elect another?


 - d.

 ___
 Wikimedia-l mailing list
 Wikimedia-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
 mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe

___
Wikimedia-l mailing list
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe

Re: [Wikimedia-l] article bytes more meaningful than users or revisions (was Re: Updates on VE data analysis)

2013-07-27 Thread Denny Vrandečić
Thank you for the observation.

Is the graph http://i.imgur.com/TfaD99V.png based on actual data? Because
it looks just tad bit too linear to me. (I do not disagree with the
finding, just wondering about the graph itself).

I still would worry, though: our content is increasing linearly, as you
say, but the number of active contributors is not. If we take for granted
that active contributors are the ones who provide quality control for the
articles, this means that since 2006 or so the ratio of content per
contributor is linearly declining, which would mean that our quality would
suffer.

I see two effects to counter that:

1) as you already mentioned, contributors are getting increasingly more
experienced and more effective in fulfilling their tasks.

2) we continue to have a strong increase in readers and even stronger in
pageviews (i.e. more and more people consult Wikipedia more and more). They
probably also provide a layer of quality assurance, even though they might
not qualify to be counted as active contributors.

I have the gut feeling that 1) cannot be sufficient, and I would be curious
in the effects of 2) - especially considering that much of the Foundation
development work can be considered in improving 2 further (visual editor,
article rating, mobile editing, etc.)





2013/7/27 James Salsman jsals...@gmail.com

 MZMcBride wrote:
 ... the number of non-deleted revisions per day for the
  English Wikipedia. The results are here:
  https://en.wikipedia.org/wiki/Special:Permalink/565971356

 So, that looks terrible: http://i.imgur.com/Z9lYCWj.png

 It looks terrible in the same way that every other graph of active
 users and several other related measures look like.

 But it isn't. It doesn't account for the power law of practice which
 causes everyone who has ever edited Wikipedia to get better at it with
 time. And since so many IP editors are obviously returning, that means
 a lot more than under the false but very common assumption that every
 IP editor is new.

 Here's what really matters, articlespace size:
 http://i.imgur.com/TfaD99V.png

 The size of the article text in bytes has been marching on linearly
 since the beginning of Wikipedia, with extremely low variation, just
 like the short popular vital articles and every other measure of
 quality content.

 There is no legitimate basis to worry about anything until the linear
 trend of the total article bytes breaks out of its 12 year linear
 trend.

 (If you multiply columns 'E' and 'I' from
 http://stats.wikimedia.org/EN/TablesWikipediaEN.htm the database size
 shows a cusp at around 2006, corresponding to the growth modes, but
 two separate linear trends fit both modes far better than any growth
 model fits the entire curve.)

 ___
 Wikimedia-l mailing list
 Wikimedia-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
 mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikimedia-l mailing list
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe

Re: [Wikimedia-l] About the concentration of resources in SF (it was: Communication plans for community engagement

2013-07-27 Thread Balázs Viczián
Well, both Hungary and Budapest aims to be the RD center of the region.
There are multiple government and munipal funds and programmes plus a lot
of favouring policies on both administrative levels, including a full
dedicated neighbourhood on the bank of the Danube, named Infopark (since
1996 [1])

Setting up a formally for-profit company who's only contractor would be the
WMF (and/or other chapters) in BP can be funded well over 50% from non
movement funds (or low/no interest loans) during the first few years and
would be much much cheaper than any parts of Western Europe and most of the
CEE. Doing so though WMHU or a separate non-profit way - probaly also
doable.

However having one such department for the sake of having one is a total
waste of time, money and efforts everywhere in the World, so the main
question is: are there enough projects that could make establishing such a
department/spearate entity reasonable?

Balázs

[1] http://www.infopark.hu/lang/en/




 On Wed, Jul 24, 2013 at 2:39 PM, rupert THURNER
 rupert.thur...@gmail.com wrote:

  If WMF is serious about letting development activities grow in other
  countries this might be taken into account in FDCs allocation policy.

 For my part, I'm happy to offer feedback to the FDC on plans related
 to the development of engineering capacity in FDC-funded
 organizations. I'm sure Wikimedia Germany, too, would be happy to
 share its experiences growing the Wikidata development team. I'd love
 to find ways to bootstrap more engineering capacity across the
 movement, as so many of our shared challenges have a software
 engineering component. If any folks on-list want to touch base on
 these questions at Wikimania, drop me a note. :)

 Erik

 --
 Erik Möller
 VP of Engineering and Product Development, Wikimedia Foundation

 ___
 Wikimedia-l mailing list
 Wikimedia-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
 mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe

___
Wikimedia-l mailing list
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe

Re: [Wikimedia-l] article bytes more meaningful than users or revisions (was Re: Updates on VE data analysis)

2013-07-27 Thread James Salsman
Denny Vrandečić wrote:
...
 Is the graph http://i.imgur.com/TfaD99V.png based on actual data?

Yes, the precise sizes for the
dumps.wikimedia.org/enwiki/MMDD/enwiki-MMDD-pages-articles-multistream.xml.bz2
files are:

2012-07-02 9524994664
2012-08-02 9824345489
2012-09-02 9929910893
2012-10-01 10015876877
2012-11-01 10124555675
2012-12-01 10220499338
2013-01-02 10315766966
2013-02-04 10425240648
2013-03-04 10430830645
2013-04-03 10433658645
2013-05-03 10525475953
2013-06-04 10617572833
2013-07-08 10721955835

The byte count approximations from multiplying columns 'E' and 'I'
from http://stats.wikimedia.org/EN/TablesWikipediaEN.htm are at the
end of this message. Again, that data best fits two linear trends,
with a cusp around 2006.

 our content is increasing... but the number of active
 contributors is not.

I'm becoming increasingly convinced that as contributors become more
experienced, they choose to do most of their work logged out. What are
the advantages of using a registered account? Theoretically you can
prove that you made contributions, but as far as I know only one
person so far has ever obtained professional credit for their
contributions (there is a recent thread on wiki-research-l about
this.) What are the disadvantages of using a registered account to
edit? Anyone who opposes an edit politically is likely to examine the
entirety of the editor's contribution history and will all too often
stalk, punish by reverting old edits, or dispute the contributor's
work. Anonymous IP editors rarely face such time wasting scrutiny and
hassles. For anyone whose primary goal is to build an encyclopedia as
opposed to socializing, amassing administrative power, or obtaining a
job with the Foundation, the choice is obvious.  Those who wish their
contributions to be remembered for posterity are more likely to become
serial puppeteers than registered editors, unless they want to spend
most of their time being hassled in article space.

John Vandenberg wrote:
...
 I would love to see stats about quality rather than quantity

It would be a mistake to rely on volunteer or Foundation assessments
of quality, because the likelihood that they would be biased is far to
great. We should rely only on third party assessments of article
quality, such as those in
http://en.wikipedia.org/wiki/Reliability_of_Wikipedia#Assessments
nearly all of which show continuous ongoing improvement.

Automatic measures of quality proposed so far have not really
impressed me, but I think http://arxiv.org/pdf/1206.2517.pdf has huge
potential and I am confident that the ideas it promotes will be easily
automated by bots after it is proven through peer review.

 Does anyone have stats for the number of blocked users per month

Yes, but it's almost meaningless because the vast majority of blocks
are for persistent vandalism, often at schools or libraries where we
really have no way to determine whether the editors involved ever
returned to do productive work.

---

Products of columns 'E' and 'I' from
http://stats.wikimedia.org/EN/TablesWikipediaEN.htm :

Jan-10 1133050
Dec-09 1126230
Nov-09 1120650
Oct-09 1078800
Sep-09 1072500
Aug-09 1065300
Jul-09 1026310
Jun-09 1021380
May-09 979160
Apr-09 971880
Mar-09 932850
Feb-09 930150
Jan-09 925020
Dec-08 885560
Nov-08 880620
Oct-08 841500
Sep-08 837500
Aug-08 831750
Jul-08 796080
Jun-08 794160
May-08 755780
Apr-08 749800
Mar-08 711260
Feb-08 706860
Jan-08 673890
Dec-07 669900
Nov-07 631800
Oct-07 625600
Sep-07 585960
Aug-07 582350
Jul-07 549900
Jun-07 518160
May-07 514080
Apr-07 479360
Mar-07 472480
Feb-07 466240
Jan-07 432000
Dec-06 425700
Nov-06 391720
Oct-06 387100
Sep-06 355160
Aug-06 351000
Jul-06 319560
Jun-06 289630
May-06 285670
Apr-06 255700
Mar-06 2476177000
Feb-06 2312907000
Jan-06 2170049000
Dec-05 201360
Nov-05 1869076000
Oct-05 174696
Sep-05 1627864000
Aug-05 1526784000
Jul-05 1407976000
Jun-05 1300334000
May-05 1209984000
Apr-05 1002925000
Mar-05 92463
Feb-05 87232
Jan-05 838272000
Dec-04 861724000
Nov-04 806195000
Oct-04 743904000
Sep-04 689924000
Aug-04 644502000
Jul-04 595665000
Jun-04 55290
May-04 511038000
Apr-04 47675
Mar-04 440286000
Feb-04 40301
Jan-04 375536000
Dec-03 350336000
Nov-03 329219000
Oct-03 310616000
Sep-03 294689000
Aug-03 27863
Jul-03 261555000
Jun-03 244454000
May-03 230328000
Apr-03 21720
Mar-03 20463
Feb-03 193475000
Jan-03 182936000
Dec-02 17101
Nov-02 16215
Oct-02 15048
Sep-02 80733000
Aug-02 6699
Jul-02 59755000
Jun-02 5542
May-02 49259000
Apr-02 4779
Mar-02 44968000
Feb-02 3935
Jan-02 30582000
Dec-01 26832000
Nov-01 21994000
Oct-01 17244000
Sep-01 10982000
Aug-01 710
Jul-01 4186000
Jun-01 324
May-01 2373600
Apr-01 1295800
Mar-01 596904
Feb-01 186636
Jan-01 33800


Re: [Wikimedia-l] About the concentration of resources in SF (it was: Communication plans for community engagement

2013-07-27 Thread aude
On Sat, Jul 27, 2013 at 1:57 AM, Erik Moeller e...@wikimedia.org wrote:

 On Wed, Jul 24, 2013 at 2:39 PM, rupert THURNER
 rupert.thur...@gmail.com wrote:

  If WMF is serious about letting development activities grow in other
  countries this might be taken into account in FDCs allocation policy.

 For my part, I'm happy to offer feedback to the FDC on plans related
 to the development of engineering capacity in FDC-funded
 organizations. I'm sure Wikimedia Germany, too, would be happy to
 share its experiences growing the Wikidata development team. I'd love
 to find ways to bootstrap more engineering capacity across the
 movement, as so many of our shared challenges have a software
 engineering component. If any folks on-list want to touch base on
 these questions at Wikimania, drop me a note. :)


Chapters undertaking technology work is definitely a good thing!

I can say personally, unofficially (as member of the wikidata team) that I
am definitely happier working in Berlin (with lower salary, that goes
pretty far), versus SF. I am not convinced I could afford same lifestyle in
SF on salary offered by WMF.

Could one afford to live on their own in a 1b apartment in SOMA on WMF
salary, which has median cost of $3,475 [1] a month? Or would I need have
flatmate or need to commute from farther away?

The rule of thumb is that one should not spend more than 30% of their
income (after tax!), and ideally smaller percentage than that.  That
requires $11,500 (after tax) salary, per month.

I can very easily live on my own in the best parts of Berlin, near the WMDE
office, or whatever I want. Just sayin'  :)

I understand that lots of people like to live in SF anyway, even with
whatever sacrifices they must make to afford it.  And good that WMF offers
the remote work option.

[1] http://priceonomics.com/the-san-francisco-rent-explosion/

Cheers,
Katie



 Erik

 --
 Erik Möller
 VP of Engineering and Product Development, Wikimedia Foundation

 ___
 Wikimedia-l mailing list
 Wikimedia-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
 mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe




-- 
@wikimediadc / @wikidata
___
Wikimedia-l mailing list
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe

Re: [Wikimedia-l] Feedback for the Wikimedia Foundation

2013-07-27 Thread phoebe ayers
Right -- and if not that specifically, I'd imagine most experienced users
do have various hacks, scripts, gadgets etc installed that we've
accumulated over the years. I know many people who have been editing for a
long time have a custom skin as well.

I don't know how any of these might or might not affect VE performance, but
one thing about the VE being enabled for IPs too (at least on a few
wikipedias) is you can always log out and see if the same problem persists
:)

-- phoebe


On Sat, Jul 27, 2013 at 12:45 AM, Jane Darnell jane...@gmail.com wrote:

 I have tried and failed to use the Visual Editor several times in the
 past few weeks, and as with all new technologies, I consider myself a
 follower rather than a leader,  so I was very interested to look
 up the Dutch feedback that Romaine was reporting. One of the comments
 was that it was impossible to create a simple blue link with the VE,
 since the VE throws nowiki around any attempt to do this. Since that
 is one of the most basic parts of wikimarkup that anyone will use, I
 decided to investigate, since that was my problem too.

 I am happy to report that I just discovered what the problem is. I had
 turned off the Show edit toolbar option in my preferences (probably
 over a year ago), so I wasn't seeing the top part of the VE edit
 toolbar, which includes the hyperlink icon, among other things. I was
 only seeing the other, second, line of the VE toolbar icons for
 including media, reference, references list, and transclusion.

 I expect that many other experienced Wikipedians have the same
 problem. This should help solve a lot of the ghost edits.

 2013/7/26, David Gerard dger...@gmail.com:
  On 26 July 2013 03:12, Everton Zanella Alvarenga
  everton.alvare...@okfn.org wrote:
 
  Maybe a new community (less conservative?) to build a good
  encyclopedia can come up if a new platformn be invented?
 
 
  Hence power users as a snarl word.
 
  After the uprising of the 17th of June
  The Secretary of the Writers’ Union
  Had leaflets distributed in the Stalinallee
  Stating that the people
  Had forfeited the confidence of the government
  And could win it back only
  By redoubled efforts. Would it not be easier
  In that case for the government
  To dissolve the people
  And elect another?
 
 
  - d.
 
  ___
  Wikimedia-l mailing list
  Wikimedia-l@lists.wikimedia.org
  Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
  mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe

 ___
 Wikimedia-l mailing list
 Wikimedia-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
 mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe




-- 
* I use this address for lists; send personal messages to phoebe.ayers at
gmail.com *
___
Wikimedia-l mailing list
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe

Re: [Wikimedia-l] Feedback for the Wikimedia Foundation

2013-07-27 Thread Erik Moeller
On Sat, Jul 27, 2013 at 12:45 AM, Jane Darnell jane...@gmail.com wrote:

 I am happy to report that I just discovered what the problem is. I had
 turned off the Show edit toolbar option in my preferences (probably
 over a year ago), so I wasn't seeing the top part of the VE edit
 toolbar, which includes the hyperlink icon, among other things. I was
 only seeing the other, second, line of the VE toolbar icons for
 including media, reference, references list, and transclusion.

That's interesting, Jane; thanks for the report. I'm not able to
reproduce this - as far as I can tell, the preference is completely
ignored by VE. If you or someone can get a repro on the exact
circumstances under which this occurs, please drop me a note or
directly add it to Bugzilla.

Thanks,
Erik

___
Wikimedia-l mailing list
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe

Re: [Wikimedia-l] [Wikitech-l] Collaborative machine translation for Wikipedia -- proposed strategy

2013-07-27 Thread Samuel Klein
David - thanks for this proposal; it is something that deserves
attention, and our projects are already used as one of the raw sources
for machine translation efforts.

On Sat, Jul 27, 2013 at 10:18 AM, David Cuenca dacu...@gmail.com wrote:
 On Fri, Jul 26, 2013 at 11:30 PM, C. Scott Ananian
 canan...@wikimedia.orgwrote:

 Step one of a machine
 translation effort should be to provide tools to annotate parallel texts in
 the various wikis, and to edit and maintain their parallelism.

I agree with most of Scott's input here.

 Scott, edit and maintain parallelism sounds wonderful on paper, until you
 want to implement it and then you realize that you have to freeze changes
 both in the source text and in the target language for it to happen, which
 is, IMHO against the very nature of wikis.

You don't need to freeze changes - you need permalinks to revisions,
the ability to track linkages between [sentences] in rev A.n in
language A and those in rev B.m in language B, and three-way diffs.
All are tractable problems.

 Translate:Extension already does that in a way. I see it useful only for
 texts acting as a central hub for translations, like official
 communication. If that were to happen for all kind of content you would
 have to sacrifice the plurality of letting each wiki to do their own version.

Allowing for a plurality of versions is useful.  There's no special
reason to break this out by language (if anything, there should be one
version per major cultural group - groups with different definitions
of reliable sources, for instance - not per language).  We should
separate plurality of branches of a document from synchronizing
translations of a given branch where a single branch of a document
should be available in any language.

For instance, I may want to read a French translation of the Russian
WP version of articles related to the Sino-Soviet war, in English --
in addition to the Japanese WP version, and the native French WP
version.  We can reduce the difficulty of translating each branch by
noting their shared similarities -- especially if we track the
revision at which each branched from, or rebased to, a shared trunk.
Allowing translators to automatically capture the source-revision when
carrying out an update via translation, per-page or per-section, would
make this easier.

 The most popular statistical-based machine translation system has created
 its engine using texts extracted from *the whole internet*, it requires
 huge processing power, and that without mentioning the amount of resources

One can do better with less power with parallel corpora.   WP and
Wikisource provide some of the closest things to a collection of
parallel corpora -- anything we can do to further clarify how much
these documents are parallel, and to improve their parallelism, will
improve [free] machine translation tools greatly.

 Of course statistical-based approaches should also be used as well (point 8
 of the proposed workflow), however more as a supporting technology rather
 than the main one.

+1

 One single researcher can create working transfer rules for a language pair
 in 3 months or less if there is previous work (see these GsoC [1], [2],
 [3]). Whichever problem the translation has, it can be understood and
 corrected...  [and] lower the entry barrier for linguists and translators 
 alike,

Right.  It's much easier to get a rules-based system that is close
enough to be useful to human translators, to speed up their work and
lower the entry barrier for someone to start translating, than to do a
complete job with rules.

 that there is no need to marry a technology, several can be developed in
 parallel and broght to a point of convergence where they work together

+10

Warmly,
SJ

___
Wikimedia-l mailing list
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe

Re: [Wikimedia-l] Feedback for the Wikimedia Foundation

2013-07-27 Thread Jane Darnell
I am using chrome and a pretty old computer. I am having trouble
getting my editting toolbar to go away again - probably cache
problems. I will try to reproduce this properly tomorrow, and
otherwise, scratch it up to RTFM

2013/7/27, Erik Moeller e...@wikimedia.org:
 On Sat, Jul 27, 2013 at 12:45 AM, Jane Darnell jane...@gmail.com wrote:

 I am happy to report that I just discovered what the problem is. I had
 turned off the Show edit toolbar option in my preferences (probably
 over a year ago), so I wasn't seeing the top part of the VE edit
 toolbar, which includes the hyperlink icon, among other things. I was
 only seeing the other, second, line of the VE toolbar icons for
 including media, reference, references list, and transclusion.

 That's interesting, Jane; thanks for the report. I'm not able to
 reproduce this - as far as I can tell, the preference is completely
 ignored by VE. If you or someone can get a repro on the exact
 circumstances under which this occurs, please drop me a note or
 directly add it to Bugzilla.

 Thanks,
 Erik

 ___
 Wikimedia-l mailing list
 Wikimedia-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
 mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe

___
Wikimedia-l mailing list
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe

Re: [Wikimedia-l] article bytes more meaningful than users or revisions (was Re: Updates on VE data analysis)

2013-07-27 Thread Mark

On 7/27/13 10:29 AM, Denny Vrandečić wrote:

I still would worry, though: our content is increasing linearly, as you
say, but the number of active contributors is not. If we take for granted
that active contributors are the ones who provide quality control for the
articles, this means that since 2006 or so the ratio of content per
contributor is linearly declining, which would mean that our quality would
suffer.



One useful bit of information is what *kind* of editors there are, not 
just the raw numbers..


For example, here is a hypothetical situation, which I think James and 
John are contemplating, which would result in a numerical decline in 
editors-per-article with no real change in actual editorial attention to 
the article:


* Article in 2007, with 19 editors: Initial content written by 1 person, 
moderate expansions from 3 people, copyediting from 5 people, 
vandalism-rollback from 10 people


* Similar article in 2013, with 12 editors: Initial content written by 1 
person, moderate expansions from 3 people, copyediting from 3 people and 
1 typo-fixing bot, vandalism-rollback from 2 people and 2 anti-vandal bots


Basically all that happened in this hypothetical is that two of the 
typo-fixers were replaced by a typo-fixing bot, and 8 rollbacks that 
would've once been done by recent-changes patrollers were instead done 
by a smaller number of anti-vandal bots. Maybe that's not what the 
change looks like, but I don't think the raw edit-count data can tell us 
either way.


I think this is also a potential issue with the definition of active 
users, which is defined as 5 edits/month for active and 100 
edits/month for very active. The latter in particular much more 
heavily favors people who make many smaller edits versus fewer large 
edits. And are there editors contributing substantial amounts of content 
to Wikipedia who don't even hit the lower threshold? One possible group 
are people whose main contribution is to write new articles, and do 
little to no other editing. Some people write offline and then 
contribute a new, well-referenced article in a single edit. If that's 
their only involvement in Wikipedia, they wouldn't be counted as active 
Wikipedians in the numbers, even if they're sending us a steady stream 
of 1-2 new articles/month.


I'm not sure how to best answer those questions automatically. Bytes, as 
James suggests, could be one possible proxy, but in addition to total 
bytes, we could look at the editor level. Has there been a decline in 
active editors if we define active editing as changing more than N 
bytes in the encyclopedia in a month, not counting rollbacks? That would 
count people who wrote substantial new articles as active, even if they 
did it in only 1 or 2 edits/month (although on the other hand, it 
wouldn't count people who made 100 rollbacks and no other edits).


Another possibility could be to sample a subset of either articles, or 
of editors, and manually annotate what kind of editing is going on. More 
tedious and would of necessity be on a small subset of the encyclopedia, 
but might avoid papering over things that are obvious when you look at 
them but tend to get lost in big-data analyses.


-Mark

___
Wikimedia-l mailing list
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe

Re: [Wikimedia-l] Collaborative machine translation for Wikipedia -- proposed strategy

2013-07-27 Thread Laura Hale
On Saturday, July 27, 2013, David Cuenca wrote:

 On Fri, Jul 26, 2013 at 11:30 PM, C. Scott Ananian
 canan...@wikimedia.org javascript:;wrote:

  This statement seems rather defeatist to me.  Step one of a machine
  translation effort should be to provide tools to annotate parallel texts
 in
  the various wikis, and to edit and maintain their parallelism.


 Scott, edit and maintain parallelism sounds wonderful on paper, until you
 want to implement it and then you realize that you have to freeze changes
 both in the source text and in the target language for it to happen, which
 is, IMHO against the very nature of wikis.
 Translate:Extension already does that in a way. I see it useful only for
 texts acting as a central hub for translations, like official
 communication. If that were to happen for all kind of content you would
 have to sacrifice the plurality of letting each wiki to do their own
 version.


Actually, this sort of translation service might be extremely useful for us
on Wikinews.  We have a fair amount of direct cross translation work from
one language to the other.  Our articles generally become non-editable
after a short period of time because of the nature of news reporting.
 There are issues for things like original reporting where getting say
original Czech language reporting outside the major news stories that
international media can easily sell for syndication do not get reported.
 Thus more local news from minority languages being shared... yeah, big
benefit for us. :)  There might be a few Wikinews language projects that
would be willing to sign on as beta testers for a collaborative translating
tool. :)  I think one of our regulars, Gryllida, has been trying to develop
a tool to make translating easier so it would fit really well with existing
project goals.

Sincerely,
Laura Hale


-- 

-- 
mobile:   635209416
twitter: purplepopple
blog: ozziesport.com
___
Wikimedia-l mailing list
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe

Re: [Wikimedia-l] About the concentration of resources in SF (it was: Communication plans for community engagement

2013-07-27 Thread Craig Franklin
Hi Erik (and whomever from WMDE),

For the benefit of chapters that are interested in this space, can you
offer any examples of projects that are of an appropriate size and type for
a chapter to take on?  I think that most chapters* would be willing to help
out in the software development space if we got a bit of direction on how
we could be the most useful.

Cheers,
Craig Franklin

* Keeping in mind that my chapter probably wouldn't have the capacity to
start anything in this space for at least another twelve months.


On 27 July 2013 09:57, Erik Moeller e...@wikimedia.org wrote:

 On Wed, Jul 24, 2013 at 2:39 PM, rupert THURNER
 rupert.thur...@gmail.com wrote:

  If WMF is serious about letting development activities grow in other
  countries this might be taken into account in FDCs allocation policy.

 For my part, I'm happy to offer feedback to the FDC on plans related
 to the development of engineering capacity in FDC-funded
 organizations. I'm sure Wikimedia Germany, too, would be happy to
 share its experiences growing the Wikidata development team. I'd love
 to find ways to bootstrap more engineering capacity across the
 movement, as so many of our shared challenges have a software
 engineering component. If any folks on-list want to touch base on
 these questions at Wikimania, drop me a note. :)

 Erik

 --
 Erik Möller
 VP of Engineering and Product Development, Wikimedia Foundation

 ___
 Wikimedia-l mailing list
 Wikimedia-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
 mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe

___
Wikimedia-l mailing list
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe