Hoi,
Apparently you are not aware that the Bengali Wikipedia is the biggest
resource in Bengali on the Internet. As a consequence it is a big success !!
Sure there should be more articles and we would absolutely welcome more
articles, more readers more positive attention for the Bengali Wikipedia.
On Thu, Aug 20, 2009 at 9:03 AM, Gerard
Meijssengerard.meijs...@gmail.com wrote:
Apparently you are not aware that the Bengali Wikipedia is the biggest
resource in Bengali on the Internet. As a consequence it is a big success !!
Sure there should be more articles and we would absolutely welcome
On Thu, Aug 20, 2009 at 5:22 AM, Lars Aronssonl...@aronsson.se wrote:
Of these 270 languages of Wikipedia, only 41 have more than 50,000
articles and only 69 had more than 1 million page views in July of
2009. The 69th most used Wikipedia is Swahili. This East African
language has 50 million
I am supposed to be taking a wiki-vacation to finish my PhD thesis and
find a job for next year. However, this afternoon I decided to take a
break and consider an interesting question recently suggested to me by
someone else:
When one downloads a dump file, what percentage of the pages are
Andre Engels wrote:
On Thu, Aug 20, 2009 at 5:22 AM, Lars Aronssonl...@aronsson.se wrote:
Of these 270 languages of Wikipedia, only 41 have more than 50,000
articles and only 69 had more than 1 million page views in July of
2009. The 69th most used Wikipedia is Swahili. This East African
Andre Engels wrote:
On Thu, Aug 20, 2009 at 5:22 AM, Lars Aronssonl...@aronsson.se wrote:
Of these 270 languages of Wikipedia, only 41 have more than 50,000
articles and only 69 had more than 1 million page views in July of
2009. The 69th most used Wikipedia is Swahili. This East African
Andre Engels hett schreven:
On Thu, Aug 20, 2009 at 5:22 AM, Lars Aronssonl...@aronsson.se wrote:
Of these 270 languages of Wikipedia, only 41 have more than 50,000
articles and only 69 had more than 1 million page views in July of
2009. The 69th most used Wikipedia is Swahili. This East
Robert, thanks for this. I have long wanted that number: it is really
interesting.
-Original Message-
From: Robert Rohde raro...@gmail.com
Date: Thu, 20 Aug 2009 03:06:06
To: Wikimedia Foundation Mailing Listfoundation-l@lists.wikimedia.org;
English
On Thu, Aug 20, 2009 at 6:06 AM, Robert Rohderaro...@gmail.com wrote:
[snip]
When one downloads a dump file, what percentage of the pages are
actually in a vandalized state?
Although you don't actually answer that question, you answer a
different question:
[snip]
approximations: I considered
On Thu, Aug 20, 2009 at 12:06 PM, Robert Rohderaro...@gmail.com wrote:
Given the nature of the approximations I made in doing this analysis I
suspect it is more likely that I have somewhat underestimated the
vandalism problem rather than overestimated it, but as I said in the
beginning I'd
Robert Rohde wrote:
When one downloads a dump file, what percentage of the pages are
actually in a vandalized state?
This is equivalent to asking, if one chooses a random page from
Wikipedia right now, what is the probability of receiving a vandalized
revision?
Is there a possibility of
Gregory Maxwell wrote:
If you were using is gay as a measure of vandalism
over time you might conclude that vandalism is decreasing when in
reality cluebot is performing the same kind of analysis for its
automatic vandalism suppression and the vandals have responded by
vandalizing in forms
David Gerard wrote:
Yes, completely. Do other Wikipedias show the same S-curve of growth?
I don't think it's an S-curve. I think we are seeing linear
growth, with a few exceptions in the very early days (years).
But hey, that's growth in the number of articles. We shouldn't
focus on the
While the time and effort that went into Robert Rohde's analysis is
certainly extensive, the outcomes are based on so many flawed assumptions
about the nature of vandalism and vandalism reversion, publicize at one's
peril the key finding of a 0.4% vandalism rate.
On Thu, Aug 20, 2009 at 12:23 PM, Lars Aronssonl...@aronsson.se wrote:
David Gerard wrote:
Yes, completely. Do other Wikipedias show the same S-curve of growth?
I don't think it's an S-curve. I think we are seeing linear
growth, with a few exceptions in the very early days (years).
But hey,
On Thu, Aug 20, 2009 at 12:59 PM, Gregory Kohs thekoh...@gmail.com wrote:
While the time and effort that went into Robert Rohde's analysis is
certainly extensive, the outcomes are based on so many flawed assumptions
about the nature of vandalism and vandalism reversion, publicize at one's
On Thu, Aug 20, 2009 at 12:46 PM, Jimmy Walesjwa...@wikia-inc.com wrote:
[snip]
Greg, I think your email sounded a little negative at the start, but not
so much further down. I think you would join me heartily in being super
grateful for people doing this kind of analysis. Yes, some of it
There is another way to detect 100% reverts. It won't catch manual reverts
that are not 100 accurate but most vandal patrollers will use undo, and the
like.
For every revision calculate md5 checksum of content. Then you can easily
look back say 100 revisions to see whether this checksum
Nathan said:
...but certainly its (sic) more informative than a Wikipedia Review
analysis of a relatively small group of articles in a specific topic area.
And you are certainly entitled to a flawed opinion based on incorrect
assumptions, such as ours being a Wikipedia Review analysis. But,
Hello,
I noticed that there are still a lot of open request for closure on Meta
so I decided to contact a LangCom member (Robin) asking him about how
and when the projects will be closed or when the requests will be
closed, but I recieved a answer I didn't expected.
Robin told me there was no
2009/8/20 Erik Zachte erikzac...@infodisiac.com:
There is another way to detect 100% reverts. It won't catch manual reverts
that are not 100 accurate but most vandal patrollers will use undo, and the
like.
For every revision calculate md5 checksum of content. Then you can easily
look back
On Thu, Aug 20, 2009 at 11:23 AM, Erik Zachte erikzac...@infodisiac.comwrote:
There is another way to detect 100% reverts. It won't catch manual reverts
that are not 100 accurate but most vandal patrollers will use undo, and the
like.
For every revision calculate md5 checksum of content.
On Thu, Aug 20, 2009 at 1:30 PM, Gregory Kohs thekoh...@gmail.com wrote:
Nathan said:
...but certainly its (sic) more informative than a Wikipedia Review
analysis of a relatively small group of articles in a specific topic area.
And you are certainly entitled to a flawed opinion based on
Hoi,
There is no procedure because what comes closest to a consensus amount to a
lot of work. Work that does not forward our mission one iota. The fact that
people vote and comment is not that special, people do ... if they vote that
I will wear a tutu at Wikimania and a consensus says that I
Lars Aronsson wrote:
Day 1: Create article Apple is a fruit.
Day 2: Create article Pear is a fruit.
Day 3: Extend article about apples. Add photos. Cite sources.
Day 3: Zero growth in the number of articles. Panic!!!
I concur wholeheartedly. Focusing on rising article counts gave us a thrill
Also a say 30% share of bot edits on some Wikipedia does not mean 30% of
articles have been created by bots. My guess is that share is higher.
That was too rash. I simply don't know the actual amount, but there is no
linear relation for sure.
Let me rephrase that more safely:
If say
I couldn't agree more, Erik. Not paying attention to milestones is
the first and best step; Wikipedia:Signpost should start with it.
Ziko
2009/8/20 Erik Zachte erikzac...@infodisiac.com:
I concur wholeheartedly. Focusing on rising article counts gave us a thrill
for many years, and now it is
Apologies to Nathan regarding the Wikipedia Review description. The
analysis team was, indeed, recruited via Wikipedia Review; however, almost
all of the participants in the research have now departed or reduced their
participation in Wikipedia Review to such a degree, I don't personally
consider
Marcus Buck wrote:
What I want to say: please everybody get away from calling
projects failure, worse, weak or whatever. It's all
subjective. And it's entirely meaningless,
I disagree, it's neither subjective nor meaningless. Wikipedia
has a mission to disseminate free knowledge. It's an
On Thu, Aug 20, 2009 at 2:35 PM, Lars Aronssonl...@aronsson.se wrote:
Marcus Buck wrote:
What I want to say: please everybody get away from calling
projects failure, worse, weak or whatever. It's all
subjective. And it's entirely meaningless,
I disagree, it's neither subjective nor
Hoi,
Lars I completely agree that the failure of a Wikipedia IS meaningful. But
it is only meaningful if we are interested in learning what causes these
failures, what we can do to remedy these situations and when we are willing
to act upon our findings.
I mentioned earlier that the Danish
Hoi,
For some of our smaller projects, the number of articles are the only
milestones available. It is necessary to celebrate progress. It is
meaningful when the Swahili Wikipedia becomes the biggest African language
Wikipedia. It is meanigful when you compare it with most of the other
African
2009/8/20 Lars Aronsson l...@aronsson.se:
David Gerard wrote:
Yes, completely. Do other Wikipedias show the same S-curve of growth?
I don't think it's an S-curve. I think we are seeing linear
growth, with a few exceptions in the very early days (years).
But hey, that's growth in the number
Chad hett schreven:
I agree wholeheartedly. We need to get away from this idea that more
projects in more languages is better. It's not. It's lead to the issue we
see now: dead projects lying around until somebody bothers to clean it
up or close it.
More projects in more languages _is_
Marcus Buck wrote:
I don't think that there are generally too few people interested in
those languages. It's just hard to make the start. It's immensely
frustrating to work on a wiki all alone, writing article for article,
and after a year, you maybe have 100 or 200 articles and your
On Thu, Aug 20, 2009 at 1:55 PM, Nathan nawr...@gmail.com wrote:
My point (which might still be incorrect, of course) was that an analysis
based on 30,000 randomly selected pages was more informative about the
English Wikipedia than 100 articles about serving United States Senators.
Any
Hi Gerard,
Indeed, people need news. But they can be produced also with more
sence having accomplishments: All mayors of our capital have an
article, the 50 most important folk singers, great illustrated
articles on the fauna and flora of our region...
Kind regards
Ziko
2009/8/20 Gerard Meijssen
On Thu, 20 Aug 2009 09:14:14 +0200
Gerard Meijssen gerard.meijs...@gmail.com wrote:
One of the reasons why Danish has been sluggish may be that the
localisation of Danish was not optimal; in Februari 83.66% of the
MediaWiki messages and 14.11% of the WMF used extensions were
localised. This
2009/8/20 Gregory Maxwell gmaxw...@gmail.com:
Going back to your simple study now: The analysis of vandalism
duration and its impact on readers makes an assumption about
readership which we know to be invalid. You're assuming a uniform
distribution of readership: That readers are just as
Gerard Meijssen wrote:
Hoi,
Lars I completely agree that the failure of a Wikipedia IS meaningful. But
it is only meaningful if we are interested in learning what causes these
failures, what we can do to remedy these situations and when we are willing
to act upon our findings.
On Thu, Aug 20, 2009 at 2:10 PM, Anthonywikim...@inbox.org wrote:
On Thu, Aug 20, 2009 at 1:55 PM, Nathan nawr...@gmail.com wrote:
My point (which might still be incorrect, of course) was that an analysis
based on 30,000 randomly selected pages was more informative about the
English Wikipedia
2009/8/20 Jimmy Wales jwa...@wikia-inc.com:
Robert Rohde wrote:
When one downloads a dump file, what percentage of the pages are
actually in a vandalized state?
This is equivalent to asking, if one chooses a random page from
Wikipedia right now, what is the probability of receiving a
Robert Rohde wrote:
Does anyone have a nice comprehensive set of page traffic aggregated
at say a month level? The raw data used by stats.grok.se, etc. is
binned hourly which opens one up to issues of short-term fluctuations,
but I'm not at all interested in downloading 35 GB of hourly
On Thu, Aug 20, 2009 at 6:36 PM, Robert Rohde raro...@gmail.com wrote:
On Thu, Aug 20, 2009 at 2:10 PM, Anthonywikim...@inbox.org wrote:
if one chooses a random page from Wikipedia right now, what is the
probability of receiving a vandalized revision The best way to answer
that
question
2009/8/20 Anthony wikim...@inbox.org:
I wouldn't suggest looking at the edit history at all, just the most recent
revision as of whatever moment in time is chosen. If vandalism is found,
then and only then would one look through the edit history to find out when
it was added.
That only works
On Thu, Aug 20, 2009 at 6:57 PM, Thomas Dalton thomas.dal...@gmail.comwrote:
2009/8/20 Anthony wikim...@inbox.org:
I wouldn't suggest looking at the edit history at all, just the most
recent
revision as of whatever moment in time is chosen. If vandalism is found,
then and only then would
On Thu, Aug 20, 2009 at 3:57 PM, Thomas Daltonthomas.dal...@gmail.com wrote:
2009/8/20 Anthony wikim...@inbox.org:
I wouldn't suggest looking at the edit history at all, just the most recent
revision as of whatever moment in time is chosen. If vandalism is found,
then and only then would one
Marcus Buck wrote:
Languages of societies
with much leisure time easily gained enough momentum by themselves. But
other language versions from societies with educational and social
hardships don't gain momentum by themselves. They don't reach the
critical mass to sustain active wiki
2009/8/21 Anthony wikim...@inbox.org:
On Thu, Aug 20, 2009 at 6:57 PM, Thomas Dalton thomas.dal...@gmail.comwrote:
2009/8/20 Anthony wikim...@inbox.org:
I wouldn't suggest looking at the edit history at all, just the most
recent
revision as of whatever moment in time is chosen. If
On Thu, Aug 20, 2009 at 7:13 PM, Robert Rohde raro...@gmail.com wrote:
On Thu, Aug 20, 2009 at 3:57 PM, Thomas Daltonthomas.dal...@gmail.com
wrote:
2009/8/20 Anthony wikim...@inbox.org:
I wouldn't suggest looking at the edit history at all, just the most
recent
revision as of whatever
On Thu, Aug 20, 2009 at 7:20 PM, Thomas Dalton thomas.dal...@gmail.comwrote:
2009/8/21 Anthony wikim...@inbox.org:
On Thu, Aug 20, 2009 at 6:57 PM, Thomas Dalton thomas.dal...@gmail.com
wrote:
2009/8/20 Anthony wikim...@inbox.org:
I wouldn't suggest looking at the edit history at all,
2009/8/21 Anthony wikim...@inbox.org:
My God. If a few dozen people couldn't easily determine to a relatively
high degree of certainty what portion of a mere 0.03% of Wikipedia's
articles are *vandalized*, how useless is Wikipedia?
I never said they couldn't. I said they couldn't do it by
2009/8/21 Anthony wikim...@inbox.org:
Is this article vandalized? is a yes/no question...
True, but that isn't actually the question that this research tried to
answer. It tried to answer How much time has this article spent in a
vandalised state?. If we are only interested in whether the most
On Thu, Aug 20, 2009 at 4:37 PM, Anthonywikim...@inbox.org wrote:
On Thu, Aug 20, 2009 at 7:13 PM, Robert Rohde raro...@gmail.com wrote:
On Thu, Aug 20, 2009 at 3:57 PM, Thomas Daltonthomas.dal...@gmail.com
wrote:
2009/8/20 Anthony wikim...@inbox.org:
I wouldn't suggest looking at the edit
On Thu, Aug 20, 2009 at 7:54 PM, Thomas Dalton thomas.dal...@gmail.comwrote:
2009/8/21 Anthony wikim...@inbox.org:
Is this article vandalized? is a yes/no question...
True, but that isn't actually the question that this research tried to
answer. It tried to answer How much time has this
On Thu, Aug 20, 2009 at 7:58 PM, Robert Rohde raro...@gmail.com wrote:
You seem to be identifying all errors with vandalism.
How so?
Sometimes factual errors are simply unintentional mistakes.
Obviously we can't know the intent of the person for sure, but after a
mistake is found it's
Riddle me this...
Is the edit below vandalism?
http://en.wikipedia.org/w/index.php?title=Arch_Coaldiff=255482597oldid=255480884
Did the edit take a page and make it worse? Or, did it make the page a
better available revision than the version immediately prior to it?
Methinks the Wikipedia
On Thu, Aug 20, 2009 at 14:10, Anthonywikim...@inbox.org wrote:
On Thu, Aug 20, 2009 at 1:55 PM, Nathan nawr...@gmail.com wrote:
My point (which might still be incorrect, of course) was that an analysis
based on 30,000 randomly selected pages was more informative about the
English Wikipedia
Gregory Kohs wrote:
Riddle me this...
Is the edit below vandalism?
http://en.wikipedia.org/w/index.php?title=Arch_Coaldiff=255482597oldid=255480884
Did the edit take a page and make it worse? Or, did it make the
page a better available revision than the version immediately
prior to it?
On Thu, Aug 20, 2009 at 9:30 PM, Mark Wagner carni...@gmail.com wrote:
On Thu, Aug 20, 2009 at 14:10, Anthonywikim...@inbox.org wrote:
if one chooses a random page from Wikipedia right now, what is the
probability of receiving a vandalized revision The best way to answer
that
question
Yann Forget wrote:
As I already said, the first steps would be to import existing
databases, and Wikimedians are very good at this job.
Do you have a bibliographic database (library catalog) of French
literature that you can upload? How many records? Convincing
libraries to donate copies
Phil Nash wrote:
Many editors undo and revert on the basis of felicity of language and
emphasis, and unless it becomes an issue is an epiphenomenon of the
encyclopedia that anyone can edit. so I can't see how this is a good
example of anything in particular.
And, with point proven, I rest my
-- Forwarded message --
From: Reid Priedhorsky r...@umn.edu
Date: Thu, Aug 20, 2009 at 9:58 AM
Subject: Re: [Wiki-research-l] [Foundation-l] How much of Wikipedia
is vandalized? 0.4% of Articles
To: wiki-researc...@lists.wikimedia.org
On 08/20/2009 11:34 AM, Gregory Maxwell
And here is where many of the flaws of the University of Minnesota study
were exposed:
http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_31#The_Unbreakable_Wikipedia.3F
Their methodology of tracking the persistence of words was questionable, to
say the least.
And here was my favorite
On Thu, Aug 20, 2009 at 11:02 PM, Gregory Kohs thekoh...@gmail.com wrote:
And here was my favorite part:
*We exclude anonymous editors from some analyses, because IPs are not
stable: multiple edits by the same human might be recorded under different
IPs, and multiple humans can share an IP.*
Hoi,
Given that on Februari first 96.07% of the most used messages were
localised, it is clear that some of the most used messages were not even
localised. Consequently your puh puh reaction that only the rare messages
are affected is not correct.
Thanks,
GerardM
2009/8/20 Kaare Olsen
Gerard Meijssen wrote:
Hoi,
Given that on Februari first 96.07% of the most used messages were
localised, it is clear that some of the most used messages were not even
localised. Consequently your puh puh reaction that only the rare messages
are affected is not correct.
Not all of the
On Thu, Aug 20, 2009 at 9:22 PM, Lars Aronssonl...@aronsson.se wrote:
Kaare Olsen wrote:
What I think is the primary reason for the Danish Wikipedia
being much smaller than the neighbouring languages is that
Danes generally are internationally minded and pride themselves
on being good at
2009/8/21 Jussi-Ville Heiskanen cimonav...@gmail.com:
Gerard Meijssen wrote:
Hoi,
Given that on Februari first 96.07% of the most used messages were
localised, it is clear that some of the most used messages were not even
localised. Consequently your puh puh reaction that only the rare
Hoi,
We are not talking about bootstrap usage. The Danish Wikipedia is obviously
way past that point. We are talking about usability and the acceptance of
MediaWiki as a proper platform for a language. Basically usage is not the
same as being accepted as an environment that provides proper
Just to clarify, are you saying that in your view, too
few messages are translated to Danish, or are you
saying that too many messages are translated to the
Danish language?
Yours,
Jussi-Ville Heiskanen
___
foundation-l mailing list
On Fri, Aug 21, 2009 at 1:36 AM, Svipsvi...@gmail.com wrote:
But that's without mentioning the horrible state of the localisation
in general: Wrong context translations, just wrong translations and
many spelling errors.
Contextual errors I can understand, figuring out all the right
contexts
Hoi,
At translatewiki.net many of the messages include information about the
context. The coverage of this information has been improving steadily. This
information is not available when messages are localised on the local wiki.
So there are two places where localisations can originate; local and
73 matches
Mail list logo