Re: [Wikimedia-l] The reader, who doesn't exist

2014-08-25 Thread Delirium

On 8/25/14, 3:06 AM, MZMcBride wrote:

As a metric, pageviews are probably not very meaningful. One way we can
observe whether we're fulfilling our mission is to see how ubiquitous
our content has become. An even better metric might be the quality of the
articles we have. Anecdotal evidence suggests that higher article quality
is not really tied to the readership rate, though perhaps article size is.

Yes, I'd ideally like some better measure of how much people get out of 
articles. Some types of analytics do track page view duration, although 
that can be considered intrusive.

I've done a little spot-checking within specific areas (e.g. 
archaeological sites) of our view counts, and they are largely dominated 
by spikes around transient news events: something is in the news and 
5,000 or 50,000 people load an article that normally gets 50 or 100 hits 
a day. Providing that kind of quick background knowledge to people 
googling for an item they saw on the news is a valuable service, to be 
sure. But I'm not sure it's *as* big a proportion of the value Wikipedia 
provides as the raw pageload numbers would say.


Wikimedia-l mailing list, guidelines at:

Re: [Wikimedia-l] Wikipedia mobile apps

2014-06-17 Thread Delirium

On 6/16/14, 4:27 PM, Brion Vibber wrote:

As Sage notes, the functionality of the new apps is about the same on both
Android and iOS, with some differences in the UI.

Is there something written on the intended relationship between the apps 
and the mobile website? I've long been mildly confused about how the 
goals for each relate, and what I should use. On my Android phone 
right now, I've got both the old app, and a bookmark to Wikipedia that 
opens in Firefox, and I seem to alternate between which I prefer, 
because neither is strictly a superset of the other's functionality.


Wikimedia-l mailing list, guidelines at:

Re: [Wikimedia-l] The tragedy of Commons

2014-06-17 Thread Delirium

On 6/17/14, 5:52 PM, George William Herbert wrote:

On Jun 17, 2014, at 8:37 AM, Emmanuel Engelhart wrote:

On 17.06.2014 17:26, George William Herbert wrote:

We need an Uncommons, where the strict open license / PD rules are abandoned 
and we accept images as long as their fair use can be established.  And don't 
delete unless that fair use is credibly questioned.

Conflating and comingling our educational role with open content advocacy was 
always risky and is proving impossible.  Without devaluing open content, we 
need to separately support fair use for educational purposes, and stop letting 
cross-project advocacy games screw with our educational mission.

Third parties may or may not be able to re-redistribute, but we simply put it up with an 
explicit reuse at your own risk.

reuse at your own risk = risky = no reuse for most actors
Well done!

Not my problem.

Educational role.

The whole mission of the movement, including its educational mission, is 
*produce freely reusable content*, not just to run a website. Wikipedia 
in particular is an open-content encyclopedia, which can be adapted to 
many educational and other uses, by Wikimedians and third parties. If 
it's not an open-content encyclopedia, for example if Wikipedia articles 
make use of provincial American copyright loopholes that render them 
illegal to redistribute here in Denmark, imo it has failed in its 
educational mission. In my view, the fact that I (an educator not in the 
United States) should be able to legally reproduce and distribute 
Wikipedia articles, is part of the whole point of an open-content 
educational project.


Wikimedia-l mailing list, guidelines at:

Re: [Wikimedia-l] Open letter from Wikimedia Argentina regarding URAA

2014-02-28 Thread Delirium

On 2/28/14, 9:18 AM, David Gerard wrote:

On 28 February 2014 01:23, geni wrote:

On 27 February 2014 22:03, Galileo Vidoni wrote:

And we remain convinced that there is space for a way more prudent
implementation of URAA that prevents deleting educational resources until
there is complete copyright information and no legal alternative, which to
our understanding (and to our interpretation of WMF's communications) can
mean waiting for DMCA takedown notices.

We could do that but it pretty much removes commons only advantage over say
imgur or flickr. We want the images on commons to be free. Not simply stuff
no one has got around to complaining about yet,

This supports what I noted: Commons increasingly just can't be relied
upon as a repository for the other Wikimedia projects.

This implies no bad faith or bad actions on the part of the Commons
community. (But that that's a distinct thing from the Wikimedia
community is a lot of the problem.) Nor that what Commons *is* is
inherently problematic; but what it is is less and less useful inside

But the other Wikimedia projects are *also* supposed to share that goal: 
of producing a Free-as-in-freedom encyclopedia whose contents can be 
safely reused and adapted by a wide range of other people and 
organizations, who should be able to assume that it is legal to do so 
without exhaustive case-by-case investigation. The movement's main job 
is not merely hosting the websites *, putting up whatever 
we find useful to put up, and taking down things when we get complaints 
or lawsuits.

What level of scrutiny we want to apply is indeed a judgment call, so 
and I don't know if the current URAA policy falls on the right or wrong 
side of that (I haven't investigated it). But I don't think the 
fundamental goals are different. And if they are, it's the other 
projects that are in the wrong: *not* having a free, reusable body of 
content as the project goal is fundamentally incompatible with the 
Wikimedia Movement. We want the content on all Wikimedia wikis to be 
free-as-in-freedom and reusable by anyone. That's the point.


Wikimedia-l mailing list

Re: [Wikimedia-l] The Wikipedia Gap

2013-12-10 Thread Delirium
In terms of specific articles to create, there is also

That project collects articles that exist in wide range of other 
encyclopedias, but don't yet exist on Wikipedia. However that's not 
covering quite the same concerns as the systemic-bias discussion, since 
many of those encyclopedias themselves have similar biases. Nonetheless 
this kind of comparison can be useful to find specific gaps in coverage 
that, equally importantly, are actionable in the sense that at least 
one source to base an article on exists.


On 12/9/13, 9:07 PM, Peter Coombe wrote:

The English Wikipedia has attempted a (non-exhaustive) list at


On 9 December 2013 07:35, Romaine Wiki wrote:

In various research and media articles is written that in several subject
groups Wikipedia is missing a lot of articles and those groups are
relatively unrepresented.

How can we as Wikipedia get clear which subject groups are missing?

How can we get lists of less represented subject groups and the articles
in those groups?

Let us get practical, ow can we fill the gap?


Wikimedia-l mailing list

Wikimedia-l mailing list

Wikimedia-l mailing list

Re: [Wikimedia-l] Which Wikipedias have had large scale bot creation of articles this year?

2013-11-28 Thread Delirium

On 11/27/13 2:01 PM, Fæ wrote:

As well as finding out where this has happened, it would be good to
have some cases of where bots went bad explained. My main concern
would be leaving a bot to create thousands of articles but in the
process creating a headache for limited numbers of maintainers, such
as article copy-editors, categorizers, illustrators, inter-linkers or
gnomic contributors.

One example I recently ran across, while using the georeference data 
from Wikipedia World, is that bot-imports of villages on the Hindi 
Wikipedia appear to be creating *thousands* of articles with identical 
coordinates. There are about 1300 articles georeferenced to the 
coordinates (25.611, 85.144), for example. I'm not sure if this is an 
error (default value left in a template?), or has some other 
explanation. I could imagine it also being a deliberate imprecision, for 
example using the coordinate for the center of a district for villages 
where the precise coordinate of the village itself isn't known. In any 
case, it produces a bit of a mess; these could all be fixed up pretty 
easily by volunteers checking on OpenStreetMap and the like, but nobody 
has done so, because there are so many of these stubs.

This particular example:,+85.144


Wikimedia-l mailing list

Re: [Wikimedia-l] letter from the FDC to the WMF

2013-10-23 Thread Delirium

On 10/23/13 2:08 AM, Federico Leva (Nemo) wrote:

Theo10011, 23/10/2013 00:21:
I'm quite surprised to constantly read FDC is somehow representative 
of the

larger community and accountable to them. Almost all the current members
were part of chapter leadership and have been quite active within that
circle. I suppose this is the same fiction as chapters inherently being
representatives of the larger community. The FDC is sort of a UN-like
gathering that yet somehow overlooks the largest and most active 

of all.

Perhaps you might want to take a look at the dismal rate of actual
community participation in FDC discussions. An year or so in to its
formation, there isn't exactly a stellar record and high-opinions to go
around. I hope I don't need to point to the recent news articles and
comments about the FDC and possible issues of corruption, which might 

even played a part in...whatever this is.

I'm not sure how this matters for this proposal/request by the FDC: do 
such defects exist or apply only to evaluating the WMF budget? If not, 
how do they bring water to the idea of letting WMF be special compared 
to the other entities' funding?

From my perspective as someone not really involved in either the WMF or 
chapters (or other committees), but just an editor and a community 
member, I tend to see the WMF as special in this sense because it 
already has a Board of Trustees that in a fairly reasonable way 
represent the community/movement, who I trust to make decisions on 
funding priorities. Therefore it's not clear to me why *another* 
advisory board should be a second layer of bureaucracy evaluating its 
budget proposals. They are already evaluated by the Trustees primarily, 
and by the community as a whole secondarily, which seems like enough 
oversight. If the community disagrees with the WMF's direction or 
priorities, they can vote for different trustees in the next election, 
or otherwise suggest changes in its structure or membership. But in 
general I trust their judgment on how to allocate the Foundation's money 
in accordance with the mission.


Wikimedia-l mailing list

Re: [Wikimedia-l] Fwd: [WikiEN-l] access to journals

2013-09-25 Thread Delirium

On 9/24/13 10:13 PM, Andy Mabbett wrote:

On 24 September 2013 14:06, Liam Wyatt wrote:

I'm now working for the National Library of Australia and we offer free, at
home, access to JSTOR and MANY other restricted access databases to any
Australian, if they get a free library card.
Is this unique to Australia?

My free library subscription in Birmingham, England, gets me access -
from home or indeed anywhere else - to a number of otherwise-paywalled
online databases and services

In Denmark, and I believe most of the USA, the norm is only on-site 
access to subscriptions, for the general public. University-affiliated 
researchers do have the option to login remotely, or VPN in to get an 
institutional IP address offsite. But the general public has to use 
library computers to access the subscriptions, or (in some cases) their 
own computers on the library WiFi.


Wikimedia-l mailing list

Re: [Wikimedia-l] Wikimedia and the politics of encryption

2013-09-03 Thread Delirium

On 9/3/13 4:28 PM, Marc A. Pelletier wrote:

On 09/03/2013 09:45 AM, Fred Bauder wrote:

Abusive nonsense does not make that fact go away. Someone,
actually, many someones, need to be trusted.

Доверяй, но проверяй.

I agree with your assessment of the risks of working with the PRC, I
simply think that if you think that those risks do not exist in our
Western countries, you are ignoring history.

I certainly agree with learning from history, but when it comes to 
censoring encyclopedias or similar reference works, are there good 
examples that might more concretely narrow down the specific type of 
thing we ought to be learning from history?

The best example of which I'm aware is the 1979 attempt by the U.S. 
Department of Energy to stop the publication of a reconstruction of the 
Teller-Ulam hydrogen bomb design. But that attempt ended up being 
unsuccessful, and encyclopedias (including Wikipedia) include that 
information. Are there more successful attempts?


Wikimedia-l mailing list

Re: [Wikimedia-l] Wikidata Stubs: Threat or Menace?

2013-04-26 Thread Delirium
This is a very interesting proposal. I think how well it will work may 
vary considerably based on the language.

The strongest case in favor of machine-generating stubs, imo, is in 
languages where there are many monolingual speakers and the Wikipedia is 
already quite large and active. In that case, machine-generated stubs 
can help promote expansion into not-yet-covered areas, plus provide 
monolingual speakers with information they would otherwise either not 
get, or have to get in worse form via a machine-translated article.

At the other end of the spectrum you have quite small Wikipedias, and 
Wikipedias which are both small and read/written mostly/entirely by 
bilingual readers. In these Wikipedias, article-writing tends to focus 
on things more specifically relevant to a certain culture and history. 
Suddenly creating tens or hundreds of thousands of stubs in such 
languages might serve to dilute a small Wikipedia more than strengthen 
it: if you take a Wikipedia with 10,000 articles, and it gains 500,000 
machine-generated stubs, *almost every* article that comes up in search 
engines will be machine-generated, making it much less obvious what 
parts of the encyclopedia are actually active and human-written amidst 
the sea of auto-generated content.

Plus, from a reader's perspective, it may not even improve the 
availability of information. For example, I doubt there are many 
speakers of Bavarian who would prefer to read a machine-generated article, over a human-written article. That may even be 
true for some less-related languages: most Danes I know would prefer a 
human-written English article over a machine-generated Danish one.


On 4/25/13 8:16 PM, Erik Moeller wrote:

Millions of Wikidata stubs invade small Wikipedias .. Volapük
Wikipedia now best curated source on asteroids .. new editors flood
small wikis .. Google spokesperson: This is out of control. We will
shut it down.

Denny suggested:

II ) develop a feature that blends into Wikipedia's search if an article
about a topic does not exist yet, but we  have data on Wikidata about that

Andrew Gray responded:

I think this would be amazing. A software hook that says we know X
article does not exist yet, but it is matched to Y topic on Wikidata
and pulls out core information, along with a set of localised
descriptions... we gain all the benefit of having stub articles
(scope, coverage) without the problems of a small community having to
curate a million pages. It's not the same as hand-written content, but
it's immeasurably better than no content, or even an attempt at
machine-translating free text.

XXX is [a species of: fish] [in the: Y family]. It [is found in: Laos,
Vietnam]. It [grows to: 20 cm]. (pictures)

This seems very doable. Is it desirable?

For many languages, it would allow hundreds of thousands of
pseudo-stubs (not real articles stored in the DB, but generated from
Wikidata) to be served to readers and crawlers that would otherwise
not exist in that language.

Looking back 10 years, User:Ram-Man was one of the first to generate
thousands of en.wp articles from, in this case, US census data. It was
controversial at the time and it stuck. Other Wikipedias have since
then either allowed or prohibited bot-creation of articles on a
project-by-project basis. It tends to lead to frustration when folks
compare article counts and see artificial inflation by bot-created

Does anyone know if the impact of bot-creation on (new) editor
behavior has been studied? I do know that many of the Rambot articles
were expanded over time, and I suspect many wouldn't have been if they
hadn't turned up in search engines in the first place. On the flip
side, a large surface area of content being indexed by search
engines will likely also attract a fair bit of drive-by vandalism that
may not be detected because those pages aren't watched.

A model like the proposed one might offer a solution to a lot of these
challenges. How I imagine it could work:

* Templates could be defined for different Wikidata entities. We could
make it possible to let users add links from items in Wikidata to
Wikipedia articles that don't exist yet. (Currently this is
prohibited.) If such a link is added, _and_ a relevant template is
defined for the Wikidata entity type (perhaps through an entity
type-template mapping), WP will render an article using that
template, pulling structured info from Wikidata.

* A lot of the grammatical rules would be defined in the template
using checks against the Wikidata result. Depending on the complexity
of grammatical variations beyond basics such as singular/plural this
might require Lua scripting.

* The article is served as a normal HTTP 200 result, cached, and
indexed by search engines. In WP itself, links to the article might
have some special affordance that suggests that they're neither
ordinary red links nor existing articles.

* When a user tries to edit the article, wikitext (or 

Re: [Wikimedia-l] Transparency about Wikimania costs

2012-10-12 Thread Delirium

On 10/12/12 12:40 AM, Itzik Edri wrote:

Just want to inform that WMIL published Wikimania 2011 budget breakdown:

Thanks for the information; it's quite useful to see these kinds of things.

Two minor questions about the numbers. I don't see an item for 
conference venue rental/fees, which is often a major cost. Was Wikimania 
given free use of the venue? Or is that part of another category, such 
as Logistics? And, I see support by chapters listed as 0, but lists 5 chapters as sponsors.


Wikimedia-l mailing list

Re: [Wikimedia-l] Wikipedia redefined -- typography and UX and such

2012-08-17 Thread Delirium

On 8/17/12 12:02 PM, Magnus Manske wrote:

This is quite nice, especially on a larger screen! Our current layout, 
which uses the full browser width for text, makes articles hard to read 
and cluttered-looking on larger screens. The text column with images and 
ToC in the sidebar is a nice change. Though on the other hand, I do like 
flowing text around images below some with threshold. When reading on a 
smaller screen, with this layout you can end up with a very narrow text 
column down the middle. But overall I like it. The only thing I'd really 
want is some way to get to more of the functionality. For example, I 
can't find how to view edit history.


Wikimedia-l mailing list

Re: [Wikimedia-l] This afternoon's system outage

2012-08-06 Thread Delirium

On 8/6/12 4:52 PM, WereSpielChequers wrote:

Hi, after crashing an hour or so ago EN Wikipedia has started to come back
but with a really strange appearance - less usable than Vector.

It's back to normal for me now. Afaict, the servers hosting the static 
CSS/JS came back up later than the servers hosting the wiki content, so 
for a period you would be seeing the raw HTML without any CSS styling.

Vaguely impressed that it was readable at all, actually. Good HTML/CSS 
practice is supposed to result in the HTML being readable even without 
CSS applied, but it's common at major sites for that to not really be 
the case.


Wikimedia-l mailing list

Re: [Wikimedia-l] Is there an agreement between GoldenMap and the Wikipedia for the use of Wikipedia content?

2012-08-03 Thread Delirium
It looks like a direct scrape, even to the extent of having some 
internal links being broken because they didn't update them (e.g. the 
link to Wikimedia Commons at the end of the article). I believe it's 
just one of the (many) unauthorized mirrors that don't properly credit 
the source of their content.

The English Wikipedia keeps track of such sites here:


On 8/3/12 1:44 PM, Rui Correia wrote:

Dear All

I came across a site called Golden Map, which has an encyclopaedic
collection of articles that are the same as in the Wikipedia, but I don't
see anywhere any information expalining what the association/ permission
is. Is there an agreement in place for this?

Look - for example - at this page on the Wildebeest

Best regards,


Wikimedia-l mailing list

Re: [Wikimedia-l] COI+ certification proposal

2012-08-01 Thread Delirium

On 8/1/12 1:51 PM, Federico Leva (Nemo) wrote:

Yann Forget, 01/08/2012 13:13:

I have suggested some basic rules about this on the French WP, but not
only they were blankly rejected, but I was barred from mentioning the
whole subject. The first step against CoI is making the editors
conscious that, because of their profession, background, culture,
etc., they may have a bias on a subject.

We regularly discuss this in the Italian community about the so called 
subject matter experts, reegularly coming to the conclusion that we 
surely make it clear that their opinions/original research is not 
welcome (it regularly gets deleted), the question is whether thay can 
find a way to contribute or they're unrecoverable.

I think it can work well, if the Wikipedia manages to develop a core of 
expert editors in an area who also understand Wikipedia norms, and can 
spread that culture. On the English Wikipedia this works reasonably well 
in some of the scientific and mathematical areas, where most of the 
involved academics understand that they need to cite third-party 
reliable sources for statements (increasingly true in the history 
editing as well, I believe). On the other hand, there is still some CoI 
that goes on now and then, with people writing articles to promote their 
own findings (or even an article on their lab, university, or themselves).


Wikimedia-l mailing list

Re: [Wikimedia-l] Apparently, Wikipedia is ugly

2012-07-14 Thread Delirium

On 7/14/12 7:05 PM, Audrey Abeyta wrote:

Appearance does affect perceptions of credibility, which should be of
interest to Wikipedia. Recently, I was talking to someone who doubted
Wikipedia's validity. When I asked her if it was because the content can be
edited by anyone, she replied, No, it's the way the site looks.

I've run into this also, but I suspect part of it is self-referential: 
Wikipedia looks like a default install of MediaWiki, and therefore looks 
like many half-assed/uncustomized MediaWiki installs out there. But 
that's because we are (close to) a default install of MediaWiki! Or 
rather, the reverse: the default MediaWiki skin was borrowed from the 
one designed for Wikimedia sites.

I wonder if we'd gain a modest boost in perceptions of our design if we 
just made sure the skin used on Wikimedia sites, and the default skin 
shipped with MediaWiki, were fairly dissimilar in style.


Wikimedia-l mailing list

Re: [Wikimedia-l] crazy deletionists!

2012-07-04 Thread Delirium

On 7/4/12 1:04 AM, Andreas Kolbe wrote:

What would a Wikipedia look like that did not make use of press sources? It
would look a hell of a lot more like an encyclopedia. Thousands of silly
arguments would never arise. Thousands of apposite criticisms of Wikipedia
would never arise. These are good things.

Unfortunately, such a Wikipedia would also have vastly impoverished
coverage of popular culture and current affairs. The articles on Lady Gaga
and Barack Obama would be years behind events; the articles on the Japan
earthquakes, which I believe Wikipedia was widely praised for, would only
now begin to be written, articles on many towns and villages would lack
colour and detail.

It's an intriguing idea, and I agree with the general principle of 
reducing reliance on sources with less gestation time, of which 
newspapers are the biggest offender. I do tend to apply it in an 
as-alternatives-are-available fashion, and to many kinds of sources. For 
example, citing a recent academic conference paper may be justified if 
no synthesizing source is available, but there are dangers to cobbling 
together a new synthesis out of a dozen conference papers that may or 
may not be representative of majority views in a field, that may now be 
obsolete in ways unbeknownst to the reader, etc. Better to cite a proper 
book or survey article, if one is available.

A problem with avoiding newspapers entirely, added to those you mention, 
is that we'd even lose many things that aren't that recent. Especially 
in their more summary pieces such as obituaries and biopics, 
newspapers (and newsmagazines) fill in a lot of fairly uncontroversial 
information on more minor, but potentially still important, people and 
events. For the ancient world, that information is compiled fairly 
exhaustively in academic sources; you can find at least a three-sentence 
biography of every attested figure in some kind of specialist 
encyclopedia, e.g. the impressively comprehensive _Prosopography of the 
Later Roman Empire_. But for 20th-century figures that's often not the 
case. For example, I've written a number of articles on minor political 
figures (a mayor of Houston, say) primarily sourced from obituaries in 
major newspapers, e.g. the NYT's obituary section. For what they are, 
they are usually reliable enough: they provide some dates, a summary of 
offices held, and a brief mention of why the person is known. For famous 
figures, there are usually better sources, but for minor figures the 
alternatives are often more like primary sources, e.g. the state or 
municipal archives, or not including an article at all.


Wikimedia-l mailing list

Re: [Wikimedia-l] Language links and double language links on the Wikipedias

2012-06-25 Thread Delirium
Thanks for this list. For the languages I know, I've started going 
through and fixing ones that are clearly wrong. If a number of people do 
that, that should improve the general quality/consistency of interwiki 
links. I second the other comment that it'd be nice if the parsing could 
be re-run to exclude commented-out links, but the list is still useful 
as is.

There are some difficult cases, though, when languages make different 
choices on how to group subjects, so the articles aren't actually in 
1-to-1 correspondence. For example, the English article [[en: Móði and 
Magni]] unsurprisingly has two outgoing interwiki links, when linking to 
languages that split them, such as [[da:Magni]] and [[da:Modi]]. It's 
not clear what to do about these cases.


On 6/25/12 12:29 PM, Denny Vrandečić wrote:

Hi all,

I ran some analysis last week, to get some numbers out of the
Wikipedia language links. One type of reports that were generated was
the list of all articles in the main namespaces of the Wikipedias that
link to more than one article in another language edition of Wikipedia
(so called double language links). There are not that many of them
(about 19,000 in total), split by language, all available here:

Double language links are not errors per se, but they contain a few nuisances
* they lead to two links in the language links list that just look the
same (you have to hover over them to see that they link to different
languages), which is not really optimal from the user experience side
* they are not saved in the langlinks table and thus are ignored in
certain reports and also in the respective export

I am not sure how to reach out to the respective Wikipedia
communities, or if I should at all. Should I post to their respective
version of the village pump? Remembering from the time I was active on
the Croatian Wikipedia, I would have appreciated that list to check
the entries. I reckoned the wikipedia-l list would be the right place,
but that list looks rather dead.


Wikimedia-l mailing list