Re: [Wikimedia-l] knowing English is a privilege (was Re: Paid translation)

2018-03-02 Thread James Salsman
> Wikidata's Lexeme project is progressing slowly, but its direction is right.
> It will finally build a technical platform that is actually good for a 
> dictionary.

A wiki article is a very similar type to dictionary presentation of lexemes.
The best dictionaries also cover morphemes, e.g., "grammar" on the top
line of https://dictionary.cambridge.org

http://www.englishprofile.org/english-grammar-profile/egp-online

Those are the intermediate English morphemes, and these are their lexemes:

http://www.englishprofile.org/wordlists

I wonder if there is a Wikidata word number mapping for all ~6,500 of
those (level A1-C2) words.

Thanks all!

Best regards,
Jim


On Fri, Mar 2, 2018 at 2:10 AM, mathieu stumpf guntz
 wrote:
> Le 02/03/2018 à 00:46, Jean-Philippe Béland a écrit :
>>
>> I think this is à propos in this discussion about how authoritative can be
>> the Wiktionary... here a scientific article starts by using a definition
>> from the Wiktionary:
>>
>> http://theconversation.com/de-facebook-au-developpement-des-plantes-quand-les-reseaux-sen-melent-90891
>>
>> JP
>
> Actually one point that wasn't indicated so far, is the Wiktionnaries have
> indeed not a equal quality for every single article, but where quality is
> there it outstand easily any other single dictionary out there. Also there
> are a growing number of words for which no definition is given outside the
> Wiktionary. I think that conjugated, it might easily accustom people to
> directly go look up in Wikitonary when they need a definition, whatever its
> authoritative level might be.
>
> ___
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
> https://meta.wikimedia.org/wiki/Wikimedia-l
> New messages to: Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> 

___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] knowing English is a privilege (was Re: Paid translation)

2018-03-02 Thread mathieu stumpf guntz

Le 02/03/2018 à 00:46, Jean-Philippe Béland a écrit :

I think this is à propos in this discussion about how authoritative can be
the Wiktionary... here a scientific article starts by using a definition
from the Wiktionary:
http://theconversation.com/de-facebook-au-developpement-des-plantes-quand-les-reseaux-sen-melent-90891

JP
Actually one point that wasn't indicated so far, is the Wiktionnaries 
have indeed not a equal quality for every single article, but where 
quality is there it outstand easily any other single dictionary out 
there. Also there are a growing number of words for which no definition 
is given outside the Wiktionary. I think that conjugated, it might 
easily accustom people to directly go look up in Wikitonary when they 
need a definition, whatever its authoritative level might be.

___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] knowing English is a privilege (was Re: Paid translation)

2018-03-01 Thread Jean-Philippe Béland
I think this is à propos in this discussion about how authoritative can be
the Wiktionary... here a scientific article starts by using a definition
from the Wiktionary:
http://theconversation.com/de-facebook-au-developpement-des-plantes-quand-les-reseaux-sen-melent-90891

JP


On Thu, Mar 1, 2018 at 9:49 AM Amir E. Aharoni 
wrote:

> 2018-02-28 23:09 GMT+02:00 James Salsman :
> >
> > > building an authoritative dictionary is considerably
> > > harder than building a (de facto) authoritative encyclopedia.
> >
> > What reason is there to think that? My any measure of editor hours, or
> > the amount of money it would take to replicate the effort, or the
> > maintenance load going forward, I'm sure that even a three shelf foot
> > encyclopedia is harder than a 100,000 word dictionary.
>
> A couple of reasons:
> * For the particular case of Wikimedia, we are using the same software for
> Wiktionary as we do for Wikipedia. It's insane. MediaWiki wasn't made for
> that. It was made for Wikipedia.
> * An *authoritative* dictionary needs authority. It must be built by a team
> of trained and certified linguists. It needs a large and systematized
> collection of citations. It's just harder to do this for a dictionary than
> for an encyclopedia. Citations for an encyclopedia these days are often
> easily googlable, and the form of an encyclopedia article is freer than the
> form of a dictionary entry, which must be super-strict.
>
> The English Wiktionary community is overcoming both of these problem
> valiantly.
>
> It is overcoming the first problem by using lots of templates and gadgets,
> which kinda work in practice, but which are hard to learn and to replicate
> for other languages, and hard for software to process.
>
> It is overcoming the second problem by being more practically useful than
> authoritative, similarly to Wikipedia. Lexicographic citations in English
> are particularly easy to google up, given that:
> * English is the #1 language on the web
> * Google is a company based in an English-speaking country and (probably)
> getting most of its revenue from English-speaking customers
> * English has a simple morphology, for which it is particularly easy to
> build a well-working search engine for
>
> However, while it's easy to google up examples for English word usage, I
> strongly suspect that googling won't produce results that will be as
> systematized as a citation database of Merriam-Webster is.
>
> Wikipedia had proved long ago that it can compete—even if not necessarily
> win—with the authority of Britannica, but Wiktionary hasn't yet proven that
> it can compete with the authority of Merriam-Webster, Oxford, Houaiss,
> Duden, etc.
>
> (The English Wiktionary is not necessarily special; I also got to use the
> French, German, and Dutch Wiktionaries a bit, and they all do it at a level
> of quality that is comparable to the English one.)
>
> Is it desirable for Wiktionary to get better? Of course it is. Can
> Wiktionary get better? Yes, and path is quite clear. Wikidata's Lexeme
> project is progressing slowly, but its direction is right. It will finally
> build a technical platform that is actually good for a dictionary.
>
> At https://phabricator.wikimedia.org/T186421 I've been writing my ideas
> about how Lexical Wikidata can actually be used by editors and readers. So
> I'm very much on board with the idea of better Wiktionary. (Before you jump
> to conclusions: These ideas were not solicited by Wikidata developers. They
> are totally mine, and they are not in any way "official". I'm just writing
> them down as a brain dump, in my personal volunteering capacity, hoping
> that they will be useful to Wikidata developers.)
>
> > > We are not *teaching* encyclopedia articles.
> >
> > What is the difference between delivering the text of an encyclopedia
> > article and teaching it? Encyclopedias are not written to be
> > accompanied by a lecturer, tutor, or teacher. We even teach how to
> > write them, to students, in schools, and the students often if not
> > almost always get academic credit for their work:
> > https://en.wikipedia.org/wiki/Wikipedia:Education_program/Educators
>
> Exactly: As Wikimedians, we are actively teaching people to write in
> Wikipedia (and in other Wikimedia projects), but we are not teaching the
> *subjects* of the articles. Not as Wikimedians. Some Wikimedians are also
> teachers, and they use Wikipedia articles as handouts, but this is not
> really a Wikimedia activity.
>
> As Wikimedians we just make materials available, and we teach others *to
> make them available*.
>
> > > Wikimedia should be busy getting even better at its main thing: wiki
> articles.
> >
> > Why? We are already the best at that.
>
> We may be the best, and we are definitely the most popular, but we could be
> so, so much better. And we should be.
>
> As a simple high-level example, it's still not NEARLY as easy to become a
> Wikipedia editor as it should be.
>
> I often wish that Wikipedia ha

Re: [Wikimedia-l] knowing English is a privilege (was Re: Paid translation)

2018-03-01 Thread Amir E. Aharoni
2018-02-28 23:09 GMT+02:00 James Salsman :
>
> > building an authoritative dictionary is considerably
> > harder than building a (de facto) authoritative encyclopedia.
>
> What reason is there to think that? My any measure of editor hours, or
> the amount of money it would take to replicate the effort, or the
> maintenance load going forward, I'm sure that even a three shelf foot
> encyclopedia is harder than a 100,000 word dictionary.

A couple of reasons:
* For the particular case of Wikimedia, we are using the same software for
Wiktionary as we do for Wikipedia. It's insane. MediaWiki wasn't made for
that. It was made for Wikipedia.
* An *authoritative* dictionary needs authority. It must be built by a team
of trained and certified linguists. It needs a large and systematized
collection of citations. It's just harder to do this for a dictionary than
for an encyclopedia. Citations for an encyclopedia these days are often
easily googlable, and the form of an encyclopedia article is freer than the
form of a dictionary entry, which must be super-strict.

The English Wiktionary community is overcoming both of these problem
valiantly.

It is overcoming the first problem by using lots of templates and gadgets,
which kinda work in practice, but which are hard to learn and to replicate
for other languages, and hard for software to process.

It is overcoming the second problem by being more practically useful than
authoritative, similarly to Wikipedia. Lexicographic citations in English
are particularly easy to google up, given that:
* English is the #1 language on the web
* Google is a company based in an English-speaking country and (probably)
getting most of its revenue from English-speaking customers
* English has a simple morphology, for which it is particularly easy to
build a well-working search engine for

However, while it's easy to google up examples for English word usage, I
strongly suspect that googling won't produce results that will be as
systematized as a citation database of Merriam-Webster is.

Wikipedia had proved long ago that it can compete—even if not necessarily
win—with the authority of Britannica, but Wiktionary hasn't yet proven that
it can compete with the authority of Merriam-Webster, Oxford, Houaiss,
Duden, etc.

(The English Wiktionary is not necessarily special; I also got to use the
French, German, and Dutch Wiktionaries a bit, and they all do it at a level
of quality that is comparable to the English one.)

Is it desirable for Wiktionary to get better? Of course it is. Can
Wiktionary get better? Yes, and path is quite clear. Wikidata's Lexeme
project is progressing slowly, but its direction is right. It will finally
build a technical platform that is actually good for a dictionary.

At https://phabricator.wikimedia.org/T186421 I've been writing my ideas
about how Lexical Wikidata can actually be used by editors and readers. So
I'm very much on board with the idea of better Wiktionary. (Before you jump
to conclusions: These ideas were not solicited by Wikidata developers. They
are totally mine, and they are not in any way "official". I'm just writing
them down as a brain dump, in my personal volunteering capacity, hoping
that they will be useful to Wikidata developers.)

> > We are not *teaching* encyclopedia articles.
>
> What is the difference between delivering the text of an encyclopedia
> article and teaching it? Encyclopedias are not written to be
> accompanied by a lecturer, tutor, or teacher. We even teach how to
> write them, to students, in schools, and the students often if not
> almost always get academic credit for their work:
> https://en.wikipedia.org/wiki/Wikipedia:Education_program/Educators

Exactly: As Wikimedians, we are actively teaching people to write in
Wikipedia (and in other Wikimedia projects), but we are not teaching the
*subjects* of the articles. Not as Wikimedians. Some Wikimedians are also
teachers, and they use Wikipedia articles as handouts, but this is not
really a Wikimedia activity.

As Wikimedians we just make materials available, and we teach others *to
make them available*.

> > Wikimedia should be busy getting even better at its main thing: wiki
articles.
>
> Why? We are already the best at that.

We may be the best, and we are definitely the most popular, but we could be
so, so much better. And we should be.

As a simple high-level example, it's still not NEARLY as easy to become a
Wikipedia editor as it should be.

I often wish that Wikipedia had more substantial competitors, so it would
drive us to be faster at improving ourselves. Medium.com, Quora.com,
Genius.com, and some other web properties are occasionally mentioned as
Wikimedia's competitors, but none of them is doing quite the same thing as
Wikimedia does, and though each of them is quite popular, none is as
popular as Wikipedia is.

(I will readily admit, however, that Google is a competitor for providing
quick facts, and Facebook and Instagram are competitors for people's spare

Re: [Wikimedia-l] knowing English is a privilege (was Re: Paid translation)

2018-02-28 Thread Asaf Bartov
On Wed, Feb 28, 2018 at 1:09 PM James Salsman  wrote:

> > We are not *teaching* encyclopedia articles.
>
> What is the difference between delivering the text of an encyclopedia
> article and teaching it?


Depending on one's understanding of "teaching", and its expected outcomes,
the difference can be significant. It can, for example, imply a different
exposition of information, or include comprehension questions and
drilling.  It can include student assessment (grading), statistics, etc.


> Encyclopedias are not written to be
> accompanied by a lecturer, tutor, or teacher.


That's true.  They're designed for self-study. But that does not mean an
encyclopedic article is the best vessel for teaching a topic.  Certainly,
most people cannot effectively pick up a language from reading the
Wikipedia articles about its morphology and syntax.


> Knowing any language is a privilege, and suggesting that there is any
> reason to narrow the Foundation's focus away from language instruction
> seems completely absurd to me.
>

I think there can be no doubt that languages are knowledge, and that
therefore language instruction falls squarely within the Wikimedia mission
"to empower and engage people around the world to collect and develop
educational content under a free license or in the public domain, and to
disseminate it effectively and globally."

But being within our mission is not enough to be actively pursued, and I
assume that by "the Foundation's focus", you're referring not just to the
mission, but to the set of things the Foundation is actively pursuing and
investing in, i.e. to its current strategy and goals.

It seems to me that the main argument for adding *language instruction* to
the set of goals the Foundation (or wider movement) actively pursues would
be some demonstration that the Wikimedia movement is *well-equipped to do
it very well*, especially compared to the relatively rich set of products
and solutions in the language instruction space.

I acknowledge that very few of those are free/libre.  But most people don't
use Wikipedia because it's free/libre; they use it because it's free of
cost and because it's really, really useful.  Likewise, we'd need to
convince ourselves we can do significantly better than Duolingo, Babbel,
Mondly, Mango, Busuu, etc., to make it a worthwhile pursuit for the
Wikimedia movement at this time, being as it would be (as always) at the
cost of pursuing other goals.

> Wikimedia should be busy getting even better at its main thing: wiki
> articles.
>
> Why? We are already the best at that. Why not make the wiki articles
> in Wiktionary better by not just playing audio recordings of words,
> which volunteers (not the Foundation) already provide, but meeting
> that initiative by recording utterances and predicting whether they
> are intelligible pronunciations, and doing the same with recording
> gadgets in Wikipedia's pronunciation articles? http://j.mp/irslides


I think that's a very interesting direction.  I would suggest that it would
only make sense to invest in, if at all, once we have Lexical data entities
in Wikidata. (soon!)

A.
   (personal capacity)
___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


[Wikimedia-l] knowing English is a privilege (was Re: Paid translation)

2018-02-28 Thread James Salsman
> building an authoritative dictionary is considerably
> harder than building a (de facto) authoritative encyclopedia.

What reason is there to think that? My any measure of editor hours, or
the amount of money it would take to replicate the effort, or the
maintenance load going forward, I'm sure that even a three shelf foot
encyclopedia is harder than a 100,000 word dictionary.

> We are not *teaching* encyclopedia articles.

What is the difference between delivering the text of an encyclopedia
article and teaching it? Encyclopedias are not written to be
accompanied by a lecturer, tutor, or teacher. We even teach how to
write them, to students, in schools, and the students often if not
almost always get academic credit for their work:
https://en.wikipedia.org/wiki/Wikipedia:Education_program/Educators

Knowing any language is a privilege, and suggesting that there is any
reason to narrow the Foundation's focus away from language instruction
seems completely absurd to me.

> Wikimedia should be busy getting even better at its main thing: wiki articles.

Why? We are already the best at that. Why not make the wiki articles
in Wiktionary better by not just playing audio recordings of words,
which volunteers (not the Foundation) already provide, but meeting
that initiative by recording utterances and predicting whether they
are intelligible pronunciations, and doing the same with recording
gadgets in Wikipedia's pronunciation articles? http://j.mp/irslides

I'm serious that I think the Foundation should hire all my Google
Summer of Code students to support doing that, because it will take
about that many people to set it up so that volunteers can complete
the work for all languages, not just English.

There is no reason that the Foundation can't both pay to translate
Wikipedia articles and pay to up Wiktionary's language instruction
game at the same time. That would have made sense ten years ago, and
the budget is much larger now. We are at a juncture in aligning our
long term strategy to the mission, so I hope both projects get funded.
If it has to be proposed budget-neutral to be compelling, then get rid
of the mobile app and mobile web versions except on platforms where
they are genuinely easier for editors, not just readers, to use.

Best regards,
Jim

___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,