Erik Moeller, 24/04/2013 08:29:
[...]
Could open source MT be such a strategic investment? I don't know, but
I'd like to at least raise the question. I think the alternative will
be, for the foreseeable future, to accept that this piece of
technology will be proprietary, and to rely on goodwill
2013/4/29 Mathieu Stumpf psychosl...@culture-libre.org
Le 2013-04-26 20:27, Milos Rancic a écrit :
OmegaWiki is a masterpiece from the perspective of one [computational]
linguist. Erik made the structure so well, that it's the best starting
point to create a contemporary multilingual
On 26/04/13 19:38, Bjoern Hoehrmann wrote:
* Andrea Zanni wrote:
At the moment, Wikisource could be a interesting corpora and laboratory for
improving and enhancing OCR,
as the OCR generated text is always proofread and corrected by humans.
Try also Distributed Proofreaders. It is my
Le 2013-04-26 17:00, Gerard Meijssen a écrit :
Hoi,
When we invest in MT it is to convey knowledge, information and
primarily
Wikipedia articles. They do not have the same problems poetry has.
With
explanatory articles on a subject there is a web of associated
concepts.
These concepts are
Le 2013-04-26 19:57, Samuel Klein a écrit :
On Fri, Apr 26, 2013 at 1:24 PM, Bjoern Hoehrmann derhoe...@gmx.net
wrote:
* Erik Moeller wrote:
Are there open source MT efforts that are close enough to merit
scrutiny?
Wiktionary. If you want to help free software efforts in the area of
machine
Le 2013-04-26 20:27, Milos Rancic a écrit :
On Fri, Apr 26, 2013 at 7:57 PM, Samuel Klein meta...@gmail.com
wrote:
Yes. Finding a way to capture and integrate the work OmegaWiki has
done into a new Wikidata-powered Wiktionary would be a useful start.
And we've already sort of claimed the space
Thanks to Jane for introducing CoSyne. But I feel all the wikis do not want to
be synchronized to certain wikis. Rather than having identical articles, I hope
they would have their own articles. I hope I could have two more tabs at right
of the 'Article' and 'Talk' on English Wikipedia for
Hi all,
On Wed, 24 Apr 2013 08:39:55 +0200
Ting Chen wing.phil...@gmx.de wrote:
Oh yes, this would really be great. Just think about the money the
Foundation gives out meanwhile for translation, plus the many many
volunteers' work invested into translation. A free and open translation
Le 2013-04-25 20:56, Theo10011 a écrit :
As far as Linguistic typology goes, it's far too unique and too
varied to
have a language independent form develop as easily. Perhaps it also
depends
on the perspective. For example, the majority of people commenting
here
(Americans, Europeans) might
We already have the translation options on the left side of the screen
in any Wikipedia article.
This choice is generally a smattering of languages, and a long term
goal for many small-language Wikipedias is to be able to translate an
article from related languages (say from Dutch into Frisian,
* Andrea Zanni wrote:
At the moment, Wikisource could be a interesting corpora and laboratory for
improving and enhancing OCR,
as the OCR generated text is always proofread and corrected by humans.
As part of our project (
http://wikisource.org/wiki/Wikisource_vision_development), Micru was
On Fri, Apr 26, 2013 at 1:24 PM, Bjoern Hoehrmann derhoe...@gmx.net wrote:
* Erik Moeller wrote:
Are there open source MT efforts that are close enough to merit
scrutiny?
Wiktionary. If you want to help free software efforts in the area of
machine translation, then what they seem to need most
On Thu, Apr 25, 2013 at 4:26 PM, Denny Vrandečić
denny.vrande...@wikimedia.de wrote:
Not just bootstrapping the content. By having the primary content be saved
in a language independent form, and always translating it on the fly, it
would not merely bootstrap content in different languages, but
On Fri, Apr 26, 2013 at 7:57 PM, Samuel Klein meta...@gmail.com wrote:
Yes. Finding a way to capture and integrate the work OmegaWiki has
done into a new Wikidata-powered Wiktionary would be a useful start.
And we've already sort of claimed the space (though we are neglecting
it) -- it's
On 24/04/13 12:35, Denny Vrandečić wrote:
Current machine translation research aims at using massive machine learning
supported systems. They usually require big parallel corpora. We do not
have big parallel corpora (Wikipedia articles are not translations of each
other, in general), especially
On 24/04/13 12:35, Denny Vrandečić wrote:
In summary, I see four calls for action right now (and for all of them this
means to first actually think more and write down a project plan and gather
input on that), that could and should be tackled in parallel if possible:
I ) develop a structured
Denny,
very good and compelling reasoning as always. I think the argument
that we can potentially do a lot for the MT space (including open
source efforts) in part by getting our own house in order on the
dictionary side of things makes a lot of sense. I don't think it
necessarily excludes
Le 2013-04-25 04:49, George Herbert a écrit :
We can't usefully help with internet access (and that's proceeding at
good
pace even in the third world), but language will remain a barrier
when
people get access. In a few situations politics / firewalling is as
well
(China, primarily), which is
On Thu, Apr 25, 2013 at 7:26 AM, Denny Vrandečić
denny.vrande...@wikimedia.de wrote:
Not just bootstrapping the content. By having the primary content be saved
in a language independent form, and always translating it on the fly, it
would not merely bootstrap content in different languages,
2013/4/25 Brion Vibber bvib...@wikimedia.org
You are blowing my mind, dude. :)
Glad to do hear :)
I suspect this approach won't serve for everything, but it sounds
*awesome*. If we can tie natural-language statements directly to data nodes
(rather than merely annotating vague references
On Thu, Apr 25, 2013 at 7:56 PM, Denny Vrandečić
denny.vrande...@wikimedia.de wrote:
Not just bootstrapping the content. By having the primary content be saved
in a language independent form, and always translating it on the fly, it
would not merely bootstrap content in different languages,
This subthread seems headed out into practical / applied epistemology, if
there is such a thing.
I am not sure if we can get from here to there; that said, a new structure
with language independent facts / information points that then got
machine-explained or described in a local language would
Oh yes, this would really be great. Just think about the money the
Foundation gives out meanwhile for translation, plus the many many
volunteers' work invested into translation. A free and open translation
software is long overdue indeed. Great idea Erik.
Greetings
Ting
Am 4/24/2013 8:29 AM,
Erik Moeller wrote:
Could open source MT be such a strategic investment? I don't know, but
I'd like to at least raise the question. I think the alternative will
be, for the foreseeable future, to accept that this piece of
technology will be proprietary, and to rely on goodwill for any
integration
A few links:
* 2010 discussion:
https://strategy.wikimedia.org/wiki/Proposal:Free_Translation_Memory as
one of the
https://strategy.wikimedia.org/wiki/List_of_things_that_need_to_be_free
(follow links, including)
* http://www.apertium.org : was used by translatewiki.net but isn't any
longer
On Wed, Apr 24, 2013 at 12:06 AM, MZMcBride z...@mzmcbride.com wrote:
Though the Wikimedia community seems eager to add new projects (Wikidata,
Wikivoyage), I wonder how it can be sensible or reasonable to focus on yet
another project when the current projects are largely neglected (Wikinews,
On Wed, Apr 24, 2013 at 8:29 AM, Erik Moeller e...@wikimedia.org wrote:
Could open source MT be such a strategic investment? I don't know, but
I'd like to at least raise the question. I think the alternative will
be, for the foreseeable future, to accept that this piece of
technology will be
Erik Moeller, 24/04/2013 10:06:
[...] Moreover, the lens of project/domain name is a very arbitrary one to
define vertically focused efforts.
A good and interesting reasoning here. Indeed something to keep in mind,
but which adds problems.
There are specialized efforts
within Wikipedia
On 4/24/13 8:29 AM, Erik Moeller wrote:
Are there open source MT efforts that are close enough to merit
scrutiny? In order to be able to provide high quality result, you
would need not only a motivated, well-intentioned group of people, but
some of the smartest people in the field working on it.
Erik, all,
sorry for the long mail.
Incidentally, I have been thinking in this direction myself for a while,
and I have come to a number of conclusions:
1) the Wikimedia movement can not, in its current state, tackle the problem
of machine translation of arbitrary text from and to all of our
A brief addendum,
On 4/24/13 12:25 PM, Mark wrote:
From 2006 through 2012 [the ERC] allocated about $10m to kickstart
open-source MT, though focused primarily on European languages, via
the EuroMatrix (2006-09) and EuroMatrixPlus (2009-12) research projects.
Missed some projects. Seems the
This is closely tied to software which is being developed, some of it
secretly, to enable machines to understand and use language. As of now
this will be government and corporate owned and controlled. I say closely
tied because that is how translation works; only someone or something
that
On 24 April 2013 11:35, Denny Vrandečić denny.vrande...@wikimedia.de wrote:
If we constrain b) a lot, we could just go and develop pages to display
for pages that do not exist yet based on Wikidata in the smaller
languages. That's a far cry from machine translating the articles, but it
would
only someone or something
that understands language can translate perfectly
Precisely
crude translations into little used languages are nearly
worthless due to syntax issues. Useful work requires at least one person
fluent in the language
It's very true!
Current Googe MT tools are reasonably
On 24/04/13 08:29, Erik Moeller wrote:
Could open source MT be such a strategic investment? I don't know, but
I'd like to at least raise the question. I think the alternative will
be, for the foreseeable future, to accept that this piece of
technology will be proprietary, and to rely on goodwill
Le 2013-04-24 08:29, Erik Moeller a écrit :
Are there open source MT efforts that are close enough to merit
scrutiny? In order to be able to provide high quality result, you
would need not only a motivated, well-intentioned group of people,
but
some of the smartest people in the field working
On Wed, Apr 24, 2013 at 2:04 PM, Mathieu Stumpf
psychosl...@culture-libre.org wrote:
I would like to add that (I'm no specialist of this subject) translating
natural language probably need at least a large set of existing
translations, at least to get read of obvious well known idiotisms like
Le 2013-04-24 12:35, Denny Vrandečić a écrit :
3) Wiktionary could be an even more amazing resource if we would
finally
tackle the issue of structuring its content more appropriately. I
think
Wikidata opened a few venues to structure planning in this direction
and
provide some software, but
I really like Erik's original suggestion, and these ideas, Denny.
Since there are many different possible goals, it's worth having a
page just to list all of the possible different goals and compare them
- both how they fit with one another and how they fit with existing
active projects elsewhere
(FYI this is me speaking with my personal hat on, none of these
opinions are official in any way or the opinions of the foundation as
an organization)
personal_hat
While Wikimedia is still only a medium-sized organization, it is not
poor. With more than 1M donors supporting our mission and a
Leslie Carr wrote (personally, not officially):
I think that while supporting open source machine translation is an
awesome goal, it is out of scope of our budget and the engineering
budget could be better spent elsewhere, such as with completing
existing tools that are in development, but not
41 matches
Mail list logo