Re: [Mt-list] Travel from LREC to EAMT?
Thanks. The LREC website includes an Air Malta timetable, which is helpful in planning. Note that they only fly to certain cities on certain days! (Indicated by numbers 1-7 in the timetable.) Bob Francisco Guzman wrote: A flight from Malta(Luqa) to Nice seems to be the shortest way. Paco Guzman On Fri, Mar 5, 2010 at 12:44 PM, Robert Frederking r...@cs.cmu.edu mailto:r...@cs.cmu.edu wrote: I'm looking for the best way to go from Malta (LREC) to St. Raphael, France (EAMT) in May, and I suspect I'm not the only one. If someone knows a particularly good plan, of the many options, it would probably be helpful to others to send it to this list. Thanks/merci/whatever you say in Maltese. Bob ___ Mt-list mailing list ___ Mt-list mailing list
Re: [Mt-list] Public release of Haitian Creole language data byCarnegie Mellon
Well, my understanding is that, unfortunately, most companies won't touch anything that's under GPL, so I don't think that's a solution. We don't want to exclude commercial entities. Bob Francis Tyers wrote: First of all, thanks to CMU for releasing the data. I've no doubt it will be valuable to people working in the field. I don't particularly like terms like lawyerbomb and obnoxious advertising clause, but this merits a response. People who don't get paid to work on the software they develop, aren't employed by big universities or companies are understandably concerned about getting sued -- you can say but they've never been sued before, so why should they worry -- but this isn't really convincing. They can get frustrated that people make more work for themselves and others. * Making up your own 'free/open-source' licence: More work for you, more work for them. * Choosing an existing tried and tested 'free/open-source' licence: Less work for you, less work for them. Furthermore, they can also find it frustrating that a non-profit organisation would release their work under a licence that is incompatible with that of over 60% of free software.[1] Fran PS. Some of these same issues are reviewed in Ted Pedersen's excellent 2008 article: http://www.d.umn.edu/~tpederse/Pubs/pedersen-last-word-2008.pdf =Notes= 1. http://www.blackducksoftware.com/oss/licenses#top20 El dv 22 de 01 de 2010 a les 18:29 -0500, en/na Job M. van Zuijlen va escriure: Some of the verbiage used in this discussion (lawyer bomb...) doesn't particularly encourage people to make their data freely available. What happened to common sense? I think CMU's initiative should be commended. Job van Zuijlen From: Robert Frederking Sent: Friday, January 22, 2010 16:32 To: Francis Tyers Cc: mt-list@eamt.org Subject: Re: [Mt-list] Public release of Haitian Creole language data byCarnegie Mellon I'm not a lawyer, but let me start by stating that out intent was simply that re-use included acknowledgement. This was not intended to be a splash-screen on every start-up, or making the software pronounce our names at the start of every sentence. :-) It only has to be clearly visible in anyone's source files. We aren't interested in suing people; we are a non-profit research organization. But like the Regents in California, we have a responsibility to our sponsors that appropriate credit is given for our work. So this is intended to be like the old BSD advertising clause, which is generally considered to be clear from a legal point of view. Please use the data however you want; just don't say you originally collected it. Bob Francis Tyers wrote: [ Sorry in advance for cross posting ] I'm going over this on the debian-legal mailing list (a good place to ask about issues in free/open-source software licensing). There is a question about clause 5 of the licence: ## 5. Any commercial, public or published work that uses this data ## ## must contain a clearly visible acknowledgment as to the ## ## provenance of the data. ## From debian-legal: My concern is whether, contrary to the favourable interpretation you give, this is intended to act like an obnoxious advertising clause. In other words, what will satisfy “contain” in “contain a clearly visible acknowledgement”? Is it sufficient for the acknowledgement to be “clearly visible” only after inspecting various files in the source code? Or is the copyright holder's intent that the acknowledgement be clearly visible to every recipient, even those who receive a non-source form of the work? The latter would be a non-free restriction, like the obnoxious advertising clause in the older BSD licenses. This looks, as it is currently worded, more like a lawyerbomb now that I consider it. I would appreciate input on this from legally-trained minds. Could you confirm if that clause means that the acknowledgement should be _clearly visible_ to _every recipient_ or would it suffice to be visible after inspecting the source code? Thanks for your help in this and best regards, Francis Tyers El dj 21 de 01 de 2010 a les 22:59 -0500, en/na Alon Lavie va escriure: Hi Francis, Thanks for the suggestion, but we were advised to leave the licensing language as is. Our licensing language is effectively equivalent to the MIT license.and is unambiguous with respect to releasing the data for any use (commercial or non-commercial). Best regards, - *Alon* Francis Tyers wrote: El dj 21 de 01 de 2010 a les 14:49 -0500, en/na Robert Frederking va escriure: The Language Technologies Institute (LTI
Re: [Mt-list] Public release of Haitian Creole language data by Carnegie Mellon
I'm not a lawyer, but let me start by stating that out intent was simply that re-use included acknowledgement. This was not intended to be a splash-screen on every start-up, or making the software pronounce our names at the start of every sentence. :-) It only has to be clearly visible in anyone's source files. We aren't interested in suing people; we are a non-profit research organization. But like the Regents in California, we have a responsibility to our sponsors that appropriate credit is given for our work. So this is intended to be like the old BSD advertising clause, which is generally considered to be clear from a legal point of view. Please use the data however you want; just don't say you originally collected it. Bob Francis Tyers wrote: [ Sorry in advance for cross posting ] I'm going over this on the debian-legal mailing list (a good place to ask about issues in free/open-source software licensing). There is a question about clause 5 of the licence: ## 5. Any commercial, public or published work that uses this data ## ## must contain a clearly visible acknowledgment as to the ## ## provenance of the data. ## From debian-legal: My concern is whether, contrary to the favourable interpretation you give, this is intended to act like an obnoxious advertising clause. In other words, what will satisfy “contain” in “contain a clearly visible acknowledgement”? Is it sufficient for the acknowledgement to be “clearly visible” only after inspecting various files in the source code? Or is the copyright holder's intent that the acknowledgement be clearly visible to every recipient, even those who receive a non-source form of the work? The latter would be a non-free restriction, like the obnoxious advertising clause in the older BSD licenses. This looks, as it is currently worded, more like a lawyerbomb now that I consider it. I would appreciate input on this from legally-trained minds. Could you confirm if that clause means that the acknowledgement should be _clearly visible_ to _every recipient_ or would it suffice to be visible after inspecting the source code? Thanks for your help in this and best regards, Francis Tyers El dj 21 de 01 de 2010 a les 22:59 -0500, en/na Alon Lavie va escriure: Hi Francis, Thanks for the suggestion, but we were advised to leave the licensing language as is. Our licensing language is effectively equivalent to the MIT license.and is unambiguous with respect to releasing the data for any use (commercial or non-commercial). Best regards, - *Alon* Francis Tyers wrote: El dj 21 de 01 de 2010 a les 14:49 -0500, en/na Robert Frederking va escriure: The Language Technologies Institute (LTI) of Carnegie Mellon University's School of Computer Science (CMU SCS) is making publicly available the Haitian Creole spoken and text data that we have collected or produced. We are providing this data with minimal restrictions in order to allow others to develop language technology for Haiti, in parallel with our own efforts to help with this crisis. Since organizing the data in a useful fashion is not instantaneous, and more text data is currently being produced by collaborators, we will be publishing the data incrementally on the web, as it becomes available. To access the currently available data, please visit the website at http://www.speech.cs.cmu.edu/haitian/ Would you consider also dual/triple licensing the data under an existing free software licence, such as the MIT licence[1] or the GNU GPL[2] ? This way it could be combined with existing data under these licences (e.g. the majority of free/open-source software) and researchers and developers don't need to hire legal advice to determine if they can combine their work with yours. Best regards, Fran 1. http://en.wikipedia.org/wiki/MIT_Licence#License_terms 2. http://www.gnu.org/licenses/gpl.html ___ Mt-list mailing list ___ Mt-list mailing list
[Mt-list] Public release of Haitian Creole language data by Carnegie Mellon
The Language Technologies Institute (LTI) of Carnegie Mellon University's School of Computer Science (CMU SCS) is making publicly available the Haitian Creole spoken and text data that we have collected or produced. We are providing this data with minimal restrictions in order to allow others to develop language technology for Haiti, in parallel with our own efforts to help with this crisis. Since organizing the data in a useful fashion is not instantaneous, and more text data is currently being produced by collaborators, we will be publishing the data incrementally on the web, as it becomes available. To access the currently available data, please visit the website at http://www.speech.cs.cmu.edu/haitian/ ___ Mt-list mailing list
Re: [Mt-list] Public release of Haitian Creole language data byCarnegie Mellon
Hi Vadim, Yes, French is the principal written language, but most of the population only speaks Creole (and is illiterate). We ourselves are indeed looking at making speech-based systems (and the rarest part of the data may be the speech data). There may also be unforeseen benefits to the data being available. For example, it appears that Doctors Without Borders (Médecins Sans Frontières) may make use of the bilingual medical phrases as-is, through Translators Without Borders (Traducteurs sans Frontières). So who knows how this may help. Cheers. Bob // Vadim Berman wrote: Hi Robert, These are commendable efforts, but isn't French the principal written language in Haiti? Or you are talking about a speech to speech system? Best regards, Vadim - Original Message - From: Robert Frederking r...@cs.cmu.edu To: mt_l...@nist.gov; mt-list@eamt.org Sent: Friday, January 22, 2010 6:49 AM Subject: [Mt-list] Public release of Haitian Creole language data byCarnegie Mellon The Language Technologies Institute (LTI) of Carnegie Mellon University's School of Computer Science (CMU SCS) is making publicly available the Haitian Creole spoken and text data that we have collected or produced. We are providing this data with minimal restrictions in order to allow others to develop language technology for Haiti, in parallel with our own efforts to help with this crisis. Since organizing the data in a useful fashion is not instantaneous, and more text data is currently being produced by collaborators, we will be publishing the data incrementally on the web, as it becomes available. To access the currently available data, please visit the website at http://www.speech.cs.cmu.edu/haitian/ ___ Mt-list mailing list ___ Mt-list mailing list
Re: [Mt-list] Computers translation in Africa / involving African languages
If Arabic counts, there's much work in the US on Arabic these days. Don Osborn wrote: I have updated a very modest presentation of some info relevant to MT in Africa at http://www.bisharat.net/Trans/ . There is not much there, so I would like to request information/recommendations for other links relating to MT in Africa and in African languages wherever. (The page also needs some reworking, but I'm mainly concerned now with content.) TIA Don Osborn Bisharat.net PanAfrican Localisation project ___ Mt-list mailing list ___ Mt-list mailing list
[MT-List] AMTA-2002: Call for Tutorial and Workshop Proposals
Please circulate as widely as possible: --- CALL FOR TUTORIAL AND WORKSHOP PROPOSALS --- The Association for Machine Translation in the Americas AMTA-2002 Conference Tiburon, California (near San Francisco) October 8-12, 2002 Conference theme: FROM RESEARCH TO REAL USERS Ever since the showdown between Empiricists and Rationalists a decade ago at TMI-92, MT researchers have hotly pursued promising paradigms for MT, including data-driven approaches and hybrids that integrate these with more traditional rule-based components. During the same period, commercial MT systems with standard transfer architectures have evolved along a parallel and almost unrelated track, increasing their coverage and achieving much broader acceptance and usage. This raises a number of interesting questions (see the main conference Call For Participation), primarily concerned with why this disconnect exists, and whether it is going to change. TUTORIAL AND WORKSHOP PROPOSAL SUBMISSIONS Proposals for tutorials and workshops are now being solicited on these and other topics of direct interest and impact for MT researchers, developers, vendors or users of MT technologies. We welcome and encourage participation by members of AMTA's sister organizations, AAMT in Asia and EAMT in Europe, as well. Workshops will be held on Tuesday October 8th. Approximately 7 hours may be allocated per workshop. Tutorials will be held on Wednesday October 9th. Tutorials would typically last 3 hours, although other arrangements might be possible. Proposals should state the topic(s) to be addressed, the rationale for addressing it and the structure of the activities. Proposals should be in English and not longer than 4 pages. Please submit proposals as soon as possible to Bob Frederking at [EMAIL PROTECTED]. Proposals must be submitted on or before Friday, April 12, 2002. For general conference information and further details as they become available, visit: http://www.amtaweb.org/AMTA2002/ CONFERENCE ORGANIZERS Elliott Macklovitch, General Chair Stephen D. Richardson, Program Chair Violetta Cavalli-Sforza, Local Arrangements Chair Bob Frederking, Workshops and Tutorials Laurie Gerber, Exhibits Coordinator -- Robert E. FrederkingEmail: [EMAIL PROTECTED] Language Technologies Institute Telephone: +1-412-268-6656 Carnegie Mellon University FAX: +1-412-268-6298 5000 Forbes Avenue Pittsburgh, PA 15213 USA http://www.cs.cmu.edu/~ref/ -- For MT-List info, see http://www.eamt.org/mt-list.html
Re: [MT-List] Big software companies and MT
The ISLE framework is interesting, but what I need/want is not just a list of criteria and considerations, but the facts as they pertain to the different vendors' products. I think -that- information is a long way from being made readily available. This is an interesting point. I have run into this problem in trying to teach our graduate MT course's lecture on "Commercial MT systems". It's very difficult to get concrete, useful information from MT vendors (other than that "ours is the best, you should buy it"). Since (it seems to me) any such information will necessarily be all wrapped up in marketting issues, I wonder how it would be possible to get straight, reliable information on a range of companies. (That is, every company clearly wants to look like they are the best.) I know there has been some discussion of some kind of Consumer Reports for MT, but this is bound to be expensive to do (independently) in a serious way. I find myself wondering whether anything like this exists for other commercial software fields. I suspect not, actually. After all, any serious quality comparison would be damaging to Microsoft. :-) So, is there any reliable, independent assessment of commercial software in other fields? If so, how do they manage to do it? (Or is it perhaps easier to evaluate other software copared to MT systems?) Bob -- Robert E. FrederkingSenior Systems Scientist Language Technologies Institute/Center for Machine Translation Carnegie Mellon University 5000 Forbes Avenue Telephone: +1-412-268-6656 Pittsburgh, PA 15213 USA FAX: +1-412-268-6298 Email: [EMAIL PROTECTED] WWW: http://www.cs.cmu.edu/~ref/ -- For MT-List info, see http://www.eamt.org/mt-list.html