Re: [Mt-list] Travel from LREC to EAMT?

2010-03-05 Thread Robert Frederking
Thanks.  The LREC website includes an Air Malta timetable, which is 
helpful in planning.  Note that they only fly to certain cities on 
certain days!  (Indicated by numbers 1-7 in the timetable.)


   Bob

Francisco Guzman wrote:

A flight from Malta(Luqa) to Nice seems to be the shortest way.


Paco Guzman


On Fri, Mar 5, 2010 at 12:44 PM, Robert Frederking r...@cs.cmu.edu 
mailto:r...@cs.cmu.edu wrote:


I'm looking for the best way to go from Malta (LREC) to St.
Raphael, France (EAMT) in May, and I suspect I'm not the only one.
 If someone knows a particularly good plan, of the many options,
it would probably be helpful to others to send it to this list.
 Thanks/merci/whatever you say in Maltese.

  Bob

___
Mt-list mailing list


___
Mt-list mailing list


Re: [Mt-list] Public release of Haitian Creole language data byCarnegie Mellon

2010-01-23 Thread Robert Frederking
Well, my understanding is that, unfortunately, most companies won't 
touch anything that's under GPL, so I don't think that's a solution.  We 
don't want to exclude commercial entities.


   Bob

Francis Tyers wrote:

First of all, thanks to CMU for releasing the data. I've no doubt it
will be valuable to people working in the field.

I don't particularly like terms like lawyerbomb and obnoxious
advertising clause, but this merits a response.

People who don't get paid to work on the software they develop, aren't
employed by big universities or companies are understandably concerned
about getting sued -- you can say but they've never been sued before,
so why should they worry -- but this isn't really convincing. They can
get frustrated that people make more work for themselves and others.

* Making up your own 'free/open-source' licence: 
More work for you, more work for them.


* Choosing an existing tried and tested 'free/open-source' licence: 
Less work for you, less work for them.


Furthermore, they can also find it frustrating that a non-profit
organisation would release their work under a licence that is
incompatible with that of over 60% of free software.[1]

Fran

PS. Some of these same issues are reviewed in Ted Pedersen's excellent
2008 article:
http://www.d.umn.edu/~tpederse/Pubs/pedersen-last-word-2008.pdf

=Notes=

1. http://www.blackducksoftware.com/oss/licenses#top20

El dv 22 de 01 de 2010 a les 18:29 -0500, en/na Job M. van Zuijlen va
escriure:
  

Some of the verbiage used in this discussion (lawyer bomb...) doesn't
particularly encourage people to make their data freely available.
What happened to common sense?  I think CMU's initiative should be
commended.
 
Job van Zuijlen



From: Robert Frederking 
Sent: Friday, January 22, 2010 16:32
To: Francis Tyers 
Cc: mt-list@eamt.org 
Subject: Re: [Mt-list] Public release of Haitian Creole language data

byCarnegie Mellon


I'm not a lawyer, but let me start by stating that out intent was
simply that re-use included acknowledgement.  This was not intended to
be a splash-screen on every start-up, or making the software pronounce
our names at the start of every sentence.  :-)  It only has to be
clearly visible in anyone's source files.

We aren't interested in suing people; we are a non-profit research
organization.  But like the Regents in California, we have a
responsibility to our sponsors that appropriate credit is given for
our work.  So this is intended to be like the old BSD advertising
clause, which is generally considered to be clear from a legal point
of view. 


Please use the data however you want; just don't say you originally
collected it.

Bob

Francis Tyers wrote: 


[ Sorry in advance for cross posting ]

I'm going over this on the debian-legal mailing list (a good place to
ask about issues in free/open-source software licensing).

There is a question about clause 5 of the licence:



##  5. Any commercial, public or published work that uses this data
##
## must contain a clearly visible acknowledgment as to the   ##
## provenance of the data.   ##



From debian-legal:

 My concern is whether, contrary to the favourable interpretation you
 give, this is intended to act like an obnoxious advertising clause.

 In other words, what will satisfy “contain” in “contain a clearly
 visible acknowledgement”? Is it sufficient for the acknowledgement to  
 be “clearly visible” only after inspecting various files in the source

 code?

 Or is the copyright holder's intent that the acknowledgement be clearly
 visible to every recipient, even those who receive a non-source form of
 the work? The latter would be a non-free restriction, like the  
 obnoxious advertising clause in the older BSD licenses.


 This looks, as it is currently worded, more like a lawyerbomb now that 
 I consider it. I would appreciate input on this from legally-trained  
 minds.




Could you confirm if that clause means that the acknowledgement should
be _clearly visible_ to _every recipient_ or would it suffice to be
visible after inspecting the source code?

Thanks for your help in this and best regards,

Francis Tyers


El dj 21 de 01 de 2010 a les 22:59 -0500, en/na Alon Lavie va escriure:
  
  

Hi Francis,

Thanks for the suggestion, but we were advised to leave the licensing 
language as is.  Our licensing language is effectively equivalent to the 
MIT license.and is unambiguous with respect to releasing the data for 
any use (commercial or non-commercial).


Best regards,

- *Alon*

Francis Tyers wrote:



El dj 21 de 01 de 2010 a les 14:49 -0500, en/na Robert Frederking va
escriure:
  
  
  

The Language Technologies Institute (LTI

Re: [Mt-list] Public release of Haitian Creole language data by Carnegie Mellon

2010-01-22 Thread Robert Frederking
I'm not a lawyer, but let me start by stating that out intent was simply 
that re-use included acknowledgement.  This was not intended to be a 
splash-screen on every start-up, or making the software pronounce our 
names at the start of every sentence.  :-)  It only has to be clearly 
visible in anyone's source files.


We aren't interested in suing people; we are a non-profit research 
organization.  But like the Regents in California, we have a 
responsibility to our sponsors that appropriate credit is given for our 
work.  So this is intended to be like the old BSD advertising clause, 
which is generally considered to be clear from a legal point of view.


Please use the data however you want; just don't say you originally 
collected it.


   Bob

Francis Tyers wrote:

[ Sorry in advance for cross posting ]

I'm going over this on the debian-legal mailing list (a good place to
ask about issues in free/open-source software licensing).

There is a question about clause 5 of the licence:



##  5. Any commercial, public or published work that uses this data
##
## must contain a clearly visible acknowledgment as to the   ##
## provenance of the data.   ##



From debian-legal:

 My concern is whether, contrary to the favourable interpretation you
 give, this is intended to act like an obnoxious advertising clause.

 In other words, what will satisfy “contain” in “contain a clearly
 visible acknowledgement”? Is it sufficient for the acknowledgement to  
 be “clearly visible” only after inspecting various files in the source

 code?

 Or is the copyright holder's intent that the acknowledgement be clearly
 visible to every recipient, even those who receive a non-source form of
 the work? The latter would be a non-free restriction, like the  
 obnoxious advertising clause in the older BSD licenses.


 This looks, as it is currently worded, more like a lawyerbomb now that 
 I consider it. I would appreciate input on this from legally-trained  
 minds.




Could you confirm if that clause means that the acknowledgement should
be _clearly visible_ to _every recipient_ or would it suffice to be
visible after inspecting the source code?

Thanks for your help in this and best regards,

Francis Tyers


El dj 21 de 01 de 2010 a les 22:59 -0500, en/na Alon Lavie va escriure:
  

Hi Francis,

Thanks for the suggestion, but we were advised to leave the licensing 
language as is.  Our licensing language is effectively equivalent to the 
MIT license.and is unambiguous with respect to releasing the data for 
any use (commercial or non-commercial).


Best regards,

- *Alon*

Francis Tyers wrote:


El dj 21 de 01 de 2010 a les 14:49 -0500, en/na Robert Frederking va
escriure:
  
  

The Language Technologies Institute (LTI) of Carnegie Mellon University's
School of Computer Science (CMU SCS) is making publicly available the
Haitian Creole spoken and text data that we have collected or produced. We
are providing this data with minimal restrictions in order to
allow others to develop language technology for Haiti, in parallel with our
own efforts to help with this crisis. Since organizing the data in a useful
fashion is not instantaneous, and more text data is currently being 
produced

by collaborators, we will be publishing the data incrementally on the web,
as it becomes available.  To access the currently available data, please
visit the website at  http://www.speech.cs.cmu.edu/haitian/



Would you consider also dual/triple licensing the data under an existing
free software licence, such as the MIT licence[1] or the GNU GPL[2] ?
This way it could be combined with existing data under these licences
(e.g. the majority of free/open-source software) and researchers and
developers don't need to hire legal advice to determine if they can
combine their work with yours.

Best regards, 


Fran

1. http://en.wikipedia.org/wiki/MIT_Licence#License_terms
2. http://www.gnu.org/licenses/gpl.html

___
Mt-list mailing list

  
  




  
___
Mt-list mailing list


[Mt-list] Public release of Haitian Creole language data by Carnegie Mellon

2010-01-21 Thread Robert Frederking

The Language Technologies Institute (LTI) of Carnegie Mellon University's
School of Computer Science (CMU SCS) is making publicly available the
Haitian Creole spoken and text data that we have collected or produced. We
are providing this data with minimal restrictions in order to
allow others to develop language technology for Haiti, in parallel with our
own efforts to help with this crisis. Since organizing the data in a useful
fashion is not instantaneous, and more text data is currently being 
produced

by collaborators, we will be publishing the data incrementally on the web,
as it becomes available.  To access the currently available data, please
visit the website at  http://www.speech.cs.cmu.edu/haitian/

___
Mt-list mailing list


Re: [Mt-list] Public release of Haitian Creole language data byCarnegie Mellon

2010-01-21 Thread Robert Frederking

Hi Vadim,
   Yes, French is the principal written language, but most of the 
population only speaks Creole (and is illiterate).  We ourselves are 
indeed looking at making speech-based systems (and the rarest part of 
the data may be the speech data).  There may also be unforeseen benefits 
to the data being available.  For example, it appears that Doctors 
Without Borders (Médecins Sans Frontières) may make use of the bilingual 
medical phrases as-is, through Translators Without Borders (Traducteurs 
sans Frontières).  So who knows how this may help.  Cheers.


   Bob
//
Vadim Berman wrote:

Hi Robert,

These are commendable efforts, but isn't French the principal written 
language in Haiti? Or you are talking about a speech to speech system?


Best regards,
Vadim

- Original Message - From: Robert Frederking r...@cs.cmu.edu
To: mt_l...@nist.gov; mt-list@eamt.org
Sent: Friday, January 22, 2010 6:49 AM
Subject: [Mt-list] Public release of Haitian Creole language data 
byCarnegie Mellon



The Language Technologies Institute (LTI) of Carnegie Mellon 
University's

School of Computer Science (CMU SCS) is making publicly available the
Haitian Creole spoken and text data that we have collected or 
produced. We

are providing this data with minimal restrictions in order to
allow others to develop language technology for Haiti, in parallel 
with our
own efforts to help with this crisis. Since organizing the data in a 
useful
fashion is not instantaneous, and more text data is currently being 
produced
by collaborators, we will be publishing the data incrementally on the 
web,

as it becomes available.  To access the currently available data, please
visit the website at  http://www.speech.cs.cmu.edu/haitian/

___
Mt-list mailing list 





___
Mt-list mailing list


Re: [Mt-list] Computers translation in Africa / involving African languages

2006-08-24 Thread Robert Frederking

If Arabic counts, there's much work in the US on Arabic these days.

Don Osborn wrote:

I have updated a very modest presentation of some info relevant to MT in
Africa at http://www.bisharat.net/Trans/ . There is not much there, so I
would like to request information/recommendations for other links relating
to MT in Africa and in African languages wherever. (The page also needs some
reworking, but I'm mainly concerned now with content.)

TIA

Don Osborn
Bisharat.net
PanAfrican Localisation project


___
Mt-list mailing list

  

___
Mt-list mailing list


[MT-List] AMTA-2002: Call for Tutorial and Workshop Proposals

2002-02-23 Thread Robert Frederking

Please circulate as widely as possible:


   --- CALL FOR TUTORIAL AND WORKSHOP PROPOSALS ---

   The Association for Machine Translation in the Americas
AMTA-2002 Conference

 Tiburon, California
 (near San Francisco)
  October 8-12, 2002


Conference theme: FROM RESEARCH TO REAL USERS

Ever since the showdown between Empiricists and Rationalists a decade
ago at TMI-92, MT researchers have hotly pursued promising paradigms
for MT, including data-driven approaches and hybrids that integrate
these with more traditional rule-based components.  During the same
period, commercial MT systems with standard transfer architectures
have evolved along a parallel and almost unrelated track, increasing
their coverage and achieving much broader acceptance and usage.  This
raises a number of interesting questions (see the main conference Call
For Participation), primarily concerned with why this disconnect
exists, and whether it is going to change.


TUTORIAL AND WORKSHOP PROPOSAL SUBMISSIONS

Proposals for tutorials and workshops are now being solicited on these
and other topics of direct interest and impact for MT researchers,
developers, vendors or users of MT technologies.  We welcome and
encourage participation by members of AMTA's sister organizations,
AAMT in Asia and EAMT in Europe, as well.

Workshops will be held on Tuesday October 8th.  Approximately 7 hours
may be allocated per workshop.  
Tutorials will be held on Wednesday October 9th.  Tutorials would
typically last 3 hours, although other arrangements might be possible.

Proposals should state the topic(s) to be addressed, the rationale for
addressing it and the structure of the activities.  Proposals should
be in English and not longer than 4 pages.

Please submit proposals as soon as possible to Bob Frederking at
[EMAIL PROTECTED].  Proposals must be submitted on or before Friday,
April 12, 2002.

For general conference information and further details
as they become available, visit:
http://www.amtaweb.org/AMTA2002/


CONFERENCE ORGANIZERS

Elliott Macklovitch, General Chair

Stephen D. Richardson, Program Chair

Violetta Cavalli-Sforza, Local Arrangements Chair

Bob Frederking, Workshops and Tutorials

Laurie Gerber, Exhibits Coordinator


--
Robert E. FrederkingEmail: [EMAIL PROTECTED]
Language Technologies Institute Telephone: +1-412-268-6656
Carnegie Mellon University  FAX: +1-412-268-6298
5000 Forbes Avenue
Pittsburgh, PA 15213  USA   http://www.cs.cmu.edu/~ref/



-- 
  For MT-List info, see http://www.eamt.org/mt-list.html



Re: [MT-List] Big software companies and MT

2001-02-19 Thread Robert Frederking

 The ISLE framework is interesting, but what I need/want is not just a
 list of criteria and considerations, but the facts as they pertain to
 the different vendors' products. I think -that- information is a long
 way from being made readily available.

This is an interesting point.  I have run into this problem in trying
to teach our graduate MT course's lecture on "Commercial MT systems".
It's very difficult to get concrete, useful information from MT
vendors (other than that "ours is the best, you should buy it").

Since (it seems to me) any such information will necessarily be all
wrapped up in marketting issues, I wonder how it would be possible to
get straight, reliable information on a range of companies.  (That is,
every company clearly wants to look like they are the best.)  I know
there has been some discussion of some kind of Consumer Reports for
MT, but this is bound to be expensive to do (independently) in a
serious way.

I find myself wondering whether anything like this exists for other
commercial software fields.  I suspect not, actually.  After all, any
serious quality comparison would be damaging to Microsoft.  :-)

So, is there any reliable, independent assessment of commercial
software in other fields?  If so, how do they manage to do it?  (Or is
it perhaps easier to evaluate other software copared to MT systems?)

Bob
--
Robert E. FrederkingSenior Systems Scientist
Language Technologies Institute/Center for Machine Translation
Carnegie Mellon University
5000 Forbes Avenue  Telephone: +1-412-268-6656
Pittsburgh, PA 15213  USA   FAX: +1-412-268-6298
Email: [EMAIL PROTECTED]   WWW: http://www.cs.cmu.edu/~ref/

-- 
  For MT-List info, see http://www.eamt.org/mt-list.html