Re: [Mt-list] Computers translation in Africa / involving African languages

2006-08-25 Thread timo . honkela
Professor Arvi Hurskainen at University of Helsinki has conducted 
a lot of research on language technology for Bantu languages
and especially Swahili. Here are some links related to that research:

- Helsinki Corpus of Swahili:
  http://www.aakkl.helsinki.fi/cameel/corpus/intro.htm

- SALAMA - Swahili Language Manager
  http://www.njas.helsinki.fi/salama/
  (includes a project on Swahili-to-English Machine Translation)

- PhD thesis by Wanjiku Ng'ang'a:
  Word Sense Disambiguation of Swahili
  http://ethesis.helsinki.fi/julkaisut/hum/aasia/vk/nganga/

Best regards,
Timo Honkela


On Thu, 24 Aug 2006, Don Osborn wrote:

 I have updated a very modest presentation of some info relevant to MT in
 Africa at http://www.bisharat.net/Trans/ . There is not much there, so I
 would like to request information/recommendations for other links relating
 to MT in Africa and in African languages wherever. (The page also needs some
 reworking, but I'm mainly concerned now with content.)
 
 TIA
 
 Don Osborn
 Bisharat.net
 PanAfrican Localisation project
 ___
 Mt-list mailing list


--
Timo Honkela, Chief Research Scientist, PhD, Docent
Adaptive Informatics Research Center
Laboratory of Computer and Information Science
Helsinki University of Technology
P.O.Box 5400, FI-02015 TKK

timo.honkela at tkk.fi,  http://www.cis.hut.fi/tho/

___
Mt-list mailing list


Re: [Mt-list] Computers translation in Africa / involving African languages

2006-08-25 Thread Steve Helmreich
There are also some resources being developed for Amharic (in 
conjunction with Daniel Yacob) at New Mexico State University:


http://crl.nmsu.edu/say

Steve Helmreich
Computing Research Laboratory
New Mexico State University
Las Cruces, NM 88003 USA


On Thu, 24 Aug 2006, Don Osborn wrote:

 


I have updated a very modest presentation of some info relevant to MT in
Africa at http://www.bisharat.net/Trans/ . There is not much there, so I
would like to request information/recommendations for other links relating
to MT in Africa and in African languages wherever. (The page also needs some
reworking, but I'm mainly concerned now with content.)

TIA

Don Osborn
Bisharat.net
PanAfrican Localisation project
___
Mt-list mailing list
   



___
Mt-list mailing list


[OT] Terminology (RE: [Mt-list] NLP for (African) pi-languages, not minority languages)

2006-08-25 Thread Don Osborn
Thanks to all who have responded in this thread. I will follow up offline.

Re the question of terminology, and minority languages in particular, here
are a few quick thoughts (with apologies for taking this off on a tangent):

1. I hadn't thought of minority being offensive, but I guess we need to be
attentive to such matters. The main problem with the term I saw was its
imprecision. There was not long ago a project to compile information on
minority languages. To the surprise of a few people asked about it,
including me, Hausa was one of them (next to Swahili it supposedly has the
highest speakership of all African languages). But when we discussed it
further, the criteria indeed seemed to admit it: In Hausaland across much
of Niger and Nigeria it is the main language, but Hausaphones are minorities
elsewhere and it is spoken as a trade language by some people further away.
However, by extension, then, just about every other language in Africa is
minority as well. What capped it was discovering that Chinese also
qualiified as a minority language - which it is in fact in many countries,
though we wouldn't think to call it, or Spanish or English, etc. As Francis
puts it, situational minority languages. But that just shows how dependent
the term is on context.

2. So people grope for an appropriate term. For more widely spoken
languages, LWC for language of wider communication emerged at some point
(rather like lingua francas, but let's not try to sort out the difference
between those two here). And at the other extreme there are endangered
languages about which, although definitions can vary, there is a generally
accepted sense of what it means (though even on that I've read references to
Igbo, a language spoken by somewhere on the order of 20 million people
described as endangered - but let's not delve into the issues there
either). But in between those two what do you say? Small languages as
shorthand for less widely spoken languages are more appropriately spoken
of as the latter - but that's too cumbersome. In Europe there was the term
lesser-used languages but with uncertain implications - less people speak
then or those that do use them less or both? Local languages is one that
I've tried to avoid lately because it seems to me to be used in a way that
reduces the languages status, and is applied only in some parts of the world
(and what of local when you have, say, Wolof-speaking merchants in New
York and Paris, for instance?). In Francophone countries the term langue
partenaire has been coined, but that raises questions of what kind of
partnership, and who's partner with whom and why and so on 

3. A lot depends of course on context. Under-resourced languages is very
descriptive for ICT contexts and even some traditional technologies (e.g.,
no textbooks in so many less-widely-spoken-languages for the better part of
the past century - now that's under-resourced). But maybe not in demographic
or sociolinguistic contexts. Just for an example, Fula definitely is under
resourced in the technical and monetary sense, but definitely not
linguistically (e.g., its lexicon is staggering - there's a large dictionary
of the roots alone). Less commonly taught languages (LCTLs) is purely an
academic reference. Pi-language is a new one on me but seems to be mainly
a technical reference (pi=poorly informatisées or what?).

4. I ran into this problem personally when I wanted a way to refer to a very
wide class of languages not counting the LWCs as LWCs, and came up with an
acronym that I think covers the intended field and is in itself
constructively ambiguous: MINEL - where M is maternal (which is every
language, but here the emphasis is on this role as opposed to the 2nd
language role) or minority (sorry!); I is indigenous (which also can mean
anything, but here meant in the sense of languages of indigenous peoples;
N is national which is an appellation more common in Francophone countries
especially in Africa and is *not* the same as official; E is endangered,
or ethnic which one will hear with regard to languages in some parts of
the world (funny that a language might be referred to as ethnic and not
indigenous or vice-vera, but the criteria for the distinction are arguable);
and L could be less-widely-spoken or even local or, well, language.

That about runs the gamut, from what I have. Hope all have a good weekend
(some of you are in the midst of it and others just starting, and some of us
will work through it either way!).

Don Osborn






___
Mt-list mailing list


Re: [OT] Terminology (RE: [Mt-list] NLP for (African) pi-languages, not minority languages)

2006-08-25 Thread Francis Tyers

 though we wouldn't think to call it, or Spanish or English, etc. As Francis
 puts it, situational minority languages. But that just shows how dependent
 the term is on context.

To avoid taking credit for this, I took the term from Peter Trudgill's
paper, Ausbau sociolinguistics and the perception of language status in
contemporary Europe, Int. J. App. Ling. 1992.

:)

Fran

___
Mt-list mailing list


[Mt-list] Open positions at SYSTRAN

2006-08-25 Thread Jeff Allen
Dear listers,

SYSTRAN is seeking to fill several open positions in Paris, France and San
Diego, California, USA.

I am involved in seeking candidates to fill the following 2 Paris-based
positions as soon as possible:
* Quality Engineering Manager
* Technical Customer Support Engineer

A few other Paris-based Software Engineer positions are also open.

Several Linguist/Lexicographer positions for various languages are also
available at both locations.

A full set of job descriptions and relevant information is available at:
http://www.systran.fr/company/careers/

http://www.systransoft.com/company/careers/

Interested candidates should fill out the form on the web site (click on the
relevant position and fill in all fields, INCLUDING the message field -- please
note the attachment format restrictions and size limit). You can send me a copy,
but please also apply through the web site.

Regards,

Jeff Allen, PhD, certified ISO 9001:2000 Quality Auditor
EMEA Director of Support  Professional Services
SYSTRAN, Paris, France
[EMAIL PROTECTED]
http://www.systran.fr/ or http://www.systransoft.com/
___
Mt-list mailing list