I built a few systems using the kits you specified and my language pairs
are quite "unusual".
My first system was a Slovenian to English translator based on 30000
parallel sentences.

I produced a few systems based on this language pair using different
corpora or selected parts of the main corpus.

My current area of interest is translation of similar languages, I
produced three new systems using 1984 book from Orwell, language pairs
are:
Slovenian - Czech
Slovenian - Serbian
Slovenian - English

All systems are based on one multilingual corpus so I can directly compare
results. Based on 6000+ aligned sentences.

Most of the systems are available online at the address:

http://www.pef.upr.si/~jernej/
click on each Menola word to get a desired system interface.

In may opinion, your corpus is somewhat too small, but one can always
try...

On Tue, 30 Mar 2004, paul johnston wrote:

> Just wondering how many people have built SMT systems using the CMU-Toolkit,
> Giza and the ISI Decoder and what were the sizes of their language and
> translation models.
> I've put together an Estonian to English system using the BNC as the
> language model and so far 1500 pairs of parallel sentences.
> I would be especially interested to hear from people using more unusual
> language pairs.
> Thanks in advance
> Paul Johnston (UMIST)
> 
> 
> _______________________________________________
> MT-List mailing list
> [EMAIL PROTECTED]
> http://www.computing.dcu.ie/mailman/listinfo/mt-list
> 


_______________________________________________
MT-List mailing list
[EMAIL PROTECTED]
http://www.computing.dcu.ie/mailman/listinfo/mt-list

Reply via email to