On Wed, Apr 04, 2012 at 03:35:39PM +0200, Per Tunedal wrote:
> Hi,
> then I hope for some development on point # 2. I don't like redoing a
> job someone else has done already.
> 
> Point # 3:
> 
> I realise the @, / or # symbols are codes for different kinds of errors.
> Such info is valuable as you can concentrate your work on those entries
> after a test-drive with some representative texts. Before entering
> entirely new words in the dictionaries.
> 
> I have noticed that it is possible to run Apertium in different modes,
> one "debug" mode showing codes for missing entries etc. and one "user"
> mode suppressing such codes. It would be interesting to let real users
> test a new pair early, letting them chose the mode of operation. And
> reporting blatant errors. That would be encouraging for the developer.
> 
> Further:
> 
> 4. The similarity of Swedish and Norwegian leads me to believe that the
> transfer rules could be somewhat similar for the two languages, although
> Swedish and Danish are considered more similar. (All the same, a Swede
> has sometimes some difficulties when speaking with a Dane, as the Danish
> pronunciation appears "mushy" to a Swede. Danes often switch to
> "Scandinavian" to make themselves understood. But in writing, Swedish
> and Danish are very similar.)

"Scandinavian" means Danes trying their best at speaking Swedish:-)

> I suppose some kind of cooperation would be beneficial if I start to
> work on the pair SE - EN. Maybe splitting up the list of words needing
> transfer rules between us. Maybe I can reuse your bilingual dictionary
> (translating Norwegian to Swedish) and you can reuse entries I add to
> mine (translating Swedish to Norwegian).

We do have a sv-da module with quite some grammar encoded.

What data are you working on for Swedish? I have been looking at some data from 
Gothenburg.
SALDO: http://spraakbanken.gu.se/resurs/saldo
I have some plans/hopes for a da-en module.

I also like the idea of reusing sv and nb/nn resources and grammar.


> I might as well cooperate with the developers of the DA - EN pair, by
> the same reasons.

I am not sure there is a da-en pair.

> Maybe it would be easier to start with the SE - NO pair, building on the
> SE - DA? There are great similarities between the Norwegian bokm??l
> (written language variant) and Danish. But I haven't any real knowledge
> of neither Danish, nor of Norwegian :-(
> 
> 5. English is quite distant from Swedish and Norwegian. Would it be more
> fruitful to use Matxin instead of Apertium? What's the difference?
> 
> Yours,
> Per Tunedal
> 
> PS As a Swede, I believe I understand Norwegian. In reality the
> similarity is sometimes deceiving and some words are completely
> different. Anyhow, Swedes and Norwegians usually understand each-other.
> I found your translation excellent. My interpretation of the Norwegian
> word "bekkekanter" is "banks of small streams of water/brooks", as I
> believe the Norwegian word "bekk" is the equivalent of the Swedish
> "b??ck". BTW The Norwegian intonation in spoken language gives the
> impression to a Swede that the speaker is very happy. This makes a
> somewhat comical impression when a Norwegian is communicating bad news.
> All the same, I get along perfectly well with my Norwegian neighbours.

Norwegians also appear very happy to danes, due to their language intonation.
There is a sketch about a very depressed Norwegian visiting a Danish 
psychiatrist.
The Dane dies of laughter on hearing all the troubles from the Norwegian.

There is a list of false friends for da/sv/nb

best regards
keld

> 
> On Wed, Apr 4, 2012, at 09:16, Francis Tyers wrote:
> > El dc 04 de 04 de 2012 a les 08:53 +0200, en/na Per Tunedal va escriure:
> > > Hi,
> > > I have just rapidly scanned through the documentation and have some very
> > > basic questions about building a new language pair:
> > 
> > I'll give some very brief answers below.
> > 
> > > 1. Can I reuse the already developed monolingual dictionaries for the
> > > two languages?
> > 
> > Yes
> > 
> > > 2. Is someone maintaining an updated version (i.e. joining all
> > > additions) of the most complete monolingual dictionary for each
> > > language?
> > 
> > No. This is something we would like to do, there is a GSOC idea for it:
> > 
> > http://wiki.apertium.org/wiki/Ideas_for_Google_Summer_of_Code/Monolingual_and_bilingual_data_decoupling
> > 
> > > 3. Can the two monolingual dictionaries for a new language pair have
> > > different length (i.e. contain a different number of words) - relative
> > > each other and relative the bilingual dictionary?
> > 
> > Not currently. This would cause errors in the translation. Here is an
> > example translation from Nursery.
> > 
> > Original:
> > 
> > Planten finnes naturlig i Lilleasia og i deler av Midt??sten (Iran,
> > Irak). Den st??r oppf??rt p?? norsk svarteliste som u??nsket og er utbredt
> > langs veikanter, bekkekanter, og bakg??rder i Nord-Norge, fra Finnmark og
> > s??rover til Nord-Tr??ndelag, men den er ogs?? funnet i S??r-Norge, s??rlig 
> > i
> > Osloomr??det. finns ogs?? i ??stfold og Vestfold.
> > 
> > MT output from no-en in nursery:
> > 
> > Plant #be natural #in #Asia Minor and #in parts #of #Middle East (Iran,
> > Iraq). He stands staged on Norwegian #blacklist as undesirable and are
> > expanded #along #roadside, *bekkekanter, and @bakg??rd #in #Northern
> > Norway, from Finnmark and @s??rover #to #Nord-Tr??ndelag, but he is also
> > found #in #South Norway, especially #in *Osloomr??det. Finn's also #in
> > #??stfold and #Vestfold.
> > 
> > My "translation" (I don't know Norwegian):
> > 
> > The plant is found naturally in Asia Minor and in parts of the Middle
> > East (Iran, Iraq). It is currently on the Norwegian blacklist as
> > undesirable and is alongside roadsides, _?_ and back gardens in Northern
> > Norway, from Finnmark southwards to Nord-Tr??ndelag, but it is also found
> > in South Norway, especially in Osloomr??det. It is also found in ??stfold
> > and Vestfold.
> > 
> > This pair has basically been put together by myself and Unhammer in a
> > few days, hence the lack of transfer rules (#be) and bilingual
> > dictionary errors (@s??rover).
> > 
> > > In case of a positive answer to my questions:
> > > 
> > > Does the job of creating a new language pair basically consist of "just"
> > > creating a bilingual dictionary? In that case it would be easy to start
> > > with some frequent words and let it grow over time.
> > 
> > In many cases it consists of "just" creating a bilingual dictionary, the
> > transfer rules, and performing the "vocabulary test"[1] This can take
> > between 10 days and three months. 
> > 
> > Fran
> > 
> > 1. http://wiki.apertium.org/wiki/Testvoc
> > 
> > > Yours,
> > > Per Tunedal
> > > 
> > > PS Why is the pair sv-en presented as in "nursery" if there is neither
> > > any files left, nor any work going on?
> > 
> > No idea, it should probably be in incubator. It was probably created
> > along with the Luxembourg Workshop because we had to make skeletons for
> > all language pairs in the matrix. 
> > 
> > 
> > ------------------------------------------------------------------------------
> > Better than sec? Nothing is better than sec when it comes to
> > monitoring Big Data applications. Try Boundary one-second 
> > resolution app monitoring today. Free.
> > http://p.sf.net/sfu/Boundary-dev2dev
> > _______________________________________________
> > Apertium-stuff mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> 
> ------------------------------------------------------------------------------
> Better than sec? Nothing is better than sec when it comes to
> monitoring Big Data applications. Try Boundary one-second 
> resolution app monitoring today. Free.
> http://p.sf.net/sfu/Boundary-dev2dev
> _______________________________________________
> Apertium-stuff mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff

------------------------------------------------------------------------------
Better than sec? Nothing is better than sec when it comes to
monitoring Big Data applications. Try Boundary one-second 
resolution app monitoring today. Free.
http://p.sf.net/sfu/Boundary-dev2dev
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to