On Wed, Apr 04, 2012 at 03:35:39PM +0200, Per Tunedal wrote: > Hi, > then I hope for some development on point # 2. I don't like redoing a > job someone else has done already. > > Point # 3: > > I realise the @, / or # symbols are codes for different kinds of errors. > Such info is valuable as you can concentrate your work on those entries > after a test-drive with some representative texts. Before entering > entirely new words in the dictionaries. > > I have noticed that it is possible to run Apertium in different modes, > one "debug" mode showing codes for missing entries etc. and one "user" > mode suppressing such codes. It would be interesting to let real users > test a new pair early, letting them chose the mode of operation. And > reporting blatant errors. That would be encouraging for the developer. > > Further: > > 4. The similarity of Swedish and Norwegian leads me to believe that the > transfer rules could be somewhat similar for the two languages, although > Swedish and Danish are considered more similar. (All the same, a Swede > has sometimes some difficulties when speaking with a Dane, as the Danish > pronunciation appears "mushy" to a Swede. Danes often switch to > "Scandinavian" to make themselves understood. But in writing, Swedish > and Danish are very similar.)
"Scandinavian" means Danes trying their best at speaking Swedish:-) > I suppose some kind of cooperation would be beneficial if I start to > work on the pair SE - EN. Maybe splitting up the list of words needing > transfer rules between us. Maybe I can reuse your bilingual dictionary > (translating Norwegian to Swedish) and you can reuse entries I add to > mine (translating Swedish to Norwegian). We do have a sv-da module with quite some grammar encoded. What data are you working on for Swedish? I have been looking at some data from Gothenburg. SALDO: http://spraakbanken.gu.se/resurs/saldo I have some plans/hopes for a da-en module. I also like the idea of reusing sv and nb/nn resources and grammar. > I might as well cooperate with the developers of the DA - EN pair, by > the same reasons. I am not sure there is a da-en pair. > Maybe it would be easier to start with the SE - NO pair, building on the > SE - DA? There are great similarities between the Norwegian bokm??l > (written language variant) and Danish. But I haven't any real knowledge > of neither Danish, nor of Norwegian :-( > > 5. English is quite distant from Swedish and Norwegian. Would it be more > fruitful to use Matxin instead of Apertium? What's the difference? > > Yours, > Per Tunedal > > PS As a Swede, I believe I understand Norwegian. In reality the > similarity is sometimes deceiving and some words are completely > different. Anyhow, Swedes and Norwegians usually understand each-other. > I found your translation excellent. My interpretation of the Norwegian > word "bekkekanter" is "banks of small streams of water/brooks", as I > believe the Norwegian word "bekk" is the equivalent of the Swedish > "b??ck". BTW The Norwegian intonation in spoken language gives the > impression to a Swede that the speaker is very happy. This makes a > somewhat comical impression when a Norwegian is communicating bad news. > All the same, I get along perfectly well with my Norwegian neighbours. Norwegians also appear very happy to danes, due to their language intonation. There is a sketch about a very depressed Norwegian visiting a Danish psychiatrist. The Dane dies of laughter on hearing all the troubles from the Norwegian. There is a list of false friends for da/sv/nb best regards keld > > On Wed, Apr 4, 2012, at 09:16, Francis Tyers wrote: > > El dc 04 de 04 de 2012 a les 08:53 +0200, en/na Per Tunedal va escriure: > > > Hi, > > > I have just rapidly scanned through the documentation and have some very > > > basic questions about building a new language pair: > > > > I'll give some very brief answers below. > > > > > 1. Can I reuse the already developed monolingual dictionaries for the > > > two languages? > > > > Yes > > > > > 2. Is someone maintaining an updated version (i.e. joining all > > > additions) of the most complete monolingual dictionary for each > > > language? > > > > No. This is something we would like to do, there is a GSOC idea for it: > > > > http://wiki.apertium.org/wiki/Ideas_for_Google_Summer_of_Code/Monolingual_and_bilingual_data_decoupling > > > > > 3. Can the two monolingual dictionaries for a new language pair have > > > different length (i.e. contain a different number of words) - relative > > > each other and relative the bilingual dictionary? > > > > Not currently. This would cause errors in the translation. Here is an > > example translation from Nursery. > > > > Original: > > > > Planten finnes naturlig i Lilleasia og i deler av Midt??sten (Iran, > > Irak). Den st??r oppf??rt p?? norsk svarteliste som u??nsket og er utbredt > > langs veikanter, bekkekanter, og bakg??rder i Nord-Norge, fra Finnmark og > > s??rover til Nord-Tr??ndelag, men den er ogs?? funnet i S??r-Norge, s??rlig > > i > > Osloomr??det. finns ogs?? i ??stfold og Vestfold. > > > > MT output from no-en in nursery: > > > > Plant #be natural #in #Asia Minor and #in parts #of #Middle East (Iran, > > Iraq). He stands staged on Norwegian #blacklist as undesirable and are > > expanded #along #roadside, *bekkekanter, and @bakg??rd #in #Northern > > Norway, from Finnmark and @s??rover #to #Nord-Tr??ndelag, but he is also > > found #in #South Norway, especially #in *Osloomr??det. Finn's also #in > > #??stfold and #Vestfold. > > > > My "translation" (I don't know Norwegian): > > > > The plant is found naturally in Asia Minor and in parts of the Middle > > East (Iran, Iraq). It is currently on the Norwegian blacklist as > > undesirable and is alongside roadsides, _?_ and back gardens in Northern > > Norway, from Finnmark southwards to Nord-Tr??ndelag, but it is also found > > in South Norway, especially in Osloomr??det. It is also found in ??stfold > > and Vestfold. > > > > This pair has basically been put together by myself and Unhammer in a > > few days, hence the lack of transfer rules (#be) and bilingual > > dictionary errors (@s??rover). > > > > > In case of a positive answer to my questions: > > > > > > Does the job of creating a new language pair basically consist of "just" > > > creating a bilingual dictionary? In that case it would be easy to start > > > with some frequent words and let it grow over time. > > > > In many cases it consists of "just" creating a bilingual dictionary, the > > transfer rules, and performing the "vocabulary test"[1] This can take > > between 10 days and three months. > > > > Fran > > > > 1. http://wiki.apertium.org/wiki/Testvoc > > > > > Yours, > > > Per Tunedal > > > > > > PS Why is the pair sv-en presented as in "nursery" if there is neither > > > any files left, nor any work going on? > > > > No idea, it should probably be in incubator. It was probably created > > along with the Luxembourg Workshop because we had to make skeletons for > > all language pairs in the matrix. > > > > > > ------------------------------------------------------------------------------ > > Better than sec? Nothing is better than sec when it comes to > > monitoring Big Data applications. Try Boundary one-second > > resolution app monitoring today. Free. > > http://p.sf.net/sfu/Boundary-dev2dev > > _______________________________________________ > > Apertium-stuff mailing list > > [email protected] > > https://lists.sourceforge.net/lists/listinfo/apertium-stuff > > ------------------------------------------------------------------------------ > Better than sec? Nothing is better than sec when it comes to > monitoring Big Data applications. Try Boundary one-second > resolution app monitoring today. Free. > http://p.sf.net/sfu/Boundary-dev2dev > _______________________________________________ > Apertium-stuff mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/apertium-stuff ------------------------------------------------------------------------------ Better than sec? Nothing is better than sec when it comes to monitoring Big Data applications. Try Boundary one-second resolution app monitoring today. Free. http://p.sf.net/sfu/Boundary-dev2dev _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
