Re: [Apertium-stuff] Automatically change first-person to third-person

2022-02-15 Thread Per Tunedal
Hi Kevin,
your new solution with rtx would be of interest. I´ve attached your old Python 
script.

Your answer from 2018:
$ curl http://termbin.com/sah4 >pres2pret.py

$ curl 
https://raw.githubusercontent.com/goavki/streamparser/master/streamparser.py 
>streamparser.py

$ sudo apt install apertium-nno

$ echo 'ååh hald no opp, eg gjer det etter kvart.'  |\
apertium-deshtml -n |\
lt-proc /usr/share/apertium/apertium-nno/nno.automorf.bin   |\
cg-proc -1 /usr/share/apertium/apertium-nno/nno.rlx.bin |\
python3 pres2pret.py|\
lt-proc -g /usr/share/apertium/apertium-nno/nno.autogen.bin |\
apertium-rehtml-noent 


which gives the (rather ungrammatical) answer "ååh hald no opp, eg
gjorde det etter kvart".

The link once again:
https://sourceforge.net/p/apertium/mailman/apertium-stuff/thread/1519736195.3991384.1284992528.191E054E%40webmail.messagingengine.com/#msg36238830

Or search for "change of tense" in the list archives. The original subject was:
Automatic change of tense
-- 
  Vänligen
  Per Tunedal

On Mon, Feb 14, 2022, at 20:10, Kevin Brubeck Unhammer wrote:
>> The link in that earlier email is dead, so I can't see what the original
>> script was doing, but based on the name it might have just been replacing
>>  with , in which case, if you still have that script, you could
>> just edit it to replace  with .
>
> Wops, I should've attached it …
>
> These days I think I'd use rtx for this, probably would be an even
> shorter file =D
>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
> Attachments:
> * signature.asc#!/usr/bin/env python3

import sys
from streamparser import parse_file, readingToString, known, SReading

for blank, lu in parse_file(sys.stdin, withText=True):
pres = [s for r in lu.readings for s in r if s.tags == ['vblex', 'pres']]
if pres != []:
pret = [SReading(baseform=s.baseform, tags=['vblex', 'pret'])
for s in pres]
print(blank+" ".join("^{}$".format(readingToString(pret))
 for r in lu.readings),
  end="")
else:
print(blank+"["+lu.wordform+"]",
  end="")
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Automatically change first-person to third-person

2022-02-14 Thread Per Tunedal
HI again,
OOps, I happened to take the wrong book from the book shelf. I should have 
taken one of those a bit more to the left :-(
The examples are from the works of an other author, Sun Axelsson, not Majgull 
Axelsson.
But the question is still valid. And you could still make Majgull happy on her 
anniversary.
Yours,
Per Tunedal

-- 
  Vänligen
  Per Tunedal

On Mon, Feb 14, 2022, at 11:11, Per Tunedal wrote:
> Hi,
> Today is the anniversary of the Swedish author Majgull Axelsson. I've 
> just read an interview with her in my morning paper, Dagens Nyheter. 
> She tells about the work with a new novel. She has just decided to 
> change the narrator perspective from first-person to third-person 
> (singular) and wish some help:
> "Jag önskar mig innerligt i fördelsedagspresent, ett datorprogram som 
> kan ändra tempus och berättarperspektiv väldigt enkelt, skämtar hon."
>
> My translation:
> "I fervently wish me a computer program that is able to change the 
> tense and the narrative perspective very easy, she says with a laugh."
>
> I've in the past already asked for changing the tense, so this is 
> doable.
> https://sourceforge.net/p/apertium/mailman/apertium-stuff/thread/1519736195.3991384.1284992528.191E054E%40webmail.messagingengine.com/#msg36238830
>
> She would like to change the perpecive as well, something like:
>
> Example 1: 
> "Jag länktade hem och lyfte glaset med en sliskig mintlikör till 
> munnen. Jag svalde klunk efter klunk och tog klumparna jag svalde för 
> sockeravlagringar. Tills jag såg att det var flugor."
> (Her book "Honungsvargar", page 19)
>
> To:
> Hon länktade hem och lyfte glaset med en sliskig mintlikör till munnen. 
> Hon svalde klunk efter klunk och tog klumparna hon svalde för 
> sockeravlagringar. Tills hon såg att det var flugor. ... " (Narrator 
> changed from 1:st person to third person)
>
> or:
>
> Hon länktar hem och lyfter glaset med en sliskig mintlikör till munnen. 
> Hon sväljer klunk efter klunk och tar klumparna hon sväljer för 
> sockeravlagringar. Tills hon ser att det är flugor. ... " (Narrator 
> changed from 1:st person to third person and past tens to present tense)
>
>
> Example 2:
> "Mig, skrev han, hade han aldrig älskat."
> ("Honungsvargar", page 23)
>
> To:
> "Henne, skrev han, hade han aldrig älskat." (Narrator changed from 1:st 
> person to third person)
>
> Maybe the Apertium community could make her wish true?
>
> Yours,
> Per Tunedal
>
> BTW A more fundamental change of narrative perspective would be harder. 
> Like changing narrator from the woman to the man in this text. I don't 
> think it's possible.
>
> The first example would become something like:
> "Han såg henne lyfta glaset med den sliskiga minlikören till munnen. 
> Hon svalde klunk efter klunk. Plötsligt stelnade hon till och stirrade 
> ned i glaset. 'Flugor', skrek hon."
>
>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff


___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] Automatically change first-person to third-person

2022-02-14 Thread Per Tunedal
Hi,
Today is the anniversary of the Swedish author Majgull Axelsson. I've just read 
an interview with her in my morning paper, Dagens Nyheter. She tells about the 
work with a new novel. She has just decided to change the narrator perspective 
from first-person to third-person (singular) and wish some help:
"Jag önskar mig innerligt i fördelsedagspresent, ett datorprogram som kan ändra 
tempus och berättarperspektiv väldigt enkelt, skämtar hon."

My translation:
"I fervently wish me a computer program that is able to change the tense and 
the narrative perspective very easy, she says with a laugh."

I've in the past already asked for changing the tense, so this is doable.
https://sourceforge.net/p/apertium/mailman/apertium-stuff/thread/1519736195.3991384.1284992528.191E054E%40webmail.messagingengine.com/#msg36238830

She would like to change the perpecive as well, something like:

Example 1: 
"Jag länktade hem och lyfte glaset med en sliskig mintlikör till munnen. Jag 
svalde klunk efter klunk och tog klumparna jag svalde för sockeravlagringar. 
Tills jag såg att det var flugor."
(Her book "Honungsvargar", page 19)

To:
Hon länktade hem och lyfte glaset med en sliskig mintlikör till munnen. Hon 
svalde klunk efter klunk och tog klumparna hon svalde för sockeravlagringar. 
Tills hon såg att det var flugor. ... " (Narrator changed from 1:st person to 
third person)

or:

Hon länktar hem och lyfter glaset med en sliskig mintlikör till munnen. Hon 
sväljer klunk efter klunk och tar klumparna hon sväljer för sockeravlagringar. 
Tills hon ser att det är flugor. ... " (Narrator changed from 1:st person to 
third person and past tens to present tense)


Example 2:
"Mig, skrev han, hade han aldrig älskat."
("Honungsvargar", page 23)

To:
"Henne, skrev han, hade han aldrig älskat." (Narrator changed from 1:st person 
to third person)

Maybe the Apertium community could make her wish true?

Yours,
Per Tunedal

BTW A more fundamental change of narrative perspective would be harder. Like 
changing narrator from the woman to the man in this text. I don't think it's 
possible.

The first example would become something like:
"Han såg henne lyfta glaset med den sliskiga minlikören till munnen. Hon svalde 
klunk efter klunk. Plötsligt stelnade hon till och stirrade ned i glaset. 
'Flugor', skrek hon."


___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Semantics in Apertium (was Apertium's Wider Use & Secondary Tags)

2020-06-16 Thread Per Tunedal
Hi all,
I liked your examples Hector.

1. Synonyms might be good for a problem in Swedish. As I've mentioned in the 
past, nouns with t-gender (neutrum) in singular, indefinit form, cannot be 
combined with adjectives that ends on the letter "t" or the letter "d". That 
form is not used because it can neither be pronounced, nor written. Usually it 
is noted as "nonexistent" or "not used" in Swedish dictionnaries and grammars.

Some examples:
A lion cannot be afraid! "ett (impossible form of rädd) lejon"
But two lions can: "två rädda lejon"

The same applies to e.g. gosts (spöken) and children (barn). Normally n-genus 
(utrium) is used for anime (living things) in Swedish, but some words have by 
some reason got the "wrong" gender. When encountering such words you have to 
substitute the adjective by a synomym (or reformulate).

2. Genre might be useful for word selection in some cases. In the past I began 
adding info on genre in the Swedish wordlist, for future use.

When choosing a good synonym for "rädd" (afraid) as above, you don't have any 
exact match. Synoms are e.g. "skrajsen" (have got the wind up), genre fam = 
colloquial/casual/informal (familier) or "skräckslagen" (terrified/terror 
struck),genre neu = neutral (neutre) or maybe a bit formal, "rädd" is much more 
current.
(BTW the connotations differ between "rädd" and "skräckslagen", the later is 
stronger ...)

I used the following genres, inspired by Le Petit Robert, by Oxfords Advanced 
Learners Dictionary and by Bonniers svenska ordbok:

neu = neutral (neutre)
sol = solemn(solennel)
fam = colloquial/casual/informal (familier)
pej= depreciatory/pejorative (dénigrant/péjoratif)
vulg= vulgar (vulgaire)
old = old-fashioned (vieilli/archaïque)
dial = dialectal (dialectal)

It might be a good idea to agree on what genres to use, and apply it for all 
languages.

3. In the past I began adding domain info as well in the Swedish wordlist. I 
hoped it might be useful for word selection.
I used e.g. c="domain:general style:fam" in the -tag, as proposed by 
Francis. I haven't got any opinion on the best way to add the info, I'm just 
eager to have the possibility. And a possibility to use the info.

It might be a good idea to agree on domains, as well.

Yours,
Per Tunedal

On Mon, Jun 15, 2020, at 18:38, Hèctor Alòs i Font wrote:
> Here come several practical examples. I tried to select them for their 
> variety. The result is more a wish list than something structured.
> 
> Let's begin with "je la baise". Depending on the context this may be "I kiss 
> her" or "I fuck her". The context can tell us if we are in a formal or 
> colloquial type of language. Another issue is that in this case the anaphora 
> resolution can also help us: if the pronoun reference is "hand", it can only 
> be "kiss"; if it is a person, the doubt persists.
> 
> Another kind of problem is the Arpitan words "chamô" ("camel"; plural 
> "camels") and "chamôs ("chamois"; unchanged in plural). So, translating into 
> French, I got yesterday chamois in a Bible text of Exodus xD I solved it 
> deciding in a CG rule that all "chamôs" (without nothing around in singular) 
> are camels. (Similar cases in French: fil/fils, foi/fois, cour/cours)
> 
> In French there are plenty of words with different meanings, depending on the 
> genre: livre, page, tour, etc. The problem is that often the immediate 
> surrounding context does not disambiguate: des livres, les pages, de tour, 
> etc. A similar but slightly different case is the word pairs homicide 
> mf/homicide m, féminicide mf/féminicide m, parricide mf/parricide, etc.: the 
> one with the genre "mf" is a person and the other is the action.
> 
> Other problems come in lexical selection. For instance, as a rule, Catalan 
> preposition "de" is translated as "de" in French, but if the following word 
> is a material, "en" must be selected (de fusta > en bois). So in the 
> Catalan2French lrx file we have a list of materials, as we have a list of 
> countries, a list of musical instruments, a list of animals, etc. I dream 
> about a monolingual dictionary where we could get this kind of information. 
> It is not useful to have these lists for many language pairs using Catalan. 
> This information should be in apertium-cat and not in every apertium-cat-xxx 
> lrx file.
> 
> Moreover, If we had words not only with different kind of semantic labels, 
> but also marked as synonyms, maybe it'd be possible to give a translation 
> using a word labeled as synonym (if it has a translation) instead of 
> "unknown".
> 
> Hèctor
> 
>

[Apertium-stuff] OT: Diceware and Dicelist

2020-05-20 Thread Per Tunedal
Hi all, 
thank you very much for the help with extracting words from the 
apertium-swe.swe.dix
See the threads:
How do I get a list of lemmas for nouns 
List of verbs

Thanks to you, I have managed to build Swedish word lists for creating secure 
passwords with the help of dices. A passphrase of random words is far more easy 
to remember than a password of random characters. You can find my first 
dicelist here:
https://github.com/havet/Dicelist

The original idea comes from Arnold G. Reinhold and Diceware is his registred 
trademark. His list is for 5 dices and consists of 6^5 = 7776 english words. 
Later The Electronic Frontier Foundation published three alternative wordlists, 
two of them for 4 dices (6^4=1296 english words).

For a start, I've published a Swedish wordlist for 4 dices (1296 words). Use 4 
dices to randomly get a combination of digits between  and , that 
corresponds to a word in the list. The combination 1234 corresponds e.g. to the 
 word "avog" and the combination 5316 corresponds to the word "roa". You need 
to get at least 8 words to form a secure password. It will be slightly stronger 
than a password consisting of 12 random characters chosen from a set consisting 
of upper and lower case characters (a-z), numbers and symbols.

I've excluded the numbers and the strange combinations of characters that are 
included in the original Diceware list. I have also tried to exclude rare 
words, offensive words, homophones and words hard to spell.

Contributions are welcome! Make a wordlist in your own language. It's fairly 
easy if your language is used in Apertium. It's an advantage if you have access 
to lists of vulgar words and of homophones in your language. A word-frequency 
list is useful as well.

More information:
https://en.wikipedia.org/wiki/Diceware
http://world.std.com/~reinhold/diceware.html
https://www.eff.org/deeplinks/2016/07/new-wordlists-random-passphrases

Yours,
Per Tunedal


___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] List of verbs

2020-05-11 Thread Per Tunedal
Hi again Kevin,
thank you for the explanation. Case closed!
Yours,
Per Tunedal

On Mon, May 11, 2020, at 13:39, Kevin Brubeck Unhammer wrote:
> "Per Tunedal" 
> čálii:
> 
> > Hi Kevin,
> > Thanks for the explanation. But what's the point of expanding the 
> > dictionary, anyway?
> 
> It gets you all forms, instead of just lemmas, and it can get you lemmas
> even where they're not marked with lm (the lm attribute is just treated
> as a comment by apertium code, whereas the  part of the entry is
> actually used, thus more likely to be checked for correctness)
> 
> 
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> 
> 
> *Attachments:*
>  * signature.asc
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] List of verbs

2020-05-11 Thread Per Tunedal
Hi Kevin,
Thanks for the explanation. But what's the point of expanding the dictionary, 
anyway?

I successfully tried:
grep "lm=" apertium-swe.swe.dix | grep "__n_"
grep "lm=" apertium-swe.swe.dix | grep "__vblex"
grep "lm=" apertium-swe.swe.dix | grep "__adj"
etc

Faster and easier. But I didn't get the two nouns "tur", due to the comment in 
Norwegian. Had to add tur manually.

I used:

grep "lm=" apertium-swe.swe.dix | grep "__adj" | sed 's/\"><.*//' | sed 's/vals

But it didn't mach these two lines:
 turtur¹
 turtur²

I didn't care, as it was just two lines that had comments.

Yours,
Per Tunedal



On Mon, May 11, 2020, at 10:18, Kevin Brubeck Unhammer wrote:
> "Per Tunedal" 
> čálii:
> 
> [...]
> 
> > arna
> > arnas
> > arnas-
> > ars
> > ars-
> > I have so far not been able to find out where they come from. They are not 
> > listed as nouns in apertium-swe.swe.dix
> 
> Probably the sed not being able to hand lines like
> 
> DJ:arna:DJ 
> 
> You may have to grep out lines with two colons first.
> 
> > Among the adjectives I got e.g. the following verbs:
> > abbreviera
> > abdikera
> > abonnera
> > abortera
> 
> Participles of verbs get tagged  / . You can grep
> them out.
> 
> 
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> 
> 
> *Attachments:*
>  * signature.asc
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] List of verbs

2020-05-11 Thread Per Tunedal
Hi again,
I've found that the solution you suggested doesn't work properly. Some 
non-existent words are produced in the process and are kept throughout the 
filtering. This got worse when I tried to get adjectives. The list was full of 
strange words, as well as words of other kinds, like e.g. verbs.

I suspected the expansion produces some output that pollutes the result. Thus I 
tried working directly on apertium-swe-swe.dix, like this:

grep "lm=" apertium-swe.swe.dix | grep "__n_" | less

This produced a usable list of nouns. A side effect is that this is far faster.

Remember, I asked about some very strange Swedish "nouns":
arna
arnas
arnas-
ars
ars-
I have so far not been able to find out where they come from. They are not 
listed as nouns in apertium-swe.swe.dix

Among the adjectives I got e.g. the following verbs:
abbreviera
abdikera
abonnera
abortera

I used:
lt-expand apertium-swe.swe.dix | grep -E "[^<:>]+:[^<:>]+" | sed -E 
's/[^<:>]+:([^<:>]+).*/\1/g' | sed 's/[¹²³]//g'

Any one who has a clue?

Yours,
Per Tunedal

On Tue, Apr 28, 2020, at 18:36, Samuel Sloniker wrote:
> egrep and fgrep are deprecated. Use grep -E and grep -F .
> 
> On Tue, Apr 28, 2020 at 7:56 AM Per Tunedal  wrote:
>> Hi,
>>  thank you all for your kind help. I'm getting the lists I need.
>>  Yours
>>  Per Tunedal
>> 
>>  On Mon, Apr 27, 2020, at 20:35, Bernard Chardonneau wrote:
>>  > Yes, me I rather do that instead of
>>  > 
>>  > (|||)
>>  > 
>>  > and I also use fgrep and egrep instead of grep -F and grep -E
>>  > as it was/(is ?) in UNIX.
>>  > 
>>  > 
>>  > > Date: Sun, 26 Apr 2020 10:40:39 -0700
>>  > > From: Samuel Sloniker 
>>  > > To: apertium-stuff@lists.sourceforge.net
>>  > > Reply-To: apertium-stuff@lists.sourceforge.net
>>  > > Subject: Re: [Apertium-stuff] List of verbs
>>  > > Pièce(s) jointes(s) probable(s)>
>>  > >
>>  > > Shouldn't  also work?
>>  > >
>>  > > On Fri, Apr 24, 2020 at 7:25 AM Daniel Swanson 
>> 
>>  > > wrote:
>>  > >
>>  > > > Also, to explain the patterns
>>  > > >
>>  > > > [^<:>]+ is "match any string of characters that doesn't contain a tag 
>> or a
>>  > > > colon"
>>  > > >
>>  > > > So the grep is "anything without tags or colons (i.e. a surface form) 
>> then
>>  > > > a colon then another string (a lemma) then a  tag"
>>  > > >
>>  > > > The sed matches roughly the same thing except it has () around the 
>> lemma
>>  > > > so it can refer to it later and .* to match whatever tags there may 
>> be. \1
>>  > > > then replaces the line with the contents of the first (), i.e. the 
>> lemma.
>>  > > >
>>  > 
>>  > 
>>  > Bernard Chardonneau (France)
>>  > Phone : [33] 9 72 36 32 90
>>  > GSM phone : [33] 7 69 46 16 31
>>  > 
>>  > An alternative Apertium translation website :
>>  > http://apertiumtrad.tuxfamily.org
>>  > 
>>  > Multilingual websites for my free softwares :
>>  > http://libremail.free.fr and http://libremail.tuxfamily.org
>>  > http://cyloop.tuxfamily.org (mainly translated with Apertium)
>>  > 
>>  > My general website (in french only)
>>  > http://bech.free.fr
>>  > 
>>  > 
>>  > ___
>>  > Apertium-stuff mailing list
>>  > Apertium-stuff@lists.sourceforge.net
>>  > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>  >
>> 
>> 
>>  ___
>>  Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> 
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> 
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Strange Swedish nouns in apertium-swe.swe.dix

2020-04-28 Thread Per Tunedal
Hi,
I suspect a bug somewhere. I will check it up soon.
Yours,
Per Tunedal

On Fri, Apr 24, 2020, at 15:55, Per Tunedal wrote:
> Hi Fran,
> I thought there might be some reason to the strange nouns. Some trick 
> to solve some problem? Or maybe an error in the expression I´ve used to 
> get the list:
> 
> lt-expand apertium-swe.swe.dix | grep -E "[^<:>]+:[^<:>]+" | sed -E 
> 's/[^<:>]+:([^<:>]+).*/\1/g' |  sort -u
> 
> Yours,
> Per Tunedal
> 
> 
> On Fri, Apr 24, 2020, at 15:51, Francis Tyers wrote:
> > El 2020-04-24 14:48, Per Tunedal escribió:
> > > Hi,
> > > I've found some strange nouns in my list of Swedish nouns from
> > > apertium-swe.swe.dix:
> > > arna
> > > arnas
> > > arnas-
> > > ars
> > > ars-
> > > 
> > > What's that?
> > > 
> > > And a misspelled word:
> > > Södermalmvåning
> > > should be:
> > > Södermalmsvåning
> > > 
> > > Yours,
> > > Per Tunedal
> > > 
> > 
> > Dear Per,
> > 
> > Please feel free to send a pull request via GitHub!
> > 
> > Best regards,
> > 
> > Fran
> >


___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] List of verbs

2020-04-28 Thread Per Tunedal
Hi,
thank you all for your kind help. I'm getting the lists I need.
Yours
Per Tunedal

On Mon, Apr 27, 2020, at 20:35, Bernard Chardonneau wrote:
> Yes, me I rather do that instead of
> 
> (|||)
> 
> and I also use fgrep and egrep instead of grep -F and grep -E
> as it was/(is ?) in UNIX.
> 
> 
> > Date: Sun, 26 Apr 2020 10:40:39 -0700
> > From: Samuel Sloniker 
> > To: apertium-stuff@lists.sourceforge.net
> > Reply-To: apertium-stuff@lists.sourceforge.net
> > Subject: Re: [Apertium-stuff] List of verbs
> > Pièce(s) jointes(s) probable(s)>
> >
> > Shouldn't  also work?
> >
> > On Fri, Apr 24, 2020 at 7:25 AM Daniel Swanson 
> > wrote:
> >
> > > Also, to explain the patterns
> > >
> > > [^<:>]+ is "match any string of characters that doesn't contain a tag or a
> > > colon"
> > >
> > > So the grep is "anything without tags or colons (i.e. a surface form) then
> > > a colon then another string (a lemma) then a  tag"
> > >
> > > The sed matches roughly the same thing except it has () around the lemma
> > > so it can refer to it later and .* to match whatever tags there may be. \1
> > > then replaces the line with the contents of the first (), i.e. the lemma.
> > >
> 
> 
> Bernard Chardonneau (France)
> Phone : [33] 9 72 36 32 90
> GSM phone : [33] 7 69 46 16 31
> 
> An alternative Apertium translation website :
> http://apertiumtrad.tuxfamily.org
> 
> Multilingual websites for my free softwares :
> http://libremail.free.fr and http://libremail.tuxfamily.org
> http://cyloop.tuxfamily.org (mainly translated with Apertium)
> 
> My general website (in french only)
> http://bech.free.fr
> 
> 
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>


___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] List of verbs

2020-04-24 Thread Per Tunedal
Hi,
Now I would like to list the verbs. I cannot fully understand the grep- and sed 
expressions for getting the nouns:

lt-expand apertium-swe.swe.dix | grep -E "[^<:>]+:[^<:>]+" | sed -E 
's/[^<:>]+:([^<:>]+).*/\1/g' | sort -u

How should I modify the expressions to get the verbs instead?

Yours,
Per Tunedal


___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] How do I get a list of lemmas for nouns

2020-04-23 Thread Per Tunedal
Hi,
Thank you Kevin! Works like a charm.
BTW I've already changed 'unique' to 'sort -u'
Yours,
Per

On Thu, Apr 23, 2020, at 10:42, Kevin Brubeck Unhammer wrote:
> "Per Tunedal" 
> čálii:
> 
> > Hi Kevin,
> > thanks for the explanation. Thus they are homonyms. How do I get rid of the 
> > duplicates?
> > I just want:
> >
> > tur
> 
> before the `| uniq`, stick in
> 
>  | sed 's/[¹²³]//g'
> 
> 
> (You may have to change `uniq` to `sort -u` in case things are not ordered 
> already)
> 
> 
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> 
> 
> *Attachments:*
>  * signature.asc
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] How do I get a list of lemmas for nouns

2020-04-23 Thread Per Tunedal
Hi Kevin,
thanks for the explanation. Thus they are homonyms. How do I get rid of the 
duplicates?
I just want:

tur

Yours,
Per Tunedal

On Thu, Apr 23, 2020, at 10:00, Kevin Brubeck Unhammer wrote:
> "Per Tunedal" 
> čálii:
> 
> > Hi Daniel,
> > Thank you! Works like a charm with a small exception.
> >
> > I get some strange duplicates like e.g. tur:
> >
> > tur¹
> > tur²
> 
> slump vs färd, they have different paradigms:
> 
>  turtur¹
>  turtur²
> 
> 
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> 
> 
> *Attachments:*
>  * signature.asc
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] How do I get a list of lemmas for nouns

2020-04-23 Thread Per Tunedal
Hi Daniel,
Thank you! Works like a charm with a small exception.

I get some strange duplicates like e.g. tur:

tur¹
tur²

Yours,
Per Tunedal

On Wed, Apr 22, 2020, at 16:28, Daniel Swanson wrote:
> Hi Per,
> 
> If I understand correctly, this might give what you want:
> 
> lt-expand apertium-swe.swe.dix | grep -E "[^<:>]+:[^<:>]+" | sed -E 
> 's/[^<:>]+:([^<:>]+).*/\1/g' | uniq
> 
> lt-expand lists all the forms, grep finds all the ones where the first tag is 
> , sed gets rid of everything but the lemma, and uniq removes duplicates.
> 
> Daniel
> 
> On Wed, Apr 22, 2020 at 7:54 AM Per Tunedal  wrote:
>> Hi,
>>  I need an ordinary dictionary of Swedish lemmas (just the lemmas, nothing 
>> else). How do I accomplish this?
>> 
>>  I read the Wiki:
>> http://wiki.apertium.org/wiki/Dixtools:_Grep
>> 
>>  Thus I tried:
>>  apertium-dixtools grep --par '.*__n' apertium-swe.swe.dix
>> 
>>  but nothing was filtered. I got the whole file.
>> 
>>  I have a bit trouble using grep, as I find regular expressions a bit hard 
>> to grasp. Unfortunately, I often get it wrong and get unexpected results.
>> 
>>  Now, I would like a list of nouns (just the lemmas) for a start. Then I 
>> need lists of the other parts of speech, verbs for instance.
>> 
>>  The expression below from http://wiki.apertium.org/wiki/Dictionary_reader:
>>  apertium-dixtools dic-reader list-lemmas apertium-swe.swe.dix
>>  gives me ALL lemmas. But I would like to choose the part of speech.
>> 
>>  I'm running Ubuntu as an app on Windows 10.
>> 
>>  Please give me a hand!
>> 
>>  Yours,
>>  Per Tunedal
>> 
>> 
>> 
>> 
>> 
>> 
>>  ___
>>  Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> 
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> 
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] How do I get a list of lemmas for nouns

2020-04-22 Thread Per Tunedal
Hi,
I need an ordinary dictionary of Swedish lemmas (just the lemmas, nothing 
else). How do I accomplish this?

I read the Wiki:
http://wiki.apertium.org/wiki/Dixtools:_Grep

Thus I tried:
apertium-dixtools grep --par '.*__n' apertium-swe.swe.dix

but nothing was filtered. I got the whole file.

I have a bit trouble using grep, as I find regular expressions a bit hard to 
grasp. Unfortunately, I often get it wrong and get unexpected results.

Now, I would like a list of nouns (just the lemmas) for a start. Then I need 
lists of the other parts of speech, verbs for instance.

The expression below from http://wiki.apertium.org/wiki/Dictionary_reader:
apertium-dixtools dic-reader list-lemmas apertium-swe.swe.dix
gives me ALL lemmas. But I would like to choose the part of speech.

I'm running Ubuntu as an app on Windows 10.

Please give me a hand!

Yours,
Per Tunedal






___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Where do I find the dictionaries

2020-04-22 Thread Per Tunedal
Hi,
thank you all!
apertium-get apertium-swe worked like a charm.

Yours,
Per Tunedal

On Tue, Apr 21, 2020, at 19:05, Tino Didriksen wrote:
> Correct, data packages are not meant for development use.
> 
> The monolingual packages install only exactly as much as is needed for 
> building pair packages and what an end-user may need for corpus analysis.
> 
> Developers can use the apertium-get helper to install and build a 
> development-usable data package from source. E.g. running "apertium-get 
> apertium-swe" will install apertium-swe in the active folder.
> 
> -- Tino Didriksen
> 
> 
> On Tue, 21 Apr 2020 at 18:29, Jonathan Washington 
>  wrote:
>> Hi Per,
>> 
>> To add to what Daniel said, language data installed from apt is put in 
>> system directories as root, and is not good for doing dev work.
>> 
>> As a fairly up-to-date Apertium language data developer, I don't know the 
>> path of system-installed language data off the top of my head (you can 
>> always run dpkg -L apertium-swe to find out) and I'm not even sure it 
>> includes the uncompiled dictionaries. Maybe I'm just an elite developer 
>> without my pulse on the needs of actual Apertium users.
>> 
>> But I do recommend what Daniel suggested—that would be the easiest approach, 
>> imo.
>> 
>> --
>> Jonathan
> 
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> 
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] Where do I find the dictionaries

2020-04-20 Thread Per Tunedal
Hi,
I'm a bit rusty, not having used Apertium for a long time.

I would like to get a dictionary containing Swedish lemmas, doing something 
like:

apertium-dixtools grep --par '.*__n' apertium-swe.dix

Where do I find the Swedish monodix?

I'm running Ubuntu as an app on Windows 10. I've installed Apertium nightly 
build. The language pairs swe-dan and swe-nor are installed from the repository 
with sudo apt-get install ...

And I've successfully installed apertium-dixtools. Then I got stuck. I cannot 
figure out where the language files are installed.

Yours,
Per Tunedal


___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] AddToDix-for-Apertium

2020-01-31 Thread Per Tunedal
Hi,
I've just uploaded my old java programs to Github:
https://github.com/havet/AddToDix-for-Apertium

I have published them in the hope that they might be useful for the Apertium 
community.

The programs are made for contributors that like human languages better than 
XML and works in Windows. The main advantage is that the contributor doesn't 
have to edit the actual dictionaries, written in XML. A language-savvy wanting 
to contribute, might be deterred from contributing by the look of the XML-files 
and might make lots of trivial errors if he/she tries to edit the code.

The tools are made for adding to the old deprecated Apertium dictionaries for 
translation between Danish and Swedish (apertium-sv-da.sv etc).  They  have to 
be adapted to the new dictionaries, to be useful for contributing to Apertium.

Yours.
Per Tunedal

PS I noticed the programs where still downloaded from my old site. I plan to 
remove them from that site, along with other programs. Some of the other 
programs will be published at Github as well.





___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] Apertium versions for iPhone and iPad

2018-03-06 Thread Per Tunedal
Hi,
may I remind you of the idea to have dual licenses to permit Apertium apps for 
iPhone and iPad? We have the opportunity whenever  a language is created from 
scratch. In those cases there is a limited number of developers involved and 
their consent may be acquired.

The same applies to new modules etc.

Yours,
Per Tunedal



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Automatic change of tense

2018-02-27 Thread Per Tunedal
Hi Kevin,
that would work. Obviously, some post editing is needed, but Apertium would do 
a lot of the tedious work .
I'll give it a try.
Nice Python script. Thank you!
Yours,
Per Tunedal

On Tue, Feb 27, 2018, at 21:00, Kevin Brubeck Unhammer wrote:
> 
> $ curl http://termbin.com/sah4 >pres2pret.py
> 
> $ curl 
> https://raw.githubusercontent.com/goavki/streamparser/master/streamparser.py 
> >streamparser.py
> 
> $ sudo apt install apertium-nno
> 
> $ echo 'ååh hald no opp, eg gjer det etter kvart.'  |\
> apertium-deshtml -n |\
> lt-proc /usr/share/apertium/apertium-nno/nno.automorf.bin   |\
> cg-proc -1 /usr/share/apertium/apertium-nno/nno.rlx.bin |\
> python3 pres2pret.py|\
> lt-proc -g /usr/share/apertium/apertium-nno/nno.autogen.bin |\
> apertium-rehtml-noent 
> 
> 
> which gives the (rather ungrammatical) answer "ååh hald no opp, eg
> gjorde det etter kvart".
> 
> 
> Per Tunedal <per.tune...@operamail.com> čálii:
> 
> > Hi all of you,
> > can Apertium be used to change tense in a text?
> >
> > Scenario:
> >
> > I've written a text of some hundred pages in e.g. past tense and would
> > like to have it in present (or the other way around).
> >
> > I suppose all information needed is in the monolingual dictionary for
> > the language in question. We've got an analyser and a generator.
> >
> > The question is:
> >
> > How do I put this together to:
> >
> > - analyse a monolingual text
> > - change the tense of the verbs as needed
> > - generate the text with the chosen tense
> >
> > Yours,
> > Per Tunedal
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> Email had 2 attachments:
> + pres2pret.py
>   1k (text/x-python)
> + signature.asc
>   1k (application/pgp-signature)

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] Automatic change of tense

2018-02-27 Thread Per Tunedal
Hi all of you,
can Apertium be used to change tense in a text?

Scenario:

I've written a text of some hundred pages in e.g. past tense and would like to 
have it in present (or the other way around).

I suppose all information needed is in the monolingual dictionary for the 
language in question. We've got an analyser and a generator.

The question is:

How do I put this together to:

- analyse a monolingual text
- change the tense of the verbs as needed
- generate the text with the chosen tense

Yours,
Per Tunedal



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] apertium-swe-nor 0.2.0 released!

2016-06-08 Thread Per Tunedal
Hi Fran, 
sorry for the confusion. I'm a bit rusty on Apertium.

If the meaning anledning = anledning is the least frequent:

1) Removing the anledning = anledning entry 
and
2) adding an entry möjlighet = anledning (or tillfälle = anledning)

would do the trick.

Yours,
Per Tunedal





On Wed, Jun 8, 2016, at 15:39, Francis Tyers wrote:
> A 2016-06-08 14:37, Per Tunedal escrigué:
> > Hi Fran!
> > You'd better double check this with a Norwegian, because looking in my
> > Norwegian dictionary "anledning" can have more meanings:
> > 1) what I described (very common as far as I know)
> > 2) = tillfälle, tilldragelse (occurrence, event)
> > 3) = anledning, orsak (as in Swedish: reason, motive)
> > 
> > I don't know how frequent the third meaning is, but I have never 
> > noticed
> > it - maybe because there wasn't any problem :-)
> > 
> > If the first meaning is the most frequent, I would do translate to
> > "möjlighet" or maybe "tillfälle" would be better as it includes meaning
> > 2 above (but it's not fluent Swedish in the first meaning).
> > 
> >> anledningmöjlighet >> n="n"/>
> 
> That entry doesn't make sense as "möjlighet" is not a Norwegian word.
> 
> The pair is Swedish -- Norwegian, so that means that 
> SWEDISHNORWEGIAN
> 
> anledninganledning n="n"/>
>  anledninggrunn n="n"/>
> anledninghøve n="n"/>
> 
> tillfälleanledning n="n"/>
> möjlighetanledning n="n"/>
> 
> Is what you are suggesting simply to remove the anledning = anledning 
> entry ?
> 
> Fran
> 
> --
> What NetFlow Analyzer can do for you? Monitors network bandwidth and
> traffic
> patterns at an interface-level. Reveals which users, apps, and protocols
> are 
> consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
> J-Flow, sFlow and other flows. Make informed decisions using capacity 
> planning reports.
> https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity 
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] apertium-swe-nor 0.2.0 released!

2016-06-08 Thread Per Tunedal
Hi Fran!
You'd better double check this with a Norwegian, because looking in my
Norwegian dictionary "anledning" can have more meanings:
1) what I described (very common as far as I know)
2) = tillfälle, tilldragelse (occurrence, event)
3) = anledning, orsak (as in Swedish: reason, motive)

I don't know how frequent the third meaning is, but I have never noticed
it - maybe because there wasn't any problem :-)

If the first meaning is the most frequent, I would do translate to
"möjlighet" or maybe "tillfälle" would be better as it includes meaning
2 above (but it's not fluent Swedish in the first meaning).

> anledning    möjlighet n="n"/>

Yours,
Per Tunedal

On Wed, Jun 8, 2016, at 11:11, Francis Tyers wrote:
> A 2016-06-08 09:52, Per Tunedal escrigué:
> > Hi!
> > Congratulations!
> > I've done some very superficial testing and so far just found some
> > problems with well-known "false friends", e.g. the common word
> > "anledning". The meaning in Swedish is "reason, motive", but in
> > norwegian rather "opportunity, possibility".  This might cause some
> > severe misunderstandings between Norwegians and Swedes.
> > 
> > If a Swede for instance invites a Norwegian to a
> > party/wedding/christening, the Norwegian might answer "Jeg har ikke
> > anledning til det." A direct, word for word, translation to "Jag har
> > inte anledning till det" would be a very offensive answer to the
> > invitation and might cause a life long hostility, if the
> > misunderstanding isn't resolved.
> > 
> > My neighbour answered me: "... vis jeg har anledning til det." And I
> > explained to him that it wasn't a good answer to a Swede ...
> > 
> > It might be a good idea to check for other false friends and adjust the
> > translation.
> > 
> 
> Thanks for the bug report!
> 
> Here are the entries in the bilingual dictionary for "anledning", how 
> would you change them to
> make them better?
> 
> anledninganledning n="n"/>
>  anledninggrunn n="n"/>
> anledninghøve n="n"/>
> 
> tillfälleanledning n="n"/>
> 
> Fran
> 
> --
> What NetFlow Analyzer can do for you? Monitors network bandwidth and
> traffic
> patterns at an interface-level. Reveals which users, apps, and protocols
> are 
> consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
> J-Flow, sFlow and other flows. Make informed decisions using capacity 
> planning reports.
> https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff


--
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity 
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] apertium-swe-nor 0.2.0 released!

2016-06-08 Thread Per Tunedal
Hi!
Congratulations!
I've done some very superficial testing and so far just found some
problems with well-known "false friends", e.g. the common word
"anledning". The meaning in Swedish is "reason, motive", but in
norwegian rather "opportunity, possibility".  This might cause some
severe misunderstandings between Norwegians and Swedes.

If a Swede for instance invites a Norwegian to a
party/wedding/christening, the Norwegian might answer "Jeg har ikke
anledning til det." A direct, word for word, translation to "Jag har
inte anledning till det" would be a very offensive answer to the
invitation and might cause a life long hostility, if the
misunderstanding isn't resolved.

My neighbour answered me: "... vis jeg har anledning til det." And I
explained to him that it wasn't a good answer to a Swede ...

It might be a good idea to check for other false friends and adjust the
translation.

Yours,
Per Tunedal 

On Tue, Jun 7, 2016, at 22:51, Kevin Brubeck Unhammer wrote:
> 111 years ago today, the union between Sweden and Norway was dissolved.
> But we're still good friends.
> 
> Here's the first proper release of apertium-swe-nor, giving translation
> From Swedish→Nynorsk, Nynorsk→Swedish, Swedish→Bokmål and Bokmål→Swedish
> – meaning all directions between Swedish, Norwegian and Danish are now
> covered by apertium :-)
> 
> Changes from the beta[1] include better disambiguation of Swedish,
> complete
> testvoc, expanded bidix and better transfer from/to supine. We'll be
> asking the natives on this list for some help with evaluation in a bit …
> 
> As with the previous dan-nor and swe-dan releases, this work was
> sponsored by Apertium and Wikimedia Foundation.
> 
> Signed tarballs available from
> https://sourceforge.net/projects/apertium/files/apertium-swe/ (0.7.0)
> https://sourceforge.net/projects/apertium/files/apertium-nno/ (0.9.0)
> https://sourceforge.net/projects/apertium/files/apertium-nob/ (0.9.0)
> https://sourceforge.net/projects/apertium/files/apertium-swe-nor/ (0.2.0)
> 
> The pair is already testable from https://apertium.org and it seems
> Kartik and Tino are hard at work packaging stuff so it should be in
> Content Translation for testing Quite Soon™.
> 
> 
> [1] http://permalink.gmane.org/gmane.comp.nlp.apertium/5809
> 
> -- 
> Kevin Brubeck Unhammer
> 
> GPG: 0x766AC60C
> --
> What NetFlow Analyzer can do for you? Monitors network bandwidth and
> traffic
> patterns at an interface-level. Reveals which users, apps, and protocols
> are 
> consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
> J-Flow, sFlow and other flows. Make informed decisions using capacity 
> planning reports.
> https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> Email had 1 attachment:
> + signature.asc
>   1k (application/pgp-signature)

--
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity 
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Backporting of swe-dan to sv-da was: Re: Java Pairs Updated

2016-03-08 Thread Per Tunedal
Hi,

On Mon, Mar 7, 2016, at 09:58, Kevin Brubeck Unhammer wrote:
> Per Tunedal <per.tune...@operamail.com> čálii:
> 
> > Hi Mikel,
> > Would you include it in the Java-versions, if I make a back-port of
> > swe-dan to sv-da?
> 
> As mentioned, you don't need a backport, you just need to remove the CG
> From the pipeline. I don't know how the modes files are represented in
> the omegaT plugin, but presumably there is somewhere in it that says
> 
> "lt-proc swe-dan.automorf.bin"
> "cg-proc swe-dan.rlx.bin"
> "apertium-tagger -g swe-dan.prob"
> 
> etc.; and then you just remove the "cg-proc" bit and leave the rest, and
> it should still run.
> 
> > Is it a requirement that the new version would be "released"? What would
> > qualify the new version of sv-da as a "release"? 
> 
> If we are to serve it from the official channels, yes,

-- snip--

Yes, please.

> 
> -Kevin

Per Tunedal

--
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://makebettercode.com/inteldaal-eval
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Lemmatisation was: Re: apertium-swe-dan-0.7.0 released

2016-03-07 Thread Per Tunedal
Hi,
The page http://wiki.apertium.org/wiki/Lemmatisation tells me to run:

$ echo "Den här är en test." | apertium -d . swe-tagger | cg-proc
guesser.bin  | sed 's/<[^>]\+>//g' | cg-proc -n guesser.bin 

Thus swe-tagger implicitly tells me that CG is used for disambiguation? 
I don't have to change anything at all.

What about other languages? Will the command work by just changing swe
to the actual language? It doesn't matter if the language uses CG or
not?

Yours,
Per Tunedal


On Mon, Mar 7, 2016, at 10:20, Kevin Brubeck Unhammer wrote:
> Kevin Brubeck Unhammer <unham...@fsfe.org> čálii:
> 
> > Per Tunedal <per.tune...@operamail.com> čálii:
> >
> >> Hi again,
> >> Obviously, CG would be quite helpful for disambiguation when doing
> >> lemmatisation. Would it be complicated to add an option to use CG (if
> >> present)? Using the cg-rules for the language would probable remove some
> >> more ambiguity.
> >
> > Exchange -tagger for -disam to run CG as well.
> 
> Sorry, I confused myself: -tagger actually runs CG now. So
> 
> swe-morph = lt-proc swe.automorf.bin 
> 
> swe-disam = lt-proc swe.automorf.bin | cg-proc swe.rlx.bin
> 
> swe-tagger = lt-proc swe.automorf.bin | cg-proc swe.rlx.bin |
> apertium-tagger swe.prob
> --
> Transform Data into Opportunity.
> Accelerate data analysis in your applications with
> Intel Data Analytics Acceleration Library.
> Click to learn more.
> http://makebettercode.com/inteldaal-eval
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> Email had 1 attachment:
> + signature.asc
>   1k (application/pgp-signature)

--
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://makebettercode.com/inteldaal-eval
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Software supporting the translation process

2016-03-06 Thread Per Tunedal
Hi Trond,
As I have some experience of both translation and of using OmegaT, I
just want to point out some issues:

1. Translators are very often explicitly forbidden to use all kind of
web tools due to strict confidentiality enforced by the client company.

2. The Apertium plug-in to OmegaT has the great advantage over e.g.
Google Translate that it's run locally.

BTW an other advantage is of course that its free - using the Google api
will be quite expensive in the long run.

Unfortunately, the translation quality is presently far better using
Google Translate. Yet, using Apertium is an interesting option for
translators working in the proliferating commercial market.

Making the OmegaT plugin a web service would certainly make it less
interesting for translators.

Finally, I agree with your reflection:

> I am quite
> satisfied with the Apertium+Omega-T platform as it is, the only problem
> is that it does not work for the languages I work with.

Yours,
Per Tunedal

On Sun, Mar 6, 2016, at 21:22, Trosterud Trond wrote:
> 
> As one of the people working hard for some of the Apertium projects in
> the nursery catalogue, I find it a challenge to convince people in the
> language communities that this whole entreprise is a good idea. Since the
> usage scenario is translation for text production, and not gisting, I am
> dependent upon two things:
> 
> - high quality output (which  obviously I and my team are responsible for
> ourselves), and
> - translation programs to support translators in their work.
> 
> For the last type, there are two candidates for Apertium: The Wikipedia
> content translator, which is great, and has the functionality I want (see
> below), but is only for Wikipedia translation, and the Apertium + Omega-T
> program setup.
> 
> I have for a while tried to get the Apertium+Omega-T working for sme-smn,
> but with no results. My last attenpt (and bugfix from one of the involved
> developers, thanks!!) brought me to the point where I got a window
> telling there was a path problem.
> 
> Anoying as this showstopper is, that is not the point of this letter, it
> is only a symptom of the neglect the issue has. My point is that the good
> translation programs we build within the Apertium framework are not put
> into use (and hence looses the opportunity to much developmental feedback
> from users and communities), since we do not have platforms for their
> use.
> 
> Prompsit uses Apertium, this I think is fantastic. But they have their
> own priorities, and most of the language pairs we work with are outside
> those priorities.
> 
> What I would like to see is first and foremost work on the
> Apertium+Omega-T (or similar) platform(s), set up as a **web-based**
> service, so that users may download the program, and set up the MT
> service with paths (preferably by choosing languages from a menu, evt.
> having a menu referring to a dynamic list of language pairs). I am quite
> satisfied with the Apertium+Omega-T platform as it is, the only problem
> is that it does not work for the languages I work with. And when I cannot
> get it work, the actual translators will not make it either. What they
> need is a setup that saves their time, where they may either take the MT
> sentence offered, or translate for themselves, and where, and this is
> **very** important, the program fixes formatting, pictures, etc. for
> them. The sad thing is that we have all this, we just do not see to it
> that it works.
> 
> I know there is a subset of the Apertium languages for which one is able
> to just click and download. This is fine, for the ones that work with
> those. I am also not against fixes that makes it possible for anyone with
> a working commandline version of any Apertium pair to use it in Omega-T.
> On the contrary, that would be great -- for me, as a developer. But that
> will be irrelevant to the language community and their translators. What
> they need is the possibility to use a web-based MT input, just like for
> the Wikipedia Content Translation.
> 
> During Google Code bids I have always favoured projects geared towards
> concrete language works, although I have seen that there always have been
> plenty of programmers applying with lot of interest but less of relevant
> language knowledge.
> 
> This is their time. Here, I really would like to see some input. The
> difference between saying that something __is__ useful and that something
> __could be__ useful is simply to big to be ignored.
> 
> Trond
> --
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] CG versus tagger was: Re: Lemmatisation was: Re: apertium-swe-dan-0.7.0 released

2016-03-06 Thread Per Tunedal
Hi,
Just a thought: couldn't this kind of rule just as well be implemented
in the TSX-file that's used to train the tagger? In that case,
retraining the tagger might do the trick as well.
Yours,
Per Tunedal

On Fri, Mar 4, 2016, at 09:41, Kevin Brubeck Unhammer wrote:
> Per Tunedal <per.tune...@operamail.com> čálii:
> 
> > 'ta en blå kon' (=take a blue cone) to danish. 'kon' might be the
> > indefinite form of 'kon' (= cone) or the definite form of 'ko' (= the
> > cow). We have:
> >
> >  (kon→ kon/ko)
> >
> > Translating the whole sentence would give us:
> >
> > tag en blå kegle / tag en blå koen (= take a blue cone / take a blue the
> > cow)
> >
> > Wouldn't that be quite revealing in many cases? In this case e.g. a
> > statistical language model could easily separate the wheat from the
> > chaff.
> 
> That example argues against your point – here the source language has
> two analyses of "kon", with different ind/def taggings (as it should).
> 
> This is not a lexical selection problem, but a morphological
> disambiguation problem.
> 
> It took me all of five minutes to write a CG rule to select indefinite
> for nouns after indefinite determiners:
> 
> LIST IndA = (adj ind) (adj comp) ;
> SET NotIndA = (*) - IndA ;
> REMOVE:en-blå-kon N + Def IF (0 N + Ind) (*-1 Det + Ind CBARRIER NotIndA)
> ;
> 
> and a quick corpus diff seems to show it generalises well:
> 
> http://sprunge.us/hhbf?diff
> 
> -- 
> Kevin Brubeck Unhammer
> 
> GPG: 0x766AC60C
> --
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> Email had 1 attachment:
> + signature.asc
>   1k (application/pgp-signature)

--
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://makebettercode.com/inteldaal-eval
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] apertium.org

2016-03-06 Thread Per Tunedal
Hi,
Looking at https://www.apertium.org/ I find that I still cannot
translate the page to Swedish, although I contributed a translation some
time ago. Any plans to update to the latest version of
apertium-html-tools?

BTW Is that page using the new release of the pair swe-dan?

Yours,
Per Tunedal



--
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Lemmatisation was: Re: apertium-swe-dan-0.7.0 released

2016-03-06 Thread Per Tunedal
Hi Tino, would it be possible to do this with CG or would this need to
implemented in some new program? Anyhow, I suppose this involves a lot
of work writing clever rules.

My rational for target language smoothing is:
a) my bad experience of the performance of the old pair sv-da with lots
   of blatant errors made by the tagger.
b) target language smoothing is very easy to implement.

When doing lemmatisation the tagger isn't used, as far as I can see.
This leaves all inherent ambiguities in the Swedish dictionary to be
handled somehow.

So, I just proposed a quick fix that might improve the result
significantly in this special case (and maybe other cases with a lot of
ambiguity that is not properly handled). Other means my be more adequate
and/or more effective.

Yours, Per Tunedal

On Fri, Mar 4, 2016, at 08:53, Tino Didriksen wrote:
> On 4 March 2016 at 07:52, Per Tunedal
> <per.tune...@operamail.com> wrote:
>> Yes, of course! That has always seemed a bit unnatural to me. It's
>>
harder to decide on the right source language lemma before translating
>>
than doing it after translation.
>
> I almost entirely disagree, and I've got experience and data to back
> it up. Target language smoothing does not help much, if your source
> language analysis is good.
>
> You can disambiguate the source language to nigh-100% if you use more
> analysis levels, such as dependency and semantics. This is what we do
> at GrammarSoft / VISL. It works.
>
> Apertium could also do it this way, and it would benefit all languages
> built from a specific source.
>
> -- Tino Didriksen
> --
> 
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140
> _
> Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
--
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Lemmatisation was: Re: apertium-swe-dan-0.7.0 released

2016-03-06 Thread Per Tunedal
Hi again,
Obviously, CG would be quite helpful for disambiguation when doing
lemmatisation. Would it be complicated to add an option to use CG (if
present)? Using the cg-rules for the language would probable remove some
more ambiguity.

Looking at the page http://wiki.apertium.org/wiki/Lemmatisation . What
does the command actually do:

 $ echo "Den här är en test." | apertium -d . swe-tagger | cg-proc
 guesser.bin  | sed 's/<[^>]\+>//g' | cg-proc -n guesser.bin 

 Will give lemmatised output where the tokens are encased in ^ and
 $, and ambiguous stems/lemmas are given separated by '/' 

Yours,
Per Tunedal

On Fri, Mar 4, 2016, at 09:41, Kevin Brubeck Unhammer wrote:
> Per Tunedal <per.tune...@operamail.com> čálii:
> 
> > 'ta en blå kon' (=take a blue cone) to danish. 'kon' might be the
> > indefinite form of 'kon' (= cone) or the definite form of 'ko' (= the
> > cow). We have:
> >
> >  (kon→ kon/ko)
> >
> > Translating the whole sentence would give us:
> >
> > tag en blå kegle / tag en blå koen (= take a blue cone / take a blue the
> > cow)
> >
> > Wouldn't that be quite revealing in many cases? In this case e.g. a
> > statistical language model could easily separate the wheat from the
> > chaff.
> 
> That example argues against your point – here the source language has
> two analyses of "kon", with different ind/def taggings (as it should).
> 
> This is not a lexical selection problem, but a morphological
> disambiguation problem.
> 
> It took me all of five minutes to write a CG rule to select indefinite
> for nouns after indefinite determiners:
> 
> LIST IndA = (adj ind) (adj comp) ;
> SET NotIndA = (*) - IndA ;
> REMOVE:en-blå-kon N + Def IF (0 N + Ind) (*-1 Det + Ind CBARRIER NotIndA)
> ;
> 
> and a quick corpus diff seems to show it generalises well:
> 
> http://sprunge.us/hhbf?diff
> 
> -- 
> Kevin Brubeck Unhammer
> 
> GPG: 0x766AC60C
> --
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> Email had 1 attachment:
> + signature.asc
>   1k (application/pgp-signature)

--
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Backporting of swe-dan to sv-da was: Re: Java Pairs Updated

2016-03-05 Thread Per Tunedal
Hi Mikel,
Would you include it in the Java-versions, if I make a back-port of
swe-dan to sv-da?
Is it a requirement that the new version would be "released"? What would
qualify the new version of sv-da as a "release"? 
Yours,
Per Tunedal

On Thu, Mar 3, 2016, at 12:39, Kevin Brubeck Unhammer wrote:
> Per Tunedal <per.tune...@operamail.com> čálii:
> 
> > Hi,
> > is it possible to do a "back-port" of swe-dan to sv-da? The
> > back-ported version would benefit from the improved swedish monodix.
> > Unfortunately, the most blatant problem, the untrained tagger, would
> > persist, though.
> 
> Do you mean in order to avoid the CG requirement? In that case, you can
> just remove it from the pipeline, no further change needed. 
> 
> 
> -Kevin
> --
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> Email had 1 attachment:
> + signature.asc
>   1k (application/pgp-signature)

--
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Lemmatisation was: Re: apertium-swe-dan-0.7.0 released

2016-03-05 Thread Per Tunedal
Hi Keld,
Yes, I remember taking a look at your suggested algorithm. Did you ever
try it out?
I don't remember if you intended to use it on the source language before
the tagger (a possible alternative to, or addition to CG) or if you
intended it for lexical selection for the target language (a possible
alternative to Francis lexical selection module).
Yours,
Per Tunedal

On Fri, Mar 4, 2016, at 16:19, k...@keldix.com wrote:
> On Fri, Mar 04, 2016 at 02:10:50PM +0100, Per Tunedal wrote:
> > Hi Kevin,
> > 
> > Back to Lemmatisation:
> > What's the easiest way to do a disambiguation, rather than get a list of
> > possible lemmas?
> 
> I am not sure it is the easiest way, but I have previously suggested that
> we 
> use wordnet data, which is freely available for Danish, to find out which
> of the
> lemmas that has the shortest distance to other lemmas in the surrounding
> text
> for the given homonym.
> 
> Best regards
> Keld
> 
> --
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Lemmatisation was: Re: apertium-swe-dan-0.7.0 released

2016-03-04 Thread Per Tunedal
Hi Francis,
I didn't thing so:
 Ta en kon! --> Tage en koen!

I will test some of the other problem later on.

Yours,
Per Tunedal


On Fri, Mar 4, 2016, at 14:14, Francis Tyers wrote:
> A 2016-03-04 14:10, Per Tunedal escrigué:
> > Hi Kevin,
> > Yes, this could definitely be fixed before the translation as it's
> > evident looking at the grammatical construction of the sentence. And of
> > course it's much better to fix it before translation than after.
> > 
> > My point was that translation adds more information, this makes it
> > possible to quite easily fix ambiguity that have not been sorted out
> > before translation. Even simple solutions like a language model might
> > help.
> > 
> > And Apertium sv-da has a lot of problems of this kind - I don't how 
> > much
> > training of the tagger would have helped. Anyhow, now we've got a brand
> > new release of Apertium swe-dan with CG. Maybe some of these problems
> > are solved by now. Unfortunately I've not been able to test as the two
> > of my boxes running Apertium are bound for the city dump. I hope to see
> > Apertium swe-dan soon at Apertium.org or maybe I'll find some time to
> > install Apertium at some other box. The Jjava versions cannot use CG.
> > 
> 
> It's already on apertium.org.
> 
> F.
> 
> --
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Lemmatisation was: Re: apertium-swe-dan-0.7.0 released

2016-03-04 Thread Per Tunedal
Hi Kevin,
Yes, this could definitely be fixed before the translation as it's
evident looking at the grammatical construction of the sentence. And of
course it's much better to fix it before translation than after.

My point was that translation adds more information, this makes it
possible to quite easily fix ambiguity that have not been sorted out
before translation. Even simple solutions like a language model might
help.

And Apertium sv-da has a lot of problems of this kind - I don't how much
training of the tagger would have helped. Anyhow, now we've got a brand
new release of Apertium swe-dan with CG. Maybe some of these problems
are solved by now. Unfortunately I've not been able to test as the two
of my boxes running Apertium are bound for the city dump. I hope to see 
Apertium swe-dan soon at Apertium.org or maybe I'll find some time to
install Apertium at some other box. The Jjava versions cannot use CG.

Back to Lemmatisation:
What's the easiest way to do a disambiguation, rather than get a list of
possible lemmas?

Yours,
Per Tunedal

On Fri, Mar 4, 2016, at 09:41, Kevin Brubeck Unhammer wrote:
> Per Tunedal <per.tune...@operamail.com> čálii:
> 
> > 'ta en blå kon' (=take a blue cone) to danish. 'kon' might be the
> > indefinite form of 'kon' (= cone) or the definite form of 'ko' (= the
> > cow). We have:
> >
> >  (kon→ kon/ko)
> >
> > Translating the whole sentence would give us:
> >
> > tag en blå kegle / tag en blå koen (= take a blue cone / take a blue the
> > cow)
> >
> > Wouldn't that be quite revealing in many cases? In this case e.g. a
> > statistical language model could easily separate the wheat from the
> > chaff.
> 
> That example argues against your point – here the source language has
> two analyses of "kon", with different ind/def taggings (as it should).
> 
> This is not a lexical selection problem, but a morphological
> disambiguation problem.
> 
> It took me all of five minutes to write a CG rule to select indefinite
> for nouns after indefinite determiners:
> 
> LIST IndA = (adj ind) (adj comp) ;
> SET NotIndA = (*) - IndA ;
> REMOVE:en-blå-kon N + Def IF (0 N + Ind) (*-1 Det + Ind CBARRIER NotIndA)
> ;
> 
> and a quick corpus diff seems to show it generalises well:
> 
> http://sprunge.us/hhbf?diff
> 
> -- 
> Kevin Brubeck Unhammer
> 
> GPG: 0x766AC60C
> --
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> Email had 1 attachment:
> + signature.asc
>   1k (application/pgp-signature)

--
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Lemmatisation was: Re: apertium-swe-dan-0.7.0 released

2016-03-03 Thread Per Tunedal
Hi,
Yes, of course! That has always seemed a bit unnatural to me. It's
harder to decide on the right source language lemma before translating
than doing it after translation.

The ambiguity in the source language is in most cases not present in the
target language. After translation you have will have an indication if
the translation "makes sense" or not, this could be quite useful when
choosing between two different translations due to ambiguity in the
source language.

But Apertium doesn't work that way. You just bet on one of the possible
source lemmas before translation.
And yes, ambiguity regarding how to translate that one and only source
lemma is a more limited task.

Can Apertium be manipulated somehow to translate all possible source
lemmas into translation hypotheses for the whole sentence (instead of
choosing just one source lemma)?

The lemmatisation seems to do half of the job: displaying all possible
lemmas, separated by '|'. I would like to continue one step further and
translate all possible variants (analyses) of a sentence. An example:

Apertium sv-da has some trouble to translate the sentence:

'ta en blå kon' (=take a blue cone) to danish. 'kon' might be the
indefinite form of 'kon' (= cone) or the definite form of 'ko' (= the
cow). We have:

 (kon→ kon/ko)

Translating the whole sentence would give us:

tag en blå kegle / tag en blå koen (= take a blue cone / take a blue the
cow)

Wouldn't that be quite revealing in many cases? In this case e.g. a
statistical language model could easily separate the wheat from the
chaff.

BTW Apertium sv-da bets on the second option.

Yours,
Per Tunedal


On Thu, Mar 3, 2016, at 21:53, Kevin Brubeck Unhammer wrote:
> Per Tunedal <per.tune...@operamail.com> čálii:
> 
> > If the constraint-based lexical selection module is used for a pair, I
> > cannot see why it couldn't be used. The rules are already in place. All
> > you have to do is to translate the ambiguous sentences and let the
> > module select the best translation.
> >
> > The tricky bit would be to use this information backwards to choose the
> > right lemma in the original language. I'm not savvy enough to figure out
> > how to do it.
> 
> The right source language lemma is already selected by the time lexical
> selection runs. Lexical selection is about selecting the right *target*
> language lemma.
> 
> -- 
> Kevin Brubeck Unhammer
> 
> GPG: 0x766AC60C
> --
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> Email had 1 attachment:
> + signature.asc
>   1k (application/pgp-signature)

--
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Lemmatisation was: Re: apertium-swe-dan-0.7.0 released

2016-03-03 Thread Per Tunedal
Hi Francis,
well,maybe.

Anyhow, lemmatisation is useful for many applications. One is frequency
lists e.g. as a way to choose what new words to add to a language pair.
It might be worth some effort to figure out an easy way to do
disambiguation, or rather lexical selection.

Intuitively, I would like to use a translation to distinguish between
different significations. A language model might do the trick, but maybe
an Apertium translation wouldn't be fluent enough to stand the test.
Maybe that's why the lextor module proved non-efficient for lexical
selection.

If the constraint-based lexical selection module is used for a pair, I
cannot see why it couldn't be used. The rules are already in place. All
you have to do is to translate the ambiguous sentences and let the
module select the best translation.

The tricky bit would be to use this information backwards to choose the
right lemma in the original language. I'm not savvy enough to figure out
how to do it.

Yours,
Per Tunedal


On Wed, Mar 2, 2016, at 09:11, Francis Tyers wrote:
> A 2016-03-02 09:09, Per Tunedal escrigué:
> > Hi again,
> > 
> > On Wed, Mar 2, 2016, at 08:34, Francis Tyers wrote:
> >> A 2016-03-02 08:25, Per Tunedal escrigué:
> >> > Hi Francis,
> >> > lemmatisation would be interesting to try, but what about
> >> > disambiguation?
> >> >
> >> > "ambiguous stems/lemmas are given separated by '/' "
> >> >
> >> > Can this be improved by your new lexical selection module somehow? It
> >> > would be better to choose the most probable lemma than simply the
> >> > first.
> >> 
> >> No, it couldn't.
> > 
> > Any other way to do lexical selection that might work?
> > 
> 
> I wouldn't bother, I would let it be ambiguous and then fix it in a 
> post-processing
> step.
> 
> F.
> 
> --
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Java Pairs Updated

2016-03-02 Thread Per Tunedal
Hi, Now there's a brand new swe-dan release! Any chance you include it?
Yours, Per Tunedal


On Tue, Feb 23, 2016, at 09:27, Tino Didriksen wrote:
> I have updated all the pairs I could in
> https://svn.code.sf.net/p/apertium/svn/builds with these omissions:
>
>
--snip--
>
> apertium-sv-da has since been renamed to swe-dan, but there is no
> release with that name yet.
>
>
--snip--
> -- Tino Didriksen
> --
> 
>
--snip--
--
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Lemmatisation was: Re: apertium-swe-dan-0.7.0 released

2016-03-02 Thread Per Tunedal
Hi again,

On Wed, Mar 2, 2016, at 08:34, Francis Tyers wrote:
> A 2016-03-02 08:25, Per Tunedal escrigué:
> > Hi Francis,
> > lemmatisation would be interesting to try, but what about
> > disambiguation?
> > 
> > "ambiguous stems/lemmas are given separated by '/' "
> > 
> > Can this be improved by your new lexical selection module somehow? It
> > would be better to choose the most probable lemma than simply the 
> > first.
> 
> No, it couldn't.

Any other way to do lexical selection that might work?

> 
> > And OOW-words (not found in the dictionary, but present in the corpus)?
> > How to handle them? Can the lemmas be guessed? I suppose some
> > statistical model might do the trick.
> 
> Those are guessed, read the page ;)
> 

Oops! I've now looked at the Guesser section in the Swedish monodix and
got an idea of the process.

--snip--

Yours,
Per Tunedal

--
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] Lemmatisation was: Re: apertium-swe-dan-0.7.0 released

2016-03-01 Thread Per Tunedal
Hi Francis,
lemmatisation would be interesting to try, but what about
disambiguation?

"ambiguous stems/lemmas are given separated by '/' "

Can this be improved by your new lexical selection module somehow? It
would be better to choose the most probable lemma than simply the first.

And OOW-words (not found in the dictionary, but present in the corpus)?
How to handle them? Can the lemmas be guessed? I suppose some
statistical model might do the trick.

Or maybe the dictionary can be used in some inventive way? It contains a
lot of paradigms - but unfortunately nothing about how common they are.
What about sorting them according to frequency in a reference corpus? Or
adding the frequency with a tag in the paradigms?  (Might be useful
anyway, e.g. when adding words to the monodix: a GUI could propose the
most likely paradigms at the top of an arrow list. Might minimise the
risk for choosing a rare and probably wrong paradigm.)

Yours,
Per Tunedal


On Tue, Mar 1, 2016, at 23:27, Francis Tyers wrote:

--snip--

> If you'd like to share any of your probabilistic lexicons for 
> Swedish--Norwegian
> or Swedish--Danish we'd be interested in looking at them.
> 
> If you have experience in SMT, the word alignments for Europarl for 
> Swedish--Danish
> could be pretty useful! Especially if you use the lemmatisation step 
> described here:
> 
> http://wiki.apertium.org/wiki/Lemmatisation
> 
> Fran
> 
> --
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] apertium-swe-dan-0.7.0 released

2016-03-01 Thread Per Tunedal
Hi all of you,
Congratulations to the new release! I haven't tested it yet, but I'm
very curious. It will be very interesting to see if at least the most
blatant errors finally are gone.

I'm sorry to hear that I contributed to Lars bad experiences in the past
:-(
I passed several months struggling with adding words and trying to
correct errors I found. Unfortunately, I didn't manage to solve all of
the tricky issues I found and was very disappointed when the translation
quality stayed very low.

Later I found thousands of more errors in the Swedish dictionary by
expanding the words and running a spell check on the expanded list. I
fixed just a few of them, but didn't find any solution to some of the
problems. I simply abandoned the tedious task to correct the errors -
they where too many.

One big issue that I didn't solve was that the tagger had never been
trained - thus it made some very blatant errors. Further some of the new
vocabulary wouldn't show up in translations. Is the tagger trained now?

An other problem was lexical selection. I'm eager to see how this is
handled now.

My intention was to create the pair Norwegian-Swedish, but I was advised
to start with Swedish-Danish. My bad experiences of that pair made me
abandon my original intention and explore statistical translation
instead, although I already had learnt some Norwegian.

Anyhow, congratulations to the new release! It was probably a brilliant
idea to simply make a new Swedish dictionary and thus get rid of all old
errors. I do hope the translation quality now finally has improved!

Yours,
Per Tunedal

PS My danish isn't any good, I've several times asked for Danes to
review my additions to the Danish dictionary. Now is the time, if not
done yet. Maybe an expansion of those words would reveal quite a few
errors to a native Dane.


On Tue, Mar 1, 2016, at 19:46, Francis Tyers wrote:
> A 2016-03-01 16:10, Lars Aronsson escrigué:
> > On 03/01/2016 02:03 PM, Kevin Brubeck Unhammer wrote:
> >> The bidix has a lot of additions by Per Tunedal from earlier (and I
> >> think I saw your name in there as well?), although additions to
> >> *monodix* are mostly encompassed by the new SALDO lexicon. There was a
> >> lot of inconsistency to work out due to the whole lexicon change,
> >> although the swe dictionary is probably a lot more correct now,
> >> considering the original one was initially created by Danes :) (and of
> >> course it's much bigger).
> > 
> > I started to contribute, but failed and left. I don't remember
> > the details, but I think it was something like this: When one
> > grammatical rule covers the inflected forms of many words,
> > and a handful of them need an exception to that rule, Per
> > had a tendency to rewrite the rule for those few words
> > without considering the many others. It was too easy to
> > modify the rule and no regression tests for the other words
> > or phrases that used that rule. It seemed impossible to me
> > to guarantee any quality or correctness.
> 
> That sounds like a pretty bad experience. But now, with the SALDO-based
> dictionary it should be much better. In essence you should only need to
> add words to the bilingual dictionary :)
> 
> Fran
> 
> --
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Apertium-html-tools: sort localisation languages

2015-11-27 Thread Per Tunedal
Done.


On Fri, Nov 27, 2015, at 16:48, Sushain Cherivirala wrote:
> Please file an issue on the GitHub repository (copy your email) and we
> can continue discussion there and preserve it for the future. Thanks!


>
> On Fri, Nov 27, 2015, 9:41 AM Per Tunedal
> <per.tune...@operamail.com> wrote:
>> Hi,
>>
I noticed that the list of localisation languages locales.json is used
>>
"as is": it would be more user friendly if it was sorted.
>>
>>
I suppose this would be a minor change in localization.js, but
>>
unfortunately I've never learnt javascript. I got confused looking at
>>
the code. I'd better not touch it.
>>
>>
Yours,
>>
Per Tunedal
>>
>>

--
>>
___
>>
Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> --
> 
> _
> Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
--
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Some silly mistake translating Apertium-html-tools

2015-11-26 Thread PER Tunedal
Hi,
That's true regarding installation, but the procedure for contribution is new. 
Yours, 
Per Tunedal 

Skickat från min Sony Xperia™-smartphone

 Francis Tyers skrev 

>It shouldn't make a difference as the GitHub and SVN versions should be 
>synched.
>
>Fran
>
>A 2015-11-26 08:19, Per Tunedal escrigué:
>> Hi again,
>> 
>> Finally I've sent a pull request (Havet). I had some trouble due to a
>> bug in GitHub desktop, now fixed.
>> 
>> Would you please update the Apertium wiki page on Apertium-html-tools.
>> The information on installation and contribution seems outdated due to
>> the move to GitHub:
>> 
>> http://wiki.apertium.org/wiki/Apertium-html-tools [5]
>> 
>> Yours,
>> 
>> Per Tunedal
>> 
>> On Mon, Nov 23, 2015, at 17:03, Sushain Cherivirala wrote:
>> 
>>> Development of apertium-html-tools is proceeding on GitHub [1],
>>> please either send a pull request or email us the file for manual
>>> inclusion.
>>> 
>>> Thanks for your help!
>>> 
>>> --
>>> 
>>> Sushain K. Cherivirala
>>> 
>>> www.skc.name [2]
>>> 
>>> On Mon, Nov 23, 2015 at 4:51 AM, Per Tunedal
>>> <per.tune...@operamail.com> wrote:
>>> 
>>> Hi,
>>> 
>>> Excellent. I've added the hint to the wiki.
>>> 
>>> Now I have trouble submitting the locales.json file:
>>> 
>>> svn: Server sent unexpected return value (423 Locked) in response to
>>> PUT request for
>>> 
>> '/p/apertium/svn/!svn/wrk/eac98752-ece9-4fe0-ac28-360ac3138eea/trunk/apertium-tools/apertium-html-tools/assets/strings/locales.json'
>>> 
>>> Someone else updating or what? I have tried several times, though.
>>> 
>>> Yours,
>>> 
>>> Per Tunedal
>>> 
>>> On Sun, Nov 22, 2015, at 22:11, Tino Didriksen wrote:
>>> 
>>> Line 49
>>> 
>>> "authors": [Per Tunedal],
>>> 
>>> is not valid JSON. It should be
>>> 
>>> "authors": ["Per Tunedal"],
>>> 
>>> If in doubt, check your JSON with http://jsonlint.com/ [3] or
>>> similar.
>>> 
>>> -- TD
>>> 
>>> On 22 November 2015 at 22:01, Per Tunedal
>>> <per.tune...@operamail.com> wrote:
>>> 
>>> I've committed the file all the same. I suppose I've done some silly
>>> 
>>> mistake. Any clue?
>>> 
>>> 
>> --
>>> 
>>> ___
>>> 
>>> Apertium-stuff mailing list
>>> 
>>> Apertium-stuff@lists.sourceforge.net
>>> 
>>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff [4]
>> 
>> --
>> 
>>  Go from Idea to Many App Stores Faster with Intel(R) XDK
>> 
>>  Give your users amazing mobile app experiences with Intel(R) XDK.
>> 
>>  Use one codebase in this all-in-one HTML5 development environment.
>> 
>>  Design, debug & build mobile apps & 2D/3D high-impact games for
>> multiple OSs.
>> 
>>  http://pubads.g.doubleclick.net/gampad/clk?id=254741551=/4140 [6]
>> 
>> ___
>> 
>>  Apertium-stuff mailing list
>> 
>>  Apertium-stuff@lists.sourceforge.net
>> 
>>  https://lists.sourceforge.net/lists/listinfo/apertium-stuff [4]
>> 
>> --
>> 
>> Go from Idea to Many App Stores Faster with Intel(R) XDK
>> 
>> Give your users amazing mobile app experiences with Intel(R) XDK.
>> 
>> Use one codebase in this all-in-one HTML5 development environment.
>> 
>> Design, debug & build mobile apps & 2D/3D high-impact games for
>> multiple OSs.
>> 
>> http://pubads.g.doubleclick.net/gampad/clk?id=254741551=/4140 [6]
>> 
>> ___
>> 
>> Apertium-stuff mailing list
>> 
>> Apertium-stuff@lists.sourceforge.net
>> 
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff [4]
>> 
>> 
>> 
>> Links:
>> --
>> [1] https://github.com/goavki/apertium-html-tools
>> [2] http://www.skc.name
>> [3] http://jsonlint.com/
>> [4] https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Re: [Apertium-stuff] Some silly mistake translating Apertium-html-tools

2015-11-25 Thread Per Tunedal
Hi again, Finally I've sent a pull request (Havet). I had some trouble
due to a bug in GitHub desktop, now fixed.

Would you please update the Apertium wiki page on Apertium-html-tools.
The information on installation and contribution seems outdated due to
the move to GitHub:

http://wiki.apertium.org/wiki/Apertium-html-tools

Yours, Per Tunedal


On Mon, Nov 23, 2015, at 17:03, Sushain Cherivirala wrote:
> Development of apertium-html-tools is proceeding on GitHub[1], please
> either send a pull request or email us the file for manual inclusion.
>
> Thanks for your help!
>
> --
> Sushain K. Cherivirala www.skc.name
>
> On Mon, Nov 23, 2015 at 4:51 AM, Per Tunedal
> <per.tune...@operamail.com> wrote:
>> __
>> Hi, Excellent. I've added the hint to the wiki. Now I have trouble
>> submitting the locales.json file: svn: Server sent unexpected return
>> value (423 Locked) in response to PUT request for 
>> '/p/apertium/svn/!svn/wrk/eac98752-ece9-4fe0-ac28-
>> 360ac3138eea/trunk/apertium-tools/apertium-html-
>> tools/assets/strings/locales.json'
>>
>> Someone else updating or what? I have tried several times, though.
>>
>> Yours, Per Tunedal
>>
>>
>>
>> On Sun, Nov 22, 2015, at 22:11, Tino Didriksen wrote:
>>> Line 49 "authors": [Per Tunedal], is not valid JSON. It should be
>>> "authors": ["Per Tunedal"],
>>>
>>> If in doubt, check your JSON with http://jsonlint.com/ or similar.
>>>
>>> -- TD
>>>
>>> On 22 November 2015 at 22:01, Per Tunedal
>>> <per.tune...@operamail.com> wrote:
>>>> I've committed the file all the same. I suppose I've done some
>>>> silly mistake. Any clue?
>>> 
>>> --
>>>
>>> _
>>> Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>>
>>
>>
>> -
>> -
>>
Go from Idea to Many App Stores Faster with Intel(R) XDK
>>
Give your users amazing mobile app experiences with Intel(R) XDK.
>>
Use one codebase in this all-in-one HTML5 development environment.
>>
Design, debug & build mobile apps & 2D/3D high-impact games for
multiple OSs.
>> http://pubads.g.doubleclick.net/gampad/clk?id=254741551=/4140
>> ___
>>
Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
> --
> 
> Go from Idea to Many App Stores Faster with Intel(R) XDK Give your
> users amazing mobile app experiences with Intel(R) XDK. Use one
> codebase in this all-in-one HTML5 development environment. Design,
> debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
> http://pubads.g.doubleclick.net/gampad/clk?id=254741551=/4140
> _
> Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff



Links:

  1. https://github.com/goavki/apertium-html-tools
--
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741551=/4140___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Some silly mistake translating Apertium-html-tools

2015-11-23 Thread Per Tunedal
Hi, Excellent. I've added the hint to the wiki. Now I have trouble
submitting the locales.json file: svn: Server sent unexpected return
value (423 Locked) in response to PUT request for 
'/p/apertium/svn/!svn/wrk/eac98752-ece9-4fe0-ac28-
360ac3138eea/trunk/apertium-tools/apertium-html-
tools/assets/strings/locales.json'

Someone else updating or what? I have tried several times, though.

Yours, Per Tunedal



On Sun, Nov 22, 2015, at 22:11, Tino Didriksen wrote:
> Line 49 "authors": [Per Tunedal], is not valid JSON. It should be
> "authors": ["Per Tunedal"],
>
> If in doubt, check your JSON with http://jsonlint.com/ or similar.
>
> -- TD
>
> On 22 November 2015 at 22:01, Per Tunedal
> <per.tune...@operamail.com> wrote:
>> I've committed the file all the same. I suppose I've done some silly
>>
mistake. Any clue?
> --
> 
> _
> Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
--
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741551=/4140___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] Some silly mistake translating Apertium-html-tools

2015-11-22 Thread Per Tunedal
Hi,
I just tried translating Apertium-html-tools to Swedish, but get an
exception running ./localisation-tools.py all swe :

Traceback (most recent call last):
  File "/usr/lib/python3.1/json/decoder.py", line 355, in raw_decode
obj, end = self.scan_once(s, idx)
StopIteration

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "./localisation-tools.py", line 73, in 
strings = OrderedDict(filter(lambda x: x[0] in
canonicalStrings.keys(), loadJSON(f).items()))
  File "./localisation-tools.py", line 15, in loadJSON
return json.loads(f.read(), object_pairs_hook=OrderedDict)
  File "/usr/lib/python3.1/json/__init__.py", line 318, in loads
return cls(**kw).decode(s)
  File "/usr/lib/python3.1/json/decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.1/json/decoder.py", line 357, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded

I've committed the file all the same. I suppose I've done some silly
mistake. Any clue?

Yours,
Per Tunedal

--
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] No package apertium-swe found

2015-03-12 Thread Per Tunedal

Hi llnar, Thank you! It works like a charm. Yours, Per Tunedal


On Tue, Mar 10, 2015, at 16:32, Ilnar Salimzyan wrote:
 Hello,

 2015-03-10 15:58 GMT+01:00 Per Tunedal per.tune...@operamail.com:
 Hi,

got into trouble when reinstalling.


./autogen.sh


in the swe-dan directory gives the following error message:


No package 'apertium-swe' found

 You need to tell autogen.sh where the apertium-swe package
 resides, e.g.:

 ./autogen.sh --with-lang1=../../languages/apertium-swe

 And, possibly, the same for the apertium-dan package:

 ./augogen.sh --with-lang1=../../languages/apertium-swe
 --with-lang2=../../languages/apertium-dan

 Best,

 Ilnar




although the language package is downloaded from SVN and I've ran


./autogen.sh

make


in the folder. (Make returns: Inget behöver göras för All. )


Yours,

Per Tunedal



--

Dive into the World of Parallel Programming The Go Parallel
Website, sponsored

by Intel and developed in partnership with Slashdot Media, is your
hub for all

things parallel software development, from weekly thought
leadership blogs to

news, videos, case studies, tutorials and more. Take a look and join the

conversation now. http://goparallel.sourceforge.net/

___

Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff

 --
 Dive into the World of Parallel Programming The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media,
 is your hub for all things parallel software development, from weekly
 thought leadership blogs to news, videos, case studies, tutorials and
 more. Take a look and join the conversation now.
 http://goparallel.sourceforge.net/
 _
 Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Compare adjectives in Swedish

2015-03-12 Thread Per Tunedal
Hi Francis,
well, yes it's both in the monodix and the bidix. There must be some
trivial error I've overlooked:

1. monodix:

!-- PT: böjs vanligen med mer, mest men andra ord med detta paradigm
kompareras normalt: två paradigm? --
pardef n=afrikansk__adj
  e   pl/l  rs n=adj/s n=pst/s n=ut/s
  n=sg/s n=ind//r/p/e
  e   plt/l rs n=adj/s n=pst/s n=nt/s
  n=sg/s n=ind//r/p/e
  e   ple/l rs n=adj/s n=pst/s n=m/s
  n=sg/s n=def//r/p/e
  e   pla/l rs n=adj/s n=pst/s n=un/s
  n=pl/s n=ind//r/p/e
  e   pla/l rs n=adj/s n=pst/s n=un/s
  n=sp/s n=def//r/p/e
  
  e r=LR a=PT   plare/l  rers n=adj/s
  n=comp/s n=un/s n=sp//r/p/e
  e r=LR a=PT   plast/l  rers n=adj/s
  n=sup/s n=un/s n=sp/s n=ind//r/p/e
  e r=LR a=PT   plaste/l rers n=adj/s
  n=sup/s n=un/s n=sp/s n=def//r/p/e
/pardef

e lm=afrikansk   iafrikansk/ipar n=afrikansk__adj//e


2. bidix:
pardef n=afrikansk_amerikansk__adj
  e   pls n=pst/s n=un/s n=sp/s
  n=def//lrs n=pst/s n=un/s n=sg/s
  n=def//r/p/e
  e   pls n=pst/s n=un/s n=pl/s
  n=ind//lrs n=pst/s n=un/s n=pl/s
  n=ind//r/p/e

  e r=LRpls n=pst/s n=ut/s n=sg/s
  n=ind//lrs n=pst/s n=un/s n=sg/s
  n=ind//r/p/e
  e r=LRpls n=pst/s n=nt/s n=sg/s
  n=ind//lrs n=pst/s n=un/s n=sg/s
  n=ind//r/p/e
  e r=LRpls n=pst/s n=m/s n=sg/s n=def//lrs
  n=pst/s n=un/s n=sg/s n=def//r/p/e
  e r=RLpls n=pst/s n=GD//lrs n=pst/s
  n=un/s n=sg/s n=ind//r/p/e

  e r=LR a=PTpls n=comp/s n=un/s n=sp//lrs
  n=unsint/s n=comp/s n=un/s n=ND//r/p/e
  e r=LR a=PTpls n=sup/s n=un/s n=sp//lrs
  n=unsint/s n=sup/s n=un/s n=ND//r/p/e

  e r=RL a=PTpls n=unsint/s n=comp/s n=un/s
  n=sp//lrs n=unsint/s n=comp/s n=un/s
  n=ND//r/p/e
  e r=RL a=PTpls n=unsint/s n=sup/s n=un/s
  n=sp//lrs n=unsint/s n=sup/s n=un/s
  n=ND//r/p/e  
/pardef


rafrikansks n=adj//r/ppar n=afrikansk_amerikansk__adj//e

I don' t know if the two last lines in the bidix are necessary, I added
them just in case when it didn't work without them. But adding them
didn't help.

Yours,
Per Tunedal


On Wed, Mar 11, 2015, at 10:14, Francis Tyers wrote:
 A 2015-03-11 07:51, Per Tunedal escrigué:
  Hi,
  it works slightly different in swe-dan, as the grades are
  included in each paradigm (adjgrad doesn't exist)).
  
  I looked at the diskret_amerikansk__adj paradigm and tried to add the
  following lines in
  afrikansk_amerikansk__adj:
  
e r=LR a=PTpls n=comp/s n=un/s n=sp//lrs
n=unsint/s n=comp/s n=un/s n=ND//r/p/e
e r=LR a=PTpls n=sup/s n=un/s n=sp//lrs
n=unsint/s n=sup/s n=un/s n=ND//r/p/e
  
  Unfortunately this doesn't work:
  
  echo afrikanskare | apertium -d . swe-dan
  
  *afrikanskare
  
 
 * means unknown word, is it in the bilingual dictionary and in the 
 morphological analyser?
 
 F.
 
 --
 Dive into the World of Parallel Programming The Go Parallel Website,
 sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub
 for all
 things parallel software development, from weekly thought leadership
 blogs to
 news, videos, case studies, tutorials and more. Take a look and join the 
 conversation now. http://goparallel.sourceforge.net/
 ___
 Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Compare adjectives in Swedish

2015-03-11 Thread Per Tunedal
Hi,
it works slightly different in swe-dan, as the grades are
included in each paradigm (adjgrad doesn't exist)).

I looked at the diskret_amerikansk__adj paradigm and tried to add the
following lines in
afrikansk_amerikansk__adj:

  e r=LR a=PTpls n=comp/s n=un/s n=sp//lrs
  n=unsint/s n=comp/s n=un/s n=ND//r/p/e
  e r=LR a=PTpls n=sup/s n=un/s n=sp//lrs
  n=unsint/s n=sup/s n=un/s n=ND//r/p/e

Unfortunately this doesn't work:

echo afrikanskare | apertium -d . swe-dan

*afrikanskare

Yours,
Per Tunedal

On Mon, Mar 9, 2015, at 09:00, Kevin Brubeck Unhammer wrote:
 Per Tunedal per.tune...@operamail.com
 writes:
 
  Hi Francis,
  Excellent! What should I do to get it to work in the opposite direction.
  I would like to keep the analysis of incorrectly inflected adjectives,
  like samhälleligare, but mer samhällelig should be generated.
 
 You could mark samhälleligare as LR.
 
 The nno-nob bidix uses pardefs for the various possibilities like
 adj.sint to adj (or adj to adj.sint). If you allow a synthetic LR form
 in an otherwise analytic monodix pardef, the bidix pardefs could deal
 with that. If we use the bidix pardefs from nno-nob as a basis, it'd
 probably look something like this:
 
 pardef n=adjgrad c=Used by adj pardefs
   e   pls n=comp//l rs
   n=comp//r/p/e
   e   pls n=pst//l  rs
   n=pst//r/p/e
   e   pls n=sup//l  rs
   n=sup//r/p/e
 /pardef
 pardef n=adj c=Analytic on both sides
   e   pls n=adj//l  rs
   n=adj//r/ppar n=adjgrad//e
   e r=LRpls n=adj/s n=sint//l rs
   n=adj//r/ppar n=adjgrad//e
   e r=RLpls n=adj//l  rs n=adj/s
   n=sint//r/ppar n=adjgrad//e
 /pardef
 pardef n=adj_sint c=Synthetic on both sides
   e   pls n=adj/s n=sint//l rs n=adj/s
   n=sint//r/ppar n=adjgrad//e
 /pardef
 pardef n=adj_sint:adj c=Synthetic left, analytic right
   e   pls n=adj/s n=sint//l rs
   n=adj//r/ppar n=adjgrad//e
   e r=RLpls n=adj/s n=sint//l rs n=adj/s
   n=sint//r/ppar n=adjgrad//e
 /pardef
 pardef n=adj:adj_sint c=Analytic left, synthetic right
   e   pls n=adj//l  rs n=adj/s
   n=sint//r/ppar n=adjgrad//e
   e r=LRpls n=adj/s n=sint//l rs n=adj/s
   n=sint//r/ppar n=adjgrad//e
 /pardef
 
 The analytic adj pardef translates adj to adj, but if it sees an
 adj.sint (samhälleligare), it translates it into analytic adj. Similarly
 for the other pardefs. The adjgrad is there to make sure we don't have
 two pardefs matching the same input.
 
 -
 
 I don't think it makes sense to correct the other direction, that'd just
 lead to overcorrection (try searching the web for e.g. mer vacker;
 most of the hits seem to be correct, like lite mer vacker höstskräck,
 aldrig mer väcker).
 
 -- 
 Kevin Brubeck Unhammer
 
 GPG: 0x766AC60C
 --
 Dive into the World of Parallel Programming The Go Parallel Website,
 sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub
 for all
 things parallel software development, from weekly thought leadership
 blogs to
 news, videos, case studies, tutorials and more. Take a look and join the 
 conversation now. http://goparallel.sourceforge.net/
 ___
 Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff
 Email had 1 attachment:
 + signature.asc
   1k (application/pgp-signature)

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] The pair eng-deu was: Re: why two dat sg in apertium-deu?

2015-03-11 Thread Per Tunedal
HI Wolfgang,
I cannot find any pair eng-deu in the repository.
Yours,
Per Tunedal

On Tue, Mar 10, 2015, at 23:43, wolfgang...@web.de wrote:
 Hi,
 
 I'm working with the German apertium-deu monodix (for a eng-deu
 translation).
 
 I verified the dictionary and I notice that many of the noun pardef have
 two dative singulare forms (one and wrong always with an e at the end)
 
 e.g.
 pardef n=Abf/all__n_m
   e r=LRplalle/l  ralls n=n/s n=m/s n=sg/s
   n=dat//r/ppar n=cmp-R//e
   e   plall/l   ralls n=n/s n=m/s n=sg/s
   n=dat//r/ppar n=cmp-R//e
 
 
 In German grammar there is only one dat sg (and one dat pl). Are these
 second dat necessary for old translations? In my local installation I
 removed these entries.
 
 Best regards,
 Wolfgang
 
 --
 Dive into the World of Parallel Programming The Go Parallel Website,
 sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub
 for all
 things parallel software development, from weekly thought leadership
 blogs to
 news, videos, case studies, tutorials and more. Take a look and join the 
 conversation now. http://goparallel.sourceforge.net/
 ___
 Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] No package apertium-swe found

2015-03-10 Thread Per Tunedal
Hi,
got into trouble when reinstalling.

./autogen.sh

in the swe-dan directory gives the following error message:

No package 'apertium-swe' found


although the language package is downloaded from SVN and I've ran

./autogen.sh
make

in the folder. (Make returns: Inget behöver göras för All. )

Yours,
Per Tunedal


--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Compare adjectives in Swedish

2015-03-05 Thread Per Tunedal
Hi,
I'm wondering if translation from a language with comparative and supine
forms of an adjective to e.g. Swedish works, if the corresponding
Swedish adjective is compared with mer and mest (more and most).

Can the inflected form of the adjective in the source language be
translated to two words (mer/mest + adjective) in the target language?

Yours,
Per Tunedal

On Thu, Mar 5, 2015, at 08:57, Per Tunedal wrote:
 Hi again Kevin,
 I cannot find the transfer rules in apertium-dan-nor. What file should I
 look for?
 
 I would like to understand how it would work in practice.
 
 Yours,
 Per Tunedal
 
 On Wed, Mar 4, 2015, at 21:19, Kevin Brubeck Unhammer wrote:
  Per Tunedal per.tune...@operamail.com
  writes:
  
   Hi again Kevin.
   funny, I just looked at the apertium-dan-nor.nno.dix !
   Hmm ...
   When updating quite a lot of files disappeared :-)
  
   I would like to see what it looks like in the monolingual dictionaries.
  
  languages/apertium-nno/apertium-nno.nno.dix has e.g.:
  
  pardef n=sein__adj
e   plare/l   rs n=adj/s n=sint/s
n=comp/s n=un/s n=sp//r/p/e
e   pl/l  rs n=adj/s n=sint/s
n=posi/s n=mf/s n=sg/s n=ind//r/p/e
e   plt/l rs n=adj/s n=sint/s
n=posi/s n=nt/s n=sg/s n=ind//r/p/e
e   ple/l rs n=adj/s n=sint/s
n=posi/s n=un/s n=pl/s n=ind//r/p/e
e   ple/l rs n=adj/s n=sint/s
n=posi/s n=un/s n=sp/s n=def//r/p/e
e   plaste/l  rs n=adj/s n=sint/s
n=sup/s n=un/s n=sp/s n=def//r/p/e
e   plast/l   rs n=adj/s n=sint/s
n=sup/s n=un/s n=sp/s n=ind//r/p/e
  /pardef
  
  pardef n=OK__adj
e   pl/l  rs n=adj/s n=posi/s n=mf/s
n=sg/s n=ind//r/p/e
e   pl/l  rs n=adj/s n=posi/s n=nt/s
n=sg/s n=ind//r/p/e
e   pl/l  rs n=adj/s n=posi/s n=un/s
n=pl/s n=ind//r/p/e
e   pl/l  rs n=adj/s n=posi/s n=un/s
n=sp/s n=def//r/p/e
  /pardef
  
  (posi→pst is on the TODO …)
  
  
   Now I've got:
  
   pardef n=samhällelig__adj !-- PT: Kompareras med mer och mest eller
   inte alls --
 e   pl/l  rs n=adj/s n=pst/s n=ut/s
 n=sg/s n=ind//r/p/e
 e   plt/l rs n=adj/s n=pst/s n=nt/s
 n=sg/s n=ind//r/p/e
 e   ple/l rs n=adj/s n=pst/s n=m/s
 n=sg/s n=def//r/p/e
 e   pla/l rs n=adj/s n=pst/s n=un/s
 n=pl/s n=ind//r/p/e
 e   pla/l rs n=adj/s n=pst/s n=un/s
 n=sp/s n=def//r/p/e
  
 e r=RL c=style:fam a=PT   plare/l   rs
 n=adj/s n=comp/s n=un/s n=sp//r/p/e
 e r=RL c=style:fam a=PT   plast/l   rs
 n=adj/s n=sup/s n=un/s n=sp/s n=ind//r/p/e
 e r=RL c=style:fam a=PT   plaste/l  rs
 n=adj/s n=sup/s n=un/s n=sp/s n=def//r/p/e
   /pardef
  
   e lm=samhällelig a=isisamhällelig/ipar
   n=samhällelig__adj//e
  
   Yours,
   Per Tunedal
  
  -- 
  Kevin Brubeck Unhammer
  
  GPG: 0x766AC60C
  --
  Dive into the World of Parallel Programming The Go Parallel Website,
  sponsored
  by Intel and developed in partnership with Slashdot Media, is your hub
  for all
  things parallel software development, from weekly thought leadership
  blogs to
  news, videos, case studies, tutorials and more. Take a look and join the 
  conversation now. http://goparallel.sourceforge.net/
  ___
  Apertium-stuff mailing list
  Apertium-stuff@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/apertium-stuff
  Email had 1 attachment:
  + signature.asc
1k (application/pgp-signature)
 
 --
 Dive into the World of Parallel Programming The Go Parallel Website,
 sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub
 for all
 things parallel software development, from weekly thought leadership
 blogs to
 news, videos, case studies, tutorials and more. Take a look and join the 
 conversation now. http://goparallel.sourceforge.net/
 ___
 Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Compare adjectives in Swedish

2015-03-04 Thread Per Tunedal
Hi again Kevin,
I cannot find the transfer rules in apertium-dan-nor. What file should I
look for?

I would like to understand how it would work in practice.

Yours,
Per Tunedal

On Wed, Mar 4, 2015, at 21:19, Kevin Brubeck Unhammer wrote:
 Per Tunedal per.tune...@operamail.com
 writes:
 
  Hi again Kevin.
  funny, I just looked at the apertium-dan-nor.nno.dix !
  Hmm ...
  When updating quite a lot of files disappeared :-)
 
  I would like to see what it looks like in the monolingual dictionaries.
 
 languages/apertium-nno/apertium-nno.nno.dix has e.g.:
 
 pardef n=sein__adj
   e   plare/l   rs n=adj/s n=sint/s
   n=comp/s n=un/s n=sp//r/p/e
   e   pl/l  rs n=adj/s n=sint/s
   n=posi/s n=mf/s n=sg/s n=ind//r/p/e
   e   plt/l rs n=adj/s n=sint/s
   n=posi/s n=nt/s n=sg/s n=ind//r/p/e
   e   ple/l rs n=adj/s n=sint/s
   n=posi/s n=un/s n=pl/s n=ind//r/p/e
   e   ple/l rs n=adj/s n=sint/s
   n=posi/s n=un/s n=sp/s n=def//r/p/e
   e   plaste/l  rs n=adj/s n=sint/s
   n=sup/s n=un/s n=sp/s n=def//r/p/e
   e   plast/l   rs n=adj/s n=sint/s
   n=sup/s n=un/s n=sp/s n=ind//r/p/e
 /pardef
 
 pardef n=OK__adj
   e   pl/l  rs n=adj/s n=posi/s n=mf/s
   n=sg/s n=ind//r/p/e
   e   pl/l  rs n=adj/s n=posi/s n=nt/s
   n=sg/s n=ind//r/p/e
   e   pl/l  rs n=adj/s n=posi/s n=un/s
   n=pl/s n=ind//r/p/e
   e   pl/l  rs n=adj/s n=posi/s n=un/s
   n=sp/s n=def//r/p/e
 /pardef
 
 (posi→pst is on the TODO …)
 
 
  Now I've got:
 
  pardef n=samhällelig__adj !-- PT: Kompareras med mer och mest eller
  inte alls --
e   pl/l  rs n=adj/s n=pst/s n=ut/s
n=sg/s n=ind//r/p/e
e   plt/l rs n=adj/s n=pst/s n=nt/s
n=sg/s n=ind//r/p/e
e   ple/l rs n=adj/s n=pst/s n=m/s
n=sg/s n=def//r/p/e
e   pla/l rs n=adj/s n=pst/s n=un/s
n=pl/s n=ind//r/p/e
e   pla/l rs n=adj/s n=pst/s n=un/s
n=sp/s n=def//r/p/e
 
e r=RL c=style:fam a=PT   plare/l   rs
n=adj/s n=comp/s n=un/s n=sp//r/p/e
e r=RL c=style:fam a=PT   plast/l   rs
n=adj/s n=sup/s n=un/s n=sp/s n=ind//r/p/e
e r=RL c=style:fam a=PT   plaste/l  rs
n=adj/s n=sup/s n=un/s n=sp/s n=def//r/p/e
  /pardef
 
  e lm=samhällelig a=isisamhällelig/ipar
  n=samhällelig__adj//e
 
  Yours,
  Per Tunedal
 
 -- 
 Kevin Brubeck Unhammer
 
 GPG: 0x766AC60C
 --
 Dive into the World of Parallel Programming The Go Parallel Website,
 sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub
 for all
 things parallel software development, from weekly thought leadership
 blogs to
 news, videos, case studies, tutorials and more. Take a look and join the 
 conversation now. http://goparallel.sourceforge.net/
 ___
 Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff
 Email had 1 attachment:
 + signature.asc
   1k (application/pgp-signature)

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] High Frequency Missing Words

2015-03-04 Thread Per Tunedal
Hi,
wouldn't it be great if the input language was detected automatically?
Maybe TextCat http://www.let.rug.nl/vannoord/TextCat/ would do the
trick?
Yours,
Per Tunedal

On Wed, Mar 4, 2015, at 21:16, Flammie Pirinen wrote:
 2015-03-04, Tino Didriksen sanoi:
 
 
  There's a lot of source/target language confusion and people using the
  entirely wrong language pair, which means we have a problem and need
  to fix the apertium.org interface so people don't make that mistake.
  Or make it detect languages better and override people's choice when
  they're clearly wrong.
 
 I’m not sure it’s an actual problem in UI, I do it all the time that I
 try out bunch of things back and forth and since these new fangly
 widgets nowadays process what I copy-paste without me clicking any
 buttons it happens all the time that I copy-paste stuff first and
 change languages then and it has already translated things in obviously
 wrong pairs. (Yes I am aware of the check-box but it's not that big of
 an issue for me as a user to have wrong translations on the fly that
 I’d bother...)
 
 -- 
 Flammie, computer scientist bachelor + linguist master = computational
 linguist doctor, free software Finnish localiser,
 and more! http://www.iki.fi/flammie/
 
 
 
 --
 Dive into the World of Parallel Programming The Go Parallel Website,
 sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub
 for all
 things parallel software development, from weekly thought leadership
 blogs to
 news, videos, case studies, tutorials and more. Take a look and join the 
 conversation now. http://goparallel.sourceforge.net/
 ___
 Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Mitzuli released

2015-03-04 Thread Per Tunedal

Hi Mikel, Congratulations! The app works great. To make it successful
the most important is to work on the language pairs to improve the
translation quality. (For Swedish-Danish the quality is actually better
than what the GF translator achieves, though. What a surprise!) Yours,
Per Tunedal


On Mon, Mar 2, 2015, at 10:06, Mikel Artetxe wrote:
 Hi Apertiumers,

 I just wanted to let you know that, after a year as a beta, Mitzuli
 has finally been released today, and it is publicly available on
 Google Play.

 For those of you who have not heard about it, Mitzuli is an Apertium
 based translator app for Android with a nice user interface and
 support for advanced features like ASR (voice input), OCR (camera
 input), and TTS (voice output). For more information, you can visit
 its new website at https://www.mitzuli.com (btw, there is now a
 section for the projects in which Mitzuli is based, and Apertium could
 not be missing there, of course ;-)

 The app can be downloaded from
 https://play.google.com/store/apps/details?id=com.mitzuli

 And its source code can be found at
 https://github.com/artetxem/mitzuli

 Regards, Mikel
 --
 Dive into the World of Parallel Programming The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media,
 is your hub for all things parallel software development, from weekly
 thought leadership blogs to news, videos, case studies, tutorials and
 more. Take a look and join the conversation now.
 http://goparallel.sourceforge.net/
 _
 Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] Fwd: [Moses-support] DiscoMT 2015 Shared Task on Pronoun Translation (at EMNLP 2015)

2015-03-04 Thread Per Tunedal

Hi, can Apertium somehow be used for this task? Intuitively, the
analysis (part of speech tags) would be useful. Especially if the tags
could be remembered for the previous sentence.

Yours, Per Tunedal


- Original message - From: joerg tiede...@gmail.com To:
moses-support moses-supp...@mit.edu Subject: [Moses-support] DiscoMT
2015 Shared Task on Pronoun Translation (at EMNLP 2015) Date: Fri, 27
Feb 2015 13:43:50 +0100

===
DiscoMT 2015 Shared Task on Pronoun Translation
===

Website: https://www.idiap.ch/workshop/DiscoMT/shared-task In connection
with EMNLP 2015 (http://emnlp2015.emnlp.org)


We are happy to announce a new exciting task for people interested in
(discourse-aware) machine translation, anaphora resolution and machine
learning in general. The EMNLP 2015 Workshop on Discourse in Machine
Translation features two shared tasks:

Task 1: Pronoun-Focused Machine Translation Task 2: Cross-Lingual
Pronoun Prediction


Task 1 requires machine translation (from English to French) and focuses
on the evaluation of translated pronouns. We provide training data and a
baseline SMT model to get started.

Task 2 is a straightforward classification task in which one has to
predict the correct translation of a given pronoun in English (it or
they) into French (ce, elle, elles, il, ils, ça, cela, on, OTHER). We
provide training and development data and a simple baseline system using
an N-gram language model.

More details of the two tasks are attached below and can be found at our
website: https://www.idiap.ch/workshop/DiscoMT/shared-task


Important Dates:

4 May, 2015 Release of the MT test set (task 1) 10 May, 2015 Submission
of translations (task 1) 11 May, 2015 Release of the classification
test set (task 2) 18 May, 2015 Submissions of classification results
(task 2) 28 May, 2015 System paper submission deadline Sep., 2015
Workshop in Lisbon


Mailing list: https://groups.google.com/d/forum/discomt2015

Downloads:
https://www.dropbox.com/sh/c8qnpag5z29jyh6/qk1TE9-UvcgEnfccdRwxa?dl=0
Download alternative 1: http://opus.lingfil.uu.se/DiscoMT2015/ Download
alternative 2: http://stp.lingfil.uu.se/~joerg/DiscoMT2015/


-
Acknowledgements: Funding for the manual evaluation of the
pronoun-focused translation task is generously provided by the European
Association for Machine Translation (EAMT)
-

==
Detailed Task Description:
==


* Overview

The DiscoMT 2015 shared task will consist of two subtasks, relevant to
both the MT and discourse communities: pronoun-focused translation, a
practical MT task, and cross-lingual pronoun prediction, a
classification task that requires no specific MT expertise and is
interesting as a machine learning task in its own right. For groups
wishing to participate in both tasks, one possibility is to convert a
system for the classification task into an MT feature model using
existing software such as the Docent decoder (Hardmeier et al., ACL
2013). Both tasks use the English–French language pair, which has a
sufficiently high baseline performance to produce basically intelligible
output, as well as interesting differences in their pronoun systems.


* Task 1: Pronoun-Focused Translation Task

In the pronoun-focused translation task, you are given a collection of
English input documents, which you are asked to translate into French.
This task is the same as for other MT shared tasks such as that of WMT.
The difference is in the way the translations are evaluated. Instead of
checking the overall translation quality, we specifically look at how
the English subject pronouns it and they were translated. The principal
evaluation will be carried out manually and will focus specifically on
the correctness of pronoun translation. Thanks to a grant from the EAMT,
the manual evaluation will be run by the organisers and participants
don't have to contribute evaluations. Automatic reference-based metrics
are available for development purposes.


The texts in the test corpus will consist of transcripts of TED talks.
The training data contains an in-domain corpus of TED talks as well as
some additional data from Europarl and news texts. To make the
participating systems as comparable as possible, we ask you to constrain
the training data of your system to the resources listed below as far as
you can, but this is not a strict requirement and we do accept
submissions using additional resources. If your system uses any
resources other than those of the official data release, please be
specific about what was included in the system description paper. For
the same reason, we also suggest that you use the tokeniser provided by
us unless you have a good reason to do otherwise.

The test set will be supplied

Re: [Apertium-stuff] Compare adjectives in Swedish

2015-03-04 Thread Per Tunedal
Hi again Kevin.
funny, I just looked at the apertium-dan-nor.nno.dix !
Hmm ...
When updating quite a lot of files disappeared :-)

I would like to see what it looks like in the monolingual dictionaries.

Now I've got:

pardef n=samhällelig__adj !-- PT: Kompareras med mer och mest eller
inte alls --
  e   pl/l  rs n=adj/s n=pst/s n=ut/s
  n=sg/s n=ind//r/p/e
  e   plt/l rs n=adj/s n=pst/s n=nt/s
  n=sg/s n=ind//r/p/e
  e   ple/l rs n=adj/s n=pst/s n=m/s
  n=sg/s n=def//r/p/e
  e   pla/l rs n=adj/s n=pst/s n=un/s
  n=pl/s n=ind//r/p/e
  e   pla/l rs n=adj/s n=pst/s n=un/s
  n=sp/s n=def//r/p/e

  e r=RL c=style:fam a=PT   plare/l   rs
  n=adj/s n=comp/s n=un/s n=sp//r/p/e
  e r=RL c=style:fam a=PT   plast/l   rs
  n=adj/s n=sup/s n=un/s n=sp/s n=ind//r/p/e
  e r=RL c=style:fam a=PT   plaste/l  rs
  n=adj/s n=sup/s n=un/s n=sp/s n=def//r/p/e
/pardef

e lm=samhällelig a=isisamhällelig/ipar
n=samhällelig__adj//e

Yours,
Per Tunedal




On Wed, Mar 4, 2015, at 13:46, Kevin Brubeck Unhammer wrote:
 Per Tunedal per.tune...@operamail.com
 writes:
 
  Hi Kevin,
  I cannot find any sint-tag in the dan-nor dictionaries e.g.
  apertium-dan-nor.nno.dix
 
 There is no apertium-dan-nor.nno.dix. 
 
 -- 
 Kevin Brubeck Unhammer
 
 GPG: 0x766AC60C
 --
 Dive into the World of Parallel Programming The Go Parallel Website,
 sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub
 for all
 things parallel software development, from weekly thought leadership
 blogs to
 news, videos, case studies, tutorials and more. Take a look and join the 
 conversation now. http://goparallel.sourceforge.net/
 ___
 Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff
 Email had 1 attachment:
 + signature.asc
   1k (application/pgp-signature)

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] High Frequency Missing Words

2015-03-04 Thread Per Tunedal

Hi Tino, well the on time words are not of any interest anyway. They
just make the list much longer. For me you could just delete them.
Yours, Per Tunedal


On Wed, Mar 4, 2015, at 14:24, Tino Didriksen wrote:
 On 4 March 2015 at 14:16, Joonas Kylmälä j.kylm...@gmail.com wrote:
 This looks good! But there is one thing which came to my mind:
 what if

people write there something personal and they don't want it to show

publicly? If the problem is not taken in account yet, maybe we could

only show the words which occur two or more times?

 Thought about that, but something secret that you can somehow make out
 the context of when you have 1 token to look at? Highly unlikely.

 Sure, if someone writes
 MyDirtySecretAllInOneTokenAndMyNameIsXAndILiveInY then that'll show,
 but I just can't see that being a real world problem.

 -- TD
 --
 Dive into the World of Parallel Programming The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media,
 is your hub for all things parallel software development, from weekly
 thought leadership blogs to news, videos, case studies, tutorials and
 more. Take a look and join the conversation now.
 http://goparallel.sourceforge.net/
 _
 Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Compare adjectives in Swedish

2015-03-04 Thread Per Tunedal
Hi Kevin,
I cannot find any sint-tag in the dan-nor dictionaries e.g.
apertium-dan-nor.nno.dix

Yours,
Per Tunedal

On Wed, Mar 4, 2015, at 11:42, Kevin Brubeck Unhammer wrote:
 Per Tunedal per.tune...@operamail.com
 writes:
 
  Hi,
  how to treat adjectives that are compared with the help of mer and
  mest (more and most), instead of the more common endings -are and
  -aste?
 
  I cannot have a paradigm with words before the adjective, can I?
 
  It should not be:
 
  *samhällelig
  *samhälleligare
  *samhälleligast
 
  but:
  samhällelig
  mer samhällelig
  mest samhällelig
 
  if you would compare this rare adjective. There are a lot of similar
  adjectives in Swedish. In fact most long adjectives are compared this
  way.
 
 That's handled in transfer; see dan-nor, nno-nob (e.g. macro
 set_grau_aux2).
 
 They should have different taggings; adjectives that inflect
 «-are/-ast(e)» have adjsint (synthetic), while adj's that take
 «mer/mest» should just have adj (analytic). This way, transfer
 knows if it's possible to generate a sup/comp form.
 
 
 -- 
 Kevin Brubeck Unhammer
 
 GPG: 0x766AC60C
 --
 Dive into the World of Parallel Programming The Go Parallel Website,
 sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub
 for all
 things parallel software development, from weekly thought leadership
 blogs to
 news, videos, case studies, tutorials and more. Take a look and join the 
 conversation now. http://goparallel.sourceforge.net/
 ___
 Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff
 Email had 1 attachment:
 + signature.asc
   1k (application/pgp-signature)

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] High Frequency Missing Words

2015-03-04 Thread Per Tunedal

Hi Tino, very interesting. Looking at the swe-dan file I'm a bit
confused. I should look for Swedish words, not Danish, shouldn't I? Or
are both directions included?

Yours, Per Tunedal


On Wed, Mar 4, 2015, at 11:22, Tino Didriksen wrote:
 On 4 March 2015 at 10:53, Francis Tyers fty...@prompsit.com wrote:
 For the kaz-tat,tat-kaz directions you could grep out Latin
 characters,

would remove

at least some of the bokmål and nynorsk :D

 But that would mean adding special case code to the export script,
 which sounds boring.

 Instead, I've added a Download link to each pair so anyone can just
 get the entire dump as a tab-separated UTF-8 plain text file and do
 their own filtering.
 E.g., http://apertium.projectjj.com/missingFreqs.php?export=swe-dan

 -- TD
 --
 Dive into the World of Parallel Programming The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media,
 is your hub for all things parallel software development, from weekly
 thought leadership blogs to news, videos, case studies, tutorials and
 more. Take a look and join the conversation now.
 http://goparallel.sourceforge.net/
 _
 Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] High Frequency Missing Words

2015-03-04 Thread Per Tunedal

Hi Tino, Excellent. Interesting to read. Quite an incentive to add some
more words.

It might be a good idea to publish the data on a regular basis, e.g.
once a year.

BTW What puzzles me, is that the missing words are not very frequent in
a general domain corpus. The words apparently reflect the interests of
the users. Maybe the general domain users are quite happy? But some
popular domains are missing?

Yours, Per Tunedal


On Wed, Mar 4, 2015, at 13:54, Tino Didriksen wrote:
 On 4 March 2015 at 13:49, Per Tunedal
 per.tune...@operamail.com wrote:
 __
 very interesting. Looking at the swe-dan file I'm a bit confused. I
 should look for Swedish words, not Danish, shouldn't I? Or are both
 directions included?

 They're separate:
 - http://apertium.projectjj.com/missingFreqs.php?pair=dan-swe
 - http://apertium.projectjj.com/missingFreqs.php?pair=swe-dan

 Which just goes to show people pick the wrong direction, or don't pick
 a direction at all.

 -- TD
 --
 Dive into the World of Parallel Programming The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media,
 is your hub for all things parallel software development, from weekly
 thought leadership blogs to news, videos, case studies, tutorials and
 more. Take a look and join the conversation now.
 http://goparallel.sourceforge.net/
 _
 Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] Compare adjectives in Swedish

2015-03-03 Thread Per Tunedal
Hi,
how to treat adjectives that are compared with the help of mer and
mest (more and most), instead of the more common endings -are and
-aste?

I cannot have a paradigm with words before the adjective, can I?

It should not be:

*samhällelig
*samhälleligare
*samhälleligast

but:
samhällelig
mer samhällelig
mest samhällelig

if you would compare this rare adjective. There are a lot of similar
adjectives in Swedish. In fact most long adjectives are compared this
way.

Yours,
Per Tunedal


--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] The pipeline

2015-02-18 Thread Per Tunedal
Hi Francis,
Thank you! It works like a charm.
Yours,
Per Tunedal

On Wed, Feb 18, 2015, at 17:25, Francis Tyers wrote:
 A 2015-02-18 17:15, Per Tunedal escrigué:
  Hi Francis,
  Thank you. -g works OK:
  
  echo Vi behöver en annan boll | lt-proc sv-da.automorf.bin |
  
  apertium-tagger -g $2 sv-da.prob |
  apertium-pretransfer |
  lt-proc -b sv-da.autobil.bin |
  apertium-transfer -b apertium-sv-da.sv-da.t1x  sv-da.t1x.bin |
  lt-proc -g $1 sv-da.autogen.bin  |
  less
  
  But why isn't there any -g switch after lt-proc in the modes file?
  
  And how do I get rid of * @ etc. see below:
  
  Vi behøver en anden *karavanförare \@genast
  (END)
  
  Yours,
  Per Tunedal
  
 
 You shouldn't be using $1 and $2 in the command line, they refer to 
 command line arguments.
 
 The switch comes from the main apertium script. That is why it is not in 
 the modes file, to get rid of the diagnostics you can use:
 
-n, --non-marked-genmorph. generation without unknown word marks
 
 (from the lt-proc --help output)
 
 Fran
 
 --
 Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
 from Actuate! Instantly Supercharge Your Business Reports and Dashboards
 with Interactivity, Sharing, Native Excel Exports, App Integration  more
 Get technology previously reserved for billion-dollar corporations, FREE
 http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk
 ___
 Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] The pipeline

2015-02-18 Thread Per Tunedal
Hi Francis,
Thank you. -g works OK:

echo Vi behöver en annan boll | lt-proc sv-da.automorf.bin | 

apertium-tagger -g $2 sv-da.prob | 
apertium-pretransfer | 
lt-proc -b sv-da.autobil.bin | 
apertium-transfer -b apertium-sv-da.sv-da.t1x  sv-da.t1x.bin | 
lt-proc -g $1 sv-da.autogen.bin  | 
less

But why isn't there any -g switch after lt-proc in the modes file?

And how do I get rid of * @ etc. see below:

Vi behøver en anden *karavanförare \@genast
(END) 

Yours,
Per Tunedal

On Wed, Feb 18, 2015, at 16:50, Francis Tyers wrote:
 
 
 A 2015-02-18 14:23, Per Tunedal escrigué:
  Hi Mikel,
  Thank you. All works except the last step:
  lt-proc $1 /home/per/Repository/apertium-sv-da/da-sv.autogen.bin
  
  I run:
  echo Vi behöver en annan boll | lt-proc sv-da.automorf.bin |
  
  apertium-tagger -g $2 sv-da.prob |
  apertium-pretransfer |
  lt-proc -b sv-da.autobil.bin |
  apertium-transfer -b apertium-sv-da.sv-da.t1x  sv-da.t1x.bin |
  lt-proc $1 sv-da.autogen.bin  |
  less
  
  and get:
  std::exception
  
  But if I run the ordinary way everything works OK:
  echo Vi behöver en annan boll | apertium -u -d . da-sv
  
  BTW How do I pass the -u if I run the commands manually?
 
 Try:
 
 lt-proc -n /home/per/Repository/apertium-sv-da/da-sv.autogen.bin
 lt-proc -g /home/per/Repository/apertium-sv-da/da-sv.autogen.bin
 lt-proc -d /home/per/Repository/apertium-sv-da/da-sv.autogen.bin
 
 Different generation modes, the --help output will explain.
 
 F.
 
 --
 Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
 from Actuate! Instantly Supercharge Your Business Reports and Dashboards
 with Interactivity, Sharing, Native Excel Exports, App Integration  more
 Get technology previously reserved for billion-dollar corporations, FREE
 http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk
 ___
 Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] The pipeline

2015-02-18 Thread Per Tunedal
Hi Mikel,
Thank you. All works except the last step:
lt-proc $1 /home/per/Repository/apertium-sv-da/da-sv.autogen.bin 

I run:
echo Vi behöver en annan boll | lt-proc sv-da.automorf.bin | 

apertium-tagger -g $2 sv-da.prob | 
apertium-pretransfer | 
lt-proc -b sv-da.autobil.bin | 
apertium-transfer -b apertium-sv-da.sv-da.t1x  sv-da.t1x.bin | 
lt-proc $1 sv-da.autogen.bin  | 
less

and get:
std::exception

But if I run the ordinary way everything works OK:
echo Vi behöver en annan boll | apertium -u -d . da-sv

BTW How do I pass the -u if I run the commands manually?

Yours,
Per Tunedal



On Wed, Feb 18, 2015, at 11:54, Mikel L. Forcada wrote:
 Hi, Per.
 When the modes.xml file is compiled, a set of shell scripts are 
 generated in directory modes/ , one for each mode. Checking these out 
 may give you some inspiration on how to run commands manually.
 
 HTH
 
 Mikel
 
 
 El 18/02/15 a les 09:42, Per Tunedal ha escrit:
  Hi,
  I would like to explore the pipeline Apertium uses for translation. I
  can see the steps in the modes.xml file, but I cannot figure out how to
  write the commands to run them manually. How do I get the output of
  lt-proc as it looks when it is forwarded to the tagger?
 
 mode name=swe-dan-bytecode install=yes
   pipeline
 program name=lt-proc
   file name=swe-dan.automorf.bin/
 /program
 program name=apertium-tagger -g $2
   file name=swe-dan.prob/
 /program
 
  The following doesn't work:
 
  echo Vi behöver en annan boll | lt-proc swe-dan.automorf.bin | less
 
  What should I enter instead?
 
  Yours,
  Per Tunedal
 
  --
  Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
  from Actuate! Instantly Supercharge Your Business Reports and Dashboards
  with Interactivity, Sharing, Native Excel Exports, App Integration  more
  Get technology previously reserved for billion-dollar corporations, FREE
  http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk
  ___
  Apertium-stuff mailing list
  Apertium-stuff@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/apertium-stuff
 
 
 -- 
   Mikel L. Forcada (http://www.dlsi.ua.es/~mlf/)
 Departament de Llenguatges i Sistemes Informàtics
 Universitat d'Alacant
 E-03071 Alacant, Spain
 Phone: +34 96 590 9776
 Fax: +34 96 590 9326
 
 
 --
 Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
 from Actuate! Instantly Supercharge Your Business Reports and Dashboards
 with Interactivity, Sharing, Native Excel Exports, App Integration  more
 Get technology previously reserved for billion-dollar corporations, FREE
 http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk
 ___
 Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] The pipeline

2015-02-18 Thread Per Tunedal
Hi Mikel

On Wed, Feb 18, 2015, at 11:54, Mikel L. Forcada wrote:
 Hi, Per.
 When the modes.xml file is compiled, a set of shell scripts are 
 generated in directory modes/ , one for each mode. Checking these out 
 may give you some inspiration on how to run commands manually.
 
 HTH
 
 Mikel
 
 
 El 18/02/15 a les 09:42, Per Tunedal ha escrit:
  Hi,
  I would like to explore the pipeline Apertium uses for translation. I
  can see the steps in the modes.xml file, but I cannot figure out how to
  write the commands to run them manually. How do I get the output of
  lt-proc as it looks when it is forwarded to the tagger?
 
 mode name=swe-dan-bytecode install=yes
   pipeline
 program name=lt-proc
   file name=swe-dan.automorf.bin/
 /program
 program name=apertium-tagger -g $2
   file name=swe-dan.prob/
 /program
 
  The following doesn't work:
 
  echo Vi behöver en annan boll | lt-proc swe-dan.automorf.bin | less
 
  What should I enter instead?
 
  Yours,
  Per Tunedal
 
  --
  Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
  from Actuate! Instantly Supercharge Your Business Reports and Dashboards
  with Interactivity, Sharing, Native Excel Exports, App Integration  more
  Get technology previously reserved for billion-dollar corporations, FREE
  http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk
  ___
  Apertium-stuff mailing list
  Apertium-stuff@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/apertium-stuff
 
 
 -- 
   Mikel L. Forcada (http://www.dlsi.ua.es/~mlf/)
 Departament de Llenguatges i Sistemes Informàtics
 Universitat d'Alacant
 E-03071 Alacant, Spain
 Phone: +34 96 590 9776
 Fax: +34 96 590 9326
 
 
 --
 Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
 from Actuate! Instantly Supercharge Your Business Reports and Dashboards
 with Interactivity, Sharing, Native Excel Exports, App Integration  more
 Get technology previously reserved for billion-dollar corporations, FREE
 http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk
 ___
 Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] Fwd: languages/apertium-swe failed nightly build

2015-02-16 Thread Per Tunedal
Hi,
don't blame me! The dependencies are not met.
Yours,
Per Tunedal

- Original message -
From: root apertium-packag...@projectjj.com
To: apertium-packag...@lists.sourceforge.net,
tune...@users.sourceforge.net
Subject: languages/apertium-swe failed nightly build
Date: Tue, 17 Feb 2015 03:27:34 + (UTC)


Package: languages/apertium-swe
started: Tue Feb 17 03:22:26 UTC 2015
latest: 0.1.0~r58937
existing: 0.1.0~r58773-1
distv: 1
launching rebuild
data only
stopped: Tue Feb 17 03:27:33 UTC 2015
FAILED:

http://apertium.projectjj.com/apt/logs/apertium-swe/jessie-amd64.log
blames in revisions 58774:58937 :
tunedal

--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Merging apertium-is-sv.is-sv.dix into apertium-swe.swe.dix

2015-02-16 Thread Per Tunedal
Hi Francis,
what if someone would like to create a new pair with Swedish and some
more distant language like German, English or French? To me it looks
great to include this feature to make the monolingual dictionary more
independent of the language pairs.

And further, shouldn't this be the standard for all languages? It would
facilitate the development of new pairs: the more distance between the
pairs, the better.

Yours,
Per Tunedal

On Mon, Feb 16, 2015, at 11:33, Francis Tyers wrote:
 A 2015-02-16 09:03, Per Tunedal escrigué:
  Hi,
  can anyone explain the implications of how the months are treated in 
  the
  is-sv pair? It differs significantly from the pair swe-dan. Should this
  somehow be introduced into the pair swe-dan?
  
  !-- Punctuation --
  
  pardef n=mánuðir
epljanuari/lrjanuari/r/p/e
eplfebruari/lrfebruari/r/p/e
eplmars/lrmars/r/p/e
eplapril/lrapril/r/p/e
eplmaj/lrmaj/r/p/e
epljuni/lrjuni/r/p/e
epljuli/lrjuli/r/p/e
eplaugusti/lraugusti/r/p/e
eplseptember/lrseptember/r/p/e
eploktober/lroktober/r/p/e
eplnovember/lrnovember/r/p/e
epldecember/lrdecember/r/p/e
  /pardef
  
  !-- 1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th 11th 12th 13th 14th
  15th ... 20th 21st --
  pardef n=dates
e   re[2-3]?[0,4-9]/replb//lrb//r/ppar
n=mánuðir/pl/rs n=num//r/p/e
e   re1[0-9]/replb//lrb//r/ppar
n=mánuðir/pl/rs n=num//r/p/e
  /pardef
  
 
 This is a fine way to deal with months for translation, to e.g. get
 
 3. nóvember - 3rd November
 
 However, I suppose in swe-dan and isl-swe it isn't strictly necessary as 
 they work the same.
 
 Fran
 
 --
 Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
 from Actuate! Instantly Supercharge Your Business Reports and Dashboards
 with Interactivity, Sharing, Native Excel Exports, App Integration  more
 Get technology previously reserved for billion-dollar corporations, FREE
 http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk
 ___
 Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Merging apertium-is-sv.is-sv.dix into apertium-swe.swe.dix

2015-02-16 Thread Per Tunedal
Hi,
can anyone explain the implications of how the months are treated in the
is-sv pair? It differs significantly from the pair swe-dan. Should this
somehow be introduced into the pair swe-dan?

!-- Punctuation --

pardef n=mánuðir
  epljanuari/lrjanuari/r/p/e
  eplfebruari/lrfebruari/r/p/e
  eplmars/lrmars/r/p/e
  eplapril/lrapril/r/p/e
  eplmaj/lrmaj/r/p/e
  epljuni/lrjuni/r/p/e
  epljuli/lrjuli/r/p/e
  eplaugusti/lraugusti/r/p/e
  eplseptember/lrseptember/r/p/e
  eploktober/lroktober/r/p/e
  eplnovember/lrnovember/r/p/e
  epldecember/lrdecember/r/p/e
/pardef

!-- 1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th 11th 12th 13th 14th
15th ... 20th 21st --
pardef n=dates
  e   re[2-3]?[0,4-9]/replb//lrb//r/ppar
  n=mánuðir/pl/rs n=num//r/p/e
  e   re1[0-9]/replb//lrb//r/ppar
  n=mánuðir/pl/rs n=num//r/p/e
/pardef


Yours,
Per Tunedal

On Tue, Feb 10, 2015, at 09:01, Per Tunedal wrote:
 Hi,
 
--snip--
 The months are treated differently. I'm not sure exactly what Tihomir
 has done, but it looks neat. This applies to the bidix as well. Can
 anyone explain? Should this be used in apertium-swe.swe.dix and
 apertium-swe-dan.swe-dan.dix ?
 
--snip--
 
 Yours,
 Per Tunedal
 
 --
 Dive into the World of Parallel Programming. The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media, is
 your
 hub for all things parallel software development, from weekly thought
 leadership blogs to news, videos, case studies, tutorials and more. Take
 a
 look and join the conversation now. http://goparallel.sourceforge.net/
 ___
 Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] Finding errors in dictionaries

2015-02-15 Thread Per Tunedal
Hi,
Now I've created the page Finding errors in dictionaries in the
wiktionary:
http://wiki.apertium.org/wiki/Finding_errors_in_dictionaries

I hope it will help contributors to improve translation quality. I
haven't translated it to French yet, though.

Yours,
Per Tunedal

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New GSOC ideas

2015-02-11 Thread Per Tunedal
Hi Francis,
I really like the idea Make a program which tests Apertium data files
for suspicious or unrecommended constructs (likely to be bugs).  For
someone like me it's very easy to make a minor mistake when editing
those bloody XML-files :-) It's quite easy to miss a quotation mark ( )
or some other symbols () that aren't all that important in ordinary
language. Or omitting some closing symbols at the right side of the
expression (/).

One way of improved checking would be not to just have separate programs
like Jimmy O'Regan's lint-tool for tsx-files, but also make the make
script be more explicit about errors. Some helpful hints about common
errors. Print the offending line with explicit info. Or rather the
offending expression? This applies to make scripts for dictionaries as
well as for tagger training.

The advantage of this is that everyone has to run the make script, but
it's easy to forget running a special tool or simply not be aware of
it's existence.

Regarding the make scripts for tagger training, it would be very welcome
if they would work with comments in the tsx-files. Working without
comments complicates the work considerably. That's the main reason why I
abandoned the work on retraining the tagger for the pair Swedish-Danish.

Yours,
Per Tunedal


On Thu, Feb 12, 2015, at 01:08, Francis Tyers wrote:
 Hello all,
 
 We've added some new ideas for GSOC:
 
 * Weighted transfer rules
 * Automatic blank handling
 * Integration and debugging tools for Grammatical Framework
 * Weights in lttoolbox
 * Improvements to the Apertium website
 
 Please don't feel shy about fleshing out the ideas and improving the 
 descriptions. :D
 
 http://wiki.apertium.org/wiki/Ideas_for_Google_Summer_of_Code
 
 We currently have thirteen ideas and could do with a few more. Something 
 around seven or eight more would be good.
 
 Entry level: 3
 Medium: 5
 Hard: 5
 
 It would be good to have a mix, so 4 more entry level ones and two each 
 medium and hard or so.
 
 Fran
 
 --
 Dive into the World of Parallel Programming. The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media, is
 your
 hub for all things parallel software development, from weekly thought
 leadership blogs to news, videos, case studies, tutorials and more. Take
 a
 look and join the conversation now. http://goparallel.sourceforge.net/
 ___
 Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] How do I add a page in the Wiki

2015-02-10 Thread Per Tunedal
Hi,
I would like to contribute some experiences of how to find errors in a
dictionary. Unfortunately, I cannot figure out how to add a new page to
the Wiki. I would like to add it with the Documentation page as parent.

Yours,
Per Tunedal

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] Changes for pronouns and adjectives in apertium-swe.swe.dix

2015-02-10 Thread Per Tunedal
Hi,
I've done some changes for pronouns and adjectives in
apertium-swe.swe.dix to make it more accurate.

These changes have not been reflected in the danish dictionary though,
because I believe it might break the pair danish-norwegian. 
And further I'm not all that strong in Danish anyway. These are rather
important features in a language and has to be right.

Could anyone more savvy have a look and do the appropriate changes in
the Danish dictionary (and possibly in the bidix)?

What about the trick word vad for instance. Now it looks like this in
the bidix:

!-- PT: Changed to pronoun: s n=prn/s n=rel/
e   plvads n=adv/s n=itg//lrhvads
n=adv/s n=itg//r/p/e
--
e a=PT   plvads n=prn/s n=rel//lrhvads
n=adv/s n=itg//r/p/e

Yours,
Per Tunedal

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Why am I getting this for swe-dan

2015-02-10 Thread Per Tunedal

Hi Tino, thank you. Now I've fixed the typo. Yours, Per Tunedal


On Tue, Feb 10, 2015, at 08:36, Tino Didriksen wrote:
 Replied inline...

 On 10 February 2015 at 08:24, Per Tunedal
 per.tune...@operamail.com wrote:
 I've got this strange message:


Package: languages/apertium-swe

started: Tue Feb 10 07:13:28 UTC 2015

latest: 0.1.0~r58796

existing: 0.1.0~r58773-1

distv: 1

launching rebuild

data only

stopped: Tue Feb 10 07:18:25 UTC 2015

FAILED:
 http://apertium.projectjj.com/apt/logs/apertium-swe/wheezy-amd64.log
 http://apertium.projectjj.com/apt/logs/apertium-swe/jessie-amd64.log
 http://apertium.projectjj.com/apt/logs/apertium-swe/sid-amd64.log
 http://apertium.projectjj.com/apt/logs/apertium-swe/precise-amd64.log
 http://apertium.projectjj.com/apt/logs/apertium-swe/trusty-amd64.log
 http://apertium.projectjj.com/apt/logs/apertium-swe/utopic-amd64.log
 http://apertium.projectjj.com/apt/logs/apertium-swe/vivid-amd64.log

blames in revisions 58774:58796 :

tunedal


Why? Have done some mistake?

 This is the kind of mail you get if you break the build for a package.
 You broke apertium-swe, and the why is in the linked logs, usually at
 the bottom. In this case, the errors are:

 apertium-swe.swe.dix:1264: element e: Schemas validity error : Element
 'e': Character content other than whitespace is not allowed because
 the content type is 'element-only'. apertium-swe.swe.dix:1264: element
 e: validity error : Element e content does not follow the DTD,
 expecting (i | p | par | re)+, got (p CDATA)

 You must have forgotten to check that make for apertium-swe passed
 before committing your changes.

 -- Tino Didriksen
 --
 Dive into the World of Parallel Programming. The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media,
 is your hub for all things parallel software development, from weekly
 thought leadership blogs to news, videos, case studies, tutorials and
 more. Take a look and join the conversation now.
 http://goparallel.sourceforge.net/
 _
 Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Error in expanded dictionary

2015-02-10 Thread Per Tunedal
Hi Kevin,
Thanks for the reply.
Yours,
Per Tunedal

On Tue, Feb 10, 2015, at 10:49, Kevin Brubeck Unhammer wrote:
 Per Tunedal per.tune...@operamail.com
 writes:
 
  Hi,
  why is:
 
  NON_ANALYSIS
 
  appended after words at many lines?
 
  eg.
 
  aktrisernaNON_ANALYSIS
 
 There's a bug in lt-comp, where if you have a pardef that looks like
 
 par n=foo
   e/e
 /par
 
 then it'll produce an FST that leads to lt-proc hanging. So if you want
 a pardef like
 
 pardef n=cmp
   e   pl/l  r/r/p/e
   e r=RLpl/l  rs n=cmp//r/p/e
 /pardef
 
 which adds the cmp tag only for the RL FST, then the LR FST uses
 
 pardef n=cmp
   e   pl/l  r/r/p/e
 /pardef
 
 which gives this bug. Thus we do
 
 pardef n=cmp
   e   pl/l  r/r/p/e
   e r=RLpl/l  rs n=cmp//r/p/e
   e   plNON_ANALYSIS/l 
   rDUE_TO_LT_PROC_HANG/r/p/e
 /pardef
 
 
 Yes, the bug should be fixed, it just hasn't been annoying enough yet
 that anyone's gotten around to it :-)
 
 (And the NON_ANALYSIS of course will presumably never be seen in a
 corpus[1] so it's harmless to have it in there.)
 
 
 [1] Except for the corpus of apertium-stuff emails and #apertium IRC
 logs.
 
 -- 
 Kevin Brubeck Unhammer
 
 GPG: 0x766AC60C
 --
 Dive into the World of Parallel Programming. The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media, is
 your
 hub for all things parallel software development, from weekly thought
 leadership blogs to news, videos, case studies, tutorials and more. Take
 a
 look and join the conversation now. http://goparallel.sourceforge.net/
 ___
 Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff
 Email had 1 attachment:
 + signature.asc
   1k (application/pgp-signature)

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Changes for pronouns and adjectives in apertium-swe.swe.dix

2015-02-10 Thread Per Tunedal
Hi Francis,
OK, now vad - hvad is treated both as relative pronoun and
interrogative adverb. Before my changes it was treated only as
interrogative adverb. In my Swedish grammar it's both relative and
interrogative pronoun.
Yours,
Per Tunedal

On Tue, Feb 10, 2015, at 10:11, Francis Tyers wrote:
 A 2015-02-10 09:24, Per Tunedal escrigué:
  Hi,
  I've done some changes for pronouns and adjectives in
  apertium-swe.swe.dix to make it more accurate.
  
  These changes have not been reflected in the danish dictionary though,
  because I believe it might break the pair danish-norwegian.
  And further I'm not all that strong in Danish anyway. These are rather
  important features in a language and has to be right.
  
  Could anyone more savvy have a look and do the appropriate changes in
  the Danish dictionary (and possibly in the bidix)?
  
  What about the trick word vad for instance. Now it looks like this in
  the bidix:
  
  !-- PT: Changed to pronoun: s n=prn/s n=rel/
  e   plvads n=adv/s n=itg//lrhvads
  n=adv/s n=itg//r/p/e
  --
  e a=PT   plvads n=prn/s n=rel//lrhvads
  n=adv/s n=itg//r/p/e
 
 When making changes to the Swedish dictionary I propose that you follow
 the patterns in the Norwegian and Danish dictionaries.
 
 dan-nor:
 e vr=nobplhvads n=prn/s n=itg//lrhvas 
 n=prn/s n=itg//r/p/e
 e vr=nnoplhvads n=prn/s n=itg//lrkvas 
 n=prn/s n=itg//r/p/e
 e vr=nobplhvads n=adv//lrhvas 
 n=adv//r/p/e
 e vr=nnoplhvads n=adv//lrkvas 
 n=adv//r/p/e
 
 nno-nob:
 e   plkvas n=prn//lrhvas n=prn//r/p/e
 e   plkvas n=adv//lrhvas n=adv//r/p/e
 
 Fran
 
 --
 Dive into the World of Parallel Programming. The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media, is
 your
 hub for all things parallel software development, from weekly thought
 leadership blogs to news, videos, case studies, tutorials and more. Take
 a
 look and join the conversation now. http://goparallel.sourceforge.net/
 ___
 Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] apertium-sv-da.sv.dix merged into apertium-swe.swe.dix

2015-02-09 Thread Per Tunedal
Hi,
 I've got stuck:

On Mon, Feb 9, 2015, at 10:24, Kevin Brubeck Unhammer wrote:
 Per Tunedal per.tune...@operamail.com
 writes:
 
  Hi,
  Now I've merged apertium-sv-da.sv.dix  into apertium-swe.swe.dix. Kevin,
  would you please make apertium-sv-da depend on
  languages/apertium-swe with that little change to the makefiles.
 
  I suppose I would better move the apertium-sv-da.sv.dix to another
  folder to avoid mistakes in the future.
 
 Set up, and changed to three-letter codes.
 
 Make sure you're running the newest SVN of
 apertium/lttoolbox/apertium-lex-tools (on .deb or .rpm-based linuxes you
 can just use Tino Didriksen's repos,
 http://wiki.apertium.org/wiki/Prerequisites_for_Debian or
 http://wiki.apertium.org/wiki/Prerequisites_for_RPM ).
 
 Then do:
 
 
   for l in swe dan; do 
svn checkout
https://svn.code.sf.net/p/apertium/svn/languages/apertium-$l  cd
apertium-$l  ./autogen.sh  cd .. || break;
   done
   
   svn checkout
   https://svn.code.sf.net/p/apertium/svn/trunk/apertium-swe-dan

Works OK until next step:

   cd apertium-swe-dan
   ./autogen.sh --with-lang1=../apertium-swe --with-lang2=../apertium-dan

Here I get stuck:

You don't have cg-comp installed

I thought this language pair didn't use any constraint grammar? I've
tried to follow the instructions in the wiki for Debian/Ubuntu but it
doesn't work. On this box am I trying Ubuntu for change.

I get a message that the repository cannot be read because of wrong
format.

 
 Now compile both the monolingual data and the pair by doing:
 
 make -j3 langs
 
 and this should give some output:
 
 make test
 
 
 -- 
 Kevin Brubeck Unhammer
 
 GPG: 0x766AC60C
 --
 Dive into the World of Parallel Programming. The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media, is
 your
 hub for all things parallel software development, from weekly thought
 leadership blogs to news, videos, case studies, tutorials and more. Take
 a
 look and join the conversation now. http://goparallel.sourceforge.net/
 ___
 Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff
 Email had 1 attachment:
 + signature.asc
   1k (application/pgp-signature)

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] apertium-sv-da.sv.dix merged into apertium-swe.swe.dix

2015-02-09 Thread Per Tunedal
Hi Francis,
Yes, I noted that a long time ago and have used some of those entries.
Goldwashing from SALDO. I was trying to find words from my corpus and
got some.
Yours,
Per Tunedal

On Mon, Feb 9, 2015, at 10:31, Francis Tyers wrote:
 A 2015-02-09 10:24, Kevin Brubeck Unhammer escrigué:
  Per Tunedal per.tune...@operamail.com
  writes:
  
  Hi,
  Now I've merged apertium-sv-da.sv.dix  into apertium-swe.swe.dix. 
  Kevin,
  would you please make apertium-sv-da depend on
  languages/apertium-swe with that little change to the makefiles.
  
  I suppose I would better move the apertium-sv-da.sv.dix to another
  folder to avoid mistakes in the future.
  
  Set up, and changed to three-letter codes.
 
 ...snip...
 
 Note that one of our GCI students, Joonas converted the nouns and verbs 
 from
 SALDO to something approximating the Apertium tagset. The files and 
 scripts
 are in apertium-swe/dev/saldo.
 
 Fran
 
 --
 Dive into the World of Parallel Programming. The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media, is
 your
 hub for all things parallel software development, from weekly thought
 leadership blogs to news, videos, case studies, tutorials and more. Take
 a
 look and join the conversation now. http://goparallel.sourceforge.net/
 ___
 Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] apertium-sv-da.sv.dix merged into apertium-swe.swe.dix

2015-02-08 Thread Per Tunedal
Hi,
Now I've merged apertium-sv-da.sv.dix  into apertium-swe.swe.dix. Kevin,
would you please make apertium-sv-da depend on
languages/apertium-swe with that little change to the makefiles.

I suppose I would better move the apertium-sv-da.sv.dix to another
folder to avoid mistakes in the future.

Yours,
Per Tunedal

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Expand monodix: generation only

2015-02-07 Thread Per Tunedal
Hi Kevin,
Thank you for your quick answer.

On Fri, Feb 6, 2015, at 18:51, Kevin Brubeck Unhammer wrote:
 Per Tunedal per.tune...@operamail.com
 writes:
 
  Hi,
  I've successfully extracted a Swedish word list from
  apertium.sv-da.sv.dix  as follows:
--snip--
 
 LR entries are output from lt-expand with :: as the field separator, so
 you can do
 
 lt-expand *.sv.dix | grep -v '::' | cut -f1 -d:  sv.expanded
 
 You might also want to exclude RL-marked entries (they tend to be a bit
 weird in monodixes):
 
 lt-expand *.sv.dix | grep -v ':[]:' | cut -f1 -d:  sv.expanded
 

Excellent! Just what I need.

  Anyhow, I continued by checking the list in Word-processing programs to
  get the real errors and found quite a lot. Some of them have I already
  corrected in the pair sv-da. What about the separate language
  dictionary? Should I merge my corrections somehow? What's the
  recommended procedure when improving/adding to an existing language
  pair?
 
 It'd be great if you could merge your changes in there; before your
 changes the diff was only 32 lines long so I don't think it should be
 much work (you might even be able to just copy it over).
 
OK. I will give it a try.

  By the way: How do I use the separated language monodixies? Can they be
  used for existing pairs or only when creating new pairs? What's the
  recommendation for new pairs? The Apertium New Language Pair HOWTO
  still supposes that the monodixies are made exclusively for the new
  pair.
 
 The challenge is just getting the monodixes merged; if you merge in
 those changes, we can make apertium-sv-da depend on
 languages/apertium-swe with a little change to the makefiles.

Does that mean that the monolingual dictionaries now are independent of
the language pairs? What about the old requirement that all words in the
monodix had to be translated for the pair; i.e. words had to be present
in both monodixies and in the bidix. Is that requirement now abandoned?
What happens when translating to Swedish if a form in the foreign
language is missing in Swedish of vice versa?

Is it now possible to extend the Swedish dictionary, without having to
extend the Danish dictionary at the same time? If so, it would
facilitate contributions considerably. Lars Aronsson would be happy. 

 
 (The diff for the Danish side is 67736 lines long, so that may be more
 of a challenge to merge … but I'd still say it's worth it to merge the
 Swedish side right away.)
 

The next step after I merged the sv-da.sv.dix with the swe-dix would be
to merge with the pair is-sv. In that way both the pair sv-da and the
pair is-sv would benefit from corrections in the Swedish monodix.

 
 -- 
 Kevin Brubeck Unhammer
 
 GPG: 0x766AC60C
 --

Yours,
Per Tunedal

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] Expand monodix: generation only

2015-02-06 Thread Per Tunedal
Hi,
I've successfully extracted a Swedish word list from
apertium.sv-da.sv.dix  as follows:

lt-expand apertium-sv-da.sv.dix | cut -f1 -d':' 
apertium-sv-da.sv.dix.expanded

Going through the list I found lots of errors. I excluded words present
in the Aspell dictionary to get a shorter list of misspelled words. It
was quite long though, and worse: it contained mostly correctly spelled
words, unknown to Aspell. Hunspell (used by e.g. OpenOffice/Libre
Office) knows much more words. Anyone that happens to know how to
extract/get Hunspell word lists as text files? 

Looking at the misspelled list I realised that many of the errors are
variants added for analysis only (r=LR). Is there an easy way to
expand only the variants that are used for generation? Such a procedure
would produce a much shorter and more correct list.

Anyhow, I continued by checking the list in Word-processing programs to
get the real errors and found quite a lot. Some of them have I already
corrected in the pair sv-da. What about the separate language
dictionary? Should I merge my corrections somehow? What's the
recommended procedure when improving/adding to an existing language
pair?

By the way: How do I use the separated language monodixies? Can they be
used for existing pairs or only when creating new pairs? What's the
recommendation for new pairs? The Apertium New Language Pair HOWTO
still supposes that the monodixies are made exclusively for the new
pair.

Yours,
Per Tunedal



--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Extract words from monodix

2015-02-02 Thread Per Tunedal
Hi Kevin,
Thank you. It works as a charm.
Yours,
Per Tunedal

On Mon, Feb 2, 2015, at 09:07, Kevin Brubeck Unhammer wrote:
 Per Tunedal per.tune...@operamail.com
 writes:
 
  Hi,
  I've successfully extracted a Swedish word list from
  apertium.sv-da.sv.dix  as follows:
 
  lt-expand apertium-sv-da.sv.dix | cut -f1 -d':' 
  apertium-sv-da.sv.dix.expanded
 
  I would like to get English and French word lists as well. How do I
  proceed with the pairs fr-es and en-es or en-ca:
 
  there aren't any similar files for English or French in those pairs.
  Only for Spanish.
 
 The dix file is compiled from a .metadix file. First, compile the pair,
 then look for a .dix file, possibly in .deps/, like .deps/en.dix or
 something.
 
  BTW Would it be better to extract words from
  http://wiki.apertium.org/wiki/Languages , rather than from the pairs?
 
 Probably not for those languages … though if you're only after forms
 anyway, you could just grab all the words from all the directories and
 then do
 
 cat apertium-sv-da.sv.dix.expanded apertium-swe.swe.dix.expanded  \
   sort -u  combined-apertium-swe.swe.dix.expanded
 
 
 -Kevin
 --
 Dive into the World of Parallel Programming. The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media, is
 your
 hub for all things parallel software development, from weekly thought
 leadership blogs to news, videos, case studies, tutorials and more. Take
 a
 look and join the conversation now. http://goparallel.sourceforge.net/
 ___
 Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff
 Email had 1 attachment:
 + signature.asc
   1k (application/pgp-signature)

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] Abbreviation list for tokenization was: Re: Using GIZA++

2014-03-03 Thread Per Tunedal
Hi again Miquel,
I've manually replaced the variables and the script
bitextor-builddics.sh works like a charm!

I've got a complaint about a missing list of Swedish abbreviations
though:

TOKENISING THE CORPUS...
WARNING: No known abbreviations for language 'sv', attempting fall-back
to English version...

Where do I find those lists of abbreviations (what program, what
folder)? It would be quite easy for me to supply such a list as I've
already done it to Apertium-sv-da and to bligner.py

Yours,
Per Tunedal

On Thu, Feb 20, 2014, at 19:48, Miquel Esplà wrote:

Well, of course you can try to replace manually the variables by paths
(as I told you, you have to try to replace variables starting and
ending with __). I don't think I can help you much more because I never
did this, but I'm sure that with a bit of patiente you will do it ;)
Good luck!

Cheers,

Miquel.

---snip---


 
  I'm sorry, I didn't explain it well: as I said,
[1]bitextor-builddics.in is
  only the template of the script. What I didn't say is that you need
to
  compile the project to get the true script. If you have a look into
the
  code of the template, you will see that there are many variables
starting
  and ending with __ (such as __PREFFIX__). These variables are
  replaced  by the corresponding paths at compilation time. So, to
use the
  script, you have to download the whole trunk directory, and then to
run:
  ./autogen.sh
  ./configure
  make
  make install
 
  As you know, you can use the option --prefix=LOCALDIR when running
  ./configure to install bitextor in a specific path (for example
LOCALDIR could
  be /home/per/local/).
 
  Best,
 
  Miquel.
 
 
 
   Yours,
  Per Tunedal
 
  On Tue, Feb 18, 2014, at 12:38, Miquel Esplà wrote:
 
   Hi Per,
 
  I think that the explanation in this website:
  [2]http://rali.iro.umontreal.ca/rali/?q=en/node/1325 is quite
useful. It
  helps a lot to understand the structure and the content of each
file
  generated by OmegaT.
 
  About the script, in the last release of bitextor we included a
script
  called bitextor-builddics (you can find the template of this
script here:
 
[3]https://svn.code.sf.net/p/bitextor/code/trunk/bitextor-builddics.in)
  which uses GIZA++ to obtain a plain text bilingual dictionary, but
only
  including pairs of words fulfilling: a) both words occur at least
10 times
  in the corpus, and b) the harmonic mean of their probabilities in
both
  probabilistic dictionaries (S - T and T - S) is higher than 0.2.
If you
  want to use this, I recommend you to use the version in the trunk,
which
  fixes some minor bugs still present in the release.
 
  Best,
 
  Miquel.
 --snip---

References

1. http://bitextor-builddics.in/
2. http://rali.iro.umontreal.ca/rali/?q=en/node/1325
3. https://svn.code.sf.net/p/bitextor/code/trunk/bitextor-builddics.in
--
Subversion Kills Productivity. Get off Subversion  Make the Move to Perforce.
With Perforce, you get hassle-free workflows. Merge that actually works. 
Faster operations. Version large binaries.  Built-in WAN optimization and the
freedom to use Git, Perforce or both. Make the move to Perforce.
http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Using GIZA++

2014-02-20 Thread Per Tunedal
Hi Miquel,
Thanks for your thorough answer.

I've tried ./autogen.sh
I had to install httrack, but then got:
checking for a Python interpreter with version = 2.7... none
configure: error: You don't have Python 2.7 or later installed.

Is it really necessary to update Python?

It appears that the configure script demands Python = 2.7 In Debian
Squeeze Pyhton 2.6.6 is the default.
I'm afraid of messing things up if I install Python manually, and not
with Synaptic. Lots of things depend on Python.

And upgrading to Debian Wheezy might fuzz things up as well ...

Yours,
Per Tunedal


On Wed, Feb 19, 2014, at 9:58, Miquel Esplà wrote:

Hi Per,

2014-02-18 21:37 GMT+01:00 Per Tunedal [1]per.tune...@operamail.com:

Hi Miquel,
thank you. Looks like a good approach.

Looking at the script:
It runs GIZA++ in both directions to begin with? I just have to supply
the bitext files?


Yes, you only need to provide the bitext files compressed with gzip.


But the script have some trouble finding the GIZA++ files:
per@Pers-debian:~/script$ sh [2]bitextor-builddics.in sv fr
/home/per/corpora/[3]OpenOffice3.fr-sv.sv
/home/per/corpora/[4]OpenOffice3.fr-sv.fr
/home/per/block_world_corpus/GIZA++_wordlists/bitextor/OpenOffice3.giz
adict.sv-fr
TOKENISING THE CORPUS...
Can't open perl script
__PREFIX__/share/bitextor/utils/tokenizer.perl: Filen eller katalogen
finns inte
gzip: /home/per/corpora/[5]OpenOffice3.fr-sv.sv: not in gzip format
Can't open perl script
__PREFIX__/share/bitextor/utils/tokenizer.perl: Filen eller katalogen
finns inte
gzip: /home/per/corpora/[6]OpenOffice3.fr-sv.fr: not in gzip format
LOWERCASING THE CORPUS...
FILTERING OUT TOO LONG SENTENCES...
FORMATTING THE CORPUS FOR PROCESSING...
mv: kan inte ta status på
/tmp/tempcorpuspreproc.QP7LM/corpus.clean.sv_corpus.clean.fr.snt:
Filen eller katalogen finns inte
mv: kan inte ta status på
/tmp/tempcorpuspreproc.QP7LM/corpus.clean.fr_corpus.clean.sv.snt:
Filen eller katalogen finns inte
mv: kan inte ta status på
/tmp/tempcorpuspreproc.QP7LM/corpus.clean.sv.vcb: Filen eller
katalogen finns inte
mv: kan inte ta status på
/tmp/tempcorpuspreproc.QP7LM/corpus.clean.fr.vcb: Filen eller
katalogen finns inte
BUILDING WORD CLASSES FOR IMPROVING ALIGNMENT...
CHECKING COOCURRENCE OF WORDS IN THE CORPUS...
BUILDING PROBABILISTIC DICTIONARIES...
FILTERING DICTIONARY...
egrep: /tmp/tempgizamodel.RlVVs/fr.vcbegrep:
/tmp/tempgizamodel.RlVVs/sv.vcb: Filen eller katalogen finns inte
: Filen eller katalogen finns inte
[7]bitextor-builddics.in: 173: __PYTHON__: not found
DONE!


I'm sorry, I didn't explain it well: as I said,
[8]bitextor-builddics.in is only the template of the script. What I
didn't say is that you need to compile the project to get the true
script. If you have a look into the code of the template, you will see
that there are many variables starting and ending with __ (such as
__PREFFIX__). These variables are replaced  by the corresponding paths
at compilation time. So, to use the script, you have to download the
whole trunk directory, and then to run:
./autogen.sh
./configure
make
make install

As you know, you can use the option --prefix=LOCALDIR when running
./configure to install bitextor in a specific path (for example
LOCALDIR could be /home/per/local/).

Best,

Miquel.


Yours,
Per Tunedal

On Tue, Feb 18, 2014, at 12:38, Miquel Esplà wrote:

Hi Per,

I think that the explanation in this website:
[9]http://rali.iro.umontreal.ca/rali/?q=en/node/1325 is quite useful.
It helps a lot to understand the structure and the content of each file
generated by OmegaT.

About the script, in the last release of bitextor we included a script
called bitextor-builddics (you can find the template of this script
here:
[10]https://svn.code.sf.net/p/bitextor/code/trunk/bitextor-builddics.in
) which uses GIZA++ to obtain a plain text bilingual dictionary, but
only including pairs of words fulfilling: a) both words occur at least
10 times in the corpus, and b) the harmonic mean of their probabilities
in both probabilistic dictionaries (S - T and T - S) is higher than
0.2. If you want to use this, I recommend you to use the version in the
trunk, which fixes some minor bugs still present in the release.

Best,

Miquel.

2014-02-17 14:21 GMT+01:00 Per Tunedal [11]per.tune...@operamail.com:

Hi Miquel,
thank you for your informative answer. In deed I needed to create a
coocurrence file.
I did successfully create such a file with snt2cooc.out

And GIZA++ has run successfully and made a lot of files in my home
directory (!).

How do I redirect the output to a more suitable folder? -outputpath ?

Where can I find an explanation of the content of the files?

I suppose the dictionary is in the translation table *.t3.final
Any convenient way to extract plain text dictionaries (without going
one step further and use Moses)?
Some script available to decode the translation table by the using the
vocabulary files *.vcb ?

Yours,
Per Tunedal



On Mon, Feb 17, 2014, at 11:08, Miquel Esplà wrote

Re: [Apertium-stuff] Using GIZA++

2014-02-20 Thread Per Tunedal
Hi Miquel,
yes, that what was I had in my mind. But it doesn't help much dough.

Next dependency is some Python library for levenstien distance ...

There must be an easier way to test the script and see if it gives me
something useful. I'm not interested in testing the other functions
right now.

Just compile the script somehow? Or just hard code paths into the
script?

Yours,
Per Tunedal


On Thu, Feb 20, 2014, at 10:46, Miquel Esplà wrote:
 Hi Per,
 
 I didn't try to compile with the version of Python you are using, but you
 can try to change this condition in configure.ac to do so.
 
 Cheers,
 
 Miquel.
 
 
 2014-02-20 10:19 GMT+01:00 Per Tunedal per.tune...@operamail.com:
 
   Hi Miquel,
  Thanks for your thorough answer.
 
  I've tried ./autogen.sh
  I had to install httrack, but then got:
  checking for a Python interpreter with version = 2.7... none
  configure: error: You don't have Python 2.7 or later installed.
 
  Is it really necessary to update Python?
 
  It appears that the configure script demands Python = 2.7 In Debian
  Squeeze Pyhton 2.6.6 is the default.
  I'm afraid of messing things up if I install Python manually, and not with
  Synaptic. Lots of things depend on Python.
 
  And upgrading to Debian Wheezy might fuzz things up as well ...
 
  Yours,
  Per Tunedal
 
 
  On Wed, Feb 19, 2014, at 9:58, Miquel Esplà wrote:
 
  Hi Per,
 
   2014-02-18 21:37 GMT+01:00 Per Tunedal per.tune...@operamail.com:
 
Hi Miquel,
  thank you. Looks like a good approach.
 
  Looking at the script:
  It runs GIZA++ in both directions to begin with? I just have to supply the
  bitext files?
 
 
  Yes, you only need to provide the bitext files compressed with gzip.
 
 
 
  But the script have some trouble finding the GIZA++ files:
   per@Pers-debian:~/script$ sh bitextor-builddics.in sv fr
  /home/per/corpora/OpenOffice3.fr-sv.sv /home/per/corpora/
  OpenOffice3.fr-sv.fr
  /home/per/block_world_corpus/GIZA++_wordlists/bitextor/OpenOffice3.gizadict.sv-fr
  TOKENISING THE CORPUS...
  Can't open perl script __PREFIX__/share/bitextor/utils/tokenizer.perl:
  Filen eller katalogen finns inte
  gzip: /home/per/corpora/OpenOffice3.fr-sv.sv: not in gzip format
  Can't open perl script __PREFIX__/share/bitextor/utils/tokenizer.perl:
  Filen eller katalogen finns inte
  gzip: /home/per/corpora/OpenOffice3.fr-sv.fr: not in gzip format
  LOWERCASING THE CORPUS...
  FILTERING OUT TOO LONG SENTENCES...
  FORMATTING THE CORPUS FOR PROCESSING...
  mv: kan inte ta status på
  /tmp/tempcorpuspreproc.QP7LM/corpus.clean.sv_corpus.clean.fr.snt: Filen
  eller katalogen finns inte
  mv: kan inte ta status på
  /tmp/tempcorpuspreproc.QP7LM/corpus.clean.fr_corpus.clean.sv.snt: Filen
  eller katalogen finns inte
  mv: kan inte ta status på
  /tmp/tempcorpuspreproc.QP7LM/corpus.clean.sv.vcb: Filen eller katalogen
  finns inte
  mv: kan inte ta status på
  /tmp/tempcorpuspreproc.QP7LM/corpus.clean.fr.vcb: Filen eller katalogen
  finns inte
  BUILDING WORD CLASSES FOR IMPROVING ALIGNMENT...
  CHECKING COOCURRENCE OF WORDS IN THE CORPUS...
  BUILDING PROBABILISTIC DICTIONARIES...
  FILTERING DICTIONARY...
  egrep: /tmp/tempgizamodel.RlVVs/fr.vcbegrep:
  /tmp/tempgizamodel.RlVVs/sv.vcb: Filen eller katalogen finns inte
  : Filen eller katalogen finns inte
  bitextor-builddics.in: 173: __PYTHON__: not found
  DONE!
 
 
  I'm sorry, I didn't explain it well: as I said, bitextor-builddics.in is
  only the template of the script. What I didn't say is that you need to
  compile the project to get the true script. If you have a look into the
  code of the template, you will see that there are many variables starting
  and ending with __ (such as __PREFFIX__). These variables are
  replaced  by the corresponding paths at compilation time. So, to use the
  script, you have to download the whole trunk directory, and then to run:
  ./autogen.sh
  ./configure
  make
  make install
 
  As you know, you can use the option --prefix=LOCALDIR when running
  ./configure to install bitextor in a specific path (for example LOCALDIR 
  could
  be /home/per/local/).
 
  Best,
 
  Miquel.
 
 
 
   Yours,
  Per Tunedal
 
  On Tue, Feb 18, 2014, at 12:38, Miquel Esplà wrote:
 
   Hi Per,
 
  I think that the explanation in this website:
  http://rali.iro.umontreal.ca/rali/?q=en/node/1325 is quite useful. It
  helps a lot to understand the structure and the content of each file
  generated by OmegaT.
 
  About the script, in the last release of bitextor we included a script
  called bitextor-builddics (you can find the template of this script here:
  https://svn.code.sf.net/p/bitextor/code/trunk/bitextor-builddics.in)
  which uses GIZA++ to obtain a plain text bilingual dictionary, but only
  including pairs of words fulfilling: a) both words occur at least 10 times
  in the corpus, and b) the harmonic mean of their probabilities in both
  probabilistic dictionaries (S - T and T - S) is higher than 0.2. If you
  want to use this, I recommend you to use

Re: [Apertium-stuff] Using GIZA++

2014-02-20 Thread Per Tunedal
Hi Miquel,
thank you. I will give it a try.
Yours,
Per Tunedal

On Thu, Feb 20, 2014, at 19:48, Miquel Esplà wrote:

Well, of course you can try to replace manually the variables by paths
(as I told you, you have to try to replace variables starting and
ending with __). I don't think I can help you much more because I never
did this, but I'm sure that with a bit of patiente you will do it ;)
Good luck!

Cheers,

Miquel.

2014-02-20 14:11 GMT+01:00 Per Tunedal [1]per.tune...@operamail.com:

  Hi Miquel,
  yes, that what was I had in my mind. But it doesn't help much dough.
  Next dependency is some Python library for levenstien distance ...
  There must be an easier way to test the script and see if it gives
  me
  something useful. I'm not interested in testing the other functions
  right now.
  Just compile the script somehow? Or just hard code paths into the
  script?
  Yours,
  Per Tunedal

On Thu, Feb 20, 2014, at 10:46, Miquel Esplà wrote:
 Hi Per,


 I didn't try to compile with the version of Python you are using, but
you
 can try to change this condition in [2]configure.ac to do so.

 Cheers,

 Miquel.


 2014-02-20 10:19 GMT+01:00 Per Tunedal
[3]per.tune...@operamail.com:

   Hi Miquel,
  Thanks for your thorough answer.
 
  I've tried ./autogen.sh
  I had to install httrack, but then got:
  checking for a Python interpreter with version = 2.7... none
  configure: error: You don't have Python 2.7 or later installed.
 
  Is it really necessary to update Python?
 
  It appears that the configure script demands Python = 2.7 In
Debian
  Squeeze Pyhton 2.6.6 is the default.
  I'm afraid of messing things up if I install Python manually, and
not with
  Synaptic. Lots of things depend on Python.
 
  And upgrading to Debian Wheezy might fuzz things up as well ...
 
  Yours,
  Per Tunedal
 
 
  On Wed, Feb 19, 2014, at 9:58, Miquel Esplà wrote:
 
  Hi Per,
 
   2014-02-18 21:37 GMT+01:00 Per Tunedal
[4]per.tune...@operamail.com:
 
Hi Miquel,
  thank you. Looks like a good approach.
 
  Looking at the script:
  It runs GIZA++ in both directions to begin with? I just have to
supply the
  bitext files?
 
 
  Yes, you only need to provide the bitext files compressed with
gzip.
 
 
 
  But the script have some trouble finding the GIZA++ files:
   per@Pers-debian:~/script$ sh [5]bitextor-builddics.in sv fr
  /home/per/corpora/[6]OpenOffice3.fr-sv.sv /home/per/corpora/
  [7]OpenOffice3.fr-sv.fr
 
/home/per/block_world_corpus/GIZA++_wordlists/bitextor/OpenOffice3.giz
adict.sv-fr
  TOKENISING THE CORPUS...
  Can't open perl script
__PREFIX__/share/bitextor/utils/tokenizer.perl:
  Filen eller katalogen finns inte
  gzip: /home/per/corpora/[8]OpenOffice3.fr-sv.sv: not in gzip format
  Can't open perl script
__PREFIX__/share/bitextor/utils/tokenizer.perl:
  Filen eller katalogen finns inte
  gzip: /home/per/corpora/[9]OpenOffice3.fr-sv.fr: not in gzip format
  LOWERCASING THE CORPUS...
  FILTERING OUT TOO LONG SENTENCES...
  FORMATTING THE CORPUS FOR PROCESSING...
  mv: kan inte ta status på
  /tmp/tempcorpuspreproc.QP7LM/corpus.clean.sv_corpus.clean.fr.snt:
Filen
  eller katalogen finns inte
  mv: kan inte ta status på
  /tmp/tempcorpuspreproc.QP7LM/corpus.clean.fr_corpus.clean.sv.snt:
Filen
  eller katalogen finns inte
  mv: kan inte ta status på
  /tmp/tempcorpuspreproc.QP7LM/corpus.clean.sv.vcb: Filen eller
katalogen
  finns inte
  mv: kan inte ta status på
  /tmp/tempcorpuspreproc.QP7LM/corpus.clean.fr.vcb: Filen eller
katalogen
  finns inte
  BUILDING WORD CLASSES FOR IMPROVING ALIGNMENT...
  CHECKING COOCURRENCE OF WORDS IN THE CORPUS...
  BUILDING PROBABILISTIC DICTIONARIES...
  FILTERING DICTIONARY...
  egrep: /tmp/tempgizamodel.RlVVs/fr.vcbegrep:
  /tmp/tempgizamodel.RlVVs/sv.vcb: Filen eller katalogen finns inte
  : Filen eller katalogen finns inte
  [10]bitextor-builddics.in: 173: __PYTHON__: not found
  DONE!
 
 
  I'm sorry, I didn't explain it well: as I said,
[11]bitextor-builddics.in is
  only the template of the script. What I didn't say is that you need
to
  compile the project to get the true script. If you have a look into
the
  code of the template, you will see that there are many variables
starting
  and ending with __ (such as __PREFFIX__). These variables are
  replaced  by the corresponding paths at compilation time. So, to
use the
  script, you have to download the whole trunk directory, and then to
run:
  ./autogen.sh
  ./configure
  make
  make install
 
  As you know, you can use the option --prefix=LOCALDIR when running
  ./configure to install bitextor in a specific path (for example
LOCALDIR could
  be /home/per/local/).
 
  Best,
 
  Miquel.
 
 
 
   Yours,
  Per Tunedal
 
  On Tue, Feb 18, 2014, at 12:38, Miquel Esplà wrote:
 
   Hi Per,
 
  I think that the explanation in this website:
  [12]http://rali.iro.umontreal.ca/rali/?q=en/node/1325 is quite
useful. It
  helps a lot to understand the structure and the content of each
file
  generated by OmegaT.
 
  About the script

Re: [Apertium-stuff] The beginnings of an Icelandic - Russian dictionary for Apertium

2014-02-18 Thread Per Tunedal
Hi,
it might be a good idea to store the dictionary some other place, if you
would like it publicly available. Now, the access is restricted.
Yours,
Per Tunedal

On Mon, Feb 17, 2014, at 19:53, Ingibjorg Elsa Bjornsdottir wrote:
 
 Hi there Apertium community,
 
 I have created an Icelandic - Russian wordlist/dictionary in google docs 
 .  You are all welcome to contribute or to send
 me ideas that you may have.
 
 The path is the following:
 
 https://docs.google.com/spreadsheet/ccc?key=0AtVcoB9lkZjndGRqMVJqWnBUbTdlekRSSVFiSDRNbUEusp=sharing
  
 
 
 
 Kindest regards from southern Iceland,
 
 Ingibjorg Elsa Bjornsdottir, (Ingella)
 Selfoss
 Southern Iceland.
 
 --
 Managing the Performance of Cloud-Based Applications
 Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
 Read the Whitepaper.
 http://pubads.g.doubleclick.net/gampad/clk?id=121054471iu=/4140/ostg.clktrk
 ___
 Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121054471iu=/4140/ostg.clktrk
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Using GIZA++

2014-02-18 Thread Per Tunedal
Hi Miquel,
thank you. Looks like a good approach.

Looking at the script:
It runs GIZA++ in both directions to begin with? I just have to supply
the bitext files?

But the script have some trouble finding the GIZA++ files:
per@Pers-debian:~/script$ sh bitextor-builddics.in sv fr
/home/per/corpora/OpenOffice3.fr-sv.sv
/home/per/corpora/OpenOffice3.fr-sv.fr
/home/per/block_world_corpus/GIZA++_wordlists/bitextor/OpenOffice3.giz
adict.sv-fr
TOKENISING THE CORPUS...
Can't open perl script
__PREFIX__/share/bitextor/utils/tokenizer.perl: Filen eller katalogen
finns inte
gzip: /home/per/corpora/OpenOffice3.fr-sv.sv: not in gzip format
Can't open perl script
__PREFIX__/share/bitextor/utils/tokenizer.perl: Filen eller katalogen
finns inte
gzip: /home/per/corpora/OpenOffice3.fr-sv.fr: not in gzip format
LOWERCASING THE CORPUS...
FILTERING OUT TOO LONG SENTENCES...
FORMATTING THE CORPUS FOR PROCESSING...
mv: kan inte ta status på
/tmp/tempcorpuspreproc.QP7LM/corpus.clean.sv_corpus.clean.fr.snt:
Filen eller katalogen finns inte
mv: kan inte ta status på
/tmp/tempcorpuspreproc.QP7LM/corpus.clean.fr_corpus.clean.sv.snt:
Filen eller katalogen finns inte
mv: kan inte ta status på
/tmp/tempcorpuspreproc.QP7LM/corpus.clean.sv.vcb: Filen eller
katalogen finns inte
mv: kan inte ta status på
/tmp/tempcorpuspreproc.QP7LM/corpus.clean.fr.vcb: Filen eller
katalogen finns inte
BUILDING WORD CLASSES FOR IMPROVING ALIGNMENT...
CHECKING COOCURRENCE OF WORDS IN THE CORPUS...
BUILDING PROBABILISTIC DICTIONARIES...
FILTERING DICTIONARY...
egrep: /tmp/tempgizamodel.RlVVs/fr.vcbegrep:
/tmp/tempgizamodel.RlVVs/sv.vcb: Filen eller katalogen finns inte
: Filen eller katalogen finns inte
bitextor-builddics.in: 173: __PYTHON__: not found
DONE!

Yours,
Per Tunedal

On Tue, Feb 18, 2014, at 12:38, Miquel Esplà wrote:

Hi Per,

I think that the explanation in this website:
[1]http://rali.iro.umontreal.ca/rali/?q=en/node/1325 is quite useful.
It helps a lot to understand the structure and the content of each file
generated by OmegaT.

About the script, in the last release of bitextor we included a script
called bitextor-builddics (you can find the template of this script
here:
[2]https://svn.code.sf.net/p/bitextor/code/trunk/bitextor-builddics.in)
which uses GIZA++ to obtain a plain text bilingual dictionary, but only
including pairs of words fulfilling: a) both words occur at least 10
times in the corpus, and b) the harmonic mean of their probabilities in
both probabilistic dictionaries (S - T and T - S) is higher than 0.2.
If you want to use this, I recommend you to use the version in the
trunk, which fixes some minor bugs still present in the release.

Best,

Miquel.


2014-02-17 14:21 GMT+01:00 Per Tunedal [3]per.tune...@operamail.com:

Hi Miquel,
thank you for your informative answer. In deed I needed to create a
coocurrence file.
I did successfully create such a file with snt2cooc.out

And GIZA++ has run successfully and made a lot of files in my home
directory (!).

How do I redirect the output to a more suitable folder? -outputpath ?

Where can I find an explanation of the content of the files?

I suppose the dictionary is in the translation table *.t3.final
Any convenient way to extract plain text dictionaries (without going
one step further and use Moses)?
Some script available to decode the translation table by the using the
vocabulary files *.vcb ?

Yours,
Per Tunedal



On Mon, Feb 17, 2014, at 11:08, Miquel Esplà wrote:

Hi Per,

if I am not wrong, depending on how you compile GIZA++, it can generate
the coocurrence files on-the-fly during alignment, or you may need to
do so before running the alignment. Actually, I think that, with the
standard compilation, you are in the second case. Have a look
here: [4]https://code.google.com/p/giza-pp/issues/detail?id=9 I hope
the link will be helpful!

Cheers,

Miquel.


2014-02-17 10:30 GMT+01:00 Per Tunedal [5]per.tune...@operamail.com:

  Hi,
  I tried the procedure described at
  [6]http://wiki.apertium.org/wiki/Using_GIZA%2B%2B to get a rough
  dictionary, but encountered the following error in the last step:
  ERROR: NO COOCURRENCE FILE GIVEN!
  Is one step missing in the procedure?
  Yours,
  Per Tunedal
  
  --
  Android apps run on BlackBerry 10
  Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
  Now with support for Jelly Bean, Bluetooth, Mapview and more.
  Get your Android app in front of a whole new audience.  Start now.
  [7]http://pubads.g.doubleclick.net/gampad/clk?id=124407151iu=/4140/
  ostg.clktrk
  ___
  Apertium-stuff mailing list
  [8]Apertium-stuff@lists.sourceforge.net
  [9]https://lists.sourceforge.net/lists/listinfo/apertium-stuff

---
---

Android apps run on BlackBerry 10

Introducing the new BlackBerry 10.2.1 Runtime for Android apps.

Now with support for Jelly

[Apertium-stuff] Using GIZA++

2014-02-17 Thread Per Tunedal

Hi,
I tried the procedure described at
http://wiki.apertium.org/wiki/Using_GIZA%2B%2B to get a rough
dictionary, but encountered the following error in the last step:

ERROR: NO COOCURRENCE FILE GIVEN!

Is one step missing in the procedure?

Yours,
Per Tunedal


--
Android apps run on BlackBerry 10
Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
Now with support for Jelly Bean, Bluetooth, Mapview and more.
Get your Android app in front of a whole new audience.  Start now.
http://pubads.g.doubleclick.net/gampad/clk?id=124407151iu=/4140/ostg.clktrk
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Using GIZA++

2014-02-17 Thread Per Tunedal
Hi Miquel,
thank you for your informative answer. In deed I needed to create a
coocurrence file.
I did successfully create such a file with snt2cooc.out

And GIZA++ has run successfully and made a lot of files in my home
directory (!).

How do I redirect the output to a more suitable folder? -outputpath ?

Where can I find an explanation of the content of the files?

I suppose the dictionary is in the translation table *.t3.final
Any convenient way to extract plain text dictionaries (without going
one step further and use Moses)?
Some script available to decode the translation table by the using the
vocabulary files *.vcb ?

Yours,
Per Tunedal



On Mon, Feb 17, 2014, at 11:08, Miquel Esplà wrote:

Hi Per,

if I am not wrong, depending on how you compile GIZA++, it can generate
the coocurrence files on-the-fly during alignment, or you may need to
do so before running the alignment. Actually, I think that, with the
standard compilation, you are in the second case. Have a look
here: [1]https://code.google.com/p/giza-pp/issues/detail?id=9 I hope
the link will be helpful!

Cheers,

Miquel.


2014-02-17 10:30 GMT+01:00 Per Tunedal [2]per.tune...@operamail.com:

  Hi,
  I tried the procedure described at
  [3]http://wiki.apertium.org/wiki/Using_GIZA%2B%2B to get a rough
  dictionary, but encountered the following error in the last step:
  ERROR: NO COOCURRENCE FILE GIVEN!
  Is one step missing in the procedure?
  Yours,
  Per Tunedal
  
  --
  Android apps run on BlackBerry 10
  Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
  Now with support for Jelly Bean, Bluetooth, Mapview and more.
  Get your Android app in front of a whole new audience.  Start now.
  [4]http://pubads.g.doubleclick.net/gampad/clk?id=124407151iu=/4140/
  ostg.clktrk
  ___
  Apertium-stuff mailing list
  [5]Apertium-stuff@lists.sourceforge.net
  [6]https://lists.sourceforge.net/lists/listinfo/apertium-stuff

---
---

Android apps run on BlackBerry 10

Introducing the new BlackBerry 10.2.1 Runtime for Android apps.

Now with support for Jelly Bean, Bluetooth, Mapview and more.

Get your Android app in front of a whole new audience.  Start now.

[7]http://pubads.g.doubleclick.net/gampad/clk?id=124407151iu=/4140/ost
g.clktrk

___

Apertium-stuff mailing list

[8]Apertium-stuff@lists.sourceforge.net

[9]https://lists.sourceforge.net/lists/listinfo/apertium-stuff

References

1. https://code.google.com/p/giza-pp/issues/detail?id=9
2. mailto:per.tune...@operamail.com
3. http://wiki.apertium.org/wiki/Using_GIZA%2B%2B
4. http://pubads.g.doubleclick.net/gampad/clk?id=124407151iu=/4140/ostg.clktrk
5. mailto:Apertium-stuff@lists.sourceforge.net
6. https://lists.sourceforge.net/lists/listinfo/apertium-stuff
7. http://pubads.g.doubleclick.net/gampad/clk?id=124407151iu=/4140/ostg.clktrk
8. mailto:Apertium-stuff@lists.sourceforge.net
9. https://lists.sourceforge.net/lists/listinfo/apertium-stuff
--
Android apps run on BlackBerry 10
Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
Now with support for Jelly Bean, Bluetooth, Mapview and more.
Get your Android app in front of a whole new audience.  Start now.
http://pubads.g.doubleclick.net/gampad/clk?id=124407151iu=/4140/ostg.clktrk___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] Translation memories with Apertium

2014-02-13 Thread Per Tunedal
Hi,
I've glanced through the GSOC idea list and found:
Currently Apertium has support for translation memories, basically as
follows: If an input sentence is found exactly in the translation
memory, it is not machine translated but instead retrieved from the
translation memory. 

That's very interesting. I've read the wiki with great interest:
http://wiki.apertium.org/wiki/Translation_memory

Yours,
Per Tunedal


--
Android apps run on BlackBerry 10
Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
Now with support for Jelly Bean, Bluetooth, Mapview and more.
Get your Android app in front of a whole new audience.  Start now.
http://pubads.g.doubleclick.net/gampad/clk?id=124407151iu=/4140/ostg.clktrk
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] GSOC idea: improve support for non-standard input

2014-02-13 Thread Per Tunedal
Hi,
I agree with Mikel.
Per Tunedal

On Wed, Feb 12, 2014, at 19:22, Mikel Forcada wrote:
 Tweet translation could be a task in itself.
 
 Mikel
 
 Al 02/12/2014 05:26 PM, En/na Francis Tyers ha escrit:
  I came up with another idea for GSOC, what do people think ?
 
  Description:  Machine translation systems, especially rule-based
  systems, are pretty fragile when it comes to non-standard input. Get a
  comma, space, apostrophe or hyphen in the wrong place and it can come
  out all wrong. But, we definitely want to be able to translate IRC, SMS,
  Tweets and Youtube comments... 
 
  This could possibly be merged with the accent/diacritic restoration task
  too.
 
  http://wiki.apertium.org/wiki/Ideas_for_Google_Summer_of_Code/Improving_support_for_non-standard_text_input
 
  Note: We have two days left before the deadline. I'd encourage people to
  take a look at the ideas list and add anything you would be interested
  in mentoring. Alternatively, email the list about your idea and we will
  see about adding it.
 
  Fran
 
 
  --
  Android apps run on BlackBerry 10
  Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
  Now with support for Jelly Bean, Bluetooth, Mapview and more.
  Get your Android app in front of a whole new audience.  Start now.
  http://pubads.g.doubleclick.net/gampad/clk?id=124407151iu=/4140/ostg.clktrk
  ___
  Apertium-stuff mailing list
  Apertium-stuff@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/apertium-stuff
 
 
 -- 
 Mikel L. Forcada (http://www.dlsi.ua.es/~mlf/)
 Departament de Llenguatges i Sistemes Informàtics
 Universitat d'Alacant
 E-03071 Alacant, Spain
 Phone: +34 96 590 9776
 Fax: +34 96 590 9326
 
 
 --
 Android apps run on BlackBerry 10
 Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
 Now with support for Jelly Bean, Bluetooth, Mapview and more.
 Get your Android app in front of a whole new audience.  Start now.
 http://pubads.g.doubleclick.net/gampad/clk?id=124407151iu=/4140/ostg.clktrk
 ___
 Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Android apps run on BlackBerry 10
Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
Now with support for Jelly Bean, Bluetooth, Mapview and more.
Get your Android app in front of a whole new audience.  Start now.
http://pubads.g.doubleclick.net/gampad/clk?id=124407151iu=/4140/ostg.clktrk
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] GSOC idea: improve support for non-standard input

2014-02-13 Thread Per Tunedal
Hey there!
What happened to the Ipad app?
Yours,
Per Tunedal

On Wed, Feb 12, 2014, at 17:26, Francis Tyers wrote:
--snip--
 
 Note: We have two days left before the deadline. I'd encourage people to
 take a look at the ideas list and add anything you would be interested
 in mentoring. Alternatively, email the list about your idea and we will
 see about adding it.
 
 Fran
 
 
 --
 Android apps run on BlackBerry 10
 Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
 Now with support for Jelly Bean, Bluetooth, Mapview and more.
 Get your Android app in front of a whole new audience.  Start now.
 http://pubads.g.doubleclick.net/gampad/clk?id=124407151iu=/4140/ostg.clktrk
 ___
 Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Android apps run on BlackBerry 10
Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
Now with support for Jelly Bean, Bluetooth, Mapview and more.
Get your Android app in front of a whole new audience.  Start now.
http://pubads.g.doubleclick.net/gampad/clk?id=124407151iu=/4140/ostg.clktrk
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] GSOC idea: make an app for Iphone/Ipad

2014-02-06 Thread Per Tunedal
Hi Tino,
I guess that means that the Apertium project doesn't own the code and
cannot release the code under any other but the current license: GPL v.2
or any later version.

In that case the only solution might be a completely new application,
wouldn't it? I'm not even sure if such an application could use the
Apertium online service? What about the dictionaries?

What's the trouble with Apple's requirements for the app store? Are all
open source licences impossible to use? That would explain the absence
of many good open source projects in their store.

Yours,
Per Tunedal

On Wed, Feb 5, 2014, at 9:54, Tino Didriksen wrote:
 On 5 February 2014 07:46, Per Tunedal per.tune...@operamail.com wrote:
 
  what about going a bit commercial? Many companies use a dual licence
  model: GPL + proprietary (e.g. the small Swedish company that made MySQL
  started that way).
 
 
 That's not a trivial task. Apertium doesn't require copyright assignment,
 so you'd have to track down and get assent from each and every Apertium
 contributor to add a non-FOSS licence option.
 
 Plus all 3rd party tools, such as HFST and CG-3 (well, CG-3 already has
 the
 non-FOSS license option).
 
 -- Tino Didriksen
 --
 Managing the Performance of Cloud-Based Applications
 Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
 Read the Whitepaper.
 http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk
 ___
 Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] GSOC idea: make an app for Iphone/Ipad

2014-02-05 Thread Per Tunedal
Hi,
the point is that we won't reach the intended audience if the iOS app
isn't available through the official app store.

I've not studied the market, but the price would have to be low. With
crowd funding of the app it might be able to set the price to zero
(free).
Anyone familiar with GOTEO http://goteo.org/ or some other crowd
funding?
Yours,
Per Tunedal
BTW I noticed that GnuPG have raised quite a lot of money through GOTEO.
We won't need that much, I suppose.

On Wed, Feb 5, 2014, at 9:09, Mikel Artetxe wrote:
 On Wed, Feb 5, 2014 at 7:46 AM, Per Tunedal
 per.tune...@operamail.comwrote:
 
  Hi,
  what about going a bit commercial? Many companies use a dual licence
  model: GPL + proprietary (e.g. the small Swedish company that made MySQL
  started that way).
 
 
 I'm not the one who has to take this decision, but doing it just because
 stupid Apple doesn't want us to use GPL doesn't sound like a strong
 enough
 reason for me. I mean, I insist that it's not me who has to take the
 decision of what license to use, but neither Apple, right?
 
 
 
 
  Would it do any harm if the project offered an iOS port for $10? And
  used the income for the development of Apertium?
 
 
 You wouldn't probably sell too many copies. Not for that price and
 without
 a good marketing campaign at least.
 
 
 
  Unfortunately, that would not fit into GSOC though. We would have to
  finance it some other way. GOTEO? http://goteo.org/ Or some other crowd
  funding?
 
  Most Ipad/Iphone users are the opposite of hackers: they are happy with
  the limitations of the system. An Apertium-app for Ipad must be
  accessible in the App Store to reach the audience.
 
  Yours,
  Per Tunedal
 
  On Wed, Feb 5, 2014, at 0:08, Jimmy O'Regan wrote:
  --snip--
  
   First of all, the basic problem with an iOS port is that it would not
   be distributable through the App Store (its terms are
   GPL-incompatible). This is the reason why an iOS port was not pursued
   in the past, and nothing has changed: see
  
  http://www.fsf.org/blogs/licensing/more-about-the-app-store-gpl-enforcement
  
  --snip--
   --
   Sefam Are any of the mentors around?
   jimregan yes, they're the ones trolling you
  
  
  --
   Managing the Performance of Cloud-Based Applications
   Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
   Read the Whitepaper.
  
  http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk
   ___
   Apertium-stuff mailing list
   Apertium-stuff@lists.sourceforge.net
   https://lists.sourceforge.net/lists/listinfo/apertium-stuff
 
 
  --
  Managing the Performance of Cloud-Based Applications
  Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
  Read the Whitepaper.
 
  http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk
  ___
  Apertium-stuff mailing list
  Apertium-stuff@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/apertium-stuff
 
 --
 Managing the Performance of Cloud-Based Applications
 Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
 Read the Whitepaper.
 http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk
 ___
 Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] GSOC idea: make an app for Iphone/Ipad

2014-02-04 Thread Per Tunedal

Hi,
last summer we got a nice app for Android devices, but that's no good
for my Ipad. Maybe it would be easy to make an app for IOS-devices?
Yours,
Per Tunedal

--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] GSOC Idea: Take a language pair and make it state of the art

2014-02-04 Thread Per Tunedal
Hi,
I agree with Lars. It's essential that it becomes VERY easy to
contribute words. We need to make thousands of people do small
contributions without effort.
Yours,
Per Tunedal

On Tue, Feb 4, 2014, at 23:33, Lars Aronsson wrote:
 On 02/04/2014 10:47 PM, Francis Tyers wrote:
  Out of the 39 or so language pairs that we have in trunk/, only two or
  three could be considered to offer state of the art performance with
  [...]
  Any thoughts ?
 
-- snip--
 
 It seems much harder to do small, incremental
 improvements of the Swedish (-Danish) language
 pair of Apertium. And quite easy to cause chaos.
 I made a few contributions in August 2013, but
 then I got ideas for some radical changes, but I
 couldn't predict if they were net improvements,
 or if they could have dangerous side effects.
 
 Apertium would benefit if the creation (and testing)
 of new paradigms was more clearly separated
 from adding new words to the dictionaries.
 These are two different roles, that require
 different skills. Right now, everything is code
 that is submitted to SVN, which requires the
 programmer-like ability to edit large text files.
 Adding words to the dictionaries should be more
 on the simple wiki editing skill level.
 
 
 -- 
Lars Aronsson (l...@aronsson.se)
Project Runeberg - free Nordic literature - http://runeberg.org/
 
 
 
 --
 Managing the Performance of Cloud-Based Applications
 Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
 Read the Whitepaper.
 http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk
 ___
 Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Unsupervised tagger training

2014-01-16 Thread Per Tunedal
Warning: There is not coarse tag for the fine tag '1888num'
 This is because of an incomplete tagset definition or a
 dictionary error
Warning: There is not coarse tag for the fine tag '1865num'
 This is because of an incomplete tagset definition or a
 dictionary error
Warning: There is not coarse tag for the fine tag '1880num'
 This is because of an incomplete tagset definition or a
 dictionary error
Error: A new ambiguity class was found. I cannot continue.
Word 'min' not found in the dictionary.
New ambiguity class: {ABBR,PRNPOS}
Take a look at the dictionary and at the training corpus. Then, retrain.
make: *** [sv-da.prob] Fel 1

Yours,
Per Tunedl

On Thu, Jan 16, 2014, at 1:04, Francis Tyers wrote:
 El dc 15 de 01 de 2014 a les 20:18 +0100, en/na Per Tunedal va escriure:
  Thank you Francis!
  I've corrected one thing my self: change a coarse tag.
  But this one I don't understand:
  
  per@Pers-debian:~/apertium-sv-da$ make -f sv-da-unsupervised.make
  apertium-validate-tagger apertium-sv-da.sv.tsx
  apertium-tagger -t 8 \
 sv-tagger-data/sv.dic \
 sv-tagger-data/sv.crp \
 apertium-sv-da.sv.tsx \
 sv-da.prob;
  Calculating ambiguity classes...
  
  97 states and 98 ambiguity classes
  Kupiec's initialization of transition and emission probabilities...
  Error: A new ambiguity class was found. I cannot continue.
  Word 'en' not found in the dictionary.
  New ambiguity class: {DETIND,NUM}
  Take a look at the dictionary and at the training corpus. Then, retrain.
  make: *** [sv-da.prob] Fel 1
  
  Yours,
  Per Tunedal
 
 I just made some updates, adding some missing coarse tags to the TSX
 file. Looking at nn-nb, it seems like you might need to make further
 changes to the sv-da-unsupervised.make to deal with compounds.
 
 Fran
 
 
 --
 CenturyLink Cloud: The Leader in Enterprise Cloud Services.
 Learn Why More Businesses Are Choosing CenturyLink Cloud For
 Critical Workloads, Development Environments  Everything In Between.
 Get a Quote or Start a Free Trial Today. 
 http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk
 ___
 Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments  Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] OT: SVN problem

2014-01-16 Thread Per Tunedal
Eureka!

Debian defaults to using the gnome keyring and that didn't work for some
reason.
I've changed the ~/.subversion/config by adding a new line:
 password-stores =

Now SVN works as usual.

Yours,
Per Tunedal

On Thu, Jan 16, 2014, at 0:54, Francis Tyers wrote:
 El dc 15 de 01 de 2014 a les 20:21 +0100, en/na Per Tunedal va escriure:
  Hi,
  I cannot submit from my new box: Could not authenticate to server:
  rejected Basic challenge
  
  Found the following in the wiki, but it doesn't help. Nothing changes.
  
  Could not authenticate to server: rejected Basic challenge
  
  This happens because you have checked out with http and not https, you
  need to switch as follows:
  
  $ svn switch --relocate http://svn.code.sf.net/p/apertium/svn/
  https://svn.code.sf.net/p/apertium/svn/
 
 Check your password and try checking out again making sure that you use
 https:// instead of http://.
 
 Fran
 
 
 --
 CenturyLink Cloud: The Leader in Enterprise Cloud Services.
 Learn Why More Businesses Are Choosing CenturyLink Cloud For
 Critical Workloads, Development Environments  Everything In Between.
 Get a Quote or Start a Free Trial Today. 
 http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk
 ___
 Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments  Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Unsupervised tagger training

2014-01-15 Thread Per Tunedal
Hi Francis,
Thank you for fixing my typos.

New error:

per@Pers-debian:~/apertium-sv-da$ make -f sv-da-unsupervised.make
Generating sv-tagger-data/sv.dic
This may take some time. Please, take a cup of coffee and come back
later.
apertium-validate-dictionary apertium-sv-da.sv.dix
apertium-validate-tagger apertium-sv-da.sv.tsx
lt-expand apertium-sv-da.sv.dix | grep -v __REGEXP__ | grep -v ::
|\
awk 'BEGIN{FS=::|:}{print $1 .;}' | apertium-destxt
sv.dic.expanded
lt-proc -a sv-da.automorf.bin sv.dic.expanded | \
apertium-filter-ambiguity apertium-sv-da.sv.tsx 
sv-tagger-data/sv.dic
Error: (137): '#comment' tag unexpected.
make: *** [sv-tagger-data/sv.dic] Fel 1
per@Pers-debian:~/apertium-sv-da$ make -f sv-da-unsupervised.make
apertium-destxt  sv-tagger-data/sv.crp.txt | lt-proc sv-da.automorf.bin
 sv-tagger-data/sv.crp
apertium-validate-tagger apertium-sv-da.sv.tsx
apertium-tagger -t 8 \
   sv-tagger-data/sv.dic \
   sv-tagger-data/sv.crp \
   apertium-sv-da.sv.tsx \
   sv-da.prob;
Error: (137): '#comment' tag unexpected.
make: *** [sv-da.prob] Fel 1
per@Pers-debian:~/apertium-sv-da$ 

What now?

Yours,
Per Tunedal


On Fri, Jan 10, 2014, at 17:43, Francis Tyers wrote:
 Your XML is really messed up. I've tried to fix it, and now it
 validates, but you might want to check that it is doing the right thing.
 Try doing svn diff -r HEAD apertium-sv-da.sv.tsx to see the diff.
 
 Fran
 
 El dv 10 de 01 de 2014 a les 17:20 +0100, en/na Per Tunedal va escriure:
  Hi Francis,
  I followed the advice in the wiki - maybe some explanation would be
  appropriate.
  
  Now I've tried the make file that was already present in apertium-sv-da
  and encountered new errors:
  
  Generating sv-tagger-data/sv.dic
  This may take some time. Please, take a cup of coffee and come back
  later.
  apertium-validate-dictionary apertium-sv-da.sv.dix
  apertium-validate-tagger apertium-sv-da.sv.tsx
  apertium-sv-da.sv.tsx:471: parser error : Opening and ending tag
  mismatch: tagset line 119 and def-label
/def-label
^
  apertium-sv-da.sv.tsx:497: parser error : Opening and ending tag
  mismatch: tagger line 2 and tagset
  /tagset
   ^
  apertium-sv-da.sv.tsx:499: parser error : Extra content at the end of
  the document
forbid
^
  make: *** [sv-tagger-data/sv.dic] Fel 1
  
  I've looked into the tsx-file and cannot understand what's wrong. There
  is a tagset label above line 497 and forbid should not be anything
  strange would it?
  
  Yours,
  Per Tunedal
  
  On Fri, Jan 10, 2014, at 16:14, Francis Tyers wrote:
   Hi Per, it seems like you've copied a tagger training makefile from a
   language that uses metadix format. If you change .dixtmp1 to .dix it
   should work.
   
   Fran
   
   El dc 08 de 01 de 2014 a les 16:13 +0100, en/na Per Tunedal va escriure:
Hi,
I've tried to follow the instructions in the Wiki
http://wiki.apertium.org/wiki/Unsupervised_tagger_training to train
sv-da

I got several complaints from the compiler. When I moved the files to
the Apertium-sv-da folder I got a bit further but I'm still stuck:

per@Pers-debian:~/apertium-sv-da$ make -f sv-da-unsupervised.make
make: *** Ingen regel för att skapa målet apertium-sv-da.sv.dixtmp1,
som behövs till sv-tagger-data/sv.dic.  Stannar.

My translation:
No rule to create the target  apertium-sv-da.sv.dixtmp1, that's needed
to sv-tagger-data/sv.dic. Halts.

Any idea?

Yours,
Per Tunedal

--
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into 
your 
Java,.NET,  PHP application. Start your 15-day FREE TRIAL of 
AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831iu=/4140/ostg.clktrk
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff
   
   
   
   
   --
   CenturyLink Cloud: The Leader in Enterprise Cloud Services.
   Learn Why More Businesses Are Choosing CenturyLink Cloud For
   Critical Workloads, Development Environments  Everything In Between.
   Get a Quote or Start a Free Trial Today. 
   http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk
   ___
   Apertium-stuff mailing list
   Apertium-stuff@lists.sourceforge.net
   https://lists.sourceforge.net/lists/listinfo/apertium-stuff

  1   2   3   4   >