Re: [Apertium-stuff] separable words module -- call for requests

2017-07-10 Thread Flammie Pirinen
2017-07-10, Benedikt Freisen sanoi:

> Well, for German we would need something that correctly handles the
> following:
[...]
> Please note that it is
>   "Sie reisen ab." (= "They depart.")
> but
>   ", wenn sie abreisen." (= ", when they depart.")

I’ve sent a German test set[1] based on my experiences with
apertium-fin-deu, but it would be a good thing if a native speaker
could search and annotate a corpus of real-world examples. It should be
noted though that the case like abocve, where there are no intervening
words between the prefix and the verb is already handled fine.

-- 
Flammie, computer scientist bachelor + linguist master = computational
linguist doctor, free software Finnish localiser,
and more! 



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] separable words module -- call for requests

2017-07-10 Thread Benedikt Freisen
[Copy of the reply that I accidentally sent to Francis, only.]

Well, for German we would need something that correctly handles the
following:

-- Block-quote from "The Awful German Language" by Mark Twain --

The Germans have another kind of parenthesis, which they make by
splitting a verb in two and putting half of it at the beginning of an
exciting chapter and the other half at the end of it. Can any one
conceive of anything more confusing than that? These things are called
"separable verbs." The German grammar is blistered all over with
separable verbs; and the wider the two portions of one of them are
spread apart, the better the author of the crime is pleased with his
performance. A favorite one is reiste ab -- which means departed. Here
is an example which I culled from a novel and reduced to English:

"The trunks being now ready, he DE- after kissing his mother and
sisters, and once more pressing to his bosom his adored Gretchen, who,
dressed in simple white muslin, with a single tuberose in the ample
folds of her rich brown hair, had tottered feebly down the stairs, still
pale from the terror and excitement of the past evening, but longing to
lay her poor aching head yet once again upon the breast of him whom she
loved more dearly than life itself, PARTED."

-- end of quote --

Please note that it is
  "Sie reisen ab." (= "They depart.")
but
  ", wenn sie abreisen." (= ", when they depart.")

Greetings
Benedikt

Am 10.07.2017 um 11:20 schrieb Francis Tyers:
> Hello everyone!
> 
> We're making progress on a module for treating separable "multiword"
> expressions. The general idea is to be able to do stuff like
> 
>  ^take$ ^the$ ^rubbish$ ^out$ -> ^take# out$ ^the$ ^rubbish$ -> ^sacar$
> ^la$ ^basura$
>  ^be$ ^always$ ^late$ -> ^be# late$ ^always$ -> ^llegar# tarde$ ^siempre$
>  ^take$ ^the$ ^rubbish$ ^out of$ ^here$ -> ^take# out$ ^the$ ^rubbish$
> ^of$ ^here$ -> ^sacar$ ^la$ ^basura$ ^de$ ^aquí$
> 
> The general idea is that it will be a finite-state transducer (like the
> existing monolingual and bilingual dictionaries) but that can work over
> words. It will appear between the pretransfer module and the lexical
> transfer module (apertium-pretransfer | new module | lt-proc -b).
> 
> So, this email is a call for language pair developers to give us
> examples of phenomena you would like to treat in your language pair.
> 
> Thanks!
> 
> Fran
> 
> --
> 
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] separable words module -- call for requests

2017-07-10 Thread Francis Tyers

Hello everyone!

We're making progress on a module for treating separable "multiword" 
expressions. The general idea is to be able to do stuff like


 ^take$ ^the$ ^rubbish$ ^out$ -> ^take# out$ ^the$ ^rubbish$ -> ^sacar$ 
^la$ ^basura$
 ^be$ ^always$ ^late$ -> ^be# late$ ^always$ -> ^llegar# tarde$ 
^siempre$
 ^take$ ^the$ ^rubbish$ ^out of$ ^here$ -> ^take# out$ ^the$ ^rubbish$ 
^of$ ^here$ -> ^sacar$ ^la$ ^basura$ ^de$ ^aquí$


The general idea is that it will be a finite-state transducer (like the 
existing monolingual and bilingual dictionaries) but that can work over 
words. It will appear between the pretransfer module and the lexical 
transfer module (apertium-pretransfer | new module | lt-proc -b).


So, this email is a call for language pair developers to give us 
examples of phenomena you would like to treat in your language pair.


Thanks!

Fran

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] apertium-eng and language variants

2017-07-10 Thread Francis Tyers



El 2017-07-09 13:00, Xavi Ivars escribió:

I've applied this change.

Now, apertium-eng generates two different generation files:
eng.autogen.bin and eng_US.autogen.bin. The first one generates
"British English", while the second one generates "American English".
There is only one analyzer, that analyzes both variants.

That will allow all language pairs using English to analyze all
English variants at no cost (no need to add more than once the same
entry in the bidix (i.e. colour/color). And it's up to the language
developers if they want to support one or more variants when
generating.

Only a few words have been added to the right variant, so there is
still work to do, but the foundation is there.

Regards!


Gràcies Xavi! :)

F.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff