Xavi,

One of the two uses of apertium-separable (what we're currently
calling "multiword disassembly", with a mode name of "revautoseq") is
to expand smaller units into individual multi-token units.

Since lsx compiles into an FST, we simply reverse the labels (just
like the relationship between morphological analyser/generator FSTs),
and use the output module between structural transfer and before
generation.  In a case like this, I could see an argument for wanting
it happen before structural transfer.

--
Jonathan


28 may 2020, C.a. tarixində 18:45 tarixində Xavi Ivars
<xavi.iv...@gmail.com> yazdı:
>
> The reason I was asking was exactly because of that: we're not trying to 
> rewrite multi-tokens into smaller units but the opposite: expand smaller 
> units into multiple ones.
>
> But just to make sure: not because I thought it doesn't belong there, but 
> because I really don't know what they're actual scope of separable is (except 
> of having used it for a few phrasal verbs in eng-cat)
>
>
> --
> Xavi Ivars
> < http://xavi.ivars.me >
>
> El dv., 29 de maig 2020, 0:39, Francis Tyers <fty...@prompsit.com> va 
> escriure:
>>
>> El 2020-05-28 23:12, Xavi Ivars escribió:
>> > How would this fit in apertium-separable?
>> >
>> > As far as I know the goal of apertium separable is to handle
>> > multi-words in a better way than in the monodixes.
>> >
>> > I totally get (and totally agree) that we should put in transfer only
>> > stuff that is really about transfer between both languages and we
>> > don't want to abuse it... But is that a good enough reason to abuse
>> > another module? Or may it be the case that apertium-separable should
>> > handle a broader set of use cases (and probably change its name)?
>> >
>> > --
>> > Xavi Ivars
>> > < http://xavi.ivars.me >
>> >
>> > El dj., 28 de maig 2020, 16:14, Jonathan Washington
>> > <jonathan.n.washing...@gmail.com> va escriure:
>> >
>> >> This could definitely be done in apertium-separable.  That would be
>> >> by far the most straightforward way to solve this problem.  And if
>> >> you did it as a language-specific lsx file as has been being
>> >> discussed recently, it would serve the purpose you describe.
>> >>
>> >> Don't treat it as a structural transfer issue.  The less lexical
>> >> stuff in transfer the better.
>> >>
>> >> --
>> >> Jonathan
>> >>
>> >> On Thu, May 28, 2020, 06:47 Jaume Ortolà i Font
>> >> <jaumeort...@gmail.com> wrote:
>> >>
>> >> Isn't this something that should go in transfer,
>> >>
>> >> Dropping this "que" is possible in Spanish, but it is not regular
>> >> syntax, it is a mannerism used in bureaucratic jargon. The regular
>> >> syntax is with "que". It makes sense to add it, so all language
>> >> pairs can translate as usual.
>> >>
>> >> Transfer is extremely annoying for this kind of things, in my
>> >> experience.
>> >>
>> >> or you could
>> >> use apertium-separable for it?
>> >>
>> >> Probably yes. We are not using apertium-separable in spa-cat, and it
>> >> will be useful to do it.
>> >>
>>
>> I think it fits better in separable (rewriting multitokens into smaller
>> units) than in CG (disambiguation).
>>
>> Another place could be in the bilingual dictionary, a special tag or a
>> special
>> lexeme, marking the missing que on the target side.
>>
>> e.g. in the monodix you could have:
>>
>> rogar¹:pregar
>> rogar²:pregar
>>
>>
>> Then in the bidix:
>>
>> rogar¹:pregar
>> rogar²:pregar# que
>>
>> Then if a clean -cat was desired, a transfer rule could just insert a
>> que
>> when lemq was "# que".
>>
>> In fact, if we consider this stylism to be a different lexeme it kind of
>> makes sense.
>>
>> Fran
>
> _______________________________________________
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff


_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to