Hi everyone,

there is a python library called apertium-streamparser [1], which is a 
relatively new thing you might not know. Might come in handy if you need to 
convert the output of the segmenter into something else.

Best,

Ilnar

[1] https://github.com/apertium/streamparser


On 2019 ж. 7 наурыз 15:46:48 GMT+03:00, Francis Tyers <fty...@prompsit.com> 
wrote:
>El 2019-03-07 09:22, Antonio Toral escribió:
>> Hi Jonathan, Fran,
>> 
>> Thanks for looking at this, I really appreciate it :)
>> 
>> From those two options, I think the first would be better.
>
>For your purposes I agree.
>
>> If I got it right, the 1st is pure segmentation while the 2nd inserts
>
>> an
>> additional д.
>
>No, it's the other way around :)
>
>In the first you lose a д through the process of поезд-де -> поез-де,
>there are two underlying д but one is removed.
>
>> Segmenting поезде as поез>де (1st option) would allow us to recover
>> the original word easily from the segmented version. Segmenting as
>> поезд>де (2nd option) would not as we may recover the original word
>> wrongly as поездде.
>> 
>
>This is correct, my intuition was that you wanted to keep the
>segmented version as close to the surface form as possible.
>
>We have a prototype (thanks Jonathan!), but it needs tweaking and
>testing. Hopefully in the next couple of days...
>
>Fran
>
>
>_______________________________________________
>Apertium-stuff mailing list
>Apertium-stuff@lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/apertium-stuff

-- 
Простите за краткость, создано в K-9 Mail.
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to