... and right after sending my previous email I decided that creating
such a prototype was a more fun use of my flight home than
treebanking, so here's what I came up with:
https://github.com/mr-martian/UD-transfer

It probably needs another couple hours of work before it'll actually
do anything, but I should be able to manage that pretty soon (the
hardest part of any project is starting).

Daniel

On Thu, Dec 16, 2021 at 7:17 PM Daniel Swanson
<awesomeevildu...@gmail.com> wrote:
>
> Greetings Apertiumers!
>
> Figuring out how to incorporate UD parsers into Apertium pipelines is
> something that's been on my todo list for a while, but with the
> unfortunate property that it keeps getting sidelined by projects that
> have deadlines.
>
> With regards to your specific issue, here are the options I can think of:
>
> 1. apertium-transfer / chunking
> The chunker can pretty much only process adjacent words. You can
> encode dependency labels to some extent (e.g.
> ^green/green<adj><sint><@amod>$), and the rules can refer to those
> tags, but I don't think there's any way to access the actual relations
> that isn't incredibly hacky and fragile.
>
> 2. apertium-recursive
> This was created precisely because chunking can't handle long distance
> relationships, but to actually use it, you'd end up somehow encoding
> and then re-parsing the tree structure which is still fairly fragile
> while also probably being an enormous waste of energy.
>
> 3. Constraint Grammar
> VISL CG-3 can manipulate dependency trees and writing agreement rules
> would be fairly straightforward, though you'd have to write them from
> scratch rather than copying from existing sources.
>
> 4. Bug me to make a real solution
> Prototyping a pipeline module to do pretty much exactly what you're
> talking about is nominally fairly high on my todo list, and if someone
> is actually waiting for it there's a decent amount of hope that I'll
> actually start it rather than some other project.
>
> If your main concern is agreement, 3 strikes me as a pretty good
> option. On the other hand, if you actually need to modify the tree
> structure, 3 might get complicated in which case I'd recommend 4.
>
> Daniel
>
> On Thu, Dec 16, 2021 at 5:20 PM Виктор Булатов <bt.uy...@gmail.com> wrote:
> >
> > Hi everyone. The Interslavic language is a constructed language that is 
> > created in such a way that people from Slavic countries are able to 
> > understand most of it without any prior education. It has a Wikipedia page 
> > and everything (maybe we even will have an ISO-639-3 code "ISV" in the 
> > future, fingers crossed!).
> >
> > I'm looking into developing some sort of MT system for Interslavic (mainly 
> > the "Some Natural Slavic Language -> Interslavic" direction). I've managed 
> > to cobble a prototype with Russian UDPipe and ISV morphological data/rules 
> > before finding out about Apertium (and you guys seem interesting).
> >
> > The thing is, Russian and Czech are probably the richest Slavic languages 
> > in terms of NLP resources. Apertium obviously isn't going to beat a 
> > dependency parser that was trained on >1M of labeled sentences. So, I don't 
> > really need any of the earlier stages of the Apertium pipeline. However, 
> > the chunking and multi-word-expression modules seem promising, especially 
> > given that I probably could re-use already existing rules (that are written 
> > for different Slavic languages, but it doesn't matter).
> >
> > So, my question is: is it possible to use the chunking module in isolation? 
> > Preferably in a way that allows manipulation of UDPipe's dependency trees? 
> > For example, to ensure gender agreement between a noun and attached 
> > adjectives.
> >
> > I would be happy to hear any other advice!
> > _______________________________________________
> > Apertium-stuff mailing list
> > Apertium-stuff@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/apertium-stuff


_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to