Hello, You can find more explanation about what should you do in this project here http://wiki.apertium.org/wiki/Ideas_for_Google_Summer_of_Code#Extend_weighted_transfer_rules
Sevilay On Fri, Feb 15, 2019 at 12:28 PM Shivanshu Sharma < sharma.shivansh...@gmail.com> wrote: > Hello, I would love to work on 1.7 to implement weighted transfer rules on > a new language pair, hopefully, Hindi-Sanskrit pair. Could someone guide me > on how to get started? > > - Shivanshu > > On Mon, Jan 28, 2019 at 10:43 PM Francis Tyers <fty...@prompsit.com> > wrote: > >> Here is my run-down on the current GSOC ideas page: >> >> 1.1 Anaphora resolution for machine translation >> >> Nice project idea, but not sure in 3 months. >> >> 1.2 Bring a released language pair up to state-of-the-art quality >> >> Always needed >> >> 1.3 Robust tokenisation in lttoolbox >> >> Up for grabs, we need this >> >> 1.4 Adopt an unreleased language pair >> >> Always needed >> >> 1.5 Extend lttoolbox to have the power of HFST >> >> I think getting this one is unlikely and requires more than 3 months. >> >> 1.6 Robust recursive transfer >> >> Keep, this would be really great. I got asked to run a workshop on >> Apertium >> recently and then unasked when they found out that the formalisms >> didn't >> actually create parse trees :) >> >> 1.7 Extend weighted transfer rules >> >> There is ongoing work in this, it would need to be supervised carefully: >> >> https://github.com/sevilaybayatli/apertium-ambiguous >> >> I would say a nice project would be to really use this on a new language >> pair >> >> 1.8 Improvements to the Apertium website >> >> Not sure >> >> 1.9 User-friendly lexical selection training >> >> I think getting this one is unlikely and requires more than 3 months. >> Also has >> been tried several times without luck. >> >> 1.10 Light alternative format for all XML files in an Apertium >> language pair >> >> I'm not sure about this one. >> >> 1.11 Bilingual dictionary enrichment via graph completion >> >> There is code for this, it was a GSOC project last year but wasn't >> merged, I'm >> not sure how well it works. >> >> 1.12 UD and Apertium integration >> >> This is a very useful project. If we can take advantage of UD corpora we >> can >> make supervised taggers for around 70% of our languages. >> >> 1.13 Add weights to lttoolbox >> >> This was done last year. A nice project would be to actually make use of >> it. >> >> 1.14 Improving language pairs mining Mediawiki Content Translation >> postedits >> 1.15 Unsupervised weighting of automata >> >> Open >> >> 1.16 Improvements to UD Annotatrix >> >> This is a really useful tool. >> >> 1.17 apertium-separable language-pair integration >> >> Agree, but I think that it should not just be apertium-separable, but >> perhaps >> something like "upgrade a language pair to use all the latest apertium >> tricks" >> >> 1.18 Create FST-based module for disambiguating >> >> I like this idea, but I'm not sure three months is enough time, without >> someone >> who really knows what they are doing with both the FST library and >> apertium. >> >> 1.19 Python API/library for Apertium >> >> This was mostly done right? I think this is still a really important >> project >> >> 1.20 TIPP functionality for Apertium >> >> Not sure >> >> There is a lot of functionality that is not used widely that could be >> really >> used to improve performance of language pairs. >> >> * apertium-separable >> * weights in lttoolbox >> * weighted transfer >> >> Fran >> >> >> _______________________________________________ >> Apertium-stuff mailing list >> Apertium-stuff@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >> > _______________________________________________ > Apertium-stuff mailing list > Apertium-stuff@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/apertium-stuff >
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff