Hello, I would love to work on 1.7 to implement weighted transfer rules on
a new language pair, hopefully, Hindi-Sanskrit pair. Could someone guide me
on how to get started?

- Shivanshu

On Mon, Jan 28, 2019 at 10:43 PM Francis Tyers <fty...@prompsit.com> wrote:

> Here is my run-down on the current GSOC ideas page:
>
>      1.1 Anaphora resolution for machine translation
>
> Nice project idea, but not sure in 3 months.
>
>      1.2 Bring a released language pair up to state-of-the-art quality
>
> Always needed
>
>      1.3 Robust tokenisation in lttoolbox
>
> Up for grabs, we need this
>
>      1.4 Adopt an unreleased language pair
>
> Always needed
>
>      1.5 Extend lttoolbox to have the power of HFST
>
> I think getting this one is unlikely and requires more than 3 months.
>
>      1.6 Robust recursive transfer
>
> Keep, this would be really great. I got asked to run a workshop on
> Apertium
>   recently and then unasked when they found out that the formalisms
> didn't
> actually create parse trees :)
>
>      1.7 Extend weighted transfer rules
>
> There is ongoing work in this, it would need to be supervised carefully:
>
> https://github.com/sevilaybayatli/apertium-ambiguous
>
> I would say a nice project would be to really use this on a new language
> pair
>
>      1.8 Improvements to the Apertium website
>
> Not sure
>
>      1.9 User-friendly lexical selection training
>
> I think getting this one is unlikely and requires more than 3 months.
> Also has
> been tried several times without luck.
>
>      1.10 Light alternative format for all XML files in an Apertium
> language pair
>
> I'm not sure about this one.
>
>      1.11 Bilingual dictionary enrichment via graph completion
>
> There is code for this, it was a GSOC project last year but wasn't
> merged, I'm
> not sure how well it works.
>
>      1.12 UD and Apertium integration
>
> This is a very useful project. If we can take advantage of UD corpora we
> can
> make supervised taggers for around 70% of our languages.
>
>      1.13 Add weights to lttoolbox
>
> This was done last year. A nice project would be to actually make use of
> it.
>
>      1.14 Improving language pairs mining Mediawiki Content Translation
> postedits
>      1.15 Unsupervised weighting of automata
>
> Open
>
>      1.16 Improvements to UD Annotatrix
>
> This is a really useful tool.
>
>      1.17 apertium-separable language-pair integration
>
> Agree, but I think that it should not just be apertium-separable, but
> perhaps
> something like "upgrade a language pair to use all the latest apertium
> tricks"
>
>      1.18 Create FST-based module for disambiguating
>
> I like this idea, but I'm not sure three months is enough time, without
> someone
> who really knows what they are doing with both the FST library and
> apertium.
>
>      1.19 Python API/library for Apertium
>
> This was mostly done right? I think this is still a really important
> project
>
>      1.20 TIPP functionality for Apertium
>
> Not sure
>
> There is a lot of functionality that is not used widely that could be
> really
> used to improve performance of language pairs.
>
> * apertium-separable
> * weights in lttoolbox
> * weighted transfer
>
> Fran
>
>
> _______________________________________________
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to