Here is my run-down on the current GSOC ideas page:

    1.1 Anaphora resolution for machine translation

Nice project idea, but not sure in 3 months.

    1.2 Bring a released language pair up to state-of-the-art quality

Always needed

    1.3 Robust tokenisation in lttoolbox

Up for grabs, we need this

    1.4 Adopt an unreleased language pair

Always needed

    1.5 Extend lttoolbox to have the power of HFST

I think getting this one is unlikely and requires more than 3 months.

    1.6 Robust recursive transfer

Keep, this would be really great. I got asked to run a workshop on Apertium recently and then unasked when they found out that the formalisms didn't
actually create parse trees :)

    1.7 Extend weighted transfer rules

There is ongoing work in this, it would need to be supervised carefully:

https://github.com/sevilaybayatli/apertium-ambiguous

I would say a nice project would be to really use this on a new language pair

    1.8 Improvements to the Apertium website

Not sure

    1.9 User-friendly lexical selection training

I think getting this one is unlikely and requires more than 3 months. Also has
been tried several times without luck.

1.10 Light alternative format for all XML files in an Apertium language pair

I'm not sure about this one.

    1.11 Bilingual dictionary enrichment via graph completion

There is code for this, it was a GSOC project last year but wasn't merged, I'm
not sure how well it works.

    1.12 UD and Apertium integration

This is a very useful project. If we can take advantage of UD corpora we can
make supervised taggers for around 70% of our languages.

    1.13 Add weights to lttoolbox

This was done last year. A nice project would be to actually make use of it.

1.14 Improving language pairs mining Mediawiki Content Translation postedits
    1.15 Unsupervised weighting of automata

Open

    1.16 Improvements to UD Annotatrix

This is a really useful tool.

    1.17 apertium-separable language-pair integration

Agree, but I think that it should not just be apertium-separable, but perhaps something like "upgrade a language pair to use all the latest apertium tricks"

    1.18 Create FST-based module for disambiguating

I like this idea, but I'm not sure three months is enough time, without someone who really knows what they are doing with both the FST library and apertium.

    1.19 Python API/library for Apertium

This was mostly done right? I think this is still a really important project

    1.20 TIPP functionality for Apertium

Not sure

There is a lot of functionality that is not used widely that could be really
used to improve performance of language pairs.

* apertium-separable
* weights in lttoolbox
* weighted transfer

Fran


_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to