Here is my run-down on the current GSOC ideas page:
1.1 Anaphora resolution for machine translation
Nice project idea, but not sure in 3 months.
1.2 Bring a released language pair up to state-of-the-art quality
Always needed
1.3 Robust tokenisation in lttoolbox
Up for grabs, we need this
1.4 Adopt an unreleased language pair
Always needed
1.5 Extend lttoolbox to have the power of HFST
I think getting this one is unlikely and requires more than 3 months.
1.6 Robust recursive transfer
Keep, this would be really great. I got asked to run a workshop on
Apertium
recently and then unasked when they found out that the formalisms
didn't
actually create parse trees :)
1.7 Extend weighted transfer rules
There is ongoing work in this, it would need to be supervised carefully:
https://github.com/sevilaybayatli/apertium-ambiguous
I would say a nice project would be to really use this on a new language
pair
1.8 Improvements to the Apertium website
Not sure
1.9 User-friendly lexical selection training
I think getting this one is unlikely and requires more than 3 months.
Also has
been tried several times without luck.
1.10 Light alternative format for all XML files in an Apertium
language pair
I'm not sure about this one.
1.11 Bilingual dictionary enrichment via graph completion
There is code for this, it was a GSOC project last year but wasn't
merged, I'm
not sure how well it works.
1.12 UD and Apertium integration
This is a very useful project. If we can take advantage of UD corpora we
can
make supervised taggers for around 70% of our languages.
1.13 Add weights to lttoolbox
This was done last year. A nice project would be to actually make use of
it.
1.14 Improving language pairs mining Mediawiki Content Translation
postedits
1.15 Unsupervised weighting of automata
Open
1.16 Improvements to UD Annotatrix
This is a really useful tool.
1.17 apertium-separable language-pair integration
Agree, but I think that it should not just be apertium-separable, but
perhaps
something like "upgrade a language pair to use all the latest apertium
tricks"
1.18 Create FST-based module for disambiguating
I like this idea, but I'm not sure three months is enough time, without
someone
who really knows what they are doing with both the FST library and
apertium.
1.19 Python API/library for Apertium
This was mostly done right? I think this is still a really important
project
1.20 TIPP functionality for Apertium
Not sure
There is a lot of functionality that is not used widely that could be
really
used to improve performance of language pairs.
* apertium-separable
* weights in lttoolbox
* weighted transfer
Fran
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff