A 2015-12-22 19:12, Joan escrigué:
> Francis,and all
> 
> I've recently found out that pairs can be enabled without a real
>  "release". FANTASTIC!!
> 
>  Can you give me a list of what improvements you expect to see ? Is it
>  just the vocabulary?  YES. BESIDES THE  FILE _DOCUMENT.__TXT_ THAT
> I ATTACH YOU, THERE ARE SOME WORDS AS BIOGRAPHIE, FILMOGRAPHIE,
> NEVEU, PETIT-FILS... I didn't understand what you meant by "strange
>  characters". THEY ARE MAINLY CREATED WHEN ORIGINALLY  IT FIND "<REF
> NAME="XX">. THE “STRANGE CHARACTERS ARE: >SPAN, ABBR CLAS,
> CONTENDITABLE...YOU CAN SEE  FOR EXEMPLE IN
> HTTPS://CA.WIKIPEDIA.ORG/W/INDEX.PHP?TITLE=BOB_RAFELSON&ACTION=EDIT
> [9] VERSUS 
> HTTPS://FR.WIKIPEDIA.ORG/W/INDEX.PHP?TITLE=BOB_RAFELSON&ACTION=EDIT
> [10]

This is a problem for the ContentTranslation team. It seems like the 
problem
has to do with templates.

>  Would anyone be against me converting fr-ca to:
> 
>  * three-letter codes
>  * monolingual language packages
>  * adding lexical selection support
>  On our part, NO PROBLEM
>  ?

Ok. Here are the problems I find:

* Using the large French lexicon from either br-fr or fr-es causes a 
substantial decrease
   in translation quality.
* The tagger is really bad.

Here is a comparison of before/after.

http://paste2.org/9GhW8LKe

I can fix the testvoc errors, it will probably take around 2-3 days. I 
can also expand the lexicon
from crossing fr-es and es-ca.  However, I will need help with:

1) Fixing the tagging. You can write constraint grammar rules, or you 
can manually annotate texts. If you
    manually annotate texts, we will need around 1500 sentences annotated 
to make a substantial improvement

2) Proofreading the dictionary.

3) Writing lexical selection rules.

If you would be interested in working on this, I think we can get 
something releasable in 2-3 full days of
work. Let me know if you would be interested and we can pick the days to 
meet up on IRC.

Fran

------------------------------------------------------------------------------
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to