A 2015-12-23 17:02, Joan escrigué: > Francis, > I'm interested, but I don't know how I can help you. I'm not a > linguist, and understand the paradigms in french/catalan had been > hard. (points 1, 2, 3). May be the people who worked in the project. > > I could work in the 1500 sentences. (point 1) But no in 2-3 days! :)
The point is to get started, I can teach you how to do the stuff, but it's most convenient for me if we can dedicate a day to learning, and then you can work autonomously. > Well, we can meet up on IRC (I never used it). If you explain me how > to enter, it's possible this evening. Another day...after Christmas. > If you can explain how to enter (off this list).... Explained, please invite other interested parties too. > I have looked at the last version of the tale. Apparently there are > many errors, but some are repeated. I don't know the original text, > but I you attach file with comments. I've fixed some of them. Here is a newer version: http://paste2.org/scWY1ts9 Fran > Joan > > 2015-12-22 22:09 GMT+01:00 Francis Tyers <[email protected]>: > >> A 2015-12-22 19:12, Joan escrigué: >>> Francis,and all >>> >>> I've recently found out that pairs can be enabled without a real >>> "release". FANTASTIC!! >>> >>> Can you give me a list of what improvements you expect to see ? >> Is it >>> just the vocabulary? YES. BESIDES THE FILE _DOCUMENT.__TXT_ >> THAT >>> I ATTACH YOU, THERE ARE SOME WORDS AS BIOGRAPHIE, FILMOGRAPHIE, >>> NEVEU, PETIT-FILS... I didn't understand what you meant by >> "strange >>> characters". THEY ARE MAINLY CREATED WHEN ORIGINALLY IT FIND >> "<REF >>> NAME="XX">. THE “STRANGE CHARACTERS ARE: >SPAN, ABBR CLAS, >>> CONTENDITABLE...YOU CAN SEE FOR EXEMPLE IN >>> >> HTTPS://CA.WIKIPEDIA.ORG/W/INDEX.PHP?TITLE=BOB_RAFELSON&ACTION=EDIT >> [1] >>> [9] VERSUS >>> >> HTTPS://FR.WIKIPEDIA.ORG/W/INDEX.PHP?TITLE=BOB_RAFELSON&ACTION=EDIT >> [2] >>> [10] >> >> This is a problem for the ContentTranslation team. It seems like >> the >> problem >> has to do with templates. >> >>> Would anyone be against me converting fr-ca to: >>> >>> * three-letter codes >>> * monolingual language packages >>> * adding lexical selection support >>> On our part, NO PROBLEM >>> ? >> >> Ok. Here are the problems I find: >> >> * Using the large French lexicon from either br-fr or fr-es causes >> a >> substantial decrease >> in translation quality. >> * The tagger is really bad. >> >> Here is a comparison of before/after. >> >> http://paste2.org/9GhW8LKe [3] >> >> I can fix the testvoc errors, it will probably take around 2-3 >> days. I >> can also expand the lexicon >> from crossing fr-es and es-ca. However, I will need help with: >> >> 1) Fixing the tagging. You can write constraint grammar rules, or >> you >> can manually annotate texts. If you >> manually annotate texts, we will need around 1500 sentences >> annotated >> to make a substantial improvement >> >> 2) Proofreading the dictionary. >> >> 3) Writing lexical selection rules. >> >> If you would be interested in working on this, I think we can get >> something releasable in 2-3 full days of >> work. Let me know if you would be interested and we can pick the >> days to >> meet up on IRC. >> >> Fran >> >> > ------------------------------------------------------------------------------ >> _______________________________________________ >> Apertium-stuff mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/apertium-stuff [4] > > > > Links: > ------ > [1] > HTTPS://CA.WIKIPEDIA.ORG/W/INDEX.PHP?TITLE=BOB_RAFELSON&ACTION=EDIT > [2] > HTTPS://FR.WIKIPEDIA.ORG/W/INDEX.PHP?TITLE=BOB_RAFELSON&ACTION=EDIT > [3] http://paste2.org/9GhW8LKe > [4] https://lists.sourceforge.net/lists/listinfo/apertium-stuff ------------------------------------------------------------------------------ _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
