No problem! :) The script will probably need to be hacked if you are dealing with stuff with reserved symbols in (see [1]), but otherwise the logic is probably ok.
F. 1. http://wiki.apertium.org/wiki/Apertium_stream_format A 2014-10-13 07:27, Adrián Chaves Fernández escrigué: > Wow, that solves everything. Thank you very much! > > 2014-10-11 15:40 GMT+02:00 Francis Tyers <[email protected]>: > >> A 2014-10-11 14:22, Adrian Chaves Fernandez escrigué: >> >>> The first issue I found that I would like to fix is the >> capitalization >>> of >>> headers. >>> >>> For example, “A General Introduction” is translated as >> “Unha >>> Introdución >>> Xeral”, but I want it translated as “Unha introdución >> xeral”. >>> >>> At the point when I pass the source string to Apertium I know >> that the >>> source >>> string that I am passing Apertium is a header string, so I can >> actually >>> workaround the issue outside of Apertium. This is not a perfect >>> aproach, as I >>> might end up lowercasing proper nouns, but the headers that I am >>> translating >>> do not usually have those. >>> >>> However, ideally I would like Apertium to detect that the text is >>> capitalized >>> as a header (all words, or combinations or nouns and other, are >>> capitalized), >>> and to uncapitalize words >>> >>> But before I go that way, I would like to know if this can be >> done on >>> the >>> Apertium side instead somehow, and if so, whether that would be a >> good >>> approach, or whether I should perform the changes on the >> translated >>> string >>> myself nonetheless. >> >> For me, in English "A General Introduction" is bad style, I would >> prefer >> to have >> the header as "A general introduction" in English. >> >> So, I think in this case it is probably better to handle this >> outside of >> Apertium >> in a pre-normalisation stage, but using parts of the Apertium >> pipeline >> to aid in >> the normalisation. For example, in order to retrieve the dictionary >> form >> of the word, >> you can use the morphological analyser, with the option -w. E.g. >> >> $ echo "A Tourist's Guide To Barcelona." | lt-proc -w >> ~/source/apertium/trunk/apertium-en-es/en-es.automorf.bin >> ^A/a<det><ind><sg>$ ^Tourist/tourist<adj>/tourist<n><sg>$ >> ^'s/'s<gen>/be<vbser><pri><p3><sg>$ >> ^Guide/guide<n><sg>/guide<vblex><inf>/guide<vblex><pres>$ >> ^To/to<pr>$ >> ^Barcelona/Barcelona<np><loc><sg>$^./.<sent>$ >> >> Then you could use a script like this: >> >> http://paste2.org/gYF4j4Wj [1] >> >> $ echo "A Tourist's Guide To Barcelona." | lt-proc -w >> ~/source/apertium/trunk/apertium-en-es/en-es.automorf.bin | python3 >> /tmp/untitle-case.py >> A tourist 's guide to Barcelona. >> >> $ echo "A Tourist's Guide To Barcelona." | lt-proc -w >> ~/source/apertium/trunk/apertium-en-es/en-es.automorf.bin | python3 >> /tmp/untitle-case.py | apertium -d >> ~/source/apertium/trunk/apertium-en-es/ en-es >> La guía de un turista a Barcelona. >> >> vs. >> >> $ echo "A Tourist's Guide To Barcelona." | apertium -d >> ~/source/apertium/trunk/apertium-en-es/ en-es >> La guía de un Turista A Barcelona. >> >> The superfluous space could be removed fairly easily. But I leave >> that >> as an exercise to the reader :) >> >> Regards, >> >> Fran >> >> > ------------------------------------------------------------------------------ >> Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer >> Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS >> Reports >> Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White >> paper >> Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog >> Analyzer >> http://p.sf.net/sfu/Zoho [2] >> _______________________________________________ >> Apertium-stuff mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/apertium-stuff [3] > > > > Links: > ------ > [1] http://paste2.org/gYF4j4Wj > [2] http://p.sf.net/sfu/Zoho > [3] https://lists.sourceforge.net/lists/listinfo/apertium-stuff > > ------------------------------------------------------------------------------ > Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer > Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS > Reports > Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper > Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer > http://p.sf.net/sfu/Zoho > > _______________________________________________ > Apertium-stuff mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/apertium-stuff ------------------------------------------------------------------------------ Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer http://p.sf.net/sfu/Zoho _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
