No problem! :) The script will probably need to be hacked if you are
dealing with stuff with reserved symbols in (see [1]), but otherwise
the logic is probably ok.

F.

1. http://wiki.apertium.org/wiki/Apertium_stream_format

A 2014-10-13 07:27, Adrián Chaves Fernández escrigué:
> Wow, that solves everything. Thank you very much!
> 
> 2014-10-11 15:40 GMT+02:00 Francis Tyers <[email protected]>:
> 
>> A 2014-10-11 14:22, Adrian Chaves Fernandez escrigué:
>> 
>>> The first issue I found that I would like to fix is the
>> capitalization
>>> of
>>> headers.
>>> 
>>> For example, “A General Introduction” is translated as
>> “Unha
>>> Introdución
>>> Xeral”, but I want it translated as “Unha introdución
>> xeral”.
>>> 
>>> At the point when I pass the source string to Apertium I know
>> that the
>>> source
>>> string that I am passing Apertium is a header string, so I can
>> actually
>>> workaround the issue outside of Apertium. This is not a perfect
>>> aproach, as I
>>> might end up lowercasing proper nouns, but the headers that I am
>>> translating
>>> do not usually have those.
>>> 
>>> However, ideally I would like Apertium to detect that the text is
>>> capitalized
>>> as a header (all words, or combinations or nouns and other, are
>>> capitalized),
>>> and to uncapitalize words
>>> 
>>> But before I go that way, I would like to know if this can be
>> done on
>>> the
>>> Apertium side instead somehow, and if so, whether that would be a
>> good
>>> approach, or whether I should perform the changes on the
>> translated
>>> string
>>> myself nonetheless.
>> 
>> For me, in English "A General Introduction" is bad style, I would
>> prefer
>> to have
>> the header as "A general introduction" in English.
>> 
>> So, I think in this case it is probably better to handle this
>> outside of
>> Apertium
>> in a pre-normalisation stage, but using parts of the Apertium
>> pipeline
>> to aid in
>> the normalisation. For example, in order to retrieve the dictionary
>> form
>> of the word,
>> you can use the morphological analyser, with the option -w. E.g.
>> 
>> $ echo "A Tourist's Guide To Barcelona." | lt-proc -w
>> ~/source/apertium/trunk/apertium-en-es/en-es.automorf.bin
>> ^A/a<det><ind><sg>$ ^Tourist/tourist<adj>/tourist<n><sg>$
>> ^'s/'s<gen>/be<vbser><pri><p3><sg>$
>> ^Guide/guide<n><sg>/guide<vblex><inf>/guide<vblex><pres>$
>> ^To/to<pr>$
>> ^Barcelona/Barcelona<np><loc><sg>$^./.<sent>$
>> 
>> Then you could use a script like this:
>> 
>> http://paste2.org/gYF4j4Wj [1]
>> 
>> $ echo "A Tourist's Guide To Barcelona." | lt-proc -w
>> ~/source/apertium/trunk/apertium-en-es/en-es.automorf.bin | python3
>> /tmp/untitle-case.py
>> A tourist 's guide to Barcelona.
>> 
>> $ echo "A Tourist's Guide To Barcelona." | lt-proc -w
>> ~/source/apertium/trunk/apertium-en-es/en-es.automorf.bin | python3
>> /tmp/untitle-case.py | apertium -d
>> ~/source/apertium/trunk/apertium-en-es/ en-es
>> La guía  de un turista a Barcelona.
>> 
>> vs.
>> 
>> $ echo "A Tourist's Guide To Barcelona." | apertium -d
>> ~/source/apertium/trunk/apertium-en-es/ en-es
>> La guía de un Turista A Barcelona.
>> 
>> The superfluous space could be removed fairly easily. But I leave
>> that
>> as an exercise to the reader :)
>> 
>> Regards,
>> 
>> Fran
>> 
>> 
> ------------------------------------------------------------------------------
>> Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
>> Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS
>> Reports
>> Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White
>> paper
>> Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog
>> Analyzer
>> http://p.sf.net/sfu/Zoho [2]
>> _______________________________________________
>> Apertium-stuff mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff [3]
> 
> 
> 
> Links:
> ------
> [1] http://paste2.org/gYF4j4Wj
> [2] http://p.sf.net/sfu/Zoho
> [3] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> 
> ------------------------------------------------------------------------------
> Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
> Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS 
> Reports
> Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
> Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
> http://p.sf.net/sfu/Zoho
> 
> _______________________________________________
> Apertium-stuff mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff

------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://p.sf.net/sfu/Zoho
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to