Peter von Kaehne wrote:
As a side issue of the other debate - how can I achieve NFC for a text I
am working on via commandline utilities?
All I can find in ICU documentation is about programming methods
available, but I have seen no command line utilities.
DM's suggestion of using the Perl facility is fine, and I use it myself
plenty often when I'm scripting Perl. But there's also an ICU utility
which can achieve normalization (and much more).
uconv (meant as a replacement for iconv, if you're familiar with that)
does codepage/encoding conversion, transliteration, and normalization.
It's part of the standard ICU distribution and we have Windows binaries
on the FTP site:
http://crosswire.org/ftpmirror/pub/sword/utils/win32/uconv.zip
http://crosswire.org/ftpmirror/pub/sword/utils/win32/icudt40-big.zip
(I'd recommend the big, 7.6 MB version of the ICU data for this.)
Use is fairly straightforward, but to take a file "input" and NFC
normalize it as a file "output" you would use (assuming both are UTF-8):
uconv -f utf-8 -t utf-8 -x NFC -o output input
--Chris
_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page