Peter von Kaehne wrote:
As a side issue of the other debate - how can I achieve NFC for a text I
am working on via commandline utilities?

All I can find in ICU documentation is about programming methods
available, but I have seen no command line utilities.

Peter
You can use perl to do it, using the following module:
http://search.cpan.org/~sadahiro/Unicode-Normalize-1.02/Normalize.pm
Note, the more recent the version of perl, the more recent the version of unicode. See the bottom of the page for the mapping.

Once this is installed, it should be something like: (I'm going from memory as I haven't used perl significantly for quite a while)
   perl -p -i.bak -MUnicode::Normalize  -e '$_ =  NFC($_)' filename
This will rename x.txt to x.txt.bak and apply the argument of -e to every line and then print the line.
For more details see:
   perldoc perlrun

The tei2mod and osis2mod do conversion to Unicode and NFC normalization by default. You can turn it off when you know the input is already NFC or that it is cp1252. Chris has said that he'd like all the module making programs to be modified to do the same.

Hope this helps.

In Him,
   DM




_______________________________________________
sword-devel mailing list: [email protected]
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Reply via email to