Robert Joop wrote on 2002-01-11 21:07 UTC: > i'm planning to write a text to postscript converter that can take > UTF-8 as input. > > i won't start from scratch but intend to start from a text to > postscript converter that uses ucs2 internally (but takes only about a > dozen different single byte encodings as input) and extend it to read > UTF-8, too.
Just use either glibc's standard ISO C multibyte conversion functions or iconv() from libiconv, and it will automatically support a huge number of encodings with a minimal amount of work. That will also make it pick the encoding according to LC_CTYPE, not some proprietary command-line option. > - what (additional) features would you find nice to have? Nice would be a best-effort attempt to output as much of Unicode as feasible with just the standard Postscript fonts available, as demonstrated for instance in http://mail.nl.linux.org/linux-utf8/2001-12/msg00012.html That covers already most of MES-1 and is a significant improvement over the Latin parts of ISO 8859. Markus -- Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/> -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
