Hi folks,
I've just committed some improvements to the way the Unicode-enabled
Postscript renderers work (PS, EPS, and potentially DPS if I unearth it one
day). Basically, I've added hooks so that all strings which are going to be
output by the objects are sent to the renderer first for evaluation (in a
separate, first pass), which allows to build the encoding pages just once
(instead of redefining the encodings each time there was a few new entries).
That code might be a little controversial, so if there are good souls to
review it and shout if it sucks, please do.
I've also used input from #53512, so normally people NOT using ISO 8859-1
(but still an 8-bit encoding) should (or may, or might) have their encoding
used juste before PS/EPS output. While I've not yet tested it (Debian is
still too b0rken to go ISO8859-15, and I can't figure out how to fix it),
this code should enable dia to output PS/EPS files which don't need an
ogonkify pass afterwards. Yes, really.
Performance-wise, this is pretty interesting, too: I've made a (really
biased) "benchmark" which shows good improvements (dia file attached):
muscat%wc /tmp/test*.eps
136 675 3667 /tmp/testtext-after.eps
280 2379 11162 /tmp/testtext-before.eps
364 1434 11209 /tmp/testtext-nouni.eps
(nouni is the --disable-unicode, regular code, -before is before my changes,
ie, the state of 0.88 with --enable-unicode, and -after is my tree right
now).
muscat%for j in nouni before after ; do ; echo "timing 100 renderings of
testtext-$j.eps" ; time (for i in `seq 100` ; pstopnm /tmp/testtext-$j.eps >/dev/null
2>/dev/null) ; done
timing 100 renderings of testtext-nouni.eps
( for i in `seq 100`; do; pstopnm /tmp/testtext-$j.eps > /dev/null 2> ; done )
61,62s user 7,94s system 98% cpu 1:10,81 total
timing 100 renderings of testtext-before.eps
( for i in `seq 100`; do; pstopnm /tmp/testtext-$j.eps > /dev/null 2> ; done )
20,76s user 3,91s system 96% cpu 25,522 total
timing 100 renderings of testtext-after.eps
( for i in `seq 100`; do; pstopnm /tmp/testtext-$j.eps > /dev/null 2> ; done )
18,15s user 4,43s system 96% cpu 23,348 total
The short story: if you use only a few fonts, the new code gives a much
faster EPS file, with custom encodings which don't assume you're using
latin1, and the resulting EPS file might be smaller (YMMV). Theoretically,
but this has never been tested, if your 8-bit charset has more than 256
glyphs (sic), several encoding pages will be defined and automatic switching
will occur.
I'd like to make --enable-unicode the default for the next release (this
will fall back to non-unicode, of course, if libunicode isn't present). Does
anyone have objections ?
-- Cyrille
--
Grumpf.