Hi Sven and Everyone,

Instantiation of character encoders is relatively expensive, as can be
seen by the following code snippet:

| count str ba enc new old |

count := 10000.
str := 'test-äöü'.
new := [
    count timesRepeat: [
        (ZnCharacterEncoder newForEncoding: 'utf8') encodeString: str
] ] timeToRunWithoutGC.
enc := ZnCharacterEncoder newForEncoding: 'utf8'.
old := [
    count timesRepeat: [
        enc encodeString: str ] ] timeToRunWithoutGC.
{ old. new. }

" #(14 122)"

What do you think of the idea of caching default instances of encoders?

My idea was to have a dictionary in ZnCharacterEncoder and add a class
method #forEncoding:.  This would lazily create an instance, store it
in the dictionary, and then re-use the instance whenever that encoder
is requested.  If people really want their own instance (not sure why,
there aren't any instance variables, and so there's no state), they
can use the existing #newForEncoding:.

For example, sending #exists to a file reference currently
instantiates two encoders.

This was actually previously fixed by #20273[1], but seems to have
been lost with the refactoring of FilePluginPrims to File.  From
memory, Cyril's application saved 12 seconds by caching the encoder.
But in that case the instance was private to the file plugin
primitives.  This would be a more general solution.

[1] 
https://pharo.fogbugz.com/f/cases/20273/FilePluginPrims-could-save-an-encoder-to-speedup-some-methods

Thanks,
Alistair

Reply via email to