Hi Alistair,

Thanks for the feedback, but consider:

count := 10000.
str := 'test-äöü'.
[
   count timesRepeat: [
       ZnCharacterEncoder utf8 encodeString: str
] ] timeToRunWithoutGC.

=> 8

#newForEncoding: is indeed slow as it has to look through all known encoders, 
which also involves a subclasses search, which is a PITA.

ZnCharacterEncoder class>>#utf8 actually uses a cached default instance (see 
ZnUTF8Encoder class>>#default). As this is the most used encoder, with no 
customization options, it makes sense. For example, #utf8Encoded and 
#utf8Decoded use this instance.

However, some (byte, utf16, utf32) encoders do have state (endianness, 
strictness, ...), so global caching could be an issue.

Users of the encoders are free to cache their own instances.

I am not (yet) convinced that a global cache is needed.

Sven

> On 28 Aug 2018, at 12:43, Alistair Grant <[email protected]> wrote:
> 
> Hi Sven and Everyone,
> 
> Instantiation of character encoders is relatively expensive, as can be
> seen by the following code snippet:
> 
> | count str ba enc new old |
> 
> count := 10000.
> str := 'test-äöü'.
> new := [
>    count timesRepeat: [
>        (ZnCharacterEncoder newForEncoding: 'utf8') encodeString: str
> ] ] timeToRunWithoutGC.
> enc := ZnCharacterEncoder newForEncoding: 'utf8'.
> old := [
>    count timesRepeat: [
>        enc encodeString: str ] ] timeToRunWithoutGC.
> { old. new. }
> 
> " #(14 122)"
> 
> What do you think of the idea of caching default instances of encoders?
> 
> My idea was to have a dictionary in ZnCharacterEncoder and add a class
> method #forEncoding:.  This would lazily create an instance, store it
> in the dictionary, and then re-use the instance whenever that encoder
> is requested.  If people really want their own instance (not sure why,
> there aren't any instance variables, and so there's no state), they
> can use the existing #newForEncoding:.
> 
> For example, sending #exists to a file reference currently
> instantiates two encoders.
> 
> This was actually previously fixed by #20273[1], but seems to have
> been lost with the refactoring of FilePluginPrims to File.  From
> memory, Cyril's application saved 12 seconds by caching the encoder.
> But in that case the instance was private to the file plugin
> primitives.  This would be a more general solution.
> 
> [1] 
> https://pharo.fogbugz.com/f/cases/20273/FilePluginPrims-could-save-an-encoder-to-speedup-some-methods
> 
> Thanks,
> Alistair
> 


Reply via email to