> On 17 Mar 2015, at 15:45, Stephan Eggermont <[email protected]> wrote: > > I tried it myself, java seems to be 7 times faster on a 35 MB jfreechart.mse > file I found on github. Moose 5.1 managed about > 30 MB/s. > > UTF8 is rather suboptimal for source code. Nearly all of it is > ASCII which can be processed a machine word at a time, instead of byte. There > were earlier discussions about that > http://forum.world.st/Fastest-utf-8-encoder-contest-td4634566.html > > Stephan
Thanks for the pointer to the file (finally !). Using this file: https://raw.githubusercontent.com/mircealungu/experiments-polymorphism/master/fileouts/jfreechart.mse which is indeed 35Mb we can do better. Since (FileLocator desktop / 'jfreechart.mse') binaryReadStreamDo: [ :in | in contents allSatisfy: [ :each | each < 127 ]. is true, we can skip decoding. For me, it is pretty fast now [ | count | count := 0. (FileLocator desktop / 'jfreechart.mse') binaryReadStreamDo: [ :in | in contents do: [ :each | count := count + 1 ] ]. count ] timeToRun. "0:00:00:00.637" Adding UTF8 decoding (implemented in Pharo) makes it 10x slower [ | count | count := 0. (FileLocator desktop / 'jfreechart.mse') binaryReadStreamDo: [ :in | in contents utf8Decoded do: [ :each | count := count + 1 ] ]. count ] timeToRun. "0:00:00:07.45" HTH, Sven
