> On 17 Mar 2015, at 15:45, Stephan Eggermont <[email protected]> wrote:
> 
> I tried it myself, java seems to be 7 times faster on a 35 MB jfreechart.mse 
> file I found on github. Moose 5.1 managed about
> 30 MB/s.
> 
> UTF8 is rather suboptimal for source code. Nearly all of it is
> ASCII which can be processed a machine word at a time, instead of byte. There 
> were earlier discussions about that
> http://forum.world.st/Fastest-utf-8-encoder-contest-td4634566.html
> 
> Stephan

Thanks for the pointer to the file (finally !).

Using this file: 
https://raw.githubusercontent.com/mircealungu/experiments-polymorphism/master/fileouts/jfreechart.mse
 which is indeed 35Mb we can do better.

Since

(FileLocator desktop / 'jfreechart.mse') binaryReadStreamDo: [ :in | 
  in contents allSatisfy: [ :each | each < 127 ].

is true, we can skip decoding.

For me, it is pretty fast now

[
| count |
count := 0.
(FileLocator desktop / 'jfreechart.mse') binaryReadStreamDo: [ :in | 
  in contents do: [ :each | count := count + 1 ] ].
count
] timeToRun. 

"0:00:00:00.637"

Adding UTF8 decoding (implemented in Pharo) makes it 10x slower

[
| count |
count := 0.
(FileLocator desktop / 'jfreechart.mse') binaryReadStreamDo: [ :in | 
  in contents utf8Decoded do: [ :each | count := count + 1 ] ].
count
] timeToRun. "0:00:00:07.45"

HTH,

Sven



Reply via email to