Ah, you beat me :-)

Still, your implementation isn't loading the whole contents as the
Java version does.

The key issue is the conversion indeed.

Phil
On Tue, Mar 17, 2015 at 4:17 PM, Sven Van Caekenberghe <[email protected]> wrote:
>
>> On 17 Mar 2015, at 15:45, Stephan Eggermont <[email protected]> wrote:
>>
>> I tried it myself, java seems to be 7 times faster on a 35 MB jfreechart.mse 
>> file I found on github. Moose 5.1 managed about
>> 30 MB/s.
>>
>> UTF8 is rather suboptimal for source code. Nearly all of it is
>> ASCII which can be processed a machine word at a time, instead of byte. 
>> There were earlier discussions about that
>> http://forum.world.st/Fastest-utf-8-encoder-contest-td4634566.html
>>
>> Stephan
>
> Thanks for the pointer to the file (finally !).
>
> Using this file: 
> https://raw.githubusercontent.com/mircealungu/experiments-polymorphism/master/fileouts/jfreechart.mse
>  which is indeed 35Mb we can do better.
>
> Since
>
> (FileLocator desktop / 'jfreechart.mse') binaryReadStreamDo: [ :in |
>   in contents allSatisfy: [ :each | each < 127 ].
>
> is true, we can skip decoding.
>
> For me, it is pretty fast now
>
> [
> | count |
> count := 0.
> (FileLocator desktop / 'jfreechart.mse') binaryReadStreamDo: [ :in |
>   in contents do: [ :each | count := count + 1 ] ].
> count
> ] timeToRun.
>
> "0:00:00:00.637"
>
> Adding UTF8 decoding (implemented in Pharo) makes it 10x slower
>
> [
> | count |
> count := 0.
> (FileLocator desktop / 'jfreechart.mse') binaryReadStreamDo: [ :in |
>   in contents utf8Decoded do: [ :each | count := count + 1 ] ].
> count
> ] timeToRun. "0:00:00:07.45"
>
> HTH,
>
> Sven
>
>
>

Reply via email to