Re: [Pharo-project] optimizing primitiveFindFirstInString

Philippe Marschall Mon, 15 Aug 2011 11:47:54 -0700

On 15.08.2011 18:07, Levente Uzonyi wrote:
> On Mon, 15 Aug 2011, Philippe Marschall wrote:
> 
>> On 15.08.2011 00:21, Levente Uzonyi wrote:
>>> On Sun, 14 Aug 2011, Philippe Marschall wrote:
>>>
>>>> On 14.08.2011 22:00, Levente Uzonyi wrote:
>>>>> On Sun, 14 Aug 2011, Philippe Marschall wrote:
>>>>>
>>>>>> Hi
>>>>>>
>>>>>> In Seaside we get a lot of performance gains out of
>>>>>> primitiveFindFirstInString. One thing that always annoyed me a bit is
>>>>>> that it's not as optimized as it could be.
>>>>>>
>>>>>> The inclusionMap you give it are 256 consecutive boolean values (0 or
>>>>>> 1). There is no need for this to be a 256 element ByteArray when each
>>>>>> element can only be 0 or 1. We could as well make it a 32 element
>>>>>> ByteArray and each byte holding eight bit values. Instead of using
>>>>>> the
>>>>>> asciiValue to directly index into the inclusionMap we would use
>>>>>> the top
>>>>>> five bits to index into the inclusionMap and the bottom 3 bits to
>>>>>> "index
>>>>>> into the byte".
>>>>>>
>>>>>> Did that make any sense?
>>>>>
>>>>> Do you want to save space?
>>>>
>>>> I'm trying to trade memory access (which is slow) for a bit shift and
>>>> two bit ands (which is fast).
>>>
>>> 256 bytes + object header easily fit into the L1 cache of any recent x86
>>> CPU, even the old P4's had 8kB of it.
>>
>> Sure once it's in the cache then it's in the cache. But a cache line is
>> only 64 bytes. So a 32 element array should fit and a 256 element
>> doesn't. So a 32 element array should be there in one memory access
>> while a 256 element array probably takes five.
> 
> A 32 element array needs 1-2 cache lines, while the 256 element array
> needs 4-5. It's not that big difference IMHO. (Especially if you want to
> pass a ByteArray and convert it ot a 32 byte array in C.)
> 
>>
>>> Moving less bytes around may make
>>> some difference though, but if you create your ByteArrays right before
>>> you use them, then it's pretty likely that your object is already in the
>>> cache.
>>
>> I don't of course.
>>
>>>>
>>>>> Are you storing lots of inclusion maps (maybe
>>>>> CharacterSets)? If not, then IMHO it's not worth to adding this
>>>>> feature
>>>>> to this primitive, because runtime performance will be worse on
>>>>> both the
>>>>> image side and the VM side.
>>>>
>>>> Why?
>>>
>>> Filling the contents of a compact 32 sized ByteArray in the image takes
>>> more time than a 256 byte ByteArray. On the VM side memory access +
>>> shifts + bitands cost more than just memory access.
>>
>> Uhm, it's a primitive. My understanding is this would result in "C
>> shifts + bitands".
> 
> Hm. Are you about to change the primitive to create an array of 32 bytes
> from the ByteArray argument and use that during the search?


Of course not. I'm going to create a new primitive that takes a 32
element byte array. Anything else is pointless.

>>
>>> But (as usual) a
>>> benchmark can tell if it's really worth to do it or not.
>>
>> This there somewhere some documentation how one adds a primitive and
>> compiles a VM (Cog ideally)?
> 
> This primitive is in the image, search for the method with the string
> 'primitiveFindFirstInString'. The same way you can add others methods to
> MiscPrimitivePlugin. IIRC there's a registry (an array in a method on
> the plugin's class' side) for these methods where you should add the new
> method, look for the method name (as string/symbol) to find it.

That part is clear to me.

> There are multiple ways to build VMs, but I guess you want to follow the
> way developed by Igor.

That part is not at all clear to me. Where can I find documentation
about this.

Cheers
Philippe

Re: [Pharo-project] optimizing primitiveFindFirstInString

Reply via email to