Hey Roger,

Thanks for taking the time to engage with me on this. Really appreciate it!


On Monday, June 22, 2020 at 6:01:46 PM UTC-4, rog wrote:
>
>
>
> On Mon, 22 Jun 2020 at 22:49, Andrew Werner <awer...@gmail.com 
> <javascript:>> wrote:
>
>>
>>
>> On Monday, June 22, 2020 at 5:42:12 PM UTC-4, rog wrote:
>>>
>>>
>>> Thanks for pointing this out. I'm cool with this approach. I'll update 
>>>>> my library to utilize it (and consider also adopting the list.List, 
>>>>> though 
>>>>> I do like my freedom to pool list nodes).
>>>>>
>>>>
>>> Personally, I'd start by re-using list.List, and only use pools if there 
>>> really is a significant performance benefit to be gained there.
>>> If you just need zero-valued elements, I could imagine that sync.Pool 
>>> might end up slower than just using the GC.
>>>
>>
>> Fair. 
>>
>>
>>> Hopefully with aggressive inlining of the specialized type the compiler 
>>>>> will be able to make the optimizations that it would be able to make if 
>>>>> the 
>>>>> slicing were performed directly on an array.
>>>>>
>>>>
>>> What optimisations are you thinking of here? Are you concerned that 
>>> slice operations are too slow?
>>>
>>
>>
>> * Bound check elimination
>>
>
> I'm not sure how it could do that, given that it's inevitable that an 
> index is stored in a field with no range limits on it (i guess if you used 
> a 256-element array and a uint8 index, it could, but then it wouldn't work 
> for other array sizes).
>
Fair point. I imagine that the fact that it's an array doesn't really 
change much w.r.t. when a bounds check needs to occur, it just might change 
the shape of such a check. It does seem like in my ring buffer example use 
of unsigned integers in some places can eliminate some bounds checking 
given I'm going to mod the whole thing by `len(buf)`

I do recognize that the above case has nothing to do with the slice/array 
discussion.



type ringBuf(type T, *ArrayT Slicer(T)) struct {
    head, len uint32
    Array
}

func (rb *ringBuf(T, ArrayT)) at(idx int32) *T {
    buf := rb.Slice()
    // no bounds check needed.
    return &buf[int(rb.head+idx) % len(buf)]
}





> * Avoiding an indirection through the pointer in the slice header would be 
>> nice
>>
>
> Isn't it going to have to indirect through a pointer regardless of whether 
> it's a slice or an array? The array isn't stored inside the Dequeue struct 
> itself.
>

In your example where you put the slice into the dequeue itself, yes, sure. 
I used the Dequeue as a simplified example where the motivating case is 
really a B-Tree. In the B-tree we'd instead be operating at a node basis 
where the node does actually contain the array. In that case we either 
realize a slice on the stack or realize a slice in the node to refer to 
itself. In either case, there will be an indirection through the slice 
header. I'll buy the argument that it should be cheap and hit cache lines.
 

>
> * Compile-time evaluation of len and cap
>>
>
> Isn't that just a one-word load of memory that's very likely in the same 
> cache line as the array-pointer member?
> I can't see that this is likely to make a significant performance 
> difference, but I'd love to see some numbers.
>
Fair enough. Anything we do here is going to be better than a world today. 
I'm just trying to push on the idea that it will be a bummer if after go 
generics land, people in the performance critical space still find 
themselves using code-generating generics solutions like 
https://github.com/mmatczuk/go_generics to specialize data structures. 

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/9171440e-98a8-445d-b374-2e31e89b0abeo%40googlegroups.com.

Reply via email to