[go-nuts] Re: alignment of stack-allocated variables?

2023-03-06 Thread TheDiveO
Keith made me aware of the fact that my benchmark is using the 
binary.BigEndian interface instead of "unrolling" the interface to use the 
specific type at runtime.  

cpu: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
BenchmarkUnsafe-8   10   5.991 ns/op   0 
B/op  0 allocs/op
BenchmarkEnc-8  10   6.327 ns/op   0 
B/op  0 allocs/op

This now gets within 6% of the unsafe method.
On Saturday, March 4, 2023 at 3:53:42 PM UTC+1 TheDiveO wrote:

> Keith, thank you very much for your feedback, it is highly appreciated!
>
> With this in mind, it's time for lies, more lies, and statistics, 
> benchmarking the three different implementations below:
>
> func (r *Reader) Uint32() uint32 {
> if r.err != nil {
> return 0
>
> }
> var s struct {
> _ [0]uint32
> b [4]byte
> }
> _, r.err = r.buff.Read(s.b[:])
> if r.err != nil {
> return 0
> }
> return *(*uint32)(unsafe.Pointer([0]))
> }
>
> func (r *Reader) Uint32X() uint32 {
> if r.err != nil {
> return 0
> }
> var v uint32
> _, r.err = r.buff.Read((*[4]byte)(unsafe.Pointer())[:])
> if r.err != nil {
> return 0
> }
> return v
> }
>
> func (r *Reader) Uint32N() uint32 {
> if r.err != nil {
> return 0
>
> }
> b := make([]byte, 4)
> _, r.err = r.buff.Read(b)
> if r.err != nil {
> return 0
> }
> return hostnative.Uint32(b)
> }
>
> The benchmarking results using "go test -bench=. -benchtime=60s -benchmem 
> .":
>
> cpu: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
> BenchmarkReadUint32
> BenchmarkReadUint32-8   10   5.974 ns/op   
> 0 B/op  0 allocs/op
> BenchmarkReadUint32X
> BenchmarkReadUint32X-8  10   5.977 ns/op   
> 0 B/op  0 allocs/op
> BenchmarkReadUint32N
> BenchmarkReadUint32N-8  10  20.81 ns/op   
>  4 B/op  1 allocs/op
>
> The two "unsafe" contenders are absolutely neck-to-neck, so in terms of 
> better readability and maintainability your proposed variant wins for me. 
> And as I was somehow suspecting, encoding/binary takes almost 4 times as 
> much as the first two implementations, and throwing a needless heap 
> allocation into the bargain. 
>
> On Saturday, March 4, 2023 at 1:20:01 AM UTC+1 Keith Randall wrote:
>
>> If you're using unsafe anyway, I'd go the other direction, casting from 
>> the larger alignment to the smaller one. That avoids any alignment concerns.
>>
>> var x uint32
>> b := (*[4]byte)(unsafe.Pointer())[:]
>> r.buff.Read(b)
>> return x
>>
>> I would encourage you to use encoding/binary though. It all works out 
>> just as well without unsafe, with a bit of trickiness around making sure 
>> that calls can be resolved and inlined.
>>
>> b := make([]byte, 4)
>> buf.Read(b)
>> if little { // some global variable (or constant) you set
>>return binary.LittleEndian.Uint32(b)
>> }
>> return binary.BigEndian.Uint32(b)
>> On Friday, March 3, 2023 at 12:30:37 PM UTC-8 TheDiveO wrote:
>>
>>> In dealing with Linux netlink messages I need to decode and encode 
>>> uint16, uint32, and uint64 numbers that are in an arbitrary aligned byte 
>>> buffer in an arbitrary position. In any case, these numbers are in native 
>>> endianess, so I would like to avoid having to go through encoding/binary.
>>>
>>> buff := bytes.NewBuffer(/* some data */)
>>>
>>> // ...
>>>
>>> func foo() uint32 {
>>> var s struct {
>>> _ [0]uint32
>>> b [4]byte
>>> }
>>> r.buff.Read(s.b[:])
>>> return *(*uint32)(unsafe.Pointer([0]))
>>> }
>>>
>>> Will the go compiler (1.19+) allocate on the stack with the correct 
>>> alignment for its element b, so that the unsafe.Pointer operation correctly 
>>> works on different CPU architectures?
>>>
>>> Or is this inefficient anyway in a subtle way that my attempt to avoid 
>>> non-stack allocations is moot anyway?
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/3e632724-a13d-42bc-a3ef-9c3bf2d88a26n%40googlegroups.com.


[go-nuts] Re: alignment of stack-allocated variables?

2023-03-04 Thread TheDiveO
Keith, thank you very much for your feedback, it is highly appreciated!

With this in mind, it's time for lies, more lies, and statistics, 
benchmarking the three different implementations below:

func (r *Reader) Uint32() uint32 {
if r.err != nil {
return 0
}
var s struct {
_ [0]uint32
b [4]byte
}
_, r.err = r.buff.Read(s.b[:])
if r.err != nil {
return 0
}
return *(*uint32)(unsafe.Pointer([0]))
}

func (r *Reader) Uint32X() uint32 {
if r.err != nil {
return 0
}
var v uint32
_, r.err = r.buff.Read((*[4]byte)(unsafe.Pointer())[:])
if r.err != nil {
return 0
}
return v
}

func (r *Reader) Uint32N() uint32 {
if r.err != nil {
return 0
}
b := make([]byte, 4)
_, r.err = r.buff.Read(b)
if r.err != nil {
return 0
}
return hostnative.Uint32(b)
}

The benchmarking results using "go test -bench=. -benchtime=60s -benchmem .
":

cpu: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
BenchmarkReadUint32
BenchmarkReadUint32-8   10   5.974 ns/op   
0 B/op  0 allocs/op
BenchmarkReadUint32X
BenchmarkReadUint32X-8  10   5.977 ns/op   
0 B/op  0 allocs/op
BenchmarkReadUint32N
BenchmarkReadUint32N-8  10  20.81 ns/op 
   4 B/op  1 allocs/op

The two "unsafe" contenders are absolutely neck-to-neck, so in terms of 
better readability and maintainability your proposed variant wins for me. 
And as I was somehow suspecting, encoding/binary takes almost 4 times as 
much as the first two implementations, and throwing a needless heap 
allocation into the bargain. 

On Saturday, March 4, 2023 at 1:20:01 AM UTC+1 Keith Randall wrote:

> If you're using unsafe anyway, I'd go the other direction, casting from 
> the larger alignment to the smaller one. That avoids any alignment concerns.
>
> var x uint32
> b := (*[4]byte)(unsafe.Pointer())[:]
> r.buff.Read(b)
> return x
>
> I would encourage you to use encoding/binary though. It all works out just 
> as well without unsafe, with a bit of trickiness around making sure that 
> calls can be resolved and inlined.
>
> b := make([]byte, 4)
> buf.Read(b)
> if little { // some global variable (or constant) you set
>return binary.LittleEndian.Uint32(b)
> }
> return binary.BigEndian.Uint32(b)
> On Friday, March 3, 2023 at 12:30:37 PM UTC-8 TheDiveO wrote:
>
>> In dealing with Linux netlink messages I need to decode and encode 
>> uint16, uint32, and uint64 numbers that are in an arbitrary aligned byte 
>> buffer in an arbitrary position. In any case, these numbers are in native 
>> endianess, so I would like to avoid having to go through encoding/binary.
>>
>> buff := bytes.NewBuffer(/* some data */)
>>
>> // ...
>>
>> func foo() uint32 {
>> var s struct {
>> _ [0]uint32
>> b [4]byte
>> }
>> r.buff.Read(s.b[:])
>> return *(*uint32)(unsafe.Pointer([0]))
>> }
>>
>> Will the go compiler (1.19+) allocate on the stack with the correct 
>> alignment for its element b, so that the unsafe.Pointer operation correctly 
>> works on different CPU architectures?
>>
>> Or is this inefficient anyway in a subtle way that my attempt to avoid 
>> non-stack allocations is moot anyway?
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/c87b6661-2ba6-4aa5-863b-4391cbb6f6c8n%40googlegroups.com.


[go-nuts] Re: alignment of stack-allocated variables?

2023-03-03 Thread 'Keith Randall' via golang-nuts
If you're using unsafe anyway, I'd go the other direction, casting from the 
larger alignment to the smaller one. That avoids any alignment concerns.

var x uint32
b := (*[4]byte)(unsafe.Pointer())[:]
r.buff.Read(b)
return x

I would encourage you to use encoding/binary though. It all works out just 
as well without unsafe, with a bit of trickiness around making sure that 
calls can be resolved and inlined.

b := make([]byte, 4)
buf.Read(b)
if little { // some global variable (or constant) you set
   return binary.LittleEndian.Uint32(b)
}
return binary.BigEndian.Uint32(b)
On Friday, March 3, 2023 at 12:30:37 PM UTC-8 TheDiveO wrote:

> In dealing with Linux netlink messages I need to decode and encode uint16, 
> uint32, and uint64 numbers that are in an arbitrary aligned byte buffer in 
> an arbitrary position. In any case, these numbers are in native endianess, 
> so I would like to avoid having to go through encoding/binary.
>
> buff := bytes.NewBuffer(/* some data */)
>
> // ...
>
> func foo() uint32 {
> var s struct {
> _ [0]uint32
> b [4]byte
> }
> r.buff.Read(s.b[:])
> return *(*uint32)(unsafe.Pointer([0]))
> }
>
> Will the go compiler (1.19+) allocate on the stack with the correct 
> alignment for its element b, so that the unsafe.Pointer operation correctly 
> works on different CPU architectures?
>
> Or is this inefficient anyway in a subtle way that my attempt to avoid 
> non-stack allocations is moot anyway?
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/7b241c86-3295-4488-888c-1beb467dd1b1n%40googlegroups.com.