[go-nuts] Re: alignment of stack-allocated variables?
Keith made me aware of the fact that my benchmark is using the binary.BigEndian interface instead of "unrolling" the interface to use the specific type at runtime. cpu: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz BenchmarkUnsafe-8 10 5.991 ns/op 0 B/op 0 allocs/op BenchmarkEnc-8 10 6.327 ns/op 0 B/op 0 allocs/op This now gets within 6% of the unsafe method. On Saturday, March 4, 2023 at 3:53:42 PM UTC+1 TheDiveO wrote: > Keith, thank you very much for your feedback, it is highly appreciated! > > With this in mind, it's time for lies, more lies, and statistics, > benchmarking the three different implementations below: > > func (r *Reader) Uint32() uint32 { > if r.err != nil { > return 0 > > } > var s struct { > _ [0]uint32 > b [4]byte > } > _, r.err = r.buff.Read(s.b[:]) > if r.err != nil { > return 0 > } > return *(*uint32)(unsafe.Pointer([0])) > } > > func (r *Reader) Uint32X() uint32 { > if r.err != nil { > return 0 > } > var v uint32 > _, r.err = r.buff.Read((*[4]byte)(unsafe.Pointer())[:]) > if r.err != nil { > return 0 > } > return v > } > > func (r *Reader) Uint32N() uint32 { > if r.err != nil { > return 0 > > } > b := make([]byte, 4) > _, r.err = r.buff.Read(b) > if r.err != nil { > return 0 > } > return hostnative.Uint32(b) > } > > The benchmarking results using "go test -bench=. -benchtime=60s -benchmem > .": > > cpu: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz > BenchmarkReadUint32 > BenchmarkReadUint32-8 10 5.974 ns/op > 0 B/op 0 allocs/op > BenchmarkReadUint32X > BenchmarkReadUint32X-8 10 5.977 ns/op > 0 B/op 0 allocs/op > BenchmarkReadUint32N > BenchmarkReadUint32N-8 10 20.81 ns/op > 4 B/op 1 allocs/op > > The two "unsafe" contenders are absolutely neck-to-neck, so in terms of > better readability and maintainability your proposed variant wins for me. > And as I was somehow suspecting, encoding/binary takes almost 4 times as > much as the first two implementations, and throwing a needless heap > allocation into the bargain. > > On Saturday, March 4, 2023 at 1:20:01 AM UTC+1 Keith Randall wrote: > >> If you're using unsafe anyway, I'd go the other direction, casting from >> the larger alignment to the smaller one. That avoids any alignment concerns. >> >> var x uint32 >> b := (*[4]byte)(unsafe.Pointer())[:] >> r.buff.Read(b) >> return x >> >> I would encourage you to use encoding/binary though. It all works out >> just as well without unsafe, with a bit of trickiness around making sure >> that calls can be resolved and inlined. >> >> b := make([]byte, 4) >> buf.Read(b) >> if little { // some global variable (or constant) you set >>return binary.LittleEndian.Uint32(b) >> } >> return binary.BigEndian.Uint32(b) >> On Friday, March 3, 2023 at 12:30:37 PM UTC-8 TheDiveO wrote: >> >>> In dealing with Linux netlink messages I need to decode and encode >>> uint16, uint32, and uint64 numbers that are in an arbitrary aligned byte >>> buffer in an arbitrary position. In any case, these numbers are in native >>> endianess, so I would like to avoid having to go through encoding/binary. >>> >>> buff := bytes.NewBuffer(/* some data */) >>> >>> // ... >>> >>> func foo() uint32 { >>> var s struct { >>> _ [0]uint32 >>> b [4]byte >>> } >>> r.buff.Read(s.b[:]) >>> return *(*uint32)(unsafe.Pointer([0])) >>> } >>> >>> Will the go compiler (1.19+) allocate on the stack with the correct >>> alignment for its element b, so that the unsafe.Pointer operation correctly >>> works on different CPU architectures? >>> >>> Or is this inefficient anyway in a subtle way that my attempt to avoid >>> non-stack allocations is moot anyway? >>> >> -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/3e632724-a13d-42bc-a3ef-9c3bf2d88a26n%40googlegroups.com.
[go-nuts] Re: alignment of stack-allocated variables?
Keith, thank you very much for your feedback, it is highly appreciated! With this in mind, it's time for lies, more lies, and statistics, benchmarking the three different implementations below: func (r *Reader) Uint32() uint32 { if r.err != nil { return 0 } var s struct { _ [0]uint32 b [4]byte } _, r.err = r.buff.Read(s.b[:]) if r.err != nil { return 0 } return *(*uint32)(unsafe.Pointer([0])) } func (r *Reader) Uint32X() uint32 { if r.err != nil { return 0 } var v uint32 _, r.err = r.buff.Read((*[4]byte)(unsafe.Pointer())[:]) if r.err != nil { return 0 } return v } func (r *Reader) Uint32N() uint32 { if r.err != nil { return 0 } b := make([]byte, 4) _, r.err = r.buff.Read(b) if r.err != nil { return 0 } return hostnative.Uint32(b) } The benchmarking results using "go test -bench=. -benchtime=60s -benchmem . ": cpu: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz BenchmarkReadUint32 BenchmarkReadUint32-8 10 5.974 ns/op 0 B/op 0 allocs/op BenchmarkReadUint32X BenchmarkReadUint32X-8 10 5.977 ns/op 0 B/op 0 allocs/op BenchmarkReadUint32N BenchmarkReadUint32N-8 10 20.81 ns/op 4 B/op 1 allocs/op The two "unsafe" contenders are absolutely neck-to-neck, so in terms of better readability and maintainability your proposed variant wins for me. And as I was somehow suspecting, encoding/binary takes almost 4 times as much as the first two implementations, and throwing a needless heap allocation into the bargain. On Saturday, March 4, 2023 at 1:20:01 AM UTC+1 Keith Randall wrote: > If you're using unsafe anyway, I'd go the other direction, casting from > the larger alignment to the smaller one. That avoids any alignment concerns. > > var x uint32 > b := (*[4]byte)(unsafe.Pointer())[:] > r.buff.Read(b) > return x > > I would encourage you to use encoding/binary though. It all works out just > as well without unsafe, with a bit of trickiness around making sure that > calls can be resolved and inlined. > > b := make([]byte, 4) > buf.Read(b) > if little { // some global variable (or constant) you set >return binary.LittleEndian.Uint32(b) > } > return binary.BigEndian.Uint32(b) > On Friday, March 3, 2023 at 12:30:37 PM UTC-8 TheDiveO wrote: > >> In dealing with Linux netlink messages I need to decode and encode >> uint16, uint32, and uint64 numbers that are in an arbitrary aligned byte >> buffer in an arbitrary position. In any case, these numbers are in native >> endianess, so I would like to avoid having to go through encoding/binary. >> >> buff := bytes.NewBuffer(/* some data */) >> >> // ... >> >> func foo() uint32 { >> var s struct { >> _ [0]uint32 >> b [4]byte >> } >> r.buff.Read(s.b[:]) >> return *(*uint32)(unsafe.Pointer([0])) >> } >> >> Will the go compiler (1.19+) allocate on the stack with the correct >> alignment for its element b, so that the unsafe.Pointer operation correctly >> works on different CPU architectures? >> >> Or is this inefficient anyway in a subtle way that my attempt to avoid >> non-stack allocations is moot anyway? >> > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/c87b6661-2ba6-4aa5-863b-4391cbb6f6c8n%40googlegroups.com.
[go-nuts] Re: alignment of stack-allocated variables?
If you're using unsafe anyway, I'd go the other direction, casting from the larger alignment to the smaller one. That avoids any alignment concerns. var x uint32 b := (*[4]byte)(unsafe.Pointer())[:] r.buff.Read(b) return x I would encourage you to use encoding/binary though. It all works out just as well without unsafe, with a bit of trickiness around making sure that calls can be resolved and inlined. b := make([]byte, 4) buf.Read(b) if little { // some global variable (or constant) you set return binary.LittleEndian.Uint32(b) } return binary.BigEndian.Uint32(b) On Friday, March 3, 2023 at 12:30:37 PM UTC-8 TheDiveO wrote: > In dealing with Linux netlink messages I need to decode and encode uint16, > uint32, and uint64 numbers that are in an arbitrary aligned byte buffer in > an arbitrary position. In any case, these numbers are in native endianess, > so I would like to avoid having to go through encoding/binary. > > buff := bytes.NewBuffer(/* some data */) > > // ... > > func foo() uint32 { > var s struct { > _ [0]uint32 > b [4]byte > } > r.buff.Read(s.b[:]) > return *(*uint32)(unsafe.Pointer([0])) > } > > Will the go compiler (1.19+) allocate on the stack with the correct > alignment for its element b, so that the unsafe.Pointer operation correctly > works on different CPU architectures? > > Or is this inefficient anyway in a subtle way that my attempt to avoid > non-stack allocations is moot anyway? > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/7b241c86-3295-4488-888c-1beb467dd1b1n%40googlegroups.com.