On Wed, Dec 11, 2024 at 02:14:39AM +0100, nick wrote:
> Thank you! I just ran a quick test and encountered the same crash at the
> same line. I’ll have more time to investigate tomorrow and can provide
> additional details then. Do you have any other ideas I could try in the
> meantime?
A part of my previous explanation was wrong, the issue is not related to
net_copy(), as the crash happened even before.
For some reason, struct bgp_prefix is u64-aligned. That makes sense in
the original code, where u64 alignment is forced by struct net_addr, but
i have no idea why this is true even after the patch removing the forced
alignment.
The slab allocator in sl_alloc() just returns u32-aligned memory, because
it is configured for sizeof(struct bgp_prefix) + sizeof(net_addr_ip6) and
the second is 20 as net_addr_ip6 is u32-aligned. The sl_alloc() internally
only enforces word-size alignment, wich is u32 in your case.
Please disregard the previous patch. Can you try the first attached patch
(log-alignment.patch) to log alignment of different structures with or
without the second patch (net-addr-u32-align.patch) and report me results?
> On 12/11/24 1:41 AM, Ondrej Zajicek wrote:
> > On Tue, Dec 10, 2024 at 09:15:46PM +0100, nick via Bird-users wrote:
> > > I also uploaded the coredumpfile:
> > > https://github.com/PolynomialDivision/coredumpupload/blob/main/bird_coredump
> > Thanks. This seems like an interesting issue. In BIRD, generic net_addr
> > structure is explicitly u64-aligned (to accomodate VPN variants), while
> > specific net_addr_ip4 and net_addr_ip6 are just u32-aligned. In this case
> > net_addr_ip6 is allocated with u32 alignment, but then copied with
> > net_copy(), which assumes generic net_addr for arguments, and compiler
> > probably used some u64-optimized copying, which required 64-bit alignment
> > despite being on 32-bit platform,
> >
> > For starters, try the attached patch. But it is preliminary, we will revisit
> > alignment of these structures.
> >
> >
> > > > > > The root cause appears to be insufficient alignment of memory
> > > > > > allocated for
> > > > > > structures, specifically in this line:
> > > > > >
> > > > > > ```c
> > > > > > px = mb_alloc(c->pool, sizeof(struct bgp_prefix) + net->length);
> > > > > > ```
> > Note that it is really allocated two lines above, here:
> >
> > px = sl_alloc(c->prefix_slab);
> >
--
Elen sila lumenn' omentielvo
Ondrej 'Santiago' Zajicek (email: [email protected])
"To err is human -- to blame it on a computer is even more so."
diff --git a/proto/bgp/bgp.c b/proto/bgp/bgp.c
index c18a73fe..3c18f80a 100644
--- a/proto/bgp/bgp.c
+++ b/proto/bgp/bgp.c
@@ -2771,5 +2771,12 @@ struct protocol proto_bgp = {
void bgp_build(void)
{
+ log(L_WARN "ALIGN: %d %d %d %d %d",
+ (int) _Alignof(net_addr),
+ (int) _Alignof(net_addr_ip4),
+ (int) _Alignof(net_addr_ip6),
+ (int) _Alignof(node),
+ (int) _Alignof(struct bgp_prefix));
+
proto_build(&proto_bgp);
}
diff --git a/lib/net.h b/lib/net.h
index 61ed37ba..eee59de5 100644
--- a/lib/net.h
+++ b/lib/net.h
@@ -51,7 +51,7 @@ typedef struct net_addr {
u8 pxlen;
u16 length;
u8 data[20];
- u64 align[0];
+ u32 align[0];
} net_addr;
typedef struct net_addr_ip4 {