Re: malloc: change chunk sizes to be multiple of 16 instead of power of 2
On Wed, Mar 01, 2023 at 08:39:08AM +0100, Otto Moerbeek wrote: > On Wed, Mar 01, 2023 at 08:31:47AM +0100, Theo Buehler wrote: > > > On Tue, Feb 28, 2023 at 05:52:28PM +0100, Otto Moerbeek wrote: > > > Second iteration. > > > > > > Gain back performance by allocation chunk_info pages in a bundle, and > > > use less buckets is !malloc option S. The chunk sizes used are 16, 32, > > > 48, 64, 80, 96, 112, 128, 160, 192, 224, 256, 320, 384, 448, 512, 640, > > > 768, 896, 1024, 1280, 1536, 1792, 2048 (and a few more for sparc84 > > > with it's 8k sized pages and loongson with it's 16k pages). > > > > > > If malloc option S (or rather cache size 0) is used we use strict > > > multiple of 16 buckets, to get as many buckets as possible. > > > > > > See the find_bucket() and bin_of() functions. Thanks to Tony Finch for > > > pointing me to code to compute nice bucket sizes. > > > > > > I think this is ready for review and wide testing. > > > > Two vala-based ports, graphics/birdfont and productivity/minder, run out > > of memory when attempting to build them with this diff (and its previous > > version) on both amd64 and arm64: > > > > ***MEMORY-ERROR***: valac[93681]: GSlice: failed to allocate 2032 bytes > > (alignment: 2048): Cannot allocate memory > > Thanks, this smells like a bug in the aligned mem case. > > + pof2 = 1 << MALLOC_MINSIZE; > > should be > > + pof2 = MALLOC_MINSIZE; > > By the looks of it. I'll get back to this. I can confirm that changing this fixes this issue with both ports on amd64 and arm64. > > -Otto > > > > > Abort trap (core dumped) > > > > To be able to build birdfont with PORTS_PRIVSEP = Yes, I had to bump > > _pbuild's datasize-cur to 15G, while 14G was not enough. That's nearly > > double the current default. On amd64 without this diff, birdfont builds > > comfortably with a datasize-cur of 1G. > > > > birdfont may be easier to investigate since the error happens early in > > the build. You can get there relatively quickly by doing > > > > cd /usr/ports/graphics/birdfont > > doas pkg_add birdfont > > make FETCH_PACKAGES= prepare > > make > > > > Not sure if the top of the trace is of much use. Here it is: > > > > #0 thrkill () at /tmp/-:3 > > #1 0x486dd8c0aacac468 in ?? () > > #2 0x0c3a34319d0e in _libc_abort () at > > /usr/src/lib/libc/stdlib/abort.c:51 > > #3 0x0c39b735724b in mem_error () from > > /usr/local/lib/libglib-2.0.so.4201.9 > > #4 0x0c39b735604f in slab_allocator_alloc_chunk () from > > /usr/local/lib/libglib-2.0.so.4201.9 > > #5 0x0c39b7355a95 in g_slice_alloc () from > > /usr/local/lib/libglib-2.0.so.4201.9 > > #6 0x0c39b735606e in g_slice_alloc0 () from > > /usr/local/lib/libglib-2.0.so.4201.9 > > #7 0x0c396c2675f5 in g_type_create_instance () from > > /usr/local/lib/libgobject-2.0.so.4200.16 > > #8 0x0c3a19ad in vala_data_type_construct_with_symbol () > >from /usr/local/lib/libvala-0.56.so.0.0 > > #9 0x0c3a19b4ecae in vala_integer_type_construct () from > > /usr/local/lib/libvala-0.56.so.0.0 > > #10 0x0c3a19b4f08c in vala_integer_type_real_copy () from > > /usr/local/lib/libvala-0.56.so.0.0 > > #11 0x0c3a19a9d46f in vala_assignment_real_check () from > > /usr/local/lib/libvala-0.56.so.0.0 > > #12 0x0c3a19ae54e5 in vala_expression_statement_real_check () > >from /usr/local/lib/libvala-0.56.so.0.0 > > #13 0x0c3a19aa6a0d in vala_block_real_check () from > > /usr/local/lib/libvala-0.56.so.0.0
Re: malloc: change chunk sizes to be multiple of 16 instead of power of 2
On Wed, Mar 01, 2023 at 08:31:47AM +0100, Theo Buehler wrote: > On Tue, Feb 28, 2023 at 05:52:28PM +0100, Otto Moerbeek wrote: > > Second iteration. > > > > Gain back performance by allocation chunk_info pages in a bundle, and > > use less buckets is !malloc option S. The chunk sizes used are 16, 32, > > 48, 64, 80, 96, 112, 128, 160, 192, 224, 256, 320, 384, 448, 512, 640, > > 768, 896, 1024, 1280, 1536, 1792, 2048 (and a few more for sparc84 > > with it's 8k sized pages and loongson with it's 16k pages). > > > > If malloc option S (or rather cache size 0) is used we use strict > > multiple of 16 buckets, to get as many buckets as possible. > > > > See the find_bucket() and bin_of() functions. Thanks to Tony Finch for > > pointing me to code to compute nice bucket sizes. > > > > I think this is ready for review and wide testing. > > Two vala-based ports, graphics/birdfont and productivity/minder, run out > of memory when attempting to build them with this diff (and its previous > version) on both amd64 and arm64: > > ***MEMORY-ERROR***: valac[93681]: GSlice: failed to allocate 2032 bytes > (alignment: 2048): Cannot allocate memory Thanks, this smells like a bug in the aligned mem case. + pof2 = 1 << MALLOC_MINSIZE; should be + pof2 = MALLOC_MINSIZE; By the looks of it. I'll get back to this. -Otto > > Abort trap (core dumped) > > To be able to build birdfont with PORTS_PRIVSEP = Yes, I had to bump > _pbuild's datasize-cur to 15G, while 14G was not enough. That's nearly > double the current default. On amd64 without this diff, birdfont builds > comfortably with a datasize-cur of 1G. > > birdfont may be easier to investigate since the error happens early in > the build. You can get there relatively quickly by doing > > cd /usr/ports/graphics/birdfont > doas pkg_add birdfont > make FETCH_PACKAGES= prepare > make > > Not sure if the top of the trace is of much use. Here it is: > > #0 thrkill () at /tmp/-:3 > #1 0x486dd8c0aacac468 in ?? () > #2 0x0c3a34319d0e in _libc_abort () at > /usr/src/lib/libc/stdlib/abort.c:51 > #3 0x0c39b735724b in mem_error () from > /usr/local/lib/libglib-2.0.so.4201.9 > #4 0x0c39b735604f in slab_allocator_alloc_chunk () from > /usr/local/lib/libglib-2.0.so.4201.9 > #5 0x0c39b7355a95 in g_slice_alloc () from > /usr/local/lib/libglib-2.0.so.4201.9 > #6 0x0c39b735606e in g_slice_alloc0 () from > /usr/local/lib/libglib-2.0.so.4201.9 > #7 0x0c396c2675f5 in g_type_create_instance () from > /usr/local/lib/libgobject-2.0.so.4200.16 > #8 0x0c3a19ad in vala_data_type_construct_with_symbol () >from /usr/local/lib/libvala-0.56.so.0.0 > #9 0x0c3a19b4ecae in vala_integer_type_construct () from > /usr/local/lib/libvala-0.56.so.0.0 > #10 0x0c3a19b4f08c in vala_integer_type_real_copy () from > /usr/local/lib/libvala-0.56.so.0.0 > #11 0x0c3a19a9d46f in vala_assignment_real_check () from > /usr/local/lib/libvala-0.56.so.0.0 > #12 0x0c3a19ae54e5 in vala_expression_statement_real_check () >from /usr/local/lib/libvala-0.56.so.0.0 > #13 0x0c3a19aa6a0d in vala_block_real_check () from > /usr/local/lib/libvala-0.56.so.0.0
Re: malloc: change chunk sizes to be multiple of 16 instead of power of 2
On Tue, Feb 28, 2023 at 05:52:28PM +0100, Otto Moerbeek wrote: > Second iteration. > > Gain back performance by allocation chunk_info pages in a bundle, and > use less buckets is !malloc option S. The chunk sizes used are 16, 32, > 48, 64, 80, 96, 112, 128, 160, 192, 224, 256, 320, 384, 448, 512, 640, > 768, 896, 1024, 1280, 1536, 1792, 2048 (and a few more for sparc84 > with it's 8k sized pages and loongson with it's 16k pages). > > If malloc option S (or rather cache size 0) is used we use strict > multiple of 16 buckets, to get as many buckets as possible. > > See the find_bucket() and bin_of() functions. Thanks to Tony Finch for > pointing me to code to compute nice bucket sizes. > > I think this is ready for review and wide testing. Two vala-based ports, graphics/birdfont and productivity/minder, run out of memory when attempting to build them with this diff (and its previous version) on both amd64 and arm64: ***MEMORY-ERROR***: valac[93681]: GSlice: failed to allocate 2032 bytes (alignment: 2048): Cannot allocate memory Abort trap (core dumped) To be able to build birdfont with PORTS_PRIVSEP = Yes, I had to bump _pbuild's datasize-cur to 15G, while 14G was not enough. That's nearly double the current default. On amd64 without this diff, birdfont builds comfortably with a datasize-cur of 1G. birdfont may be easier to investigate since the error happens early in the build. You can get there relatively quickly by doing cd /usr/ports/graphics/birdfont doas pkg_add birdfont make FETCH_PACKAGES= prepare make Not sure if the top of the trace is of much use. Here it is: #0 thrkill () at /tmp/-:3 #1 0x486dd8c0aacac468 in ?? () #2 0x0c3a34319d0e in _libc_abort () at /usr/src/lib/libc/stdlib/abort.c:51 #3 0x0c39b735724b in mem_error () from /usr/local/lib/libglib-2.0.so.4201.9 #4 0x0c39b735604f in slab_allocator_alloc_chunk () from /usr/local/lib/libglib-2.0.so.4201.9 #5 0x0c39b7355a95 in g_slice_alloc () from /usr/local/lib/libglib-2.0.so.4201.9 #6 0x0c39b735606e in g_slice_alloc0 () from /usr/local/lib/libglib-2.0.so.4201.9 #7 0x0c396c2675f5 in g_type_create_instance () from /usr/local/lib/libgobject-2.0.so.4200.16 #8 0x0c3a19ad in vala_data_type_construct_with_symbol () from /usr/local/lib/libvala-0.56.so.0.0 #9 0x0c3a19b4ecae in vala_integer_type_construct () from /usr/local/lib/libvala-0.56.so.0.0 #10 0x0c3a19b4f08c in vala_integer_type_real_copy () from /usr/local/lib/libvala-0.56.so.0.0 #11 0x0c3a19a9d46f in vala_assignment_real_check () from /usr/local/lib/libvala-0.56.so.0.0 #12 0x0c3a19ae54e5 in vala_expression_statement_real_check () from /usr/local/lib/libvala-0.56.so.0.0 #13 0x0c3a19aa6a0d in vala_block_real_check () from /usr/local/lib/libvala-0.56.so.0.0
Re: update: xf86-video-amdgpu 23.0.0
On Mon, Feb 27, 2023 at 09:17:51PM +0100, Matthieu Herrb wrote: > Hi, > > the patch below update the amdgpu(4) X.Org driver to version 23.0.0 > > To test (on machines with an AMD / Radeon GPU) > > # cd /usr/xenocara/driver/xf86-video-amdgpu > # patch -p0 -E < /path/to/this/file > # make -f Makefile.bsd-wrapper obj > # make -f Makefile.bsd-wrapper build > > and restart xenodm(1) > > Tests and oks welcome... tested on: amd64, t14 gen 1 amd, builtin eDP and external DP amdgpu0: RENOIR GC 9.3.0 6 CU rev 0x00 amd64, t495, builtin eDP amdgpu0: PICASSO GC 9.1.0 10 CU rev 0x01 amd64, vega56, DP to DVI amdgpu0: VEGA10 GC 9.0.1 56 CU rev 0x01 ok jsg@
Re: malloc: change chunk sizes to be multiple of 16 instead of power of 2
Second iteration. Gain back performance by allocation chunk_info pages in a bundle, and use less buckets is !malloc option S. The chunk sizes used are 16, 32, 48, 64, 80, 96, 112, 128, 160, 192, 224, 256, 320, 384, 448, 512, 640, 768, 896, 1024, 1280, 1536, 1792, 2048 (and a few more for sparc84 with it's 8k sized pages and loongson with it's 16k pages). If malloc option S (or rather cache size 0) is used we use strict multiple of 16 buckets, to get as many buckets as possible. See the find_bucket() and bin_of() functions. Thanks to Tony Finch for pointing me to code to compute nice bucket sizes. I think this is ready for review and wide testing. -Otto Index: stdlib/malloc.c === RCS file: /home/cvs/src/lib/libc/stdlib/malloc.c,v retrieving revision 1.277 diff -u -p -r1.277 malloc.c --- stdlib/malloc.c 27 Feb 2023 06:47:54 - 1.277 +++ stdlib/malloc.c 28 Feb 2023 16:49:08 - @@ -67,6 +67,11 @@ #define MALLOC_CHUNK_LISTS 4 #define CHUNK_CHECK_LENGTH 32 +#define B2SIZE(b) ((b) * MALLOC_MINSIZE) +#define B2ALLOC(b) ((b) == 0 ? MALLOC_MINSIZE : \ + (b) * MALLOC_MINSIZE) +#define BUCKETS(MALLOC_MAXCHUNK / MALLOC_MINSIZE) + /* * We move allocations between half a page and a whole page towards the end, * subject to alignment constraints. This is the extra headroom we allow. @@ -144,9 +149,9 @@ struct dir_info { int mutex; int malloc_mt; /* multi-threaded mode? */ /* lists of free chunk info structs */ - struct chunk_head chunk_info_list[MALLOC_MAXSHIFT + 1]; + struct chunk_head chunk_info_list[BUCKETS + 1]; /* lists of chunks with free slots */ - struct chunk_head chunk_dir[MALLOC_MAXSHIFT + 1][MALLOC_CHUNK_LISTS]; + struct chunk_head chunk_dir[BUCKETS + 1][MALLOC_CHUNK_LISTS]; /* delayed free chunk slots */ void *delayed_chunks[MALLOC_DELAYED_CHUNK_MASK + 1]; u_char rbytes[32]; /* random bytes */ @@ -155,6 +160,8 @@ struct dir_info { size_t bigcache_used; size_t bigcache_size; struct bigcache *bigcache; + void *chunk_pages; + size_t chunk_pages_used; #ifdef MALLOC_STATS size_t inserts; size_t insert_collisions; @@ -195,8 +202,7 @@ struct chunk_info { LIST_ENTRY(chunk_info) entries; void *page; /* pointer to the page */ u_short canary; - u_short size; /* size of this page's chunks */ - u_short shift; /* how far to shift for this size */ + u_short bucket; u_short free; /* how many free chunks */ u_short total; /* how many chunks */ u_short offset; /* requested size table offset */ @@ -247,11 +253,11 @@ static void malloc_exit(void); #endif /* low bits of r->p determine size: 0 means >= page size and r->size holding - * real size, otherwise low bits are a shift count, or 1 for malloc(0) + * real size, otherwise low bits is the bucket + 1 */ #define REALSIZE(sz, r)\ (sz) = (uintptr_t)(r)->p & MALLOC_PAGEMASK, \ - (sz) = ((sz) == 0 ? (r)->size : ((sz) == 1 ? 0 : (1 << ((sz)-1 + (sz) = ((sz) == 0 ? (r)->size : B2SIZE((sz) - 1)) static inline void _MALLOC_LEAVE(struct dir_info *d) @@ -502,7 +508,7 @@ omalloc_poolinit(struct dir_info *d, int d->r = NULL; d->rbytesused = sizeof(d->rbytes); d->regions_free = d->regions_total = 0; - for (i = 0; i <= MALLOC_MAXSHIFT; i++) { + for (i = 0; i <= BUCKETS; i++) { LIST_INIT(>chunk_info_list[i]); for (j = 0; j < MALLOC_CHUNK_LISTS; j++) LIST_INIT(>chunk_dir[i][j]); @@ -720,7 +726,7 @@ unmap(struct dir_info *d, void *p, size_ /* don't look through all slots */ for (j = 0; j < d->bigcache_size / 4; j++) { - i = (base + j) % d->bigcache_size; + i = (base + j) & (d->bigcache_size - 1); if (d->bigcache_used < BIGCACHE_FILL(d->bigcache_size)) { if (d->bigcache[i].psize == 0) @@ -764,10 +770,13 @@ unmap(struct dir_info *d, void *p, size_ } cache = >smallcache[psz - 1]; if (cache->length == cache->max) { + int fresh; /* use a random slot */ - i = getrbyte(d) % cache->max; + i = getrbyte(d) & (cache->max - 1); r = cache->pages[i]; - if (!mopts.malloc_freeunmap) + fresh = (uintptr_t)r & 1; +
Re: pf(4) drops valid IGMP/MLD messages
Hello Matthieu, On Tue, Feb 28, 2023 at 02:05:50PM +0100, Matthieu Herrb wrote: > > --- a/sys/net/pf.c > > +++ b/sys/net/pf.c > > @@ -6846,8 +6846,12 @@ pf_walk_header(struct pf_pdesc *pd, struct ip *h, > > u_short *reason) > > pd->proto = h->ip_p; > > /* IGMP packets have router alert options, allow them */ > > if (pd->proto == IPPROTO_IGMP) { > > - /* According to RFC 1112 ttl must be set to 1. */ > > - if ((h->ip_ttl != 1) || !IN_MULTICAST(h->ip_dst.s_addr)) { > > + /* > > +* According to RFC 1112 ttl must be set to 1 in all IGMP > > +* packets sent do 224.0.0.1 > > +*/ > > + if ((h->ip_ttl != 1) && > > + (h->ip_dst.s_addr == INADDR_ALLHOSTS_GROUP)) { > > DPFPRINTF(LOG_NOTICE, "Invalid IGMP"); > > REASON_SET(reason, PFRES_IPOPTIONS); > > return (PF_DROP); > > > > > Hi, > > The expression look wrong to me again. > I read in RFC1112 that correct packets should have: > > (h->ip_ttl == 1 && h->ip.ip_dst.s_addr == INADDR_ALLHOST_GROUP) the expression above is true for for Query/Report messages which are a sub-set of IGMP protocol. See Luca's emails with references to RFCs which further extend IGMP protocol. here in this particular place (pf_walk_header()) I think we really want to simply discard all IGMP datagrams sent to 224.0.0.1 with TTL different to 1. anything else should be passed further up and let pf(4) rules to make a decision on those cases. thanks and regards sashan
Re: pf(4) drops valid IGMP/MLD messages
Hi Matthieu, DVMRP messages are sent with IGMP protocol. Some Multicast Control messages (Query, Report) have an IP Destination Address belonging to 224.0.0.0/4, with TTL=1. Other DVMRP multicast control messages (Prune, Graft, Graft-Ack) have an IP Destination Address = Unicast Address of the adjacent multicast router, with TTL >1. After revision 1.1128, PF drops IGMP messages with TTL != 1 or IP Destination Address ! IN_MULTICAST. I think that your proposed test to drop invalid IGMP is too restrictive. In fact, it would block not only Prune, Graft, Graft-Ack messages (the IP Destination Address is Unicast and TTL > 1), but also: - DVMRP Probe and Report (TTL = 1 but IP Destination Address = 224.0.0.4 - DVMRP Routers) - Multicast Control messages sent to 224.0.0.2 (All Routers on this Subnet) - Multicast Control messages sent to 224.0.0.22 (IGMP) For reference: https://www.iana.org/assignments/multicast-addresses/multicast-addresses.xhtml Regards Luca Il giorno mar 28 feb 2023 alle ore 14:05 Matthieu Herrb ha scritto: > On Mon, Feb 27, 2023 at 12:04:54AM +0100, Alexandr Nedvedicky wrote: > > Hello, > > > > > > > > > 8<---8<---8<--8< > > > > diff --git a/sys/net/pf.c b/sys/net/pf.c > > > > index 8cb1326a160..c328109026c 100644 > > > > --- a/sys/net/pf.c > > > > +++ b/sys/net/pf.c > > > > @@ -6847,7 +6847,7 @@ pf_walk_header(struct pf_pdesc *pd, struct ip > *h, u_short *reason) > > > > /* IGMP packets have router alert options, allow them */ > > > > if (pd->proto == IPPROTO_IGMP) { > > > > /* According to RFC 1112 ttl must be set to 1. */ > > > > - if ((h->ip_ttl != 1) || !IN_MULTICAST(h->ip_dst.s_addr)) { > > > > + if ((h->ip_ttl != 1) && IN_MULTICAST(h->ip_dst.s_addr)) { > > > > DPFPRINTF(LOG_NOTICE, "Invalid IGMP"); > > > > REASON_SET(reason, PFRES_IPOPTIONS); > > > > return (PF_DROP); > > > > @@ -7101,8 +7101,8 @@ pf_walk_header6(struct pf_pdesc *pd, struct > ip6_hdr *h, u_short *reason) > > > >* missing then MLD message is invalid and > > > >* should be discarded. > > > >*/ > > > > - if ((h->ip6_hlim != 1) || > > > > - !IN6_IS_ADDR_LINKLOCAL(>ip6_src)) { > > > > + if ((h->ip6_hlim != 1) && > > > > + IN6_IS_ADDR_LINKLOCAL(>ip6_src)) { > > > > DPFPRINTF(LOG_NOTICE, "Invalid > MLD"); > > > > REASON_SET(reason, > PFRES_IPOPTIONS); > > > > return (PF_DROP); > > > > > > > > > > Unless I'm missing more context, this hunk looks wrong to me. Valid > > > MLD packets must have a ttl of 1 *and* come from a LL address. The > > > initial logic seems ok to me. > > > > > > > yes you are right. Your comment made me to take better look > > at RFC 1112 [1]. Section 'Informal Protocol Description' > > reads as follows: > > > >Multicast routers send Host Membership Query messages (hereinafter > >called Queries) to discover which host groups have members on > their > >attached local networks. Queries are addressed to the all-hosts > >group (address 224.0.0.1), and carry an IP time-to-live of 1. > > > > I think I've confused all-hosts group (224.0.0.1) with any multicast > > address (any class-D 224.0.0.0). I think the diff below is what we > > actually need to get IPv4 IGMP working again: > > > > [1] https://www.ietf.org/rfc/rfc1112.txt > > > > 8<---8<---8<--8< > > diff --git a/sys/net/pf.c b/sys/net/pf.c > > index 8cb1326a160..c50173186da 100644 > > --- a/sys/net/pf.c > > +++ b/sys/net/pf.c > > @@ -6846,8 +6846,12 @@ pf_walk_header(struct pf_pdesc *pd, struct ip *h, > u_short *reason) > > pd->proto = h->ip_p; > > /* IGMP packets have router alert options, allow them */ > > if (pd->proto == IPPROTO_IGMP) { > > - /* According to RFC 1112 ttl must be set to 1. */ > > - if ((h->ip_ttl != 1) || !IN_MULTICAST(h->ip_dst.s_addr)) { > > + /* > > + * According to RFC 1112 ttl must be set to 1 in all IGMP > > + * packets sent do 224.0.0.1 > > + */ > > + if ((h->ip_ttl != 1) && > > + (h->ip_dst.s_addr == INADDR_ALLHOSTS_GROUP)) { > > DPFPRINTF(LOG_NOTICE, "Invalid IGMP"); > > REASON_SET(reason, PFRES_IPOPTIONS); > > return (PF_DROP); > > > > > Hi, > > The expression look wrong to me again. > I read in RFC1112 that correct packets should have: > > (h->ip_ttl == 1 && h->ip.ip_dst.s_addr == INADDR_ALLHOST_GROUP) > > so the test to drop invalid IGMP should be: > >if (h->ip_ttl != 1 ||
[patch] usr.sbin/smtpd filter localhost relays
Hi On github someone reported an issue[0] regarding localhost MX entries. Currently smtpd will just use the localhost relay. This leads to a loop. Here a patch filtering localhost and localhost addresses for MX requests. As next step you could implement Null-MX (rfc 7505). Philipp [0] https://github.com/OpenSMTPD/OpenSMTPD/issues/1190 diff --git a/usr.sbin/smtpd/dns.c b/usr.sbin/smtpd/dns.c index f7c6b3df..7389efec 100644 --- a/usr.sbin/smtpd/dns.c +++ b/usr.sbin/smtpd/dns.c @@ -53,6 +53,7 @@ struct dns_lookup { struct dns_session *session; char*host; int preference; + int filter_localhost; }; struct dns_session { @@ -65,7 +66,7 @@ struct dns_session { int refcount; }; -static void dns_lookup_host(struct dns_session *, const char *, int); +static void dns_lookup_host(struct dns_session *, const char *, int, int); static void dns_dispatch_host(struct asr_result *, void *); static void dns_dispatch_mx(struct asr_result *, void *); static void dns_dispatch_mx_preference(struct asr_result *, void *); @@ -139,7 +140,7 @@ dns_imsg(struct mproc *p, struct imsg *imsg) case IMSG_MTA_DNS_HOST: m_get_string(, ); m_end(); - dns_lookup_host(s, host, -1); + dns_lookup_host(s, host, -1, 0); return; case IMSG_MTA_DNS_MX: @@ -205,6 +206,28 @@ dns_imsg(struct mproc *p, struct imsg *imsg) } } +static int +is_localhost(struct sockaddr *sa) +{ + struct sockaddr_in6 *ipv6; + struct sockaddr_in *ipv4; + uint32_t addr; + + switch (sa->sa_family) { + case AF_INET6: + // May check also for v6 mapped v4 addresses + ipv6 = (struct sockaddr_in6 *)sa; + return IN6_IS_ADDR_LOOPBACK(>sin6_addr); + case AF_INET: + ipv4 = (struct sockaddr_in *)sa; + addr = ntohl(ipv4->sin_addr.s_addr); + return ((addr >> 24) & 0xff) == 127; + default: + log_warnx("warn: unknown family in sockaddr"); + } + return 0; +} + static void dns_dispatch_host(struct asr_result *ar, void *arg) { @@ -215,6 +238,10 @@ dns_dispatch_host(struct asr_result *ar, void *arg) s = lookup->session; for (ai = ar->ar_addrinfo; ai; ai = ai->ai_next) { + if (lookup->filter_localhost && is_localhost(ai->ai_addr)) { + log_warnx("warn: ignore localhost address for host %s", lookup->host); + continue; + } s->mxfound++; m_create(s->p, IMSG_MTA_DNS_HOST, 0, 0, -1); m_add_id(s->p, s->reqid); @@ -280,14 +307,18 @@ dns_dispatch_mx(struct asr_result *ar, void *arg) continue; print_dname(rr.rr.mx.exchange, buf, sizeof(buf)); buf[strlen(buf) - 1] = '\0'; - dns_lookup_host(s, buf, rr.rr.mx.preference); + if (strcasecmp("localhost", buf)==0) { + log_warnx("ignore localhost MX-entry for domain <%s>", lookup->host); + continue; + } + dns_lookup_host(s, buf, rr.rr.mx.preference, 1); found++; } free(ar->ar_data); /* fallback to host if no MX is found. */ if (found == 0) - dns_lookup_host(s, s->name, 0); + dns_lookup_host(s, s->name, 0, 1); } static void @@ -340,7 +371,7 @@ dns_dispatch_mx_preference(struct asr_result *ar, void *arg) } static void -dns_lookup_host(struct dns_session *s, const char *host, int preference) +dns_lookup_host(struct dns_session *s, const char *host, int preference, int filter_localhost) { struct dns_lookup *lookup; struct addrinfo hints; @@ -350,6 +381,7 @@ dns_lookup_host(struct dns_session *s, const char *host, int preference) lookup = xcalloc(1, sizeof *lookup); lookup->preference = preference; + lookup->filter_localhost = filter_localhost; lookup->host = xstrdup(host); lookup->session = s; s->refcount++;
Re: pf(4) drops valid IGMP/MLD messages
On Mon, Feb 27, 2023 at 12:04:54AM +0100, Alexandr Nedvedicky wrote: > Hello, > > > > > 8<---8<---8<--8< > > > diff --git a/sys/net/pf.c b/sys/net/pf.c > > > index 8cb1326a160..c328109026c 100644 > > > --- a/sys/net/pf.c > > > +++ b/sys/net/pf.c > > > @@ -6847,7 +6847,7 @@ pf_walk_header(struct pf_pdesc *pd, struct ip *h, > > > u_short *reason) > > > /* IGMP packets have router alert options, allow them */ > > > if (pd->proto == IPPROTO_IGMP) { > > > /* According to RFC 1112 ttl must be set to 1. */ > > > - if ((h->ip_ttl != 1) || !IN_MULTICAST(h->ip_dst.s_addr)) { > > > + if ((h->ip_ttl != 1) && IN_MULTICAST(h->ip_dst.s_addr)) { > > > DPFPRINTF(LOG_NOTICE, "Invalid IGMP"); > > > REASON_SET(reason, PFRES_IPOPTIONS); > > > return (PF_DROP); > > > @@ -7101,8 +7101,8 @@ pf_walk_header6(struct pf_pdesc *pd, struct ip6_hdr > > > *h, u_short *reason) > > >* missing then MLD message is invalid and > > >* should be discarded. > > >*/ > > > - if ((h->ip6_hlim != 1) || > > > - !IN6_IS_ADDR_LINKLOCAL(>ip6_src)) { > > > + if ((h->ip6_hlim != 1) && > > > + IN6_IS_ADDR_LINKLOCAL(>ip6_src)) { > > > DPFPRINTF(LOG_NOTICE, "Invalid MLD"); > > > REASON_SET(reason, PFRES_IPOPTIONS); > > > return (PF_DROP); > > > > > > > Unless I'm missing more context, this hunk looks wrong to me. Valid > > MLD packets must have a ttl of 1 *and* come from a LL address. The > > initial logic seems ok to me. > > > > yes you are right. Your comment made me to take better look > at RFC 1112 [1]. Section 'Informal Protocol Description' > reads as follows: > >Multicast routers send Host Membership Query messages (hereinafter >called Queries) to discover which host groups have members on their >attached local networks. Queries are addressed to the all-hosts >group (address 224.0.0.1), and carry an IP time-to-live of 1. > > I think I've confused all-hosts group (224.0.0.1) with any multicast > address (any class-D 224.0.0.0). I think the diff below is what we > actually need to get IPv4 IGMP working again: > > [1] https://www.ietf.org/rfc/rfc1112.txt > > 8<---8<---8<--8< > diff --git a/sys/net/pf.c b/sys/net/pf.c > index 8cb1326a160..c50173186da 100644 > --- a/sys/net/pf.c > +++ b/sys/net/pf.c > @@ -6846,8 +6846,12 @@ pf_walk_header(struct pf_pdesc *pd, struct ip *h, > u_short *reason) > pd->proto = h->ip_p; > /* IGMP packets have router alert options, allow them */ > if (pd->proto == IPPROTO_IGMP) { > - /* According to RFC 1112 ttl must be set to 1. */ > - if ((h->ip_ttl != 1) || !IN_MULTICAST(h->ip_dst.s_addr)) { > + /* > + * According to RFC 1112 ttl must be set to 1 in all IGMP > + * packets sent do 224.0.0.1 > + */ > + if ((h->ip_ttl != 1) && > + (h->ip_dst.s_addr == INADDR_ALLHOSTS_GROUP)) { > DPFPRINTF(LOG_NOTICE, "Invalid IGMP"); > REASON_SET(reason, PFRES_IPOPTIONS); > return (PF_DROP); > Hi, The expression look wrong to me again. I read in RFC1112 that correct packets should have: (h->ip_ttl == 1 && h->ip.ip_dst.s_addr == INADDR_ALLHOST_GROUP) so the test to drop invalid IGMP should be: if (h->ip_ttl != 1 || h->ip.ip_dst.s_addr != INADDR_ALLHOST_GROUP) { ... return (PF_DROP } Again I may be missing something -- Matthieu Herrb
bgpd, improve RFC9234 support
When I implemented RFC9234 support I was a bit to conservative and only enabled the loop detection if the capability was enabled. This is how every other capability works but RFC9234 is special. On top of this with the addition of ASPA support the way BGP roles are handled changed and the code turned up a bit strange. Considering all of this and rereading RFC9234 I came up with this diff: - add role output to bgpctl, also adjust the capability output. Note, this changes the JSON output of neighbors a bit. - adjust the config parser to enable the RFC9234 role capability when there is a role set. iBGP and sessions with no role will not announce the role capability. - adjust the role capability announcement to be only on sessions that use either AFI IPv4 or IPv6 and SAFI 1 (AID_INET, AID_INET6). If a session has neither of the two then do not announce the role (even if set) - if there is an OPEN notification indicating that the role capability is bad only disable the capability if it is not enforced. If enforced do nothing and let the session fail (again and again). - Adjust capability negotiation, store remote_role on the peer since the neighbors role is no longer needed. - inject the OTC attribute on ingress only for AID_INET and AID_INET6. For other AIDs clear the F_ATTR_OTC_LOOP flag since OTC loop detection must be skipped for other address families. - Adjust the role logic in the RDE and use the peer->role (local role of the system) for all checks. Also remove the check if the role capability was present (removal of peer_has_open_policy()). This change switches all role checks from raw capability codes to enum role and at the same time flips the logic since the view of the checks is changed from remote system to local system. The roles change like this: rs <-> rs-client, customer <-> provider, peer <-> peer - In prefix_eligible() check also if the F_ATTR_OTC_LOOP flag is set. The RFC requires that prefixes must be considered ineligible (and not treat as withdraw) - When generating an UPDATE include the OTC attribute unless the AID is neither AID_INET or AID_INET6 or the roles do not require the addtion. No longer depend on peer_has_open_policy() there. Please test and check that my logic is actually correct and following the RFC. -- :wq Claudio Index: usr.sbin/bgpctl/output.c === RCS file: /cvs/src/usr.sbin/bgpctl/output.c,v retrieving revision 1.36 diff -u -p -r1.36 output.c --- usr.sbin/bgpctl/output.c31 Jan 2023 14:32:43 - 1.36 +++ usr.sbin/bgpctl/output.c28 Feb 2023 05:46:41 - @@ -281,6 +281,8 @@ show_neighbor_full(struct peer *p, struc printf("\n"); if (p->conf.descr[0]) printf(" Description: %s\n", p->conf.descr); + if (p->conf.ebgp && p->conf.role != ROLE_NONE) + printf(" Role: %s\n", log_policy(p->conf.role)); if (p->conf.max_prefix) { printf(" Max-prefix: %u", p->conf.max_prefix); if (p->conf.max_prefix_restart) @@ -330,7 +332,7 @@ show_neighbor_full(struct peer *p, struc } if (hascapamp || hascapaap || p->capa.peer.grestart.restart || p->capa.peer.refresh || p->capa.peer.enhanced_rr || - p->capa.peer.as4byte || p->capa.peer.role_ena) { + p->capa.peer.as4byte || p->capa.peer.policy) { printf(" Neighbor capabilities:\n"); if (hascapamp) show_neighbor_capa_mp(>capa.peer); @@ -344,10 +346,10 @@ show_neighbor_full(struct peer *p, struc show_neighbor_capa_restart(>capa.peer); if (hascapaap) show_neighbor_capa_add_path(>capa.peer); - if (p->capa.peer.role_ena) + if (p->capa.peer.policy) printf("Open Policy role %s (local %s)\n", - log_policy(p->capa.peer.role), - log_policy(p->capa.ann.role)); + log_policy(p->remote_role), + log_policy(p->conf.role)); } hascapamp = 0; @@ -360,7 +362,7 @@ show_neighbor_full(struct peer *p, struc } if (hascapamp || hascapaap || p->capa.neg.grestart.restart || p->capa.neg.refresh || p->capa.neg.enhanced_rr || - p->capa.neg.as4byte || p->capa.neg.role_ena) { + p->capa.neg.as4byte || p->capa.neg.policy) { printf(" Negotiated capabilities:\n"); if (hascapamp) show_neighbor_capa_mp(>capa.neg); @@ -374,10 +376,10 @@ show_neighbor_full(struct peer *p, struc show_neighbor_capa_restart(>capa.neg); if (hascapaap) show_neighbor_capa_add_path(>capa.neg); - if (p->capa.neg.role_ena) + if (p->capa.neg.policy)