On NPF connection state handling
Hi, I have made some adjustments to the NPF connection state handling in the NetBSD -current. The state behaviour can now be configured using the configuration-level parameters -- please see explanations in the manual pages at the following locations: - npf.conf(5) manual page, the "Stateful" section. - npf-params(7) manual page, the section on the 'state.key' parameters. Let me know if you have any questions or if you would like to suggest wording improvements to these sections. Thanks. -- Mindaugas
Re: npf lock issue?
Anthony Mallet| Trying to upgrade from 7.99.44 to today's -current, I have a panic > | right away when starting npf. The boot with npf disabled is fine (see > | note below), then when manually running `npfctl reload` the machine > | reboots right aways with absolutely no diagnostic. This is an issue > | that I experiencing consistently since something like last January or > | so. > > I got a useful backtrace, it's actually failing in sljit: > > #11 0x804b3075 in panic ( > fmt=fmt@entry=0x806b6790 "uvm_km_check_empty: va %p has pa > 0x%llx") > at /usr/src/sys/kern/subr_prf.c:258 > #12 0x8044ed05 in uvm_km_check_empty ( > map=map@entry=0x8081c780 , > start=, end=18446744071572586496) > at /usr/src/sys/uvm/uvm_km.c:563 > #13 0x8045268f in uvm_map ( > map=map@entry=0x8081c780 , > startp=startp@entry=0xfe80cc383918, size=size@entry=65536, > uobj=, uoffset=uoffset@entry=-1, align=, > flags=, flags@entry=5927) at > /usr/src/sys/uvm/uvm_map.c:1096 > #14 0x8044ee4f in uvm_km_alloc ( > map=0x8081c780 , size=size@entry=65536, > align=align@entry=4096, flags=flags@entry=49) > at /usr/src/sys/uvm/uvm_km.c:621 > #15 0x80240a4d in alloc_chunk (size=65536) > at /usr/src/sys/external/bsd/sljit/dist/sljit_src/sljitExecAllocator.c:110 > #16 sljit_malloc_exec (size=) > at /usr/src/sys/external/bsd/sljit/dist/sljit_src/sljitExecAllocator.c:221 > 221 header = (struct block_header*)alloc_chunk(chunk_size); > > Does this ring a bell to anyone? This looks like a bug in sljit rather than NPF per se. The panic message suggests some kind of KVA leak. I suspect it might be a result of e.g. a free_chunk() call with an incorrect size in the sljitExecAllocator.c code. Alex -- do you want to have a look into this? -- Mindaugas
Re: The NPF firewall leaks! (was Re: in_cksum: out of data)
Tom Ivar Helbekkmowrote: > ... > > It's fine and all, but I tend to think that the simplistic first version > might automatically expand to the code in the second one. In fact, the > documentation seems to agree with me: > > By default, a stateful rule implies SYN-only flag check ("flags > S/SAFR") for the TCP packets. It is not advisable to change this > behavior; however, it can be overridden with the flags keyword. > > The code or the documentation needs to change. I vote for the code. :) There is a difference between these two: pass stateful out final all pass stateful out final proto tcp all The latter will generate an implicit "flags S/SAFR", while the former will not as it covers non-TCP protocols too. I agree that this is not really intuitive and the documentation did not clarify this either. -- Mindaugas
Re: A web interface for apropos: http://man-k.org/
Abhinav Upadhyaywrote: > Recently, I have been working on building a web interface for > apropos(1). Although, there are still plenty of things to do but I > think it is in a decent shape now and I wanted to share it with all of > you. > > http://man-k.org/ > > I would appreciate any constructive feedback that you might have and I > will try to implement it :-) That is convenient. Would you be able to add POSIX manuals? https://www.kernel.org/pub/linux/docs/man-pages/man-pages-posix/ Other systems would be handy too, e.g. illumos and some Linux distribution. -- Mindaugas
Re: Removal of acorn26 port
Justin Cormack jus...@specialbusservice.com wrote: acorn26 build but has been suspected of being broken for years. The last post to port-acorn26 was in 2011. Yes, I am sure it qualifies to move to tier III, and has done for some years. Procedurally, moving it officially to tier III and then removing all support for it from the common code to make it even more broken (it would then be removed from the build matrix), and then deleting it in 6 months might be better than just deleting it now. Back in 2009 when I removed uarea swap-out [1], I managed to find only *one* user of acorn26 - its former maintainer (bjh21). He managed to boot to single-user mode but run out of time trying multi-user. Since then nobody managed to show acorn26 being usable without uarea swapout. There is really no point to keep it dusting. [1] https://mail-index.netbsd.org/source-changes/2009/10/21/msg002198.html -- Mindaugas
Re: NPF questions, issues and observations
Harry Waddell wadd...@caravaninfotech.com wrote: I know NPF is a work in progress, and so is its documentation, but now that I have used it for a fairly large project, I have several questions and a few problems. I'm using netbsd-7 as of 3/12/15. 1. this validates $private_addr = { 10.0.0.0/8, 172.16.0.0/14, 192.168.0.0/16 } map vlan200 dynamic $private_addr - $mesh_map_addr pass from mesh_nattable to ngroutes but this does not map vlan200 dynamic mesh_nattable - $mesh_map_addr pass from mesh_nattable to ngroutes This seems like an artificial constraint, but I could be missing something. Yes. In fact, when the extended map syntax is used, the value on the left hand side (in a case of outbound NAT) is ignored because the filter criteria is explicitly defined by the pass ... rule. The parser only validates the syntax. When I was implementing this, I was considering something like: map wm0 dynamic any - $nat_ip pass from table1 to table2 However, I was not sure whether the keyword any (or perhaps explicit) would make it clearer or, contrary, would just confuse users. Thoughts? 2. Is there a way to get a listing of the NAT state table akin to ipnat -l? Well, you can run npfctl save and it will dump the configuration *and* all the connections to the /var/db/npf.db file (in PropertyList format). However, npfctl does not have a command to print them in human readable format yet. This functionality is planned, but my higher priority is to replace proplib/PropertyList with a better library and format (binary, at the very least). 3. I got the npfctl: npfctl_config_send: File exists error message. This is not the world's most useful message. I eventually tracked it down to a duplicate entry in a tree type table loaded from a file. Fixed. There are more user-unfriendly messages lurking.. I will need to go through them in a more structured way at some point. 4. Since group names are unique ( when direction is factored in ), I don't see what he advantage is to the ruleset syntax for dynamic rules. I supect this is because there's a lot of functionality in the group-opt I don't understand. Would someone provide some additional explanation of dynamic rulesets? Dynamic rulesets allow you to add/remove rules on the fly, think of iptables-style rules. In npf.conf, ruleset is just a syntactic sugar for group which indicates that the group will have the rules managed dynamically rather than statically. Does that answer your question? 5. With my large npf.conf file, npfctl comamnds and npf itself seem to hang after repeated reloads and a system reboot is required to clear the problem. Has anyone else experienced this. I think a PR is in order. This is a bug. I need more details about your problem, but I have just committed one fix in -current (and requested a pullup to netbsd-7 branch) which is likely to be a fix for the same problem you are experiencing. Pull-up ticket containing the fix is #630. 6. The line count of /etc/npf.conf and all my files for tables is now 569 lines. The old ipfilter based configuration was 1184 lines. The new configuration has 13 different network security zones — the old one had only 7. Clearly, it's possible to do pretty complicated things with npf with fewer, more readable, lines of configuration and tables make it a lot easier to maintain. Good to hear! 7. It doesn't seem to be possible to use a variable in the definition of another variable. I assume this is because the parser just makes one pass, but it would be really handy if one could do something like: The parser has some limitations. Hopefully, it will get improved soon. -- Mindaugas
Re: pf--?
Thomas Klausner t...@giga.or.at wrote: Is there still much point in keeping pf(4) in NetBSD now that we have npf(4) (and also ipfilter(4))? There are some missing features we need to implement before being able to fully replace it. Once those are done, I am all for nuking pf(4). -- Mindaugas
Re: NetBSD/amd64-current crashes during shutdown
Tom Ivar Helbekkmo t...@hamartun.priv.no wrote: After upgrading to an August 29th snapshot of 7.99.1, my amd64 system crashes during shutdown, always on the same assertion: ... FYI: Fixed in the latest netbsd-7. -- Mindaugas
Re: Kernels w/o IPX, NETATALK support build failure
John D. Baker jdba...@mylinuxisp.com wrote: -c /x/current/src/sys/net/if_loop.c In file included from /x/current/src/sys/net/if_loop.c:88:0: /x/current/src/sys/net/if_loop.c: In function 'lostart': /x/current/src/sys/net/if.h:429:30: error: 'ifq' may be used uninitialized in this function [-Werror=maybe-uninitialized] #define IF_QFULL(ifq) ((ifq)-ifq_len = (ifq)-ifq_maxlen) ^ /x/current/src/sys/net/if_loop.c:362:19: note: 'ifq' was declared here struct ifqueue *ifq; ^ /x/current/src/sys/net/if_loop.c:425:14: error: 'isr' may be used uninitialized in this function [-Werror=maybe-uninitialized] schednetisr (isr); ^ cc1: all warnings being treated as errors *** [if_loop.o] Error code 1 nbmake: stopped in /d0/build/current/obj/i386/sys/arch/i386/compile/FAYE In if_loop.c, function lostart(), ifq and isr are uninitialized unless at least either of options IPX or options NETATALK is present in the kernel configuration AND the address family matches AF_IPX or AF_APPLETALK, in the switch() statement, respectively. For the INET{,6} cases, they remain uninitialized should control reach the statements shown in the error messages above. Fixed. gcc is not being clever here: if those options are not present, then the path taking ifq/isr would not be taken. -- Mindaugas
Re: KASSERT KERNEL_LOCKED_P() failed (if_ethersubr.c:214) when stopping rpcbind
Nicolas Joly nj...@pasteur.fr wrote: njoly@lanfeust [/misc/crash] crash -M netbsd.17.core -N netbsd.17 Crash version 6.99.42,image version 6.99.42. System panicked: kernel diagnostic assertion KERNEL_LOCKED_P()failed: file /local/src/NetBSD/src/sys/net/if_ethersubr.c, line 214 Backtrace from time of crash is available. Fixed. We may start pushing the kernel-lock to the entry points of various subsystems (like Ethernet, in this case), but wrapping attach/detach is more defensive way for now. -- Mindaugas
Re: KASSERT KERNEL_LOCKED_P() failed (if_ethersubr.c:214) when stopping rpcbind
Manuel Bouyer bou...@antioche.eu.org wrote: On Tue, May 20, 2014 at 08:04:23PM +0100, Mindaugas Rasiukevicius wrote: Nicolas Joly nj...@pasteur.fr wrote: njoly@lanfeust [/misc/crash] crash -M netbsd.17.core -N netbsd.17 Crash version 6.99.42,image version 6.99.42. System panicked: kernel diagnostic assertion KERNEL_LOCKED_P ()failed: file /local/src/NetBSD/src/sys/net/if_ethersubr.c, line 214 Backtrace from time of crash is available. Fixed. We may start pushing the kernel-lock to the entry points of various subsystems (like Ethernet, in this case), but wrapping attach/detach is more defensive way for now. the IP stack is not supported to run under KERNEL_LOCK (and actually is not, because ip_input() is explicitely called without the KERNEL_LOCK held. The ethernet subsystem (and probably other link-layer subsystems) still needs the kernel lock, so grabing is again before sending a packet is the way to go (and is what IP is doing). This is why I added this KASSERT(). Running the whole IP stack under KERNEL_LOCK would be a regression. Who said anything about the whole IP stack? We are talking about PCB attach/detach only. I have restored the original behaviour before my recent change. -- Mindaugas
Re: npf build break
Valery Ushakov u...@stderr.spb.ru wrote: uint16_t bswap16(uint16_t) __constfunc; - __attribute__((__const__)) This code has been for years. Are you building with -O0 option? May be we shouldn't depend on the optimizer to take care of this? Should we provide suitable macros, say, HTONS_C(val) and HTONL_C(val) using C99 INT*_C() macros as a naming pattern? __byte_swap_*_constant may not be directly suitable for this as they include a cast which precludes its use in preprocessor directives (assuming that's a desirable feature). I agree. Would you like to add them? -- Mindaugas
Re: something's wrong
Matthias Drochner m.droch...@fz-juelich.de wrote: On Thu, 14 Nov 2013 23:41:33 +0900 Takahiro HAYASHI t-h...@abox3.so-net.ne.jp wrote: I happened unfortunately to meet this problem, but fortunately entered ddb. I was doing ./build release for amd64 on amd64 HEAD around Nov 9. Does this give any help? Yes, thanks - this helps to narrow down the problem. I don't see the real reason yet, but perhaps someone more familiar with synchronization matters can make more sense of it. Just extracting some interesting data. Thomas, Takahiro: can you try with the following change? http://mail-index.netbsd.org/source-changes/2013/11/26/msg049508.html -- Mindaugas
Re: something's wrong
Thomas Klausner w...@netbsd.org wrote: On Tue, Nov 26, 2013 at 08:34:15PM +, Mindaugas Rasiukevicius wrote: Thomas, Takahiro: can you try with the following change? http://mail-index.netbsd.org/source-changes/2013/11/26/msg049508.html Thanks for looking at this! The immediate effect is worse though: repeatable panic right after avail memory: panic: kernel diagnostic assertion level SOFTINT_CONST failed: file .../sys/kern/kern_softint.c, line 343 fatal breakpoint trap in supervisor mode That was a bit too quick from me. Sorry about that. Please cvs up. -- Mindaugas
Re: KASSERT while running dev/sysmon/t_swsensor on tmpfs
Nicolas Joly nj...@pasteur.fr wrote: Hi, When checking NetBSD -current testsuite on my amd64 main workstation i do see tmpfs related KASSERT panics when dev/sysmon/t_swsensor testcases terminate. I can reproduce it easily with the following steps : mount_tmpfs tmpfs /tmp cd /usr/tests/dev/sysmon atf-run t_swsensor System panicked: kernel diagnostic assertion TAILQ_EMPTY (node-tn_spec.tn_dir.tn_dir)failed: file /local/src/NetBSD/src/sys/fs/tmpfs/tmpfs_subr.c, line 248 Just to for the record: this was fixed in -current. -- Mindaugas
Re: Question about tmpfs
chris...@astron.com (Christos Zoulas) wrote: In article 52713dff.2000...@m00nbsd.net, Maxime Villard m...@m00nbsd.net wrote: Hi, I have a question regarding the function tmpfs_alloc_node() in fs/tmpfs/tmpfs_subr.c. When alloc'ing the area for symlinks, there's this code: l.171 nnode-tn_size = strlen(target); if (nnode-tn_size == 0) { nnode-tn_spec.tn_lnk.tn_link = NULL; break; } nnode-tn_spec.tn_lnk.tn_link = tmpfs_strname_alloc(tmp, nnode-tn_size); if (nnode-tn_spec.tn_lnk.tn_link == NULL) { tmpfs_node_put(tmp, nnode); return ENOSPC; } memcpy(nnode-tn_spec.tn_lnk.tn_link, target, nnode-tn_size); Only strlen(target) bytes are allocated for 'target', and only strlen(target) bytes are copied from 'target' to 'tn_link'. Why isn't the '\0' taken into account? Is this intentional? I don't think it is from a quick reading. The only reason it works, is because most of the time it rounds up. It is not a bug, but it is potentially error-prone. I adjusted the code: http://mail-index.netbsd.org/source-changes/2013/10/31/msg048829.html -- Mindaugas
Re: Clue needed - failure trying to build from old sources
Paul Goyette p...@whooppee.com wrote: ... # link npfctl/npfctl /build/test/tools/x86_64/amd64/bin/x86_64--netbsd-gcc\ --sysroot=/build/test/dest/amd64 -o npfctl \ npfctl.o npf_var.o npf_data.o npf_ncgen.o npf_build.o \ npf_extmod.o npf_disassemble.o npf_scan.o npf_parse.o \ -lnpf -lprop -lcrypto -lutil -ly \ -Wl,-dynamic-linker=/libexec/ld.elf_so \ -Wl,-rpath,/lib -L=/lib /lib/libcrypt.so.1: undefined reference to `__explicit_memset' Is it, for some reason, trying to use the host's installed copy of libcrypt rather than the one that was built earlier in $DESTDIR? (I notice that it uses -lcrypto in the Makefile, but the error message refers to libcrypt - without the 'o'). So, 1. Will adding -l crypt to npfctl's Makefile fix this? 2. If I do that, do I also need to add ${LIBCRYPT} to DPADD? 3. Any other additional changes needed? Seems like the same PR/47922, where joerg@ made a fix for ld(1). -- Mindaugas