Re: unkillable firefox
On 12/20/2016 15:29, Steve Kargl wrote: > Anyone know how to kill firefox? > > last pid: 69652; load averages: 0.49, 0.27, 0.24 up 1+02:40:06 > 13:16:02 > 126 processes: 1 running, 121 sleeping, 4 stopped > CPU: 0.8% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle > Mem: 2049M Active, 3739M Inact, 496M Laundry, 1365M Wired, 783M Buf, 239M Free > Swap: 16G Total, 1772K Used, 16G Free > > PID USERNAME PRI NICE SIZERES STATE C TIMEWCPU COMMAND > 63902 kargl 40 0 3157M 2302M STOP1 10:50 0.00% > firefox{firefox} > 63902 kargl -16 0 3157M 2302M STOP2 5:46 0.00% > firefox{Composit > 16874 kargl 40 0 740M 330M STOP1 0:07 0.00% > firefox{firefox} > 16874 kargl -16 0 740M 330M STOP1 0:00 0.00% > firefox{Composit > > It seems that firefox is wedged in the thread firefox{Compositor}, > and slowly eating up memory. This is on an amd64 system at > r310125 and latest firefox from ports. procstat suggests that its > stuck in a vm sleep queue. > > % procstat -k 63902 > PIDTID COMM TDNAME KSTACK > 63902 100504 firefox-mi_switch thread_suspend_switch > thread_single exit1 sigexit postsig ast > Xfast_syscall > 63902 101494 firefoxCompositor mi_switch sleepq_wait _sleep > vm_page_busy_sleep vm_page_sleep_if_busy > vm_fault_hold vm_fault trap_pfault trap > calltrap > Do you have output of procstat -k for all threads? I'd guess one thread is busy dumping core. Eric ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Kqueue races causing crashes
Oops, I don't think my attachment worked. This should do the trick: https://drive.google.com/open?id=0B8Lj3D-GnaCcS0taVVNlQktQRkk ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Kqueue races causing crashes
Hi there, There seems to be some racy code in kern_event.c which is causing me to run into some crashes. I’ve attached the test program used to generate these crashes (build it and run the “go” script). They were produced in a VM with 4 cores on 11 Alpha 3 (and originally 10.3). The crashes I’ve seen come in a few varieties: 1. “userret: returning with the following locks held”. This one is the easiest to hit (assuming witness is enabled). userret: returning with the following locks held: exclusive sleep mutex process lock (process lock) r = 0 (0xf80006956120) locked @ /usr/src/sys/kern/kern_event.c:2125 panic: witness_warn cpuid = 2 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe39d8e0 vpanic() at vpanic+0x182/frame 0xfe39d960 kassert_panic() at kassert_panic+0x126/frame 0xfe39d9d0 witness_warn() at witness_warn+0x3c6/frame 0xfe39daa0 userret() at userret+0x9d/frame 0xfe39dae0 amd64_syscall() at amd64_syscall+0x406/frame 0xfe39dbf0 Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe39dbf0 --- syscall (1, FreeBSD ELF64, sys_sys_exit), rip = 0x800b8a0ba, rsp = 0x7fffea98, rbp = 0x7fffeae0 --- KDB: enter: panic [ thread pid 64855 tid 100106 ] Stopped at kdb_enter+0x3b: movq$0,kdb_why db> show all locks Process 64855 (watch) thread 0xf800066c3000 (100106) exclusive sleep mutex process lock (process lock) r = 0 (0xf80006956120) locked @ /usr/src/sys/kern/kern_event.c:2125 Process 64855 (watch) thread 0xf8000696a500 (100244) exclusive sleep mutex pmap (pmap) r = 0 (0xf800068c3138) locked @ /usr/src/sys/amd64/amd64/pmap.c:4067 exclusive sx vm map (user) (vm map (user)) r = 0 (0xf800068f6080) locked @ /usr/src/sys/vm/vm_map.c:3315 exclusive sx vm map (user) (vm map (user)) r = 0 (0xf800068c3080) locked @ /usr/src/sys/vm/vm_map.c:3311 db> ps pid ppid pgrp uid state wmesg wchancmd 64855 690 690 0 R+ (threaded) watch 100106 Run CPU 2 main 100244 Run CPU 1 procmaker 100245 Run CPU 3 reaper 2. “Sleeping thread owns a non-sleepable lock”. This one first drew my attention by showing up in a real world application at work. Sleeping thread (tid 100101, pid 76857) owns a non-sleepable lock KDB: stack backtrace of thread 100101: sched_switch() at sched_switch+0x2a5/frame 0xfe257690 mi_switch() at mi_switch+0xe1/frame 0xfe2576d0 sleepq_catch_signals() at sleepq_catch_signals+0x16c/frame 0xfe257730 sleepq_timedwait_sig() at sleepq_timedwait_sig+0xf/frame 0xfe257760 _sleep() at _sleep+0x234/frame 0xfe2577e0 kern_kevent_fp() at kern_kevent_fp+0x38a/frame 0xfe2579d0 kern_kevent() at kern_kevent+0x9f/frame 0xfe257a30 sys_kevent() at sys_kevent+0x12a/frame 0xfe257ae0 amd64_syscall() at amd64_syscall+0x2d4/frame 0xfe257bf0 Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe257bf0 --- syscall (363, FreeBSD ELF64, sys_kevent), rip = 0x800b6afea, rsp = 0x7fffea88, rbp = 0x7fffead0 --- panic: sleeping thread cpuid = 3 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe225590 kdb_backtrace() at kdb_backtrace+0x39/frame 0xfe225640 vpanic() at vpanic+0x126/frame 0xfe225680 panic() at panic+0x43/frame 0xfe2256e0 propagate_priority() at propagate_priority+0x166/frame 0xfe225710 turnstile_wait() at turnstile_wait+0x282/frame 0xfe225750 __mtx_lock_sleep() at __mtx_lock_sleep+0x26b/frame 0xfe2257d0 __mtx_lock_flags() at __mtx_lock_flags+0x5e/frame 0xfe2257f0 proc_to_reap() at proc_to_reap+0x46/frame 0xfe225840 kern_wait6() at kern_wait6+0x202/frame 0xfe2258f0 sys_wait4() at sys_wait4+0x72/frame 0xfe225ae0 amd64_syscall() at amd64_syscall+0x2d4/frame 0xfe225bf0 Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe225bf0 --- syscall (7, FreeBSD ELF64, sys_wait4), rip = 0x800b209ba, rsp = 0x7fffdfdfcf48, rbp = 0x7fffdfdfcf80 --- KDB: enter: panic [ thread pid 76857 tid 100225 ] Stopped at kdb_enter+0x3e: movq$0,kdb_why db> show allchains chain 1: thread 100225 (pid 76857, reaper) blocked on lock 0xf800413105f0 (sleep mutex) "process lock" thread 100101 (pid 76857, main) inhibited (3./4.) There are a few others that I hit less frequently (“page fault while in kernel mode”, "Kernel page fault with the following non-sleepable locks held”. I don’t have a backtrace handy for these. I believe they all have more or less the same cause. The crashes occur because we acquire a knlist lock via the KN_LIST_LOCK macro, but when we call KN_LIST_UNLOCK, the knote’s knlist reference (kn->kn_knlist) has been cleared by another thread. Thus we are unable to unlock the previously acquired lock and hold it until
Re: Appending to message buffer while in ddb
On 08/03/2015 03:21 PM, Marcel Moolenaar wrote: On Aug 3, 2015, at 12:59 PM, Eric Badger eric_bad...@dell.com wrote: Hi there, Since r226435, output from kernel printf/log functions is not appended to the message buffer when in ddb. The commit message doesn't call this out specifically; instead it appears to have been to address double printing to the console while in ddb. I noticed this because a ddb script which previously resulted in some things ending up in a textdump's msgbuf.txt no longer does so. It may be that the answer is use db_printf in ddb, which is ok, but I thought I'd check anyway to see if the aforementioned change was indeed intentional, since I wasn't able to dig up any discussion about it. It’s a direct consequence. But is it a necessary consequence? For example, would the below patch also be acceptable (it's perhaps not the cleanest way to do it, but gives the idea)? This way we'll print to the console (once) and, if TOLOG is also specified, append to the message buffer. If this is not acceptable, then I think all ddb commands not using db_printf (such as 'show rtc') should be converted to doing so (this might be a good idea either way), since their output cannot currently be captured in textdumps. Thanks, Eric diff --git sys/kern/subr_prf.c sys/kern/subr_prf.c index 4f35838..4739331 100644 --- sys/kern/subr_prf.c +++ sys/kern/subr_prf.c @@ -463,19 +463,28 @@ putchar(int c, void *arg) struct putchar_arg *ap = (struct putchar_arg*) arg; struct tty *tp = ap-tty; int flags = ap-flags; + int putbuf_done = 0; /* Don't use the tty code after a panic or while in ddb. */ if (kdb_active) { if (c != '\0') cnputc(c); - return; - } - - if ((flags TOTTY) tp != NULL panicstr == NULL) - tty_putchar(tp, c); + /* Prevent double printing. */ + ap-flags = ~(TOCONS); + flags = ap-flags; + } else { + if ((panicstr == NULL) (flags TOTTY) (tp != NULL)) + tty_putchar(tp, c); - if ((flags (TOCONS | TOLOG)) c != '\0') - putbuf(c, ap); + if (flags TOCONS) { + putbuf(c, ap); + putbuf_done = 1; + } + } + if ((flags TOLOG) (putbuf_done == 0)) { + if (c != '\0') + putbuf(c, ap); + } } /* ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Appending to message buffer while in ddb
Hi there, Since r226435, output from kernel printf/log functions is not appended to the message buffer when in ddb. The commit message doesn't call this out specifically; instead it appears to have been to address double printing to the console while in ddb. I noticed this because a ddb script which previously resulted in some things ending up in a textdump's msgbuf.txt no longer does so. It may be that the answer is use db_printf in ddb, which is ok, but I thought I'd check anyway to see if the aforementioned change was indeed intentional, since I wasn't able to dig up any discussion about it. Thanks, Eric ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: PCI PF memory decode disable when sizing VF BARs
On 05/06/15 14:54, Ryan Stone wrote: On Wed, May 6, 2015 at 2:33 PM, John Baldwin j...@freebsd.org mailto:j...@freebsd.org wrote: Ah, sorry, I didn't know you did it in the caller already. Perhaps then something more like your previous patch, but using the test you added here (PCIR_IS_IOV) instead of your previous check against BAR values to decide when to frob the command register? I think that I prefer the current version, as it keeps the interface consistent. It's redundant now, but caller could evolve in the future. Given that this is just being run during initialization a couple of extra register accesses are irrelevant anyway. On Wed, May 6, 2015 at 2:58 PM, Eric Badger eric.bad...@compellent.com mailto:eric.bad...@compellent.com wrote: Does the disabling of VF MSE in pci_iov_config actually protect anything else beyond what happens in pci_read_bar? I gave a read through which suggests no, but I might have missed something. Just thinking that the code would be a bit more hardy if it were done the same way for both VFs and other devices. Eric I think that it inherently has to be done differently. For real PCI devices the device might be important during the boot process (e.g. the video card) so we need to stay working. For VFs the devices don't even exist until I enable the VF Enable bit is set, so setting MSE before that point is irrelevant (it's allowed by the spec, but any access to a VF memory space with MSE set and VF Enable clear just gets an Unsupported Request response). Sure; what I meant was to leave the disabling of VF MSE when sizing VF BARs in pci_read_bar (as in your second patch) for consistency and, if possible, not bother disabling VF MSE in pci_iov_config. But if it's not worth nixing the latter (or not possible), it's no big deal. I've been testing out the second patch in my environment and it looks good. I might suggest something like the below (which I find more readable) as a cosmetic change: @@ -2627,9 +2635,18 @@ pci_read_bar(device_t dev, int reg, pci_addr_t *mapp, pci_addr_t *testvalp, * determining the BAR's length since we will be placing it in * a weird state. */ - cmd = pci_read_config(dev, PCIR_COMMAND, 2); - pci_write_config(dev, PCIR_COMMAND, - cmd ~(PCI_BAR_MEM(map) ? PCIM_CMD_MEMEN : PCIM_CMD_PORTEN), 2); +#ifdef PCI_IOV +if (PCIR_IS_IOV(dinfo-cfg, reg)) { +restore_reg = dinfo-cfg.iov-iov_pos + PCIR_SRIOV_CTL; +mask = PCIM_SRIOV_VF_MSE; +} else +#endif +{ +restore_reg = PCIR_COMMAND; +mask = PCI_BAR_MEM(map) ? PCIM_CMD_MEMEN : PCIM_CMD_PORTEN; +} +cmd = pci_read_config(dev, restore_reg, 2); +pci_write_config(dev, restore_reg, cmd ~mask, 2); Thanks, Eric ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: PCI PF memory decode disable when sizing VF BARs
On 05/06/15 13:33, John Baldwin wrote: On Wednesday, May 06, 2015 02:24:24 PM Ryan Stone wrote: On Wed, May 6, 2015 at 11:45 AM, John Baldwin j...@freebsd.org wrote: There are some devices with BARs in non-standard locations. :( If there is a flag to just disable the VF bar decoding, then ideally we should just be doing that and leaving the global decoding flag alone while sizing the VF BAR. Disabling SR-IOV BAR decoding in this function is currently redundant, as it's already done in pci_iov.c, but I guess to keep the interface sane it makes sense to do it here too. Something like this then? Ah, sorry, I didn't know you did it in the caller already. Perhaps then something more like your previous patch, but using the test you added here (PCIR_IS_IOV) instead of your previous check against BAR values to decide when to frob the command register? Does the disabling of VF MSE in pci_iov_config actually protect anything else beyond what happens in pci_read_bar? I gave a read through which suggests no, but I might have missed something. Just thinking that the code would be a bit more hardy if it were done the same way for both VFs and other devices. Eric ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
PCI PF memory decode disable when sizing VF BARs
Hi Ryan and -current, During IOV config, when setting up VF bars, several calls are made to 'pci_read_bar' (in sys/dev/pci/pci.c) in order to size VF BARs, which causes memory decoding to be turned off temporarily for the PF associated with those VFs. I'm finding that this can interfere with an already running PF. I've several thoughts about how this might be handled, but I'm not convinced I understand all of the consequences each of them entails, so any thoughts from others would be appreciated. Here are ideas I've considered: 1. Check the value of the 'reg' arg to 'pci_read_bar' and, if it is outside a standard BAR range, don't disable memory decoding. This is simple, but feels a little hackish and may have consequences I'm missing. 2. Pass some flag/context through such that pci_read_bar knows it is configuring VF BARs (we might instead disable VF MSE in this case, if it is enabled). It would be necessary to carry this flag/context through several function calls before reaching pci_read_bar, which might end up being ugly. 3. Rearrange the calls so that VF BARs are sized when the PF is not yet running, and that info saved until VFs are created. Probably it would be done when the PF BARs are sized for any device supporting IOV, even if that device never creates VFs. Thanks, Eric ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Early use of log() does not end up in kernel msg buffer
On 04/06/2015 04:11 PM, Poul-Henning Kamp wrote: In message 2033248.eu3rhs8...@ralph.baldwin.cx, John Baldwin writes: I think phk@ broke this back in 70239. Before that the log() function did this: log() { /* log to the msg buffer */ kvprintf(fmt, msglogchar, ...); if (!log_open) { /* log to console */ kvprintf(fmt, putchar, ...); } } I think your patch is fine unless phk@ (cc'd) has a reason for not wanting to do this. The reason was systems not running syslog having slow serial consoles. Correct me if I've misunderstood, but that doesn't seem to matter here; the proposed change adds logging to the message buffer but leaves logging to the console (when no syslog is listening) unchanged. Eric ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Early use of log() does not end up in kernel msg buffer
Using log(9) when no process is reading the log results in the message going only to the console (contrast with printf(9), which goes to the console and to the kernel message buffer in this case). I believe it is truer to the semantics of logging for messages to *always* go to the message buffer (where they can eventually be collected and in fact put into a logfile). I therefore propose the attached patch, which sends log(9) to the message buffer always, and to the console only if no one has yet opened the log. It may be more complete to log to the console only if the log level is greater than some (user defined) value, but this seems like that might be more than necessary for this case. Thoughts? Eric diff --git share/man/man9/printf.9 share/man/man9/printf.9 index 84ac822..505ea9b 100644 --- share/man/man9/printf.9 +++ share/man/man9/printf.9 @@ -67,7 +67,8 @@ The .Fn log function sends the message to the kernel logging facility, using the log level as indicated by -.Fa pri . +.Fa pri , +and to the console if no process is yet reading the log. .Pp Each of these related functions use the .Fa fmt diff --git sys/kern/subr_prf.c sys/kern/subr_prf.c index 7e6fd09..6509522 100644 --- sys/kern/subr_prf.c +++ sys/kern/subr_prf.c @@ -295,7 +295,7 @@ log(int level, const char *fmt, ...) va_list ap; va_start(ap, fmt); - (void)_vprintf(level, log_open ? TOLOG : TOCONS, fmt, ap); + (void)_vprintf(level, log_open ? TOLOG : TOCONS | TOLOG, fmt, ap); va_end(ap); msgbuftrigger = 1; ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Filepaths in VM map for tmpfs files
On 02/05/2015 07:25 AM, John Baldwin wrote: On Thursday, February 05, 2015 10:37:55 AM Konstantin Belousov wrote: On Wed, Feb 04, 2015 at 10:15:04AM -0500, John Baldwin wrote: On Tuesday, February 03, 2015 10:33:36 PM Konstantin Belousov wrote: On Mon, Feb 02, 2015 at 09:50:22PM -0600, Eric Badger wrote: On 02/02/2015 03:30 AM, Konstantin Belousov wrote: On Sun, Feb 01, 2015 at 08:38:29PM -0600, Eric Badger wrote: On 01/31/2015 09:36 AM, Konstantin Belousov wrote: First, shouldn't the kve_type changed to KVME_TYPE_VNODE as well ? My thinking is no, because KVME_TYPE_SWAP is in fact the correct type; I'd opine that it is better to be transparent than make it look like there is an OBJT_VNODE object there. It may be that some programs would be confused by VNODE info returned on a SWAP type mapping, though I know that dtrace handles it OK. kve_vn_* and kve_path fields are defined only for KVME_TYPE_VNODE kve_type. So this is in fact a bug in whatever used the API to access kve_path for KVE_TYPE_SWAP. Hmm, is that documented anywhere? I think it's fair to assume that kve_vn* applies only to the VNODE type, but I know there are several in-tree users that reference kve_path regardless of type (ostensibly relying on the default of an empty string). Maybe one could determine the validity of the kve_vn* fields by inspecting the kve_vn_type (not sure of all the consequences of that)? Or change it to KVME_TYPE_VNODE and deal with the below problem... There is no useful documentation for the kern.proc. sysctls. My word (and statements from other involved developers) could be considered as close to the truth as it can be. Somebody taking the efforts to document the stuff would make very valuable contribution. I think that kve_path should be valid for all types (e.g. shm_open() is not a vnode but has a pathname, and that should be fixed to display if possible). In the equivalent for files (kinfo_file), the pathname is type-independent and always valid. Well, this means that it should be valid for vnodes and shm. My point is that kvme_vn_path should be used only after the check for type. We can and do set it to nul string, but using the path unconditionally is a bug in the user code. The problem is that shm's can have different types (DEFAULT vs SWAP vs PHYS). :) For kinfo_file, tools like fstat always print kf_path regardless of type. I do think it would be more consistent if the path in a kvme worked the same way. Then you don't have to update all the tools each time a type starts populating the path. Re: the kve_vn* fields, isn't setting kve_status = KF_ATTR_VALID the way to mark them as valid (irrespective of kve_type)? As for path name, I'd agree that there's no inherent need to restrict it by type. The field is somewhat self-validating (if something other than an empty string was returned in the path name field, this field is obviously valid). That said, I think tmpfs nodes should be exposed as files. It is an implementation detail of tmpfs that they are swap-backed, but from a user's perspective these are files, and if you want to expose other vnode-specific fields than just the path, KVME_TYPE_VNODE would be more correct. I agree, but doing it is not easy, since there might be no vnode to get the required information from. We do know that this swap object is for tmpfs node, but currently we only store pointer to object in the node, not pointer to node from the object. When the vnode exists, pointer to vnode is stored in the object. To fix the issue, we should store pointer to node. Code was not done this way, because VM code which handles special-case for OBJT_TMPFS, would need to know tmpfs internals. Right now, code knows about vnodes anyway, so object-vnode does not bring tmpfs internals into vm. I'm more arguing in support of your original proposal. Doing a best effort if the vnode exists would certainly be an improvement over what we have now. I'll make one more brief case for returning tmpfs vm objects as KVME_TYPE_SWAP. Isn't the purpose of this sysctl for debugging, or to help a user understand what is going on internally? I can imagine scenarios where knowing that a mapped file is swap backed is relevant information, and returning it as KVME_TYPE_VNODE would hide this information. I'd put forth a vote for return vnode info on a best-effort basis, at least for now. Eric ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Filepaths in VM map for tmpfs files
On 02/03/2015 02:33 PM, Konstantin Belousov wrote: On Mon, Feb 02, 2015 at 09:50:22PM -0600, Eric Badger wrote: On 02/02/2015 03:30 AM, Konstantin Belousov wrote: On Sun, Feb 01, 2015 at 08:38:29PM -0600, Eric Badger wrote: On 01/31/2015 09:36 AM, Konstantin Belousov wrote: First, shouldn't the kve_type changed to KVME_TYPE_VNODE as well ? My thinking is no, because KVME_TYPE_SWAP is in fact the correct type; I'd opine that it is better to be transparent than make it look like there is an OBJT_VNODE object there. It may be that some programs would be confused by VNODE info returned on a SWAP type mapping, though I know that dtrace handles it OK. kve_vn_* and kve_path fields are defined only for KVME_TYPE_VNODE kve_type. So this is in fact a bug in whatever used the API to access kve_path for KVE_TYPE_SWAP. Hmm, is that documented anywhere? I think it's fair to assume that kve_vn* applies only to the VNODE type, but I know there are several in-tree users that reference kve_path regardless of type (ostensibly relying on the default of an empty string). Maybe one could determine the validity of the kve_vn* fields by inspecting the kve_vn_type (not sure of all the consequences of that)? Or change it to KVME_TYPE_VNODE and deal with the below problem... There is no useful documentation for the kern.proc. sysctls. My word (and statements from other involved developers) could be considered as close to the truth as it can be. Somebody taking the efforts to document the stuff would make very valuable contribution. Ok. If I can get a solution figured, I'll plan to include some documentation updates. This problem is somewhat important to me, so I'm going to do some additional digging and see if I can't come up with a solution that takes into account your notes. Thanks for the help, Eric ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Filepaths in VM map for tmpfs files
On 02/02/2015 03:30 AM, Konstantin Belousov wrote: On Sun, Feb 01, 2015 at 08:38:29PM -0600, Eric Badger wrote: On 01/31/2015 09:36 AM, Konstantin Belousov wrote: First, shouldn't the kve_type changed to KVME_TYPE_VNODE as well ? My thinking is no, because KVME_TYPE_SWAP is in fact the correct type; I'd opine that it is better to be transparent than make it look like there is an OBJT_VNODE object there. It may be that some programs would be confused by VNODE info returned on a SWAP type mapping, though I know that dtrace handles it OK. kve_vn_* and kve_path fields are defined only for KVME_TYPE_VNODE kve_type. So this is in fact a bug in whatever used the API to access kve_path for KVE_TYPE_SWAP. Hmm, is that documented anywhere? I think it's fair to assume that kve_vn* applies only to the VNODE type, but I know there are several in-tree users that reference kve_path regardless of type (ostensibly relying on the default of an empty string). Maybe one could determine the validity of the kve_vn* fields by inspecting the kve_vn_type (not sure of all the consequences of that)? Or change it to KVME_TYPE_VNODE and deal with the below problem... Second, note that it is possible that the vnode is recycled, so OBJ_TMPFS flag is cleared for tmpfs swap object. The OBJ_TMPFS_NODE flag is still set then. I am not sure what to do in this case, should the type changed to KVME_TYPE_VNODE still, but kve_vn_* fields left invalid ? I think if we changed to KVME_TYPE_VNODE in some cases, it should be done in all cases, even if the vnode has been recycled (but leave vp == NULL in that case). Though if it is left as KVME_TYPE_SWAP, then that concern goes away on its own. Concern is not vp == NULL, but the fact that kve_vn* cannot be filled, there is simply no (easy) way to fetch this information. Right; by leaving vp == NULL, I meant don't populate the kve_vn* fields, which admittedly isn't a great solution. But as you say, the information is not really available once the vnode has been reclaimed. There is some inherent difficultly in the duality of the vm object here; it would be nice if it could be treated uniformly with other vnodes, but I think I lack the expertise to approach a more involved solution that would achieve this. Incidentally Konstantin, thanks for the feedback and advice. Eric ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Filepaths in VM map for tmpfs files
On 01/31/2015 09:36 AM, Konstantin Belousov wrote: First, shouldn't the kve_type changed to KVME_TYPE_VNODE as well ? My thinking is no, because KVME_TYPE_SWAP is in fact the correct type; I'd opine that it is better to be transparent than make it look like there is an OBJT_VNODE object there. It may be that some programs would be confused by VNODE info returned on a SWAP type mapping, though I know that dtrace handles it OK. Second, note that it is possible that the vnode is recycled, so OBJ_TMPFS flag is cleared for tmpfs swap object. The OBJ_TMPFS_NODE flag is still set then. I am not sure what to do in this case, should the type changed to KVME_TYPE_VNODE still, but kve_vn_* fields left invalid ? I think if we changed to KVME_TYPE_VNODE in some cases, it should be done in all cases, even if the vnode has been recycled (but leave vp == NULL in that case). Though if it is left as KVME_TYPE_SWAP, then that concern goes away on its own. Eric ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Filepaths in VM map for tmpfs files
In FreeBSD 9, examining the VM map of a process (with e.g. 'procstat -v') with a tmpfs file mapped showed a VNODE type and displayed the file path. In 10.0 up to CURRENT (I believe this started at r250030), instead SWAP is shown without a filepath. This has some unfortunate consequences; I discovered this problem when trying to use dtrace's pid provider, which fails to find symbols for executables running from tmpfs. I've attached a patch which will repair procstat/dtrace. There are a few other places such a patch would be needed. I'm willing to put together such a patch, but would like to first hear some feedback that this seems like a reasonable approach, or if there's anything I've missed. Thoughts? Eric Index: sys/kern/kern_proc.c === --- sys/kern/kern_proc.c (revision 277957) +++ sys/kern/kern_proc.c (working copy) @@ -2337,6 +2337,11 @@ break; case OBJT_SWAP: kve-kve_type = KVME_TYPE_SWAP; +if ((lobj-flags OBJ_TMPFS) != 0) +{ + vp = lobj-un_pager.swp.swp_tmpfs; + vref(vp); +} break; case OBJT_DEVICE: kve-kve_type = KVME_TYPE_DEVICE; ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org