Re: panic in deadlkres() on r267110
On Fri, Jun 6, 2014 at 5:06 PM, Glen Barber g...@freebsd.org wrote: On Fri, Jun 06, 2014 at 07:23:49AM -0700, Sean Bruno wrote: On Fri, 2014-06-06 at 10:12 -0400, Glen Barber wrote: Two machines in the cluster panic last night with the same backtrace. It is unclear yet exactly what was happening on the systems, but both are port building machines using ports-mgmt/tinderbox. Any ideas or information on how to further debug this would be appreciated. These machines were happily running r266621 previously to this update yesterday. So, that gives us a bisection point. Some more debug information. Thank you to Attilio for information on what data to get. Script started on Fri Jun 6 15:00:53 2014 command: /bin/sh # kgdb ./kernel.debug /var/crash/vmcore.0 [...] #0 doadump (textdump=-946873840) at pcpu.h:219 219 __asm(movq %%gs:%1,%0 : =r (td) (kgdb) p/x allproc_lock.sx_lock $1 = 0xf813ae7f4924 Current language: auto; currently minimal (kgdb) p ((struct thread *)0xf813ae7f4924) The actual thread address is: 0xf813ae7f4920. Then look at the GDB threads list and match with the tid. Attilio ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: processes stuck in vmo_de state
On Thu, Mar 13, 2014 at 10:18 AM, Xin Li delp...@delphij.net wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Hi, It looks like there is a regression (or a regression that gets exposed by some new feature) that is related to time-keeping or timecounter, although I'm not yet familiar with the related code to tell if my conclusion was right or not. The problem I observed is that when system boots up, it sometimes hangs and pressing ^T on console tells me that sleep(1) is running with 0 second out of 1 second, but the 'real' part of the output is smaller than 1 or sometimes negative. For some reason the console may stop giving any output, but trapping into debugger would unblock it sometimes. When sh(1) stuck in 'vmo_de' state, it would never recover from that and a hard reset is necessary. If sleeps are not being serviced 'vmo_de' deadlocks makes sense because it is a sleep(1) condition. What is softclock doing at the time the deadlock happens? Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Fwd: Problem with curret in vmware
On Tue, Jul 30, 2013 at 5:55 PM, John Baldwin j...@freebsd.org wrote: On Tuesday, July 30, 2013 5:25:06 am Alexander Yerenkow wrote: Hello all. I have panics in vmware with installed vmwaretools (they are guessed culprit). Seems that memory balooning (or using more memory in all vms than there is in host) produces some kind of weird behavior in FreeBSD. This vm aren't shutted down now, is there somethin I can do to help investigate this? Panic screens: http://gits.kiev.ua/FreeBSD/panic1.png http://gits.kiev.ua/FreeBSD/panic2.png Looks like their code needs to be updated to work with locking changes in HEAD. Attilio is probably the best person to ask. Exactly which is the ports you installed? Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Fwd: Problem with curret in vmware
On Fri, Aug 2, 2013 at 8:27 PM, Alexander Yerenkow yeren...@gmail.com wrote: That was their official tools, which are came from ISO which mounted with command install/upgrade client tools. There is not much I can do then, unless they update their source-code. Or do you have any pointer? Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Kernel hangs on reboot on system with 05/2013~06/2013 CURRENT sources
On Tue, Jun 25, 2013 at 11:27 PM, Florian Smeets f...@smeets.im wrote: On 06/25/2013 22:45, Garrett Cooper wrote: Long story short is that I've run into an issue on several VM images and real machines where UFS on mpt fails to reboot because it hangs in the kernel. I don't have any specific details, other than it occurs regularly with cam/mpt on VMware boxes running builds; however I've also seen this occur with a Dell box that has an mpt SAS controller with 2 zpools and gobs of RAM. Does anyone know of any issues in this area [recently]? This set of issues appears to have started cropping up after 03/2013, because I was running reliable builds off those sources. Thanks! -Garrett Yes, I saw the same thing today when rebooting a box running r251905: Tue Jun 18 10:12:42 CEST 2013 with ahci on a zfs only system. I update this box about once a week, that previous kernel was from Jun 11. and that still rebooted successfully. As the kernel from June 18. is now kernel.old I don't know the SVN rev for the June 11 kernel, but it looks like it was broken between June 11. and 18. Can you break into KDB once the loop happens? Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Incorrect comparison of ticks in deadlkres
On Wed, May 29, 2013 at 1:18 AM, Ryan Stone ryst...@gmail.com wrote: On Tue, May 28, 2013 at 5:29 PM, Ian Lepore i...@freebsd.org wrote: ticks is defined as a signed integer but conceptually it is unsigned -- it increments from 0 to UINT_MAX (not INT_MAX) then rolls over. If td-td_blktick is captured while ticks = UINT_MAX and later ticks has rolled over and counted back up to 15, then ticks - td-td_blktick gives an elapsed time of 16, as it should be. Whether exploiting this property of signed overflow is elegant or ugly is in the eye of the beholder. :) If the intent of the ticks td-td_blktick is to avoid the deadlock check until after enough time has passed, then I guess it should probably be something more like (ticks - td-blktick) SOME_THRESHOLD so that it also uses the signed overflow trick. -- Ian It already does this later on to actually detect the deadlock. The test is reversed but was intended to bail and not calculate the time elapsed at all if ticks had overflowed after td_blktick was captured, but as you say this is unnecessary. I'm not sure if there was a comparison between the 2 values (ticks, td_slpticks) somewhere, but if there is not and only add/sub to the relative values then it is good to be removed. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Incorrect comparison of ticks in deadlkres
On Sat, May 25, 2013 at 11:55 PM, Ryan Stone ryst...@gmail.com wrote: Currently deadlkres performs the following comparison when trying to check for threads that have been blocked on a mutex or sleeping on an sx lock: if (TD_ON_LOCK(td) ticks td-td_blktick) { /* check for deadlock...*/ Yes the check looks indeed inverted. The test against ticks is incorrect. It results in deadlkres only signaling a deadlock after ticks has rolled over; at 1000 hz this will take up to 49 days. From looking at the history of the code this test appears to be a attempt to deal with ticks rollover. However this is necessary; later on the code calculates the amount of time that has passed with: tticks = ticks - td-td_blktick; ticks was designed to exploit integer underflow in the case of rollover to guarantee that subtraction produces correct results in all cases (other than a double rollover, of course). I am going to remove the two incorrect tests unless somebody can point out a overflow/underflow case that I haven't considered. I'm not sure I follow what are you saying. Assume that when thread td goes to sleep, ticks is very close to the 32 bits limit. Then thread td goes to sleep and td-td_blktick is set to a value very close to 32 bits limits. After a while deadlkres thread kicks in and in the while ticks counter overflowed, rolling back to a very low value. How are you supposed to compute a valid value from this situation? I think that you need to still guard about overflow of ticks for such cases. Additively, if you really want to improve deadlkres, you should bring into the logic a fix for the adaptive spinning. Think about the schematic LOR case. Because of the adaptive spinning what will happen is that 2 threads getting a deadlock on 2 different locks will just end up spinning. I think you should import some sort of checks just like spinmutexes do, but with much higher time threshhold. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: sysutils/fusefs-kmod problem in CURRENT
On Wed, Mar 27, 2013 at 4:00 AM, Marcelo/Porks marceloro...@gmail.com wrote: On Mar 22, 2013 1:02 AM, Attilio Rao atti...@freebsd.org wrote: On Fri, Mar 22, 2013 at 2:02 AM, Marcelo/Porks marceloro...@gmail.com wrote: Hi, I'm facing an error compiling the sysutils/fusefs-kmod. I'm using the CURRENT from today (2013-03-21). Can someone using the CURRENT confirm if this also happens in your system? CURRENT should not allow you to build fusefs-kmod at all. Use option FUSEFS from your kernel config. Sorry, maybe I didnt understand what you said. I tried to use in my kernel conf: options FUSEFS option FUSEFS options fusefs option fusefs Sorry, my fault, I mis-pelled it. it is: options FUSE Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: sysutils/fusefs-kmod problem in CURRENT
On Fri, Mar 22, 2013 at 2:02 AM, Marcelo/Porks marceloro...@gmail.com wrote: Hi, I'm facing an error compiling the sysutils/fusefs-kmod. I'm using the CURRENT from today (2013-03-21). Can someone using the CURRENT confirm if this also happens in your system? CURRENT should not allow you to build fusefs-kmod at all. Use option FUSEFS from your kernel config. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: r244036 kernel hangs under load.
On Tue, Dec 11, 2012 at 9:55 PM, Rick Macklem rmack...@uoguelph.ca wrote: Konstantin Belousov wrote: On Mon, Dec 10, 2012 at 07:11:59PM -0500, Rick Macklem wrote: Konstantin Belousov wrote: On Mon, Dec 10, 2012 at 01:38:21PM -0500, Rick Macklem wrote: Adrian Chadd wrote: .. what was the previous kernel version? Hopefully Tim has it narrowed down more, but I don't see the hangs on a Sept. 7 kernel from head and I do see them on a Dec. 3 kernel from head. (Don't know the eact rNN.) It seems to predate my commit (r244008), which was my first concern. I use old single core i386 hardware and can fairly reliably reproduce it by doing a kernel build and a svn checkout concurrently. No NFS activity. These are running on a local disk (UFS/FFS). (The kernel I reproduce it on is built via GENERIC for i386. If you want me to start a binary search for which rNN, I can do that, but it will take a while.:-) I can get out into DDB, but I'll admit I don't know enough about it to know where to look;-) Here's some lines from db ps, in case they give someone useful information. (I can leave this box sitting in DB for the rest of to-day, in case someone can suggest what I should look for on it.) Just snippets... Ss pause adjkerntz DL sdflush [sofdepflush] RL [syncer] DL vlruwt [vnlru] DL psleep [bufdaemon] RL [pagezero] DL psleep [vmdaemon] DL psleep [pagedaemon] DL ccb_scan [xpt_thrd] DL waiting_ [sctp_iterator] DL ctl_work [ctl_thrd] DL cooling [acpi_cooling0] DL tzpoll [acpi_thermal] DL (threaded) [usb] ... DL - [yarrow] DL (threaded) [geom] D - [g_down] D - [g_up] D - [g_event] RL (threaded) [intr] I [irq15: ata1] ... Run CPU0 [swi6: Giant taskq] -- does this one indicate the CPU is actually running this? (after a db cont, wait a while ctrlaltesc db ps it is still the same) I [swi4: clock] I [swi1: netisr 0] I [swi3: vm] RL [idle: cpu0] SLs wait [init] DL audit_wo [audit] DLs (threaded) [kernel] D - [deadlkres] ... D sched [swapper] I have no idea if this ps output helps, unless it indicates that it is looping on the Giant taskq? Might be. You could do 'bt pid' for the process to see where it loops. Another good set of hints is at http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html Kostik, you must be clairvoyant;-) When I did show alllocks, I found that the syncer process held - exclusive sleep mutex mount mtx locked @ kern/vfs_subr.c:4720 - exclusive lockmgr syncer locked @ kern/vfs_subr.c:1780 The trace for this process goes like: spinlock_exit mtx_unlock_spin_flags kern_yield _mnt_vnode_next_active vnode_next_active vfs_msync() So, it seems like your r244095 commit might have fixed this? (I'm not good at this stuff, but from your description, it looks like it did the kern_yield() with the mutex held and maybe got into trouble trying to acquire Giant?) Anyhow, I'm going to test a kernel with r244095 in it and see if I can still reproduce the hang. (There wasn't much else in the show alllocks, except a process that held the exclusive vnode interlock mutex plus a ufs vnode lock, but it's just doing a witness_unlock.) There must be a thread blocked for the mount interlock for the loop in the mnt_vnode_next_active to cause livelock. Yes. I am getting hangs with the -current kernel and they seem easier for me to reproduce. Can you report the svn rev number is kernel is built from? Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: LK_SHARED/LK_DOWNGRADE adjustments to lock.9 manual page
On Thu, Nov 29, 2012 at 12:05 PM, Andriy Gapon a...@freebsd.org wrote: on 16/11/2012 16:42 Andriy Gapon said the following: on 15/11/2012 23:44 Attilio Rao said the following: Do you think you can test this patch?: http://www.freebsd.org/~attilio/lockmgr_forcerec.patch I will use this patch in my tree, but I think that it is effectively already quite well tested by using INVARIANTS+WITNESS. I've been using this patch in both debug and non-debug environments and I have not run into any issues. Please commit when you get a chance. Thank you. Committed as r243900, please proceed with manpage cleanup. Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: panic: vm_object_madvise: page 0xfffffe0413c58630 is fictitious
On Tue, Nov 27, 2012 at 11:26 AM, Andre Oppermann an...@freebsd.org wrote: FreeBSD bbb.ccc 10.0-CURRENT FreeBSD 10.0-CURRENT #0: Fri Nov 23 17:00:40 CET 2012 a...@bbb.ccc:/usr/obj/usr/src/head/sys/GENERIC amd64 #0 doadump (textdump=-2014022336) at pcpu.h:229 #1 0x8033e2d2 in db_fncall (dummy1=value optimized out, dummy2=value optimized out, dummy3=value optimized out, dummy4=value optimized out) at /usr/src/head/sys/ddb/db_command.c:578 #2 0x8033e074 in db_command (last_cmdp=value optimized out, cmd_table=value optimized out, dopager=1) at /usr/src/head/sys/ddb/db_command.c:449 #3 0x8033dd62 in db_command_loop () at /usr/src/head/sys/ddb/db_command.c:502 #4 0x80340690 in db_trap (type=value optimized out, code=0) at /usr/src/head/sys/ddb/db_main.c:231 #5 0x808b375e in kdb_trap (type=3, code=0, tf=value optimized out) at /usr/src/head/sys/kern/subr_kdb.c:654 #6 0x80bfc71a in trap (frame=0xff8487f478a0) at /usr/src/head/sys/amd64/amd64/trap.c:579 #7 0x80be65b2 in calltrap () at /tmp/exception-3nQ6Cf.s:179 #8 0x808b2f5e in kdb_enter (why=0x80e5e23b panic, msg=value optimized out) at cpufunc.h:63 #9 0x8088086f in panic (fmt=value optimized out) at /usr/src/head/sys/kern/kern_shutdown.c:628 #10 0x80adea4a in vm_object_madvise (object=value optimized out, pindex=value optimized out, end=8952, advise=value optimized out) at /usr/src/head/sys/vm/vm_object.c:1101 Can you do: frame 10 and then info locals and finally p *m? Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Spurious witness warning when destroying spin mtx
On Sat, Nov 24, 2012 at 3:01 PM, Attilio Rao atti...@freebsd.org wrote: On Sat, Nov 24, 2012 at 3:08 AM, Ryan Stone ryst...@gmail.com wrote: Today I saw a spurious witness warning for acquiring duplicate lock of same type. The root cause is that when running mtx_destroy on a spinlock that is held by the current thread, mtx_destroy calls spinlock_exit() before calling WITNESS_UNLOCK, which opens up a window in which the CPU can be interrupted and attempt to acquire another spinlock of the same type as the one being destroyed. This patch should fix it: I seriously wonder why right now we don't assume the lock is unheld. There are likely historically reasons for that, but I would like to know which one are those and eventually fix them out. FWIK, all the other locking primitives assume the lock is already unheld when destroying and I think it would be good to have that for mutexes as well. Can you please show which lock triggers the panic you saw? Ryan, however I'm sure this patch would introduce a major switch in our KPI/POLA and eventually it would be a bigger work. Your patch is certainly right and I think you should commit it for the time being. For the long-term, maybe you would like to work on such a patch, rather. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Spurious witness warning when destroying spin mtx
On Sat, Nov 24, 2012 at 3:08 AM, Ryan Stone ryst...@gmail.com wrote: Today I saw a spurious witness warning for acquiring duplicate lock of same type. The root cause is that when running mtx_destroy on a spinlock that is held by the current thread, mtx_destroy calls spinlock_exit() before calling WITNESS_UNLOCK, which opens up a window in which the CPU can be interrupted and attempt to acquire another spinlock of the same type as the one being destroyed. This patch should fix it: I seriously wonder why right now we don't assume the lock is unheld. There are likely historically reasons for that, but I would like to know which one are those and eventually fix them out. FWIK, all the other locking primitives assume the lock is already unheld when destroying and I think it would be good to have that for mutexes as well. Can you please show which lock triggers the panic you saw? Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Spurious witness warning when destroying spin mtx
On Sat, Nov 24, 2012 at 3:46 PM, Ryan Stone ryst...@gmail.com wrote: On Sat, Nov 24, 2012 at 10:01 AM, Attilio Rao atti...@freebsd.org wrote: I seriously wonder why right now we don't assume the lock is unheld. There are likely historically reasons for that, but I would like to know which one are those and eventually fix them out. FWIK, all the other locking primitives assume the lock is already unheld when destroying and I think it would be good to have that for mutexes as well. Can you please show which lock triggers the panic you saw? Thanks, Attilio It was taskqueue_free: taskqueue_free() must not be called in places where there are still races, so the lock is not really meaningful and should be acquired. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Spurious witness warning when destroying spin mtx
On Sat, Nov 24, 2012 at 3:51 PM, Attilio Rao atti...@freebsd.org wrote: On Sat, Nov 24, 2012 at 3:46 PM, Ryan Stone ryst...@gmail.com wrote: On Sat, Nov 24, 2012 at 10:01 AM, Attilio Rao atti...@freebsd.org wrote: I seriously wonder why right now we don't assume the lock is unheld. There are likely historically reasons for that, but I would like to know which one are those and eventually fix them out. FWIK, all the other locking primitives assume the lock is already unheld when destroying and I think it would be good to have that for mutexes as well. Can you please show which lock triggers the panic you saw? Thanks, Attilio It was taskqueue_free: taskqueue_free() must not be called in places where there are still races, so the lock is not really meaningful and should be acquired. Herm, I mean to say after taskqueue_termintate() returns must not be races Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: LK_SHARED/LK_DOWNGRADE adjustments to lock.9 manual page
On 11/15/12, Andriy Gapon a...@freebsd.org wrote: To people knowing the code, do the following documentation changes look correct? The latter chunk is not correct. It will panic only if assertions are on. I was thinking that however it would be good idea to patch lockmgr to panic also in non-debugging kernel situation. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: LK_SHARED/LK_DOWNGRADE adjustments to lock.9 manual page
On Thu, Nov 15, 2012 at 8:38 PM, Andriy Gapon a...@freebsd.org wrote: on 15/11/2012 20:46 Attilio Rao said the following: On 11/15/12, Andriy Gapon a...@freebsd.org wrote: To people knowing the code, do the following documentation changes look correct? The latter chunk is not correct. It will panic only if assertions are on. But the current content is not correct too? Indeed, current content is crappy. I was thinking that however it would be good idea to patch lockmgr to panic also in non-debugging kernel situation. It would make sense indeed, IMO. Do you think you can test this patch?: http://www.freebsd.org/~attilio/lockmgr_forcerec.patch I think the LK_NOSHARE case is still fine with just asserts. Once this patch goes in, you are free to commit your documentation one. Thanks for fixing doc. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: MPSAFE VFS -- update
On Mon, Oct 22, 2012 at 4:50 PM, C. P. Ghost cpgh...@cordula.ws wrote: On Thu, Oct 18, 2012 at 7:51 PM, Attilio Rao atti...@freebsd.org wrote: Following the plan reported here: http://wiki.freebsd.org/NONMPSAFE_DEORBIT_VFS We are now at the state where all non-MPSAFE filesystems are disconnected by the three. Sad to see PortalFS go. You've served us well here. :-( So do you think you will be able to test patches if someone fixes it? I've double-checked and unfortunately there is no FUSE module for portalfs. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: mounting ntfs partition
On Sun, Oct 21, 2012 at 12:53 PM, Raoul rm...@free.fr wrote: Hi, Trying to mount a partition from type ntfs with the following conditions i get: R241700, with fusefs-libs in sync. kldload fuse fuse loaded mount -t ntfs /dev/daXsX not supported! mount_ntfs /dev/daXsX no such file or directory! in the second case, truss will report: sysctl , ... which mean that the path is empty of course. Perhaps i miss something??? Try ntfs-3g /dev/daXsX /mnt/ntfs/ Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: mounting ntfs partition
On 10/21/12, Raoul MEGELAS rm...@free.fr wrote: On Sun, 21 Oct 2012 14:04:46 +0100 Attilio Rao atti...@freebsd.org wrote: Hi Attilio, On Sun, Oct 21, 2012 at 12:53 PM, Raoul rm...@free.fr wrote: Hi, Trying to mount a partition from type ntfs with the following conditions i get: R241700, with fusefs-libs in sync. kldload fuse fuse loaded mount -t ntfs /dev/daXsX not supported! mount_ntfs /dev/daXsX no such file or directory! in the second case, truss will report: sysctl , ... which mean that the path is empty of course. Perhaps i miss something??? Try ntfs-3g /dev/daXsX /mnt/ntfs/ ntfs-3g crashes in the following environment: mac x86/64 hardware, running an external usb drive with a freebsd partition (da0s3; da1s1 da1s2 = ntfs partitions. and even if it mounts fine the first read/rwrite panic. Attilio -- Peace can only be achieved by understanding - A. Einstein and now i investigated a little more. BTW, are you using the imported new FUSE module from stock kernel? I think that fusefs-ntfs will install fusefs-kmod port which must not happen now. Please make sure to use the stock FUSE support present in -CURRENT (which means reinstalling kernel and world appropriately). Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: mounting ntfs partition
On Sun, Oct 21, 2012 at 10:55 PM, Raoul MEGELAS rm...@free.fr wrote: sorry for posting 2 times the same message! On Sun, 21 Oct 2012 22:07:36 +0100 Attilio Rao atti...@freebsd.org wrote: On 10/21/12, Raoul MEGELAS rm...@free.fr wrote: On Sun, 21 Oct 2012 14:04:46 +0100 Attilio Rao atti...@freebsd.org wrote: Hi Attilio, On Sun, Oct 21, 2012 at 12:53 PM, Raoul rm...@free.fr wrote: Hi, Trying to mount a partition from type ntfs with the following conditions i get: R241700, with fusefs-libs in sync. kldload fuse fuse loaded mount -t ntfs /dev/daXsX not supported! mount_ntfs /dev/daXsX no such file or directory! in the second case, truss will report: sysctl , ... which mean that the path is empty of course. Perhaps i miss something??? Try ntfs-3g /dev/daXsX /mnt/ntfs/ ntfs-3g crashes in the following environment: mac x86/64 hardware, running an external usb drive with a freebsd partition (da0s3; da1s1 da1s2 = ntfs partitions. and even if it mounts fine the first read/rwrite panic. and now i investigated a little more. BTW, are you using the imported new FUSE module from stock kernel? I think that fusefs-ntfs will install fusefs-kmod port which must not happen now. Please make sure to use the stock FUSE support present in -CURRENT (which means reinstalling kernel and world appropriately). if CURRENT R241700 has the good one as UPDATING said, i have it. mount.c is dated October 18. other files in the same directory: February. Yes but did you rebuild world and install it? Is your kernel using FUSE support from the kernel config? (and not from the fusefs-kmod port) Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: mounting ntfs partition
On Sun, Oct 21, 2012 at 11:10 PM, Raoul MEGELAS rm...@free.fr wrote: On Sun, 21 Oct 2012 22:58:00 +0100 Attilio Rao atti...@freebsd.org wrote: On Sun, Oct 21, 2012 at 10:55 PM, Raoul MEGELAS rm...@free.fr wrote: On Sun, 21 Oct 2012 22:07:36 +0100 Attilio Rao atti...@freebsd.org wrote: On 10/21/12, Raoul MEGELAS rm...@free.fr wrote: On Sun, 21 Oct 2012 14:04:46 +0100 Attilio Rao atti...@freebsd.org wrote: Hi Attilio, On Sun, Oct 21, 2012 at 12:53 PM, Raoul rm...@free.fr wrote: Hi, Trying to mount a partition from type ntfs with the following conditions i get: R241700, with fusefs-libs in sync. kldload fuse fuse loaded mount -t ntfs /dev/daXsX not supported! mount_ntfs /dev/daXsX no such file or directory! in the second case, truss will report: sysctl , ... which mean that the path is empty of course. Perhaps i miss something??? Try ntfs-3g /dev/daXsX /mnt/ntfs/ ntfs-3g crashes in the following environment: mac x86/64 hardware, running an external usb drive with a freebsd partition (da0s3; da1s1 da1s2 = ntfs partitions. and even if it mounts fine the first read/rwrite panic. and now i investigated a little more. BTW, are you using the imported new FUSE module from stock kernel? I think that fusefs-ntfs will install fusefs-kmod port which must not happen now. Please make sure to use the stock FUSE support present in -CURRENT (which means reinstalling kernel and world appropriately). if CURRENT R241700 has the good one as UPDATING said, i have it. mount.c is dated October 18. other files in the same directory: February. Yes but did you rebuild world and install it? Is your kernel using FUSE support from the kernel config? (and not from the fusefs-kmod port) yes: deinstall fusefs-ntfs deinstall fusefs-kmod svn update make buildworld no error build kernel no error mergemaster! all this on October 18: R241700 i rebuilt mount from /usr/src/head/sbin/mount just now, and mount_ntfs, with the same result. mount_ntfs has no play here. It is a remnant of the in-kernel support for NTFS and it is not used at all. You need to install fusefs-ntfs and use ntfs-3g the way I showed you. However, fusefs-ntfs is going to bring along fusefs-kmod as a dependency. We need a patch to disable that. If Florian or George don't beat me to it (as they were going to fix it) I can produce one in 1-2 days. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
MPSAFE VFS -- update
Following the plan reported here: http://wiki.freebsd.org/NONMPSAFE_DEORBIT_VFS We are now at the state where all non-MPSAFE filesystems are disconnected by the three. At this point we can proceed with the import of a revised kib's patch as reported in that page. This will mean effectively remove Giant from the VFS, buffer cache and GEOM layer. We expect to have this included as soon as possibe, maybe before the end of the month. Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Problem with fuse.ko
On Tue, Oct 16, 2012 at 6:53 PM, AN a...@neu.net wrote: FreeBSD FBSD10 10.0-CURRENT FreeBSD 10.0-CURRENT #26 r241612: Tue Oct 16 13:03:26 EDT 2012 root@FBSD10:/usr/obj/usr/src/sys/MYKERNEL amd64 I loaded the module with kldload fuse.ko # kldstat Id Refs AddressSize Name 1 22 0x8020 d75248 kernel 21 0x80f76000 3570 amdtemp.ko 31 0x80f7a000 ee2b08 nvidia.ko 43 0x81e5d000 48508linux.ko 51 0x81ea6000 4ef30vboxdrv.ko 61 0x82012000 3dfe linprocfs.ko 71 0x82183000 9744 fuse.ko According to /usr/src/UPDATING: 20121014: Import the FUSE kernel and userland support into base system. What provides libfuse.so.2? # truecrypt Shared object libfuse.so.2 not found, required by truecrypt Previously the library was installed as part of the fusefs-kmod port, how do you install it now? Install the port sysutils/fusefs-libs. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: MPSAFE VFS -- List of upcoming actions
On Fri, Sep 21, 2012 at 1:22 AM, Attilio Rao atti...@freebsd.org wrote: On Wed, Sep 19, 2012 at 3:48 AM, Attilio Rao atti...@freebsd.org wrote: On Fri, Jul 13, 2012 at 12:18 AM, Attilio Rao atti...@freebsd.org wrote: 2012/7/4 Attilio Rao atti...@freebsd.org: 2012/6/29 Attilio Rao atti...@freebsd.org: As already published several times, according to the following plan: http://wiki.freebsd.org/NONMPSAFE_DEORBIT_VFS I still haven't heard from Vivien or Edward, anyway as NTFS is basically only used RO these days (also the mount_ntfs code just permits RO mounting) I stripped all the uncomplete/bogus write support with the following patch: http://www.freebsd.org/~attilio/ntfs_remove_write.patch This is an attempt to make the code smaller and possibly just focus on the locking that really matter (as read-only filesystem). On some points of the patch I'm a bit less sure as we could easily take into account also write for things like vaccess() arguments, and make easier to re-add correct write support at some point in the future, but still force RO, even if the approach used in the patch is more correct IMHO. As an added bonus this patch cleans some dirty code in the mount operation and fixes a bug as vfs_mountedfrom() is called before real mounting is completed and can still fail. A quick update on this. It looks like NTFS won't be completed for this GSoC thus I seriously need to find an alternative to not loose the NTFS support entirely. I tried to look into the NTFS implementation right now and it is really a poor support. As Peter has also verified, it can deadlock in no-time, it compeltely violates VFS rules, etc. IMHO it deserves a complete rewrite if we would still support in-kernel NTFS. I also tried to look at the NetBSD implementation. Their code is someway similar to our, but they used very complicated (and very dirty) code to do the locking. Even if I don't know well enough NetBSD VFS, I have the impression not all the races are correctly handled. Definitively, not something I would like to port. Considering all that the only viable option would be meaning an userland filesystem implementation. My preferred choice would be to import PUFFS and librefuse on top of it but honestly it requires a lot of time to be completed, time which I don't currently have as in 2 months Giant must be gone by the VFS. I then decided to switch to gnn's rewamp of FUSE patches. You can find his initial e-mail here: http://lists.freebsd.org/pipermail/freebsd-fs/2012-March/013876.html I've precisely got the second version of George's patch and created this dolphin branch: svn://svn.freebsd.org/base/projects/fuse I'm fixing low hanging fruit for the moment (see r238411 for example) and I still have to make a throughful review. However my idea is to commit the support once: - ntfs-3g is well stress-tested and proves to be bug-free - there is no major/big technical issue pending after the reviews In the last weeks Peter, Florian, Gustau and I have been working in stabilizing fuse support. In the specific, Peter has worked hard on producing several utilities to nit stress-test fuse and in particular ntfs, Florian has improved fuse related ports (as explained later) and Gustau has done sparse testing. I feel moderately satisfied by the level of stability of fuse now to propose to wider usage, in particular given the huge amount of complaints I'm hearing around about occasional fuse users. The final target of the project is to completely import into base the content of fusefs-kmod starting from earlier posted patches by George. So far, we took care only of importing in the fuse branch the kernel part, so that fusefs-kmod userland part is still needed to be installed from ports, but I was studying the mount_fusefs licensing before to process with the import for the userland bits of it. The fixing has been happening here: svn://svn.freebsd.org/base/projects/fuse/ which is essentially an HEAD branch + fuse kernel components. In order to get fuse, please compile a kernel from this branch with FUSE option or simply build and load fuse module. Alternatively, a kernel patch that should work with HEAD@240684 is here: http://www.freebsd.org/~attilio/fuse_import/fuse_240684.patch I guess the patch can easilly apply to all FreeBSD branches, really, but it is not tested to anything else different then -CURRENT. As said you still need currently to build fusefs-kmod port. However you need these further patches, to be put in the fusefs-kmod/files/ directory:: http://www.freebsd.org/~attilio/fuse_import/patch-Makefile http://www.freebsd.org/~attilio/fuse_import/patch-mount_fusefs__mount_fusefs2.c They both disable the old kernel building/linking and import new functionality to let the new kernel support work well in presence of many consumers. In addition to fusefs-kmod, Bryan and Florian have also updated fusefs-lib and fusefs-ntfs ports
Re: MPSAFE VFS -- List of upcoming actions
On Wed, Oct 10, 2012 at 6:15 AM, Kevin Oberman kob6...@gmail.com wrote: On Mon, Oct 8, 2012 at 7:57 AM, Attilio Rao atti...@freebsd.org wrote: On Fri, Sep 28, 2012 at 4:47 PM, Harald Schmalzbauer h.schmalzba...@omnilan.de wrote: schrieb Attilio Rao am 28.09.2012 16:18 (localtime): On Wed, Sep 26, 2012 at 12:02 PM, Harald Schmalzbauer h.schmalzba...@omnilan.de wrote: ... After many people willing to test fuse on STABLE_9, I made this patch that at least compiles there: http://www.freebsd.org/~attilio/fuse_import/fuse_stable9_241030.patch Thanks a lot! In the meantime I made the original patch compiling. I simply looked at the changes which were made around july in the fuse project to follow changes in head (checkpath(), vrecycle() and vtruncbuf()) and reverted them. Since I have no idea about the code I modified, I'm happy that you did a more qualified patch set :-) Of course, I didn't have a chance to test it because I'm also out for vacation right now but please do and report. Happy holiday!!! If you're by chance arround the Oktoberfest, drop me a note, I'll pay you a Maß (or any other drink if you don't like „Wiesnbier“) :-) I really hoped to make this year, but no luck :/ ... Some questions: Is this planned to be mfc'd and if so, how can one know? In which sense how can one know?. We usually specify MFC timeouts in the commit message (not sure if this answers your concerns). Yep, that's what I wanted to know. So if there's no MFC timeout in the log, it's not intended to be MFCd ever I guess. Thanks a lot! World/Kernel compiled fine in the meantime, I'll do some sshfs tests. Did you do any test in the end? Thanks, Attilio i have done same testing and it clearly is more stable than the old kmod. At least operations that crashed my system now work. I did see one weird anomaly, though. I had several NTFS file system mounted, one a Windows OS. I also had a GELI encrypted UFS file system mounted. They were both mounted and working. I finished with the data disk and tried to unmount it. I got no error, but it remained mounted. I did not actually try to access it. Figured it would umount when I shut down or end up dirty and I'd have to fsck it. The unmount attempt was using nautilus/gnome-mount. This is not the odd part, though. Kevin, can you please report steps required to reproduce it in high detail (rather than a description), please? This will help in reproducing it and eventually fixing it. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: MPSAFE VFS -- List of upcoming actions
On Wed, Oct 10, 2012 at 6:15 AM, Kevin Oberman kob6...@gmail.com wrote: On Mon, Oct 8, 2012 at 7:57 AM, Attilio Rao atti...@freebsd.org wrote: On Fri, Sep 28, 2012 at 4:47 PM, Harald Schmalzbauer h.schmalzba...@omnilan.de wrote: schrieb Attilio Rao am 28.09.2012 16:18 (localtime): On Wed, Sep 26, 2012 at 12:02 PM, Harald Schmalzbauer h.schmalzba...@omnilan.de wrote: ... After many people willing to test fuse on STABLE_9, I made this patch that at least compiles there: http://www.freebsd.org/~attilio/fuse_import/fuse_stable9_241030.patch Thanks a lot! In the meantime I made the original patch compiling. I simply looked at the changes which were made around july in the fuse project to follow changes in head (checkpath(), vrecycle() and vtruncbuf()) and reverted them. Since I have no idea about the code I modified, I'm happy that you did a more qualified patch set :-) Of course, I didn't have a chance to test it because I'm also out for vacation right now but please do and report. Happy holiday!!! If you're by chance arround the Oktoberfest, drop me a note, I'll pay you a Maß (or any other drink if you don't like „Wiesnbier“) :-) I really hoped to make this year, but no luck :/ ... Some questions: Is this planned to be mfc'd and if so, how can one know? In which sense how can one know?. We usually specify MFC timeouts in the commit message (not sure if this answers your concerns). Yep, that's what I wanted to know. So if there's no MFC timeout in the log, it's not intended to be MFCd ever I guess. Thanks a lot! World/Kernel compiled fine in the meantime, I'll do some sshfs tests. Did you do any test in the end? Thanks, Attilio i have done same testing and it clearly is more stable than the old kmod. At least operations that crashed my system now work. I did see one weird anomaly, though. I had several NTFS file system mounted, one a Windows OS. I also had a GELI encrypted UFS file system mounted. They were both mounted and working. I finished with the data disk and tried to unmount it. I got no error, but it remained mounted. I did not actually try to access it. Figured it would umount when I shut down or end up dirty and I'd have to fsck it. The unmount attempt was using nautilus/gnome-mount. This is not the odd part, though. After the attempt to unmount the UFS device, I could no longer access the Window_OS file system. an ls showed the mount point to be d- and an attempt to list files in the directory reported that the socket was not found. So it looks like the attempt to unmount one NTFS FS deleted the socket for the other. This make absolutely no sense to me, but you understand the underlying opertations better than I do. Repeated efforts have failed to re-create the problem. I'm baffled. It is possible that there is no relationship between the two odd things happening at about the same time (NTFS volume lost socket and UFS disk won;t unmount, but reports no errors), but neither has happened since. FWIW, I also see that no device numbers are listed for the fuse devices: /dev/fuse 184319948 165594236 1872571290%/media/Media /dev/fuse 110636028 82934424 2770160475%/media/Windows7_OS How does the system distinguish between them? Sorry, forgot to reply about this and it is due: differently from fuse4bsd version, this one doesn't do device cloning but uses devfs*cdevpriv() infrastructure. So effectively different filedescriptors are handled internally. This is why it requires further changes to the mount_fusefs(8) (because the vfs_mount operation also need further knowledge on the per-filedescriptor handle and it cannot acquire it easilly because it is not a devfs operation). Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: MPSAFE VFS -- List of upcoming actions
On Fri, Sep 28, 2012 at 4:47 PM, Harald Schmalzbauer h.schmalzba...@omnilan.de wrote: schrieb Attilio Rao am 28.09.2012 16:18 (localtime): On Wed, Sep 26, 2012 at 12:02 PM, Harald Schmalzbauer h.schmalzba...@omnilan.de wrote: ... After many people willing to test fuse on STABLE_9, I made this patch that at least compiles there: http://www.freebsd.org/~attilio/fuse_import/fuse_stable9_241030.patch Thanks a lot! In the meantime I made the original patch compiling. I simply looked at the changes which were made around july in the fuse project to follow changes in head (checkpath(), vrecycle() and vtruncbuf()) and reverted them. Since I have no idea about the code I modified, I'm happy that you did a more qualified patch set :-) Of course, I didn't have a chance to test it because I'm also out for vacation right now but please do and report. Happy holiday!!! If you're by chance arround the Oktoberfest, drop me a note, I'll pay you a Maß (or any other drink if you don't like „Wiesnbier“) :-) I really hoped to make this year, but no luck :/ ... Some questions: Is this planned to be mfc'd and if so, how can one know? In which sense how can one know?. We usually specify MFC timeouts in the commit message (not sure if this answers your concerns). Yep, that's what I wanted to know. So if there's no MFC timeout in the log, it's not intended to be MFCd ever I guess. Thanks a lot! World/Kernel compiled fine in the meantime, I'll do some sshfs tests. Did you do any test in the end? Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: MPSAFE VFS -- List of upcoming actions
On Wed, Sep 26, 2012 at 12:02 PM, Harald Schmalzbauer h.schmalzba...@omnilan.de wrote: schrieb Harald Schmalzbauer am 25.09.2012 20:24 (localtime): schrieb Attilio Rao am 21.09.2012 02:22 (localtime): On Wed, Sep 19, 2012 at 3:48 AM, Attilio Rao atti...@freebsd.org wrote: On Fri, Jul 13, 2012 at 12:18 AM, Attilio Rao atti...@freebsd.org wrote: 2012/7/4 Attilio Rao atti...@freebsd.org: 2012/6/29 Attilio Rao atti...@freebsd.org: As already published several times, according to the following plan: http://wiki.freebsd.org/NONMPSAFE_DEORBIT_VFS I still haven't heard from Vivien or Edward, anyway as NTFS is basically only used RO these days (also the mount_ntfs code just permits RO mounting) I stripped all the uncomplete/bogus write support with the following patch: http://www.freebsd.org/~attilio/ntfs_remove_write.patch This is an attempt to make the code smaller and possibly just focus on the locking that really matter (as read-only filesystem). On some points of the patch I'm a bit less sure as we could easily take into account also write for things like vaccess() arguments, and make easier to re-add correct write support at some point in the future, but still force RO, even if the approach used in the patch is more correct IMHO. As an added bonus this patch cleans some dirty code in the mount operation and fixes a bug as vfs_mountedfrom() is called before real mounting is completed and can still fail. A quick update on this. It looks like NTFS won't be completed for this GSoC thus I seriously need to find an alternative to not loose the NTFS support entirely. I tried to look into the NTFS implementation right now and it is really a poor support. As Peter has also verified, it can deadlock in no-time, it compeltely violates VFS rules, etc. IMHO it deserves a complete rewrite if we would still support in-kernel NTFS. I also tried to look at the NetBSD implementation. Their code is someway similar to our, but they used very complicated (and very dirty) code to do the locking. Even if I don't know well enough NetBSD VFS, I have the impression not all the races are correctly handled. Definitively, not something I would like to port. Considering all that the only viable option would be meaning an userland filesystem implementation. My preferred choice would be to import PUFFS and librefuse on top of it but honestly it requires a lot of time to be completed, time which I don't currently have as in 2 months Giant must be gone by the VFS. I then decided to switch to gnn's rewamp of FUSE patches. You can find his initial e-mail here: http://lists.freebsd.org/pipermail/freebsd-fs/2012-March/013876.html I've precisely got the second version of George's patch and created this dolphin branch: svn://svn.freebsd.org/base/projects/fuse I'm fixing low hanging fruit for the moment (see r238411 for example) and I still have to make a throughful review. However my idea is to commit the support once: - ntfs-3g is well stress-tested and proves to be bug-free - there is no major/big technical issue pending after the reviews In the last weeks Peter, Florian, Gustau and I have been working in stabilizing fuse support. In the specific, Peter has worked hard on producing several utilities to nit stress-test fuse and in particular ntfs, Florian has improved fuse related ports (as explained later) and Gustau has done sparse testing. I feel moderately satisfied by the level of stability of fuse now to propose to wider usage, in particular given the huge amount of complaints I'm hearing around about occasional fuse users. The final target of the project is to completely import into base the content of fusefs-kmod starting from earlier posted patches by George. So far, we took care only of importing in the fuse branch the kernel part, so that fusefs-kmod userland part is still needed to be installed from ports, but I was studying the mount_fusefs licensing before to process with the import for the userland bits of it. The fixing has been happening here: svn://svn.freebsd.org/base/projects/fuse/ which is essentially an HEAD branch + fuse kernel components. In order to get fuse, please compile a kernel from this branch with FUSE option or simply build and load fuse module. Alternatively, a kernel patch that should work with HEAD@240684 is here: http://www.freebsd.org/~attilio/fuse_import/fuse_240684.patch I guess the patch can easilly apply to all FreeBSD branches, really, but it is not tested to anything else different then -CURRENT. As said you still need currently to build fusefs-kmod port. However you need these further patches, to be put in the fusefs-kmod/files/ directory:: http://www.freebsd.org/~attilio/fuse_import/patch-Makefile http://www.freebsd.org/~attilio/fuse_import/patch-mount_fusefs__mount_fusefs2.c They both disable the old kernel building/linking and import new functionality to let the new kernel support work well in presence
Re: MPSAFE VFS -- List of upcoming actions
On Fri, Sep 21, 2012 at 1:22 AM, Attilio Rao atti...@freebsd.org wrote: [ trimm ] You can use the branch directly or this patch against -CURRENT at 240752: http://www.freebsd.org/~attilio/fuse_import/fuse_240752.patch In order to test this work, then, you just need to patch (or use directly the branch) your sources with this patch and install ports normally as they work. Forgot to tell: with the new branch you *must not* install fusefs-kmod port. Please test it from a pristine installation or double-check if your fusefs-kmod port is completely gone (if already installed) before to report bugs as its functionality could be tainting the branch one. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: MPSAFE VFS -- List of upcoming actions
On Wed, Sep 19, 2012 at 3:48 AM, Attilio Rao atti...@freebsd.org wrote: On Fri, Jul 13, 2012 at 12:18 AM, Attilio Rao atti...@freebsd.org wrote: 2012/7/4 Attilio Rao atti...@freebsd.org: 2012/6/29 Attilio Rao atti...@freebsd.org: As already published several times, according to the following plan: http://wiki.freebsd.org/NONMPSAFE_DEORBIT_VFS I still haven't heard from Vivien or Edward, anyway as NTFS is basically only used RO these days (also the mount_ntfs code just permits RO mounting) I stripped all the uncomplete/bogus write support with the following patch: http://www.freebsd.org/~attilio/ntfs_remove_write.patch This is an attempt to make the code smaller and possibly just focus on the locking that really matter (as read-only filesystem). On some points of the patch I'm a bit less sure as we could easily take into account also write for things like vaccess() arguments, and make easier to re-add correct write support at some point in the future, but still force RO, even if the approach used in the patch is more correct IMHO. As an added bonus this patch cleans some dirty code in the mount operation and fixes a bug as vfs_mountedfrom() is called before real mounting is completed and can still fail. A quick update on this. It looks like NTFS won't be completed for this GSoC thus I seriously need to find an alternative to not loose the NTFS support entirely. I tried to look into the NTFS implementation right now and it is really a poor support. As Peter has also verified, it can deadlock in no-time, it compeltely violates VFS rules, etc. IMHO it deserves a complete rewrite if we would still support in-kernel NTFS. I also tried to look at the NetBSD implementation. Their code is someway similar to our, but they used very complicated (and very dirty) code to do the locking. Even if I don't know well enough NetBSD VFS, I have the impression not all the races are correctly handled. Definitively, not something I would like to port. Considering all that the only viable option would be meaning an userland filesystem implementation. My preferred choice would be to import PUFFS and librefuse on top of it but honestly it requires a lot of time to be completed, time which I don't currently have as in 2 months Giant must be gone by the VFS. I then decided to switch to gnn's rewamp of FUSE patches. You can find his initial e-mail here: http://lists.freebsd.org/pipermail/freebsd-fs/2012-March/013876.html I've precisely got the second version of George's patch and created this dolphin branch: svn://svn.freebsd.org/base/projects/fuse I'm fixing low hanging fruit for the moment (see r238411 for example) and I still have to make a throughful review. However my idea is to commit the support once: - ntfs-3g is well stress-tested and proves to be bug-free - there is no major/big technical issue pending after the reviews In the last weeks Peter, Florian, Gustau and I have been working in stabilizing fuse support. In the specific, Peter has worked hard on producing several utilities to nit stress-test fuse and in particular ntfs, Florian has improved fuse related ports (as explained later) and Gustau has done sparse testing. I feel moderately satisfied by the level of stability of fuse now to propose to wider usage, in particular given the huge amount of complaints I'm hearing around about occasional fuse users. The final target of the project is to completely import into base the content of fusefs-kmod starting from earlier posted patches by George. So far, we took care only of importing in the fuse branch the kernel part, so that fusefs-kmod userland part is still needed to be installed from ports, but I was studying the mount_fusefs licensing before to process with the import for the userland bits of it. The fixing has been happening here: svn://svn.freebsd.org/base/projects/fuse/ which is essentially an HEAD branch + fuse kernel components. In order to get fuse, please compile a kernel from this branch with FUSE option or simply build and load fuse module. Alternatively, a kernel patch that should work with HEAD@240684 is here: http://www.freebsd.org/~attilio/fuse_import/fuse_240684.patch I guess the patch can easilly apply to all FreeBSD branches, really, but it is not tested to anything else different then -CURRENT. As said you still need currently to build fusefs-kmod port. However you need these further patches, to be put in the fusefs-kmod/files/ directory:: http://www.freebsd.org/~attilio/fuse_import/patch-Makefile http://www.freebsd.org/~attilio/fuse_import/patch-mount_fusefs__mount_fusefs2.c They both disable the old kernel building/linking and import new functionality to let the new kernel support work well in presence of many consumers. In addition to fusefs-kmod, Bryan and Florian have also updated fusefs-lib and fusefs-ntfs ports. For instance, please refer to this e-mail: http://lists.freebsd.org/pipermail
Re: MPSAFE VFS -- List of upcoming actions
On Wed, Sep 19, 2012 at 4:47 AM, Kevin Oberman kob6...@gmail.com wrote: On Tue, Sep 18, 2012 at 7:48 PM, Attilio Rao atti...@freebsd.org wrote: On Fri, Jul 13, 2012 at 12:18 AM, Attilio Rao atti...@freebsd.org wrote: 2012/7/4 Attilio Rao atti...@freebsd.org: 2012/6/29 Attilio Rao atti...@freebsd.org: As already published several times, according to the following plan: http://wiki.freebsd.org/NONMPSAFE_DEORBIT_VFS I still haven't heard from Vivien or Edward, anyway as NTFS is basically only used RO these days (also the mount_ntfs code just permits RO mounting) I stripped all the uncomplete/bogus write support with the following patch: http://www.freebsd.org/~attilio/ntfs_remove_write.patch This is an attempt to make the code smaller and possibly just focus on the locking that really matter (as read-only filesystem). On some points of the patch I'm a bit less sure as we could easily take into account also write for things like vaccess() arguments, and make easier to re-add correct write support at some point in the future, but still force RO, even if the approach used in the patch is more correct IMHO. As an added bonus this patch cleans some dirty code in the mount operation and fixes a bug as vfs_mountedfrom() is called before real mounting is completed and can still fail. A quick update on this. It looks like NTFS won't be completed for this GSoC thus I seriously need to find an alternative to not loose the NTFS support entirely. I tried to look into the NTFS implementation right now and it is really a poor support. As Peter has also verified, it can deadlock in no-time, it compeltely violates VFS rules, etc. IMHO it deserves a complete rewrite if we would still support in-kernel NTFS. I also tried to look at the NetBSD implementation. Their code is someway similar to our, but they used very complicated (and very dirty) code to do the locking. Even if I don't know well enough NetBSD VFS, I have the impression not all the races are correctly handled. Definitively, not something I would like to port. Considering all that the only viable option would be meaning an userland filesystem implementation. My preferred choice would be to import PUFFS and librefuse on top of it but honestly it requires a lot of time to be completed, time which I don't currently have as in 2 months Giant must be gone by the VFS. I then decided to switch to gnn's rewamp of FUSE patches. You can find his initial e-mail here: http://lists.freebsd.org/pipermail/freebsd-fs/2012-March/013876.html I've precisely got the second version of George's patch and created this dolphin branch: svn://svn.freebsd.org/base/projects/fuse I'm fixing low hanging fruit for the moment (see r238411 for example) and I still have to make a throughful review. However my idea is to commit the support once: - ntfs-3g is well stress-tested and proves to be bug-free - there is no major/big technical issue pending after the reviews In the last weeks Peter, Florian, Gustau and I have been working in stabilizing fuse support. In the specific, Peter has worked hard on producing several utilities to nit stress-test fuse and in particular ntfs, Florian has improved fuse related ports (as explained later) and Gustau has done sparse testing. I feel moderately satisfied by the level of stability of fuse now to propose to wider usage, in particular given the huge amount of complaints I'm hearing around about occasional fuse users. The final target of the project is to completely import into base the content of fusefs-kmod starting from earlier posted patches by George. So far, we took care only of importing in the fuse branch the kernel part, so that fusefs-kmod userland part is still needed to be installed from ports, but I was studying the mount_fusefs licensing before to process with the import for the userland bits of it. The fixing has been happening here: svn://svn.freebsd.org/base/projects/fuse/ which is essentially an HEAD branch + fuse kernel components. In order to get fuse, please compile a kernel from this branch with FUSE option or simply build and load fuse module. Alternatively, a kernel patch that should work with HEAD@240684 is here: http://www.freebsd.org/~attilio/fuse_import/fuse_240684.patch I guess the patch can easilly apply to all FreeBSD branches, really, but it is not tested to anything else different then -CURRENT. As said you still need currently to build fusefs-kmod port. However you need these further patches, to be put in the fusefs-kmod/files/ directory:: http://www.freebsd.org/~attilio/fuse_import/patch-Makefile http://www.freebsd.org/~attilio/fuse_import/patch-mount_fusefs__mount_fusefs2.c They both disable the old kernel building/linking and import new functionality to let the new kernel support work well in presence of many consumers. In addition to fusefs-kmod, Bryan and Florian have also updated fusefs-lib and fusefs-ntfs ports
Re: MPSAFE VFS -- List of upcoming actions
On 9/19/12, Kevin Oberman kob6...@gmail.com wrote: On Wed, Sep 19, 2012 at 12:30 AM, Attilio Rao atti...@freebsd.org wrote: On Wed, Sep 19, 2012 at 4:47 AM, Kevin Oberman kob6...@gmail.com wrote: On Tue, Sep 18, 2012 at 7:48 PM, Attilio Rao atti...@freebsd.org wrote: On Fri, Jul 13, 2012 at 12:18 AM, Attilio Rao atti...@freebsd.org wrote: 2012/7/4 Attilio Rao atti...@freebsd.org: 2012/6/29 Attilio Rao atti...@freebsd.org: As already published several times, according to the following plan: http://wiki.freebsd.org/NONMPSAFE_DEORBIT_VFS I still haven't heard from Vivien or Edward, anyway as NTFS is basically only used RO these days (also the mount_ntfs code just permits RO mounting) I stripped all the uncomplete/bogus write support with the following patch: http://www.freebsd.org/~attilio/ntfs_remove_write.patch This is an attempt to make the code smaller and possibly just focus on the locking that really matter (as read-only filesystem). On some points of the patch I'm a bit less sure as we could easily take into account also write for things like vaccess() arguments, and make easier to re-add correct write support at some point in the future, but still force RO, even if the approach used in the patch is more correct IMHO. As an added bonus this patch cleans some dirty code in the mount operation and fixes a bug as vfs_mountedfrom() is called before real mounting is completed and can still fail. A quick update on this. It looks like NTFS won't be completed for this GSoC thus I seriously need to find an alternative to not loose the NTFS support entirely. I tried to look into the NTFS implementation right now and it is really a poor support. As Peter has also verified, it can deadlock in no-time, it compeltely violates VFS rules, etc. IMHO it deserves a complete rewrite if we would still support in-kernel NTFS. I also tried to look at the NetBSD implementation. Their code is someway similar to our, but they used very complicated (and very dirty) code to do the locking. Even if I don't know well enough NetBSD VFS, I have the impression not all the races are correctly handled. Definitively, not something I would like to port. Considering all that the only viable option would be meaning an userland filesystem implementation. My preferred choice would be to import PUFFS and librefuse on top of it but honestly it requires a lot of time to be completed, time which I don't currently have as in 2 months Giant must be gone by the VFS. I then decided to switch to gnn's rewamp of FUSE patches. You can find his initial e-mail here: http://lists.freebsd.org/pipermail/freebsd-fs/2012-March/013876.html I've precisely got the second version of George's patch and created this dolphin branch: svn://svn.freebsd.org/base/projects/fuse I'm fixing low hanging fruit for the moment (see r238411 for example) and I still have to make a throughful review. However my idea is to commit the support once: - ntfs-3g is well stress-tested and proves to be bug-free - there is no major/big technical issue pending after the reviews In the last weeks Peter, Florian, Gustau and I have been working in stabilizing fuse support. In the specific, Peter has worked hard on producing several utilities to nit stress-test fuse and in particular ntfs, Florian has improved fuse related ports (as explained later) and Gustau has done sparse testing. I feel moderately satisfied by the level of stability of fuse now to propose to wider usage, in particular given the huge amount of complaints I'm hearing around about occasional fuse users. The final target of the project is to completely import into base the content of fusefs-kmod starting from earlier posted patches by George. So far, we took care only of importing in the fuse branch the kernel part, so that fusefs-kmod userland part is still needed to be installed from ports, but I was studying the mount_fusefs licensing before to process with the import for the userland bits of it. The fixing has been happening here: svn://svn.freebsd.org/base/projects/fuse/ which is essentially an HEAD branch + fuse kernel components. In order to get fuse, please compile a kernel from this branch with FUSE option or simply build and load fuse module. Alternatively, a kernel patch that should work with HEAD@240684 is here: http://www.freebsd.org/~attilio/fuse_import/fuse_240684.patch I guess the patch can easilly apply to all FreeBSD branches, really, but it is not tested to anything else different then -CURRENT. As said you still need currently to build fusefs-kmod port. However you need these further patches, to be put in the fusefs-kmod/files/ directory:: http://www.freebsd.org/~attilio/fuse_import/patch-Makefile http://www.freebsd.org/~attilio/fuse_import/patch-mount_fusefs__mount_fusefs2.c They both disable the old kernel building/linking and import new functionality to let the new kernel support work well in presence
Re: MPSAFE VFS -- List of upcoming actions
On Fri, Jul 13, 2012 at 12:18 AM, Attilio Rao atti...@freebsd.org wrote: 2012/7/4 Attilio Rao atti...@freebsd.org: 2012/6/29 Attilio Rao atti...@freebsd.org: As already published several times, according to the following plan: http://wiki.freebsd.org/NONMPSAFE_DEORBIT_VFS I still haven't heard from Vivien or Edward, anyway as NTFS is basically only used RO these days (also the mount_ntfs code just permits RO mounting) I stripped all the uncomplete/bogus write support with the following patch: http://www.freebsd.org/~attilio/ntfs_remove_write.patch This is an attempt to make the code smaller and possibly just focus on the locking that really matter (as read-only filesystem). On some points of the patch I'm a bit less sure as we could easily take into account also write for things like vaccess() arguments, and make easier to re-add correct write support at some point in the future, but still force RO, even if the approach used in the patch is more correct IMHO. As an added bonus this patch cleans some dirty code in the mount operation and fixes a bug as vfs_mountedfrom() is called before real mounting is completed and can still fail. A quick update on this. It looks like NTFS won't be completed for this GSoC thus I seriously need to find an alternative to not loose the NTFS support entirely. I tried to look into the NTFS implementation right now and it is really a poor support. As Peter has also verified, it can deadlock in no-time, it compeltely violates VFS rules, etc. IMHO it deserves a complete rewrite if we would still support in-kernel NTFS. I also tried to look at the NetBSD implementation. Their code is someway similar to our, but they used very complicated (and very dirty) code to do the locking. Even if I don't know well enough NetBSD VFS, I have the impression not all the races are correctly handled. Definitively, not something I would like to port. Considering all that the only viable option would be meaning an userland filesystem implementation. My preferred choice would be to import PUFFS and librefuse on top of it but honestly it requires a lot of time to be completed, time which I don't currently have as in 2 months Giant must be gone by the VFS. I then decided to switch to gnn's rewamp of FUSE patches. You can find his initial e-mail here: http://lists.freebsd.org/pipermail/freebsd-fs/2012-March/013876.html I've precisely got the second version of George's patch and created this dolphin branch: svn://svn.freebsd.org/base/projects/fuse I'm fixing low hanging fruit for the moment (see r238411 for example) and I still have to make a throughful review. However my idea is to commit the support once: - ntfs-3g is well stress-tested and proves to be bug-free - there is no major/big technical issue pending after the reviews In the last weeks Peter, Florian, Gustau and I have been working in stabilizing fuse support. In the specific, Peter has worked hard on producing several utilities to nit stress-test fuse and in particular ntfs, Florian has improved fuse related ports (as explained later) and Gustau has done sparse testing. I feel moderately satisfied by the level of stability of fuse now to propose to wider usage, in particular given the huge amount of complaints I'm hearing around about occasional fuse users. The final target of the project is to completely import into base the content of fusefs-kmod starting from earlier posted patches by George. So far, we took care only of importing in the fuse branch the kernel part, so that fusefs-kmod userland part is still needed to be installed from ports, but I was studying the mount_fusefs licensing before to process with the import for the userland bits of it. The fixing has been happening here: svn://svn.freebsd.org/base/projects/fuse/ which is essentially an HEAD branch + fuse kernel components. In order to get fuse, please compile a kernel from this branch with FUSE option or simply build and load fuse module. Alternatively, a kernel patch that should work with HEAD@240684 is here: http://www.freebsd.org/~attilio/fuse_import/fuse_240684.patch I guess the patch can easilly apply to all FreeBSD branches, really, but it is not tested to anything else different then -CURRENT. As said you still need currently to build fusefs-kmod port. However you need these further patches, to be put in the fusefs-kmod/files/ directory:: http://www.freebsd.org/~attilio/fuse_import/patch-Makefile http://www.freebsd.org/~attilio/fuse_import/patch-mount_fusefs__mount_fusefs2.c They both disable the old kernel building/linking and import new functionality to let the new kernel support work well in presence of many consumers. In addition to fusefs-kmod, Bryan and Florian have also updated fusefs-lib and fusefs-ntfs ports. For instance, please refer to this e-mail: http://lists.freebsd.org/pipermail/freebsd-ports/2012-August/077950.html Even if this work is someway independent by the fusefs-kmod import, I
Re: Clang as default compiler November 4th
On 9/11/12, Brooks Davis bro...@freebsd.org wrote: On Tue, Sep 11, 2012 at 01:45:18PM +0300, Konstantin Belousov wrote: On Mon, Sep 10, 2012 at 04:12:07PM -0500, Brooks Davis wrote: For the past several years we've been working towards migrating from GCC to Clang/LLVM as our default compiler. We intend to ship FreeBSD 10.0 with Clang as the default compiler on i386 and amd64 platforms. To this end, we will make WITH_CLANG_IS_CC the default on i386 and amd64 platforms on November 4th. What does the mean to you? * When you build world after the default is changed /usr/bin/cc, cpp, and c++ will be links to clang. * This means the initial phase of buildworld and old style kernel compilation will use clang instead of gcc. This is known to work. * It also means that ports will build with clang by default. A major of ports work, but a significant number are broken or blocked by broken ports. For more information see: http://wiki.freebsd.org/PortsAndClang What issues remain? * The gcc-clang transition currently requires setting CC, CXX, and CPP in addition to WITH_CLANG_IS_CC. I will post a patch to toolchain@ to address this shortly. * Ports compiler selection infrastructure is still under development. * Some ports could build with clang with appropriate tweaks. What can you do to help? * Switch (some of) your systems. Early adoption can help us find bugs. * Fix ports to build with clang. If you don't have a clang system, you can use the CLANG/amd64 or CLANG/i386 build environments on redports.org. tl;dr: Clang will become the default compiler for x86 architectures on 2012-11-04 There was a chorus of voices talking about ports already. My POV is that suggesting to 'fix remaining ports to work with clang' is just a nonsense. You are proposing to fork the development of all the programs which do not compile with clang. Often, upstream developers do not care about clang at all since it not being default compiler in Debian/Fedora/Whatever Linux. The project simply do not have resources to maintain the fork of 20K programs. I may have phrased the above poorly, but in most cases I'd be happy with using USE_GCC as a solution, but to the extent that port maintainers can fix their ports to build with clang, that's a good thing. Having a deadline will help focus efforts towards finding the right fix for the most important ports in a timely manner. If we near the deadline and find that we need a few more weeks, nothing prevents us from slipping the date a bit. Another issue with the switch, which seems to be not only not addressed, but even not talked about, is the performance impact of the change. I do not remember any measurements, whatever silly they could be, of the performance change by the compiler switch. We often have serious and argumented push-back for kernel changes that give as low as 2-3% of the speed hit. What are the numbers for clang change, any numbers ? Florian Smeets (flo) did one round of benchmarks back in June with sysbench/mysql. There is a small but measurable slowdown both with world compiled with clang and with mysql compiled with clang. You can see the results on the last page of this document: http://people.freebsd.org/~flo/perf.pdf The total impacts are on the order of 1-2%. That's more than I'd like and I expect some pushback, but I feel it is in the range of acceptable code debt to take on to accomplish a multi-year project goal. 1-2% on SMP workload can just be part of the variance due to memory layout changes. What I would like to see is benchmarks in UP configurations, like machine booting with only one process and doing make buildworld (no -j at all). This could give a good measurement if the compiler changed anything or not. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Clang as default compiler November 4th
On Tue, Sep 11, 2012 at 4:56 PM, Garrett Cooper yaneg...@gmail.com wrote: On Sep 11, 2012, at 8:35 AM, Daniel Eischen wrote: On Tue, 11 Sep 2012, Konstantin Belousov wrote: On Tue, Sep 11, 2012 at 02:06:49PM +0200, Roman Divacky wrote: We currently dont compile 4680 ports (out of 23857). Top 10 ports that prevent the most other ports from compiling together prevent ports from compilation. So if we fixed those 10 ports we could be at around 2500 ports not compiling. Thats quite far from your claim of forking 20k programs. Sorry, I cannot buy the argument. How many patches there are already in the ports tree to cope with clang incompatibility with gcc ? You may declare that all of them are application bugs, but it completely misses the point. [ snip ] I believe majority of the broken ports is broken because their maintainer never saw them being broken with clang just because it's not the default compiler. Thus by making it the default majority of the problems would just go away. Can you, please, read what I wrote ? Fixing _ports_ to compile with clang is plain wrong. Upstream developers use gcc almost always for development and testing. Establishing another constant cost on the porting work puts burden on the ports submitters, maintainers and even ports users. This is a good point! Alternate compilers are being used on other OS distributions, like Arch Linux, Gentoo Linux, etc, so encouraging external developers to correct/simplify their Makefiles and build infrastructures is a good thing (and plus, it makes switching to other compilers like icc, pcc, etc easier). You're going to run into almost the same problem when trying to get stuff to cross-compile for multiple targets, so there's no reason why FreeBSD/Linux should not strive to get others to hardcode less. I wouldn't consider ports to be a stopgap for the clang switchover as much as correctness/performance. Broken third-party software can be fixed, but if the underlying foundation doesn't deliver sane code or severely regresses performance (runtime is more important than building IMO because I'd rather have code take a little while longer to compile if the end-result runs faster, and ultimately runtime performance affects build performance), then there's no point in trying to switch over yet. While this is generally true I think we need to make a distinction. To me speaking about not compiling ports doesn't mean anything. What are the bugs that actually prevents the vast majority of ports from compiling? (speaking of which anyone has testing their actual functionality too?) Because I really don't expect the bugs to be always the same repeated over and over, there will be some bugs depending by brain-o in the ports code and other depending by clang bugs. As kib@ rightly points out fixing indiscriminately ports is not the solution, but fixing ports when *the bug is actually in the port itself* is the right solution, otherwise fix the compiler for the other class of bugs. Did the people pushing for default clang make an assessment on the type of ports bug present? (and I see there is a lot of people aiming for it, so if the ports are splitted among several people we can get a good handle on it). Could such bugs be characterized and classified? Making such a huge change is also a matter of loosing much time on problems which don't seem directly related to it, but which infact prevent the system from working correctly. Switching to default CLANG with the current situation (20% of port not even compiling, ports compiler selection broken, libm loss of precision, performance barely analyzed in simplified scheme, etc.) is not an option in my head and people should really reconsider it, unless all these points gets properly addressed. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: TUNABLE_INT vs TUNABLE_INT_FETCH
On 8/23/12, Luigi Rizzo ri...@iet.unipi.it wrote: Hi, I am a bit unclear on what are the pros and cons of using TUNABLE_INT vs TUNABLE_INT_FETCH within a device driver. TUNABLE_INT is basically the statically initializer version of TUNABLE_INT_FETCH. In short terms, you will use TUNABLE_INT_FETCH() in normal functions, while TUNABLE_INT() in data declaration. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: TUNABLE_INT vs TUNABLE_INT_FETCH
On Thu, Aug 23, 2012 at 5:05 PM, Luigi Rizzo ri...@iet.unipi.it wrote: On Thu, Aug 23, 2012 at 03:52:56PM +0100, Attilio Rao wrote: On 8/23/12, Luigi Rizzo ri...@iet.unipi.it wrote: Hi, I am a bit unclear on what are the pros and cons of using TUNABLE_INT vs TUNABLE_INT_FETCH within a device driver. TUNABLE_INT is basically the statically initializer version of TUNABLE_INT_FETCH. In short terms, you will use TUNABLE_INT_FETCH() in normal functions, while TUNABLE_INT() in data declaration. The thing is, do we need the data declaration at all ? What do you mean with data declaration? We need to mimic a static initialization usage, so what we do is to use the first SYSINIT() family available (SI_SUB_TUNABLES). You also need the env to look for and the static variable to initialize, so for SYSINIT's sake you need to pack them up in a single argument. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: On cooperative work [Was: Re: newbus' ivar's limitation..]
On Wed, Aug 1, 2012 at 5:32 PM, Arnaud Lacombe lacom...@gmail.com wrote: Hi, On Tue, Jul 31, 2012 at 4:14 PM, Attilio Rao atti...@freebsd.org wrote: You don't want to work cooperatively. Why is it that mbuf's refactoring consultation is being held in internal, private, committers-and-invite-only-restricted meeting at BSDCan ? Why is it that so much review and discussion on changes are held privately ? Arnaud, belive me, to date I don't recall a single major technical decision that has been settled exclusively in private (not subjected to peer review) and in particular in person (e-mail help you focus on a lot of different details that you may not have under control when talking to people, etc). Sometimes it is useful that a limited number of developers is involved in initial brainstorming of some works, but after this period constructive people usually ask for peer review publishing their plans on the mailing lists or other media. If you don't see any public further discussion this may be meaning: a) the BSDCan meetings have been fruitless and there is no precise plan/roadmap/etc. b) there is still not consensus on details and you can always publically asked on what was decided and what not. Just send a mail to interested recipients and CC any FreeBSD mailing list. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: On cooperative work [Was: Re: newbus' ivar's limitation..]
On 8/1/12, Arnaud Lacombe lacom...@gmail.com wrote: Hi, On Wed, Aug 1, 2012 at 12:40 PM, Attilio Rao atti...@freebsd.org wrote: On Wed, Aug 1, 2012 at 5:32 PM, Arnaud Lacombe lacom...@gmail.com wrote: Hi, On Tue, Jul 31, 2012 at 4:14 PM, Attilio Rao atti...@freebsd.org wrote: You don't want to work cooperatively. Why is it that mbuf's refactoring consultation is being held in internal, private, committers-and-invite-only-restricted meeting at BSDCan ? Why is it that so much review and discussion on changes are held privately ? Arnaud, belive me, to date I don't recall a single major technical decision that has been settled exclusively in private (not subjected to peer review) and in particular in person (e-mail help you focus on a lot of different details that you may not have under control when talking to people, etc). Whose call is it to declare something worth public discussion ? No one. Every time I see a Suggested by:, Submitted by:, Reported by:, and especially Approved by:, there should to be a public reference of the mentioned communication. Not necessarilly. Every developer must ensure to produce a quality work, with definition of quality being discretional. Some people fail this expectation, while others do very good. As a general rule, some people send patches to experts on the matter and they just trust their judgment, others also full-fill testing cycles by thirdy part people, others again ask for public reviews. Often this process is adapted based on the dimension of the work and the number of architectural changes it introduces. As a personal matter, for a big architectural change things I would *personally* do are: - Prepare a master-plan with experts of the matter - Post a plan (after having achived consensus) on the public mailing list for further discussions - Adjust the plan based on public feedbacks (if necessary) - Implement the plan - Ask the experts if they have objections to the implementation - Ask testers to perform some stress-testing - Ask testers to perform benchmark (if I find people interested in that) - Send out to the public mailing list for public review - Integrate suggestions - Ask testers to stress-test again - Commit I think this model in general works fairly well, but people may have different ideas on that, meaning that people may want to not involve thirdy part for testing or review. This is going to be risky and lower the quality of their work but it is their call. Sometimes it is useful that a limited number of developers is involved in initial brainstorming of some works, Never. but after this period constructive people usually ask for peer review publishing their plans on the mailing lists or other media. Again, never. By doing so, you merely put the community in a situation where, well, We, committers, have come with this, you can either accept or STFU, but no major changes will be made because we decided so. You are forgetting one specific detail: you can always review a work *after* it entered the tree. This is something you would never do, but sometimes, when poor quality code is committed there is nothing else you can do than just raise your concern after it is in. The callout-ng conference at BSDCan was just beautiful, it was basically: Speaker: we will do this Audience: how about this situation ? What you will do will not work. Speaker: thank you for listening, end of the conference It was beautiful to witness. Not sure if you realized but I was what you mention Audience. I think you are referring to a specific case where a quick heads-up on a summer of code project has been presented, you cannot really believe all the technical discussion around FreeBSD evolve this way. If you don't see any public further discussion this may be meaning: a) the BSDCan meetings have been fruitless and there is no precise plan/roadmap/etc. so not only you make it private, but it shamelessly failed... And so? I think you have a wrong point of view of what is shamelessly... I'm working on the same project since 6 months, i thought I could finish it in 1 but then I understood that in order to get the quality I was hoping I had to do more work... does it qualify as failed, according to your standard? b) there is still not consensus on details Then the discussion should stop, public records are kept for reference in the future. There is no problem with this. and you can always publically asked on what was decided and what not. Just send a mail to interested recipients and CC any FreeBSD mailing list. This is not the way openness should be about. There is not much more you can do when people don't share details and discussions automatically. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd
Re: On cooperative work [Was: Re: newbus' ivar's limitation..]
On 8/1/12, Arnaud Lacombe lacom...@gmail.com wrote: Hi, On Wed, Aug 1, 2012 at 2:18 PM, Attilio Rao atti...@freebsd.org wrote: [ trimm ] You are forgetting one specific detail: you can always review a work *after* it entered the tree. This is something you would never do, but sometimes, when poor quality code is committed there is nothing else you can do than just raise your concern after it is in. Unfortunately, not. First, the developer will certainly have moved on after the commit, API may have been changed tree-wide, etc. Then, time is likely to have passed between the time you figure potential regression or bugs, which makes the first point even more problematic. Finally, if my point of view is being ignored *before* it goes to the tree, do you really expect it will be considered *after* ? There is one thing you are not considering: committers are as powerless as non-committers in face of someone stuck on his own buggy ideas/implementations. Often people are just convinced their idea is better than your and they won't change their mind, regardeless their status in the opensource community. And there is nothing more you can do apart from learning how to deal with such situations. Granted, there are projects blowing up and people abbandoning successful opensource community because of this. From my external point of view, committers not only have the possibility, but *do* commit mess in the tree without non-committers could say or do anything, just as well as committers being able to arbitrarily close PR even if the original reporter disagree. You should look at svn-src@ more often I suspect. You will see how many discussions are in there. And so? I think you have a wrong point of view of what is shamelessly... I'm working on the same project since 6 months, i thought I could finish it in 1 but then I understood that in order to get the quality I was hoping I had to do more work... does it qualify as failed, according to your standard? not specifically, but at the end of that project of your, I would run a post-mortem meeting to see what went correctly and where things got out-of-control. Arnaud, my friend, I have a new for you: this stuff is hard. I see the brightest people I've ever met stuck for weeks on problems, thinking about how to fix them in elegant way. Sometimes things get understimated, sometimes not, this is just part of the game. But an important thing is accepting critics in costructive way and learn. This makes things much easier. As for the mbuf meeting, all I can find from it online is: http://lists.freebsd.org/pipermail/freebsd-arch/2012-June/012629.html which is not worth much... Rather than doing things internally, maybe it is time to open up... oh, wait, you will certainly come to the community with a design plan, figure out it's not gonna work while doing the implementation, change the design plan while implementing, go public with a +3k/-2k loc change patch, ask for benediction, go ahead and commit it up until someone comes with an obvious design flaw which would render the change pretty much useless, but there will be man-month of work to fix it, so it's never gonna be done. One obvious problem in FreeBSD is that committers are prosecutor, judge and jury altogether. That's not the first time I point this out. You are drammatizing. As I already told, please, if you are interested in this topic, ask for the state of the discussion and ask politely to be included from now on. Nobody will reject you only because you don't have a @freebsd.org. b) there is still not consensus on details Then the discussion should stop, public records are kept for reference in the future. There is no problem with this. and you can always publically asked on what was decided and what not. Just send a mail to interested recipients and CC any FreeBSD mailing list. This is not the way openness should be about. There is not much more you can do when people don't share details and discussions automatically. keep reporting regressions, interface flaws, POLA violations, ABI breakages, bugs, etc. I agree. But with the correct and humble mindset and avoiding aggressive behaviour. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: newbus' ivar's limitation..
On Tue, Jul 31, 2012 at 8:47 PM, Arnaud Lacombe lacom...@gmail.com wrote: Hi, On Tue, Jul 31, 2012 at 12:27 PM, Warner Losh i...@bsdimp.com wrote: On Jul 31, 2012, at 9:20 AM, Arnaud Lacombe wrote: Hi, On Mon, Jul 30, 2012 at 11:51 PM, Warner Losh i...@bsdimp.com wrote: [...] We lack that right now, which is why you're trying to shoe-horn the FDT connections into a newbus world and complaining that everything sucks because it is a poor fit. I'd suggest that different mechanisms are necessary. I'm not trying anything, or at least no longer. I do not see the point working on that when I can not get trivial patches in, especially those fixing poorly maintained drivers, whose issues _do_ affect people. Hey Arnaud, sorry to be a little harsh, but maybe if you shouted less and cooperated more, people would be more willing to work with you. I tried to be Mr Nice Guy in the past, all I got in return was being ignored. Lists are full of unanswered problem because people do not want to insist getting an answer. Now, believe me on one point, if you are a driver or subsystem author, might I have an issue with your work, I *will* be a recurring pain in your butt to get the issue fixed, or to get in what I do believe, with the limited set of knowledge and resources to my disposal[0], to be a correct fix for the issue, at the time. If you are condescending, arrogant, or advocates of the status-quo, as have been committers in the past, I will return you favor. Let face it, FreeBSD is not the most outstanding OS out there (despite obvious capabilities), and I would not be too proud of its state. All that to say that asking politely does not work in the arbitrary FreeBSD realm, where the power to serve, is today nothing more that a relic. Arnaud, the problem I see here is that as always you make strong and false claims on bugs and missing support from FreeBSD kernel, but when people points out what are you missing/misunderstanding, you turn the whole thread into FreeBSD is a relic baby-whining, without replying with real technical arguments and simply ignoring e-mail by freebsd developers. I didn't see any response from you to several technical e-mail in this threads and others (please don't force me to open mailman and show exactly all the responses you have deliberately ignored), spitting unrespectful, poison-weighted words on developers of our project. You don't want to work cooperatively. You don't want to build something with FreeBSD community. So why you keep insist on sending e-mail like this? Don't you think it would be more productive for you to stick with another project? (I have a couple of names into my head that may be a good fit for you...). I think it is very offensive that you mock our statement like that. For many people reading this e-mail it has a true meaning, people like you should really watch his mouth when speaking about FreeBSD. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: MPSAFE VFS -- List of upcoming actions
On 7/21/12, Antony Mawer li...@mawer.org wrote: On Wed, Jul 18, 2012 at 6:45 PM, Attilio Rao atti...@freebsd.org wrote: 2012/7/18, Gustau Pérez i Querol gpe...@entel.upc.edu: Sorry fo the delay. About the ntfs support, I'd go with fuse and leave the most relevant filesystems in kernel space. In fact filesystems not particulary specific and not tied our kernel would go to userspace; thinks like smbfs, nwfs, ntfs, ext2 o ext4 for example should be in userspace (the list is incomplete and I don't really know if all of them are yet implemenent in userspace) in my opinion. That would make them easier to maintain (changes in the kernel would only affect fuse, once fixed all the userspace filesystem would work again). As a bonus, we would get many working fs based on fuse. In the server side gluster is a desirable thing; in the desktop things like gvfs (in the linux world gvfs is used not only by gnome but also by kde or xfce) or truecrypt I'm really concerned also about ntfs and smbfs at the moment. It seems that there is also a FUSE smbfs port, but I never used it and I'm not sure about its state at all. From what I understand, Apple have done a considerable amount of work on the FreeBSD-drived smbfs in the latest versions of OS X, based on the existing smbfs in tree: I've also found that there are 2 FUSE modules for smbfs but pho@ and flo@ still haven't tested them. It may make sense to do so before we commit FUSE to -CURRENT. However, thee is a plan by a $COMPANY to work on the in-kernel version of smbfs and lock it before 10.0 is shipped. In the unlikely events this doesn't happen we will came up with a different plan (assuming we will adopt anyway the FUSE module, if it proves to work well). http://www.opensource.apple.com/source/smb/smb-552.5/ I imagine things like the filesystem locking are probably somewhat different, but in terms of updating smbfs itself to support newer features it may be a good base (licensing permitting). smbfs at the moment lacks in some areas such as DFS support, although I do not know if the OS X version is any different there (given the consumer focus of their OS, probably not). There was also a version spun off by OpenSolaris: http://hub.opensolaris.org/bin/view/Project+smbfs/ which again was based on the FreeBSD + Apple versions. I also have a vested interest in NWFS continuing to work - only from a legacy point of view where we still interoperate with a number of Netware 6 servers through this. While those will likely eventually go away, more than likely before we move to 10.x, if there is anyone capable of working on it we could supply a test environment. Unfortunately the actual locking of the NWFS and NCP modules is outside my sphere of knowledge... If you have NCP, do you think you can try this netncp I never committed because lack of testing?: http://lists.freebsd.org/pipermail/freebsd-fs/2009-January/005617.html IIRC, Apple does a similar thing for netsmb (which suffers from a similar problem as netncp). Do you know if FUSE can support NWFS in any way? Starting providing stress-tests on the current codebase for NWFS/NetNCP (and report bugs found, preparing a list) could be a good way to start the locking effort. Interested developers then can look into such a list and provide necessary insight. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: MPSAFE VFS -- List of upcoming actions
2012/7/18, Gustau Pérez i Querol gpe...@entel.upc.edu: Sorry fo the delay. About the ntfs support, I'd go with fuse and leave the most relevant filesystems in kernel space. In fact filesystems not particulary specific and not tied our kernel would go to userspace; thinks like smbfs, nwfs, ntfs, ext2 o ext4 for example should be in userspace (the list is incomplete and I don't really know if all of them are yet implemenent in userspace) in my opinion. That would make them easier to maintain (changes in the kernel would only affect fuse, once fixed all the userspace filesystem would work again). As a bonus, we would get many working fs based on fuse. In the server side gluster is a desirable thing; in the desktop things like gvfs (in the linux world gvfs is used not only by gnome but also by kde or xfce) or truecrypt I'm really concerned also about ntfs and smbfs at the moment. It seems that there is also a FUSE smbfs port, but I never used it and I'm not sure about its state at all. I'm fixing low hanging fruit for the moment (see r238411 for example) and I still have to make a throughful review. However my idea is to commit the support once: - ntfs-3g is well stress-tested and proves to be bug-free - there is no major/big technical issue pending after the reviews I'm now looking for people sticking with the branch and trying to stress-test ntfs-3g as much as they can. For example I know that Gustau (cc'ed) already had issues. It would be good if he tries to reproduce them and make a full report. I've seen ntfs-3g+fuse crashing a few times and IIRC most of the time the problem happened while unmounting the filesystem. The FUSE module you had testing still has several bugs. You can try this patch: http://people.freebsd.org/~attilio/fuse_AW_DONE_ISDOTDOT_collision.patch on top of the FUSE branch I'm working on: svn://svn.freebsd.org/base/projects/fuse/ however it still doesn't address the cloning races (I'm rewriting it in order to take advantage of devfs_*_devpriv() interface right now). Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: MPSAFE VFS -- List of upcoming actions
2012/7/4 Attilio Rao atti...@freebsd.org: 2012/6/29 Attilio Rao atti...@freebsd.org: As already published several times, according to the following plan: http://wiki.freebsd.org/NONMPSAFE_DEORBIT_VFS I still haven't heard from Vivien or Edward, anyway as NTFS is basically only used RO these days (also the mount_ntfs code just permits RO mounting) I stripped all the uncomplete/bogus write support with the following patch: http://www.freebsd.org/~attilio/ntfs_remove_write.patch This is an attempt to make the code smaller and possibly just focus on the locking that really matter (as read-only filesystem). On some points of the patch I'm a bit less sure as we could easily take into account also write for things like vaccess() arguments, and make easier to re-add correct write support at some point in the future, but still force RO, even if the approach used in the patch is more correct IMHO. As an added bonus this patch cleans some dirty code in the mount operation and fixes a bug as vfs_mountedfrom() is called before real mounting is completed and can still fail. A quick update on this. It looks like NTFS won't be completed for this GSoC thus I seriously need to find an alternative to not loose the NTFS support entirely. I tried to look into the NTFS implementation right now and it is really a poor support. As Peter has also verified, it can deadlock in no-time, it compeltely violates VFS rules, etc. IMHO it deserves a complete rewrite if we would still support in-kernel NTFS. I also tried to look at the NetBSD implementation. Their code is someway similar to our, but they used very complicated (and very dirty) code to do the locking. Even if I don't know well enough NetBSD VFS, I have the impression not all the races are correctly handled. Definitively, not something I would like to port. Considering all that the only viable option would be meaning an userland filesystem implementation. My preferred choice would be to import PUFFS and librefuse on top of it but honestly it requires a lot of time to be completed, time which I don't currently have as in 2 months Giant must be gone by the VFS. I then decided to switch to gnn's rewamp of FUSE patches. You can find his initial e-mail here: http://lists.freebsd.org/pipermail/freebsd-fs/2012-March/013876.html I've precisely got the second version of George's patch and created this dolphin branch: svn://svn.freebsd.org/base/projects/fuse I'm fixing low hanging fruit for the moment (see r238411 for example) and I still have to make a throughful review. However my idea is to commit the support once: - ntfs-3g is well stress-tested and proves to be bug-free - there is no major/big technical issue pending after the reviews I'm now looking for people sticking with the branch and trying to stress-test ntfs-3g as much as they can. For example I know that Gustau (cc'ed) already had issues. It would be good if he tries to reproduce them and make a full report. Please try to stick with the code contained with this branch for the tests unless diversly advised. As final note, George as agreed to maintain FUSE in the long-term and of course I'll give him an hand as time permits. Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: MPSAFE VFS -- List of upcoming actions
2012/6/29 Attilio Rao atti...@freebsd.org: As already published several times, according to the following plan: http://wiki.freebsd.org/NONMPSAFE_DEORBIT_VFS I still haven't heard from Vivien or Edward, anyway as NTFS is basically only used RO these days (also the mount_ntfs code just permits RO mounting) I stripped all the uncomplete/bogus write support with the following patch: http://www.freebsd.org/~attilio/ntfs_remove_write.patch This is an attempt to make the code smaller and possibly just focus on the locking that really matter (as read-only filesystem). On some points of the patch I'm a bit less sure as we could easily take into account also write for things like vaccess() arguments, and make easier to re-add correct write support at some point in the future, but still force RO, even if the approach used in the patch is more correct IMHO. As an added bonus this patch cleans some dirty code in the mount operation and fixes a bug as vfs_mountedfrom() is called before real mounting is completed and can still fail. The patch has been kindly tested by pho@. If none has objections I will commit it friday evening. Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: MPSAFE VFS -- List of upcoming actions
2012/7/2, Christoph Hellwig h...@infradead.org: On Sun, Jul 01, 2012 at 03:52:05PM +0200, Attilio Rao wrote: anything by SoC involved people about NTFS and certainly I don't see a plan to get XFS locked. Stupid question, but what amount of locking does XFS in FreeBSD still need? I'm one of the maintainer of XFS on Linux, and while I know FreeBSD imported a really old version compared to the current one the codebases on IRIX and later Linux never relied on any global Giant-style locking. So if there is anything to fix it would be the in the small bits of FreeBSD-specific code. Basically when it cames to make a MPSAFE filesystem under FreeBSD there are 2 things to pay attention at: filesystem specific locking and dealing with FreeBSD's VFS locking. The former is usually the tricky part because it varies among the filesystems and it is where the developers might have more knowledge. The latter can be helped by testing with a debugging kernel for low hanging fruits, but special attention must be put in things like avoid to put half-constructed vnodes in the mount lists, lookup races (in particular with DOTDOT case) and others. For a reference, one can always look to simple filesystems that are already made MPSAFE (like MSDOS-FS likely). In the XFS case, I think it would be desirable to have a real maintainership. This means, basically, not only work on the locking but really be keen to have a working XFS. At the moment, we might still have write support as well, but it is badly broken. What I suggest for XFS is: - Remove the current write support entirely - Try to bring the sole read support to new(ish) XFS version (at least to a version that is known to not be totally broken) - Fix up the support to work with FreeBSD VFS Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: MPSAFE VFS -- List of upcoming actions
2012/7/2, Russell Cattelan catte...@thebarn.com: On 7/2/12 1:12 AM, Christoph Hellwig wrote: On Sun, Jul 01, 2012 at 03:52:05PM +0200, Attilio Rao wrote: anything by SoC involved people about NTFS and certainly I don't see a plan to get XFS locked. Stupid question, but what amount of locking does XFS in FreeBSD still need? I'm one of the maintainer of XFS on Linux, and while I know FreeBSD imported a really old version compared to the current one the codebases on IRIX and later Linux never relied on any global Giant-style locking. So if there is anything to fix it would be the in the small bits of FreeBSD-specific code. I would be curious as well. Since I'm one of the people that has done the port of XFS to FreeBSD I'm wondering what this whole MP initiative with regards to filesystems is about. So if you think that XFS doesn't need to acquire Giant why it is not yet marked MPSAFE? Also, did you really ever actually tested write support? (Not sure if it was added to you). Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: MPSAFE VFS -- List of upcoming actions
2012/7/1 C. P. Ghost cpgh...@cordula.ws: On Fri, Jun 29, 2012 at 10:32 PM, Attilio Rao atti...@freebsd.org wrote: As already published several times, according to the following plan: http://wiki.freebsd.org/NONMPSAFE_DEORBIT_VFS in 2 months the code dealing with non-MPSAFE filesystem will be removed and filesystems not yet MPSAFE will be disconnected from the tree. Their code will be however available in our official repository yet for 6 months. This leaves a total time of 8 months to do actions. Current list of unmantained filesystems is: HPFS, NWFS, PortalFS and XFS. Coda and SMBFS have current mantainership but the status of the work has still to be determined. NTFS, is being worked for the Summer of Code program. Finally, ReiserFS was successfully locked during this campaign. Sorry if this has been discussed already, but... it's one thing to obsolete some code, it's a different thing to remove it entirely where there's no user-level replacement, especially since it would also remove the ability to access ancient media that some people may still have access to. Couldn't filesystems that are still not MP SAFE be kept in the tree in such a way that they at least provide read-only access in case of emergencies? Unfortunately not. First of all, they are mostly already READ-ONLY, in particular XFS and NTFS. Second, it would be meaning leave in place the Giant-VFS bloat that we necessarilly must get rid of. The most interesting ones in the list, IMHO, are still SMBFS and NTFS and possibly XFS (but this is really a personal preference). I've received an e-mail explaining that there are arrangements by a company to put their developers on work after 1st september on SMBFS locking which is certainly a good new, but I still haven't heard anything by SoC involved people about NTFS and certainly I don't see a plan to get XFS locked. However remind that at the worst (for filesystems like PortalFS, for example) the code will remain in Attic even after it gets depurated and then it can be revitalized and locked in whatever point in the future. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
MPSAFE VFS -- List of upcoming actions
As already published several times, according to the following plan: http://wiki.freebsd.org/NONMPSAFE_DEORBIT_VFS in 2 months the code dealing with non-MPSAFE filesystem will be removed and filesystems not yet MPSAFE will be disconnected from the tree. Their code will be however available in our official repository yet for 6 months. This leaves a total time of 8 months to do actions. Current list of unmantained filesystems is: HPFS, NWFS, PortalFS and XFS. Coda and SMBFS have current mantainership but the status of the work has still to be determined. NTFS, is being worked for the Summer of Code program. Finally, ReiserFS was successfully locked during this campaign. It is time for community members to step up and offer time and skills to lock a filesystem or test the effort of other developers willing to do so. Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: panic td-td_lock == NULL in scheduler(), csup'd 2011-02-19
2012/6/13, Svatopluk Kraus onw...@gmail.com: Hi, it looks similar to http://lists.freebsd.org/pipermail/freebsd-current/2011-March/023829.html Yes, that is likely the problem. However, I would really love to workaround the pid allocation race in another way than PRS_NEW because this imposes an extra-constraint, undocumented, on iterations of processes in the system. If you want to work on a patch for that, you are welcome to do so. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: new panic in cpu_reset() with WITNESS
2012/5/17, Andriy Gapon a...@freebsd.org: on 25/01/2012 23:52 Andriy Gapon said the following: on 24/01/2012 14:32 Gleb Smirnoff said the following: Yes, now: Rebooting... lock order reversal: 1st 0x80937140 smp rendezvous (smp rendezvous) @ /usr/src/head/sys/kern/kern_shutdown.c:542 2nd 0xfe0001f5d838 uart_hwmtx (uart_hwmtx) @ /usr/src/head/sys/dev/uart/uart_cpu.h:92 panic: mtx_lock_spin: recursed on non-recursive mutex cnputs_mtx @ /usr/src/head/sys/kern/kern_cons.c:500 OK, so it's just a plain LOR between smp rendezvous and uart_hwmtx, no new details... It's still intriguing to me why the LOR *doesn't* happen [*] with stop_scheduler_on_panic=0. But I don't see a productive way to pursue this investigation further. Salve Glebius! After your recent nudging I took yet another look at this issue and it seems that I have some findings. For those who might get interested here is a convenience reference to the whole thread on gmane: http://thread.gmane.org/gmane.os.freebsd.current/139307 A short summary. Prerequisites: an SMP x86 system, its kernel is configured with WITNESS !WITNESS_SKIPSPIN (this is important) and a system uses serial console via uart. Then, if stop_scheduler_on_panic is set to zero the system can be rebooted without a problem. On the other hand, if stop_scheduler_on_panic is enabled, then the system first runs into a LOR when calling printf() in cpu_reset() and then it runs into a panic when printf is recursively called from witness(9) to report the LOR. The panic happens because of the recursion on cnputs_mtx, which should ensure that cnputs() output is not intermingled and which is not flagged to support recursion. There are two things about this report that greatly confused and puzzled me: 1. stop_scheduler_on_panic variable is used _only_ in panic(9). So how could it be possible that changing its value affects behavior of the system when panic(9) is not called?! 2. The LOR in question happens between smp rendezvous (smp_ipi_mtx) and uart_hwmtx (sc_hwmtx_s in uart core) spin locks. The order of these locks is actually predefined in witness order_lists[] such that uart_hwmtx must come before smp_ipi_mtx. But in the reboot path we first take smp_ipi_mtx in shutdown_reset(), then we call cpu_reset, then it calls printf and from there we get to uart_hwmtx for serial console output. So the order between these spinlocks is always violated in the x86 SMP reboot path. How come witness(9) doesn't _always_ detect this LOR? How come it didn't detect this LOR before any stop scheduler commits?! [Spoiler alert :-)] Turns out that the two puzzles above are closely related. Let's first list all the things that change when stop_scheduler_on_panic is enabled and a panic happens: - other CPUs are stopped (forced to spin) - interrupts on current CPU are disabled - by virtue of the above the current thread should be the only thread running (unless it executes a voluntary switch) - all locks are busted, they are completely ignored / bypassed - by virtue of the above no lock invariants and witness checks are performed, so no lock order checking, no recursion checking, etc So, what I observe is this: when stop_scheduler_on_panic is disabled, the LOR is actually detected as it should be. witness(9) works properly here. Once the LOR is detected witness(9) wants to report it using printf(9). That's where we run into the cnputs_mtx recursion panic. It's all exactly as with stop_scheduler_on_panic enabled. Then panic(9) wants to report the panic using printf(9), which goes to cnputs() again, where _mtx_lock_spin_flags() detects locks recursion again (this is independent of witness(9)) and calls panic(9) again. Then panic(9) wants to report the panic using printf(9)... I assume that when the stack is exhausted we run into a double fault and dblfault_handler wants to print something again... Likely we eventually run into a triple fault which resets the machine. Although, I recall some old reports of machines hanging during reboot in a place suspiciously close to where the described ordeal happens... But if the machine doesn't hang we get a full appearance of the reset successfully happening (modulo some last messages missing). With stop_scheduler_on_panic enabled and all the locks being ignored we, of course, do not run into any secondary lock recursions and resulting panics. So the system is able to at least panic gracefully (still we should have just reported the LOR gracefully, no panic is required). Some obvious conclusions: - the smp rendezvous and uart_hwmtx LOR is real and it is the true cause of the problem; it should be fixed one way or other - either by correcting witness order_lists[] or by avoiding the LOR in shutdown_reset/cpu_reset; - because witness(9) uses printf(9) to report problems, it is very fragile to use witness with any locks that can be
Re: wdog_kern_pat: liberate from SW_WATCHDOG
2012/5/16, Andriy Gapon a...@freebsd.org: I would like to commit something like the following patch. I think that in-kernel watchdog patting during crash dump is useful with hardware watchdogs too. The code seems to work fine here. In fact, I am not sure why wdog_kern_pat was originally tied to SW_WATCHDOG. I didn't think I tested this with any hw watchdog. Which one you are using for tests? BTW, can you please skip adding the white lines? Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: wdog_kern_pat: liberate from SW_WATCHDOG
2012/5/16, Andriy Gapon a...@freebsd.org: on 16/05/2012 15:37 Attilio Rao said the following: 2012/5/16, Andriy Gapon a...@freebsd.org: I would like to commit something like the following patch. I think that in-kernel watchdog patting during crash dump is useful with hardware watchdogs too. The code seems to work fine here. In fact, I am not sure why wdog_kern_pat was originally tied to SW_WATCHDOG. I didn't think I tested this with any hw watchdog. Which one you are using for tests? amdsbwd and ichwd BTW, can you please skip adding the white lines? I thought that those calls were quite significant to be emphasized by spacing. I don't think this is the right thing to do style-wise. But it is up to you. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: fast bcopy...
2012/5/3, Steven Atreju snatr...@googlemail.com: K. Macy wrote [2012-05-03 02:58+0200]: It's highly chipset and processor dependent what works best. Yes, of course. Though i was kinda, even shocked, once i've seen this first: http://marc.info/?l=dragonfly-commitsm=132241713812022w=2 So we don't use our assembler version for new gccs and HAMMER or SSE3+ (the decision for these was rather arbitrarily, except they were yet existent for an instant implementation). Intel now has non-temporal loads and stores which work much better in some cases but provide little benefit in others. Yes, our 2002 tests have shown that these were *extremely* dependent upon alignment. (Note: 2002. o-) Hmm, it doesn't really matter, but i guess this is a good time to thank the FreeBSD hackers for that FPU stack FILD/FISTP idea! I'll append the copy related notes of our doc/memperf.txt. Thanks, I made an implementation of fpu unwinding and mmx copy to see if they were really making a difference years ago (reimplementing bcopy, memcopy, etc.). What really mattered with hw available at that time (pentium4) was the alignment and use of non-temporal operations on heavilly contended cache-lines. In few words it is more important we engineer the buffer layout rather than the functions themselves. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Disabling an arbitrary device
Il 20 aprile 2012 19:18, Arnaud Lacombe lacom...@gmail.com ha scritto: Hi, On Fri, Apr 20, 2012 at 2:16 PM, Arnaud Lacombe lacom...@gmail.com wrote: Hi, I will be bringing up an old thread there, but it would seem the situation did not evolve in the past 9 years. I have a machine running 7.1 whose UHCI controller is generating some interrupt storm: # vmstat -i interrupt total rate irq4: sio0 1328 2 irq19: uhci1+ 63514509 96380 [...] generating useless load on one CPU: # top -SH last pid: 5223; load averages: 0.00, 0.00, 0.00 up 0+00:17:21 13:10:35 117 processes: 14 running, 79 sleeping, 24 waiting CPU: 0.2% user, 0.0% nice, 0.2% system, 6.6% interrupt, 93.0% idle Mem: 33M Active, 9348K Inact, 67M Wired, 400K Cache, 29M Buf, 2892M Free [...] 57 root -64 - 0K 8K CPU0 0 11:59 86.57% irq19: uhci1+ I thought I could use an hint to forbid uhci(4) attachment, ala: hint.uhci.0.disabled=1 hint.uhci.1.disabled=1 in /boot/loader.conf. However, it would seem that what should be usable with any arbitrary devices, ie. be an integral part of device(9), has to be hardcoded in every driver, sad. as for the usual if you're not happy, please provide a patch: https://github.com/lacombar/freebsd/commit/30786d09b0cb441890cdc749ee5243238e81d2d8 I think a generalized kludge should really belong to device_probe_child() rather than device_add_child_ordered(). When you have a tested patch against -CURRENT, can you please send to freebsd-newbus@? BTW, IIRC 7.x doesn't have the new USB stack, which means that you can have caught easilly a driver bug there, did you consider switching at least to 8.x kernel? Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: build with KTR and KTR_CPUMASK
Please read NOTES. Attilio Il 13 aprile 2012 14:37, Aleksandr Rybalko r...@dlink.ua ha scritto: Hi, When kernel build with option KTR_CPUMASK, build failed with following error: cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -Wmissing-include-dirs -fdiagnostics-show-option -nostdinc -I. -I/usr/1/MIPS_FreeBSD/HEAD/head/sys -I/usr/1/MIPS_FreeBSD/HEAD/head/sys/contrib/altq -I/usr/1/MIPS_FreeBSD/HEAD/head/sys/contrib/libfdt -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=1 --param large-function-growth=10 --param max-inline-insns-single=1 -fno-pic -mno-abicalls -G0 -DKERNLOADADDR=0x80001000 -march=mips32 -msoft-float -ffreestanding -Werror /usr/1/MIPS_FreeBSD/HEAD/head/sys/kern/kern_ktr.c cc1: warnings being treated as errors /usr/1/MIPS_FreeBSD/HEAD/head/sys/kern/kern_ktr.c: In function 'ktr_cpumask_initializer': /usr/1/MIPS_FreeBSD/HEAD/head/sys/kern/kern_ktr.c:112: warning: passing argument 2 of 'cpusetobj_strscan' makes pointer from integer without a cast *** Error code 1 SVN revision 234222. WBW -- Alexandr Rybalko r...@dlink.ua aka Alex RAY r...@ddteam.net ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [RFT][patch] Scheduling for HTT and not only
Il 05 aprile 2012 19:12, Arnaud Lacombe lacom...@gmail.com ha scritto: Hi, [Sorry for the delay, I got a bit sidetrack'ed...] 2012/2/17 Alexander Motin m...@freebsd.org: On 17.02.2012 18:53, Arnaud Lacombe wrote: On Fri, Feb 17, 2012 at 11:29 AM, Alexander Motinm...@freebsd.org wrote: On 02/15/12 21:54, Jeff Roberson wrote: On Wed, 15 Feb 2012, Alexander Motin wrote: I've decided to stop those cache black magic practices and focus on things that really exist in this world -- SMT and CPU load. I've dropped most of cache related things from the patch and made the rest of things more strict and predictable: http://people.freebsd.org/~mav/sched.htt34.patch This looks great. I think there is value in considering the other approach further but I would like to do this part first. It would be nice to also add priority as a greater influence in the load balancing as well. I haven't got good idea yet about balancing priorities, but I've rewritten balancer itself. As soon as sched_lowest() / sched_highest() are more intelligent now, they allowed to remove topology traversing from the balancer itself. That should fix double-swapping problem, allow to keep some affinity while moving threads and make balancing more fair. I did number of tests running 4, 8, 9 and 16 CPU-bound threads on 8 CPUs. With 4, 8 and 16 threads everything is stationary as it should. With 9 threads I see regular and random load move between all 8 CPUs. Measurements on 5 minutes run show deviation of only about 5 seconds. It is the same deviation as I see caused by only scheduling of 16 threads on 8 cores without any balancing needed at all. So I believe this code works as it should. Here is the patch: http://people.freebsd.org/~mav/sched.htt40.patch I plan this to be a final patch of this series (more to come :)) and if there will be no problems or objections, I am going to commit it (except some debugging KTRs) in about ten days. So now it's a good time for reviews and testing. :) is there a place where all the patches are available ? All my scheduler patches are cumulative, so all you need is only the last mentioned here sched.htt40.patch. You may want to have a look to the result I collected in the `runs/freebsd-experiments' branch of: https://github.com/lacombar/hackbench/ and compare them with vanilla FreeBSD 9.0 and -CURRENT results available in `runs/freebsd'. On the dual package platform, your patch is not a definite win. But in some cases, especially for multi-socket systems, to let it show its best, you may want to apply additional patch from avg@ to better detect CPU topology: https://gitorious.org/~avg/freebsd/avgbsd/commit/6bca4a2e4854ea3fc275946a023db65c483cb9dd test I conducted specifically for this patch did not showed much improvement... Can you please clarify on this point? The test you did included cases where the topology was detected badly against cases where the topology was detected correctly as a patched kernel (and you still didn't see a performance improvement), in terms of cache line sharing? Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [RFT][patch] Scheduling for HTT and not only
Il 06 aprile 2012 15:27, Alexander Motin m...@freebsd.org ha scritto: On 04/06/12 17:13, Attilio Rao wrote: Il 05 aprile 2012 19:12, Arnaud Lacombelacom...@gmail.com ha scritto: Hi, [Sorry for the delay, I got a bit sidetrack'ed...] 2012/2/17 Alexander Motinm...@freebsd.org: On 17.02.2012 18:53, Arnaud Lacombe wrote: On Fri, Feb 17, 2012 at 11:29 AM, Alexander Motinm...@freebsd.org wrote: On 02/15/12 21:54, Jeff Roberson wrote: On Wed, 15 Feb 2012, Alexander Motin wrote: I've decided to stop those cache black magic practices and focus on things that really exist in this world -- SMT and CPU load. I've dropped most of cache related things from the patch and made the rest of things more strict and predictable: http://people.freebsd.org/~mav/sched.htt34.patch This looks great. I think there is value in considering the other approach further but I would like to do this part first. It would be nice to also add priority as a greater influence in the load balancing as well. I haven't got good idea yet about balancing priorities, but I've rewritten balancer itself. As soon as sched_lowest() / sched_highest() are more intelligent now, they allowed to remove topology traversing from the balancer itself. That should fix double-swapping problem, allow to keep some affinity while moving threads and make balancing more fair. I did number of tests running 4, 8, 9 and 16 CPU-bound threads on 8 CPUs. With 4, 8 and 16 threads everything is stationary as it should. With 9 threads I see regular and random load move between all 8 CPUs. Measurements on 5 minutes run show deviation of only about 5 seconds. It is the same deviation as I see caused by only scheduling of 16 threads on 8 cores without any balancing needed at all. So I believe this code works as it should. Here is the patch: http://people.freebsd.org/~mav/sched.htt40.patch I plan this to be a final patch of this series (more to come :)) and if there will be no problems or objections, I am going to commit it (except some debugging KTRs) in about ten days. So now it's a good time for reviews and testing. :) is there a place where all the patches are available ? All my scheduler patches are cumulative, so all you need is only the last mentioned here sched.htt40.patch. You may want to have a look to the result I collected in the `runs/freebsd-experiments' branch of: https://github.com/lacombar/hackbench/ and compare them with vanilla FreeBSD 9.0 and -CURRENT results available in `runs/freebsd'. On the dual package platform, your patch is not a definite win. But in some cases, especially for multi-socket systems, to let it show its best, you may want to apply additional patch from avg@ to better detect CPU topology: https://gitorious.org/~avg/freebsd/avgbsd/commit/6bca4a2e4854ea3fc275946a023db65c483cb9dd test I conducted specifically for this patch did not showed much improvement... Can you please clarify on this point? The test you did included cases where the topology was detected badly against cases where the topology was detected correctly as a patched kernel (and you still didn't see a performance improvement), in terms of cache line sharing? At this moment SCHED_ULE does almost nothing in terms of cache line sharing affinity (though it probably worth some further experiments). What this patch may improve is opposite case -- reduce cache sharing pressure for cache-hungry applications. For example, proper cache topology detection (such as lack of global L3 cache, but shared L2 per pairs of cores on Core2Quad class CPUs) increases pbzip2 performance when number of threads is less then number of CPUs (i.e. when there is place for optimization). My asking is not referred to your patch really. I just wanted to know if he correctly benchmark a case where the topology was screwed up and then correctly recognized by avg's patch in terms of cache level aggregation (it wasn't referred to your patch btw). Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Scheduler + IPC performance on FreeBSD 7.4, 8.2, 9.0 and -CURRENT
Il 05 aprile 2012 19:03, Arnaud Lacombe lacom...@gmail.com ha scritto: Hi folks, Over the past months, I ran on a couple of unused box the `hackbench'[HACKBENCH] benchmark used by the Linux folks for tracking down various kind of regression/improvement. `hackbench' is a scheduler + IPC test (socket xor pipe). It creates producers/consumers groups and let a variable quantity of small messages flow happily. Producers and consumers are either processes xor threads. Tested platforms were - Atom D510, Intel, (incomplete) - Core 2 Quad Q9560, Intel - Soekris net5501, AMD (incomplete) - Xeon E5645, Intel (incomplete) - Xeon E5620 (dual package), Intel - Xeon E5-1650 (pending completion) - Vortex86, DMP Tested kernel were: - FreeBSD 7.4-RELEASE - FreeBSD 8.2-RELEASE - FreeBSD 9.0-RC3 and FreeBSD 9.0-RELEASE - FreeBSD 10-CURRENT as of r231573 Which means you run 10-CURRENT with all the kernel debugging options on and MALLOC_DEBUG on? Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Scheduler + IPC performance on FreeBSD 7.4, 8.2, 9.0 and -CURRENT
Il 06 aprile 2012 18:54, Arnaud Lacombe lacom...@gmail.com ha scritto: Hi, On Fri, Apr 6, 2012 at 10:58 AM, Attilio Rao atti...@freebsd.org wrote: Il 05 aprile 2012 19:03, Arnaud Lacombe lacom...@gmail.com ha scritto: Hi folks, Over the past months, I ran on a couple of unused box the `hackbench'[HACKBENCH] benchmark used by the Linux folks for tracking down various kind of regression/improvement. `hackbench' is a scheduler + IPC test (socket xor pipe). It creates producers/consumers groups and let a variable quantity of small messages flow happily. Producers and consumers are either processes xor threads. Tested platforms were - Atom D510, Intel, (incomplete) - Core 2 Quad Q9560, Intel - Soekris net5501, AMD (incomplete) - Xeon E5645, Intel (incomplete) - Xeon E5620 (dual package), Intel - Xeon E5-1650 (pending completion) - Vortex86, DMP Tested kernel were: - FreeBSD 7.4-RELEASE - FreeBSD 8.2-RELEASE - FreeBSD 9.0-RC3 and FreeBSD 9.0-RELEASE - FreeBSD 10-CURRENT as of r231573 Which means you run 10-CURRENT with all the kernel debugging options on and MALLOC_DEBUG on? I already answered that question. Namely: note: rule [I] is alleviated for -CURRENT kernels, which were built with the same alteration made to GENERIC during the CURRENT-RELEASE transition (ie. WITNESS and a couple of other option disabled). this translates into the following patch (for amd64): Did you enable MALLOC_PRODUCTION and rebuilt libc? Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: svn commit: r232619 - in head: . sys/amd64/conf sys/arm/conf sys/i386/conf sys/ia64/conf sys/mips/conf sys/pc98/conf sys/powerpc/conf sys/sparc64/conf
2012/3/6, Attilio Rao atti...@freebsd.org: Author: attilio Date: Tue Mar 6 20:01:25 2012 New Revision: 232619 URL: http://svn.freebsd.org/changeset/base/232619 Log: Disable the option VFS_ALLOW_NONMPSAFE by default on all the supported platforms. This will make every attempt to mount a non-mpsafe filesystem to the kernel forbidden, unless it is expressely compiled with VFS_ALLOW_NONMPSAFE option. This is just a gentle reminder in order to point you further to the official page: http://wiki.freebsd.org/NONMPSAFE_DEORBIT_VFS and to mention that the time for removing non-mpsafe filesystem is approaching. In 6 months we will disconnect from the tree the non-mpsafe filesystems and will remove the whole non-mpsafe handling infrastructure in the VFS and the buffer cache, thus please think about stepping up and convert your favourite filesystem. Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: fifo_listen: fchmod public/pickup: Invalid argument with postfix on today's current
Il 25 febbraio 2012 07:15, Doug Barton do...@freebsd.org ha scritto: On 02/24/2012 21:00, Doug Barton wrote: I'm on today's -current (r232126) and I'm getting the error in the subject when trying to start postfix. I recompiled 2.9, and then tried 2.8 both give the same error. Did you also rebuilt world? Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [panic] intr_event_execute_handlers() - Corrupted DWARF expression
2012/1/19 John Baldwin j...@freebsd.org: On Thursday, January 19, 2012 11:02:57 am Glen Barber wrote: On Thu, Jan 19, 2012 at 10:50:45AM -0500, John Baldwin wrote: On Wednesday, January 18, 2012 5:01:37 pm Glen Barber wrote: Hi, I'm running -CURRENT from about 5 days ago: nucleus# uname -a FreeBSD nucleus 10.0-CURRENT FreeBSD 10.0-CURRENT #3 r230037M: Fri Jan 13 17:48:14 EST 2012 gjb@nucleus:/usr/obj/usr/src/sys/NUCLEUS amd64 (The 'M' is kib's DRM patches for Intel GPU.) So far, I haven't had much problem with this laptop, but just had the machine panic. I have kgdb output attached, and I'll be happy to provide whatever additional information that may be needed. I have core.txt.N available here: http://people.freebsd.org/~gjb/core.txt In kgdb, can you go to frame 6 and 'p td-td_lock'. If that is non-null, can you do 'p *td-td_lock'? Sure, script(1) output is attached. Hmm, I don't think td-td_lock is ever supposed to be NULL. No, never, it is initialized in sched_fork_thread() and can point to containers lock or blocked_lock. I think the memory page of the pointer could have been zeroed or it is an hw bug. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: svn commit: r227333 - in head: . sys/amd64/conf sys/arm/conf sys/conf sys/i386/conf sys/ia64/conf sys/kern sys/mips/conf sys/pc98/conf sys/powerpc/conf sys/sparc64/conf
2011/11/8 Attilio Rao atti...@freebsd.org: 2011/11/8 Attilio Rao atti...@freebsd.org: Author: attilio Date: Tue Nov 8 10:18:07 2011 New Revision: 227333 URL: http://svn.freebsd.org/changeset/base/227333 Log: Introduce the option VFS_ALLOW_NONMPSAFE and turn it on by default on all the architectures. The option allows to mount non-MPSAFE filesystem. Without it, the kernel will refuse to mount a non-MPSAFE filesytem. This patch is part of the effort of killing non-MPSAFE filesystems from the tree. This is just a gentle reminder in order to point you further to the official page: http://wiki.freebsd.org/NONMPSAFE_DEORBIT_VFS and encourage once again people in adopting a dying FS if it really matters to them. So far, unfortunately, I didn't see a lot of activity in this area but I hope that this would change soon. This is a further reminder. So far I've not seen any improvement over the locking of any of our 'legacy' filesystems. I remind you that this may be meaning disconnecting them from the tree on 1st Semptember 2012, accordingly with this road-map: http://wiki.freebsd.org/NONMPSAFE_DEORBIT_VFS In one month I'm going to disable VFS_ALLOW_NONMPSAFE by defaults in order to see how well the users do with this option down. At the present times this means that from 1st March you won't be able to mount smbfs or ntfs volumes, for example. Please step up and fix your favourite filesystem. Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Sleeping thread (tid 100033, pid 16): panic in FreeBSD 10.0-CURRENT/amd64 r228662
2011/12/20 John Baldwin j...@freebsd.org: On Saturday, December 17, 2011 10:41:15 pm m...@freebsd.org wrote: On Sat, Dec 17, 2011 at 5:45 PM, Alexander Kabaev kab...@gmail.com wrote: On Sun, 18 Dec 2011 01:09:00 +0100 O. Hartmann ohart...@zedat.fu-berlin.de wrote: Sleeping thread (tid 100033, pid 16) owns a non sleepable lock panic: sleeping thread cpuid = 0 PID 16 is always USB on my box. You really need to give us a backtrace when you quote panics. It is impossible to make any sense of the above panic message without more context. In the case of this panic, the stack of the thread which panics is useless; it's someone trying to propagate priority that discovered it. A backtrace on tid 100033 would be useful. With WITNESS enabled, it's possible to have this panic display the stack of the incorrectly sleeping thread at the time it acquired the lock, as well, but this code isn't in CURRENT or any release. I have a patch at $WORK I can dig up on Monday. Huh? The stock kernel dumps a stack trace of the offending thread if you have DDB enabled: /* * If the thread is asleep, then we are probably about * to deadlock. To make debugging this easier, just * panic and tell the user which thread misbehaved so * they can hopefully get a stack trace from the truly * misbehaving thread. */ if (TD_IS_SLEEPING(td)) { printf( Sleeping thread (tid %d, pid %d) owns a non-sleepable lock\n, td-td_tid, td-td_proc-p_pid); #ifdef DDB db_trace_thread(td, -1); #endif panic(sleeping thread); } It may be that we can make use of the STACK API here instead to output this trace even when DDB isn't enabled. The patch below tries to do that (untested). It does some odd thigns though since it is effectively running from a panic context already, so it uses a statically allocated 'struct stack' rather than using stack_create() and uses stack_print_ddb() since it is holding spin locks and can't possibly grab an sx lock: Index: subr_turnstile.c === --- subr_turnstile.c (revision 228534) +++ subr_turnstile.c (working copy) @@ -72,6 +72,7 @@ __FBSDID($FreeBSD$); #include sys/proc.h #include sys/queue.h #include sys/sched.h +#include sys/stack.h #include sys/sysctl.h #include sys/turnstile.h @@ -175,6 +176,7 @@ static void turnstile_fini(void *mem, int size); static void propagate_priority(struct thread *td) { + static struct stack st; struct turnstile *ts; int pri; @@ -217,8 +219,10 @@ propagate_priority(struct thread *td) printf( Sleeping thread (tid %d, pid %d) owns a non-sleepable lock\n, td-td_tid, td-td_proc-p_pid); -#ifdef DDB - db_trace_thread(td, -1); +#ifdef STACK + stack_zero(st); + stack_save_td(st, td); + stack_print_ddb(st); #endif panic(sleeping thread); } -- I'm not sure it is a wise idea to trimm the DDB part, because it is much more common than STACK enabled. Note that stack(9) is working if you define DDB too, so I'd say to do that for both. Also, I don't think you need the stack_zero() prior to set it. As we are here, however, I have a question for Robert here: do you think we should support the _ddb() variant of options even in the case DDB is not enabled in the kernel? Probabilly the way it is nowadays is easier to have stack(9) both defined for DDB and STACK, but in general I wouldn't advertise that. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Sleeping thread (tid 100033, pid 16): panic in FreeBSD 10.0-CURRENT/amd64 r228662
2011/12/20 John Baldwin j...@freebsd.org: On Tuesday, December 20, 2011 9:20:09 am Attilio Rao wrote: 2011/12/20 John Baldwin j...@freebsd.org: On Saturday, December 17, 2011 10:41:15 pm m...@freebsd.org wrote: On Sat, Dec 17, 2011 at 5:45 PM, Alexander Kabaev kab...@gmail.com wrote: On Sun, 18 Dec 2011 01:09:00 +0100 O. Hartmann ohart...@zedat.fu-berlin.de wrote: Sleeping thread (tid 100033, pid 16) owns a non sleepable lock panic: sleeping thread cpuid = 0 PID 16 is always USB on my box. You really need to give us a backtrace when you quote panics. It is impossible to make any sense of the above panic message without more context. In the case of this panic, the stack of the thread which panics is useless; it's someone trying to propagate priority that discovered it. A backtrace on tid 100033 would be useful. With WITNESS enabled, it's possible to have this panic display the stack of the incorrectly sleeping thread at the time it acquired the lock, as well, but this code isn't in CURRENT or any release. I have a patch at $WORK I can dig up on Monday. Huh? The stock kernel dumps a stack trace of the offending thread if you have DDB enabled: /* * If the thread is asleep, then we are probably about * to deadlock. To make debugging this easier, just * panic and tell the user which thread misbehaved so * they can hopefully get a stack trace from the truly * misbehaving thread. */ if (TD_IS_SLEEPING(td)) { printf( Sleeping thread (tid %d, pid %d) owns a non-sleepable lock\n, td-td_tid, td-td_proc-p_pid); #ifdef DDB db_trace_thread(td, -1); #endif panic(sleeping thread); } It may be that we can make use of the STACK API here instead to output this trace even when DDB isn't enabled. The patch below tries to do that (untested). It does some odd thigns though since it is effectively running from a panic context already, so it uses a statically allocated 'struct stack' rather than using stack_create() and uses stack_print_ddb() since it is holding spin locks and can't possibly grab an sx lock: Index: subr_turnstile.c === --- subr_turnstile.c (revision 228534) +++ subr_turnstile.c (working copy) @@ -72,6 +72,7 @@ __FBSDID($FreeBSD$); #include sys/proc.h #include sys/queue.h #include sys/sched.h +#include sys/stack.h #include sys/sysctl.h #include sys/turnstile.h @@ -175,6 +176,7 @@ static void turnstile_fini(void *mem, int size); static void propagate_priority(struct thread *td) { + static struct stack st; struct turnstile *ts; int pri; @@ -217,8 +219,10 @@ propagate_priority(struct thread *td) printf( Sleeping thread (tid %d, pid %d) owns a non-sleepable lock\n, td-td_tid, td-td_proc-p_pid); -#ifdef DDB - db_trace_thread(td, -1); +#ifdef STACK + stack_zero(st); + stack_save_td(st, td); + stack_print_ddb(st); #endif panic(sleeping thread); } -- I'm not sure it is a wise idea to trimm the DDB part, because it is much more common than STACK enabled. Note that stack(9) is working if you define DDB too, so I'd say to do that for both. Also, I don't think you need the stack_zero() prior to set it. Err, STACK is enabled in GENERIC in release kernels but DDB is not, so I think STACK is the more common one. As far as stack_zero(), I was just being paranoid. And what is the point for not having #ifdef STACK as #if defined(STACK) || defined(DDB) ? As we are here, however, I have a question for Robert here: do you think we should support the _ddb() variant of options even in the case DDB is not enabled in the kernel? Probabilly the way it is nowadays is easier to have stack(9) both defined for DDB and STACK, but in general I wouldn't advertise that. The _ddb variants are always enabled by my reading. They just use different entry points into the linker that don't use locking. My question is different: why we define them anyway even when DDB is not enabled? Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Benchmark (Phoronix): FreeBSD 9.0-RC2 vs. Oracle Linux 6.1 Server
2011/12/16 Arnaud Lacombe lacom...@gmail.com: Hi, On Thu, Dec 15, 2011 at 2:32 AM, O. Hartmann ohart...@zedat.fu-berlin.de wrote: Just saw this shot benchmark on Phoronix dot com today: http://www.phoronix.com/scan.php?page=news_itempx=MTAyNzA it might be worth highlighting that despite Oracle Linux 6.1 Server is using a kernel + compiler almost 2 years old, it still manages to out-perform the bleeding edge FreeBSD :-) Now, from what I've read so far in this thread, it seems that a lot of people are still in abnegation... my 0.2c, - Arnaud Said by someone which really thinks passing __FILE__ and __LINE__ to kernel function is going to give a mesaurable performance penalty is really hilarious however :) It is crystal clear you really don't understand how to make reliable benchmarks (and likely you don't really have a grasp of nowaday's machine contention points), so why you keep talking about it? It would be more valuable for you and whatever project you follow if you spend your time coding and making real benchmarking. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SCHED_ULE should not be the default
2011/12/14 Mike Tancsa m...@sentex.net: On 12/13/2011 7:01 PM, m...@freebsd.org wrote: Has anyone experiencing problems tried to set sysctl kern.sched.steal_thresh=1 ? I don't remember what our specific problem at $WORK was, perhaps it was just interrupt threads not getting serviced fast enough, but we've hard-coded this to 1 and removed the code that sets it in sched_initticks(). The same effect should be had by setting the sysctl after a box is up. FWIW, this does impact the performance of pbzip2 on an i7. Using a 1.1G file pbzip2 -v -c big /dev/null with burnP6 running in the background, sysctl kern.sched.steal_thresh=1 vs sysctl kern.sched.steal_thresh=3 N Min Max Median Avg Stddev x 10 38.005022 38.42238 38.194648 38.165052 0.15546188 + 9 38.695417 40.595544 39.392127 39.435384 0.59814114 Difference at 95.0% confidence 1.27033 +/- 0.412636 3.32852% +/- 1.08119% (Student's t, pooled s = 0.425627) a value of 1 is *slightly* faster. Hi Mike, was that just the same codebase with the switch SCHED_4BSD/SCHED_ULE? Also, the results here should be in the 3% interval for the avg case, which is not yet at the 'alarm level' but could still be an indication. I still suspect I/O plays a big role here, however, thus it could be detemined by other factors. Could you retry the bench checking CPU usage and possible thread migration around for both cases? Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SCHED_ULE should not be the default
2011/12/13 Jeremy Chadwick free...@jdc.parodius.com: On Mon, Dec 12, 2011 at 02:47:57PM +0100, O. Hartmann wrote: Not fully right, boinc defaults to run on idprio 31 so this isn't an issue. And yes, there are cases where SCHED_ULE shows much better performance then SCHED_4BSD. [...] Do we have any proof at hand for such cases where SCHED_ULE performs much better than SCHED_4BSD? Whenever the subject comes up, it is mentioned, that SCHED_ULE has better performance on boxes with a ncpu 2. But in the end I see here contradictionary statements. People complain about poor performance (especially in scientific environments), and other give contra not being the case. Within our department, we developed a highly scalable code for planetary science purposes on imagery. It utilizes present GPUs via OpenCL if present. Otherwise it grabs as many cores as it can. By the end of this year I'll get a new desktop box based on Intels new Sandy Bridge-E architecture with plenty of memory. If the colleague who developed the code is willing performing some benchmarks on the same hardware platform, we'll benchmark bot FreeBSD 9.0/10.0 and the most recent Suse. For FreeBSD I intent also to look for performance with both different schedulers available. This is in no way shape or form the same kind of benchmark as what you're planning to do, but I thought I'd throw it out there for folks to take in as they see fit. I know folks were focused mainly on buildworld. I personally would find it interesting if someone with a higher-end system (e.g. 2 physical CPUs, with 6 or 8 cores per CPU) was to do the same test (changing -jX to -j{numofcores} of course). -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | sched_ule === - time make -j2 buildworld 1689.831u 229.328s 18:46.20 170.4% 6566+2051k 432+4264io 4565pf+0w - time make -j2 buildkernel 640.542u 87.737s 9:01.38 134.5% 6490+1920k 134+5968io 0pf+0w sched_4bsd - time make -j2 buildworld 1662.793u 206.908s 17:12.02 181.1% 6578+2054k 23750+4271io 6451pf+0w - time make -j2 buildkernel 638.717u 76.146s 8:34.90 138.8% 6530+1927k 6415+5903io 0pf+0w software == * sched_ule test: FreeBSD 8.2-STABLE, Thu Dec 1 04:37:29 PST 2011 * sched_4bsd test: FreeBSD 8.2-STABLE, Mon Dec 12 22:42:54 PST 2011 Hi Jeremy, thanks for the time you spent on this. However, I wanted to ask/let you note 3 things: 1) Did you use 2 different code base for the test? (one updated on December 1 and another one on December 12) 2) Please note that you should have repeated this test several times (basically until you don't get a standard deviation which is acceptable with ministat) and report the ministat output 3) The difference is less than 2% which I suspect is really statistically unuseful/the same I'm not really even surprised ULE is not faster than 4BSD in this case because usually buildworld/buildkernel tests are driven for the vast majority by I/O overhead rather than scheduler capacity. It would be more interesting to analyze how buildworld does while another type of workload is going on. Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SCHED_ULE should not be the default
2011/12/15 Mike Tancsa m...@sentex.net: On 12/15/2011 11:26 AM, Attilio Rao wrote: Hi Mike, was that just the same codebase with the switch SCHED_4BSD/SCHED_ULE? Hi Attilio, It was the same codebase. Could you retry the bench checking CPU usage and possible thread migration around for both cases? I can, but how do I do that ? I'm thinking now to a better test-case for this: can you try that on a tmpfs volume? Also what filesystem you were using? How many CPUs were in place? Did you reboot before to move the steal_thresh value? Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SCHED_ULE should not be the default
2011/12/15 Mike Tancsa m...@sentex.net: On 12/15/2011 11:42 AM, Attilio Rao wrote: I'm thinking now to a better test-case for this: can you try that on a tmpfs volume? There is enough RAM in the box so that it should not touch the disk, and I was sending the output to /dev/null, so it was not writing to the disk. Also what filesystem you were using? UFS How many CPUs were in place? 4 Did you reboot before to move the steal_thresh value? No. So, as very first thing, can you try the following: - Same codebase, etc. etc. - Make the test 4 times, discard the first and ministat for the other 3 - Reboot - Change the steal_thresh value - Make the test 4 times, discard the first and ministat for the other 3 Then report discarded values and the ministated one and we will have more informations I guess (also, I don't think devfs contention should play a role here, thus nevermind about it for now). Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SCHED_ULE should not be the default
2011/12/15 Jeremy Chadwick free...@jdc.parodius.com: On Thu, Dec 15, 2011 at 05:26:27PM +0100, Attilio Rao wrote: 2011/12/13 Jeremy Chadwick free...@jdc.parodius.com: On Mon, Dec 12, 2011 at 02:47:57PM +0100, O. Hartmann wrote: Not fully right, boinc defaults to run on idprio 31 so this isn't an issue. And yes, there are cases where SCHED_ULE shows much better performance then SCHED_4BSD. ??[...] Do we have any proof at hand for such cases where SCHED_ULE performs much better than SCHED_4BSD? Whenever the subject comes up, it is mentioned, that SCHED_ULE has better performance on boxes with a ncpu 2. But in the end I see here contradictionary statements. People complain about poor performance (especially in scientific environments), and other give contra not being the case. Within our department, we developed a highly scalable code for planetary science purposes on imagery. It utilizes present GPUs via OpenCL if present. Otherwise it grabs as many cores as it can. By the end of this year I'll get a new desktop box based on Intels new Sandy Bridge-E architecture with plenty of memory. If the colleague who developed the code is willing performing some benchmarks on the same hardware platform, we'll benchmark bot FreeBSD 9.0/10.0 and the most recent Suse. For FreeBSD I intent also to look for performance with both different schedulers available. This is in no way shape or form the same kind of benchmark as what you're planning to do, but I thought I'd throw it out there for folks to take in as they see fit. I know folks were focused mainly on buildworld. I personally would find it interesting if someone with a higher-end system (e.g. 2 physical CPUs, with 6 or 8 cores per CPU) was to do the same test (changing -jX to -j{numofcores} of course). -- | Jeremy Chadwick ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??jdc at parodius.com | | Parodius Networking ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? http://www.parodius.com/ | | UNIX Systems Administrator ?? ?? ?? ?? ?? ?? ?? ?? ?? Mountain View, CA, US | | Making life hard for others since 1977. ?? ?? ?? ?? ?? ?? ?? PGP 4BD6C0CB | sched_ule === - time make -j2 buildworld ??1689.831u 229.328s 18:46.20 170.4% 6566+2051k 432+4264io 4565pf+0w - time make -j2 buildkernel ??640.542u 87.737s 9:01.38 134.5% 6490+1920k 134+5968io 0pf+0w sched_4bsd - time make -j2 buildworld ??1662.793u 206.908s 17:12.02 181.1% 6578+2054k 23750+4271io 6451pf+0w - time make -j2 buildkernel ??638.717u 76.146s 8:34.90 138.8% 6530+1927k 6415+5903io 0pf+0w software == * sched_ule test: ??FreeBSD 8.2-STABLE, Thu Dec ??1 04:37:29 PST 2011 * sched_4bsd test: FreeBSD 8.2-STABLE, Mon Dec 12 22:42:54 PST 2011 Hi Jeremy, thanks for the time you spent on this. However, I wanted to ask/let you note 3 things: 1) Did you use 2 different code base for the test? (one updated on December 1 and another one on December 12) No; src-all (/usr/src on this system) was not updated between December 1st and December 12th PST. I do believe I updated it today (15th PST). I can/will obviously hold off so that we have a consistent code base for comparing numbers between schedulers during buildworld and/or buildkernel. 2) Please note that you should have repeated this test several times (basically until you don't get a standard deviation which is acceptable with ministat) and report the ministat output This is the first time I have heard of ministat(1). I'm pretty sure I see what it's for and how it applies to this situation, but boy that man page could use some clarification (I have 3 people looking at this thing right now trying to figure out what means what in the graph :-) ). Anyway, graph or not, I see the point. Regarding multiple tests: yup, you're absolutely right, the only way to do it would be to run a sequence of tests repeatedly (probably 10 per scheduler). Reboots and rm -fr /usr/obj/* would be required after each test too, to guarantee empty kernel caches (of all types) consistently every time. What I posted was supposed to give people just a general idea if there was any gigantic difference between the two, and there really isn't. But, as others have stated (and you below), buildworld may not be an effective way to benchmark what we're trying to test. Hence me wondering exactly what would make for a good test. Example: 1. Run + background some program that beats on things (I really don't know what; creation/deletion of threads? CPU benchmark? bonnie++?), with output going to /dev/null. 2. Run + background time make -j2 buildworld with output going to /dev/null 3. Record/save output from time. 4. rm -fr /usr/obj shutdown -r now 5. Repeat all steps ~10 times 6. Adjust kernel configuration file to use other scheduler 7. Repeat steps 1-5. What I'm trying to figure out is what #1 and #2 should
Re: SCHED_ULE should not be the default
2011/12/15 Mike Tancsa m...@sentex.net: On 12/15/2011 11:56 AM, Attilio Rao wrote: So, as very first thing, can you try the following: - Same codebase, etc. etc. - Make the test 4 times, discard the first and ministat for the other 3 - Reboot - Change the steal_thresh value - Make the test 4 times, discard the first and ministat for the other 3 Then report discarded values and the ministated one and we will have more informations I guess (also, I don't think devfs contention should play a role here, thus nevermind about it for now). Results and data at http://www.tancsa.com/ule-bsd.html I'm not totally sure, what does burnP6 do? is it a CPU-bound workload? Also, how many threads are spanked in your case for parallel bzip2? Also, it would be very good if you could arrange these tests against newer -CURRENT (with userland and kerneland debugging off). Thanks a lot of your hard work, Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Stop scheduler on panic
2011/12/7 Andriy Gapon a...@freebsd.org: on 07/12/2011 00:11 Attilio Rao said the following: I'd just change this check on panicstr: @@ -606,9 +603,13 @@ kdb_trap(int type, int code, struct trapframe *tf) intr = intr_disable(); #ifdef SMP - other_cpus = all_cpus; - CPU_CLR(PCPU_GET(cpuid), other_cpus); - stop_cpus_hard(other_cpus); + if (panicstr == NULL) { + other_cpus = all_cpus; + CPU_CLR(PCPU_GET(cpuid), other_cpus); + stop_cpus_hard(other_cpus); + did_stop_cpus = 1; + } else + did_stop_cpus = 0; to be SCHEDULER_STOPPED(). Makes sense. I will do this. If you agree I can fix the kern_mutex, kern_sx and kern_rwlock parts and it should be done. Since I am not very familiar with the details of that code, I can not be against such a proposal :-) What Kostik did seemed quite reasonable to me, but if that can be further improved, then I am all for it. The following patch is a further add-on on Kostik's: http://www.freebsd.org/~attilio/scheduler_stopped.patch - Rework of mutex, rwlock and sxlock for a correct dealing of hard and fast paths - Protection of LOCK_PROFILING bits (missed also in my review) - Protection of WITNESS_SAVE/RESTORE because of Giant handling (missed also in my review) - Removal of gratuitous whitelines - Usage of SCHEDULER_STOPPED() in kdb check What do you think about it? I just test-compiled it with several combinations of LOCK_PROFILING and LOCK_DEBUG, but I didn't change the bulk of it thus it should be perfectly fine. If you like it I'd say to go for the commit asap. I wonder if someone tried to simulate a livelock and panic and thus verify that stoppcbs is correctly populated as expected (to be honest, this is one of the best features I'm interested into for this patch). Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Stop scheduler on panic
2011/12/2 Andriy Gapon a...@freebsd.org: on 02/12/2011 20:40 John Baldwin said the following: On 12/2/11 12:18 PM, Attilio Rao wrote: 2011/12/2 John Baldwinj...@freebsd.org: On 12/2/11 5:05 AM, Andriy Gapon wrote: on 02/12/2011 06:36 John Baldwin said the following: Ah, ok (I had thought SCHEDULER_STOPPED was going to always be true when kdb was active). But I think these two changes should cover critical_exit() ok. I attempted to start a discussion about this a few times already :-) Should we treat kdb context the same as SCHEDULER_STOPPED context (in the current definition) ? That is, skip all locks in the same fashion? There are pros and contras. kdb should not block on locks, no. Most debugger commands should not go near locks anyway unless they are intended to carefully modify the existing system in a safe manner (such as the 'kill' command which should only be using try locks and fail if it cannot safely post the signal). The biggest problem to KDB as the same as panic is that doing proper 'continue' is impossible. One of the features of the 'skip-locking' path is that it doesn't take into account fast locking paths, where sometimes the lock can succeed and other fails and you don't know about them. Also the restarted CPUs can find corrupted datas (as they can be arbitrarely updated), I'm sure it is too much panic prone. Yes, my thought is that kdb commands, etc. should be using dedicated routines that do not use locks whenever possible. The problem of a user calling an arbitrary routine is not solvable (so I don't think we should try to solve that, you use 'call' at your own risk), but built-in commands should explicitly either 1) not use locking, or 2) only use try locks and fail out cleanly (including dropping any try locks acquired) if a try fails. Now, that's an ideal view, I don't know how close we are to that in practice or if it is a realistically attainable goal. I agree with what Attilio and you say. Initially it was tempting for me to apply the same SCHEDULER_STOPPED stopped medicine to the kdb_active context, but after trying to deal with kdb_active x SCHEDULER_STOPPED x ukbd situation I really changed my mind. I would classify the code that can be called in kdb_active context as follows: o debugger code proper (kdb, ddb, gdb stub, etc) - this obviously must not (doesn't have to) use any locking o code that can be invoked via 'call' command - this is essentially any code and I don't think that it can/should do anything special for the kdb_active context [*] o debugger helper routines - those that do something trivial should not acquire any locks; those that access shared resources should try the relevant locks and bail out if a resource can be in inconsistent state, or should be equipped to deal correctly with such a state; this is the same as what you say above o common code that the debuggers have to use - most obviously this is console code and drivers that serve a particular console; on one hand those drivers can have a non-trivial state that must be lock protected during normal operation, on the other hand the debugger must disregard those locks and grab its console; this is the most complex case in my opinion. Thanks for summarizing this up. However, please note that code in 2 and 4 entries may have the same issues or being the same thing, in practice. Anyway, I'm thinking now that if we really want to stop CPUs when entering KDB (and I think we do) functions at 2 and 4 should basically just be totally lockless or in general being totally re-entrant because when we restart CPUs we don't really want them finding datas to be corrupted. Also, skipping locking, is totally broken for this very specific reason. Functions at point 2 and 4 should be totally lockless then and possibly just work on read mode. For point 2, specifically, I think we need an explicit KPI to define functions within the subsystem themselves (something like DB_SHOW_COMMAND()) which marks undoublty functions to be called within ddb (the only KDB backend we implement right now) and likely for functions at point 4 we need to find a way to stress their belonging to the KDB area. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Stop scheduler on panic
2011/12/4 Andriy Gapon a...@freebsd.org: on 21/11/2011 18:58 Attilio Rao said the following: I would be very in favor about having a 'thread trampoline for KDB', thus that it can use locks. I keep hearing the suggestion to add this trampoline, but I admit that I do not understand its technical meaning in this context. And also how it helps with the locking. So I will appreciate an explanation! Thanks! kdb_trap() now runs in interrupt context, my suggestion was to just to give KDB its own context (a new kernel thread) and yield its execution when KDB needs to be entered, this way it is possible to use locking and avoid functions duplications. In theory, this avoids constructing complicate algorithms to be lockless when implementing primitives KDB should use. However, I now realize a problem: if we want to stop CPUs we don't really want to acquire locks anyway because of CPU restart. Likely, the KDB trampoline is not a good option for this reason and we should work instead on getting KDB functions to be totally lockless. Another thing I'm considering is, however, the entrypoint for KDB. When I looked into it months ago I thought there is a mismatch between kdb_enter() (which should disable CPUs) and other ways to enter KDB (maybe some paths calling directly kdb_trap()?). We must offer an unified policy and entrypoint, being likely to disable CPUs when entering it. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Stop scheduler on panic
2011/12/4 Andriy Gapon a...@freebsd.org: on 02/12/2011 19:18 Attilio Rao said the following: BTW, I'm waiting for the details to settle (including the patch we have been discussing internally about binding to CPU0 during ACPI shutdown) I do not see strong interdependency between that patch and the panic patch. BTW, I think that your patch is good to go. I agree, we can get back to this once the stop_scheduler patch is in. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Stop scheduler on panic
2011/11/13 Kostik Belousov kostik...@gmail.com: I was tricked into finishing the work by Andrey Gapon, who developed the patch to reliably stop other processors on panic. The patch greatly improves the chances of getting dump on panic on SMP host. Several people already saw the patchset, and I remember that Andrey posted it to some lists. The change stops other (*) processors early upon the panic. This way, no parallel manipulation of the kernel memory is performed by CPUs. In particular, the kernel memory map is static. Patch prevents the panic thread from blocking and switching out. * - in the context of the description, other means not current. Since other threads are not run anymore, lock owner cannot release a lock which is required by panic thread. Due to this, we need to fake a lock acquisition after the panic, which adds minimal overhead to the locking cost. The patch tries to not add any overhead on the fast path of the lock acquire. The check for the after-panic condition was reduced to single memory access, done only when the quick cas lock attempt failed, and braced with __unlikely compiler hint. For now, the new mode of operation is disabled by default, since some further USB changes are needed to make USB keyboard usable in that environment. With the patch, getting a dump from the machine without debugger compiled in is much more realistic. Please comment, I will commit the change in 2 weeks unless strong reasons not to are given. http://people.freebsd.org/~kib/misc/stop_cpus_on_panic.1.patch Below are reported my notes on the patch: - Just by looking at kern_mutex.c chunk I see that you have added SCHEDULER_STOPPED() checks on very specific places, usually after sanity checks by WITNESS (and similar) and sometimes in odd places (as the chunk involving _mtx_lock_spin_flags). I think that we should just skip all the checks along with the hard locking operation. Ideall we should also skip the fast path, IMHO, but it is impossible (without polluting it), but at least skip the vast majority of operations for the hard one, so that we get, for example: %svn diff -x -p kern/kern_mutex.c | less Index: kern/kern_mutex.c === --- kern/kern_mutex.c (revision 228308) +++ kern/kern_mutex.c (working copy) @@ -232,6 +232,8 @@ void _mtx_lock_spin_flags(struct mtx *m, int opts, const char *file, int line) { + if (SCHEDULER_STOPPED()) + return; MPASS(curthread != NULL); KASSERT(m-mtx_lock != MTX_DESTROYED, (mtx_lock_spin() of destroyed mutex @ %s:%d, file, line)); In this optic I'd patch directly the hard functions rather than waiting them to hit the smallest possible common point (which are _mtx_lock_sleep() and _mtx_lock_spin()). That will make the patch more verbose but more precise and more correct too. - This chunk is unneeded now: @@ -577,6 +589,7 @@ retry: m-mtx_recurse++; break; } + lock_profile_obtain_lock_failed(m-lock_object, contested, waittime); /* Give interrupts a chance while we spin. */ - I'm not entirely sure, why we want to disable interrupts at this moment (before to stop other CPUs)?: @@ -547,13 +555,18 @@ panic(const char *fmt, ...) { #ifdef SMP static volatile u_int panic_cpu = NOCPU; + cpuset_t other_cpus; #endif struct thread *td = curthread; int bootopt, newpanic; va_list ap; static char buf[256]; - critical_enter(); + if (stop_scheduler_on_panic) + spinlock_enter(); + else + critical_enter(); + - In this chunk I don't entirely understand the kdb_active check: @@ -566,11 +579,18 @@ panic(const char *fmt, ...) PCPU_GET(cpuid)) == 0) while (panic_cpu != NOCPU) ; /* nothing */ + if (stop_scheduler_on_panic) { + if (panicstr == NULL !kdb_active) { + other_cpus = all_cpus; + CPU_CLR(PCPU_GET(cpuid), other_cpus); + stop_cpus_hard(other_cpus); + } + } #endif bootopt = RB_AUTOBOOT; newpanic = 0; - if (panicstr) + if (panicstr != NULL) bootopt |= RB_NOSYNC; else { bootopt |= RB_DUMP; Is it for avoiding to pass an empty mask to stop_cpus() in kdb_trap() (I saw you changed the policy there)? Maybe we can find a better integration among the two. I'd also move the setting of stop_scheduler variable in the if, it seems a bug to me to have it set otherwise. - The same reservations expressed about the hard path on mutex also applies to rwlock and sxlock. - I'm not sure I like to change the policies on cpu stopping for KDB with this
Re: Stop scheduler on panic
2011/12/6 Andriy Gapon a...@freebsd.org: on 06/12/2011 20:34 Attilio Rao said the following: [snip] - I'm not entirely sure, why we want to disable interrupts at this moment (before to stop other CPUs)?: Because I believe that stop_cpus_hard() should run in a context with interrupts and preemption disabled. Also, I believe that the whole panic handling code should run in the same context. So it was only natural for me to enter that context at this point. I'm not against that, I don't see anything wrong with having interrupts disabled at that point. @@ -547,13 +555,18 @@ panic(const char *fmt, ...) { #ifdef SMP static volatile u_int panic_cpu = NOCPU; + cpuset_t other_cpus; #endif struct thread *td = curthread; int bootopt, newpanic; va_list ap; static char buf[256]; - critical_enter(); + if (stop_scheduler_on_panic) + spinlock_enter(); + else + critical_enter(); + - In this chunk I don't entirely understand the kdb_active check: @@ -566,11 +579,18 @@ panic(const char *fmt, ...) PCPU_GET(cpuid)) == 0) while (panic_cpu != NOCPU) ; /* nothing */ + if (stop_scheduler_on_panic) { + if (panicstr == NULL !kdb_active) { + other_cpus = all_cpus; + CPU_CLR(PCPU_GET(cpuid), other_cpus); + stop_cpus_hard(other_cpus); + } + } #endif bootopt = RB_AUTOBOOT; newpanic = 0; - if (panicstr) + if (panicstr != NULL) bootopt |= RB_NOSYNC; else { bootopt |= RB_DUMP; Is it for avoiding to pass an empty mask to stop_cpus() in kdb_trap() (I saw you changed the policy there)? Yes. You know my position about elimination of the cpuset parameter to stop_cpus_* and my intention to do so. This is related to that. Right now that check is not strictly necessary, but it doesn't do any harm either. We know that all other CPUs are already stopped when kdb_active (ditto for panicstr != NULL). I see there could be races with disabiling interrupts and having 2 different stopping mechanisms that want to stop cpus, even using stop_cpus_hard(), on architectures that don't use a privileged channel (NMI) as mostly of our !x86. They are mostly very rare races (requiring kdb_trap() and panic() to happen in parallel on different CPUs). Maybe we can find a better integration among the two. What kind of integration? Right now I have simple model - both stop all other CPUs. Well, there is no synchronization atm between panic stopping cpus and the kdb_trap(). When kdb_trap() stop cpus it has interrupts disabled and if panic already started they will both deadlock because IPI_STOP won't be properly delivered. However, I see all this as a problem with other arches not having/not implementing a real dedicated channel for cpu_stop_hard(), so we should not think about it now. I think we may need some sort of control as panic already does with panic_cpu before to disable interrupts, but don't worry about it now. I'd also move the setting of stop_scheduler variable in the if, it seems a bug to me to have it set otherwise. Can you please explain what bug do you suspect here? I do not see any. I just see more natural to move it within the above if (panicstr == NULL ...) condition. [snip] - I'm not sure I like to change the policies on cpu stopping for KDB with this patchset. I am not sure what exactly you mean by change in policies. I do not see any such change, entering kdb always stops all other CPUs, with or without the patch. Yes, I was confused by older code did just stopped CPUs before kdb_trap() manually, I think what kdb_trap() does now is ok (and you just retain it). I'd just change this check on panicstr: @@ -606,9 +603,13 @@ kdb_trap(int type, int code, struct trapframe *tf) intr = intr_disable(); #ifdef SMP - other_cpus = all_cpus; - CPU_CLR(PCPU_GET(cpuid), other_cpus); - stop_cpus_hard(other_cpus); + if (panicstr == NULL) { + other_cpus = all_cpus; + CPU_CLR(PCPU_GET(cpuid), other_cpus); + stop_cpus_hard(other_cpus); + did_stop_cpus = 1; + } else + did_stop_cpus = 0; to be SCHEDULER_STOPPED(). If you agree I can fix the kern_mutex, kern_sx and kern_rwlock parts and it should be done. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Stop scheduler on panic
2011/12/2 John Baldwin j...@freebsd.org: On 12/2/11 5:05 AM, Andriy Gapon wrote: on 02/12/2011 06:36 John Baldwin said the following: Ah, ok (I had thought SCHEDULER_STOPPED was going to always be true when kdb was active). But I think these two changes should cover critical_exit() ok. I attempted to start a discussion about this a few times already :-) Should we treat kdb context the same as SCHEDULER_STOPPED context (in the current definition) ? That is, skip all locks in the same fashion? There are pros and contras. kdb should not block on locks, no. Most debugger commands should not go near locks anyway unless they are intended to carefully modify the existing system in a safe manner (such as the 'kill' command which should only be using try locks and fail if it cannot safely post the signal). The biggest problem to KDB as the same as panic is that doing proper 'continue' is impossible. One of the features of the 'skip-locking' path is that it doesn't take into account fast locking paths, where sometimes the lock can succeed and other fails and you don't know about them. Also the restarted CPUs can find corrupted datas (as they can be arbitrarely updated), I'm sure it is too much panic prone. BTW, I'm waiting for the details to settle (including the patch we have been discussing internally about binding to CPU0 during ACPI shutdown) before to read the whole thread and start a proper review, would it be possible to have an almost-final version of the patch? Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Stop scheduler on panic
2011/12/2 John Baldwin j...@freebsd.org: On 12/2/11 12:18 PM, Attilio Rao wrote: 2011/12/2 John Baldwinj...@freebsd.org: On 12/2/11 5:05 AM, Andriy Gapon wrote: on 02/12/2011 06:36 John Baldwin said the following: Ah, ok (I had thought SCHEDULER_STOPPED was going to always be true when kdb was active). But I think these two changes should cover critical_exit() ok. I attempted to start a discussion about this a few times already :-) Should we treat kdb context the same as SCHEDULER_STOPPED context (in the current definition) ? That is, skip all locks in the same fashion? There are pros and contras. kdb should not block on locks, no. Most debugger commands should not go near locks anyway unless they are intended to carefully modify the existing system in a safe manner (such as the 'kill' command which should only be using try locks and fail if it cannot safely post the signal). The biggest problem to KDB as the same as panic is that doing proper 'continue' is impossible. One of the features of the 'skip-locking' path is that it doesn't take into account fast locking paths, where sometimes the lock can succeed and other fails and you don't know about them. Also the restarted CPUs can find corrupted datas (as they can be arbitrarely updated), I'm sure it is too much panic prone. Yes, my thought is that kdb commands, etc. should be using dedicated routines that do not use locks whenever possible. The problem of a user calling an arbitrary routine is not solvable (so I don't think we should try to solve that, you use 'call' at your own risk), but built-in commands should explicitly either 1) not use locking, or 2) only use try locks and fail out cleanly (including dropping any try locks acquired) if a try fails. Now, that's an ideal view, I don't know how close we are to that in practice or if it is a realistically attainable goal. So you are not in favor of giving KDB its own context? There are some fallbacks (like, for example, bugs involving the scheduler or switching mechanism but for that we can make a facility like KDB_LITE if you want to debug a scheduler problem), but in general that would avoid replicating code to avoid the locking. If you don't want to give KDB its own context, we should work on a KPI (or similar) that defines the command to serve as KDB commands, that tries to keep things under control, etc. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Stop scheduler on panic
2011/11/21 John Baldwin j...@freebsd.org: On Friday, November 18, 2011 4:59:32 pm Andriy Gapon wrote: on 17/11/2011 23:38 John Baldwin said the following: On Thursday, November 17, 2011 4:35:07 pm John Baldwin wrote: Hmmm, you could also make critical_exit() not perform deferred preemptions if SCHEDULER_STOPPED? That would fix the recursion and still let the preemption work when resuming from the debugger? Yes, that's a good solution, I think. I just didn't want to touch such a low level code, but I guess there is no rational reason for that. Or you could let the debugger run in a permament critical section (though perhaps that breaks too many other things like debugger routines that try to use locks like the 'kill' command (which is useful!)). Effectively what you are trying to do is having the debugger run in a critical section until the debugger is exited. It would be cleanest to let it run that way explicitly if possible since then you don't have to catch as many edge cases. I like this idea, but likely it would take some effort to get done. Yes, it would take some effort, so checking SCHEDULER_STOPPED in critical_exit() is probably good for the short term. Would be nice to fix it properly some day. Related to this is something that I attempted to discuss before. I think that because the debugger acts on a frozen system image (the debugger is a sole actor and observer), the debugger should try to minimize its interaction with the debugged system. In this vein I think that the debugger should also bypass any locks just like with SCHEDULER_STOPPED. The debugger should also be careful to note a state of any subsystems that it uses (e.g. the keyboard) and return them to the initial state if it had to be changed. But that's a bit different story. And I really get beyond my knowledge zone when I try to think about things like handling 'call func_' in the debugger where func_ may want to acquire some locks or noticeably change some state within a system. I think to some extent, I think if a user calls a function from the debugger they better know what they are doing. However, I think it can also be useful to have the debugger modify the system in some cases if it can safely do so (e.g. the 'kill' command from DDB can be very useful, and IIRC, it is careful to only use try locks and just fail if it can't acquire the needed locks). I would be very in favor about having a 'thread trampoline for KDB', thus that it can use locks. The only downside is that it assume an healthy state of the switching infrastructure, so maybe it would be fine to wrapper this under a proper compile-time option (KDB_LITE - it will run in interrupt context and the users will better know what they do). Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: vm_page_t related KBI [Was: Re: panic at vm_page_wire with FreeBSD 9.0 Beta 3]
2011/11/18 Attilio Rao atti...@freebsd.org: 2011/11/18 Attilio Rao atti...@freebsd.org: 2011/11/18 Kostik Belousov kostik...@gmail.com: On Fri, Nov 18, 2011 at 11:40:28AM +0100, Attilio Rao wrote: 2011/11/16 Kostik Belousov kostik...@gmail.com: On Tue, Nov 15, 2011 at 07:15:01PM +0100, Attilio Rao wrote: 2011/11/7 Kostik Belousov kostik...@gmail.com: On Mon, Nov 07, 2011 at 11:45:38AM -0600, Alan Cox wrote: Ok. I'll offer one final suggestion. Please consider an alternative suffix to func. Perhaps, kbi or KBI. In other words, something that hints at the function's reason for existing. Sure. Below is the extraction of only vm_page_lock() bits, together with the suggested rename. When Attilio provides the promised simplification of the mutex KPI, this can be reduced. My tentative patch is here: http://www.freebsd.org/~attilio/mutexfileline.patch I need to make more compile testing later, but it already compiles GENERIC + modules fine on HEAD. The patch provides a common entrypoint, option independent, for both fast case and debug/compat case. Additively, it almost entirely fixes the standard violation of the reserved namespace, as you described (the notable exception being the macro used in the fast path, that I want to fix as well, but in a separate commit). Now the file/line couplet can be passed to the _ suffix variant of the flag functions. Yes, this is exactly KPI that I would use when available for the vm_page_lock() patch. eadler@ reviewed the mutex.h comment. Please let me know what you think about it, as long as we agree on the patch I'll commit it. But I also agree with John that imposing large churn due to the elimination of the '__' prefix is too late now. At least it will make the change non-MFCable. Besides, we already lived with the names for 10+ years. I will be happy to have the part of the patch that exports the mtx_XXX_(mtx, file, line) defines which can be used without taking care of LOCK_DEBUG or MUTEX_NOINLINE in the consumer code. Ok, this patch should just add the compat stub: http://www.freebsd.org/~attilio/mutexfileline2.patch Am I right that I would use mtx_lock_(mtx, file, line) etc ? If yes, I am fine with it. Yes that is correct. However, I'm a bit confused on one aspect: would you mind using _mtx_lock_flags() instead? If you don't mind the underscore namespace violation I think I can make a much smaller patch against HEAD for it. Otherwise, the one now posted should be ok. After thinking more about it, I think that is basically the shorter version I can came up with. Please consider: http://www.freebsd.org/~attilio/mutexfileline2.patch This is now committed as r227758,227759, you can update your patch now. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: vm_page_t related KBI [Was: Re: panic at vm_page_wire with FreeBSD 9.0 Beta 3]
2011/11/20 Kostik Belousov kostik...@gmail.com: On Sun, Nov 20, 2011 at 05:37:33PM +0100, Attilio Rao wrote: 2011/11/18 Attilio Rao atti...@freebsd.org: Please consider: http://www.freebsd.org/~attilio/mutexfileline2.patch This is now committed as r227758,227759, you can update your patch now. Here is it. diff --git a/sys/vm/vm_page.c b/sys/vm/vm_page.c index d592ac0..74e5126 100644 --- a/sys/vm/vm_page.c +++ b/sys/vm/vm_page.c @@ -2843,6 +2843,34 @@ vm_page_test_dirty(vm_page_t m) vm_page_dirty(m); } +void +vm_page_lock_KBI(vm_page_t m, const char *file, int line) +{ + + mtx_lock_flags_(vm_page_lockptr(m), 0, file, line); +} + +void +vm_page_unlock_KBI(vm_page_t m, const char *file, int line) +{ + + mtx_unlock_flags_(vm_page_lockptr(m), 0, file, line); +} + +int +vm_page_trylock_KBI(vm_page_t m, const char *file, int line) +{ + + return (mtx_trylock_flags_(vm_page_lockptr(m), 0, file, line)); +} + +void +vm_page_lock_assert_KBI(vm_page_t m, int a, const char *file, int line) +{ + + mtx_assert_(vm_page_lockptr(m), a, file, line); +} + int so_zerocp_fullpage = 0; /* diff --git a/sys/vm/vm_page.h b/sys/vm/vm_page.h index 151710d..fe0295b 100644 --- a/sys/vm/vm_page.h +++ b/sys/vm/vm_page.h @@ -218,11 +218,23 @@ extern struct vpglocks pa_lock[]; #define PA_LOCK_ASSERT(pa, a) mtx_assert(PA_LOCKPTR(pa), (a)) +#ifdef KLD_MODULE +#define vm_page_lock(m) vm_page_lock_KBI((m), LOCK_FILE, LOCK_LINE) +#define vm_page_unlock(m) vm_page_unlock_KBI((m), LOCK_FILE, LOCK_LINE) +#define vm_page_trylock(m) vm_page_trylock_KBI((m), LOCK_FILE, LOCK_LINE) +#ifdef INVARIANTS +#define vm_page_lock_assert(m, a) \ + vm_page_lock_assert_KBI((m), (a), LOCK_FILE, LOCK_LINE) I think you should put the \ in the last tab and also, for consistency, you may want to use __FILE__ and __LINE__ for assert (or maybe I should also switch mutex.h to use LOCK_FILE and LOCK_LINE at some point?). +#else +#define vm_page_lock_assert(m, a) +#endif +#else /* KLD_MODULE */ This should be /* !KLD_MODULE */, I guess? #define vm_page_lockptr(m) This is not defined for the KLD_MODULE case? Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: vm_page_t related KBI [Was: Re: panic at vm_page_wire with FreeBSD 9.0 Beta 3]
It looks good to me. Attilio 2011/11/20 Kostik Belousov kostik...@gmail.com: On Sun, Nov 20, 2011 at 07:02:14PM +0100, Attilio Rao wrote: 2011/11/20 Kostik Belousov kostik...@gmail.com: +#define vm_page_lock_assert(m, a) \ + vm_page_lock_assert_KBI((m), (a), LOCK_FILE, LOCK_LINE) I think you should put the \ in the last tab and also, for consistency, you may want to use __FILE__ and __LINE__ for assert (or maybe I should also switch mutex.h to use LOCK_FILE and LOCK_LINE at some point?). I never saw the requirement for the backslash. I am consistent with PA_UNLOCK_COND() several lines above. Changed assert to use __FILE/LINE__. +#else +#define vm_page_lock_assert(m, a) +#endif +#else /* KLD_MODULE */ This should be /* !KLD_MODULE */, I guess? Changed. #define vm_page_lockptr(m) This is not defined for the KLD_MODULE case? Yes, explicitely. This was discussed. http://lists.freebsd.org/pipermail/freebsd-current/2011-November/029009.html diff --git a/sys/vm/vm_page.c b/sys/vm/vm_page.c index d592ac0..74e5126 100644 --- a/sys/vm/vm_page.c +++ b/sys/vm/vm_page.c @@ -2843,6 +2843,34 @@ vm_page_test_dirty(vm_page_t m) vm_page_dirty(m); } +void +vm_page_lock_KBI(vm_page_t m, const char *file, int line) +{ + + mtx_lock_flags_(vm_page_lockptr(m), 0, file, line); +} + +void +vm_page_unlock_KBI(vm_page_t m, const char *file, int line) +{ + + mtx_unlock_flags_(vm_page_lockptr(m), 0, file, line); +} + +int +vm_page_trylock_KBI(vm_page_t m, const char *file, int line) +{ + + return (mtx_trylock_flags_(vm_page_lockptr(m), 0, file, line)); +} + +void +vm_page_lock_assert_KBI(vm_page_t m, int a, const char *file, int line) +{ + + mtx_assert_(vm_page_lockptr(m), a, file, line); +} + int so_zerocp_fullpage = 0; /* diff --git a/sys/vm/vm_page.h b/sys/vm/vm_page.h index 151710d..1fab735 100644 --- a/sys/vm/vm_page.h +++ b/sys/vm/vm_page.h @@ -218,11 +218,23 @@ extern struct vpglocks pa_lock[]; #define PA_LOCK_ASSERT(pa, a) mtx_assert(PA_LOCKPTR(pa), (a)) +#ifdef KLD_MODULE +#define vm_page_lock(m) vm_page_lock_KBI((m), LOCK_FILE, LOCK_LINE) +#define vm_page_unlock(m) vm_page_unlock_KBI((m), LOCK_FILE, LOCK_LINE) +#define vm_page_trylock(m) vm_page_trylock_KBI((m), LOCK_FILE, LOCK_LINE) +#ifdef INVARIANTS +#define vm_page_lock_assert(m, a) \ + vm_page_lock_assert_KBI((m), (a), __FILE__, __LINE__) +#else +#define vm_page_lock_assert(m, a) +#endif +#else /* !KLD_MODULE */ #define vm_page_lockptr(m) (PA_LOCKPTR(VM_PAGE_TO_PHYS((m #define vm_page_lock(m) mtx_lock(vm_page_lockptr((m))) #define vm_page_unlock(m) mtx_unlock(vm_page_lockptr((m))) #define vm_page_trylock(m) mtx_trylock(vm_page_lockptr((m))) #define vm_page_lock_assert(m, a) mtx_assert(vm_page_lockptr((m)), (a)) +#endif #define vm_page_queue_free_mtx vm_page_queue_free_lock.data /* @@ -405,6 +417,11 @@ void vm_page_cowfault (vm_page_t); int vm_page_cowsetup(vm_page_t); void vm_page_cowclear (vm_page_t); +void vm_page_lock_KBI(vm_page_t m, const char *file, int line); +void vm_page_unlock_KBI(vm_page_t m, const char *file, int line); +int vm_page_trylock_KBI(vm_page_t m, const char *file, int line); +void vm_page_lock_assert_KBI(vm_page_t m, int a, const char *file, int line); + #ifdef INVARIANTS void vm_page_object_lock_assert(vm_page_t m); #define VM_PAGE_OBJECT_LOCK_ASSERT(m) vm_page_object_lock_assert(m) -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: vm_page_t related KBI [Was: Re: panic at vm_page_wire with FreeBSD 9.0 Beta 3]
2011/11/20 Attilio Rao atti...@freebsd.org: 2011/11/18 Attilio Rao atti...@freebsd.org: 2011/11/18 Attilio Rao atti...@freebsd.org: 2011/11/18 Kostik Belousov kostik...@gmail.com: On Fri, Nov 18, 2011 at 11:40:28AM +0100, Attilio Rao wrote: 2011/11/16 Kostik Belousov kostik...@gmail.com: On Tue, Nov 15, 2011 at 07:15:01PM +0100, Attilio Rao wrote: 2011/11/7 Kostik Belousov kostik...@gmail.com: On Mon, Nov 07, 2011 at 11:45:38AM -0600, Alan Cox wrote: Ok. I'll offer one final suggestion. Please consider an alternative suffix to func. Perhaps, kbi or KBI. In other words, something that hints at the function's reason for existing. Sure. Below is the extraction of only vm_page_lock() bits, together with the suggested rename. When Attilio provides the promised simplification of the mutex KPI, this can be reduced. My tentative patch is here: http://www.freebsd.org/~attilio/mutexfileline.patch I need to make more compile testing later, but it already compiles GENERIC + modules fine on HEAD. The patch provides a common entrypoint, option independent, for both fast case and debug/compat case. Additively, it almost entirely fixes the standard violation of the reserved namespace, as you described (the notable exception being the macro used in the fast path, that I want to fix as well, but in a separate commit). Now the file/line couplet can be passed to the _ suffix variant of the flag functions. Yes, this is exactly KPI that I would use when available for the vm_page_lock() patch. eadler@ reviewed the mutex.h comment. Please let me know what you think about it, as long as we agree on the patch I'll commit it. But I also agree with John that imposing large churn due to the elimination of the '__' prefix is too late now. At least it will make the change non-MFCable. Besides, we already lived with the names for 10+ years. I will be happy to have the part of the patch that exports the mtx_XXX_(mtx, file, line) defines which can be used without taking care of LOCK_DEBUG or MUTEX_NOINLINE in the consumer code. Ok, this patch should just add the compat stub: http://www.freebsd.org/~attilio/mutexfileline2.patch Am I right that I would use mtx_lock_(mtx, file, line) etc ? If yes, I am fine with it. Yes that is correct. However, I'm a bit confused on one aspect: would you mind using _mtx_lock_flags() instead? If you don't mind the underscore namespace violation I think I can make a much smaller patch against HEAD for it. Otherwise, the one now posted should be ok. After thinking more about it, I think that is basically the shorter version I can came up with. Please consider: http://www.freebsd.org/~attilio/mutexfileline2.patch This is now committed as r227758,227759, you can update your patch now. This other patch converts sx to a similar interface which cleans up vm_map.c: http://www.freebsd.org/~attilio/sxfileline.patch What do you think about it? Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: vm_page_t related KBI [Was: Re: panic at vm_page_wire with FreeBSD 9.0 Beta 3]
2011/11/20 Kostik Belousov kostik...@gmail.com: On Sun, Nov 20, 2011 at 08:04:21PM +0100, Attilio Rao wrote: This other patch converts sx to a similar interface which cleans up vm_map.c: http://www.freebsd.org/~attilio/sxfileline.patch What do you think about it? This one only changes the KBI ? Note that _sx suffix is not reserved. In which sense? If you want to keep the shim support for KLD (thus the hard path) you will always need to keep an hard function and thus you still need a macro acting as a gate between the 'hard function' (or KLD version, if you prefer) and the fast case, that is where the _ suffix came from. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: vm_page_t related KBI [Was: Re: panic at vm_page_wire with FreeBSD 9.0 Beta 3]
2011/11/20 Kostik Belousov kostik...@gmail.com: On Sun, Nov 20, 2011 at 08:22:38PM +0100, Attilio Rao wrote: 2011/11/20 Kostik Belousov kostik...@gmail.com: On Sun, Nov 20, 2011 at 08:04:21PM +0100, Attilio Rao wrote: This other patch converts sx to a similar interface which cleans up vm_map.c: http://www.freebsd.org/~attilio/sxfileline.patch What do you think about it? This one only changes the KBI ? Note that _sx suffix is not reserved. In which sense? If you want to keep the shim support for KLD (thus the hard path) you will always need to keep an hard function and thus you still need a macro acting as a gate between the 'hard function' (or KLD version, if you prefer) and the fast case, that is where the _ suffix came from. As I see, right now kernel exports e.g. _sx_try_slock() for the hard path. After the patch, it will export sx_try_slock_() for the same purpose. The old modules, which call _sx_try_slock(), cannot be loaded into the patched kernel. Am I reading the patch wrong ? We shouldn't be concerned about it for -CURRENT, when MFCing this patch I'll just make: #define sx_try_slock_ _sx_try_slock rather than renaming the function. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: vm_page_t related KBI [Was: Re: panic at vm_page_wire with FreeBSD 9.0 Beta 3]
2011/11/16 Kostik Belousov kostik...@gmail.com: On Tue, Nov 15, 2011 at 07:15:01PM +0100, Attilio Rao wrote: 2011/11/7 Kostik Belousov kostik...@gmail.com: On Mon, Nov 07, 2011 at 11:45:38AM -0600, Alan Cox wrote: Ok. I'll offer one final suggestion. Please consider an alternative suffix to func. Perhaps, kbi or KBI. In other words, something that hints at the function's reason for existing. Sure. Below is the extraction of only vm_page_lock() bits, together with the suggested rename. When Attilio provides the promised simplification of the mutex KPI, this can be reduced. My tentative patch is here: http://www.freebsd.org/~attilio/mutexfileline.patch I need to make more compile testing later, but it already compiles GENERIC + modules fine on HEAD. The patch provides a common entrypoint, option independent, for both fast case and debug/compat case. Additively, it almost entirely fixes the standard violation of the reserved namespace, as you described (the notable exception being the macro used in the fast path, that I want to fix as well, but in a separate commit). Now the file/line couplet can be passed to the _ suffix variant of the flag functions. Yes, this is exactly KPI that I would use when available for the vm_page_lock() patch. eadler@ reviewed the mutex.h comment. Please let me know what you think about it, as long as we agree on the patch I'll commit it. But I also agree with John that imposing large churn due to the elimination of the '__' prefix is too late now. At least it will make the change non-MFCable. Besides, we already lived with the names for 10+ years. I will be happy to have the part of the patch that exports the mtx_XXX_(mtx, file, line) defines which can be used without taking care of LOCK_DEBUG or MUTEX_NOINLINE in the consumer code. Ok, this patch should just add the compat stub: http://www.freebsd.org/~attilio/mutexfileline2.patch I'll make more test-compiling later in the day, if you agree on it I will commit the patch tomorrow. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: vm_page_t related KBI [Was: Re: panic at vm_page_wire with FreeBSD 9.0 Beta 3]
2011/11/18 Kostik Belousov kostik...@gmail.com: On Fri, Nov 18, 2011 at 11:40:28AM +0100, Attilio Rao wrote: 2011/11/16 Kostik Belousov kostik...@gmail.com: On Tue, Nov 15, 2011 at 07:15:01PM +0100, Attilio Rao wrote: 2011/11/7 Kostik Belousov kostik...@gmail.com: On Mon, Nov 07, 2011 at 11:45:38AM -0600, Alan Cox wrote: Ok. I'll offer one final suggestion. Please consider an alternative suffix to func. Perhaps, kbi or KBI. In other words, something that hints at the function's reason for existing. Sure. Below is the extraction of only vm_page_lock() bits, together with the suggested rename. When Attilio provides the promised simplification of the mutex KPI, this can be reduced. My tentative patch is here: http://www.freebsd.org/~attilio/mutexfileline.patch I need to make more compile testing later, but it already compiles GENERIC + modules fine on HEAD. The patch provides a common entrypoint, option independent, for both fast case and debug/compat case. Additively, it almost entirely fixes the standard violation of the reserved namespace, as you described (the notable exception being the macro used in the fast path, that I want to fix as well, but in a separate commit). Now the file/line couplet can be passed to the _ suffix variant of the flag functions. Yes, this is exactly KPI that I would use when available for the vm_page_lock() patch. eadler@ reviewed the mutex.h comment. Please let me know what you think about it, as long as we agree on the patch I'll commit it. But I also agree with John that imposing large churn due to the elimination of the '__' prefix is too late now. At least it will make the change non-MFCable. Besides, we already lived with the names for 10+ years. I will be happy to have the part of the patch that exports the mtx_XXX_(mtx, file, line) defines which can be used without taking care of LOCK_DEBUG or MUTEX_NOINLINE in the consumer code. Ok, this patch should just add the compat stub: http://www.freebsd.org/~attilio/mutexfileline2.patch Am I right that I would use mtx_lock_(mtx, file, line) etc ? If yes, I am fine with it. Yes that is correct. However, I'm a bit confused on one aspect: would you mind using _mtx_lock_flags() instead? If you don't mind the underscore namespace violation I think I can make a much smaller patch against HEAD for it. Otherwise, the one now posted should be ok. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: vm_page_t related KBI [Was: Re: panic at vm_page_wire with FreeBSD 9.0 Beta 3]
2011/11/18 Attilio Rao atti...@freebsd.org: 2011/11/18 Kostik Belousov kostik...@gmail.com: On Fri, Nov 18, 2011 at 11:40:28AM +0100, Attilio Rao wrote: 2011/11/16 Kostik Belousov kostik...@gmail.com: On Tue, Nov 15, 2011 at 07:15:01PM +0100, Attilio Rao wrote: 2011/11/7 Kostik Belousov kostik...@gmail.com: On Mon, Nov 07, 2011 at 11:45:38AM -0600, Alan Cox wrote: Ok. I'll offer one final suggestion. Please consider an alternative suffix to func. Perhaps, kbi or KBI. In other words, something that hints at the function's reason for existing. Sure. Below is the extraction of only vm_page_lock() bits, together with the suggested rename. When Attilio provides the promised simplification of the mutex KPI, this can be reduced. My tentative patch is here: http://www.freebsd.org/~attilio/mutexfileline.patch I need to make more compile testing later, but it already compiles GENERIC + modules fine on HEAD. The patch provides a common entrypoint, option independent, for both fast case and debug/compat case. Additively, it almost entirely fixes the standard violation of the reserved namespace, as you described (the notable exception being the macro used in the fast path, that I want to fix as well, but in a separate commit). Now the file/line couplet can be passed to the _ suffix variant of the flag functions. Yes, this is exactly KPI that I would use when available for the vm_page_lock() patch. eadler@ reviewed the mutex.h comment. Please let me know what you think about it, as long as we agree on the patch I'll commit it. But I also agree with John that imposing large churn due to the elimination of the '__' prefix is too late now. At least it will make the change non-MFCable. Besides, we already lived with the names for 10+ years. I will be happy to have the part of the patch that exports the mtx_XXX_(mtx, file, line) defines which can be used without taking care of LOCK_DEBUG or MUTEX_NOINLINE in the consumer code. Ok, this patch should just add the compat stub: http://www.freebsd.org/~attilio/mutexfileline2.patch Am I right that I would use mtx_lock_(mtx, file, line) etc ? If yes, I am fine with it. Yes that is correct. However, I'm a bit confused on one aspect: would you mind using _mtx_lock_flags() instead? If you don't mind the underscore namespace violation I think I can make a much smaller patch against HEAD for it. Otherwise, the one now posted should be ok. After thinking more about it, I think that is basically the shorter version I can came up with. Please consider: http://www.freebsd.org/~attilio/mutexfileline2.patch as a possible commit candidate for me. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Stop scheduler on panic
2011/11/17 Andriy Gapon a...@freebsd.org: on 17/11/2011 21:09 John Baldwin said the following: On Thursday, November 17, 2011 11:58:03 am Andriy Gapon wrote: on 17/11/2011 18:37 John Baldwin said the following: On Thursday, November 17, 2011 4:47:42 am Andriy Gapon wrote: on 17/11/2011 10:34 Andriy Gapon said the following: on 17/11/2011 10:15 Kostik Belousov said the following: I have the following change for eons on my test boxes. Without it, I simply cannot get _any_ dump. diff --git a/sys/cam/cam_xpt.c b/sys/cam/cam_xpt.c index 10b89c7..a38e42f 100644 --- a/sys/cam/cam_xpt.c +++ b/sys/cam/cam_xpt.c @@ -4230,7 +4230,7 @@ xpt_done(union ccb *done_ccb) TAILQ_INSERT_TAIL(cam_simq, sim, links); mtx_unlock(cam_simq_lock); sim-flags |= CAM_SIM_ON_DONEQ; - if (first) + if (first panicstr == NULL) swi_sched(cambio_ih, 0); } } I think that this (or similar) change should go into the patch and the tree. And, BTW, I still would like to do something like the following (perhaps with td_oncpu = NOCPU and td_flags = ~TDF_NEEDRESCHED also moved to the common code): Index: sys/kern/sched_ule.c === --- sys/kern/sched_ule.c (revision 227608) +++ sys/kern/sched_ule.c (working copy) @@ -1790,7 +1790,6 @@ sched_switch(struct thread *td, struct thread *new td-td_oncpu = NOCPU; if (!(flags SW_PREEMPT)) td-td_flags = ~TDF_NEEDRESCHED; - td-td_owepreempt = 0; tdq-tdq_switchcnt++; /* * The lock pointer in an idle thread should never change. Reset it Index: sys/kern/kern_synch.c === --- sys/kern/kern_synch.c (revision 227608) +++ sys/kern/kern_synch.c (working copy) @@ -406,6 +406,8 @@ mi_switch(int flags, struct thread *newtd) (mi_switch: switch must be voluntary or involuntary)); KASSERT(newtd != curthread, (mi_switch: preempting back to ourself)); + td-td_owepreempt = 0; + /* * Don't perform context switches from the debugger. */ Index: sys/kern/sched_4bsd.c === --- sys/kern/sched_4bsd.c (revision 227608) +++ sys/kern/sched_4bsd.c (working copy) @@ -940,7 +940,6 @@ sched_switch(struct thread *td, struct thread *new td-td_lastcpu = td-td_oncpu; if (!(flags SW_PREEMPT)) td-td_flags = ~TDF_NEEDRESCHED; - td-td_owepreempt = 0; td-td_oncpu = NOCPU; /* Does anybody see any potential problems with such a change? Hmm, does this mean the preemption will be lost if you break into the debugger and continue in the non-panic case? Not sure which exact scenario you have in mind. Please note that the above diff just moves resetting of td_owepreempt to an earlier place. As far as I can see there are no checks of td_owepreempt value between the new place and the old places. I'm worried that you are clearing td_owepreempt even in cases where a context switch is not performed. So say you enter DDB with td_owepreempt set and that DDB bails on a context switch. With your change it will now clear td_owepreempt and lose the preemption. And without the change we get the recursion and double-fault because of kdb_switch - thread_unlock - spinlock_exit - critical_exit - mi_switch in this case ? BTW, it is my opinion that we really should not let the debugger code call mi_switch for any reason. Yes, I agree with this, this is why the sched_bind() in boot() is broken (immagine calling things like doadump from KDB. KDB right now can be thought as a first cut of this patch because it does disable the CPUs when entering the context, thus, the bug here is that if you stop all CPUs including CPU0 and later on you want bind on it you are death). We need to discuss this and the patch more extensively, I'm taking my time and hopefully will do a full review during the weekend. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Stop scheduler on panic
2011/11/17 m...@freebsd.org: On Thu, Nov 17, 2011 at 12:54 PM, Attilio Rao atti...@freebsd.org wrote: 2011/11/17 Andriy Gapon a...@freebsd.org: BTW, it is my opinion that we really should not let the debugger code call mi_switch for any reason. Yes, I agree with this, this is why the sched_bind() in boot() is broken (immagine calling things like doadump from KDB. KDB right now can be thought as a first cut of this patch because it does disable the CPUs when entering the context, thus, the bug here is that if you stop all CPUs including CPU0 and later on you want bind on it you are death). Another patch related to this area we have at $WORK: #if defined(SMP) - /* - * Bind us to CPU 0 so that all shutdown code runs there. Some - * systems don't shutdown properly (i.e., ACPI power off) if we - * run on another processor. - */ - thread_lock(curthread); - sched_bind(curthread, 0); - thread_unlock(curthread); - KASSERT(PCPU_GET(cpuid) == 0, (%s: not running on cpu 0, __func__)); + /* + * sched_bind can't be done reliably inside of panic. cpu_reset() will + * rebind us in any case, more reliably. + */ + if (panicstr == NULL) { + /* + * Bind us to CPU 0 so that all shutdown code runs there. Some + * systems don't shutdown properly (i.e., ACPI power off) if we + * run on another processor. + */ + thread_lock(curthread); + sched_bind(curthread, 0); + thread_unlock(curthread); + KASSERT(PCPU_GET(cpuid) == 0, (boot: not running on cpu 0)); + } #endif /* We're in the process of rebooting. */ rebooting = 1; This doesn't cover the KDB case which is the most broken here. (I'm a bit unsure about the name of functions and I cannot check now, but in short): - you enter KDB via debug.kdb.enter=1 (for example) - kdb_enter() stop CPUs and if it is on CPU1 it stops CPU0 - you call functions entering boot() from KDB prompt (IIRC call doadump should do it) - boot() wants to bind on CPU0 which is turned off This case only take care of panic, which is not enough. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: vm_page_t related KBI [Was: Re: panic at vm_page_wire with FreeBSD 9.0 Beta 3]
2011/11/7 Kostik Belousov kostik...@gmail.com: On Mon, Nov 07, 2011 at 11:45:38AM -0600, Alan Cox wrote: Ok. I'll offer one final suggestion. Please consider an alternative suffix to func. Perhaps, kbi or KBI. In other words, something that hints at the function's reason for existing. Sure. Below is the extraction of only vm_page_lock() bits, together with the suggested rename. When Attilio provides the promised simplification of the mutex KPI, this can be reduced. My tentative patch is here: http://www.freebsd.org/~attilio/mutexfileline.patch I need to make more compile testing later, but it already compiles GENERIC + modules fine on HEAD. The patch provides a common entrypoint, option independent, for both fast case and debug/compat case. Additively, it almost entirely fixes the standard violation of the reserved namespace, as you described (the notable exception being the macro used in the fast path, that I want to fix as well, but in a separate commit). Now the file/line couplet can be passed to the _ suffix variant of the flag functions. eadler@ reviewed the mutex.h comment. Please let me know what you think about it, as long as we agree on the patch I'll commit it. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: vm_page_t related KBI [Was: Re: panic at vm_page_wire with FreeBSD 9.0 Beta 3]
2011/11/15 m...@freebsd.org: On Tue, Nov 15, 2011 at 10:15 AM, Attilio Rao atti...@freebsd.org wrote: 2011/11/7 Kostik Belousov kostik...@gmail.com: On Mon, Nov 07, 2011 at 11:45:38AM -0600, Alan Cox wrote: Ok. I'll offer one final suggestion. Please consider an alternative suffix to func. Perhaps, kbi or KBI. In other words, something that hints at the function's reason for existing. Sure. Below is the extraction of only vm_page_lock() bits, together with the suggested rename. When Attilio provides the promised simplification of the mutex KPI, this can be reduced. My tentative patch is here: http://www.freebsd.org/~attilio/mutexfileline.patch I need to make more compile testing later, but it already compiles GENERIC + modules fine on HEAD. The patch provides a common entrypoint, option independent, for both fast case and debug/compat case. Additively, it almost entirely fixes the standard violation of the reserved namespace, as you described (the notable exception being the macro used in the fast path, that I want to fix as well, but in a separate commit). Now the file/line couplet can be passed to the _ suffix variant of the flag functions. eadler@ reviewed the mutex.h comment. Please let me know what you think about it, as long as we agree on the patch I'll commit it. Out of curiosity, why are function names explicitly spelled out in panic and log messages, instead of using %s and __func__? I've seen this all around FreeBSD, and if there's no reason otherwise, I'd just as soon change to a version that doesn't need updating when the function names change. I prefer the __func__ stuff as well but bde isn't in favor of it because it is more difficult to grep for the message in that case. I'm not sure I'd buy his point on this, honestly, but that is why. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [PATCH] Intel Sandy Bridge support for hwpmc
2011/11/13 Davide Italiano davide.itali...@gmail.com: On Sun, Nov 13, 2011 at 9:52 PM, Davide Italiano davide.itali...@gmail.com wrote: Good evening folks. During last days I've written a patch to add sandy bridge support to hwpmc. Until now, the most recent Intel processor microarchitecture supported was Westmere. Testing is appreciated, in order to see if there's something that have to be fixed. You can find the diff here: http://davit.altervista.rg/hwpmc_sandy_bridge.diff I'd like to thanks a lot attilio@ that helped me to fix a bug and gnn@ and fabient@ for the useful suggestions. Best Davide Sorry, bad link. It should be: http://davit.altervista.org/hwpmc_sandy_bridge.diff Ci sono un po di cose da pulire, ma quello posso farlo io. Mi dici come riprodurre il deadlock con THREAD_P? Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [PATCH] Intel Sandy Bridge support for hwpmc
2011/11/13 Davide Italiano davide.itali...@gmail.com: On Sun, Nov 13, 2011 at 9:56 PM, Attilio Rao atti...@freebsd.org wrote: 2011/11/13 Davide Italiano davide.itali...@gmail.com: On Sun, Nov 13, 2011 at 9:52 PM, Davide Italiano davide.itali...@gmail.com wrote: Good evening folks. During last days I've written a patch to add sandy bridge support to hwpmc. Until now, the most recent Intel processor microarchitecture supported was Westmere. Testing is appreciated, in order to see if there's something that have to be fixed. You can find the diff here: http://davit.altervista.rg/hwpmc_sandy_bridge.diff I'd like to thanks a lot attilio@ that helped me to fix a bug and gnn@ and fabient@ for the useful suggestions. Best Davide Sorry, bad link. It should be: http://davit.altervista.org/hwpmc_sandy_bridge.diff Ci sono un po di cose da pulire, ma quello posso farlo io. Mi dici come riprodurre il deadlock con THREAD_P? Attilio -- Peace can only be achieved by understanding - A. Einstein pmcstat -SCPU_CLK_UNHALTED.THREAD_P dovrebbe andare. (facendo partire pmcstat non in system-mode, ad esempio pmcstat -pCPU_CLK_UNHALTED.THREAD_P ls , non va in deadlock. In pratica tu hai visto che cercando di andare con ctrl+c non termina? Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [PATCH] Intel Sandy Bridge support for hwpmc
2011/11/13 Davide Italiano davide.itali...@gmail.com: On Sun, Nov 13, 2011 at 9:52 PM, Davide Italiano davide.itali...@gmail.com wrote: Good evening folks. During last days I've written a patch to add sandy bridge support to hwpmc. Until now, the most recent Intel processor microarchitecture supported was Westmere. Testing is appreciated, in order to see if there's something that have to be fixed. You can find the diff here: http://davit.altervista.rg/hwpmc_sandy_bridge.diff I'd like to thanks a lot attilio@ that helped me to fix a bug and gnn@ and fabient@ for the useful suggestions. Best Davide Sorry, bad link. It should be: http://davit.altervista.org/hwpmc_sandy_bridge.diff I can perform some small cleanups and likely test it too. If Fabien or George can review it I'm fine with committing as long as all that is settled. Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org