Re: FreeBSD 8.2 - active plus inactive memory leak!?
On Wed, 2012-03-07 at 10:23 +0200, Konstantin Belousov wrote: On Wed, Mar 07, 2012 at 12:36:21AM +, Luke Marsden wrote: I'm trying to confirm that, on a system with no pages swapped out, that the following is a true statement: a page is accounted for in active + inactive if and only if it corresponds to one or more of the pages accounted for in the resident memory lists of all the processes on the system (as per the output of 'top' and 'ps') No. The pages belonging to vnode vm object can be active or inactive or cached but not mapped into any process address space. Thank you, Konstantin. Does the number of vnodes we've got open on this machine (272011) fully explain away the memory gap? Memory gap: 11264M active + 2598M inactive - 9297M sum-of-resident = 4565M Active vnodes: vfs.numvnodes: 272011 That gives a lower bound at 17.18Kb per vode (or higher if we take into account shared libs, etc); that seems a bit high for a vnode vm object doesn't it? If that doesn't fully explain it, what else might be chewing through active memory? Also, when are vnodes freed? This system does have some tuning... kern.maxfiles: 100 vm.pmap.pv_entry_max: 73296250 Could that be contributing to so much active + inactive memory (5GB+ more than expected), or do PV entries live in wired e.g. kernel memory? On Tue, 2012-03-06 at 17:48 -0700, Ian Lepore wrote: In my experience, the bulk of the memory in the inactive category is cached disk blocks, at least for ufs (I think zfs does things differently). On this desktop machine I have 12G physical and typically have roughly 11G inactive, and I can unmount one particular filesystem where most of my work is done and instantly I have almost no inactive and roughly 11G free. Okay, so this could be UFS disk cache, except the system is ZFS-on-root with no UFS filesystems active or mounted. Can I confirm that no double-caching of ZFS data is happening in active + inactive (+ cache) memory? Thanks, Luke -- CTO, Hybrid Logic +447791750420 | +1-415-449-1165 | www.hybrid-cluster.com ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD 8.2 - active plus inactive memory leak!?
On Wed, 2012-03-07 at 10:23 +0200, Konstantin Belousov wrote: On Wed, Mar 07, 2012 at 12:36:21AM +, Luke Marsden wrote: I'm trying to confirm that, on a system with no pages swapped out, that the following is a true statement: a page is accounted for in active + inactive if and only if it corresponds to one or more of the pages accounted for in the resident memory lists of all the processes on the system (as per the output of 'top' and 'ps') No. The pages belonging to vnode vm object can be active or inactive or cached but not mapped into any process address space. Thank you, Konstantin. Does the number of vnodes we've got open on this machine (272011) fully explain away the memory gap? Memory gap: 11264M active + 2598M inactive - 9297M sum-of-resident = 4565M Active vnodes: vfs.numvnodes: 272011 That gives a lower bound at 17.18Kb per vode (or higher if we take into account shared libs, etc); that seems a bit high for a vnode vm object doesn't it? If that doesn't fully explain it, what else might be chewing through active memory? Also, when are vnodes freed? This system does have some tuning... kern.maxfiles: 100 vm.pmap.pv_entry_max: 73296250 Could that be contributing to so much active + inactive memory (5GB+ more than expected), or do PV entries live in wired e.g. kernel memory? On Tue, 2012-03-06 at 17:48 -0700, Ian Lepore wrote: In my experience, the bulk of the memory in the inactive category is cached disk blocks, at least for ufs (I think zfs does things differently). On this desktop machine I have 12G physical and typically have roughly 11G inactive, and I can unmount one particular filesystem where most of my work is done and instantly I have almost no inactive and roughly 11G free. Okay, so this could be UFS disk cache, except the system is ZFS-on-root with no UFS filesystems active or mounted. Can I confirm that no double-caching of ZFS data is happening in active + inactive (+ cache) memory? Thanks, Luke -- CTO, Hybrid Logic +447791750420 | +1-415-449-1165 | www.hybrid-cluster.com ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
FreeBSD 8.2 - active plus inactive memory leak!?
Hi all, I'm having some trouble with some production 8.2-RELEASE servers where the 'Active' and 'Inact' memory values reported by top don't seem to correspond with the processes which are running on the machine. I have two near-identical machines (with slightly different workloads); on one, let's call it A, active + free is small (6.5G) and on the other (B) active + free is large (13.6G), even though they have almost identical sums-of-resident memory (8.3G on A and 9.3G on B). The only difference is that A has a smaller number of quite long-running processes (it's hosting a small number of busy sites) and B has a larger number of more frequently killed/recycled processes (it's hosting a larger number of quiet sites, so the FastCGI processes get killed and restarted frequently). Notably B has many more ZFS filesystems mounted than A (around 4,000 versus 100). The machines are otherwise under similar amounts of load. I hoped that the community could please help me understand what's going on with respect to the worryingly large amount of active + free memory on B. Both machines are ZFS-on-root with FreeBSD 8.2-RELEASE with uptimes around 5-6 days. I have recently reduced the ARC cache on both machines since my previous thread [1] and Wired memory usage is now stable at 6G on A and 7G on B with an arc_max of 4G on both machines. Neither of the machines have any swap in use: Swap: 10G Total, 10G Free My current (probably quite simplistic) understanding of the FreeBSD virtual memory system is that, for each process as reported by top: * Size corresponds to the total size of all the text pages for the process (those belonging to code in the binary itself and linked libraries) plus data pages (including stack and malloc()'d but not-yet-written-to memory segments). * Resident corresponds to a subset of the pages above: those pages which actually occupy physical/core memory. Notably pages may appear in size but not appear in resident for read-only text pages from libraries which have not been used yet or which have been malloc()'d but not yet written-to. My understanding for the values for the system as a whole (at the top in 'top') is as follows: * Active / inactive memory is the same thing: resident memory from processes in use. Being in the inactive as opposed to active list simply indicates that the pages in question are less recently used and therefore more likely to get swapped out if the machine comes under memory pressure. * Wired is mostly kernel memory. * Cache is freed memory which the kernel has decided to keep in case it correspond to a useful page in future; it can be cheaply evicted into the free list. * Free memory is actually not being used for anything. It seems that pages which occur in the active + inactive lists must occur in the resident memory of one or more processes (or more since processes can share pages in e.g. read-only shared libs or COW forked address space). Conversely, if a page *does not* occur in the resident memory of any process, it must not occupy any space in the active + inactive lists. Therefore the active + inactive memory should always be less than or equal to the sum of the resident memory of all the processes on the system, right? But it's not. So, I wrote a very simple Python script to add up the resident memory values in the output from 'top' and, on machine A: Mem: 3388M Active, 3209M Inact, 6066M Wired, 196K Cache, 11G Free There were 246 processes totalling 8271 MB resident memory Whereas on machine B: Mem: 11G Active, 2598M Inact, 7177M Wired, 733M Cache, 1619M Free There were 441 processes totalling 9297 MB resident memory Now, on machine A: 3388M active + 3209M inactive - 8271M sum-of-resident = -1674M I can attribute this negative value to shared libraries between the running processes (which the sum-of-res is double-counting but active + inactive is not). But on machine B: 11264M active + 2598M inactive - 9297M sum-of-resident = 4565M I'm struggling to explain how, when there are only 9.2G (worst case, discounting shared pages) of resident processes, the system is using 11G + 2598M = 13.8G of memory! This missing memory is scary, because it seems to be increasing over time, and eventually when the system runs out of free memory, I'm certain it will crash in the same way described in my previous thread [1]. Is my understanding of the virtual memory system badly broken - in which case please educate me ;-) or is there a real problem here? If so how can I dig deeper to help uncover/fix it? Best Regards, Luke Marsden [1] lists.freebsd.org/pipermail/freebsd-fs/2012-February/013775.html [2] https://gist.github.com/1988153 -- CTO, Hybrid Logic +447791750420 | +1-415-449-1165 | www.hybrid
Re: FreeBSD 8.2 - active plus inactive memory leak!?
Thanks for your email, Chuck. Conversely, if a page *does not* occur in the resident memory of any process, it must not occupy any space in the active + inactive lists. Hmm...if a process gets swapped out entirely, the pages for it will be moved to the cache list, flushed, and then reused as soon as the disk I/O completes. But there is a window where the process can be marked as swapped out (and considered no longer resident), but still has some of it's pages in physical memory. There's no swapping happening on these machines (intentionally so, because as soon as we hit swap everything goes tits up), so this window doesn't concern me. I'm trying to confirm that, on a system with no pages swapped out, that the following is a true statement: a page is accounted for in active + inactive if and only if it corresponds to one or more of the pages accounted for in the resident memory lists of all the processes on the system (as per the output of 'top' and 'ps') Therefore the active + inactive memory should always be less than or equal to the sum of the resident memory of all the processes on the system, right? No. If you've got a lot of process pages shared (ie, a webserver with lots of httpd children, or a database pulling in a large common shmem area), then your process resident sizes can be very large compared to the system-wide active+inactive count. But that's what I'm saying... sum(process resident sizes) = active + inactive Or as I said it above, equivalently: active + inactive = sum(process resident sizes) The data I've got from this system, and what's killing us, shows the opposite: active + inactive sum(process resident sizes) - by over 5GB now and growing, which is what keeps causing these machines to crash. In particular: Mem: 13G Active, 1129M Inact, 7543M Wired, 120M Cache, 1553M Free But the total sum of resident memories is 9457M (according to summing the output from ps or top). 13G + 1129M = 14441M (active + inact) 9457M (sum of res) That's 4984M out, and that's almost enough to push us over the edge. If my understanding of VM is correct, I don't see how this can happen. But it's happening, and it's causing real trouble here because our free memory keeps hitting zero and then we swap-spiral. What can I do to investigate this discrepancy? Are there some tools that I can use to debug the memory allocated in active to find out where it's going, if not to resident process memory? Thanks, Luke -- CTO, Hybrid Logic +447791750420 | +1-415-449-1165 | www.hybrid-cluster.com ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Another ZFS ARC memory question
...@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64 Thanks! Luke Marsden -- CTO, Hybrid Logic +447791750420 | +1-415-449-1165 | www.hybrid-cluster.com ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Another ZFS ARC memory question
Hi all, Just wanted to get your opinion on best practices for ZFS. We're running 8.2-RELEASE v15 in production on 24GB RAM amd64 machines but have been having trouble with short spikes in application memory usage resulting in huge amounts of swapping, bringing the whole machine to its knees and crashing it hard. I suspect this is because when there is a sudden spike in memory usage the zfs arc reclaim thread is unable to free system memory fast enough. This most recently happened yesterday as you can see from the following munin graphs: E.g. http://hybrid-logic.co.uk/memory-day.png http://hybrid-logic.co.uk/swap-day.png Our response has been to start limiting the ZFS ARC cache to 4GB on our production machines - trading performance for stability is fine with me (and we have L2ARC on SSD so we still get good levels of caching). My questions are: * is this a known problem? * what is the community's advice for production machines running ZFS on FreeBSD, is manually limiting the ARC cache (to ensure that there's enough actually free memory to handle a spike in application memory usage) the best solution to this spike-in-memory-means-crash problem? * has FreeBSD 9.0 / ZFS v28 solved this problem? * rather than setting a hard limit on the ARC cache size, is it possible to adjust the auto-tuning variables to leave more free memory for spiky memory situations? e.g. set the auto-tuning to make arc eat 80% of memory instead of ~95% like it is at present? * could the arc reclaim thread be made to drop ARC pages with higher priority before the system starts swapping out application pages? Thank you for any/all answers, and thank you for making FreeBSD awesome :-) Best Regards, Luke Marsden -- CTO, Hybrid Logic +447791750420 | +1-415-449-1165 | www.hybrid-cluster.com ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Swap on zvol - recommendable?
On Feb 6, 2012, at 11:57 AM, Patrick M. Hausen wrote: Hi, all, is it possible to make a definite statement about swap on zvols? I found some older discussions about a resource starvation scenario when ZFS arc would be the cause of the system running out of memory, trying to swap, yet the ZFS would not be accessible until some memory was freed - leading to a deadlock. Is this still the case with RELENG_8? The various Root on ZFS guides mention both choices (decicated or gmirror partition vs. zvol), yet don't say anything about the respective merits or risks. I am aware of the fact that I cannot dump to a raidz2 zvol ... On Tue, 2012-02-07 at 20:53 +0100, Peter Ankerstål wrote: I can just tell you I had this problem still in 8.1 and it was a HUGE problem. System stalled every two weeks or so. Now when the swap is moved away from zfs it works fine. I can confirm that this is still a problem on 8.2 and 9.0. -- CTO, Hybrid Logic +447791750420 | +1-415-449-1165 | www.hybrid-cluster.com ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 8.1R possible zfs snapshot livelock?
On Wed, 2011-05-18 at 14:05 +0200, Borja Marcos wrote: On May 17, 2011, at 1:29 PM, Jeremy Chadwick wrote: * ZFS send | ssh zfs recv results in ZFS subsystem hanging; 8.1-RELEASE; February 2011: http://lists.freebsd.org/pipermail/freebsd-fs/2011-February/010602.html I found a reproducible deadlock condition actually. If you keep some I/O activity on a dataset on which you are receiving a ZFS incremental snapshot at the same time, it can deadlock. Imagine this situation: Two servers, A and B. A dataset on server A is replicated at regular intervals to B, so that you keep a reasonably up to date copy. Something like: (Runnning on server A): zfs snapshot thepool/thedataset@thistime zfs send -Ri thepooll/thedataser@previoustime hepool/thedataset@thistime | ssh serverB zfs receive -d thepool It works, but I suffered a deadlock when one of the periodic daily scripts was running. Doing some tests, I saw that ZFS can deadlock if you do a zfs receive onto a dataset which has some read activity. Disabling atime didn't help either. But if you make sure *not* to access the replicated dataset it works, I haven´t seen it failing otherwise. If you wish to reproduce it, try creating a dataset for /usr/obj, running make buildworld on it, replicating at, say, 30 or 60 second intervals, and keep several scripts (or rsync) reading the target dataset files and just copying them to another place in the usual, classic way. (example: tar cf - . | ( cd /destination tar xf -) Is there a PR for this? I'd like to see it addressed, since read-only I/O on a dataset which is being updated by `zfs recv` is an important part of what we plan to do with ZFS on FreeBSD. -- Best Regards, Luke Marsden CTO, Hybrid Logic Ltd. Web: http://www.hybrid-cluster.com/ Hybrid Web Cluster - cloud web hosting Phone: +447791750420 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 8.1R possible zfs snapshot livelock?
Hi all, On Tue, 2011-05-17 at 04:29 -0700, Jeremy Chadwick wrote: There are still some outstanding incidents that directly pertain to ZFS snapshots, or are related to ZFS snapshots (meaning things like send/recv which are commonly used alongside snapshots), which I remember reading about but really saw no answer to: * ZFS send | ssh zfs recv results in ZFS subsystem hanging; 8.1-RELEASE; February 2011: http://lists.freebsd.org/pipermail/freebsd-fs/2011-February/010602.html As the original author of this post I wanted to chime in to say that our problem was mis-diagnosed here as being related to snapshots and zfs send/receive. Instead, it was a bug [1] relating to force-unmounting a ZFS filesystem which has active child nullfs mounts and active special devices (FIFO). There is a related kernel panic [1] which suggests that this is a problem area. I've been meaning to collect enough information to submit a proper bug report -- I can at least reliably reproduce the issue -- but have been rather too busy with the 1.0 release of our application, and was put off by one response: IMO this is expected. [1] http://lists.freebsd.org/pipermail/freebsd-fs/2011-March/010983.html Our application -- see HCFS at http://www.hybrid-cluster.com/tech/ -- makes very heavy use of ZFS snapshots and ZFS send/receive on FreeBSD (currently 8.1), and since we engineered it so that it never attempts foolish force-unmounts on busy filesystems we've seen no kernel hangs over the course of hundreds of thousands of snapshot and zfs replication events in testing. I'm interested to know whether the OP's problem is fixed in 8.2 or 8-STABLE, since it could affect us. Also, thanks for the links to the backports for 8.2, Jeremy, I'll include those in our next system image. -- Best Regards, Luke Marsden CTO, Hybrid Logic Ltd. Web: http://www.hybrid-cluster.com/ Hybrid Web Cluster - cloud web hosting Phone: +447791750420 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Guaranteed kernel panic with ZFS + nullfs
Hi all, The following script seems to cause a guaranteed kernel panic on 8.1-R, 8.2-R and 8-STABLE as of today (2011-03-16), with both ZFS v14/15, and v28 on 8.2-R with mm@ patches from 2011-03. I suspect it may also affect 9-CURRENT but have not tested this yet. #!/usr/local/bin/bash export POOL=hpool # change this to your pool name sudo zfs destroy -r $POOL/foo sudo zfs create $POOL/foo sudo zfs set mountpoint=/foo $POOL/foo sudo mount -t nullfs /foo /bar sudo touch /foo/baz ls /bar # should see baz sudo zfs umount -f $POOL/foo # seems okay (ls: /bar: Bad file descriptor) sudo zfs mount $POOL/foo # PANIC! Can anyone suggest a patch which fixes this? Preferably against 8-STABLE :-) I also have a more subtle problem where, after mounting and then quickly force-unmounting a ZFS filesystem (call it A) with two nullfs-mounted filesystems and a devfs filesystem within it, running ls on the mountpoint of the parent filesystem of A hangs. I'm working on narrowing it down to a shell script like the above - as soon as I have one I'll post a followup. This latter problem is actually more of an issue for me - I can avoid the behaviour which triggers the panic (if it hurts, don't do it), but I need to be able to perform the actions which trigger the deadlock (mounting and unmounting filesystems). This also affects 8.1-R, 8.2-R, 8-STABLE and 8.2-R+v28. It seems to be the zfs umount -f process which hangs and triggers further accesses to the parent filesystem to hang. Note that I have definitely correctly unmounted the nullfs and devfs mounts from within the filesystem before I force the unmount. Unfortunately the -f is necessary in my application. After the hang: hybrid@dev3:/opt/HybridCluster$ sudo ps ax |grep zfs 41 ?? DL 0:00.11 [zfskern] 3751 ?? D 0:00.03 /sbin/zfs unmount -f hpool/hcfs/filesystem1 hybrid@dev3:/opt/HybridCluster$ sudo procstat -kk 3751 PIDTID COMM TDNAME KSTACK 3751 100264 zfs -mi_switch+0x16f sleepq_wait+0x42 _sleep+0x31c zfsvfs_teardown+0x269 zfs_umount+0x1a7 dounmount+0x28a unmount+0x3c8 syscall+0x1e7 Xfast_syscall+0xe1 hybrid@dev3:/opt/HybridCluster$ sudo procstat -kk 41 PIDTID COMM TDNAME KSTACK 41 100058 zfskern arc_reclaim_thre mi_switch+0x16f sleepq_timedwait+0x42 _cv_timedwait+0x129 arc_reclaim_thread+0x2d1 fork_exit+0x118 fork_trampoline+0xe 41 100062 zfskern l2arc_feed_threa mi_switch+0x16f sleepq_timedwait+0x42 _cv_timedwait+0x129 l2arc_feed_thread+0x1be fork_exit+0x118 fork_trampoline+0xe 41 100090 zfskern txg_thread_enter mi_switch+0x16f sleepq_wait+0x42 _cv_wait+0x111 txg_thread_wait+0x79 txg_quiesce_thread +0xb5 fork_exit+0x118 fork_trampoline+0xe 41 100091 zfskern txg_thread_enter mi_switch+0x16f sleepq_timedwait+0x42 _cv_timedwait+0x129 txg_thread_wait+0x3c txg_sync_thread+0x355 fork_exit+0x118 fork_trampoline+0xe I will continue to attempt to create a shell script which makes this latter bug easily reproducible. In the meantime, what further information can I gather? I will build a debug kernel in the morning. If it helps accelerate finding a solution to this problem, Hybrid Logic Ltd might be able to fund a small bounty for a fix. Contact me off-list if you can help in this way. -- Best Regards, Luke Marsden CTO, Hybrid Logic Ltd. Web: http://www.hybrid-cluster.com/ Hybrid Web Cluster - cloud web hosting Phone: +441172232002 / +16179496062 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
ZFS hanging with simultaneous zfs recv and zfs umount -f
Hi FreeBSD-{stable,current,fs}, I've reliably been able to cause the ZFS subsystem to hang under FreeBSD 8.1-RELEASE under the following conditions: Another server is sending the server an incremental snapshot stream which is in the process of being received with: zfs send -I $OLD $FS@$NEW |ssh $HOST zfs recv -uF $FILESYSTEM On the receiving server, we forcibly unmount the filesystem which is being received into with: zfs umount -f $FILESYSTEM (the filesystem may or may not actually be mounted) This causes any ZFS file operation (such as ls) to hang forever and when attempting to reboot the machine, it goes down and stops responding to pings, but then hangs somewhere in the reboot process and needs a hard power cycle. Unfortunately we don't have a remote console on this machine. I understand this is a fairly harsh use case but the ideal behaviour would be for the zfs recv to emit an error message (if necessary) rather than rendering the entire machine unusable ;-) Let me know if you need any further information. I appreciate that providing a script to reliably reproduce the problem, testing on -CURRENT and 8.2-PRE, and submitting a bug report will help... I will do this in due course, but don't have time right now -- just wanted to get this bug report out there first in case there's an obvious fix. Thank you for supporting ZFS on FreeBSD!! -- Best Regards, Luke Marsden CTO, Hybrid Logic Ltd. Web: http://www.hybrid-cluster.com/ Hybrid Web Cluster - cloud web hosting Phone: +441172232002 / +16179496062 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
ZFS hanging with simultaneous zfs recv and zfs umount -f
Hi FreeBSD-{stable,current,fs}, I've reliably been able to cause the ZFS subsystem to hang under FreeBSD 8.1-RELEASE under the following conditions: Another server is sending the server an incremental snapshot stream which is in the process of being received with: zfs send -I $OLD $FS@$NEW |ssh $HOST zfs recv -uF $FILESYSTEM On the receiving server, we forcibly unmount the filesystem which is being received into with: zfs umount -f $FILESYSTEM (the filesystem may or may not actually be mounted) This causes any ZFS file operation (such as ls) to hang forever and when attempting to reboot the machine, it goes down and stops responding to pings, but then hangs somewhere in the reboot process and needs a hard power cycle. Unfortunately we don't have a remote console on this machine. I understand this is a fairly harsh use case but the ideal behaviour would be for the zfs recv to emit an error message (if necessary) rather than rendering the entire machine unusable ;-) Let me know if you need any further information. I appreciate that providing a script to reliably reproduce the problem, testing on -CURRENT and 8.2-PRE, and submitting a bug report will help... I will do this in due course, but don't have time right now -- just wanted to get this bug report out there first in case there's an obvious fix. Thank you for supporting ZFS on FreeBSD!! -- Best Regards, Luke Marsden CTO, Hybrid Logic Ltd. Web: http://www.hybrid-cluster.com/ Hybrid Web Cluster - cloud web hosting Phone: +441172232002 / +16179496062 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Virtio drivers for FreeBSD on KVM
Hi everyone, With more cloud infrastructure providers using KVM than ever before, the importance of having FreeBSD performant as a guest on these infrastructures [1], [2], [3] is increasing. It seems that using Virtio drivers give a pretty significant performance boost [4], [5]. There was a NetBSD driver, and there seems to (have been) some work happening to port this to DragonFly BSD at [6] and [7] -- does anyone know if this code is stable, or if it has stalled, or if anyone's working on it? It may be possible to use the work done on the Xen paravirtualised network and disk drivers, combined with the NetBSD code, as starting point for an implementation? My company might soon be in a position to sponsor the work to get this completed and available at some point in FreeBSD 8. I'd be very interested to hear from anyone who's involved, or who might like to be. -- Best Regards, Luke Marsden CTO, Hybrid Logic Ltd. Web: http://www.hybrid-cluster.com/ Hybrid Web Cluster - cloud web hosting Mobile: +447791750420 [1] http://www.elastichosts.com/ [2] http://www.cloudsigma.com/ [3] http://beta.brightbox.com/ [4] http://arstechnica.com/civs/viewtopic.php?f=16t=34039 [5] blog.loftninjas.org/2008/10/22/kvm-virtio-network-performance/ [6] kerneltrap.org/mailarchive/dragonflybsd-kernel/2010/10/23/6884356 [7] http://gitorious.org/virtio-drivers ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Problem running 8.1R on KVM with AMD hosts
Hi FreeBSD-stable, 1. Please, build your kernel with debug symbols. 2. Show kgdb output I could not convince the kernel to dump (it was looping forever but not panicing), but I have managed to compiled a kernel with debugging symbols and DDB which immediately drops into the debugger when the problem occurs, see screenshot at: http://lukemarsden.net/kvm-panic.png Progress, I sense. I tried typing 'panic' on the understanding that this should force a panic and cause it would dump core to the configured swap device (I have set dump* in /etc/rc.conf) so that I could get you the kgdb output, but it just looped back into the debugger. This issue seems to occur very early in the boot process. I would like to invite anyone with the skills and the inclination to have a poke around with this directly over VNC to email me off-list and I will turn on the VM and send you the VNC credentials. My email address is: luke [at] hybrid-logic.co.uk Or you can catch me on Skype at luke.marsden. I'm in GMT+1. I look forward to hearing from you ;-) -- Best Regards, Luke Marsden Hybrid Logic Ltd. Web: http://www.hybrid-cluster.com/ Hybrid Web Cluster - cloud web hosting based on FreeBSD and ZFS Mobile: +447791750420 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Problem running 8.1R on KVM with AMD hosts
On Thu, 2010-09-30 at 18:55 -0400, Jung-uk Kim wrote: It seems MCA capability is advertised by the CPUID translator but writing to the MSRs causes GPF. In other words, it seems like a CPU emulator bug. A simple workaround is 'set hw.mca.enabled=0' from the loader prompt. If it works, add hw.mca.enabled=0 in /boot/loader.conf to make it permanent. MCA does not make any sense in emulation any way. Awesome, this allows us to boot 8.1R on Linux KVM with AMD hardware! Thank you very much. This has just doubled our number of availability zones. -- Best Regards, Luke Marsden Hybrid Logic Ltd. Web: http://www.hybrid-cluster.com/ Hybrid Web Cluster - cloud web hosting based on FreeBSD and ZFS Mobile: +447791750420 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Problem running 8.1R on KVM with AMD hosts
Hi all, Thanks for your responses. 1. Please, build your kernel with debug symbols. 2. Show kgdb output I will build a debug kernel as per your instructions and post the results as soon as I can. Likely in the next couple of days. I have secured us test hardware at ElasticHosts to debug this as necessary. As a reference point, 8.0R runs fine on this particular infrastructure: Linux KVM on AMD hardware. More detail to follow. Thank you. -- Best Regards, Luke Marsden Hybrid Logic Ltd. Web: http://www.hybrid-cluster.com/ Hybrid Web Cluster - cloud web hosting based on FreeBSD and ZFS Mobile: +447791750420 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org