Re: Stale NFS file handles on 8.x amd64
Hi, Adam-- On Nov 29, 2010, at 5:06 PM, Adam McDougall wrote: > I've been running dovecot 1.1 on FreeBSD 7.x for a while with a bare minimum > of NFS problems, but it got worse with 8.x. I have 2-4 servers (usually just > 2) accessing mail on a Netapp over NFSv3 via imapd. delivery is via procmail > which doesn't touch the dovecot metadata and webmail uses imapd. Client > connections to imapd go to random servers and I don't yet have solid means to > keep certain users on certain servers. Are you familiar with: http://wiki1.dovecot.org/NFS Basically, you're running a "try to avoid doing this" configuration, but it does discuss some options to improve the situation. If you can tolerate the performance hit, try disabling NFS attribute cache... Regards, -- -Chuck ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: ZFS panic after replacing log device
On 11/16/2010 8:41 PM, Terry Kennedy wrote: I would say it is definitely very odd that writes are a problem. Sounds like it might be a hardware problem. Is it possible to export the pool, remove the ZIL and re-import it? I myself would be pretty nervous trying that, but it would help isolate the problem? If you can risk it. I think it is unlikely to be a hardware problem. While I haven't run any destructive testing on the ZFS pool, the fact that it can be read without error, combined with ECC throughout the system and the panic always happen- ing on the first write, makes me think that it is a software issue in ZFS. When I do: zpool export data; zpool remove data da0 I get a "No such pool: data". I then re-imported the pool and did: zpool offline data da0; zpool export data; zpool import data After doing that, I can write to the pool without a panic. But once I online the log device and do any writes, I get the panic again. As I mentioned, I have this data replicated elsewere, so I can exper- iment with the pool if it will help track down this issue. Any more news on this? -- Dan Langille - http://langille.org/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Stale NFS file handles on 8.x amd64
On 11/29/10 20:35, Chuck Swiger wrote: Hi, Adam-- On Nov 29, 2010, at 5:06 PM, Adam McDougall wrote: I've been running dovecot 1.1 on FreeBSD 7.x for a while with a bare minimum of NFS problems, but it got worse with 8.x. I have 2-4 servers (usually just 2) accessing mail on a Netapp over NFSv3 via imapd. delivery is via procmail which doesn't touch the dovecot metadata and webmail uses imapd. Client connections to imapd go to random servers and I don't yet have solid means to keep certain users on certain servers. Are you familiar with: http://wiki1.dovecot.org/NFS Basically, you're running a "try to avoid doing this" configuration, but it does discuss some options to improve the situation. If you can tolerate the performance hit, try disabling NFS attribute cache... Regards, I am familiar with that page, have taken it into account, worked closly with Timo the author of Dovecot and my mail servers have been running close enough to perfect on 7.x for years. The FreeBSD version is the the only major change that I can think of at this point other than the versions of other ports. I'm planning to revert some to 7.x to make sure. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Stale NFS file handles on 8.x amd64
On Mon, Nov 29, 2010 at 08:06:54PM -0500, Adam McDougall wrote: > I've been running dovecot 1.1 on FreeBSD 7.x for a while with a bare > minimum of NFS problems, but it got worse with 8.x. I have 2-4 > servers (usually just 2) accessing mail on a Netapp over NFSv3 via > imapd. delivery is via procmail which doesn't touch the dovecot > metadata and webmail uses imapd. Client connections to imapd go to > random servers and I don't yet have solid means to keep certain > users on certain servers. I upgraded some of the servers to 8.x and > dovecot 1.2 and ran into Stale NFS file handles causing > index/uidlist corruption causing inboxes to appear as empty when > they were not. In some situations their corrupt index had to be > deleted manually. I first suspected dovecot 1.2 since it was > upgraded at the same time but I downgraded to 1.1 and its doing the > same thing. I don't really have a wealth of details to go on yet > and I usually stay quiet until I do, and half the time it is > difficult to reproduce myself so I've had to put it in production to > get a feel for progress. This only happens a dozen or so times per > weekday but I feel the need to start taking bigger steps. I'll > probably do what I can to get IMAP back on a stable base (7.x?) and > also try to debug 8.x on the remaining servers. A binary search is > within possibility if I can reproduce the symptoms often enough even > if I have to put a test server in production for a few hours. > > Any tips on where we could start looking, or alterations I could try > making such as sysctls to return to older behavior? http://wiki1.dovecot.org/NFS is a good start, especially if this problem is only seen with Dovecot. I would start there, specially adjusting your dovecot.conf to include the necessary directives. > It might be > worth noting that I've seen a considerable increase in traffic from > my mail servers since the 8.x upgrade timeframe, on the order of > 5-10x as much traffic to the NFS server. dovecot tries its hardest > to flush out the access cache when needed and it was working well > enough since about 1.0.16 (years ago). It seems like FreeBSD is > what regressed in this scenario. dovecot 2.x is going in a > different direction from my situation and I'm not ready to start > testing that immediately if I can avoid it as it will involve some > restructuring. > > Thanks for any input. For now the following errors are about all I > have to go on: > > Nov 29 11:07:54 server1 dovecot: IMAP(user1): > o_stream_send(/home/user1/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) > failed: Stale NFS file handle > Nov 29 13:19:51 server1 dovecot: IMAP(user1): > o_stream_send(/home/user1/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) > failed: Stale NFS file handle > Nov 29 14:35:41 server1 dovecot: IMAP(user2): > o_stream_send(/home/user2/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) > failed: Stale NFS file handle > Nov 29 15:07:05 server1 dovecot: IMAP(user3): read(mail, uid=128990) > failed: Stale NFS file handle > > Nov 29 11:57:22 server2 dovecot: IMAP(user4): > open(/egr/mail/shared/vprgs/dovecot-acl-list) failed: Stale NFS file > handle > Nov 29 14:04:22 server2 dovecot: IMAP(user5): > o_stream_send(/home/user5/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) > failed: Stale NFS file handle > Nov 29 14:27:21 server2 dovecot: IMAP(user6): > o_stream_send(/home/user6/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) > failed: Stale NFS file handle > Nov 29 15:44:38 server2 dovecot: IMAP(user7): > open(/egr/mail/shared/decs/dovecot-acl-list) failed: Stale NFS file > handle > Nov 29 19:04:54 server2 dovecot: IMAP(user8): > o_stream_send(/home/user8/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) > failed: Stale NFS file handle > > Nov 29 06:32:11 server3 dovecot: IMAP(user9): > open(/egr/mail/shared/cmsc/dovecot-acl-list) failed: Stale NFS file > handle > Nov 29 10:03:58 server3 dovecot: IMAP(user10): > o_stream_send(/home/user10/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) > failed: Stale NFS file handle -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Stale NFS file handles on 8.x amd64
I've been running dovecot 1.1 on FreeBSD 7.x for a while with a bare minimum of NFS problems, but it got worse with 8.x. I have 2-4 servers (usually just 2) accessing mail on a Netapp over NFSv3 via imapd. delivery is via procmail which doesn't touch the dovecot metadata and webmail uses imapd. Client connections to imapd go to random servers and I don't yet have solid means to keep certain users on certain servers. I upgraded some of the servers to 8.x and dovecot 1.2 and ran into Stale NFS file handles causing index/uidlist corruption causing inboxes to appear as empty when they were not. In some situations their corrupt index had to be deleted manually. I first suspected dovecot 1.2 since it was upgraded at the same time but I downgraded to 1.1 and its doing the same thing. I don't really have a wealth of details to go on yet and I usually stay quiet until I do, and half the time it is difficult to reproduce myself so I've had to put it in production to get a feel for progress. This only happens a dozen or so times per weekday but I feel the need to start taking bigger steps. I'll probably do what I can to get IMAP back on a stable base (7.x?) and also try to debug 8.x on the remaining servers. A binary search is within possibility if I can reproduce the symptoms often enough even if I have to put a test server in production for a few hours. Any tips on where we could start looking, or alterations I could try making such as sysctls to return to older behavior? It might be worth noting that I've seen a considerable increase in traffic from my mail servers since the 8.x upgrade timeframe, on the order of 5-10x as much traffic to the NFS server. dovecot tries its hardest to flush out the access cache when needed and it was working well enough since about 1.0.16 (years ago). It seems like FreeBSD is what regressed in this scenario. dovecot 2.x is going in a different direction from my situation and I'm not ready to start testing that immediately if I can avoid it as it will involve some restructuring. Thanks for any input. For now the following errors are about all I have to go on: Nov 29 11:07:54 server1 dovecot: IMAP(user1): o_stream_send(/home/user1/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) failed: Stale NFS file handle Nov 29 13:19:51 server1 dovecot: IMAP(user1): o_stream_send(/home/user1/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) failed: Stale NFS file handle Nov 29 14:35:41 server1 dovecot: IMAP(user2): o_stream_send(/home/user2/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) failed: Stale NFS file handle Nov 29 15:07:05 server1 dovecot: IMAP(user3): read(mail, uid=128990) failed: Stale NFS file handle Nov 29 11:57:22 server2 dovecot: IMAP(user4): open(/egr/mail/shared/vprgs/dovecot-acl-list) failed: Stale NFS file handle Nov 29 14:04:22 server2 dovecot: IMAP(user5): o_stream_send(/home/user5/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) failed: Stale NFS file handle Nov 29 14:27:21 server2 dovecot: IMAP(user6): o_stream_send(/home/user6/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) failed: Stale NFS file handle Nov 29 15:44:38 server2 dovecot: IMAP(user7): open(/egr/mail/shared/decs/dovecot-acl-list) failed: Stale NFS file handle Nov 29 19:04:54 server2 dovecot: IMAP(user8): o_stream_send(/home/user8/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) failed: Stale NFS file handle Nov 29 06:32:11 server3 dovecot: IMAP(user9): open(/egr/mail/shared/cmsc/dovecot-acl-list) failed: Stale NFS file handle Nov 29 10:03:58 server3 dovecot: IMAP(user10): o_stream_send(/home/user10/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) failed: Stale NFS file handle ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: ath0: lot of bad series hwrate and AH_SUPPORT_AR5416
(I should get me an AR9285 to test with at some point.) On 29 November 2010 18:52, David DEMELIER wrote: > ath0: bad series3 hwrate 0x1b, tries 2 ts_status 0x0 > ath0: bad series3 hwrate 0x1b, tries 2 ts_status 0x0 > ath0: bad series3 hwrate 0x1b, tries 2 ts_status 0x0 > ath0: bad series3 hwrate 0x1b, tries 2 ts_status 0x0 That's ath_rate_sample saying "I don't know about that hardware rate", but it transmitted successfully! So something queued up a patcket at that hwrate. 0x1B is CCK_1MB_L - that should be fine in 11bg? Or is this somehow running in 11A mode? Would you please paste 'ifconfig wlan0' here? > I also don't understand why the option AH_SUPPORT_AR5416 is needed in > kernel, I found this > http://lists.freebsd.org/pipermail/freebsd-current/2009-May/006417.html > and it should include it if the driver needs it, isn't it? Here my > kernel won't build if I remove it so I have : > > options AH_SUPPORT_AR5416 > device ath > device ath_hal > device ath_rate_sample Would you please file a PR for that and email me the PR number? Thanks, adrian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: reboot halts on atom D525
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On 11/28/10 23:28, Andrei Kolu wrote: > Hi, > > trouble with rebooting on Intel Atom motherboard D525MWV. > -- > ... > All buffers synced. > Uptime 4m 27s. > Rebooting... > CPU_reset: Stopping other CPUs > _ > -- > System halts completely- no response to CTRL+ALT+DEL. Only way to > restart is to press reset or power off. No error messages. Will a "sysctl hw.acpi.handle_reboot=1" before rebooting help? Cheers, - -- Xin LI http://www.delphij.net/ FreeBSD - The Power to Serve! Live free or die -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.16 (FreeBSD) iQEcBAEBCAAGBQJM9BnWAAoJEATO+BI/yjfBDMgH/j3omd71w5hjqBt5VLia2Z2V n1iTSABh9TUhYDrh9lMTkIBr/3S0A7g3JwzRJJZB4K2+J5M+1x/lsd4CgePoiLVB X5+EIj9pIrk6nvTP7ERN0HeznmpKIQSUnIgrxTJTAk2qeVSXc2NW5H48eOEAuWsO kngjJDe4FX1jw8U5Hn13jUwv5dS4IIJfk7zef6a2lPpuFsfNBiLKrV2aMy/iNRUs XIL65VKVhqUg/qc9zFDnOoWNTYJoonkTKPkYPVn891vycwyuIaxcj8JjCNHYgARQ 5lE4SiDxzplqr1+dQe1HC2hi6V1+Vh/ICmMyRX/35sJ3cwpkbcSK1d17OQxghIA= =pzIx -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: puc(4) and pucdata.c
On Mon, Nov 29, 2010 at 10:31:25AM -0500, John Baldwin wrote: > On Friday, November 26, 2010 1:10:48 pm Jonathan Chen wrote: > > Hi, > > > > I've recently added a new PCI 1 Parallel Port card, and I'm trying to > > get it recognised by my 8-STABLE/amd64 system. > > puc(4) probably ignores single-port devices. Try adding the device ID to > sys/dev/ppc/ppc_pci.c instead. Yup. Got it working now! Thanks! -- Jonathan Chen -- "If everything's under control, you're going too slow" - Mario Andretti ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: puc(4) and pucdata.c
On Friday, November 26, 2010 1:10:48 pm Jonathan Chen wrote: > Hi, > > I've recently added a new PCI 1 Parallel Port card, and I'm trying to > get it recognised by my 8-STABLE/amd64 system. puc(4) probably ignores single-port devices. Try adding the device ID to sys/dev/ppc/ppc_pci.c instead. -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: e1000 + vlan = connection lost after ifconfig + vlanid + vlandev
On Monday, November 22, 2010 9:09:17 pm Rudolph Sand wrote: > Hi, I noticed that *something has changed* regarding vlan creation since > 8.0-rel and 7.1-rel, compared to 8.1-rel / 7-stable > > When creating a vlan under 8.1-rel (csup'ed the sources yesterday), the box > looses all connectivity Yes, this has to do with changes to the driver to fix bugs in its VLAN hardware filter. If you grab the latest 8-stable it should be back to working fine by default. If you enable 'vlanhwfilter' via ifconfig you will loose link when adding or removing vlan's again. -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: memory leak and swapfile
On Sat, 27 Nov 2010, Kevin Oberman wrote: > > Date: Sat, 27 Nov 2010 04:17:43 -0800 > > From: Jeremy Chadwick > > > > On Fri, Nov 26, 2010 at 07:12:59PM -0800, Kevin Oberman wrote: > > > > From: "Jack Raats" > > > > Date: Fri, 26 Nov 2010 19:17:05 +0100 > > > > Sender: owner-freebsd-sta...@freebsd.org > > > > > > > > It looks like that there may be a memory leak of my swap space with > > > > one of > > > > the processes that is running. > > > > Big question: How can I determine which process is responsible. > > > > > > > > Any suggestions? > > > > > > ps -aux Look for processes with large values in the VSZ column. > > > > > > I'm sure that there are other ways to see this, but that's an easy > > > one. You can, of course, pipe the output to sort and use the -k 5 -n > > > options. > > > > I believe he should be looking for a process that has a large value in > > RSS ("RES" in top), not VSZ ("SIZE" in top). > > I believe it's not that simple, but I think my answer is more likely to > pint to the culprit than yours. I think so too, given Jack suggested growing swap use was the issue. > I am not terribly familiar with the details of the FreeBSD virtual > memory system, but I assume that it is similar to other I have known very > well over the years, primarily VMS and OSF/1 where I did a lot of kernel > programming. > > FreeBSD does not do "greedy" swap allocation. Some systems will > "reserve" space in swap for all active memory. FreeBSD only uses swap > space when it is needed. RES shows the amount of physical memory a > process is using while VSZ. VSZ is in KB while RES is in pages (I > think), so the numbers wind up looking "odd". top's SIZE and RES are given in KB (suffixed 'K', or 'M' over 100M) and ps' VSZ and RSS are both given in unadorned KB. I expect Jeremy's right about procstat RES being in pages, and that might be a useful view once processes are under pressure to swap unused (or leaked!) pages out. > If VSZ is bigger than (RES * page-size in KB), then the entire process > memory space is not in physical memory. It is in one of three other > places: > 1. Imaginary memory (demand-zero pages) > 2. Unmapped space (any pages of the image that have not been loaded into > physical memory) > 3. Swap > > It's very hard to determine how much is where, though unread image pages > are not likely to be significant. Some applications set up huge buffers > of demand-zero memory which may never be used. This is the virtual memory > equivalent of a sparse file. Until a demand zero page is written to, it > takes a page table slot, but does not use either physical memory nor > swap space. > > That all said, memory leakage is memory that has been used, but not > freed. It is never accessed, so drops into swap space when memory > pressure triggers the system to look for pages not recently > accessed. It goes to swap and stays thee until the process exists and > VSZ just keeps growing. Yep, think I demonstrated just that with my lil'-iron example? > If you monitor VSZ and it just keeps growing when the process is not > doing anything that should require ever increasing memory, it's probably > a memory leak. > > While RES alone tells you only what is in memory and nothing about swap > use, leaky process will start by growing RES eventually start having > old pages swapped out so RES stops growing and VSZ keeps growing. If > some process grows with pages that are being actively accessed (not a > leak), RES may get large, but unless memory pressure is great enough, > will use little swap. > > Bottom line is that, if the system works the way I believe it does, VSZ > is the best, if not ideal check for memory leaks that fill swap. Agreed, in my experience anyway. All this prompted me to write a little script, below. Tested on 8.1-S and (cough) 5.5-S. Not really sure when top's default display changed. Things vary over the time it's running of course, but it confirms that top and ps are seeing the same virtual and resident sizes, of which I'd never been sure. % grep memory /var/run/dmesg.boot real memory = 167772160 (160 MB) avail memory = 154341376 (147 MB) % swapinfo Device 1K-blocks UsedAvail Capacity /dev/ad0s2b393216 173128 22008844% % procsize >From ps -aux: 157 procs, virtual 1022784K (998M), resident 95932K (93M) >From top -S : 157 procs, virtual 1022820K (998M), resident 95976K (93M) cheers, Ian === #!/bin/sh # procsize 0.5 smithi 27,29/11/10 .. compare ps -aux VSZ,RSS vs top SIZE,RES tempfile=/tmp/`basename $0`.$$ vszall=0; rssall=0; procnt=0 ps -aux >$tempfile # can't update parent vars in pipe while read user pid cpu mem vsz rss tt stat started time command; do [ $user = USER ] && continue [ "$command" = "ps -aux" ] && continue # vs top -t vszall=$((vszall + $vsz)) rssall=$((rssall + $rss)) # both
ath0: lot of bad series hwrate and AH_SUPPORT_AR5416
Hi, I just bought an Atheros 9285 for my laptop, I love it but there is a lot of this in my dmesg : ath0: bad series3 hwrate 0x1b, tries 2 ts_status 0x0 ath0: bad series3 hwrate 0x1b, tries 2 ts_status 0x0 ath0: bad series3 hwrate 0x1b, tries 2 ts_status 0x0 ath0: bad series3 hwrate 0x1b, tries 2 ts_status 0x0 I also don't understand why the option AH_SUPPORT_AR5416 is needed in kernel, I found this http://lists.freebsd.org/pipermail/freebsd-current/2009-May/006417.html and it should include it if the driver needs it, isn't it? Here my kernel won't build if I remove it so I have : options AH_SUPPORT_AR5416 device ath device ath_hal device ath_rate_sample Kind regards, -- Demelier David ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"