Re: kern/135412: [zfs] [nfs] zfs(v13)+nfs and open(..., O_WRONLY|O_CREAT|O_EXCL, ...)
On 2009-06-30, Mike Andrews wrote: Jaakko Heinonen wrote: On 2009-06-30, Danny Braniss wrote: This pr is realy holding me back, I can't upgrade this server, and telling serveral tens of users to us cp, etc is not an option. The open works fine if not using O_EXCL. I guess that r185586 needs to be MFCd to stable/7. Here's an untested patch against stable/7: The patch doesn't help over here, sorry. Simply doing 'touch' or 'mv' to an NFSv3 mount (using either a v6 or v13 zpool) is the test case I've been using; touch doesn't even use O_EXCL as far as I can tell. I could reproduce the problem with O_EXCL and verified that the patch fixes it. However I couldn't reproduce the problem you are seeing with touch and mv. same here, touch worked before too - so i think it's unrelated, btw, it seems that the problem does not exist on i386, though I'm pretty sure I tried there too, oh well, thanks! danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
kern/135412: [zfs] [nfs] zfs(v13)+nfs and open(..., O_WRONLY|O_CREAT|O_EXCL, ...)
hi, This pr is realy holding me back, I can't upgrade this server, and telling serveral tens of users to us cp, etc is not an option. The open works fine if not using O_EXCL. Thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: reecommendations for an 'appliance platform ?
I'm not 100% sure, but fairly sure that you'll have a hard time finding something that combines the low-power standalone type spec with a 64-bit capable processor. Once you get the higher-end processor, That was my experiense when shopping around yes - annoying as I don't need anything particularly low power (it ain't going to be my leccy bill :-). becomes a more likely failure point, and so on. Can you elaborate a bit more on which parts of that system spec you really need - do you need the GigE? Two ethernets? The external SATA? It needs to be: 1) Complete as purchased - I dont want to build a machine 2) Capable of having a simple boot device (e.g. CF card) dropped in 3) At least one ether port. 100 meg will do. 4) Small enough to be posted to the end user 5) Cheap - under 400 euros, preferably 300 I do not really care about processor speed, or memory, or power consumption. It needs to run FreeBSD, and I would prefer amd64 as we havent written or used any of our code on 32 bit in a long time, and I would feel uneasy that there might be laten bugs in it if we simply recompiled it for 32 bit. I bought one of these from them last year: http://www.fit-pc.com/new/fit-pc-1-0-specifications.html Thanks for the links - thats pretty interesting! I notice the newer ones are also Atom based, so similarly spec'd to what I was looking at, but they may be more suitable. cheers, -pete. I've had very good experience with: http://www.pcengines.ch/ danny danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
(no subject)
latest -stable (June 11) is causing problems: MB is intel SE7320VP21, msk0: Ethernet address: 00:0e:0c:6a:85:a8 miibus0: MII bus on msk0 e1000phy0: Marvell 88E Gigabit PHY PHY 0 on miibus0 e1000phy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto msk0: watchdog timeout (missed Tx interrupts) -- recovering msk0: watchdog timeout (missed Tx interrupts) -- recovering msk0: watchdog timeout (missed Tx interrupts) -- recovering ... cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: msk/stable
On Fri, Jun 12, 2009 at 10:57:42AM +0300, Danny Braniss wrote: latest -stable (June 11) is causing problems: MB is intel SE7320VP21, msk0: Ethernet address: 00:0e:0c:6a:85:a8 miibus0: MII bus on msk0 e1000phy0: Marvell 88E Gigabit PHY PHY 0 on miibus0 e1000phy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto msk0: watchdog timeout (missed Tx interrupts) -- recovering msk0: watchdog timeout (missed Tx interrupts) -- recovering msk0: watchdog timeout (missed Tx interrupts) -- recovering ... I think there was not much msk(4) changes in stable. msk(4) in CURRENT has a lot changes to support newer controllers. Does msk(4) in CURRENT make any difference? Also please show me dmesg output(msk(4) related one) to know which controller you have. hrumph, missed some lines: mskc0: Marvell Yukon 88E8050 Gigabit Ethernet port 0xb800-0xb8ff mem 0xdeefc000-0xdeef irq 16 at device 0.0 on pci2 msk0: Marvell Technology Group Ltd. Yukon EC Id 0xb6 Rev 0x02 on mskc0 msk0: Ethernet address: 00:0e:0c:6a:85:a8 miibus0: MII bus on msk0 miibus0: MII bus on msk0 e1000phy0: Marvell 88E Gigabit PHY PHY 0 on miibus0 e1000phy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto mskc0: [FILTER] Btw, are you using MSI? yes, but it was (so it seemed) working ok. i'll try again soon without msi. in the meantime, Im running an older kernel, trying to finish a very long process (svn/svk), which when done, I will be able to compile current. thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
unionfs unlocking unheld lock
hi, sporadically, I see this: lockmgr: thread 0xff0004a8b390 unlocking unheld lock KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _lockmgr() at _lockmgr+0x6ae VOP_UNLOCK_APV() at VOP_UNLOCK_APV+0x46 unionfs_unlock() at unionfs_unlock+0x22f VOP_UNLOCK_APV() at VOP_UNLOCK_APV+0x46 vn_read() at vn_read+0x264 dofileread() at dofileread+0xa1 kern_readv() at kern_readv+0x4c read() at read+0x54 syscall() at syscall+0x256 Xfast_syscall() at Xfast_syscall+0xab --- syscall (3, FreeBSD ELF64, read), rip = 0x8017545bc, rsp = 0x7fffaf98, rbp = 0x7fffb025 --- sf-02 zgrep 'unlocking unheld lock' /var/log/messages* /var/log/messages:May 25 03:03:37 sf-02 kernel: lockmgr: thread 0xff0004ed0720 unlocking unheld lock /var/log/messages:May 31 03:03:10 sf-02 kernel: lockmgr: thread 0xff0004ed6ab0 unlocking unheld lock /var/log/messages:Jun 10 03:03:19 sf-02 kernel: lockmgr: thread 0xff0004a8b390 unlocking unheld lock it happens around 3 am, so I guess it must be some daily script that trips this. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Stable from May 31 - zfs list locked
Hello, I encounter this problem for the second time. The system is working perfectly well but suddenly the command `zfs list' don't work and can't be killed. Here is a procstat of the culprit: [r...@morzine ~]# procstat -k 91766 PIDTID COMM TDNAME KSTACK 91766 100490 zfs -mi_switch sleepq_switch sleepq_wait _cv_wait zio_wait dbuf_read dmu_buf_hold zap_lockdir zap_lookup_norm zap_lookup dsl_prop_get_dd dsl_dataset_get_ref dsl_dataset_hold dmu_objset_open zfs_ioc_objset_stats zfsdev_ioctl devfs_ioctl_f kern_ioctl same thing happen if I try to run `zpool list' un another terminal. Henri same here, but with a twist: it used to happen on a 7.1, then after an unpgrade, sometime in April, to 7.2-PRERELEASE, things were ok till today! I was about to blame a resent upgrade of the PERC firmware(a very long shot :-), but now I don't know if upgradeing to 7.2-stable will help. This is a production host, with 12TB serving many nfs clients. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
msk(4) and Yukon
hi, Since I saw some activity, I decided to try out my msks, so the Yukon 88E8050 on my Intel SE7320VP21 now works with hw.msk.legacy_intr=0 which didn't before (sorry, but the best I can say is 'long time ago' ;-) on an Asus P5K-VM with Yukon 88E8056, it panics when used to PXE boot, but otherwise works fine (it used to hang the boot before). cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ZFS MFC heads up
I will be MFC'ing the newer ZFS support some time this afternoon. Both world and kernel will need to be re-built. Existing pools will continue to work without upgrade. If you choose to upgrade a pool to take advantage of new features you will no longer be able to use it with sources prior to today. 'zfs send/recv' is not expected to inter-operate between different pool versions. I think this is not a zfs issue, but it does trigger the problem: on a zfs mounted system via nfs, from linux, an ls .zfs used to panic the server, now it just doesn't work :-). (http://lists.freebsd.org/pipermail/freebsd-fs/2008-October/005217.html) so is this being worked on? btw, i will be trying the new version over the weekend. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD and iSCSI for disks.
This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --enig90DADA8437A99D893FB775F8 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Danny Braniss wrote: Garance A Drosihn wrote: Some friends of mine are looking at the new DroboPro, which makes a= lot of disk space available via iSCSI (in addition to firewire 800), and they were wondering how well iSCSI works with FreeBSD. I haven't= paid attention to iSCSI support. Is there anyone using it heavily for disk-storage under FreeBSD? Has there been much changed for iSCSI support in the 8.x branch, or is 7.x support working fine? I suppose you are interested in the client (initiator) side of iSCSI= support. It hasn't changed much between 7.x and 8.x but there are apparently some announcements of a newer version: http://lists.freebsd.org/pipermail/freebsd-scsi/2009-March/003834.html= I can't find any more information on it. the latest is in: http://www.cs.huji.ac.il/~danny/ftp/freebsd/iscsi-2.1.1.tar.gz Thanks! Is there anything in particular you'd like to get tested in the new version, any significant changes or improvements? mainly fixed some bugs, and some code cleanup. give it a spin, and let me know what target you are testing. btw, the default tag opening is a bit concervative (1), you might want to change it to somewhat larger, say 64 or 128. cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD and iSCSI for disks.
Danny Braniss wrote: This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --enig90DADA8437A99D893FB775F8 Content-Type: text/plain; charset=3DUTF-8 Content-Transfer-Encoding: quoted-printable Danny Braniss wrote: Garance A Drosihn wrote: Some friends of mine are looking at the new DroboPro, which makes= a=3D lot of disk space available via iSCSI (in addition to firewire 800)= , and they were wondering how well iSCSI works with FreeBSD. I haven= 't=3D paid attention to iSCSI support. Is there anyone using it heavily for disk-storage under FreeBSD? Has there been much changed for iSCSI support in the 8.x branch, or is 7.x support working fine? I suppose you are interested in the client (initiator) side of iSC= SI=3D support. It hasn't changed much between 7.x and 8.x but there are apparently some announcements of a newer version: http://lists.freebsd.org/pipermail/freebsd-scsi/2009-March/003834.ht= ml=3D I can't find any more information on it. the latest is in: http://www.cs.huji.ac.il/~danny/ftp/freebsd/iscsi-2.1.1.tar.gz Thanks! Is there anything in particular you'd like to get tested in the new version, any significant changes or improvements? mainly fixed some bugs, and some code cleanup. =20 give it a spin, and let me know what target you are testing. btw, the default tag opening is a bit concervative (1), you might want = to change it to somewhat larger, say 64 or 128. Hi, camcontrol tags hangs: Apr 9 15:36:36 terminator kernel: da3 at iscsi0 bus 0 target 1 lun 0 Apr 9 15:36:36 terminator kernel: da3: FreeBSD iSCSI DISK 0001 Fixed Direct Access SCSI-5 device Apr 9 15:36:38 terminator kernel: (da2:iscsi0:0:0:0): lost device Apr 9 15:36:38 terminator kernel: (da2:iscsi0:0:0:0): removing device en= try terminator:~ivoras/temp/sbin/iscontrol# ls /dev/da* /dev/da0 /dev/da0s1 /dev/da0s1a /dev/da0s1b /dev/da0s1c /dev/da1 /dev/da3 terminator:~ivoras/temp/sbin/iscontrol# camcontrol tags da3 The configuration is: target0 { targetaddress =3D 161.53.72.65 targetname =3D iqn.2007-09.jp.ne.peach:disk1 tags =3D 16 } Q: what kernel? Q: what target? btw, without the camcontrol tags, is it working? danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD and iSCSI for disks.
Garance A Drosihn wrote: Some friends of mine are looking at the new DroboPro, which makes a lot of disk space available via iSCSI (in addition to firewire 800), and they were wondering how well iSCSI works with FreeBSD. I haven't paid attention to iSCSI support. Is there anyone using it heavily for disk-storage under FreeBSD? Has there been much changed for iSCSI support in the 8.x branch, or is 7.x support working fine? I suppose you are interested in the client (initiator) side of iSCSI support. It hasn't changed much between 7.x and 8.x but there are apparently some announcements of a newer version: http://lists.freebsd.org/pipermail/freebsd-scsi/2009-March/003834.html I can't find any more information on it. the latest is in: http://www.cs.huji.ac.il/~danny/ftp/freebsd/iscsi-2.1.1.tar.gz cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Intel Integrated Raid (iir) relevance
Hi Xin LI, (It would be probably good idea to redirect this discussion to -stable@, redirected) ok by me. Hi, Danny, Danny Braniss wrote: It's no longer working (for me) under 7.2, and so far I am not getting any feedback, so since it seems that this particular hardware has reached EOL, I was wondering if, a) it's true, b) drop it, and replace it. c) should time be spent in getting it to work again. I'm not very sure about your problem with iir(4). A diff against RELENG_7_1 does not reveal any change on the driver itself. Are you sure that 7.1-R can have the device working? it's definitly broken for me, it broke sometime after rev 189591. but the main questions are still unanswered. The problem I'm facing, together with the amr, is on hosts that are being de-comissioned, and though I'll be sad turning them to scrap, they did serve us well. thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: amr driver broken since March 12
Danny Braniss wrote: it seems March 12 was a bit off :-) it took some time, but I managed to close the gap: 189100 ok 189150 fails I will continue tomorrow, but this should be helpful. 189150 is in the middle of a big string of related commits. Try updating to the following change numbers and retesting: 189088 189107 189161 If the last one does not work, try editing /sys/dev/amr/amr.c to change #define AMR_ENABLE_CAM 1 to #define AMR_ENABLE_CAM 0 Scott 189161 works, also for the iir now what? danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: amr driver broken since March 12
Danny Braniss wrote: Danny Braniss wrote: it seems March 12 was a bit off :-) it took some time, but I managed to close the gap: 189100 ok 189150 fails I will continue tomorrow, but this should be helpful. 189150 is in the middle of a big string of related commits. Try updating to the following change numbers and retesting: 189088 189107 189161 If the last one does not work, try editing /sys/dev/amr/amr.c to change #define AMR_ENABLE_CAM 1 to #define AMR_ENABLE_CAM 0 Scott 189161 works, also for the iir now what? Next set to try: 189219 broken 189229 broken any point in going on? danny 189253 189402 189531 189569 189591 Scott ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: amr driver broken since March 12
Danny Braniss danny at cs.huji.ac.il writes: at least for me :-) [and sorry for the cross posting] [...] amr0: LSILogic MegaRAID 1.53 mem 0xfbef-0xfbef,0xfe58-0xfe5f irq 27 at device 0.0 on pci4 amr0: [ITHREAD] amr0: delete logical drives supported by controller amr0: LSILogic Intel(R) RAID Controller SRCU42X Firmware 414I, BIOS A100, 128MB RAM amr0: adapter is busy amr0: adapter is busy amr0: delete logical drives supported by controller (probe0:amr0:0:6:0): TEST UNIT READY. CDB: 0 0 0 0 0 0 (probe0:amr0:0:6:0): CAM Status: SCSI Status Error (probe0:amr0:0:6:0): SCSI Status: Check Condition (probe0:amr0:0:6:0): ILLEGAL REQUEST asc:24,0 (probe0:amr0:0:6:0): Invalid field in CDB (probe0:amr0:0:6:0): Unretryable error FWIW, I have a an amr device (Dell PERC 3/DC) which is working fine with a -STABLE dated after March 12th: FreeBSD 7.2-PRERELEASE #2: Thu Mar 26 09:41:58 EDT 2009 te...@test4.tmk.com:/usr/obj/usr/src/sys/PE1550 [snip] amr0: LSILogic MegaRAID 1.53 mem 0xf000-0xf7ff irq 25 at device 0.0 on pci3 amr0: [ITHREAD] amr0: delete logical drives supported by controller amr0: LSILogic PERC 3/DC Firmware 199D, BIOS 3.35, 128MB RAM amr0: delete logical drives supported by controller amrd0: LSILogic MegaRAID logical drive on amr0 amrd0: 69360MB (142049280 sectors) RAID 5 (optimal) ses0 at amr0 bus 0 target 6 lun 0 ses0: DELL 1x3 U2W SCSI BP 1.21 Fixed Processor SCSI-2 device ses0: SAF-TE Compliant Device Trying to mount root from ufs:/dev/amrd0s1a This is on a dual-processor Dell PowerEdge 1550. So this may only affect certain models or firmware revisions of amr devices. Of course, since each LSI OEM uses their own firmware and BIOS numbering scheme, it'll be hard to tell which one is newer than the other. I have a bazillion of these cards if one would be helpful to a de- veloper. well, it's broken on my Dell PowerEdge 2940 amr0: LSILogic MegaRAID 1.53 amr0: Series 467 Firmware 1.06 and pciconf: a...@pci0:0:2:1:class=0x0e0001 card=0x04671028 chip=0x19608086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '80960RP i960RP Microprocessor' class = intelligent I/O controller subclass = I2O now try to follow the rebranding trail :-) ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ALT_BREAK_TO... + ILO ... missing something in config ...
Due to an issue I'm having with 7.x, and trying to track it down, I spent tonight getting my server setup to allow my to break into the debugger when it hangs, and hopefully dump core ... But, although I *think* I've got it all, I'm obviously missing something, as it isn't breaking ... First ... I'm running a proliant server, and when I connect via SSH to ILO on that machine, and type 'vsp', I get a shell as I expect, I can type, etc ... when I reboot the machine, I get the opening splash screen with the 7(?) options (normal boot, single user mode, etc, etc) ... but I get nothing between that and the login prompt ... first sign of a problem, maybe? not realy, you at least have confirmation that you are talking correctly via the serial port. Till this point boot is using the BIOS routines to talk via the serial port. Later, the kernel tries to use it's routines/knowledge of the serial port. Next, the easy question ... what is the key stroke to issue when one has ALT_BREAK_TO_DEBUGGER is set in the kernel? I thought it was CR ~ ^b ... is that correct? I'm using putty to connect via ssh, if that makes a difference ... I've also tried using the browser interface into ilo / vsp, same lack of a result ... unless the serial port is setup as console, check if /boot/device.hints has: hint.sio.0.flags=0x10 escaping to the debugger is not caught. btw, Jeremy Chadwick had a nice explanation, but I lost the URL. Beyond adding sio device driver to my kernel, I've also got: options ALT_BREAK_TO_DEBUGGER options KDB options DDB Missing a kernel option maybe? I have the following in /boot/loader.conf: comconsole_speed=9600 console=vidconsole,comconsole # A comma separated list of console(s) boot_multicons=-D # -D: Use multiple consoles boot_serial=-h # -h: Use serial console So ... eithe rI don't have it enabled like I think, or I'm doing the wrong key stroke ... or ... Thx you are very close!, but each hardware/bios needs a different solution :-( danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: amr driver broken since March 12
Danny Braniss wrote: Danny Braniss wrote: at least for me :-) [and sorry for the cross posting] old (March 12 , i know need the svn rev number but...) None of the commit activity on March 12 is jumping out at me as being suspicious. However, you are now the second person who has told me about AMR problems in 7.1 recently. If you have a precise svn change number, it would help greatly. Scott my bad. the last working amr/iir is from March 12. I first detected the problem sometime later, but not later than March 23. So it has to be changes in that time frame. both drivers are showing similar symptoms: waiting for not busy the iir goes on for ever, and it's the cam that eventually panics, run_interrupt_driven_hooks: still waiting after 300 seconds for xpt_config (actually not 100% true, depending if WITNESS is on or off, it sometimes just hangs). the amr seems to time out: amr0: adapter is busy thanks for looking into the problem, danny Ok, here are a series of revisions to step through, in forward order. Make sure that you are starting with at least revision 189568. Then, update to exactly the revision numbers below, recompile the kernel, and test: 190087 190091 it seems March 12 was a bit off :-) it took some time, but I managed to close the gap: 189100 ok 189150 fails I will continue tomorrow, but this should be helpful. cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
amr driver broken since March 12
at least for me :-) [and sorry for the cross posting] old (March 12 , i know need the svn rev number but...) dmesg | grep amr amr0: LSILogic MegaRAID 1.53 mem 0xfbef-0xfbef,0xfe58-0xfe5f irq 27 at device 0.0 on pci4 amr0: [ITHREAD] amr0: delete logical drives supported by controller amr0: LSILogic Intel(R) RAID Controller SRCU42X Firmware 414I, BIOS A100, 128MB RAM amr0: delete logical drives supported by controller amrd0: LSILogic MegaRAID logical drive on amr0 amrd0: 34857MB (71387136 sectors) RAID 0 (optimal) amrd1: LSILogic MegaRAID logical drive on amr0 amrd1: 280024MB (573489152 sectors) RAID 5 (optimal) and a resent 7.2 (same host): amr0: LSILogic MegaRAID 1.53 mem 0xfbef-0xfbef,0xfe58-0xfe5f irq 27 at device 0.0 on pci4 amr0: [ITHREAD] amr0: delete logical drives supported by controller amr0: LSILogic Intel(R) RAID Controller SRCU42X Firmware 414I, BIOS A100, 128MB RAM amr0: adapter is busy amr0: adapter is busy amr0: delete logical drives supported by controller (probe0:amr0:0:6:0): TEST UNIT READY. CDB: 0 0 0 0 0 0 (probe0:amr0:0:6:0): CAM Status: SCSI Status Error (probe0:amr0:0:6:0): SCSI Status: Check Condition (probe0:amr0:0:6:0): ILLEGAL REQUEST asc:24,0 (probe0:amr0:0:6:0): Invalid field in CDB (probe0:amr0:0:6:0): Unretryable error btw, since I also have similar problems with another kind of raid card (iir), I suspect some related changes are the cause. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: amr driver broken since March 12
Danny Braniss wrote: at least for me :-) [and sorry for the cross posting] old (March 12 , i know need the svn rev number but...) None of the commit activity on March 12 is jumping out at me as being suspicious. However, you are now the second person who has told me about AMR problems in 7.1 recently. If you have a precise svn change number, it would help greatly. Scott my bad. the last working amr/iir is from March 12. I first detected the problem sometime later, but not later than March 23. So it has to be changes in that time frame. both drivers are showing similar symptoms: waiting for not busy the iir goes on for ever, and it's the cam that eventually panics, run_interrupt_driven_hooks: still waiting after 300 seconds for xpt_config (actually not 100% true, depending if WITNESS is on or off, it sometimes just hangs). the amr seems to time out: amr0: adapter is busy thanks for looking into the problem, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: dump | restore fails: unknown tape header type 1853384566
Daniel O'Connor ÎÁÐÉÓÁ×(ÌÁ): On Tuesday 24 March 2009 11:55:07 Mikhail T. wrote: I'm trying to migrate a filesystem from one disk to another using: dump a0hCf 0 32 - /old | restore -rf - (/old is already mounted read-only). The process runs for a while and then stops with: [...] DUMP: 22.85% done, finished in 3:57 at Tue Mar 24 01:03:21 2009 DUMP: 24.66% done, finished in 3:50 at Tue Mar 24 01:00:58 2009 DUMP: 26.44% done, finished in 3:43 at Tue Mar 24 00:59:14 2009 unknown tape header type 1853384566 abort? [yn] Any idea, what's going on? Why can't FreeBSD's restore read FreeBSD's dump's output? What happens if you don't use the cache? No big difference: dump a0f - /old | restore -rf - [...] DUMP: 17.25% done, finished in 3:27 at Tue Mar 24 05:42:00 2009 DUMP: 20.36% done, finished in 3:09 at Tue Mar 24 05:28:13 2009 DUMP: 23.83% done, finished in 2:50 at Tue Mar 24 05:14:32 2009 unknown tape header type -621260722 abort? [yn] Looks like a junk value somewhere... Unitialized variable or some such. can you try splitting it in 2, ie no pipe? dump a0f some.file /old (or dump 0f - /old | gzip -c file.dump.gz) restore rf some.file danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Intel Integrated RAID iir not working under 7.2
Hi, After turning debuging on, it seems that the iir driver is loosing an interrupt while probing: ... gdt_next(0xc7666000) gdt_mpr_test_busy(0xc7666000) gdt_intr(0xc7666000) gdt_mpr_get_status(0xc7666000) gdt_mpr_intr(0xc7666000) gdt_free_ccb(0xc7666000, 0xc767e444) gdt_sync_event(0xc7666000, 3, 5, 0xc767e444) gdt_next(0xc7666000) gdt_mpr_test_busy(0xc7666000) run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_config run_interrupt_driven_hooks: still waiting after 180 seconds for xpt_config run_interrupt_driven_hooks: still waiting after 240 seconds for xpt_config run_interrupt_driven_hooks: still waiting after 300 seconds for xpt_config panic: run_interrupt_driven_config_hooks: waited too long btw, older (7.0/7.1) still works. any ideas? thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
7.2 and iir stuck on boot.
Hi, after upgrading to 7.2, booting the kernel gets stuck with: run_interrupt_driven_hooks: still waiting after 300 seconds for xpt_config from a 7.1 dmesg, it seems that it's in the iir driver. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
7.2-PRERELEASE/sunx2200/bge/msi broken
Hi, between March 16 and now, bge on a Sun X2200 stopped working, turning off msi (via hw..pci.enable_msi=0) got it working again. I tried first replacing bge with an older version but that did not help. please advice :-) Danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 7.2-PRERELEASE/sunx2200/bge/msi broken
On Sun, Mar 22, 2009 at 03:10:07PM +0200, Mikolaj Golub wrote: On Sun, 22 Mar 2009 12:55:02 +0200 Danny Braniss wrote: DB Hi, DB between March 16 and now, bge on a Sun X2200 stopped working, DB turning off msi (via hw..pci.enable_msi=0) got it working again. DB I tried first replacing bge with an older version but that did not help. It looks like related to this report: http://www.freebsd.org/cgi/getmsg.cgi?fetch=1253844+1263253+/usr/local/www/db/text/2009/freebsd-bugs/20090322.freebsd-bugs Could you please give the following patch a try? http://people.freebsd.org/~marius/bge_intx.diff Marius it works! thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Tester wanted for multipath failover iSCSI target software
Now istgt is a part of ports. (net/istgt) FreeBSD issue is solved by danny's patch. After applying the patch, iscontrol can connect to istgt. I am interested in giving this a try, though not immediately as I am away from the office at the moment. Do I need to apply a patch to iscontrol to make it work though ? I can't work it out from your statement above. english version: (ungoogled :-) the latest is in: http://www.cs.huji.ac.il/~danny/ftp/freebsd/iscsi-2.1.1.tar.gz and if you already have 2.1, apply: --- iscsi.c.orig2008-09-21 10:01:50.0 +0300 +++ iscsi.c 2009-03-11 13:29:04.250472000 +0200 @@ -62,7 +62,7 @@ #include dev/iscsi/initiator/iscsi.h #include dev/iscsi/initiator/iscsivar.h -static char *iscsi_driver_version = 2.1.0; +static char *iscsi_driver_version = 2.1.1; static struct isc_softc isc; --- isc_sm.c.orig 2008-07-19 14:04:23.0 +0300 +++ isc_sm.c2009-03-11 13:30:20.672791000 +0200 @@ -508,7 +508,7 @@ sn-cmd++; case ISCSI_WRITE_DATA: - bhs-ExpStSN = htonl(sn-stat); + bhs-ExpStSN = htonl(sn-stat + 1); break; default: danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FBSD 7.1 XEON Quad Core
On 12.02.2009 3:12 Uhr, SDH Support wrote: Yes, if you have (or plan to have) more than 3 GB of memory. FYI I have had a lot of problems with FBSD7.x and HP DL-series hardware + amd64. There is a bug IIRC in the loader. I had problems with DL3X0G5 when booting with PXE, every now an then the loader would hang when trying to load the kernel from an i386 8-CURRENT NFS server. As soon as i had installed everything and was booting from the disks the problem did not show up anymore. I also used amd64. Do you use PXE boot? pxeboot is problematic on some platforms, try an older version btw, if you succeed let me know. thanks danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: impossible packet length ...
I'm reposting this to hackers, and there is some more info. Hi, on 2 different servers, running 7.1-stable + zfs, I get this error rather frequently: Feb 5 17:01:03 warhol-00 kernel: impossible packet length (543383918) from nfs server sunfire:/dist Feb 5 17:01:03 warhol-00 kernel: impossible packet length (1936028704) from nfs server sunfire:/dist Feb 5 17:01:03 warhol-00 kernel: impossible packet length (1869363744) from nfs server sunfire:/dist Feb 5 17:01:03 warhol-00 kernel: impossible packet length (1667787057) from nfs server sunfire:/dist Feb 5 17:01:03 warhol-00 kernel: impossible packet length (976040755) from nfs server sunfire:/dist Feb 5 17:01:03 warhol-00 kernel: impossible packet length (1953459488) from nfs server sunfire:/dist Feb 5 17:01:03 warhol-00 kernel: impossible packet length (1348825156) from nfs server sunfire:/dist Feb 5 17:01:03 warhol-00 kernel: impossible packet length (0) from nfs server sunfire:/dist Feb 5 17:01:03 warhol-00 kernel: impossible packet length (1647208041) from nfs server sunfire:/dist in this case the server is running Freebsd-7.0-stable, but I also get it when the server is a netapp. is there a connection? thanks, danny going through the logs, after it happened again, I got a glimps of this: Feb 6 18:00:13 warhol-00.cs.huji.ac.il kernel: bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) Feb 6 18:00:19 klee-05.cs.huji.ac.il kernel: nfs: server warhol-00 not responding, timed out ... Feb 6 19:00:00 warhol-00.cs.huji.ac.il amd[715]: More than a single value for /defaults in hesiod.local Feb 6 19:00:00 warhol-00.cs.huji.ac.il amd[715]: Unknown $ sequence in rhost:=${RHOST};type:=nfsl;fs:=${FS};rfs:=$huldigC0#^ZM-^KoM- abase Feb 6 19:00:00 warhol-00.cs.huji.ac.il kernel: impossible packet length (2068989523) from nfs server sunfire:/dist which seems to point fingers at bce... danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: impossible packet length ...
On Sun, 8 Feb 2009, Peter Jeremy wrote: On 2009-Feb-08 11:31:45 +0200, Danny Braniss da...@cs.huji.ac.il wrote: Q: with rxcsum on, and a bad checksum packet is received, is it dropped by the NIC? if not, then it somewhat explains the behaviour If checksum offloading is working correctly then a bad packet should be dropped by the NIC. If checksum offloading isn't working correctly then you can wind up in the situation where both the NIC and the driver think the other party has verified the checksum. It's also possible that you may be running into corruption during DMA transfer from the NIC to RAM. ISTR there have been some issues reported recently with checksum offloading on some NICs - though I don't have details to hand - you might like to search the lists. changing the nic is tough, but if needed will be done. If disabling checksum offloading fixes the problem and the additional CPU load is acceptable (at least until you find a real fix) then there's no need to change NICs. Actually, my understanding was that packets with bad checksums are delivered to software, and flag the descriptor ring header for each packet tells us whether the checksum was (a) checked and (b) validated by the hardware. We then propagate these to mbuf flags so that higher stack layers know whether or not to calculate the checksum themselves. Regardless of the specifics, though, packets with checked but bad checksums shouldn't make it to the socket layer where they would be visible to NFS. If the NIC is marking apparently bad packets as good, there are a number of possible sources -- be it bad checksum handling in the card, corruption between the card and higher levels of the stack (a DMA problem, as you point out, would have this symptom). looking at the bce source, it's not clear (to me :-). If errors are detected in bce_rx_intr(), the packet gets dropped, which I would expect to be the treatment of an offloded chekcum error, but it seems that is not the case. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: impossible packet length ...
On Sun, 8 Feb 2009, Danny Braniss wrote: looking at the bce source, it's not clear (to me :-). If errors are detected in bce_rx_intr(), the packet gets dropped, which I would expect to be the treatment of an offloded chekcum error, but it seems that is not the case. I think we're thinking of different checksums -- devices/device drivers drop frames with bad ethernet checksums, but not IP and above layer checksums. I know I'm stepping on thin ice hear - haven't touched Stevens for a while, (and I doubt it mentions offloading), but if the offload checksum is bad, why not just drop the packet? The way I read the driver, if the offload checksum is on, and if no errors where detected, then it's marked as ok. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: impossible packet length ...
On Sun, 8 Feb 2009, Danny Braniss wrote: On Sun, 8 Feb 2009, Danny Braniss wrote: looking at the bce source, it's not clear (to me :-). If errors are detected in bce_rx_intr(), the packet gets dropped, which I would expect to be the treatment of an offloded chekcum error, but it seems that is not the case. I think we're thinking of different checksums -- devices/device drivers drop frames with bad ethernet checksums, but not IP and above layer checksums. I know I'm stepping on thin ice hear - haven't touched Stevens for a while, (and I doubt it mentions offloading), but if the offload checksum is bad, why not just drop the packet? The way I read the driver, if the offload checksum is on, and if no errors where detected, then it's marked as ok. There are a few good reasons I can think of, but this is hardly a comprehensive list: (1) If there are bad higher level checksums on the wire, you want to see them in tcpdump, so allow them to get up to a higher layer if network layer checksums aren't good. (2) It's a matter of local policy as to whether UDP checksums (for example) are observed or not. (3) If you're forwarding or bridging packets, it should be up to the end nodes how they deal with bad UDP checksums on packets to them, not the routers. ok, I can understand the logic. Looking at if_bce.c, the following seems to be reasonable logic; first, ethernet-layer checksums: 5902 /* Check the received frame for errors. */ 5903 if (status (L2_FHDR_ERRORS_BAD_CRC | 5904 L2_FHDR_ERRORS_PHY_DECODE | L2_FHDR_ERRORS_ALIGNMENT | 5905 L2_FHDR_ERRORS_TOO_SHORT | L2_FHDR_ERRORS_GIANT_FRAME)) { 5906 5907 /* Log the error and release the mbuf. */ 5908 ifp-if_ierrors++; 5909 DBRUN(sc-l2fhdr_status_errors++); 5910 5911 m_freem(m0); 5912 m0 = NULL; 5913 goto bce_rx_int_next_rx; 5914 } I.e., if there are ethernet-level CRC failures, drop the packet. 5922 /* Validate the checksum if offload enabled. */ 5923 if (ifp-if_capenable IFCAP_RXCSUM) { 5924 5925 /* Check for an IP datagram. */ 5926 if (!(status L2_FHDR_STATUS_SPLIT) 5927 (status L2_FHDR_STATUS_IP_DATAGRAM)) { 5928 m0-m_pkthdr.csum_flags |= CSUM_IP_CHECKED; 5929 5930 /* Check if the IP checksum is valid. */ 5931 if ((l2fhdr-l2_fhdr_ip_xsum ^ 0x) == 0) 5932 m0-m_pkthdr.csum_flags |= CSUM_IP_VALID; 5933 } 5934 5935 /* Check for a valid TCP/UDP frame. */ 5936 if (status (L2_FHDR_STATUS_TCP_SEGMENT | 5937 L2_FHDR_STATUS_UDP_DATAGRAM)) { 5938 5939 /* Check for a good TCP/UDP checksum. */ 5940 if ((status (L2_FHDR_ERRORS_TCP_XSUM | 5941 L2_FHDR_ERRORS_UDP_XSUM)) == 0) { 5942 m0-m_pkthdr.csum_data = 5943 l2fhdr-l2_fhdr_tcp_udp_xsum; 5944 m0-m_pkthdr.csum_flags |= (CSUM_DATA_VALID 5945 | CSUM_PSEUDO_HDR); 5946 } 5947 } 5948 } Only look at higher level checksums if policy enables it on the interface; then, only if the hardware has a view on the IP-layer checksums, propagte that information to the mbuf flags from the descriptor ring entry flags, both whether or not the checksum was verified, and whether or not it was good. If policy disables it, or the hardware expresses no view, we don't set flags, which simply defers checksumming to a higher layer (if required -- for forwarded packets, we won't test UDP-layer checksums at all). I missed line 5928, and as usual, your explanation is most educational! The comment in line 5939 is a bit missleading, the way I read the code, it does not check for good checksum. Cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
impossible packet length ...
Hi, on 2 different servers, running 7.1-stable + zfs, I get this error rather frequently: Feb 5 17:01:03 warhol-00 kernel: impossible packet length (543383918) from nfs server sunfire:/dist Feb 5 17:01:03 warhol-00 kernel: impossible packet length (1936028704) from nfs server sunfire:/dist Feb 5 17:01:03 warhol-00 kernel: impossible packet length (1869363744) from nfs server sunfire:/dist Feb 5 17:01:03 warhol-00 kernel: impossible packet length (1667787057) from nfs server sunfire:/dist Feb 5 17:01:03 warhol-00 kernel: impossible packet length (976040755) from nfs server sunfire:/dist Feb 5 17:01:03 warhol-00 kernel: impossible packet length (1953459488) from nfs server sunfire:/dist Feb 5 17:01:03 warhol-00 kernel: impossible packet length (1348825156) from nfs server sunfire:/dist Feb 5 17:01:03 warhol-00 kernel: impossible packet length (0) from nfs server sunfire:/dist Feb 5 17:01:03 warhol-00 kernel: impossible packet length (1647208041) from nfs server sunfire:/dist in this case the server is running Freebsd-7.0-stable, but I also get it when the server is a netapp. is there a connection? thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: impossible packet length ...
On 2009-Feb-06 08:32:27 +0200, Danny Braniss da...@cs.huji.ac.il wrote: on 2 different servers, running 7.1-stable + zfs, I get this error rather frequently: Feb 5 17:01:03 warhol-00 kernel: impossible packet length (543383918) fro= m=20 nfs server sunfire:/dist So many quetsions :-) I gather warhol-00 is running 7.1-S+ZFS. How recent a 'stable' is it? very: FreeBSD warhol-00 7.1-STABLE FreeBSD 7.1-STABLE #37: Fri Jan 23 10:41:54 IST 2009 and its amd64. Where does ZFS fit in? Is sunfire:/dist mountpoint in a local ZFS or is a local ZFS mountpoint inside the sunfire:/dist mount? warhole is a nfs server, the storage is a ZFS local raid, the errors occure on a nfs/tcp mounted file system on warhol. Do you get the same problems without any ZFS mounts? I have several hosts running 7.1-stable without nfs exported ZFS, non have this error That is why I think there is a connection, because on two, which have ZFS exported the problem appears. Is this a TCP or UDP NFS mount? What happens if you switch protocols? i'll try but not trivial. the other difference between the boxes is that one is dataless, while the other is stand-alone (well, / is on a local disk, but /usr/local home dirs are on the network/nfs). What NIC are you using and are you seeing any network errors? bce, the boxes are Dell-2950, but no visible errors. Are you able to capture a protocol trace showing the transaction including erroneous packet? I have started the capture, but since I don't know what triggers the problem, it will take some time. I will also start capturing packets at the router level, but that will have to wait till next week. thanks, danny --=20 Peter Jeremy --gKMricLos+KVdGMg Content-Type: application/pgp-signature Content-Disposition: inline -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.10 (FreeBSD) iEYEARECAAYFAkmL4t8ACgkQ/opHv/APuIeXNQCgg68TMfH6zh1gRaKfhCkNQi+0 y10AoJcG7/7fiqL8oUpsWhIwhceWSFPo =MKeo -END PGP SIGNATURE- --gKMricLos+KVdGMg-- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Unhappy Xorg upgrade
As a general note, this is the second time in a row that an X.org upgrade broke X for a significant number of people. IMO, this suggests that our approach to X.org upgrades needs significant changes (see below). X11 is a critical component for anyone who is using FreeBSD as a desktop and having upgrades fail or come with significant POLA violations and regressions for significant numbers of people is not acceptable. you took the words out of my mouth! Some days ago, I compiled wine from ports, among its dependencies was cups(why in the name of G_D?), and x11-xcb (which did not ring any special bells - stupidly I thought it meant some x11 cut buffer gizmo :-) Anyways, next day, I couldn't open windows (x11 not MS) from some hosts, some debuging later, it was xauth failing. Now xcb did ring bells! A year ago we found a bug in libxcb, where the treatment of xauth was broken, we sent a patch, but it is still waiting. BTW, I opend a PR, http://www.freebsd.org/cgi/query-pr.cgi?pr=131120, where it's now going the way the salmon, up stream, waiting for some kind sole to apply it. On 2009-Jan-29 08:40:11 -0500, Robert Noland rnol...@freebsd.org wrote: I've had patches available for probably a couple of months now posted to freebsd-...@. For the few people who tested it, I had no real issues reported. I didn't recall seeing any reference to patches so I went looking. All I could find is a couple of references to a patchset existing buried inside threads discussing specific problems with X. The majority of people who didn't have those specific problems probably skipped the thread and never saw that a patchset was available. When the X.org 7.0 upgrade was planned, a heads-up went out on a number of mailing lists, together with a pointer to the patchset and upgrade instructions and the upgrade did not proceed until both a reasonable number of people reported success and reported problems had been ironed out. Given the ongoing problems with code provided by X.org, I suggest that this approach needs to be followed for every future release of X.org until (if) the X.org Project demonstrates that they can provide release-quality code. This update also brings in support for a lot of people who are running newer hardware. And breaks support for lots of people who used to have functional X servers. merging /usr/X11R6 into /usr/local was a bad idea! cheers, danny --=20 Peter Jeremy Please excuse any delays as the result of my ISP's inability to implement an MTA that is either RFC2821-compliant or matches their claimed behaviour. --+1TulI7fc0PCHNy3 Content-Type: application/pgp-signature Content-Disposition: inline -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.10 (FreeBSD) iEYEARECAAYFAkmDWqcACgkQ/opHv/APuIdisQCgogeNZ8aXPDJ3gcZ/23Gyp/CV bmsAn0efyI9cS6TWGFkofoYh6oFmtc5l =i2p0 -END PGP SIGNATURE- --+1TulI7fc0PCHNy3-- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: more marvell marvels
On Fri, Jan 09, 2009 at 01:48:24PM +0200, Danny Braniss wrote: hi, the mb is asus P5K-VM, the onboard nic is, acccording to pciconf: ms...@pci0:1:0:0: class=0x02 card=0x81f81043 chip=0x436411ab rev=0x12 hdr=0x00 vendor = 'Marvell Semiconductor (Was: Galileo Technology Ltd)' device = '88E8056 Yukon PCI-E Gigabit Ethernet Controller' class = network subclass = ethernet cap 01[48] = powerspec 3 supports D0 D1 D2 D3 current D0 cap 03[50] = VPD cap 05[5c] = MSI supports 1 message, 64 bit cap 10[e0] = PCI-Express 1 legacy endpoint nothing new here, problems have been reported before, but: my very first attempt - after a very long time - of booting 7.1-stable, produced a panic because msk could not find its physio, by the time i had the serial console attached and working, that problem disappeared :-( now, after reboot, it sometimes hangs - because the net is not working, and only if I unplug the ethernet, (no signs of the driver seeing this), and replug things begin to work. btw, i had to set hw.msk.legacy_intr=1 to get things working. any patches for 7.1-stable to test? If memory serve me right you have Yukon EC Ultra with 88E1149 PHY, right? CURRENT has some stability fixes but the source wouldn't be compiled on stable/7 yet due to KPI differences. I have plan to add some features in next week which make it possible to use HEAD version on stable/7. I'm not sure the patch for 88E8040 could be applied to stable/7 but the patch has some fixes for link state handling. Would you give it try? http://people.freebsd.org/~yongari/msk/msk.88E8040.patch14 Note, the 88E8040 patch is not complete yet and may cause other problems too. tried to apply patches, but if_mskreg.h patches failed, and hand stitching didn't help (I have 7.1-Stable) danny -- Regards, Pyun YongHyeon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
more marvell marvels
hi, the mb is asus P5K-VM, the onboard nic is, acccording to pciconf: ms...@pci0:1:0:0: class=0x02 card=0x81f81043 chip=0x436411ab rev=0x12 hdr=0x00 vendor = 'Marvell Semiconductor (Was: Galileo Technology Ltd)' device = '88E8056 Yukon PCI-E Gigabit Ethernet Controller' class = network subclass = ethernet cap 01[48] = powerspec 3 supports D0 D1 D2 D3 current D0 cap 03[50] = VPD cap 05[5c] = MSI supports 1 message, 64 bit cap 10[e0] = PCI-Express 1 legacy endpoint nothing new here, problems have been reported before, but: my very first attempt - after a very long time - of booting 7.1-stable, produced a panic because msk could not find its physio, by the time i had the serial console attached and working, that problem disappeared :-( now, after reboot, it sometimes hangs - because the net is not working, and only if I unplug the ethernet, (no signs of the driver seeing this), and replug things begin to work. btw, i had to set hw.msk.legacy_intr=1 to get things working. any patches for 7.1-stable to test? danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: more marvell marvels
On Fri, Jan 09, 2009 at 01:48:24PM +0200, Danny Braniss wrote: hi, the mb is asus P5K-VM, the onboard nic is, acccording to pciconf: ms...@pci0:1:0:0: class=0x02 card=0x81f81043 chip=0x436411ab rev=0x12 hdr=0x00 vendor = 'Marvell Semiconductor (Was: Galileo Technology Ltd)' device = '88E8056 Yukon PCI-E Gigabit Ethernet Controller' class = network subclass = ethernet cap 01[48] = powerspec 3 supports D0 D1 D2 D3 current D0 cap 03[50] = VPD cap 05[5c] = MSI supports 1 message, 64 bit cap 10[e0] = PCI-Express 1 legacy endpoint nothing new here, problems have been reported before, but: my very first attempt - after a very long time - of booting 7.1-stable, produced a panic because msk could not find its physio, by the time i had the serial console attached and working, that problem disappeared :-( now, after reboot, it sometimes hangs - because the net is not working, and only if I unplug the ethernet, (no signs of the driver seeing this), and replug things begin to work. btw, i had to set hw.msk.legacy_intr=1 to get things working. any patches for 7.1-stable to test? If memory serve me right you have Yukon EC Ultra with 88E1149 PHY, right? e1000phy0: Marvell 88E1149 Gigabit PHY PHY 0 on miibus0 e1000phy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX-FDX, auto mskc0: [ITHREAD] CURRENT has some stability fixes but the source wouldn't be compiled on stable/7 yet due to KPI differences. I have plan to add some features in next week which make it possible to use HEAD version on stable/7. I'm not sure the patch for 88E8040 could be applied to stable/7 but the patch has some fixes for link state handling. Would you give it try? http://people.freebsd.org/~yongari/msk/msk.88E8040.patch14 Note, the 88E8040 patch is not complete yet and may cause other problems too. I'll try asap thanks, danny -- Regards, Pyun YongHyeon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: newfs(8) parameters from dumpfs -m have bad -s value?
On Mon, Jan 05, 2009 at 08:23:53PM +0100, Oliver Fromme wrote: This seems to be a bug in dumpfs(8). It simply prints the value of the fs_size field of the superblock, which is wrong. The -s option of newfs(8) expects the available size in sectors (i.e. 512 bytes), but the fs_size field contains the size of the file system in 2KB units. This seems to be the fragment size, but I'm not sure if this is just This *is* the fragment size. UFS/FFS uses the plain term block to mean the fragment size. All blocks are indexed with this number, unlike block size which is almost always 8 fragments (blocks). Confusing. So, dumpfs(8) needs to be fixed to perform the proper calculations when printing the value for the -s option. Unfortunately I'm not sufficiently much of a UFS guru to offer a fix. My best guess would be to multiply the fs_size value by the fragment size (measured in 512 byte units), i.e. multiply by 4 in the most common case. But I'm afraid the real solution is not that simple. The sector size and filesystem size parameters in newfs are remnants. Everything is converted to number of media sectors (sector size as specified by the device). So one could assume for dumpfs to always use 512, since it's rarely different, and multiply fs_size by fs_fsize and divide by 512, and then output -S 512. don't assume 512, in the iscsi world I have seen all kinds of sector sizes, making it a PITA to get things right. Better yet would be to add a parameter (-z perhaps) to newfs(8) to accept number of bytes instead of multiples of sectorsize. I would be willing to write up patches for dumpfs and newfs to both add the raw byte size and the 512-byte sector size handling to correct said mistake, unless someone else would rather. -- Rick C. Petty ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: amd(8) cores dump when load high
On Sat, Dec 27, 2008 at 01:03:54PM +0200, Danny Braniss wrote: well, I'm running 7.1-PRERELEASE, what does the amd logs show? .. Dec 27 10:37:01 sf-02 amd[856]: am-utils version 6.1.5 (build 1). Dec 27 10:37:01 sf-02 amd[856]: Report bugs to https://bugzilla.am-utils.org/ or am-ut...@am-utils.org. Dec 27 10:37:01 sf-02 amd[856]: Configured by da...@sunfire on date Sun Jun 29 16:59:06 IDT 2008. Dec 27 10:37:01 sf-02 amd[856]: Built by da...@sunfire on date Sun Jun 29 17:02:07 IDT 2008. Dec 27 10:37:01 sf-02 amd[856]: cpu=x86_64 (little-endian), arch=amd64, karch=amd64. Dec 27 10:37:01 sf-02 amd[856]: full_os=freebsd7.0, os=freebsd7, osver=7.0, vendor=unknown, distro=none. How did you get this output from /usr/sbin/amd? It should be: # amq -v Copyright (c) 1997-2006 Erez Zadok Copyright (c) 1990 Jan-Simon Pendry Copyright (c) 1990 Imperial College of Science, Technology Medicine Copyright (c) 1990 The Regents of the University of California. am-utils version 6.1.5 (build 800059). Report bugs to https://bugzilla.am-utils.org/ or am-ut...@am-utils.org. Configured by David O'Brien obr...@freebsd.org on date 4-December-2007 PST. Built by r...@quynh.nuxi.org on date Fri Dec 19 15:29:18 PST 2008. cpu=amd64 (little-endian), arch=amd64, karch=amd64. full_os=freebsd8.0, os=freebsd8, osver=8.0, vendor=undermydesk, distro=The FreeBSD Project. So many of your fields aren't what I expect: build#, configured by, configured date, cpu, vendor, nor distro. -- -- David(obr...@freebsd.org) as explained in a later message, I rolled my own :-) with the unofficial patches. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: amd(8) cores dump when load high
Yes, we found that it crashes when swap is used. on an amd64 architecture, amd could not plock it's pages in memory, and would, under memory preasure be swapped out, and break. I just run some tests under 7.1-PRERELEASE, and - it seems that plock is working. - amd is not being swapped out. are you running with amd -S ? danny On Fri, Dec 26, 2008 at 6:02 PM, Rong-en Fan gra...@gmail.com wrote: On Tue, Dec 23, 2008 at 12:44 AM, Lin Jui-Nan Eric eric...@tamama.org wrote: Dear listers, We currently found that amd frequently cores dump while loading is high (about 4~5) after we upgrade world kernel from 7.0-RELEASE to 7.1-PRERELEASE. I have read -stable and svn log of 7-STABLE, but can not found a report or a solution. Did anyone have the same issue? Thank you very much. According to my previous experience, amd 6.1.5 crashes under low memory situations. Not necessary high load. Regards, Rong-En Fan ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: amd(8) cores dump when load high
No, we do not running amd with -S. # ps auxww | grep amd root 706 0.0 0.1 7660 5416 ?? Ss Wed05PM 4:48.12 /usr/sbin/amd -p -k amd64 -x all /net amd.map well, I'm running 7.1-PRERELEASE, what does the amd logs show? Dec 27 10:37:01 sf-02 amd[856]: AM-UTILS VERSION INFORMATION: Dec 27 10:37:01 sf-02 amd[856]: Copyright (c) 1997-2006 Erez Zadok Dec 27 10:37:01 sf-02 amd[856]: Copyright (c) 1990 Jan-Simon Pendry Dec 27 10:37:01 sf-02 amd[856]: Copyright (c) 1990 Imperial College of Science, Technology Medicine Dec 27 10:37:01 sf-02 amd[856]: Copyright (c) 1990 The Regents of the University of California. Dec 27 10:37:01 sf-02 amd[856]: am-utils version 6.1.5 (build 1). Dec 27 10:37:01 sf-02 amd[856]: Report bugs to https://bugzilla.am-utils.org/ or am-ut...@am-utils.org. Dec 27 10:37:01 sf-02 amd[856]: Configured by da...@sunfire on date Sun Jun 29 16:59:06 IDT 2008. Dec 27 10:37:01 sf-02 amd[856]: Built by da...@sunfire on date Sun Jun 29 17:02:07 IDT 2008. Dec 27 10:37:01 sf-02 amd[856]: cpu=x86_64 (little-endian), arch=amd64, karch=amd64. Dec 27 10:37:01 sf-02 amd[856]: full_os=freebsd7.0, os=freebsd7, osver=7.0, vendor=unknown, distro=none. Dec 27 10:37:01 sf-02 amd[856]: domain=unknown.domain, host=sf-02, hostd=sf-02.unknown.domain. Dec 27 10:37:01 sf-02 amd[856]: Map support for: root, passwd, hesiod, union, nis, ndbm, file, exec, error. Dec 27 10:37:01 sf-02 amd[856]: AMFS: nfs, link, nfsx, nfsl, host, linkx, program, union, ufs, cdfs, Dec 27 10:37:01 sf-02 amd[856]: pcfs, auto, direct, toplvl, error, inherit. Dec 27 10:37:01 sf-02 amd[856]: FS: cd9660, nfs, nfs3, nullfs, msdosfs, ufs, unionfs. Dec 27 10:37:01 sf-02 amd[856]: Network: wire=132.65.16.0 (netnumber=132.65.16). Dec 27 10:37:01 sf-02 amd[856]: My ip addr is 127.0.0.1 Dec 27 10:37:01 sf-02 amd[857]: released controlling tty using setsid() Dec 27 10:37:01 sf-02 amd[857]: Locked process pages in memory ** On Sat, Dec 27, 2008 at 4:51 PM, Danny Braniss da...@cs.huji.ac.il wrote: Yes, we found that it crashes when swap is used. on an amd64 architecture, amd could not plock it's pages in memory, and would, under memory preasure be swapped out, and break. I just run some tests under 7.1-PRERELEASE, and - it seems that plock is working. - amd is not being swapped out. are you running with amd -S ? danny On Fri, Dec 26, 2008 at 6:02 PM, Rong-en Fan gra...@gmail.com wrote: On Tue, Dec 23, 2008 at 12:44 AM, Lin Jui-Nan Eric eric...@tamama.org wrote: Dear listers, We currently found that amd frequently cores dump while loading is high (about 4~5) after we upgrade world kernel from 7.0-RELEASE to 7.1-PRERELEASE. I have read -stable and svn log of 7-STABLE, but can not found a report or a solution. Did anyone have the same issue? Thank you very much. According to my previous experience, amd 6.1.5 crashes under low memory situations. Not necessary high load. Regards, Rong-En Fan ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: amd(8) cores dump when load high
I got Couldn't lock process pages in memory using mlockall() too: [...] On Sat, Dec 27, 2008 at 9:07 PM, Rong-en Fan gra...@gmail.com wrote: On Sat, Dec 27, 2008 at 7:03 PM, Danny Braniss da...@cs.huji.ac.il wrote: No, we do not running amd with -S. # ps auxww | grep amd root 706 0.0 0.1 7660 5416 ?? Ss Wed05PM 4:48.12 /usr/sbin/amd -p -k amd64 -x all /net amd.map well, I'm running 7.1-PRERELEASE, what does the amd logs show? [...] Dec 27 10:37:01 sf-02 amd[857]: Locked process pages in memory ** Hmm.. interesting, I got this Dec 26 15:32:11 bsd2 amd[39723]: Couldn't lock process pages in memory using mlo ckall(): Resource temporarily unavailable w/ 7-STABLE around Sep 4. I don't put plock = no in amd.conf, so by default it's plock'ed. Regards, Rong-En Fan some more ingrediants: when running vanilla amd it also failes to lock pages: Couldn't lock process pages in memory using mlockall(): Resource temporarily unavailable while the amd I'm running, which includes the latest - non official - patches works fine. but, the main diff I see is: opteron ldd /usr/sbin/amd /usr/sbin/amd: libc.so.7 = /lib/libc.so.7 (0x80065a000) while opteron ldd /SBIN/amd /SBIN/amd: librt.so.1 = /usr/lib/librt.so.1 (0x800658000) librpcsvc.so.4 = /usr/lib/librpcsvc.so.4 (0x80075d000) libwrap.so.5 = /usr/lib/libwrap.so.5 (0x800866000) libc.so.7 = /lib/libc.so.7 (0x80096f000) danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: RELENG_7_1: bce driver change generating too much interrupts ?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi, Nawfal, Nawfal bin Mohmad Rouyan wrote: I have been using a Dell machine with 2 bce interfaces as a bridge between my LAN and Firewall to shape the traffic. Since after the update, the machine can only run for a few minutes and after that no more connection can go through. Ping from LAN to Internet is OK but when I telnet say to www.yahoo.com at port 80 and issue GET / HTTP/1.0 I can see the data of different application including the HTML text. For example, I can see uTorrent packets with binaries and also the HTML page being cut short. It's as if, I'm seeing packets jumbled together from different application. I'm using PF to shape the traffic. If I reboot the server, it will panic and I have about 3 different vmcores in /var/crash and not sure what to do with it :( . I've tested the patch to remove stat_IfInFramesL2FilterDiscards but the problem still occurs. The last patch is not a functional change, but a behavior change that removes the L2FilterDiscards from being counted to match previous behavior. Would you please do this: script bt.txt kgdb /boot/kernel/kernel.symbols /var/crash/vmcore.0 Then, do 'bt', press enter until all display has finished, then exit kgdb, and send me the result (bt.txt)? As for now, I'm not using the server to shape the traffic because I suspect the driver isn't reliable. I'm going to revert back to the previous driver and hopes its going to work. Sorry if there is not much detail since I'm not sure what to provide. Just tell me what to provide and I'd be happy to do so. I don't know if the following is related, but: - while stress testing nfs/zfs, I get many weird things on the server (dell-2950/bce) example: impossible packet length (33555456) from nfs server fr-01:/vol/system/share impossible packet length (1792323116) from nfs server fr-01:/vol/system/share ... and things get worse soon after. Now, there are no input errors, so it seems some memory starvation are not properly handled ... cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
more zfs/nfs panics
Hi, I'm trying to tar a rather big directory via nfs (some 800gb), it has many subdirectories, some of them with many files (close to 10^6 :-) just before the server panics, the tar (on the client) starts complaining about lost files, or permition denied, but not in the pathological directories. panic: kmem_malloc(-1661382656): kmem_map too small: 645009408 total allocated cpuid = 3 KDB: enter: panic [thread pid 881 tid 100112 ] Stopped at kdb_enter_why+0x3d: movq$0,0x5ef3e8(%rip) db tr Tracing pid 881 tid 100112 td 0xff0004ba2000 kdb_enter_why() at kdb_enter_why+0x3d panic() at panic+0x17b kmem_malloc() at kmem_malloc+0x565 uma_large_malloc() at uma_large_malloc+0x4a malloc() at malloc+0xd7 nfsrv_readdir() at nfsrv_readdir+0x4e1 nfssvc() at nfssvc+0x400 syscall() at syscall+0x1bb Xfast_syscall() at Xfast_syscall+0xab --- syscall (155, FreeBSD ELF64, nfssvc), rip = 0x8006885cc, rsp = 0x7fffea28, rbp = 0 --- I have increased vm.kmem_size_max=1024M vm.kmem_size=1024M vfs.zfs.arc_max=800M it just seems to delay the panic though, it smells like some memory leak ... the host is running amd64 quad core, 7.1-prerelease and 8GB. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: bce reporting fantom input errors?
This is a multi-part message in MIME format. --070205030901020808000803 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Danny Braniss wrote: Hi, After changing cables,switches,ports, I came to the conclusion that bce is reporting input errors that are not there, or creating them. I checked this with 3 different boxes, all Dell-2950/Broadcom NetXtreme II BCM5708 1000Base-T (B2), and one of them, while running Solaris, reported 0 errors after a week, and freebsd after a few minutes its count was 100. The errors appear under 7.-PRERELEASE, but not under 7.0 Anybody else seeing this? Please apply this patch, it was committed as revision 186169 about 3 hours ago against -HEAD. I'll MFC it after 3 days. Cheers, - -- Xin LI delp...@delphij.net http://www.delphij.net/ FreeBSD - The Power to Serve! -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.9 (FreeBSD) iEYEARECAAYFAklHZWsACgkQi+vbBBjt66CHxgCfQhUCadChP7mtyoOD4Wg4cP/k lAUAnj1S2vh/TtmnKZAaczJvx7V/XR4x =fdk+ -END PGP SIGNATURE- --070205030901020808000803 Content-Type: text/plain; name=bce-noL2Filter.diff Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename=bce-noL2Filter.diff Index: if_bce.c === --- if_bce.c (revision 186076) +++ if_bce.c (working copy) @@ -7408,7 +7408,6 @@ (u_long) sc-stat_IfInMBUFDiscards + (u_long) sc-stat_Dot3StatsAlignmentErrors + (u_long) sc-stat_Dot3StatsFCSErrors + - (u_long) sc-stat_IfInFramesL2FilterDiscards + (u_long) sc-stat_IfInRuleCheckerDiscards + (u_long) sc-stat_IfInFTQDiscards + (u_long) sc-com_no_buffers; --070205030901020808000803-- thanks! so actually it was counting IfInFramesL2FilterDiscards. btw, it worked, it's now 0 input errors. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
bce reporting fantom input errors?
Hi, After changing cables,switches,ports, I came to the conclusion that bce is reporting input errors that are not there, or creating them. I checked this with 3 different boxes, all Dell-2950/Broadcom NetXtreme II BCM5708 1000Base-T (B2), and one of them, while running Solaris, reported 0 errors after a week, and freebsd after a few minutes its count was 100. The errors appear under 7.-PRERELEASE, but not under 7.0 Anybody else seeing this? danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: zfs panics
Hi, On 2008-12-10, Danny Braniss wrote: from a solaris or linux client, doing a ls(1) of a nfs exported zfs file, for example: ls /net/zfs-server/h/.zfs/snapshot, panics the server. The server is running latest 7.1-prerelease. This has been reported as PR kern/125149. I have described the problem in this message: http://lists.freebsd.org/pipermail/freebsd-fs/2008-October/005217.html See the PR for RELENG_7 patches. (http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/125149) -- Jaakko Hi Jaakko, did you apply the patches and it solved the problem? and, btw, which patch? To Jeremy, How about adding a line explaining that it would be prudent to 'zfs set snapdir=hidden' ..., or, of cource fix the bug :-) I will apply the patch/es and see what happens. cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: zfs panics
Hi, On 2008-12-10, Danny Braniss wrote: from a solaris or linux client, doing a ls(1) of a nfs exported zfs file, for example: ls /net/zfs-server/h/.zfs/snapshot, panics the server. The server is running latest 7.1-prerelease. This has been reported as PR kern/125149. I have described the problem in this message: http://lists.freebsd.org/pipermail/freebsd-fs/2008-October/005217.html See the PR for RELENG_7 patches. (http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/125149) -- Jaakko Hi Jaakko, did you apply the patches and it solved the problem? and, btw, which patch? To Jeremy, How about adding a line explaining that it would be prudent to 'zfs set snapdir=hidden' ..., or, of cource fix the bug :-) I will apply the patch/es and see what happens. cheers, danny the patch to nfs_server.c does indeed prevent the panics. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
zfs panics
hi, from a solaris or linux client, doing a ls(1) of a nfs exported zfs file, for example: ls /net/zfs-server/h/.zfs/snapshot, panics the server. The server is running latest 7.1-prerelease. when client is freebsd, it mostly works, but in a few cases the server just goes into comma. btw, the server is running vanilla zfs, no tunning, and the server is 64bit with 8gb of memory and quad core (dell-pe2950) Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x168 fault code = supervisor write data, page not present instruction pointer = 0x8:0x804a9175 stack pointer = 0x10:0xb71fc550 frame pointer = 0x10:0xb71fc560 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 802 (nfsd) [thread pid 802 tid 100185 ] Stopped at _mtx_lock_flags+0x15: lock cmpxchgq %rsi,0x50(%rdi) db tr Tracing pid 802 tid 100185 td 0xff0004d576e0 _mtx_lock_flags() at _mtx_lock_flags+0x15 vput() at vput+0x45 nfsrv_readdirplus() at nfsrv_readdirplus+0x83e nfssvc() at nfssvc+0x400 syscall() at syscall+0x1bb Xfast_syscall() at Xfast_syscall+0xab --- syscall (155, FreeBSD ELF64, nfssvc), rip = 0x8006885cc, rsp = 0x7fffea2 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
btx/pxeboot problem
latest pxeboot (7.1): mother-boardNIC/LOM CPU - --- --- Intel SWV25 em xeonworks fine SUN X2200bgeamd works fine DELL PE 2950 bcexeonfailes 95% of the times hangs or goes into btx dump regs. mode :-) Intel SE7320VP21 mskxeonfailes 50% of the times - hangs pxeboot with btx.S 1.45 2008/02/27 23:35:39, works fine. so it seems that changes since 1.45 have fixed it for some, but it brakes for others :-). I can help testing, but btx is way out of my league. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: dhclient doing DISCOVER with bad IP checksum - bge (7.1 show stopper??)
Can someone please confirm or rule out my issue with dhclient sending bad IP checksum packets. It would really suck if 7.1 was released with a broken DHCP client. I've had many problems lately, but none involved checksum nor the dhcpd (btw, I assume that you are seeing bad checksum on the receiving server) could you add a nic to your PE1750? danny Jonathan Feally wrote: Sorry for the cross-post, but this could be either lists problem. I have 2 boxes running 7-STABLE as of 20081130, both i386 SMP. One is running ISC DHCPD 3.0.x from recent ports, and the other dhclient from make world. The server is refusing to answer the DISCOVER request, as it thinks the IP checksum is wrong, which tcpdump also confirms. Other DHCP clients are working fine on this network, so I do not believe it to be the network, server or dhcpd. Server is running a 2 Port Intel card - em driver. Client is a Dell PE1750 with 2 onboard NIC's - bge driver. I have tried turning off both RXCSUM and TXCSUM on both the client and server machines with no luck. I also tried the second NIC on the server with the same result. This setup was working just a couple of weeks ago, and the only thing that has changed is updating the src for a make world. PXE booting this server does result in an IP being issued, so it is pointing towards something new/changed in 7-STABLE. I have attached a 3 packet dump of the DISCOVER requests. Can anybody shed some light on this for me? Thanks, -Jon ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED] -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: diskless+pxe notes
Hi, i finally decided to try and use pxeboot to replace the etherboot method I was using so far for diskless setups. The goal is to fully share the server's root and /usr directories, as documented in diskless(8). I'd like to share the following notes, hopefully to go in the manpage. cheers luigi Hi, With a slightly modified libstand/bootp.c - a PR was sent way back, but you can check ftp://ftp.cs.huji.ac.il/users/danny/freebsd/diskless-boot you can control the diskless boot options. by comenting out kernel= in /boot/defaults/loader.conf you can set in the dhcpd.conf. since most of the tags received via dhcp are placed in kenv, the crucial options are there! BTW, we use diskless servers/workstations for 90% of our hosts, the exception being: - the dhcp/tftp server - a 'lagged' server - the router/server get confused :-) - our mail servers, there is a bug somewhere, where some critical network resources get deadlocked. - our developement servers. the / of the diskless is almost identical to the server, but for many reasons, I like to keep it appart. The trick to overcome the read-only problem, is using unionfs for /etc: in rc.initdiskless: if [ -e /conf/union ]; then kldload unionfs mount_md 4096 /.etc mount_unionfs -o transparent /.etc /etc fi the /conf is nfs mounted from a central site, the location is passed via dhcp: confpath=`kenv conf-path` if [ -n $confpath ] ; then if [ `expr $confpath : '\(.*\):'` ] ; then echo Mounting $confpath on /conf mount_nfs $confpath /conf chkerr $? mount_nfs $confpath /conf to_umount=${to_umount} $confpath fi fi the actual rc.conf is configured like this: eval `kenv | sed -n 's/^rc\.//p'` rm -f /etc/rc.conf /etc/rc.conf.local for fc in $conf0 $conf1 $conf2 $conf3 $conf4 $conf5 $conf6 $conf7 $conf8 $conf9 rc.conf.$hostname do ho=`expr $fc : '\(.*\):'` fl=`expr $fc : '.*/\(.*\)'` if [ ${ho} != ]; then mp=`expr $fc : '\(.*\)/.*'` mount_nfs $mp /mnt /dev/null 21 if [ -f /mnt/$fl ]; then echo # from $fc /mnt/$fl /etc/rc.conf cat /mnt/$fl /etc/rc.conf fi umount /mnt /dev/null 21 elif [ -e /conf/$fc ] ; then echo # from /conf/$fc /etc/rc.conf cat /conf/$fc /etc/rc.conf fi done -- root path configuration - There seems to be a well known problem in pxeloader, see kern/106493 , where pxeloader defaults to using a root path of /pxeroot when offered / . The patch suggested in http://www.freebsd.org/cgi/query-pr.cgi?pr=106493 is trivial and judging from it I believe this is addressing a true bug and not a feature. Fortunately there is a workaround (suggested in the PR) which is using // as a root path. - sharing /boot with the server --- I believe it is quite useful to share the whole root partition between the server and the diskless client. This would require at a minimum some conditional code in loader.conf (or loader.rc, etc) so that at least you point to different kernels. A minimalistic approach can be adding this line to /boot/loader.conf bootfile=kernel\\${loaddev};kernel The variable $loaddev contains the name of the load device, which is pxe0 in the case of pxeboot, and disk* in other cases when loading from the local disk. If you make sure that there is no 'kernel.disk*' on the directory, and instead there is a kernel.pxe0 in the same directory, then the diskless machines and the server will boot from the proper file. Unfortunately i don't know how to implement a conditional in /boot/loader.conf -- otherwise one could do much nicer things such as differentiate which modules to load and so on. --- pxeloader bug in 7.x --- Also worth mentioning is an annoying bug in pxeloader as compiled on 7.x, see http://www.freebsd.org/cgi/query-pr.cgi?pr=118222 i.e. the pxeloader in 7.x fails to proceed and prints a message can't figure out which disk we are booting from. The workaround is using a pxeloader from FreeBSD6 works. I guess this is a compiler-related problem (given that 6.x uses gcc 3.4 as a compiler, while 7.x uses gcc 4.2). - ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: bad NFS/UDP performance
On Sat, 4 Oct 2008, Danny Braniss wrote: at the moment, the best I can do is run it on a different hardware that has if_em, the results are in ftp://ftp.cs.huji.ac.il/users/danny/lock.prof/7.1-1000.em the benchmark ran better with the Intel NIC, averaged UDP 54MB/s, TCP 53MB/s (I get the same numbers with an older kernel). Dear Danny: Unfortunately, I was left slightly unclear on the comparison you are making above. Could you confirm whether or not, with if_em, you see a performance regression using UDP NFS between 7.0-RELEASE and the most recent 7.1-STABLE, and if you do, whether or not the RLOCK-WLOCK change has any effect on performance? It would be nice to know on the same hardware but at least with different hardware we get a sense of whether or not this might affect other systems or whether it's limited to a narrower set of configurations. Thanks, 7.1-1000.em vanilla 7.1 1 x Intel Core Duo 7.1-1000.x2200.em vanilla 7.1 2 x Dual-Core AMD Opteron 7.0-1000.x2200.em 7.0 + RLOCK-WLOCK the plot thickens. I put an em card in, and the throughput is almost the same than with the bge. all the tests were done on the same host, a Sun x2200/amd/2cpux2core except for the one over the weekend that is a intel Core Duo, and not the same if_em card, sorry about that but one has PCI X, the other PCI Express :-(. what is becoming obvious is that NFS/UDP is very temperamental/sensitive :-) danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: bad NFS/UDP performance
On Fri, 3 Oct 2008, Danny Braniss wrote: On Fri, 3 Oct 2008, Danny Braniss wrote: gladly, but have no idea how to do LOCK_PROFILING, so some pointers would be helpfull. The LOCK_PROFILING(9) man page isn't a bad starting point -- I find that the defaults work fine most of the time, so just use them. Turn the enable syscl on just before you begin a run, and turn it off immediately afterwards. Make sure to reset between reruns (rebooting to a new kernel is fine too!). in ftp://ftp.cs.huji.ac.il/users/danny/lock.prof there 3 files: 7.1-100 host connected at 100 running -prerelease 7.1-1000same but connected at 1000 7.0-1000-stable with your 'patch' at 100 my benchmark didn't suffer from the profiling, average was about 9. at 1000 the benchmark got realy hit, average was around 12 for the patched, and 4 for the unpatched (less than at 100). Interesting. A bit of post-processing: [EMAIL PROTECTED]:/tmp cat 7.1-1000 | awk -F' ' '{print $3 $9}' | sort -n | tail -10 2413283 /r+d/7/sys/kern/kern_mutex.c:141 2470096 /r+d/7/sys/nfsclient/nfs_socket.c:1218 2676282 /r+d/7/sys/net/route.c:293 2754866 /r+d/7/sys/kern/vfs_bio.c:1468 3196298 /r+d/7/sys/nfsclient/nfs_bio.c:1664 3318742 /r+d/7/sys/net/route.c:1584 3711139 /r+d/7/sys/dev/bge/if_bge.c:3287 3753518 /r+d/7/sys/net/if_ethersubr.c:405 3961312 /r+d/7/sys/nfsclient/nfs_subs.c:1066 10688531 /r+d/7/sys/dev/bge/if_bge.c:3726 [EMAIL PROTECTED]:/tmp cat 7.0-1000 | awk -F' ' '{print $3 $9}' | sort -n | tail -10 468631 /r+d/hunt/src/sys/nfsclient/nfs_nfsiod.c:286 501989 /r+d/hunt/src/sys/nfsclient/nfs_vnops.c:1148 631587 /r+d/hunt/src/sys/nfsclient/nfs_socket.c:1198 701155 /r+d/hunt/src/sys/nfsclient/nfs_socket.c:1258 718211 /r+d/hunt/src/sys/kern/kern_mutex.c:141 1118711 /r+d/hunt/src/sys/nfsclient/nfs_bio.c:1664 1169125 /r+d/hunt/src/sys/nfsclient/nfs_subs.c:1066 1222867 /r+d/hunt/src/sys/kern/vfs_bio.c:1468 3876072 /r+d/hunt/src/sys/netinet/udp_usrreq.c:545 5198927 /r+d/hunt/src/sys/netinet/udp_usrreq.c:864 The first set above is with the unmodified 7-STABLE tree, the second with a reversion of read locking on the UDP inpcb. The big blinking sign of interest is that the bge interface lock is massively contended in the first set of output, and basically doesn't appear in the second. There are various reasons bge could stand out quite so much -- one possibly is that previously, the udp lock serialized all access to the interface from the send code, preventing the send and receive paths from contending. A few things to try: - Let's look compare the context switch rates on the two benchmarks. Could you run vmstat and look at the cpu cs line during the benchmarks and see how similar the two are as the benchmarks run? You'll want to run it with vmstat -w 1 and collect several samples per benchmark, since we're really interested in the distribution rather than an individual sample. - Is there any chance you could drop an if_em card into the same box and run the identical benchmarks with and without LOCK_PROFILING to see whether it behaves differently than bge when the patch is applied? if_em's interrupt handling is quite different, and may significantly affect lock use, and hence contention. at the moment, the best I can do is run it on a different hardware that has if_em, the results are in ftp://ftp.cs.huji.ac.il/users/danny/lock.prof/7.1-1000.em the benchmark ran better with the Intel NIC, averaged UDP 54MB/s, TCP 53MB/s (I get the same numbers with an older kernel). danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: bad NFS/UDP performance
it more difficult than I expected. for one, the kernel date was missleading, the actual source update is the key, so the window of changes is now 28/July to 19/August. I have the diffs, but nothing yet seems relevant. on the other hand, I tried NFS/TCP, and there things seem ok, ie the 'good' and the 'bad' give the same throughput, which seem to point to UDP changes ... Can you post the network-numbers? so I ran some more test, these are for writes IO: server is a NetApp: kernel from 18/08/08 00:00:0 : /- UDP // TCP ---/ 1*512 38528 0.19s 83.50MB 0.20s 80.82MB/s 2*512 19264 0.21s 76.83MB 0.21s 77.57MB/s 4*512 9632 0.19s 85.51MB 0.22s 73.13MB/s 8*512 4816 0.19s 83.76MB 0.21s 75.84MB/s 16*512 2408 0.19s 83.99MB 0.21s 77.18MB/s 32*512 1204 0.19s 84.45MB 0.22s 71.79MB/s 64*512602 0.20s 79.98MB 0.20s 78.44MB/s 128*512301 0.18s 86.51MB 0.22s 71.53MB/s 256*512150 0.19s 82.83MB 0.20s 78.86MB/s 512*512 75 0.19s 82.77MB 0.21s 76.39MB/s 1024*512 37 0.19s 85.62MB 0.21s 76.64MB/s 2048*512 18 0.21s 77.72MB 0.20s 80.30MB/s 4096*512 9 0.26s 61.06MB 0.30s 53.79MB/s 8192*512 4 0.83s 19.20MB 0.41s 39.12MB/s 16384*512 2 0.84s 19.01MB 0.41s 39.03MB/s 32768*512 1 0.82s 19.59MB 0.39s 40.89MB/s kernel from 19/08/08 00:00:00: 1*512 38528 0.45s 35.59MB 0.20s 81.43MB/s 2*512 19264 0.45s 35.56MB 0.20s 79.24MB/s 4*512 9632 0.49s 32.66MB 0.22s 73.72MB/s 8*512 4816 0.47s 34.06MB 0.21s 75.52MB/s 16*512 2408 0.53s 30.16MB 0.22s 72.58MB/s 32*512 1204 0.31s 51.68MB 0.40s 40.14MB/s 64*512602 0.43s 37.23MB 0.25s 63.57MB/s 128*512301 0.51s 31.39MB 0.26s 62.70MB/s 256*512150 0.47s 34.02MB 0.23s 69.06MB/s 512*512 75 0.47s 34.01MB 0.23s 70.52MB/s 1024*512 37 0.53s 30.12MB 0.22s 73.01MB/s 2048*512 18 0.55s 29.07MB 0.23s 70.64MB/s 4096*512 9 0.46s 34.69MB 0.21s 75.92MB/s 8192*512 4 0.81s 19.66MB 0.43s 36.89MB/s 16384*512 2 0.80s 19.99MB 0.40s 40.29MB/s 32768*512 1 1.11s 14.41MB 0.38s 42.56MB/s ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: bad NFS/UDP performance
On Fri, 3 Oct 2008, Danny Braniss wrote: it more difficult than I expected. for one, the kernel date was missleading, the actual source update is the key, so the window of changes is now 28/July to 19/August. I have the diffs, but nothing yet seems relevant. on the other hand, I tried NFS/TCP, and there things seem ok, ie the 'good' and the 'bad' give the same throughput, which seem to point to UDP changes ... Can you post the network-numbers? so I ran some more test, these are for writes IO: OK, so it looks like this was almost certainly the rwlock change. What happens if you pretty much universally substitute the following in udp_usrreq.c: Currently Change to - - INP_RLOCK INP_WLOCK INP_RUNLOCK INP_WUNLOCK INP_RLOCK_ASSERT INP_WLOCK_ASSERT I guess you were almost certainly correct :-) I did the global subst. on the udp_usrreq.c from 19/08, __FBSDID($FreeBSD: src/sys/netinet/udp_usrreq.c,v 1.218.2.3 2008/08/18 23:00:41 bz Exp $); and now udp is fine again! danny Robert N M Watson Computer Laboratory University of Cambridge server is a NetApp: kernel from 18/08/08 00:00:0 : /- UDP // TCP ---/ 1*512 38528 0.19s 83.50MB 0.20s 80.82MB/s 2*512 19264 0.21s 76.83MB 0.21s 77.57MB/s 4*512 9632 0.19s 85.51MB 0.22s 73.13MB/s 8*512 4816 0.19s 83.76MB 0.21s 75.84MB/s 16*512 2408 0.19s 83.99MB 0.21s 77.18MB/s 32*512 1204 0.19s 84.45MB 0.22s 71.79MB/s 64*512602 0.20s 79.98MB 0.20s 78.44MB/s 128*512301 0.18s 86.51MB 0.22s 71.53MB/s 256*512150 0.19s 82.83MB 0.20s 78.86MB/s 512*512 75 0.19s 82.77MB 0.21s 76.39MB/s 1024*512 37 0.19s 85.62MB 0.21s 76.64MB/s 2048*512 18 0.21s 77.72MB 0.20s 80.30MB/s 4096*512 9 0.26s 61.06MB 0.30s 53.79MB/s 8192*512 4 0.83s 19.20MB 0.41s 39.12MB/s 16384*512 2 0.84s 19.01MB 0.41s 39.03MB/s 32768*512 1 0.82s 19.59MB 0.39s 40.89MB/s kernel from 19/08/08 00:00:00: 1*512 38528 0.45s 35.59MB 0.20s 81.43MB/s 2*512 19264 0.45s 35.56MB 0.20s 79.24MB/s 4*512 9632 0.49s 32.66MB 0.22s 73.72MB/s 8*512 4816 0.47s 34.06MB 0.21s 75.52MB/s 16*512 2408 0.53s 30.16MB 0.22s 72.58MB/s 32*512 1204 0.31s 51.68MB 0.40s 40.14MB/s 64*512602 0.43s 37.23MB 0.25s 63.57MB/s 128*512301 0.51s 31.39MB 0.26s 62.70MB/s 256*512150 0.47s 34.02MB 0.23s 69.06MB/s 512*512 75 0.47s 34.01MB 0.23s 70.52MB/s 1024*512 37 0.53s 30.12MB 0.22s 73.01MB/s 2048*512 18 0.55s 29.07MB 0.23s 70.64MB/s 4096*512 9 0.46s 34.69MB 0.21s 75.92MB/s 8192*512 4 0.81s 19.66MB 0.43s 36.89MB/s 16384*512 2 0.80s 19.99MB 0.40s 40.29MB/s 32768*512 1 1.11s 14.41MB 0.38s 42.56MB/s ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: bad NFS/UDP performance
forget it about LOCK_PROFILING, I'm RTFM now :-) though some hints on values might be helpful. have a nice weekend, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: bad NFS/UDP performance
On Fri, 3 Oct 2008, Danny Braniss wrote: OK, so it looks like this was almost certainly the rwlock change. What happens if you pretty much universally substitute the following in udp_usrreq.c: Currently Change to - - INP_RLOCK INP_WLOCK INP_RUNLOCKINP_WUNLOCK INP_RLOCK_ASSERT INP_WLOCK_ASSERT I guess you were almost certainly correct :-) I did the global subst. on the udp_usrreq.c from 19/08, __FBSDID($FreeBSD: src/sys/netinet/udp_usrreq.c,v 1.218.2.3 2008/08/18 23:00:41 bz Exp $); and now udp is fine again! OK. This is a change I'd rather not back out since it significantly improves performance for many other UDP workloads, so we need to figure out why it's hurting us so much here so that we know if there are reasonable alternatives. Would it be possible for you to do a run of the workload with both kernels using LOCK_PROFILING around the benchmark, and then we can compare lock contention in the two cases? What we often find is that relieving contention at one point causes new contention at another point, and if the primitive used at that point handles contention less well for whatever reason, performance can be reduced rather than improved. So maybe we're looking at an issue in the dispatched UDP code from so_upcall? Another less satisfying (and fundamentally more difficult) answer might be something to do with the scheduler, but a bit more analysis may shed some light. gladly, but have no idea how to do LOCK_PROFILING, so some pointers would be helpfull. as a side note, many years ago I checked out NFS/TCP and it was really bad, I even remember NetApp telling us to drop TCP, but now, things look rather better. Wonder what caused it. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: bad NFS/UDP performance
On Fri, 3 Oct 2008, Danny Braniss wrote: gladly, but have no idea how to do LOCK_PROFILING, so some pointers would be helpfull. The LOCK_PROFILING(9) man page isn't a bad starting point -- I find that the defaults work fine most of the time, so just use them. Turn the enable syscl on just before you begin a run, and turn it off immediately afterwards. Make sure to reset between reruns (rebooting to a new kernel is fine too!). in ftp://ftp.cs.huji.ac.il/users/danny/lock.prof there 3 files: 7.1-100 host connected at 100 running -prerelease 7.1-1000same but connected at 1000 7.0-1000-stable with your 'patch' at 100 my benchmark didn't suffer from the profiling, average was about 9. at 1000 the benchmark got realy hit, average was around 12 for the patched, and 4 for the unpatched (less than at 100). danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: bad NFS/UDP performance
On Fri, 26 Sep 2008, Danny Braniss wrote: after more testing, it seems it's related to changes made between Aug 4 and Aug 29 ie, a kernel built on Aug 4 works fine, Aug 29 is slow. I'l now try and close the gap. I think this is the best way forward -- skimming August changes, there are a number of candidate commits, including retuning of UDP hashes by mav, my rwlock changes, changes to mbuf chain handling, etc. it more difficult than I expected. for one, the kernel date was missleading, the actual source update is the key, so the window of changes is now 28/July to 19/August. I have the diffs, but nothing yet seems relevant. on the other hand, I tried NFS/TCP, and there things seem ok, ie the 'good' and the 'bad' give the same throughput, which seem to point to UDP changes ... danny Grr, there goes binary search theory out of the window, So far I have managed to pinpoint the day that the changes affect the throughput: 18/08/08 00:00:00 19/08/08 00:00:00 (I assume cvs's date is GMT). now would be a good time for some help, specially how to undo changes, my knowledge of csup/cvs are close to zero. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: bad NFS/UDP performance
it more difficult than I expected. for one, the kernel date was missleading, the actual source update is the key, so the window of changes is now 28/July to 19/August. I have the diffs, but nothing yet seems relevant. on the other hand, I tried NFS/TCP, and there things seem ok, ie the 'good' and the 'bad' give the same throughput, which seem to point to UDP changes ... Can you post the network-numbers? [again :-] Writing 16 MB file BSCount / 7.0 --/ / 7.1 -/ should now read: / Aug 18 --/ /--- Aug 19 / 1*512 32768 0.16s 98.11MB/s 0.43s 37.18MB/s 2*512 16384 0.17s 92.04MB/s 0.46s 34.79MB/s 4*512 8192 0.16s 101.88MB/s 0.43s 37.26MB/s 8*512 4096 0.16s 99.86MB/s 0.44s 36.41MB/s 16*512 2048 0.16s 100.11MB/s 0.50s 32.03MB/s 32*512 1024 0.26s 61.71MB/s 0.46s 34.79MB/s 64*512512 0.22s 71.45MB/s 0.45s 35.41MB/s 128*512256 0.21s 77.84MB/s 0.51s 31.34MB/s 256*512128 0.19s 82.47MB/s 0.43s 37.22MB/s 512*512 64 0.18s 87.77MB/s 0.49s 32.69MB/s 1024*512 32 0.18s 89.24MB/s 0.47s 34.02MB/s 2048*512 16 0.17s 91.81MB/s 0.30s 53.41MB/s 4096*512 8 0.16s 100.56MB/s 0.42s 38.07MB/s 8192*512 4 0.82s 19.56MB/s 0.80s 19.95MB/s 16384*512 2 0.82s 19.63MB/s 0.95s 16.80MB/s 32768*512 1 0.81s 19.69MB/s 0.96s 16.64MB/s Average: 75.8633.00 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: bad NFS/UDP performance
:-vfs.nfs.realign_test: 22141777 :+vfs.nfs.realign_test: 498351 : :-vfs.nfsrv.realign_test: 5005908 :+vfs.nfsrv.realign_test: 0 : :+vfs.nfsrv.commit_miss: 0 :+vfs.nfsrv.commit_blks: 0 : : changing them did nothing - or at least with respect to nfs throughput :-) : :I'm not sure what any of these do, as NFS is a bit out of my league. ::-) I'll be following this thread though! : :-- :| Jeremy Chadwickjdc at parodius.com | A non-zero nfs_realign_count is bad, it means NFS had to copy the mbuf chain to fix the alignment. nfs_realign_test is just the number of times it checked. So nfs_realign_test is irrelevant. it's nfs_realign_count that matters. it's zero, so I guess I'm ok there. funny though, on my 'good' machine, vfs.nfsrv.realign_test: 5862999 and on the slow one, it's 0 - but then again the good one has been up for several days. Several things can cause NFS payloads to be improperly aligned. Anything from older network drivers which can't start DMA on a 2-byte boundary, resulting in the 14-byte encapsulation header causing improper alignment of the IP header payload, to rpc embedded in NFS TCP streams winding up being misaligned. Modern network hardware either support 2-byte-aligned DMA, allowing the encapsulation to be 2-byte aligned so the payload winds up being 4-byte aligned, or support DMA chaining allowing the payload to be placed in its own mbuf, or pad, etc. -- One thing I would check is to be sure a couple of nfsiod's are running on the client when doing your tests. If none are running the RPCs wind up being more synchronous and less pipelined. Another thing I would check is IP fragment reassembly statistics (for UDP) - there should be none for TCP connections no matter what the NFS I/O size selected. ahh, nfsiod, it seems that it's now dynamicaly started! at least none show when host is idle, after i run my tests there are 20! with ppid 0 need to refresh my NFS knowledge. how can I see the IP fragment reassembly statistics? (It does seem more likely to be scheduler-related, though). tend to agree, I tried bith ULE/BSD, but the badness is there. -Matt thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: bad NFS/UDP performance
--==_Exmh_1222467420_5817P Content-Type: text/plain; charset=us-ascii Content-Disposition: inline David, You beat me to it. Danny, read the iperf man page: -b, --bandwidth n[KM] set target bandwidth to n bits/sec (default 1 Mbit/sec). This setting requires UDP (-u). The page needs updating, though. It should read -b, --bandwidth n[KMG]. It also does NOT require -u. If you use -b, UDP is assumed. I did RTFM(*), but when i tried it just wouldn't work, I tried today and it's actually working - so don't RTFM before coffee! btw, even though iperf sucks, netperf udp tends to bring the server down to it's knees. danny PS: * - i don't seem to have the iperf man, all I have is iperf -h ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: bad NFS/UDP performance
On Fri, 26 Sep 2008, Danny Braniss wrote: after more testing, it seems it's related to changes made between Aug 4 and Aug 29 ie, a kernel built on Aug 4 works fine, Aug 29 is slow. I'l now try and close the gap. I think this is the best way forward -- skimming August changes, there are a number of candidate commits, including retuning of UDP hashes by mav, my rwlock changes, changes to mbuf chain handling, etc. it more difficult than I expected. for one, the kernel date was missleading, the actual source update is the key, so the window of changes is now 28/July to 19/August. I have the diffs, but nothing yet seems relevant. on the other hand, I tried NFS/TCP, and there things seem ok, ie the 'good' and the 'bad' give the same throughput, which seem to point to UDP changes ... danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
bad NFS/UDP performance
Hi, There seems to be some serious degradation in performance. Under 7.0 I get about 90 MB/s (on write), while, on the same machine under 7.1 it drops to 20! Any ideas? thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: bad NFS/UDP performance
On Fri, Sep 26, 2008 at 10:04:16AM +0300, Danny Braniss wrote: Hi, There seems to be some serious degradation in performance. Under 7.0 I get about 90 MB/s (on write), while, on the same machine under 7.1 it drops to 20! Any ideas? 1) Network card driver changes, could be, but at least iperf/tcp is ok - can't get udp numbers, do you know of any tool to measure udp performance? BTW, I also checked on different hardware, and the badness is there. 2) This could be relevant, but rwatson@ will need to help determine that. http://lists.freebsd.org/pipermail/freebsd-stable/2008-September/045109.html gut feeling is that it's somewhere else: Writing 16 MB file BSCount / 7.0 --/ / 7.1 -/ 1*512 32768 0.16s 98.11MB/s 0.43s 37.18MB/s 2*512 16384 0.17s 92.04MB/s 0.46s 34.79MB/s 4*512 8192 0.16s 101.88MB/s 0.43s 37.26MB/s 8*512 4096 0.16s 99.86MB/s 0.44s 36.41MB/s 16*512 2048 0.16s 100.11MB/s 0.50s 32.03MB/s 32*512 1024 0.26s 61.71MB/s 0.46s 34.79MB/s 64*512512 0.22s 71.45MB/s 0.45s 35.41MB/s 128*512256 0.21s 77.84MB/s 0.51s 31.34MB/s 256*512128 0.19s 82.47MB/s 0.43s 37.22MB/s 512*512 64 0.18s 87.77MB/s 0.49s 32.69MB/s 1024*512 32 0.18s 89.24MB/s 0.47s 34.02MB/s 2048*512 16 0.17s 91.81MB/s 0.30s 53.41MB/s 4096*512 8 0.16s 100.56MB/s 0.42s 38.07MB/s 8192*512 4 0.82s 19.56MB/s 0.80s 19.95MB/s 16384*512 2 0.82s 19.63MB/s 0.95s 16.80MB/s 32768*512 1 0.81s 19.69MB/s 0.96s 16.64MB/s Average: 75.8633.00 the nfs filer is a NetWork Appliance, and is in use, so i get fluctuations in the measurements, but the relation are similar, good on 7.0, bad on 7.1 Cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: bad NFS/UDP performance
On Fri, Sep 26, 2008 at 12:27:08PM +0300, Danny Braniss wrote: On Fri, Sep 26, 2008 at 10:04:16AM +0300, Danny Braniss wrote: Hi, There seems to be some serious degradation in performance. Under 7.0 I get about 90 MB/s (on write), while, on the same machine under 7.1 it drops to 20! Any ideas? 1) Network card driver changes, could be, but at least iperf/tcp is ok - can't get udp numbers, do you know of any tool to measure udp performance? BTW, I also checked on different hardware, and the badness is there. According to INDEX, benchmarks/iperf does UDP bandwidth testing. I know, but I get about 1mgb, which seems somewhat low :-( benchmarks/nttcp should as well. What network card is in use? If Intel, what driver version (should be in dmesg). bge: Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x9003 and bce: Broadcom NetXtreme II BCM5708 1000Base-T (B2) and intels, but haven't tested there yet. 2) This could be relevant, but rwatson@ will need to help determine that. http://lists.freebsd.org/pipermail/freebsd-stable/2008-September/045109.html gut feeling is that it's somewhere else: Writing 16 MB file BSCount / 7.0 --/ / 7.1 -/ 1*512 32768 0.16s 98.11MB/s 0.43s 37.18MB/s 2*512 16384 0.17s 92.04MB/s 0.46s 34.79MB/s 4*512 8192 0.16s 101.88MB/s 0.43s 37.26MB/s 8*512 4096 0.16s 99.86MB/s 0.44s 36.41MB/s 16*512 2048 0.16s 100.11MB/s 0.50s 32.03MB/s 32*512 1024 0.26s 61.71MB/s 0.46s 34.79MB/s 64*512512 0.22s 71.45MB/s 0.45s 35.41MB/s 128*512256 0.21s 77.84MB/s 0.51s 31.34MB/s 256*512128 0.19s 82.47MB/s 0.43s 37.22MB/s 512*512 64 0.18s 87.77MB/s 0.49s 32.69MB/s 1024*512 32 0.18s 89.24MB/s 0.47s 34.02MB/s 2048*512 16 0.17s 91.81MB/s 0.30s 53.41MB/s 4096*512 8 0.16s 100.56MB/s 0.42s 38.07MB/s 8192*512 4 0.82s 19.56MB/s 0.80s 19.95MB/s 16384*512 2 0.82s 19.63MB/s 0.95s 16.80MB/s 32768*512 1 0.81s 19.69MB/s 0.96s 16.64MB/s Average: 75.8633.00 the nfs filer is a NetWork Appliance, and is in use, so i get fluctuations in the measurements, but the relation are similar, good on 7.0, bad on 7.1 Do you have any NFS-related tunings in /etc/rc.conf or /etc/sysctl.conf? no, but diffing the sysctl show: -vfs.nfs.realign_test: 22141777 +vfs.nfs.realign_test: 498351 -vfs.nfsrv.realign_test: 5005908 +vfs.nfsrv.realign_test: 0 +vfs.nfsrv.commit_miss: 0 +vfs.nfsrv.commit_blks: 0 changing them did nothing - or at least with respect to nfs throughput :-) danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: bad NFS/UDP performance
On Fri, 2008-09-26 at 10:04 +0300, Danny Braniss wrote: Hi, There seems to be some serious degradation in performance. Under 7.0 I get about 90 MB/s (on write), while, on the same machine under 7.1 it drops to 20! Any ideas? The scheduler has been changed to ULE, and NFS has historically been very sensitive to changes like that. You could try switching back to the 4BSD scheduler and seeing if that makes a difference. If it does, toggling PREEMPTION would also be interesting to see the results of. Gavin I'm testing 7.0-stable vs 7.1-prerelease, and both have ULE. BTW, the nfs client hosts I'm testing are idle. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: bad NFS/UDP performance
On Fri, Sep 26, 2008 at 12:27:08PM +0300, Danny Braniss wrote: On Fri, Sep 26, 2008 at 10:04:16AM +0300, Danny Braniss wrote: Hi, There seems to be some serious degradation in performance. Under 7.0 I get about 90 MB/s (on write), while, on the same machine under 7.1 it drops to 20! Any ideas? 1) Network card driver changes, could be, but at least iperf/tcp is ok - can't get udp numbers, do you know of any tool to measure udp performance? BTW, I also checked on different hardware, and the badness is there. According to INDEX, benchmarks/iperf does UDP bandwidth testing. benchmarks/nttcp should as well. What network card is in use? If Intel, what driver version (should be in dmesg). 2) This could be relevant, but rwatson@ will need to help determine that. http://lists.freebsd.org/pipermail/freebsd-stable/2008-September/045109.html gut feeling is that it's somewhere else: Writing 16 MB file BSCount / 7.0 --/ / 7.1 -/ 1*512 32768 0.16s 98.11MB/s 0.43s 37.18MB/s 2*512 16384 0.17s 92.04MB/s 0.46s 34.79MB/s 4*512 8192 0.16s 101.88MB/s 0.43s 37.26MB/s 8*512 4096 0.16s 99.86MB/s 0.44s 36.41MB/s 16*512 2048 0.16s 100.11MB/s 0.50s 32.03MB/s 32*512 1024 0.26s 61.71MB/s 0.46s 34.79MB/s 64*512512 0.22s 71.45MB/s 0.45s 35.41MB/s 128*512256 0.21s 77.84MB/s 0.51s 31.34MB/s 256*512128 0.19s 82.47MB/s 0.43s 37.22MB/s 512*512 64 0.18s 87.77MB/s 0.49s 32.69MB/s 1024*512 32 0.18s 89.24MB/s 0.47s 34.02MB/s 2048*512 16 0.17s 91.81MB/s 0.30s 53.41MB/s 4096*512 8 0.16s 100.56MB/s 0.42s 38.07MB/s 8192*512 4 0.82s 19.56MB/s 0.80s 19.95MB/s 16384*512 2 0.82s 19.63MB/s 0.95s 16.80MB/s 32768*512 1 0.81s 19.69MB/s 0.96s 16.64MB/s Average: 75.8633.00 the nfs filer is a NetWork Appliance, and is in use, so i get fluctuations in the measurements, but the relation are similar, good on 7.0, bad on 7.1 Do you have any NFS-related tunings in /etc/rc.conf or /etc/sysctl.conf? after more testing, it seems it's related to changes made between Aug 4 and Aug 29 ie, a kernel built on Aug 4 works fine, Aug 29 is slow. I'l now try and close the gap. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: RELENG_7 hangs on boot w/Gigabyte MA78GM-S2H MB
On Sat, Sep 20, 2008 at 08:05:33AM -0700, Jeremy Chadwick wrote: On Sat, Sep 20, 2008 at 09:45:10AM -0500, Bob Willcox wrote: On Sat, Sep 20, 2008 at 07:04:56AM -0700, Jeremy Chadwick wrote: On Sat, Sep 20, 2008 at 08:24:29AM -0500, Bob Willcox wrote: 1) It would be helpful to know if you installed i386 or amd64 FreeBSD, This is amd64 on this particular machine. 2) With regards to the lock-up after mount root, if you press NumLock or CapsLock, do the keyboard LEDs turn on/off? Nope, no keys do anything. You must either push reset or pull the plug. Is it possible to get the output when booting in verbose mode? If not, what are the last few lines before the machine locks up when booting verbosely? Yep, just did that. The last things printed right before hang are: ioapic0: Assigning ISA IRQ 1 to local APIC 0 ioapic0: Assigning ISA IRQ 4 to local APIC 1 ioapic0: Assigning ISA IRQ 6 to local APIC 2 ioapic0: Assigning ISA IRQ 7 to local APIC 0 ioapic0: Assigning ISA IRQ 9 to local APIC 1 ioapic0: Assigning ISA IRQ 12 to local APIC 2 ioapic0: Assigning ISA IRQ 14 to local APIC 0 ioapic0: Assigning ISA PCI 16 to local APIC 1 ioapic0: Assigning ISA PCI 17 to local APIC 2 ioapic0: Assigning ISA PCI 18 to local APIC 0 ioapic0: Assigning ISA PCI 19 to local APIC 1 ioapic0: Assigning ISA PCI 22 to local APIC 2 trying to mount root from ufs:/dev/ad4s1a start_init: trying /sbin/init [hung at this point] 3) Many others have seen the hanging/lock-up after mount root. I believe one found a workaround by setting ATA_STATIC_ID in their kernel configuration. I realise this is a problem when you can't get the system up to a point of building a kernel; chicken-and-egg problem, Well, I can build a kernel if I run the 7.0-release kernel. That's how I got to 7-stable on the machine in the first place. I used sneaker net to copy it to this one via a CD (as I mentioned, the 7.0 kernel boots but the Realtek ethernet device is not recognized). So the problem is that 7.0-RELEASE works fine for you, but after upgrading your RELENG_7 source (to what is now 7.1-BETA), the machine hangs after printing the mount root message. Is this correct? Yes, that is pretty much it. The Realtek ethernet isn't working in in 7.0-RELEASE either, but I'm guessing that that is a different (and less serious) problem related to changes in that device. Here's another question: does booting into single-user exhibit the same problem as multi-user? It looks like when I try a single-user mode (and verbose) boot the only difference is that the las line shown above (the start_init line) isn't printed. Otherwise, the hang is the same. 4) The Realtek NIC on that motherboard is probably too new to be supported under RELENG_7. Realtek has a history of releasing different sub-revisions of the same NIC/PHY, and the internal changes are severe enough to cause the NIC to not work correctly (under any OS) without full driver support for that specific sub-revision. That's what I suspected. The values displayed when doing a pciconf -lv are similar as for this system I'm using to type this, but now that I look closer and make a direct comparison, the failing device has a rev=0x02 vs. rev=0x01 for the working one. The pciconf -lv output for the failing mb is: [EMAIL PROTECTED]:2:0:0: class=0x02 card=0xe0001458 chip=0x816810ec rev=0x02 hdr=0x00 vendor = 'Realtek Semiconductor' device = 'RTL8168/8111 PCI-E Gigabit Ethernet NIC' class = network subclass = ethernet Regarding the Realtek issue: I've CC'd PYUN Yong-Hyeon (surname in caps), who maintains the re(4) driver for FreeBSD. He might have a patch available for you to try, or help determine how to get this NIC working on FreeBSD. He'll probably need more than just pciconf -lv output, but should be able to work with you. Ok, that'd be great. I must say that I'm close to simply returning this MB and going with something not quite so new that is more likely to work. I was hoping to get this system up and running this weekend. :( I wish I knew what was causing the lock-up for you. I'm truly baffled, especially given that the system is able to boot + find the kernel + load kernel modules. Debugging this problem is out of field; jhb@ might have some ideas, as I'm not sure what magic happens immediately before the root filesystem is mounted. Those debugging/helping may want disklabel -r -A ad4s1 output. At least you can boot 7.0-RELEASE to get that information. Regarding hardware: I myself purchased an Asus P5Q SE board, with an
Re: bin/121684: : dump(8) frequently hangs
Danny, Thanks for the suggestion, but my system is a P-III so there is only one CPU. At 1GHz, I think that this easily qualifies as an older, slower, non-smp host. I just tried it on a GEODE/7.0 stable - slower than a P-III :-), and dump went smoothly so, to help isolate the problem, if it still happens to you, let me know what os/version - I can probably find a P-III around here. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: bin/121684: : dump(8) frequently hangs
take a look at: http://www.freebsd.org/cgi/query-pr.cgi?pr=117603 danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Using iscsi with multiple targets
FreeBSD 7.0 I have 2 machines with identical configurations/hardware, let's call them A (master) and B (slave). I have installed iscsi-target from ports and have set up 3 targets representing the 3 drives I wish to be connected to from A. The Targets file: # extents filestart length extent0 /dev/da10 465GB extent1 /dev/da20 465GB extent2 /dev/da30 465GB # targetflags storage netmask target0 rw extent0 192.168.0.1/24 target1 rw extent1 192.168.0.1/24 target2 rw extent2 192.168.0.1/24 I then start up iscsi_target and all is good. Now on A I have set up my /etc/iscsi.conf file as follows: # cat /etc/iscsi.conf data1 { targetaddress=192.168.0.252 targetname=iqn.1994-04.org.netbsd.iscsi-target:target0 initiatorname=iqn.2005-01.il.ac.huji.cs::BSD-2-1.sven.local } data2 { targetaddress=192.168.0.252 targetname=iqn.1994-04.org.netbsd.iscsi-target:target1 initiatorname=iqn.2005-01.il.ac.huji.cs::BSD-2-1.sven.local } data3 { targetaddress=192.168.0.252 targetname=iqn.1994-04.org.netbsd.iscsi-target:target2 initiatorname=iqn.2005-01.il.ac.huji.cs::BSD-2-1.sven.local } So far so good, now come the issues. First of all, it would appear that with iscontrol one can only start one named session at a time; for example /sbin/iscontrol -n data1 /sbin/iscontrol -n data2 /sbin/isconrtol -n data3 I guess that is ok, except that each invocation of iscontrol resets the other sessions. Here is the camcontrol and dmesg output from running the above 3 commands. # camcontrol devlist AMCC 9550SXU-8L DISK 3.08at scbus0 target 0 lun 0 (pass0,da0) AMCC 9550SXU-8L DISK 3.08at scbus0 target 1 lun 0 (pass1,da1) AMCC 9550SXU-8L DISK 3.08at scbus0 target 2 lun 0 (pass2,da2) AMCC 9550SXU-8L DISK 3.08at scbus0 target 3 lun 0 (pass3,da3) NetBSD NetBSD iSCSI 0at scbus1 target 0 lun 0 (da5,pass5) NetBSD NetBSD iSCSI 0at scbus1 target 1 lun 0 (da6,pass6) NetBSD NetBSD iSCSI 0at scbus1 target 2 lun 0 (da4,pass4) [ /sbin/iscontrol -n data1 ] da4 at iscsi0 bus 0 target 0 lun 0 da4: NetBSD NetBSD iSCSI 0 Fixed Direct Access SCSI-3 device [ /sbin/iscontrol -n data2 ] (da4:iscsi0:0:0:0): lost device (da4:iscsi0:0:0:0): removing device entry da4 at iscsi0 bus 0 target 0 lun 0 da4: NetBSD NetBSD iSCSI 0 Fixed Direct Access SCSI-3 device da5 at iscsi0 bus 0 target 1 lun 0 da5: NetBSD NetBSD iSCSI 0 Fixed Direct Access SCSI-3 device [ /sbin/iscontrol -n data3 ] (da4:iscsi0:0:0:0): lost device (da4:iscsi0:0:0:0): removing device entry (da5:iscsi0:0:1:0): lost device (da5:iscsi0:0:1:0): removing device entry da4 at iscsi0 bus 0 target 2 lun 0 da4: NetBSD NetBSD iSCSI 0 Fixed Direct Access SCSI-3 device da5 at iscsi0 bus 0 target 0 lun 0 da5: NetBSD NetBSD iSCSI 0 Fixed Direct Access SCSI-3 device da6 at iscsi0 bus 0 target 1 lun 0 da6: NetBSD NetBSD iSCSI 0 Fixed Direct Access SCSI-3 device It would appear that rather than appending the new device to the end of the da devices, it starts to do some type of naming queue after the second device. If I am to use these devices in any type of automated setup, how can make sure that after these commands, da6 will always be target 1 (i.e. /dev/da2 on the slave machine). Next, there is no startup script for iscontrol - would that simply have to be added the system or is there a way with sysctl that it could be done. The plan here is use gmirror such that /dev/da1 on A is mirrored with the /dev/da1 on B using iscsi. Hi Sven, I just tried it here, and it seems that at the end all is ok :-) I think the lost/removing/found has something to do to iscontrol calling camcontrol rescan - I will check this later, but the end result is that you should have all /dev/da's. I don't see any reasonable safe way to tie a scsi# (/dev/dan), except to label (see glabel) the disk. The startup script is, at the moment, not trivial, but I'm attaching it, so someone can suggest improvements :-) #!/bin/sh # PROVIDE: iscsi # REQUIRE: NETWORKING # BEFORE: DAEMON # KEYWORD: nojail shutdown # # Add the following lines to /etc/rc.conf to enable iscsi: # # iscsi_enable=YES # iscsi_fstab=/etc/fstab.iscsi . /etc/rc.subr . /cs/share/etc/rc.subr name=iscsi rcvar=`set_rcvar` command=/sbin/iscontrol iscsi_enable=${iscsi_enable:-NO} iscsi_fstab=${iscsi_fstab:-/etc/fstab.iscsi} iscsi_exports=${iscsi_exports:-/etc/exports.iscsi} iscsi_debug=${iscsi_debug:-0} start_cmd=iscsi_start faststop_cmp=iscsi_stop stop_cmd=iscsi_stop start_precmd=iscontrol_precmd iscontrol_prog=${iscontrol_prog:-iscontrol} iscontrol_log=${iscontrol_log:-/var/log/$iscontrol_prog}
Re: ata on alix/geode stopped being detcted.
hi, latest changes in dev/ata broke this, on older -stable ... ata0-master: pio=PIO4 wdma=UNSUPPORTED udma=UNSUPPORTED cable=40 =wire ad0: success setting PIO4 on National chip ad0: 977MB SanDisk SDCFB-1024 Rev 0.00 at ata0-master PIO4 on latest -stable: ata0-master: pio=PIO4 wdma=WDMA2 udma=UNSUPPORTED cable=40 wire and no disk. cheers, danny problem solved: somehow 'device atadisk' was lost from the kernel configuration file ata0-master: pio=PIO4 wdma=WDMA2 udma=UNSUPPORTED cable=40 wire ad0: setting PIO4 on CS5536 chip ad0: setting WDMA2 on CS5536 chip ad0: 1953MB SanDisk SDCFB-2048 HDX 3.21 at ata0-master WDMA2 ad0: 4001760 sectors [3970C/16H/63S] 4 sectors/interrupt 1 depth queue danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
ata on alix/geode stopped being detcted.
hi, latest changes in dev/ata broke this, on older -stable ... ata0-master: pio=PIO4 wdma=UNSUPPORTED udma=UNSUPPORTED cable=40 wire ad0: success setting PIO4 on National chip ad0: 977MB SanDisk SDCFB-1024 Rev 0.00 at ata0-master PIO4 on latest -stable: ata0-master: pio=PIO4 wdma=WDMA2 udma=UNSUPPORTED cable=40 wire and no disk. cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: auto_nlist failed on cp_time at location 1
--dc+cDN39EJAMEtIO Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In the last episode (Apr 23), Tim Stoddard said: I just upgraded from FreeBSD 6.2 - 6.3 (using source tree). I then recompiled my net-snmp port binaries (using portupgrade). I am now get error message in my logs every five secs. I am sure my libkvm is in sync with my kernel. I do not know what else to look at. You got bit by revision 1.178.2.5 date: 2008/04/09 19:47:20; author: peter; state: Exp; lines: +68 -5 MFC: record per-cpu stats for %user/%nice/%system/%idle , which removed the kernel variable that net-snmp uses to track CPU usage. Try this patch (put it in /usr/ports/net-mgmt/net-snmp/files and rebuild net-snmp). I've sent it to the net-snmp port maintainer so hopefully it will be committed soon. -- Dan Nelson [EMAIL PROTECTED] the same goes for rpc.rstatd :-), see http://www.freebsd.org/cgi/query-pr.cgi?pr=123014 --dc+cDN39EJAMEtIO Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=patch-cpu_nlist.c --- agent/mibgroup/hardware/cpu/cpu_nlist.c 2007-01-19 10:53:44.0 -0600 +++ agent/mibgroup/hardware/cpu/cpu_nlist.c 2008-04-22 00:13:48.330686919 -0500 @@ -1,5 +1,5 @@ /* - * nlist() interface + * sysctl() interface * e.g. FreeBSD */ #include net-snmp/net-snmp-config.h @@ -12,24 +12,9 @@ #include sys/types.h #include sys/resource.h -#ifdef HAVE_SYS_DKSTAT_H -#include sys/dkstat.h -#endif #ifdef HAVE_SYS_SYSCTL_H #include sys/sysctl.h #endif -#ifdef HAVE_SYS_VMMETER_H -#include sys/vmmeter.h -#endif -#ifdef HAVE_VM_VM_PARAM_H -#include vm/vm_param.h -#endif -#ifdef HAVE_VM_VM_EXTERN_H -#include vm/vm_extern.h -#endif - -#define CPU_SYMBOL cp_time -#define MEM_SYMBOL cnt void _cpu_copy_stats( netsnmp_cpu_info *cpu ); @@ -67,11 +52,12 @@ */ int netsnmp_cpu_arch_load( netsnmp_cache *cache, void *magic ) { long cpu_stats[CPUSTATES]; -struct vmmeter mem_stats; +int size, tempval; + netsnmp_cpu_info *cpu = netsnmp_cpu_get_byIdx( -1, 0 ); -auto_nlist( CPU_SYMBOL, (char *) cpu_stats, sizeof(cpu_stats)); -auto_nlist( MEM_SYMBOL, (char *)mem_stats, sizeof(mem_stats)); +size = sizeof(cpu_stats); +sysctlbyname(kern.cp_time, cpu_stats, size, NULL, 0); cpu-user_ticks = (unsigned long)cpu_stats[CP_USER]; cpu-nice_ticks = (unsigned long)cpu_stats[CP_NICE]; @@ -85,15 +71,19 @@ * Interrupt/Context Switch statistics * XXX - Do these really belong here ? */ -#if defined(openbsd2) || defined(darwin) -cpu-swapIn = (unsigned long)mem_stats.v_swpin; -cpu-swapOut = (unsigned long)mem_stats.v_swpout; -#else -cpu-swapIn = (unsigned long)mem_stats.v_swappgsin+mem_stats.v_vnodepgsin; -cpu-swapOut = (unsigned long)mem_stats.v_swappgsout+mem_stats.v_vnodepgsout; -#endif -cpu-nInterrupts = (unsigned long)mem_stats.v_intr; -cpu-nCtxSwitches = (unsigned long)mem_stats.v_swtch; +size = sizeof(int); +#define GET_VM_STATS(cat, name, netsnmpname) \ +do { \ +sysctlbyname(vm.stats. #cat . #name, tempval, size, NULL, 0); \ +cpu-netsnmpname = (unsigned long) tempval; \ +} while(0) + +GET_VM_STATS(vm, v_swappgsin, swapIn); +GET_VM_STATS(vm, v_swappgsout, swapOut); +GET_VM_STATS(vm, v_vnodepgsin, pageIn); +GET_VM_STATS(vm, v_vnodepgsout, pageOut); +GET_VM_STATS(sys, v_intr,nInterrupts); +GET_VM_STATS(sys, v_swtch, nCtxSwitches); #ifdef PER_CPU_INFO for ( i = 0; i n; i++ ) { --dc+cDN39EJAMEtIO Content-Type: text/plain; charset=us-ascii MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] --dc+cDN39EJAMEtIO-- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Marvell Technology Group Ltd. Yukon EC Ultra
Hi, Under load, the msk has problems, with hw.msk.legacy_intr=1 and 0. with = 1, i get TCP segementation error watchdog timeout with = 0, Tx MAC parity error watchdog timeout the board is a Asus P5K-VM Cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Marvell Technology Group Ltd. Yukon EC Ultra
On Thu, Mar 27, 2008 at 12:57:31PM +0200, Danny Braniss wrote: Hi, Under load, the msk has problems, with hw.msk.legacy_intr=1 and 0. with = 1, i get TCP segementation error watchdog timeout with = 0, Tx MAC parity error watchdog timeout Would you show me verbosed boot messages related with msk(4)/e1000phy(4)? mskc0: Marvell Yukon 88E8056 Gigabit Ethernet port 0xc800-0xc8ff mem 0xfeafc000-0xfeaf irq 17 at device 0.0 on pci1 mskc0: Reserved 0x4000 bytes for rid 0x10 type 3 at 0xfeafc000 mskc0: MSI count : 1 mskc0: attempting to allocate 1 MSI vectors (1 supported) msi: routing MSI IRQ 256 to vector 52 mskc0: using IRQ 256 for MSI mskc0: RAM buffer size : 128KB mskc0: Port 0 : Rx Queue 85KB(0x:0x000153ff) mskc0: Port 0 : Tx Queue 43KB(0x00015400:0x0001) msk0: Marvell Technology Group Ltd. Yukon EC Ultra Id 0xb4 Rev 0x03 on mskc0 msk0: bpf attached msk0: Ethernet address: 00:1e:8c:6d:5c:fe miibus0: MII bus on msk0 e1000phy0: Marvell 88E1149 Gigabit PHY PHY 0 on miibus0 e1000phy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX-FDX, auto mskc0: [MPSAFE] mskc0: [FILTER] is this enough? danny the board is a Asus P5K-VM Cheers, danny -- Regards, Pyun YongHyeon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Marvell Technology Group Ltd. Yukon EC Ultra
On Thu, Mar 27, 2008 at 01:32:46PM +0200, Danny Braniss wrote: On Thu, Mar 27, 2008 at 12:57:31PM +0200, Danny Braniss wrote: Hi, Under load, the msk has problems, with hw.msk.legacy_intr=1 and 0. with = 1, i get TCP segementation error watchdog timeout with = 0, Tx MAC parity error watchdog timeout Would you show me verbosed boot messages related with msk(4)/e1000phy(4)? mskc0: Marvell Yukon 88E8056 Gigabit Ethernet port 0xc800-0xc8ff mem 0xfeafc000-0xfeaf irq 17 at device 0.0 on pci1 mskc0: Reserved 0x4000 bytes for rid 0x10 type 3 at 0xfeafc000 mskc0: MSI count : 1 mskc0: attempting to allocate 1 MSI vectors (1 supported) msi: routing MSI IRQ 256 to vector 52 mskc0: using IRQ 256 for MSI mskc0: RAM buffer size : 128KB mskc0: Port 0 : Rx Queue 85KB(0x:0x000153ff) mskc0: Port 0 : Tx Queue 43KB(0x00015400:0x0001) msk0: Marvell Technology Group Ltd. Yukon EC Ultra Id 0xb4 Rev 0x03 on mskc0 msk0: bpf attached msk0: Ethernet address: 00:1e:8c:6d:5c:fe miibus0: MII bus on msk0 e1000phy0: Marvell 88E1149 Gigabit PHY PHY 0 on miibus0 e1000phy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX-FDX, auto mskc0: [MPSAFE] mskc0: [FILTER] is this enough? Yes, it seems that 88E8056/88E1149 PHY has several issues. I recall that there had been several reports for this issue. Since nfe(4) with 88E1149 also have some stability issues, e1000phy(4) has lack of required code for 88E1149 PHY. Up to date, I couldn't find a clue, sorry. I'll let you know if I have a code to give it spin. great and thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: BTX on USB pen drive
On Sunday 09 March 2008 09:07:03 am Torfinn Ingolfsen wrote: On Sat, 08 Mar 2008 18:44:50 -0800 Jeremy Chadwick [EMAIL PROTECTED] wrote: Your boot0cfg line to reinstall the boot0 MBR looks fine, but I don't use boot0 myself (I prefer to go right into boot2/loader). I used 'fdisk -B da0' to install /boot/mbr to the disk for testing. When I now boot the disk on the Acer laptop, it just displays one register dump followed by BTX halted. You haven't updated boot2 (via bsdlabel -B) which sits in between boot0/mbr and /boot/loader. think you can apply the same magic to pxeldr? danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Promise driver/support
is there support for this Promise card? [EMAIL PROTECTED]:4:14:0: class=0x010400 card=0x0374105a chip=0x8350105a rev=0x00 hdr=0x00 vendor = 'Promise Technology Inc' class = mass storage subclass = RAID thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Documentation: Installing FreeBSD 7.0 via serial console and PXE
Hi Jeremy, I'm very glad that you a) can write! b) that you are actually doing something with respect to the zillions of missguided how-to's :-) Having some experience with the subject, and please, don't read me wrong, I see some different approaches: - indeed this IS the 21'st century, and it's unbelivable that we still have to deal with baudrates! (why can't they be more like modems? autosense:-) - newer servers don't have serial anymore :-(, the have IPMI/ILO/etc some only have com2 what im trying to say, is that hard coding where the console is is a 'problem' It's my belief that the setting of the console can be done via DHCP - at the moment I can select the com1/2 - sio.0/sio.1 - via dhcp. the other item I would like to raise, is the way we do it here. 1st: I boot the new host diskless, this allows us to find out quickly if all hardware is working, using a tested root/kernel - since DHCP/TFTP/NFS are working, it takes only a few minutes to bring up a new host set it to boot pxe add the mac address to the dhcp.conf and reboot 2nd: if/and when we decide to make the host 'stand-alone', we do sysinstall to partition the disk (or via bsdlabel if you are good at maths) cd /mnt-root rsh -n server dump 0f - /the/root/partition | restore rf - change the bios setting to boot off disk (or if you have the console, reboot and hit ESC when doing dhcp ...) ok, so i fibbed a bit :-), there are some small 'configurables'(*) missing, but I hope you get the idea. Cheers, danny PS: *: like setting a diskless setup, which is rather simple and gladly can try to explain so that you can the write it out in readable english :-) ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ad0: TIMEOUT - WRITE_DMA type errors with 7.0-RC1
Henri Hennebert wrote: Jeremy Chadwick wrote: On Fri, Jan 25, 2008 at 06:17:24PM -0700, Joe Peterson wrote: Glad you got it back! Yes, when I was first playing with ZFS, I noti= ced that booting between single and multi user mode could make the pools invisible. Import seemed to bring them back... I did go into single-user mode and attempt to do ZFS-related commands,= which might explain the no datasets available once I was back in multiuser! I would classify that as a bug, and one which is going to cause all sorts of hair-pulling for administrators in the future. I wonder what it's caused by. =20 In single user / is read only and so /boot/zfs/zpool.cache can't be=20 created/updated But it's still readable. The issue is that hostid isn't set (by=20 /etc/rc.d/hostid). if the root is read only, as the case of diskless/dataless boot, it's the fact that /boot/zfs/zpool.cache cannot be used which causes the problem, so adding zpool import -a solves the issue. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: strings /boot/kernel/kernel | grep ___
On 7-stable strings /boot/kernel/kernel | grep ___ fails to show kernel config, whereas on 6.2-REL before it worked. Also in 7 there's no START CONFIG FILE END CONFIG FILE Is this deliberate or a mistake ? strings are still there though, look for ^options CONFIG_AUTOGENERATED (assuming your kernel congif gile included options INCLUDE_CONFIG_FILE ) config -x /boot/kernel/kernel ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 7.0-BETA4 and msk problems
On Mon, Dec 10, 2007 at 11:03:47AM +0200, Danny Braniss wrote: On Sun, Dec 09, 2007 at 02:41:28PM +0200, Danny Braniss wrote: with this onboard NIC (LOB?) mskc0: Marvell Yukon 88E8056 Gigabit Ethernet e1000phy0: Marvell 88E1149 Gigabit PHY PHY 0 on miibus0 [EMAIL PROTECTED]:1:0:0: class=0x02 card=0x81f81043 chip=0x436411ab rev=0x12 hdr=0x00 vendor = 'Marvell Semiconductor (Was: Galileo Technology Ltd)' device = '88E8056 Yukon PCI-E Gigabit Ethernet Controller' class = network subclass = ethernet I'm getting allot of: msk0: watchdog timeout and mskc0: Tx descriptor error and msk0: link state changed to DOWN and msk0: link state changed to UP any help is most welcome, danny It seems that the issue happens only on 88E8056/88E1149 PHY. See PR 116853 and 114631. Sorry, I have no cluet yet. to add some more noise, this is the first host that panicked too :-) anything I can do to help? Probably ship the hardware to me? :) love to, but the hardware is not mine :-) here is some more info, this is a different board, but with the same Marvell 88E8056, and it panics after printing 'no PHY found!' and the ethernet is -1 (ff.ff...) danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
zfs diskless boot problem
Zfs uses /boot/zfs to keep track of it's pools, but in a diskless environment, this is a read-only fs. This causes several inconveniences, - /etc/rc.d/zfs needs : zfs_start_main() { dlv=`/sbin/sysctl -n vfs.nfs.diskless_valid 2 /dev/null` if [ ${dlv:=0} -ne 0 ]; then zpool import -a fi ... - how important is /boot/zfs? danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
unionfs lock problems
with 7.0-Beta4, I'm getting quiet a few of these: lockmgr: thread 0xff00039269f0 unlocking unheld lock KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _lockmgr() at _lockmgr+0x6ae VOP_UNLOCK_APV() at VOP_UNLOCK_APV+0x46 unionfs_unlock() at unionfs_unlock+0x22f VOP_UNLOCK_APV() at VOP_UNLOCK_APV+0x46 vn_read() at vn_read+0x264 dofileread() at dofileread+0xa1 kern_readv() at kern_readv+0x4c read() at read+0x54 syscall() at syscall+0x254 Xfast_syscall() at Xfast_syscall+0xab --- syscall (3, FreeBSD ELF64, read), rip = 0x80187b18c, rsp = 0x7fffafb8, rbp = 0x7fffb045 --- danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
ufs_rename: fvp == tvp (can't happen)
Hi, I'm also getting quiet a few of this can't happen. The system is running 7.0-beta4, and is doing 'portsupgrade -af' danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 7.0-BETA4 and msk problems
On Sun, Dec 09, 2007 at 02:41:28PM +0200, Danny Braniss wrote: with this onboard NIC (LOB?) mskc0: Marvell Yukon 88E8056 Gigabit Ethernet e1000phy0: Marvell 88E1149 Gigabit PHY PHY 0 on miibus0 [EMAIL PROTECTED]:1:0:0: class=0x02 card=0x81f81043 chip=0x436411ab rev=0x12 hdr=0x00 vendor = 'Marvell Semiconductor (Was: Galileo Technology Ltd)' device = '88E8056 Yukon PCI-E Gigabit Ethernet Controller' class = network subclass = ethernet I'm getting allot of: msk0: watchdog timeout and mskc0: Tx descriptor error and msk0: link state changed to DOWN and msk0: link state changed to UP any help is most welcome, danny It seems that the issue happens only on 88E8056/88E1149 PHY. See PR 116853 and 114631. Sorry, I have no cluet yet. to add some more noise, this is the first host that panicked too :-) anything I can do to help? danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
7.0-BETA4 and msk problems
with this onboard NIC (LOB?) mskc0: Marvell Yukon 88E8056 Gigabit Ethernet e1000phy0: Marvell 88E1149 Gigabit PHY PHY 0 on miibus0 [EMAIL PROTECTED]:1:0:0: class=0x02 card=0x81f81043 chip=0x436411ab rev=0x12 hdr=0x00 vendor = 'Marvell Semiconductor (Was: Galileo Technology Ltd)' device = '88E8056 Yukon PCI-E Gigabit Ethernet Controller' class = network subclass = ethernet I'm getting allot of: msk0: watchdog timeout and mskc0: Tx descriptor error and msk0: link state changed to DOWN and msk0: link state changed to UP any help is most welcome, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD-6.2, 7.0-BETA1 on X60
try monitoring the traffic (tcpdump/wireshark), this should give you a good starting point. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
bge/ilo blues on Sun X2200
newer kernels (after Oct. 20) are breacking bge. i've checked the cvsdiffs, but nothing seems relevant. the ilo (Integrated Lights Out) port, bge1, though not configured, gets configured to 10baseT/UTP full-duplex, and no magic will get it to 100/full-duplex, which is what the ilo is using. this works fine on an Oct 20 system. any ideas? cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: /usr/share/man/man8/MAKEDEV.8
On Sun, Oct 28, 2007 at 12:37:45PM -0400, Andrew Lankford wrote: Thanks for replying, but once again I've miscommunicated the issue. I meant that MAKEDEV.8 is in /usr/src/share/man/man8/, and the modification time is Oct 13th, recent, which suggests to me that it's still in the 7.0-BETA1 cvs tree. And you're very much correct: http://www.freebsd.org/cgi/cvsweb.cgi/src/share/man/man8/MAKEDEV.8 how about reading what ity says? it might be very educational :-) NAME MAKEDEV -- old script for creating device nodes DESCRIPTION The MAKEDEV script was deprecated by devfs(5) and removed from FreeBSD after devfs(5) became mandatory. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
any hope for nfe/msk?
Hi, these drivers don't work under 7.0 As soon as some mild preasure is applied, they start loosing interrupts, and in my case the hosts come to a total stand-still, since they are diskless and rely on the network. This happens at 1gb and at 100mg. Maybe the problem is with the shared interrups? irq16: mskc0 uhci0 3308351 13 or irq21: nfe0 ohci01584415 24 but I have no idea how to uncouple this danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: PXE booting issues
I'm currently working on an embedded project which will be built around a BSD (I'm not sure which yet), currently I have an image up and running DragonFly and I'm currently attempting to do the same with FreeBSD for comparison. I'm more or less following the miniBSD tutorials (updating the various file lists and such as I go). The system is being built around 6.2-STABLE but I'm having some issues getting it to PXE boot. On my FreeBSD host I've enabled TFTP, exported the rootfs for the system via NFS, built and installed isc-dhcpd and configured it with the extra options for PXE booting. The client currently gets its IP address from the server successfully, retrieves and loads pxeboot but when it comes to launch the kernel it eventually throws an NFS timeout. I know the export is working because I used NFS to pull the rootfs over to my DragonFly boot host to see what the result was when I booted the same image from a known working boot host (it worked correctly until it hit a problem in the image I will detail in a separate message). I did attempt to rebuild the pxeboot loader, following the standard instructions; set LOADER_TFTP_SUPPORT=YES in /etc/make.conf cd /usr/src/sys/boot make clean make depend make It appears to be successful (and the output would support this) but the i386/pxeldr/pxeboot and i386/loader/loader files do not exist, my only guess is that I've not set a make variable I should have, the most confusing part is that the dd command which is the final step in generating pxeboot appears in the output and appears to be successful; == dd if=pxeboot.tmp of=pxeboot obs=2k conv=osync 425+0 records in 107+0 records out == The discrepancy in the records in and the records out is concerning but I would expect the file to exist regardless, I'm currently using the default /boot/pxeboot. Any suggestions as to what might be causing this would be greatly appreciated. 1- try sniffing (wireshark, not the kind that will get you high :-) and see where it hangs. 2- the dhcp should tell pxeloader where the root is: option root-path ip:/path danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: iSCSI initiator tester wanted
This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --enig2465C3FB4D976B8B44FBA3F0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Danny Braniss wrote: Hi all, I'm in the last mile before crossing the beta-release line, so I'd like to get some input, and update the list of targets it suppor= ts. you can obtain the driver from: Have you tested it with net/iscsi-target from ports? I get the following messages and errors when I try it: Jun 7 22:53:08 finstall kernel: ic_action: called Jun 7 22:53:08 finstall kernel: ic_action: func_code=3D0x901 flags=3D0xc= 0 status=3D0x0 target=3D0 lun=3D0 retry_count=3D1 timeout=3D30 Jun 7 22:53:08 finstall kernel: ic_action: XPT_SCSI_IO cmd=3D0x35 Jun 7 22:53:08 finstall kernel: scsi_encap: called Jun 7 22:53:08 finstall kernel: scsi_encap: ccb-sp=3D0xc2e97c00 Jun 7 22:53:08 finstall kernel: dwl: called Jun 7 22:53:08 finstall kernel: isc_qout: called Jun 7 22:53:08 finstall kernel: 0] isc_qout: enqued: pq=3D0xc37ff0bc Jun 7 22:53:08 finstall kernel: proc_out: called Jun 7 22:53:08 finstall kernel: 0] proc_out: opcode=3D0x1 sn(cmd=3D0x27 expCmd=3D0x26 maxCmd=3D0x26 expStat=3D0x0 itt=3D0x27) Jun 7 22:53:08 finstall kernel: isc_sendPDU: called Jun 7 22:53:08 finstall kernel: 0] ism_proc: odone=3D1 Jun 7 22:53:08 finstall kernel: proc_out: called Jun 7 22:53:08 finstall kernel: 0] ism_proc: odone=3D0 Jun 7 22:53:08 finstall kernel: so_input: called Jun 7 22:53:08 finstall kernel: so_getbhs: called Jun 7 22:53:08 finstall kernel: proc_out: called Jun 7 22:53:08 finstall kernel: 0] ism_proc: odone=3D0 Jun 7 22:53:08 finstall kernel: so_recv: called Jun 7 22:53:08 finstall kernel: 0] so_recv: len=3D48] opcode=3D0x21 ahs_len=3D0x0 ds_len=3D0x0 Jun 7 22:53:08 finstall kernel: ism_recv: called Jun 7 22:53:08 finstall kernel: 0] ism_recv: opcode=3D0x21 itt=3D0x27 stat#0x1 maxcmd=3D0x27 Jun 7 22:53:08 finstall kernel: _scsi_rsp: called Jun 7 22:53:08 finstall kernel: _scsi_rsp: itt=3D27 pq=3D0xc37ff5e0 opq=3D0xc37ff0bc Jun 7 22:53:08 finstall kernel: iscsi_done: called Jun 7 22:53:08 finstall kernel: _scsi_done: called Jun 7 22:53:08 finstall kernel: _scsi_done: ccb_h-status=3D1 Jun 7 22:53:08 finstall kernel: so_input: called Jun 7 22:53:08 finstall kernel: so_getbhs: called Jun 7 22:53:08 finstall kernel: proc_out: called Jun 7 22:53:08 finstall kernel: 0] ism_proc: odone=3D0 Jun 7 22:53:08 finstall iscontrol[2084]: cam_open_btl: no passthrough device found at 1:0:1 Jun 7 22:53:08 finstall iscontrol[2084]: cam_open_btl: no passthrough device found at 1:0:2 Jun 7 22:53:08 finstall iscontrol[2084]: cam_open_btl: no passthrough device found at 1:0:3 Jun 7 22:53:38 finstall kernel: _nop_out: called Jun 7 22:53:38 finstall kernel: 0] _nop_out: cws=3D1 Jun 7 22:53:38 finstall kernel: proc_out: called Jun 7 22:53:38 finstall kernel: 0] ism_proc: odone=3D0 The message on the machine running scsi-target is: Unsupported INQUIRY VPD page 80 yes, I have tested it against ports/net/iscsi-target, I use it to try out errror recovery :-), and as far as I could tell it's harmelss. Anothere thing I can report is that running both (target/initiator) does not work, and it seems that the target gets stuck. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: iSCSI initiator tester wanted
... The message on the machine running scsi-target is: Unsupported INQUIRY VPD page 80 yes, I have tested it against ports/net/iscsi-target, I use it to try out errror recovery :-), and as far as I could tell it's harmelss. Anothere thing I can report is that running both (target/initiator) does not ^ on the same host work, and it seems that the target gets stuck. cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: iSCSI initiator tester wanted
A couple comments just from reading through this, see below. #!/bin/sh # PROVIDE: iscsi # REQUIRE: NETWORKING # BEFORE: DAEMON # KEYWORD: nojail shutdown # # Add the following lines to /etc/rc.conf to enable iscsi: # # iscsi_enable=YES # iscsi_fstab=/etc/fstab.iscsi The iscsi_exports knob should also be documented here. agreed . /etc/rc.subr name=iscsi rcvar=`set_rcvar` command=/usr/local/sbin/iscontrol Assuming this gets commited this will want to be /sbin/iscontrol. absolutely iscsi_enable=${iscsi_enable:-NO} iscsi_fstab=${iscsi_fstab:-/etc/fstab.iscsi} iscsi_exports=${iscsi_exports:-/etc/exports.iscsi} start_cmd=iscsi_start faststop_cmp=iscsi_stop stop_cmd=iscsi_stop iscsi_wait() { dev=$1 trap echo 'wait loop cancelled'; exit 1 2 count=0 while true; do if [ -c $dev ]; then break; fi if [ $count -eq 0 ]; then echo -n Waiting for ${dev}': ' fi count=$((${count} + 1)) if [ $count -eq 6 ]; then echo ' Failed' return 0 break fi echo -n '.' sleep 5; done echo '.' return 1 } iscsi_start() { # # load needed modules for m in iscsi_initiator geom_label; do kldstat -qm $m || kldload $m done Good thinking making geom_label a pseudo-requirement. Examples and documentation for fstab.iscsi should strongly recommend its use, since device names will vary. sysctl debug.iscsi=2 Maybe make this another rc variable that could be set in /etc/rc.conf. You'll probably also want to change the module's default verbosity level once it becomes more official. it will be zero by default, and no reason to clobber rc.conf. # # start iscontrol for each target if [ -n ${iscsi_targets} ]; then for target in ${iscsi_targets}; do ${command} ${rc_flags} -n ${target} done fi if [ -f ${iscsi_fstab} ]; then while read spec file type opt t1 t2 do case ${spec} in \#*|'') ;; *) if iscsi_wait ${spec}; then break; fi echo type=$type spec=$spec file=$file fsck -p ${spec} mount ${spec} ${file} ;; esac done ${iscsi_fstab} fi if [ -f ${iscsi_exports} ]; then cat ${iscsi_exports} /etc/exports #/etc/rc.d/mountd reload does not work, why? kill -1 `cat /var/run/mountd.pid` fi } Look at how Pawel handled this with ZFS (mostly in the zfs and mountd rc.d scripts), and use the fact that mountd can take multiple exports files on its command line to your advantage. i.e. appending to the normal exports file is not really what you want to do. I like the idea of keeping things from spreading around, and maybe /etc/rc.d/mountd can be taught to use all exportfs.something files might be an idea, specially since sometimes one has to '/etc/rc.d/mountd reload' - i miss 'exportfs -a' :-) iscsi_stop() { echo 'iscsi stopping' while read spec file type opt t1 t2 do case ${spec} in \#*|'') ;; *) echo iscsi: umount $spec umount -fv $spec # and remove from the exports ... See above; this could be a no-op. ;; esac done ${iscsi_fstab} } load_rc_config $name run_rc_command $1 -- problems with the above script: - no background fsck It would be nice not to re-invent the wheel here, and there are other reasons it would be nice to just use /etc/fstab instead of adding a new file -- a number of utilities use /etc/fstab to map between mountpoints and device names even if the device isn't mounted. Did you try this approach, and if so what obstacles did you encounter? I will play around with this if I have time. The late fstab/mount option will probably be useful here. it all boils down to my not-liking-to-spread-out syndrome, rc.conf should have all that is needed to configure a host, but alas, that is a too minimalistic approach, since there are also config files. well, some of the solutions take into concideration my local environment, most of the servers and workstations are 'dataless', they share many files, and via DHCP/rc.conf and some other magic, it all works. Except for 'small' changes in cofiguration files, ie: most of the hosts have serial console enabled, but a few problematic ones don't. [easy solution: a script that changes off/on accroding to some rc.conf tunable). most have a common fstab (cdrom, proc, linproc0, but different disks (da,ad,etc). so it would be nice to be able to keep the common stuff (DEFAULT) and the merge the diffs. and i don't want to go the XML road, nor any other heavy handed solution. ok, enough ramblings for a bussy morning. chears, danny PS:
iSCSI initiator tester wanted
Hi all, I'm in the last mile before crossing the beta-release line, so I'd like to get some input, and update the list of targets it supports. you can obtain the driver from: ftp://ftp.cs.huji.ac.il/users/danny/freebsd/iscsi-2.0.92.tar.gz Cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: iSCSI initiator tester wanted
Quoting Danny Braniss [EMAIL PROTECTED]: I'm in the last mile before crossing the beta-release line, so I'd like to get some input, and update the list of targets it supports. you can obtain the driver from: ftp://ftp.cs.huji.ac.il/users/danny/freebsd/iscsi-2.0.92.tar.gz Looks great! I've done some basic testing against our cluster of three LeftHand Networks NSM 160's running SAN/iQ 6.6SP1. My machine is running -CURRENT as of a couple days ago (with gcc 4.2 and symbol versioning). I've tested previous snapshots of the driver against the same SAN on this and another machine running -STABLE with good results. so i'm updating my Targets file. Is there anything specific you'd like tested? What connection interruption scenarios does the driver try to recover from? I'm running some backups to an iSCSI mount now. When that finishes (and my machine is otherwise unoccupied) I'll play around with temporarily yanking the ethernet cable and other fun tricks. it 'should' recover from network disconects, like pulling out cable, or rebooting the target, but I think that this will only work if there is no major activity - I better test this one again. it should also flush buffers when you shut down the host, this was a major pain with the old versions. Thanks for the Makefiles. Your blurb text incorrectly directs the reader to run make in sys/dev/iscsi_initiator (which doesn't exist, and there's no Makefile in sys/dev/iscsi). Obviously you meant sys/modules/iscsi_initiator. Also, a line about running make in iscontrol/ would be helpful, as would an install target in that Makefile. ok, fixed the 'typos', I also forgot the sample rc.d/iscsi, Do you have any suggestions on startup integration (rc script, fstab magic, etc)? I know you said once before that that was hopefully coming soon.. this is an attempt: #!/bin/sh # PROVIDE: iscsi # REQUIRE: NETWORKING # BEFORE: DAEMON # KEYWORD: nojail shutdown # # Add the following lines to /etc/rc.conf to enable iscsi: # # iscsi_enable=YES # iscsi_fstab=/etc/fstab.iscsi . /etc/rc.subr name=iscsi rcvar=`set_rcvar` command=/usr/local/sbin/iscontrol iscsi_enable=${iscsi_enable:-NO} iscsi_fstab=${iscsi_fstab:-/etc/fstab.iscsi} iscsi_exports=${iscsi_exports:-/etc/exports.iscsi} start_cmd=iscsi_start faststop_cmp=iscsi_stop stop_cmd=iscsi_stop iscsi_wait() { dev=$1 trap echo 'wait loop cancelled'; exit 1 2 count=0 while true; do if [ -c $dev ]; then break; fi if [ $count -eq 0 ]; then echo -n Waiting for ${dev}': ' fi count=$((${count} + 1)) if [ $count -eq 6 ]; then echo ' Failed' return 0 break fi echo -n '.' sleep 5; done echo '.' return 1 } iscsi_start() { # # load needed modules for m in iscsi_initiator geom_label; do kldstat -qm $m || kldload $m done sysctl debug.iscsi=2 # # start iscontrol for each target if [ -n ${iscsi_targets} ]; then for target in ${iscsi_targets}; do ${command} ${rc_flags} -n ${target} done fi if [ -f ${iscsi_fstab} ]; then while read spec file type opt t1 t2 do case ${spec} in \#*|'') ;; *) if iscsi_wait ${spec}; then break; fi echo type=$type spec=$spec file=$file fsck -p ${spec} mount ${spec} ${file} ;; esac done ${iscsi_fstab} fi if [ -f ${iscsi_exports} ]; then cat ${iscsi_exports} /etc/exports #/etc/rc.d/mountd reload does not work, why? kill -1 `cat /var/run/mountd.pid` fi } iscsi_stop() { echo 'iscsi stopping' while read spec file type opt t1 t2 do case ${spec} in \#*|'') ;; *) echo iscsi: umount $spec umount -fv $spec # and remove from the exports ... ;; esac done ${iscsi_fstab} } load_rc_config $name run_rc_command $1 -- problems with the above script: - no background fsck - restart will mess the exports file - the wait loop should be replaced by something more deterministic. Thanks again. I'll post again if I manage to break something. Ok, but can't say I look forward to hear from you :-) danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]