Re: kern/135412: [zfs] [nfs] zfs(v13)+nfs and open(..., O_WRONLY|O_CREAT|O_EXCL, ...)

2009-07-01 Thread Danny Braniss
 On 2009-06-30, Mike Andrews wrote:
  Jaakko Heinonen wrote:
   On 2009-06-30, Danny Braniss wrote:
This pr is realy holding me back, I can't upgrade this server, and
   telling serveral tens of users to us cp, etc is not an option. The open
   works fine if not using O_EXCL. 
   
   I guess that r185586 needs to be MFCd to stable/7. Here's an untested
   patch against stable/7:
  
  The patch doesn't help over here, sorry.
  
  Simply doing 'touch' or 'mv' to an NFSv3 mount (using either a v6 or v13 
  zpool) is the test case I've been using; touch doesn't even use O_EXCL as 
  far as I can tell.
 
 I could reproduce the problem with O_EXCL and verified that the patch
 fixes it. However I couldn't reproduce the problem you are seeing with
 touch and mv.

same here, touch worked before too - so i think it's unrelated,
btw, it seems that the problem does not exist on i386, though
I'm pretty sure I tried there too, oh well, 

thanks!
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


kern/135412: [zfs] [nfs] zfs(v13)+nfs and open(..., O_WRONLY|O_CREAT|O_EXCL, ...)

2009-06-30 Thread Danny Braniss
hi,
This pr is realy holding me back, I can't upgrade this server, and
telling serveral tens of users to us cp, etc is not an option. The open
works fine if not using O_EXCL. 

Thanks,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: reecommendations for an 'appliance platform ?

2009-06-13 Thread Danny Braniss
I'm not 100% sure, but fairly sure that you'll have a hard time
  finding something that combines the low-power standalone type spec with
  a 64-bit capable processor.  Once you get the higher-end processor,
 
 That was my experiense when shopping around yes - annoying as I
 don't need anything particularly low power (it ain't going to
 be my leccy bill :-).
 
  becomes a more likely failure point, and so on.  Can you elaborate a
  bit more on which parts of that system spec you really need - do you
  need the GigE?  Two ethernets? The external SATA?
 
 It needs to be:
 
 1) Complete as purchased - I dont want to build a machine
 2) Capable of having a simple boot device (e.g. CF card) dropped in
 3) At least one ether port. 100 meg will do.
 4) Small enough to be posted to the end user
 5) Cheap - under 400 euros, preferably 300
 
 I do not really care about processor speed, or memory, or power
 consumption. It needs to run FreeBSD, and I would prefer amd64
 as we havent written or used any of our code on 32 bit in a long
 time, and I would feel uneasy that there might be laten bugs in
 it if we simply recompiled it for 32 bit.
 
  I bought one of these from them last year:
http://www.fit-pc.com/new/fit-pc-1-0-specifications.html
 
 Thanks for the links - thats pretty interesting! I notice the newer
 ones are also Atom based, so similarly spec'd to what I was
 looking at, but they may be more suitable.
 
 cheers,
 
 -pete.

I've had very good experience with:
http://www.pcengines.ch/
danny

danny



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


(no subject)

2009-06-12 Thread Danny Braniss
latest -stable (June 11) is causing problems:
MB is intel SE7320VP21,

msk0: Ethernet address: 00:0e:0c:6a:85:a8
miibus0: MII bus on msk0
e1000phy0: Marvell 88E Gigabit PHY PHY 0 on miibus0
e1000phy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto

msk0: watchdog timeout (missed Tx interrupts) -- recovering
msk0: watchdog timeout (missed Tx interrupts) -- recovering
msk0: watchdog timeout (missed Tx interrupts) -- recovering
...

cheers,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: msk/stable

2009-06-12 Thread Danny Braniss
 On Fri, Jun 12, 2009 at 10:57:42AM +0300, Danny Braniss wrote:
  latest -stable (June 11) is causing problems:
  MB is intel SE7320VP21,
  
  msk0: Ethernet address: 00:0e:0c:6a:85:a8
  miibus0: MII bus on msk0
  e1000phy0: Marvell 88E Gigabit PHY PHY 0 on miibus0
  e1000phy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
  1000baseT-FDX, auto
  
  msk0: watchdog timeout (missed Tx interrupts) -- recovering
  msk0: watchdog timeout (missed Tx interrupts) -- recovering
  msk0: watchdog timeout (missed Tx interrupts) -- recovering
  ...
  
 
 I think there was not much msk(4) changes in stable. msk(4) in
 CURRENT has a lot changes to support newer controllers. Does msk(4)
 in CURRENT make any difference?
 Also please show me dmesg output(msk(4) related one) to know which
 controller you have.
hrumph, missed some lines:

mskc0: Marvell Yukon 88E8050 Gigabit Ethernet port 0xb800-0xb8ff mem 
0xdeefc000-0xdeef irq 16 at device 0.0 on pci2
msk0: Marvell Technology Group Ltd. Yukon EC Id 0xb6 Rev 0x02 on mskc0
msk0: Ethernet address: 00:0e:0c:6a:85:a8
miibus0: MII bus on msk0
miibus0: MII bus on msk0
e1000phy0: Marvell 88E Gigabit PHY PHY 0 on miibus0
e1000phy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto
mskc0: [FILTER]

 
 Btw, are you using MSI?
yes, but it was (so it seemed) working ok.
i'll try again soon without msi.

in the meantime, Im running an older kernel, trying to finish a
very long process (svn/svk), which when done, I will be able
to compile current.

thanks,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


unionfs unlocking unheld lock

2009-06-10 Thread Danny Braniss
hi,
sporadically, I see this:

lockmgr: thread 0xff0004a8b390 unlocking unheld lock
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
_lockmgr() at _lockmgr+0x6ae
VOP_UNLOCK_APV() at VOP_UNLOCK_APV+0x46
unionfs_unlock() at unionfs_unlock+0x22f
VOP_UNLOCK_APV() at VOP_UNLOCK_APV+0x46
vn_read() at vn_read+0x264
dofileread() at dofileread+0xa1
kern_readv() at kern_readv+0x4c
read() at read+0x54
syscall() at syscall+0x256
Xfast_syscall() at Xfast_syscall+0xab
--- syscall (3, FreeBSD ELF64, read), rip = 0x8017545bc, rsp = 0x7fffaf98, 
rbp = 0x7fffb025 ---

sf-02 zgrep 'unlocking unheld lock' /var/log/messages*
/var/log/messages:May 25 03:03:37 sf-02 kernel: lockmgr: thread 
0xff0004ed0720 unlocking unheld lock
/var/log/messages:May 31 03:03:10 sf-02 kernel: lockmgr: thread 
0xff0004ed6ab0 unlocking unheld lock
/var/log/messages:Jun 10 03:03:19 sf-02 kernel: lockmgr: thread 
0xff0004a8b390 unlocking unheld lock

it happens around 3 am, so I guess it must be some daily script that trips 
this.

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Stable from May 31 - zfs list locked

2009-06-06 Thread Danny Braniss
 Hello,
 
 I encounter this problem for the second time. The system is working 
 perfectly well but suddenly the command `zfs list' don't work and can't 
 be killed.
 
 Here is a procstat of the culprit:
 
 [r...@morzine ~]# procstat -k 91766
PIDTID COMM TDNAME   KSTACK 
 
 91766 100490 zfs  -mi_switch sleepq_switch 
 sleepq_wait _cv_wait zio_wait dbuf_read dmu_buf_hold zap_lockdir 
 zap_lookup_norm zap_lookup dsl_prop_get_dd dsl_dataset_get_ref 
 dsl_dataset_hold dmu_objset_open zfs_ioc_objset_stats zfsdev_ioctl 
 devfs_ioctl_f kern_ioctl
 
 same thing happen if I try to run `zpool list' un another terminal.
 
 Henri

same here, but with a twist:
it used to happen on a 7.1, then after an unpgrade, sometime in April,
to 7.2-PRERELEASE, things were ok till today!
I was about to blame a resent upgrade of the PERC firmware(a very long shot :-),
but now I don't know if upgradeing to 7.2-stable will help. This is a
production host, with 12TB serving many nfs clients.

danny



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


msk(4) and Yukon

2009-05-24 Thread Danny Braniss
hi,
Since I saw some activity, I decided to try out my msks,
so the Yukon 88E8050 on my Intel SE7320VP21 now works with
hw.msk.legacy_intr=0
which didn't before (sorry, but the best I can say is 'long time ago' ;-)

on an Asus P5K-VM with Yukon 88E8056, it panics when used to PXE boot,
but otherwise works fine (it used to hang the boot before).

cheers,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ZFS MFC heads up

2009-05-21 Thread Danny Braniss
 I will be MFC'ing the newer ZFS support some time this afternoon. Both
 world and kernel will need to be re-built. Existing pools will
 continue to work without upgrade.
 
 
 If you choose to upgrade a pool to take advantage of new features you
 will no longer be able to use it with sources prior to today. 'zfs
 send/recv' is not expected to inter-operate between different pool
 versions.

I think this is not a zfs issue, but it does trigger the problem:
on a zfs mounted system via nfs, from linux, an ls .zfs 
used to panic the server, now it just doesn't work :-).
(http://lists.freebsd.org/pipermail/freebsd-fs/2008-October/005217.html)

so is this being worked on?

btw, i will be trying the new version over the weekend.

danny



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD and iSCSI for disks.

2009-04-09 Thread Danny Braniss
 This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
 --enig90DADA8437A99D893FB775F8
 Content-Type: text/plain; charset=UTF-8
 Content-Transfer-Encoding: quoted-printable
 
 Danny Braniss wrote:
  Garance A Drosihn wrote:
  Some friends of mine are looking at the new DroboPro, which makes a=
 
  lot of disk space available via iSCSI (in addition to firewire 800),
  and they were wondering how well iSCSI works with FreeBSD.  I haven't=
 
  paid attention to iSCSI support.  Is there anyone using it heavily
  for disk-storage under FreeBSD?  Has there been much changed for
  iSCSI support in the 8.x branch, or is 7.x support working fine?
  I suppose you are interested in the client (initiator) side of iSCSI=
 
  support. It hasn't changed much between 7.x and 8.x but there are
  apparently some announcements of a newer version:
 
  http://lists.freebsd.org/pipermail/freebsd-scsi/2009-March/003834.html=
 
 
  I can't find any more information on it.
 
  the latest is in:
  http://www.cs.huji.ac.il/~danny/ftp/freebsd/iscsi-2.1.1.tar.gz
 
 Thanks!
 
 Is there anything in particular you'd like to get tested in the new
 version, any significant changes or improvements?
mainly fixed some bugs, and some code cleanup.

give it a spin, and let me know what target you are testing.
btw, the default tag opening is a bit concervative (1), you might want to
change it to somewhat larger, say 64 or 128.

cheers,
danny



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD and iSCSI for disks.

2009-04-09 Thread Danny Braniss
 Danny Braniss wrote:
  This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
  --enig90DADA8437A99D893FB775F8
  Content-Type: text/plain; charset=3DUTF-8
  Content-Transfer-Encoding: quoted-printable
 
  Danny Braniss wrote:
  Garance A Drosihn wrote:
  Some friends of mine are looking at the new DroboPro, which makes=
  a=3D
  lot of disk space available via iSCSI (in addition to firewire 800)=
 ,
  and they were wondering how well iSCSI works with FreeBSD.  I haven=
 't=3D
  paid attention to iSCSI support.  Is there anyone using it heavily
  for disk-storage under FreeBSD?  Has there been much changed for
  iSCSI support in the 8.x branch, or is 7.x support working fine?
  I suppose you are interested in the client (initiator) side of iSC=
 SI=3D
  support. It hasn't changed much between 7.x and 8.x but there are
  apparently some announcements of a newer version:
 
  http://lists.freebsd.org/pipermail/freebsd-scsi/2009-March/003834.ht=
 ml=3D
  I can't find any more information on it.
  the latest is in:
http://www.cs.huji.ac.il/~danny/ftp/freebsd/iscsi-2.1.1.tar.gz
  Thanks!
 
  Is there anything in particular you'd like to get tested in the new
  version, any significant changes or improvements?
  mainly fixed some bugs, and some code cleanup.
 =20
  give it a spin, and let me know what target you are testing.
  btw, the default tag opening is a bit concervative (1), you might want =
 to
  change it to somewhat larger, say 64 or 128.
 
 Hi,
 
 camcontrol tags hangs:
 
 Apr  9 15:36:36 terminator kernel: da3 at iscsi0 bus 0 target 1 lun 0
 Apr  9 15:36:36 terminator kernel: da3: FreeBSD iSCSI DISK 0001 Fixed
 Direct Access SCSI-5 device
 Apr  9 15:36:38 terminator kernel: (da2:iscsi0:0:0:0): lost device
 Apr  9 15:36:38 terminator kernel: (da2:iscsi0:0:0:0): removing device en=
 try
 terminator:~ivoras/temp/sbin/iscontrol# ls /dev/da*
 /dev/da0 /dev/da0s1   /dev/da0s1a  /dev/da0s1b  /dev/da0s1c
 /dev/da1 /dev/da3
 terminator:~ivoras/temp/sbin/iscontrol# camcontrol tags da3
 
 
 The configuration is:
 
 target0 {
 targetaddress =3D 161.53.72.65
 targetname =3D iqn.2007-09.jp.ne.peach:disk1
 tags =3D 16
 }
 

Q: what kernel?
Q: what target?

btw, without the camcontrol tags, is it working?

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD and iSCSI for disks.

2009-04-08 Thread Danny Braniss

 Garance A Drosihn wrote:
  Some friends of mine are looking at the new DroboPro, which makes a
  lot of disk space available via iSCSI (in addition to firewire 800),
  and they were wondering how well iSCSI works with FreeBSD.  I haven't
  paid attention to iSCSI support.  Is there anyone using it heavily
  for disk-storage under FreeBSD?  Has there been much changed for
  iSCSI support in the 8.x branch, or is 7.x support working fine?
 
 I suppose you are interested in the client (initiator) side of iSCSI
 support. It hasn't changed much between 7.x and 8.x but there are
 apparently some announcements of a newer version:
 
 http://lists.freebsd.org/pipermail/freebsd-scsi/2009-March/003834.html
 
 I can't find any more information on it.
the latest is in:
http://www.cs.huji.ac.il/~danny/ftp/freebsd/iscsi-2.1.1.tar.gz
cheers,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Intel Integrated Raid (iir) relevance

2009-04-01 Thread Danny Braniss
Hi Xin LI,
 (It would be probably good idea to redirect this discussion to -stable@,
 redirected)
 
ok by me.

 Hi, Danny,
 
 Danny Braniss wrote:
  It's no longer working (for me) under 7.2, and so far
  I am not getting any feedback, so since it seems that
  this particular hardware has reached EOL, I was wondering
  if,
   a) it's true,
   b) drop it, and replace it.
   c) should time be spent in getting it to work again.
 
 I'm not very sure about your problem with iir(4).  A diff against
 RELENG_7_1 does not reveal any change on the driver itself.  Are you
 sure that 7.1-R can have the device working?
 
it's definitly broken for me, it broke sometime after rev 189591.
but the main questions are still unanswered. The problem I'm facing,
together with the amr, is on hosts that are being de-comissioned,
and though I'll be sad turning them to scrap, they did serve us well.

thanks,
danny



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: amr driver broken since March 12

2009-03-29 Thread Danny Braniss
 Danny Braniss wrote:
  it seems March 12 was a bit off :-)
  it took some time, but I managed to close the gap:
  189100  ok
  189150  fails
  I will continue tomorrow, but this should be helpful.
  
 
 
 189150 is in the middle of a big string of related commits.  Try
 updating to the following change numbers and retesting:
 
 189088
 189107
 189161
 
 If the last one does not work, try editing /sys/dev/amr/amr.c to change
 
 #define AMR_ENABLE_CAM 1
 
 to
 
 #define AMR_ENABLE_CAM 0
 
 Scott

189161 works, also for the iir
now what?

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: amr driver broken since March 12

2009-03-29 Thread Danny Braniss
 Danny Braniss wrote:
  Danny Braniss wrote:
  it seems March 12 was a bit off :-)
  it took some time, but I managed to close the gap:
189100  ok
189150  fails
  I will continue tomorrow, but this should be helpful.
 
 
  189150 is in the middle of a big string of related commits.  Try
  updating to the following change numbers and retesting:
 
  189088
  189107
  189161
 
  If the last one does not work, try editing /sys/dev/amr/amr.c to change
 
  #define AMR_ENABLE_CAM 1
 
  to
 
  #define AMR_ENABLE_CAM 0
 
  Scott
  
  189161 works, also for the iir
  now what?
  
 
 Next set to try:
 
 189219
broken
 189229
broken
any point in going on?
danny

 189253
 189402
 189531
 189569
 189591
 
 Scott


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: amr driver broken since March 12

2009-03-29 Thread Danny Braniss
 Danny Braniss danny at cs.huji.ac.il writes:
  at least for me :-)
  [and sorry for the cross posting]
 
 [...]
 
  amr0: LSILogic MegaRAID 1.53 mem 
  0xfbef-0xfbef,0xfe58-0xfe5f 
  irq 27 at device 0.0 on pci4
  amr0: [ITHREAD]
  amr0: delete logical drives supported by controller
  amr0: LSILogic Intel(R) RAID Controller SRCU42X Firmware 414I, BIOS A100, 
  128MB RAM
  amr0: adapter is busy
  amr0: adapter is busy
  amr0: delete logical drives supported by controller
  (probe0:amr0:0:6:0): TEST UNIT READY. CDB: 0 0 0 0 0 0 
  (probe0:amr0:0:6:0): CAM Status: SCSI Status Error
  (probe0:amr0:0:6:0): SCSI Status: Check Condition
  (probe0:amr0:0:6:0): ILLEGAL REQUEST asc:24,0
  (probe0:amr0:0:6:0): Invalid field in CDB
  (probe0:amr0:0:6:0): Unretryable error
 
   FWIW, I have a an amr device (Dell PERC 3/DC) which is working fine with
 a -STABLE dated after March 12th:
 
 FreeBSD 7.2-PRERELEASE #2: Thu Mar 26 09:41:58 EDT 2009
 te...@test4.tmk.com:/usr/obj/usr/src/sys/PE1550
 [snip]
 amr0: LSILogic MegaRAID 1.53 mem 0xf000-0xf7ff irq 25 at device 0.0 
 on pci3
 amr0: [ITHREAD]
 amr0: delete logical drives supported by controller
 amr0: LSILogic PERC 3/DC Firmware 199D, BIOS 3.35, 128MB RAM
 amr0: delete logical drives supported by controller
 amrd0: LSILogic MegaRAID logical drive on amr0
 amrd0: 69360MB (142049280 sectors) RAID 5 (optimal)
 ses0 at amr0 bus 0 target 6 lun 0
 ses0: DELL 1x3 U2W SCSI BP 1.21 Fixed Processor SCSI-2 device 
 ses0: SAF-TE Compliant Device
 Trying to mount root from ufs:/dev/amrd0s1a
 
   This is on a dual-processor Dell PowerEdge 1550.
 
   So this may only affect certain models or firmware revisions of amr
 devices. Of course, since each LSI OEM uses their own firmware and
 BIOS numbering scheme, it'll be hard to tell which one is newer than
 the other.
 
   I have a bazillion of these cards if one would be helpful to a de-
 veloper.

well, it's broken on my Dell PowerEdge 2940
amr0: LSILogic MegaRAID 1.53
amr0: Series 467 Firmware 1.06

and pciconf:
a...@pci0:0:2:1:class=0x0e0001 card=0x04671028 chip=0x19608086 rev=0x02 
hdr=0x00
vendor = 'Intel Corporation'
device = '80960RP i960RP Microprocessor'
class  = intelligent I/O controller
subclass   = I2O

now try to follow the rebranding trail :-)




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ALT_BREAK_TO... + ILO ... missing something in config ...

2009-03-28 Thread Danny Braniss
 
 Due to an issue I'm having with 7.x, and trying to track it down, I spent 
 tonight getting my server setup to allow my to break into the debugger 
 when it hangs, and hopefully dump core ...
 
 But, although I *think* I've got it all, I'm obviously missing something, 
 as it isn't breaking ...
 
 First ... I'm running a proliant server, and when I connect via SSH to ILO 
 on that machine, and type 'vsp', I get a shell as I expect, I can type, 
 etc ... when I reboot the machine, I get the opening splash screen with 
 the 7(?) options (normal boot, single user mode, etc, etc) ... but I get 
 nothing between that and the login prompt ... first sign of a problem, 
 maybe?
 

not realy, you at least have confirmation that you are talking
correctly via the serial port. Till this point boot is using
the BIOS routines to talk via the serial port. Later, the kernel
tries to use it's routines/knowledge of the serial port.

 Next, the easy question ... what is the key stroke to issue when one has 
 ALT_BREAK_TO_DEBUGGER is set in the kernel? I thought it was CR ~ ^b ... 
 is that correct?  I'm using putty to connect via ssh, if that makes a 
 difference ... I've also tried using the browser interface into ilo / vsp, 
 same lack of a result ...

unless the serial port is setup as console, check if /boot/device.hints
has:
hint.sio.0.flags=0x10
escaping to the debugger is not caught.
btw, Jeremy Chadwick had a nice explanation, but I lost the URL.

 
 Beyond adding sio device driver to my kernel, I've also got:
 
 options ALT_BREAK_TO_DEBUGGER
 options KDB
 options DDB
 
 Missing a kernel option maybe?
 
 I have the following in /boot/loader.conf:
 
 comconsole_speed=9600
 console=vidconsole,comconsole # A comma separated list of console(s)
 boot_multicons=-D # -D: Use multiple consoles
 boot_serial=-h # -h: Use serial console
 
 So ... eithe rI don't have it enabled like I think, or I'm doing the wrong 
 key stroke ... or ...
 
 Thx
 

you are very close!, but each hardware/bios needs a different solution :-(

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: amr driver broken since March 12

2009-03-28 Thread Danny Braniss
 Danny Braniss wrote:
  Danny Braniss wrote:
  at least for me :-)
  [and sorry for the cross posting]
 
  old (March 12 , i know need the svn rev number but...)
  None of the commit activity on March 12 is jumping out at me as being 
  suspicious.  However, you are now the second person who has told me 
  about AMR problems in 7.1 recently.  If you have a precise svn change
  number, it would help greatly.
 
  Scott
  my bad. the last working amr/iir is from March 12.
  I first detected the problem sometime later, but not later than March 23.
  So it has to be changes in that time frame.
  
  both drivers are showing similar symptoms:
  waiting for not busy
  the iir goes on for ever, and it's the cam that eventually panics,
   run_interrupt_driven_hooks: still waiting after 300 seconds for xpt_config
  (actually not 100% true, depending if WITNESS is on or off, it sometimes
  just hangs).
  the amr seems to time out:
  amr0: adapter is busy
  
  thanks for looking into the problem,
  
  danny
  
  
 
 Ok, here are a series of revisions to step through, in forward order.
 Make sure that you are starting with at least revision 189568.  Then,
 update to exactly the revision numbers below, recompile the kernel, and
 test:
 
 190087
 190091
 
it seems March 12 was a bit off :-)
it took some time, but I managed to close the gap:
189100  ok
189150  fails
I will continue tomorrow, but this should be helpful.

cheers,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


amr driver broken since March 12

2009-03-27 Thread Danny Braniss
at least for me :-)
[and sorry for the cross posting]

old (March 12 , i know need the svn rev number but...)
dmesg | grep amr
amr0: LSILogic MegaRAID 1.53 mem 0xfbef-0xfbef,0xfe58-0xfe5f 
irq 27 at device 0.0 on pci4
amr0: [ITHREAD]
amr0: delete logical drives supported by controller
amr0: LSILogic Intel(R) RAID Controller SRCU42X Firmware 414I, BIOS A100, 
128MB RAM
amr0: delete logical drives supported by controller
amrd0: LSILogic MegaRAID logical drive on amr0
amrd0: 34857MB (71387136 sectors) RAID 0 (optimal)
amrd1: LSILogic MegaRAID logical drive on amr0
amrd1: 280024MB (573489152 sectors) RAID 5 (optimal)

and a resent 7.2 (same host): 

amr0: LSILogic MegaRAID 1.53 mem 0xfbef-0xfbef,0xfe58-0xfe5f 
irq 27 at device 0.0 on pci4
amr0: [ITHREAD]
amr0: delete logical drives supported by controller
amr0: LSILogic Intel(R) RAID Controller SRCU42X Firmware 414I, BIOS A100, 
128MB RAM
amr0: adapter is busy
amr0: adapter is busy
amr0: delete logical drives supported by controller
(probe0:amr0:0:6:0): TEST UNIT READY. CDB: 0 0 0 0 0 0 
(probe0:amr0:0:6:0): CAM Status: SCSI Status Error
(probe0:amr0:0:6:0): SCSI Status: Check Condition
(probe0:amr0:0:6:0): ILLEGAL REQUEST asc:24,0
(probe0:amr0:0:6:0): Invalid field in CDB
(probe0:amr0:0:6:0): Unretryable error

btw, since I also have similar problems with another kind of raid card (iir),
I suspect some related changes are the cause.

danny





___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: amr driver broken since March 12

2009-03-27 Thread Danny Braniss
 Danny Braniss wrote:
  at least for me :-)
  [and sorry for the cross posting]
  
  old (March 12 , i know need the svn rev number but...)
 
 None of the commit activity on March 12 is jumping out at me as being 
 suspicious.  However, you are now the second person who has told me 
 about AMR problems in 7.1 recently.  If you have a precise svn change
 number, it would help greatly.
 
 Scott
my bad. the last working amr/iir is from March 12.
I first detected the problem sometime later, but not later than March 23.
So it has to be changes in that time frame.

both drivers are showing similar symptoms:
waiting for not busy
the iir goes on for ever, and it's the cam that eventually panics,
 run_interrupt_driven_hooks: still waiting after 300 seconds for xpt_config
(actually not 100% true, depending if WITNESS is on or off, it sometimes
just hangs).
the amr seems to time out:
amr0: adapter is busy

thanks for looking into the problem,

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: dump | restore fails: unknown tape header type 1853384566

2009-03-24 Thread Danny Braniss
 Daniel O'Connor ÎÁÐÉÓÁ×(ÌÁ):
 
 On Tuesday 24 March 2009 11:55:07 Mikhail T. wrote:
 
   
  I'm trying to migrate a filesystem from one disk to another using:
 
  dump a0hCf 0 32 - /old | restore -rf -
 
  (/old is already mounted read-only). The process runs for a while and  
  then stops with:
 
  [...]
DUMP: 22.85% done, finished in 3:57 at Tue Mar 24 01:03:21 2009
DUMP: 24.66% done, finished in 3:50 at Tue Mar 24 01:00:58 2009
DUMP: 26.44% done, finished in 3:43 at Tue Mar 24 00:59:14 2009
  unknown tape header type 1853384566  abort? [yn]
 
  Any idea, what's going on? Why can't FreeBSD's restore read FreeBSD's
  dump's output?
  
 
  What happens if you don't use the cache?

 No big difference:
  dump a0f  - /old | restore -rf -
 [...]
   DUMP: 17.25% done, finished in 3:27 at Tue Mar 24 05:42:00 2009
   DUMP: 20.36% done, finished in 3:09 at Tue Mar 24 05:28:13 2009
   DUMP: 23.83% done, finished in 2:50 at Tue Mar 24 05:14:32 2009
 unknown tape header type -621260722 abort? [yn]
 
 Looks like a junk value somewhere... Unitialized variable or some such.
 
can you try splitting it in 2, ie no pipe?
dump a0f some.file /old (or dump 0f - /old | gzip -c  file.dump.gz)
restore rf some.file

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Intel Integrated RAID iir not working under 7.2

2009-03-24 Thread Danny Braniss
Hi,
After turning debuging on, it seems that the iir driver
is loosing an interrupt while probing:
...
gdt_next(0xc7666000)
gdt_mpr_test_busy(0xc7666000) gdt_intr(0xc7666000)
gdt_mpr_get_status(0xc7666000) gdt_mpr_intr(0xc7666000) 
gdt_free_ccb(0xc7666000, 0xc767e444)
gdt_sync_event(0xc7666000, 3, 5, 0xc767e444)
gdt_next(0xc7666000)
gdt_mpr_test_busy(0xc7666000) run_interrupt_driven_hooks: still waiting after 
60 seconds for xpt_config
run_interrupt_driven_hooks: still waiting after 180 seconds for xpt_config
run_interrupt_driven_hooks: still waiting after 240 seconds for xpt_config
run_interrupt_driven_hooks: still waiting after 300 seconds for xpt_config
panic: run_interrupt_driven_config_hooks: waited too long

btw, older (7.0/7.1) still works.
any ideas?
thanks,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


7.2 and iir stuck on boot.

2009-03-23 Thread Danny Braniss
Hi,
after upgrading to 7.2, booting the kernel gets stuck with:
  run_interrupt_driven_hooks: still waiting after 300 seconds for xpt_config

from a 7.1 dmesg, it seems that it's in the iir driver.

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


7.2-PRERELEASE/sunx2200/bge/msi broken

2009-03-22 Thread Danny Braniss
Hi,
between March 16 and now, bge on a Sun X2200 stopped working,
turning off msi (via hw..pci.enable_msi=0) got it working again.
I tried first replacing bge with an older version but that did not help.

please advice :-)

Danny

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 7.2-PRERELEASE/sunx2200/bge/msi broken

2009-03-22 Thread Danny Braniss
 On Sun, Mar 22, 2009 at 03:10:07PM +0200, Mikolaj Golub wrote:
  
  On Sun, 22 Mar 2009 12:55:02 +0200 Danny Braniss wrote:
  
   DB Hi,
   DB between March 16 and now, bge on a Sun X2200 stopped working,
   DB turning off msi (via hw..pci.enable_msi=0) got it working again.
   DB I tried first replacing bge with an older version but that did not 
  help.
  
  It looks like related to this report:
  
  http://www.freebsd.org/cgi/getmsg.cgi?fetch=1253844+1263253+/usr/local/www/db/text/2009/freebsd-bugs/20090322.freebsd-bugs
  
 
 Could you please give the following patch a try?
 http://people.freebsd.org/~marius/bge_intx.diff
 
 Marius
 
it works!
thanks,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Tester wanted for multipath failover iSCSI target software

2009-03-11 Thread Danny Braniss
  Now istgt is a part of ports. (net/istgt)
  FreeBSD issue is solved by danny's patch.
  After applying the patch, iscontrol can connect to istgt.
 
 I am interested in giving this a try, though not immediately as I
 am away from the office at the moment. Do I need to apply a patch
 to iscontrol to make it work though ? I can't work it out from your
 statement above.

english version: (ungoogled :-)
the latest is in:
http://www.cs.huji.ac.il/~danny/ftp/freebsd/iscsi-2.1.1.tar.gz
and if you already have 2.1, apply:

--- iscsi.c.orig2008-09-21 10:01:50.0 +0300
+++ iscsi.c 2009-03-11 13:29:04.250472000 +0200
@@ -62,7 +62,7 @@
 #include dev/iscsi/initiator/iscsi.h
 #include dev/iscsi/initiator/iscsivar.h
 
-static char *iscsi_driver_version = 2.1.0;
+static char *iscsi_driver_version = 2.1.1;
 
 static struct isc_softc isc;
 
--- isc_sm.c.orig   2008-07-19 14:04:23.0 +0300
+++ isc_sm.c2009-03-11 13:30:20.672791000 +0200
@@ -508,7 +508,7 @@
sn-cmd++;
 
  case ISCSI_WRITE_DATA:
-  bhs-ExpStSN = htonl(sn-stat);
+  bhs-ExpStSN = htonl(sn-stat + 1);
   break;
 
  default:



danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FBSD 7.1 XEON Quad Core

2009-02-12 Thread Danny Braniss
 On 12.02.2009 3:12 Uhr, SDH Support wrote:
 
 
  Yes, if you have (or plan to have) more than 3 GB of memory.
 
 
  FYI I have had a lot of problems with FBSD7.x and HP DL-series hardware + 
  amd64. There is a bug IIRC in the loader.
 
 
 
 I had problems with DL3X0G5 when booting with PXE, every now an then the 
 loader would hang when trying to load the kernel from an i386 8-CURRENT 
 NFS server. As soon as i had installed everything and was booting from 
 the disks the problem did not show up anymore. I also used amd64.
 
 Do you use PXE boot?

pxeboot is problematic on some platforms, try an older version
btw, if you succeed let me know.

thanks
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: impossible packet length ...

2009-02-08 Thread Danny Braniss
I'm reposting this to hackers, and there is some more info.

 Hi,
 on 2 different servers, running 7.1-stable + zfs, I get this
 error rather frequently:
 
 Feb  5 17:01:03 warhol-00 kernel: impossible packet length (543383918) from 
 nfs server sunfire:/dist
 Feb  5 17:01:03 warhol-00 kernel: impossible packet length (1936028704) from 
 nfs server sunfire:/dist
 Feb  5 17:01:03 warhol-00 kernel: impossible packet length (1869363744) from 
 nfs server sunfire:/dist
 Feb  5 17:01:03 warhol-00 kernel: impossible packet length (1667787057) from 
 nfs server sunfire:/dist
 Feb  5 17:01:03 warhol-00 kernel: impossible packet length (976040755) from 
 nfs server sunfire:/dist
 Feb  5 17:01:03 warhol-00 kernel: impossible packet length (1953459488) from 
 nfs server sunfire:/dist
 Feb  5 17:01:03 warhol-00 kernel: impossible packet length (1348825156) from 
 nfs server sunfire:/dist
 Feb  5 17:01:03 warhol-00 kernel: impossible packet length (0) from nfs 
 server 
 sunfire:/dist
 Feb  5 17:01:03 warhol-00 kernel: impossible packet length (1647208041) from 
 nfs server sunfire:/dist
 
 in this case the server is running Freebsd-7.0-stable, but I also get it when 
 the server is a
 netapp.
 
 is there a connection?
 
 thanks,
   danny

going through the logs, after it happened again, I got a glimps of this:

Feb  6 18:00:13 warhol-00.cs.huji.ac.il kernel: bce0: discard frame w/o 
leading ethernet header (len 0 pkt len 0)
Feb  6 18:00:19 klee-05.cs.huji.ac.il kernel: nfs: server warhol-00 not 
responding, timed out
...
Feb  6 19:00:00 warhol-00.cs.huji.ac.il amd[715]: More than a single value for 
/defaults in hesiod.local
Feb  6 19:00:00 warhol-00.cs.huji.ac.il amd[715]: Unknown $ sequence in 
rhost:=${RHOST};type:=nfsl;fs:=${FS};rfs:=$huldigC0#^ZM-^KoM- abase
Feb  6 19:00:00 warhol-00.cs.huji.ac.il kernel: impossible packet length 
(2068989523) from nfs server sunfire:/dist

which seems to point fingers at bce...

danny



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: impossible packet length ...

2009-02-08 Thread Danny Braniss
 On Sun, 8 Feb 2009, Peter Jeremy wrote:
 
  On 2009-Feb-08 11:31:45 +0200, Danny Braniss da...@cs.huji.ac.il wrote:
  Q: with rxcsum on, and a bad checksum packet is received, is it
dropped by the NIC? if not, then it somewhat explains the behaviour
 
  If checksum offloading is working correctly then a bad packet should be 
  dropped by the NIC.  If checksum offloading isn't working correctly then 
  you 
  can wind up in the situation where both the NIC and the driver think the 
  other party has verified the checksum.  It's also possible that you may be 
  running into corruption during DMA transfer from the NIC to RAM.  ISTR 
  there 
  have been some issues reported recently with checksum offloading on some 
  NICs - though I don't have details to hand - you might like to search the 
  lists.
 
  changing the nic is tough, but if needed will be done.
 
  If disabling checksum offloading fixes the problem and the additional CPU 
  load is acceptable (at least until you find a real fix) then there's no 
  need 
  to change NICs.
 
 Actually, my understanding was that packets with bad checksums are delivered 
 to software, and flag the descriptor ring header for each packet tells us 
 whether the checksum was (a) checked and (b) validated by the hardware.  We 
 then propagate these to mbuf flags so that higher stack layers know whether 
 or 
 not to calculate the checksum themselves.  Regardless of the specifics, 
 though, packets with checked but bad checksums shouldn't make it to the 
 socket 
 layer where they would be visible to NFS.  If the NIC is marking apparently 
 bad packets as good, there are a number of possible sources -- be it bad 
 checksum handling in the card, corruption between the card and higher levels 
 of the stack (a DMA problem, as you point out, would have this symptom).

looking at the bce source, it's not clear (to me :-). If errors are detected in
bce_rx_intr(), the packet gets dropped, which I would expect to be the 
treatment
of an offloded chekcum error, but it seems that is not the case. 

danny



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: impossible packet length ...

2009-02-08 Thread Danny Braniss
 
 On Sun, 8 Feb 2009, Danny Braniss wrote:
 
  looking at the bce source, it's not clear (to me :-). If errors are 
  detected 
  in bce_rx_intr(), the packet gets dropped, which I would expect to be the 
  treatment of an offloded chekcum error, but it seems that is not the case.
 
 I think we're thinking of different checksums -- devices/device drivers drop 
 frames with bad ethernet checksums, but not IP and above layer checksums.

I know I'm stepping on thin ice hear - haven't touched Stevens for a while,
(and I doubt it mentions offloading), but if the offload checksum is bad,
why not just drop the packet?

The way I read the driver, if the offload checksum is on, and if no
errors where detected, then it's marked as ok.

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: impossible packet length ...

2009-02-08 Thread Danny Braniss
 
 On Sun, 8 Feb 2009, Danny Braniss wrote:
 
  On Sun, 8 Feb 2009, Danny Braniss wrote:
 
  looking at the bce source, it's not clear (to me :-). If errors are 
  detected in bce_rx_intr(), the packet gets dropped, which I would expect 
  to be the treatment of an offloded chekcum error, but it seems that is 
  not 
  the case.
 
  I think we're thinking of different checksums -- devices/device drivers 
  drop frames with bad ethernet checksums, but not IP and above layer 
  checksums.
 
  I know I'm stepping on thin ice hear - haven't touched Stevens for a while, 
  (and I doubt it mentions offloading), but if the offload checksum is bad, 
  why not just drop the packet?
 
  The way I read the driver, if the offload checksum is on, and if no errors 
  where detected, then it's marked as ok.
 
 There are a few good reasons I can think of, but this is hardly a 
 comprehensive list:
 
 (1) If there are bad higher level checksums on the wire, you want to see them
  in tcpdump, so allow them to get up to a higher layer if network layer
  checksums aren't good.
 
 (2) It's a matter of local policy as to whether UDP checksums (for example)
  are observed or not.
 
 (3) If you're forwarding or bridging packets, it should be up to the end nodes
  how they deal with bad UDP checksums on packets to them, not the routers.

ok, I can understand the logic.

 
 Looking at if_bce.c, the following seems to be reasonable logic; first, 
 ethernet-layer checksums:
 
 5902 /* Check the received frame for errors. */
 5903 if (status  (L2_FHDR_ERRORS_BAD_CRC |
 5904 L2_FHDR_ERRORS_PHY_DECODE | 
 L2_FHDR_ERRORS_ALIGNMENT |
 5905 L2_FHDR_ERRORS_TOO_SHORT  | 
 L2_FHDR_ERRORS_GIANT_FRAME)) {
 5906
 5907 /* Log the error and release the mbuf. */
 5908 ifp-if_ierrors++;
 5909 DBRUN(sc-l2fhdr_status_errors++);
 5910
 5911 m_freem(m0);
 5912 m0 = NULL;
 5913 goto bce_rx_int_next_rx;
 5914 }
 
 I.e., if there are ethernet-level CRC failures, drop the packet.
 
 5922 /* Validate the checksum if offload enabled. */
 5923 if (ifp-if_capenable  IFCAP_RXCSUM) {
 5924
 5925 /* Check for an IP datagram. */
 5926 if (!(status  L2_FHDR_STATUS_SPLIT) 
 5927 (status  L2_FHDR_STATUS_IP_DATAGRAM)) {
 5928 m0-m_pkthdr.csum_flags |= 
 CSUM_IP_CHECKED;
 5929
 5930 /* Check if the IP checksum is valid. */
 5931 if ((l2fhdr-l2_fhdr_ip_xsum ^ 0x) 
 == 
 0)
 5932 m0-m_pkthdr.csum_flags |= 
 CSUM_IP_VALID;
 5933 }
 5934
 5935 /* Check for a valid TCP/UDP frame. */
 5936 if (status  (L2_FHDR_STATUS_TCP_SEGMENT |
 5937 L2_FHDR_STATUS_UDP_DATAGRAM)) {
 5938
 5939 /* Check for a good TCP/UDP checksum. */
 5940 if ((status  (L2_FHDR_ERRORS_TCP_XSUM |
 5941   L2_FHDR_ERRORS_UDP_XSUM)) 
 == 0) {
 5942 m0-m_pkthdr.csum_data =
 5943 l2fhdr-l2_fhdr_tcp_udp_xsum;
 5944 m0-m_pkthdr.csum_flags |= 
 (CSUM_DATA_VALID
 5945 | CSUM_PSEUDO_HDR);
 5946 }
 5947 }
 5948 }
 
 Only look at higher level checksums if policy enables it on the interface; 
 then, only if the hardware has a view on the IP-layer checksums, propagte 
 that 
 information to the mbuf flags from the descriptor ring entry flags, both 
 whether or not the checksum was verified, and whether or not it was good.  If 
 policy disables it, or the hardware expresses no view, we don't set flags, 
 which simply defers checksumming to a higher layer (if required -- for 
 forwarded packets, we won't test UDP-layer checksums at all).

I missed line 5928, and as usual, your explanation is most educational!
The comment in line 5939 is a bit missleading, the way I read the code, it
does not check for good checksum.

Cheers,
danny



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


impossible packet length ...

2009-02-05 Thread Danny Braniss
Hi,
on 2 different servers, running 7.1-stable + zfs, I get this
error rather frequently:

Feb  5 17:01:03 warhol-00 kernel: impossible packet length (543383918) from 
nfs server sunfire:/dist
Feb  5 17:01:03 warhol-00 kernel: impossible packet length (1936028704) from 
nfs server sunfire:/dist
Feb  5 17:01:03 warhol-00 kernel: impossible packet length (1869363744) from 
nfs server sunfire:/dist
Feb  5 17:01:03 warhol-00 kernel: impossible packet length (1667787057) from 
nfs server sunfire:/dist
Feb  5 17:01:03 warhol-00 kernel: impossible packet length (976040755) from 
nfs server sunfire:/dist
Feb  5 17:01:03 warhol-00 kernel: impossible packet length (1953459488) from 
nfs server sunfire:/dist
Feb  5 17:01:03 warhol-00 kernel: impossible packet length (1348825156) from 
nfs server sunfire:/dist
Feb  5 17:01:03 warhol-00 kernel: impossible packet length (0) from nfs server 
sunfire:/dist
Feb  5 17:01:03 warhol-00 kernel: impossible packet length (1647208041) from 
nfs server sunfire:/dist

in this case the server is running Freebsd-7.0-stable, but I also get it when 
the server is a
netapp.

is there a connection?

thanks,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: impossible packet length ...

2009-02-05 Thread Danny Braniss

 On 2009-Feb-06 08:32:27 +0200, Danny Braniss da...@cs.huji.ac.il wrote:
 on 2 different servers, running 7.1-stable + zfs, I get this
 error rather frequently:
 
 Feb  5 17:01:03 warhol-00 kernel: impossible packet length (543383918) fro=
 m=20
 nfs server sunfire:/dist
 
So many quetsions :-)

 I gather warhol-00 is running 7.1-S+ZFS.
 How recent a 'stable' is it?
very:
FreeBSD warhol-00 7.1-STABLE FreeBSD 7.1-STABLE #37: Fri Jan 23 
10:41:54 IST 
2009
and its amd64.

 Where does ZFS fit in?  Is sunfire:/dist mountpoint in a local ZFS or
 is a local ZFS mountpoint inside the sunfire:/dist mount?
warhole is a nfs server, the storage is a ZFS local raid, the errors
occure on a nfs/tcp mounted file system on warhol.

 Do you get the same problems without any ZFS mounts?
I have several hosts running 7.1-stable without nfs exported ZFS, non have 
this error
That is why I think there is a connection, because on two, which have ZFS 
exported
the problem appears.

 Is this a TCP or UDP NFS mount?  What happens if you switch protocols?
i'll try but not trivial.
the other difference between the boxes is that one is dataless, while
the other is stand-alone (well, / is on a local disk, but /usr/local 
home dirs are on the network/nfs).

 What NIC are you using and are you seeing any network errors?
bce, the boxes are Dell-2950, but no visible errors.

 Are you able to capture a protocol trace showing the transaction including
 erroneous packet?
I have started the capture, but since I don't know what triggers the problem,
it will take some time. I will also start capturing packets at the router 
level,
but that will have to wait till next week.

thanks,
danny

 
 --=20
 Peter Jeremy
 
 --gKMricLos+KVdGMg
 Content-Type: application/pgp-signature
 Content-Disposition: inline
 
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v2.0.10 (FreeBSD)
 
 iEYEARECAAYFAkmL4t8ACgkQ/opHv/APuIeXNQCgg68TMfH6zh1gRaKfhCkNQi+0
 y10AoJcG7/7fiqL8oUpsWhIwhceWSFPo
 =MKeo
 -END PGP SIGNATURE-
 
 --gKMricLos+KVdGMg--


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Unhappy Xorg upgrade

2009-01-31 Thread Danny Braniss

 As a general note, this is the second time in a row that an X.org
 upgrade broke X for a significant number of people.  IMO, this
 suggests that our approach to X.org upgrades needs significant changes
 (see below).  X11 is a critical component for anyone who is using
 FreeBSD as a desktop and having upgrades fail or come with significant
 POLA violations and regressions for significant numbers of people is
 not acceptable.
 
you took the words out of my mouth!
Some days ago, I compiled wine from ports, among its dependencies
was cups(why in the name of G_D?), and x11-xcb (which did not ring
any special bells - stupidly I thought it meant some x11 cut buffer gizmo :-)
Anyways, next day, I couldn't open windows (x11 not MS) from some hosts,
some debuging later, it was xauth failing. Now xcb did ring bells! A year
ago we found a bug in libxcb, where the treatment of xauth was broken,
we sent a patch, but it is still waiting.
BTW, I opend a PR, http://www.freebsd.org/cgi/query-pr.cgi?pr=131120,
where it's now going the way the salmon, up stream, waiting for some kind sole
to apply it.

 On 2009-Jan-29 08:40:11 -0500, Robert Noland rnol...@freebsd.org wrote:
 I've had patches available for probably a couple of months now posted to
 freebsd-...@.  For the few people who tested it, I had no real issues
 reported.
 
 I didn't recall seeing any reference to patches so I went looking.
 All I could find is a couple of references to a patchset existing
 buried inside threads discussing specific problems with X.  The
 majority of people who didn't have those specific problems probably
 skipped the thread and never saw that a patchset was available.
 
 When the X.org 7.0 upgrade was planned, a heads-up went out on a
 number of mailing lists, together with a pointer to the patchset and
 upgrade instructions and the upgrade did not proceed until both a
 reasonable number of people reported success and reported problems had
 been ironed out.  Given the ongoing problems with code provided by
 X.org, I suggest that this approach needs to be followed for every
 future release of X.org until (if) the X.org Project demonstrates that
 they can provide release-quality code.
 
   This update also brings in support for a
 lot of people who are running newer hardware.
 
 And breaks support for lots of people who used to have functional
 X servers.
 
merging /usr/X11R6 into /usr/local was a bad idea!

cheers,
danny
 --=20
 Peter Jeremy
 Please excuse any delays as the result of my ISP's inability to implement
 an MTA that is either RFC2821-compliant or matches their claimed behaviour.
 
 --+1TulI7fc0PCHNy3
 Content-Type: application/pgp-signature
 Content-Disposition: inline
 
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v2.0.10 (FreeBSD)
 
 iEYEARECAAYFAkmDWqcACgkQ/opHv/APuIdisQCgogeNZ8aXPDJ3gcZ/23Gyp/CV
 bmsAn0efyI9cS6TWGFkofoYh6oFmtc5l
 =i2p0
 -END PGP SIGNATURE-
 
 --+1TulI7fc0PCHNy3--
 


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: more marvell marvels

2009-01-10 Thread Danny Braniss
 On Fri, Jan 09, 2009 at 01:48:24PM +0200, Danny Braniss wrote:
   hi, the mb is asus P5K-VM, the onboard nic is, acccording to pciconf:
   
   ms...@pci0:1:0:0:   class=0x02 card=0x81f81043 chip=0x436411ab 
   rev=0x12 hdr=0x00
   vendor = 'Marvell Semiconductor (Was: Galileo Technology Ltd)'
   device = '88E8056 Yukon PCI-E Gigabit Ethernet Controller'
   class  = network
   subclass   = ethernet
   cap 01[48] = powerspec 3  supports D0 D1 D2 D3  current D0
   cap 03[50] = VPD
   cap 05[5c] = MSI supports 1 message, 64 bit 
   cap 10[e0] = PCI-Express 1 legacy endpoint
   
   nothing new here, problems have been reported before, but:
   
   my very first attempt - after a very long time - of booting 7.1-stable, 
   produced
   a panic because msk could not find its physio, by the time i had the 
 serial 
   console
   attached and working, that problem disappeared :-(
   now, after reboot, it sometimes hangs - because the net is not working, 
 and 
   only if
   I unplug the ethernet, (no signs of the driver seeing this), and replug 
 things 
   begin
   to work. btw, i had to set
   hw.msk.legacy_intr=1
   to get things working.
   
   any patches for 7.1-stable to test?
   
 
 If memory serve me right you have Yukon EC Ultra with 88E1149 PHY,
 right? CURRENT has some stability fixes but the source wouldn't be
 compiled on stable/7 yet due to KPI differences. I have plan to add
 some features in next week which make it possible to use HEAD
 version on stable/7.
 
 I'm not sure the patch for 88E8040 could be applied to stable/7
 but the patch has some fixes for link state handling. Would you
 give it try?
 http://people.freebsd.org/~yongari/msk/msk.88E8040.patch14
 Note, the 88E8040 patch is not complete yet and may cause other
 problems too.
 

tried to apply patches, but if_mskreg.h patches failed, and hand stitching
didn't help (I have 7.1-Stable)

danny

 -- 
 Regards,
 Pyun YongHyeon


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


more marvell marvels

2009-01-09 Thread Danny Braniss
hi, the mb is asus P5K-VM, the onboard nic is, acccording to pciconf:

ms...@pci0:1:0:0:   class=0x02 card=0x81f81043 chip=0x436411ab 
rev=0x12 hdr=0x00
vendor = 'Marvell Semiconductor (Was: Galileo Technology Ltd)'
device = '88E8056 Yukon PCI-E Gigabit Ethernet Controller'
class  = network
subclass   = ethernet
cap 01[48] = powerspec 3  supports D0 D1 D2 D3  current D0
cap 03[50] = VPD
cap 05[5c] = MSI supports 1 message, 64 bit 
cap 10[e0] = PCI-Express 1 legacy endpoint

nothing new here, problems have been reported before, but:

my very first attempt - after a very long time - of booting 7.1-stable, 
produced
a panic because msk could not find its physio, by the time i had the serial 
console
attached and working, that problem disappeared :-(
now, after reboot, it sometimes hangs - because the net is not working, and 
only if
I unplug the ethernet, (no signs of the driver seeing this), and replug things 
begin
to work. btw, i had to set
 hw.msk.legacy_intr=1
to get things working.

any patches for 7.1-stable to test?

danny



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: more marvell marvels

2009-01-09 Thread Danny Braniss
 On Fri, Jan 09, 2009 at 01:48:24PM +0200, Danny Braniss wrote:
   hi, the mb is asus P5K-VM, the onboard nic is, acccording to pciconf:
   
   ms...@pci0:1:0:0:   class=0x02 card=0x81f81043 chip=0x436411ab 
   rev=0x12 hdr=0x00
   vendor = 'Marvell Semiconductor (Was: Galileo Technology Ltd)'
   device = '88E8056 Yukon PCI-E Gigabit Ethernet Controller'
   class  = network
   subclass   = ethernet
   cap 01[48] = powerspec 3  supports D0 D1 D2 D3  current D0
   cap 03[50] = VPD
   cap 05[5c] = MSI supports 1 message, 64 bit 
   cap 10[e0] = PCI-Express 1 legacy endpoint
   
   nothing new here, problems have been reported before, but:
   
   my very first attempt - after a very long time - of booting 7.1-stable, 
   produced
   a panic because msk could not find its physio, by the time i had the 
 serial 
   console
   attached and working, that problem disappeared :-(
   now, after reboot, it sometimes hangs - because the net is not working, 
 and 
   only if
   I unplug the ethernet, (no signs of the driver seeing this), and replug 
 things 
   begin
   to work. btw, i had to set
   hw.msk.legacy_intr=1
   to get things working.
   
   any patches for 7.1-stable to test?
   
 
 If memory serve me right you have Yukon EC Ultra with 88E1149 PHY,
 right?

e1000phy0: Marvell 88E1149 Gigabit PHY PHY 0 on miibus0
e1000phy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX-FDX, 
auto
mskc0: [ITHREAD]

CURRENT has some stability fixes but the source wouldn't be
 compiled on stable/7 yet due to KPI differences. I have plan to add
 some features in next week which make it possible to use HEAD
 version on stable/7.
 
 I'm not sure the patch for 88E8040 could be applied to stable/7
 but the patch has some fixes for link state handling. Would you
 give it try?
 http://people.freebsd.org/~yongari/msk/msk.88E8040.patch14
 Note, the 88E8040 patch is not complete yet and may cause other
 problems too.

I'll try asap
thanks,
danny

 
 -- 
 Regards,
 Pyun YongHyeon


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: newfs(8) parameters from dumpfs -m have bad -s value?

2009-01-05 Thread Danny Braniss
 On Mon, Jan 05, 2009 at 08:23:53PM +0100, Oliver Fromme wrote:
  
  This seems to be a bug in dumpfs(8).  It simply prints
  the value of the fs_size field of the superblock, which
  is wrong.
  
  The -s option of newfs(8) expects the available size in
  sectors (i.e. 512 bytes), but the fs_size field contains
  the size of the file system in 2KB units.  This seems to
  be the fragment size, but I'm not sure if this is just
 
 This *is* the fragment size.  UFS/FFS uses the plain term block to mean
 the fragment size.  All blocks are indexed with this number, unlike block
 size which is almost always 8 fragments (blocks).  Confusing.
 
  So, dumpfs(8) needs to be fixed to perform the proper
  calculations when printing the value for the -s option.
  Unfortunately I'm not sufficiently much of a UFS guru
  to offer a fix.  My best guess would be to multiply the
  fs_size value by the fragment size (measured in 512 byte
  units), i.e. multiply by 4 in the most common case.
  But I'm afraid the real solution is not that simple.
 
 The sector size and filesystem size parameters in newfs are remnants.
 Everything is converted to number of media sectors (sector size as
 specified by the device).  So one could assume for dumpfs to always use
 512, since it's rarely different, and multiply fs_size by fs_fsize and
 divide by 512, and then output -S 512.
 

don't assume 512, in the iscsi world I have seen all kinds of sector sizes,
making it a PITA to get things right.

 Better yet would be to add a parameter (-z perhaps) to newfs(8) to accept
 number of bytes instead of multiples of sectorsize.
 
 I would be willing to write up patches for dumpfs and newfs to both add the
 raw byte size and the 512-byte sector size handling to correct said
 mistake, unless someone else would rather.
 
 -- Rick C. Petty
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
 


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: amd(8) cores dump when load high

2008-12-29 Thread Danny Braniss
 On Sat, Dec 27, 2008 at 01:03:54PM +0200, Danny Braniss wrote:
  well, I'm running 7.1-PRERELEASE, what does the amd logs show?
 ..
  Dec 27 10:37:01 sf-02 amd[856]: am-utils version 6.1.5 (build 1).
  Dec 27 10:37:01 sf-02 amd[856]: Report bugs to 
  https://bugzilla.am-utils.org/ 
  or am-ut...@am-utils.org.
  Dec 27 10:37:01 sf-02 amd[856]: Configured by da...@sunfire on date Sun Jun 
  29 16:59:06 IDT 2008.
  Dec 27 10:37:01 sf-02 amd[856]: Built by da...@sunfire on date Sun Jun 29 
  17:02:07 IDT 2008.
  Dec 27 10:37:01 sf-02 amd[856]: cpu=x86_64 (little-endian), arch=amd64, 
  karch=amd64.
  Dec 27 10:37:01 sf-02 amd[856]: full_os=freebsd7.0, os=freebsd7, osver=7.0, 
  vendor=unknown, distro=none.
 
 How did you get this output from /usr/sbin/amd?
 
 It should be:
 # amq -v
 Copyright (c) 1997-2006 Erez Zadok
 Copyright (c) 1990 Jan-Simon Pendry
 Copyright (c) 1990 Imperial College of Science, Technology  Medicine
 Copyright (c) 1990 The Regents of the University of California.
 am-utils version 6.1.5 (build 800059).
 Report bugs to https://bugzilla.am-utils.org/ or am-ut...@am-utils.org.
 Configured by David O'Brien obr...@freebsd.org on date 4-December-2007 PST.
 Built by r...@quynh.nuxi.org on date Fri Dec 19 15:29:18 PST 2008.
 cpu=amd64 (little-endian), arch=amd64, karch=amd64.
 full_os=freebsd8.0, os=freebsd8, osver=8.0, vendor=undermydesk, distro=The 
 FreeBSD Project.
 
 So many of your fields aren't what I expect: build#, configured by,
 configured date, cpu, vendor, nor distro.
 
 -- 
 -- David(obr...@freebsd.org)

as explained in a later message, I rolled my own :-)
with the unofficial patches.

danny




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: amd(8) cores dump when load high

2008-12-27 Thread Danny Braniss
 Yes, we found that it crashes when swap is used.

on an amd64 architecture, amd could not plock it's pages in memory, and would,
under memory preasure be swapped out, and break.
I just run some tests under 7.1-PRERELEASE, and
- it seems that plock is working.
- amd is not being swapped out.
are you running with amd -S ?
danny

 
 
 On Fri, Dec 26, 2008 at 6:02 PM, Rong-en Fan gra...@gmail.com wrote:
  On Tue, Dec 23, 2008 at 12:44 AM, Lin Jui-Nan Eric eric...@tamama.org 
  wrote:
  Dear listers,
 
  We currently found that amd frequently cores dump while loading is
  high (about 4~5) after we upgrade world  kernel from 7.0-RELEASE to
  7.1-PRERELEASE.
 
  I have read -stable and svn log of 7-STABLE, but can not found a
  report or a solution. Did anyone have the same issue? Thank you very
  much.
 
 
  According to my previous experience, amd 6.1.5 crashes
  under low memory situations. Not necessary high load.
 
  Regards,
  Rong-En Fan
 
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
 


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: amd(8) cores dump when load high

2008-12-27 Thread Danny Braniss
 No, we do not running amd with -S.
 
 # ps auxww | grep amd
 root  706  0.0  0.1  7660  5416  ??  Ss   Wed05PM   4:48.12
 /usr/sbin/amd -p -k amd64 -x all /net amd.map
 
well, I'm running 7.1-PRERELEASE, what does the amd logs show?

Dec 27 10:37:01 sf-02 amd[856]: AM-UTILS VERSION INFORMATION:
Dec 27 10:37:01 sf-02 amd[856]: Copyright (c) 1997-2006 Erez Zadok
Dec 27 10:37:01 sf-02 amd[856]: Copyright (c) 1990 Jan-Simon Pendry
Dec 27 10:37:01 sf-02 amd[856]: Copyright (c) 1990 Imperial College of 
Science, Technology  Medicine
Dec 27 10:37:01 sf-02 amd[856]: Copyright (c) 1990 The Regents of the 
University of California.
Dec 27 10:37:01 sf-02 amd[856]: am-utils version 6.1.5 (build 1).
Dec 27 10:37:01 sf-02 amd[856]: Report bugs to https://bugzilla.am-utils.org/ 
or am-ut...@am-utils.org.
Dec 27 10:37:01 sf-02 amd[856]: Configured by da...@sunfire on date Sun Jun 29 
16:59:06 IDT 2008.
Dec 27 10:37:01 sf-02 amd[856]: Built by da...@sunfire on date Sun Jun 29 
17:02:07 IDT 2008.
Dec 27 10:37:01 sf-02 amd[856]: cpu=x86_64 (little-endian), arch=amd64, 
karch=amd64.
Dec 27 10:37:01 sf-02 amd[856]: full_os=freebsd7.0, os=freebsd7, osver=7.0, 
vendor=unknown, distro=none.
Dec 27 10:37:01 sf-02 amd[856]: domain=unknown.domain, host=sf-02, 
hostd=sf-02.unknown.domain.
Dec 27 10:37:01 sf-02 amd[856]: Map support for: root, passwd, hesiod, union, 
nis, ndbm, file, exec, error.
Dec 27 10:37:01 sf-02 amd[856]: AMFS: nfs, link, nfsx, nfsl, host, linkx, 
program, union, ufs, cdfs,
Dec 27 10:37:01 sf-02 amd[856]:   pcfs, auto, direct, toplvl, error, 
inherit.
Dec 27 10:37:01 sf-02 amd[856]: FS: cd9660, nfs, nfs3, nullfs, msdosfs, ufs, 
unionfs.
Dec 27 10:37:01 sf-02 amd[856]: Network: wire=132.65.16.0 
(netnumber=132.65.16).
Dec 27 10:37:01 sf-02 amd[856]: My ip addr is 127.0.0.1
Dec 27 10:37:01 sf-02 amd[857]: released controlling tty using setsid()
Dec 27 10:37:01 sf-02 amd[857]: Locked process pages in memory
**

 
 On Sat, Dec 27, 2008 at 4:51 PM, Danny Braniss da...@cs.huji.ac.il wrote:
  Yes, we found that it crashes when swap is used.
 
  on an amd64 architecture, amd could not plock it's pages in memory, and 
  would,
  under memory preasure be swapped out, and break.
  I just run some tests under 7.1-PRERELEASE, and
  - it seems that plock is working.
  - amd is not being swapped out.
  are you running with amd -S ?
 danny
 
 
 
  On Fri, Dec 26, 2008 at 6:02 PM, Rong-en Fan gra...@gmail.com wrote:
   On Tue, Dec 23, 2008 at 12:44 AM, Lin Jui-Nan Eric eric...@tamama.org 
   wrote:
   Dear listers,
  
   We currently found that amd frequently cores dump while loading is
   high (about 4~5) after we upgrade world  kernel from 7.0-RELEASE to
   7.1-PRERELEASE.
  
   I have read -stable and svn log of 7-STABLE, but can not found a
   report or a solution. Did anyone have the same issue? Thank you very
   much.
  
  
   According to my previous experience, amd 6.1.5 crashes
   under low memory situations. Not necessary high load.
  
   Regards,
   Rong-En Fan
  
  ___
  freebsd-stable@freebsd.org mailing list
  http://lists.freebsd.org/mailman/listinfo/freebsd-stable
  To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
 
 
 
 


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: amd(8) cores dump when load high

2008-12-27 Thread Danny Braniss
 I got Couldn't lock process pages in memory using mlockall() too:
[...]
 On Sat, Dec 27, 2008 at 9:07 PM, Rong-en Fan gra...@gmail.com wrote:
  On Sat, Dec 27, 2008 at 7:03 PM, Danny Braniss da...@cs.huji.ac.il wrote:
  No, we do not running amd with -S.
 
  # ps auxww | grep amd
  root  706  0.0  0.1  7660  5416  ??  Ss   Wed05PM   4:48.12
  /usr/sbin/amd -p -k amd64 -x all /net amd.map
 
  well, I'm running 7.1-PRERELEASE, what does the amd logs show?
 
  [...]
  Dec 27 10:37:01 sf-02 amd[857]: Locked process pages in memory
 **
 
  Hmm.. interesting, I got this
 
  Dec 26 15:32:11 bsd2 amd[39723]: Couldn't lock process pages in memory 
  using mlo
  ckall(): Resource temporarily unavailable
 
  w/ 7-STABLE around Sep 4. I don't put plock = no in amd.conf, so
  by default it's plock'ed.
 
  Regards,
  Rong-En Fan
 

some more ingrediants:
when running vanilla amd it also failes to lock pages:
Couldn't lock process pages in memory using mlockall(): Resource
temporarily unavailable
while the amd I'm running, which includes the latest - non official - patches
works fine.
but, the main diff I see is:
opteron ldd /usr/sbin/amd
/usr/sbin/amd:
libc.so.7 = /lib/libc.so.7 (0x80065a000)
while
opteron ldd /SBIN/amd
/SBIN/amd:
librt.so.1 = /usr/lib/librt.so.1 (0x800658000)
librpcsvc.so.4 = /usr/lib/librpcsvc.so.4 (0x80075d000)
libwrap.so.5 = /usr/lib/libwrap.so.5 (0x800866000)
libc.so.7 = /lib/libc.so.7 (0x80096f000)

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: RELENG_7_1: bce driver change generating too much interrupts ?

2008-12-18 Thread Danny Braniss
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 Hi, Nawfal,
 
 Nawfal bin Mohmad Rouyan wrote:
  I have been using a Dell machine with 2 bce interfaces as a bridge
  between my LAN and Firewall to shape the traffic. Since after the
  update, the machine can only run for a few minutes and after that no
  more connection can go through.
  
  Ping from LAN to Internet is OK but when I telnet say to www.yahoo.com
  at port 80 and issue GET / HTTP/1.0 I can see the data of different
  application including the HTML text.
  
  For example, I can see uTorrent packets with binaries and also the HTML
  page being cut short. It's as if, I'm seeing packets jumbled together
  from different application.
  
  I'm using PF to shape the traffic. If I reboot the server, it will panic
  and I have about 3 different vmcores in /var/crash and not sure what to
  do with it :( . I've tested the patch to remove
  stat_IfInFramesL2FilterDiscards but the problem still occurs.
 
 The last patch is not a functional change, but a behavior change that
 removes the L2FilterDiscards from being counted to match previous behavior.
 
 Would you please do this:
 
 script bt.txt kgdb /boot/kernel/kernel.symbols /var/crash/vmcore.0
 
 Then, do 'bt', press enter until all display has finished, then exit
 kgdb, and send me the result (bt.txt)?
 
  As for now, I'm not using the server to shape the traffic because I
  suspect the driver isn't reliable. I'm going to revert back to the
  previous driver and hopes its going to work.
  
  Sorry if there is not much detail since I'm not sure what to provide.
  Just tell me what to provide and I'd be happy to do so.

I don't know if the following is related, but:
- while stress testing nfs/zfs, I get many weird things on the server 
(dell-2950/bce)
example:
impossible packet length (33555456) from nfs server 
fr-01:/vol/system/share
impossible packet length (1792323116) from nfs server 
fr-01:/vol/system/share
...
and things get worse soon after. Now, there are no input errors, so it seems 
some memory starvation are not properly handled ...

cheers,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


more zfs/nfs panics

2008-12-16 Thread Danny Braniss
Hi,
I'm trying to tar a rather big directory via nfs (some 800gb), it
has many subdirectories, some of them with many files (close to 10^6 :-)

just before the server panics, the tar (on the client) starts complaining 
about lost
files, or permition denied, but not in the pathological directories.
panic: kmem_malloc(-1661382656): kmem_map too small: 645009408 total allocated
cpuid = 3
KDB: enter: panic
[thread pid 881 tid 100112 ]
Stopped at  kdb_enter_why+0x3d: movq$0,0x5ef3e8(%rip)
db tr
Tracing pid 881 tid 100112 td 0xff0004ba2000
kdb_enter_why() at kdb_enter_why+0x3d
panic() at panic+0x17b
kmem_malloc() at kmem_malloc+0x565
uma_large_malloc() at uma_large_malloc+0x4a
malloc() at malloc+0xd7
nfsrv_readdir() at nfsrv_readdir+0x4e1
nfssvc() at nfssvc+0x400
syscall() at syscall+0x1bb
Xfast_syscall() at Xfast_syscall+0xab
--- syscall (155, FreeBSD ELF64, nfssvc), rip = 0x8006885cc, rsp = 
0x7fffea28, rbp = 0 ---

I have increased 
vm.kmem_size_max=1024M
vm.kmem_size=1024M
vfs.zfs.arc_max=800M
it just seems to delay the panic though, it smells like some memory leak ...

the host is running amd64 quad core, 7.1-prerelease and 8GB.

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: bce reporting fantom input errors?

2008-12-16 Thread Danny Braniss
 This is a multi-part message in MIME format.
 --070205030901020808000803
 Content-Type: text/plain; charset=ISO-8859-1
 Content-Transfer-Encoding: 7bit
 
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 Danny Braniss wrote:
  Hi,
  After changing cables,switches,ports, I came to the conclusion
  that bce is reporting input errors that are not there, or creating them.
  I checked this with 3 different boxes, all Dell-2950/Broadcom NetXtreme II
  BCM5708 1000Base-T (B2), and one of them, while running Solaris, reported
  0 errors after a week, and freebsd after a few minutes its count was  100.
  The errors appear under 7.-PRERELEASE, but not under 7.0
  
  Anybody else seeing this?
 
 Please apply this patch, it was committed as revision 186169 about 3
 hours ago against -HEAD.  I'll MFC it after 3 days.
 
 Cheers,
 - --
 Xin LI delp...@delphij.net  http://www.delphij.net/
 FreeBSD - The Power to Serve!
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v2.0.9 (FreeBSD)
 
 iEYEARECAAYFAklHZWsACgkQi+vbBBjt66CHxgCfQhUCadChP7mtyoOD4Wg4cP/k
 lAUAnj1S2vh/TtmnKZAaczJvx7V/XR4x
 =fdk+
 -END PGP SIGNATURE-
 
 --070205030901020808000803
 Content-Type: text/plain;
  name=bce-noL2Filter.diff
 Content-Transfer-Encoding: 7bit
 Content-Disposition: inline;
  filename=bce-noL2Filter.diff
 
 Index: if_bce.c
 ===
 --- if_bce.c  (revision 186076)
 +++ if_bce.c  (working copy)
 @@ -7408,7 +7408,6 @@
   (u_long) sc-stat_IfInMBUFDiscards +
   (u_long) sc-stat_Dot3StatsAlignmentErrors +
   (u_long) sc-stat_Dot3StatsFCSErrors +
 - (u_long) sc-stat_IfInFramesL2FilterDiscards +
   (u_long) sc-stat_IfInRuleCheckerDiscards +
   (u_long) sc-stat_IfInFTQDiscards +
   (u_long) sc-com_no_buffers;
 
 --070205030901020808000803--

thanks! so actually it was counting IfInFramesL2FilterDiscards.
btw, it worked, it's now 0 input errors.

danny






___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


bce reporting fantom input errors?

2008-12-15 Thread Danny Braniss
Hi,
After changing cables,switches,ports, I came to the conclusion
that bce is reporting input errors that are not there, or creating them.
I checked this with 3 different boxes, all Dell-2950/Broadcom NetXtreme II
BCM5708 1000Base-T (B2), and one of them, while running Solaris, reported
0 errors after a week, and freebsd after a few minutes its count was  100.
The errors appear under 7.-PRERELEASE, but not under 7.0

Anybody else seeing this?

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: zfs panics

2008-12-11 Thread Danny Braniss
 
 Hi,
 
 On 2008-12-10, Danny Braniss wrote:
  from a solaris or linux client, doing a ls(1) of a nfs exported zfs 
  file,
  for example: ls /net/zfs-server/h/.zfs/snapshot,
  panics the server. The server is running latest 7.1-prerelease.
 
 This has been reported as PR kern/125149. I have described the problem
 in this message:
 
 http://lists.freebsd.org/pipermail/freebsd-fs/2008-October/005217.html
 
 See the PR for RELENG_7 patches.
 (http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/125149)
 
 -- 
 Jaakko

Hi Jaakko,
did you apply the patches and it solved the problem?
and, btw, which patch?

To Jeremy,
How about adding a line explaining that it would be
prudent to 'zfs set snapdir=hidden' ..., or, of cource
fix the bug :-)
I will apply the patch/es and see what happens.

cheers,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: zfs panics

2008-12-11 Thread Danny Braniss
  
  Hi,
  
  On 2008-12-10, Danny Braniss wrote:
 from a solaris or linux client, doing a ls(1) of a nfs exported zfs 
   file,
   for example: ls /net/zfs-server/h/.zfs/snapshot,
   panics the server. The server is running latest 7.1-prerelease.
  
  This has been reported as PR kern/125149. I have described the problem
  in this message:
  
  http://lists.freebsd.org/pipermail/freebsd-fs/2008-October/005217.html
  
  See the PR for RELENG_7 patches.
  (http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/125149)
  
  -- 
  Jaakko
 
 Hi Jaakko,
   did you apply the patches and it solved the problem?
 and, btw, which patch?
 
 To Jeremy,
   How about adding a line explaining that it would be
 prudent to 'zfs set snapdir=hidden' ..., or, of cource
 fix the bug :-)
 I will apply the patch/es and see what happens.
 
 cheers,
   danny

the patch to nfs_server.c does indeed prevent the panics.

danny



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


zfs panics

2008-12-10 Thread Danny Braniss
hi,
from a solaris or linux client, doing a ls(1) of a nfs exported zfs 
file,
for example: ls /net/zfs-server/h/.zfs/snapshot,
panics the server. The server is running latest 7.1-prerelease.
when client is freebsd, it mostly works, but in a few cases
the server just goes into comma.
btw, the server is running vanilla zfs, no tunning, and the server is 
64bit with 8gb of memory and quad core (dell-pe2950)

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x168
fault code  = supervisor write data, page not present
instruction pointer = 0x8:0x804a9175
stack pointer   = 0x10:0xb71fc550
frame pointer   = 0x10:0xb71fc560
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 802 (nfsd)
[thread pid 802 tid 100185 ]
Stopped at  _mtx_lock_flags+0x15:   lock cmpxchgq   %rsi,0x50(%rdi)
db tr
Tracing pid 802 tid 100185 td 0xff0004d576e0
_mtx_lock_flags() at _mtx_lock_flags+0x15
vput() at vput+0x45
nfsrv_readdirplus() at nfsrv_readdirplus+0x83e
nfssvc() at nfssvc+0x400
syscall() at syscall+0x1bb
Xfast_syscall() at Xfast_syscall+0xab
--- syscall (155, FreeBSD ELF64, nfssvc), rip = 0x8006885cc, rsp = 
0x7fffea2


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


btx/pxeboot problem

2008-12-02 Thread Danny Braniss
latest pxeboot (7.1):
mother-boardNIC/LOM CPU
-   --- ---
Intel SWV25  em xeonworks fine
SUN X2200bgeamd works fine
DELL PE 2950 bcexeonfailes 95% of the times
hangs or goes into btx dump regs. mode 
:-)
Intel SE7320VP21 mskxeonfailes 50% of the times - hangs

pxeboot with btx.S 1.45 2008/02/27 23:35:39, works fine.

so it seems that changes since 1.45 have fixed it for some, but it
brakes for others :-). I can help testing, but btx is way out of
my league.

danny



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: dhclient doing DISCOVER with bad IP checksum - bge (7.1 show stopper??)

2008-12-02 Thread Danny Braniss
 Can someone please confirm or rule out my issue with dhclient sending 
 bad IP checksum packets. It would really suck if 7.1 was released with a 
 broken DHCP client.
 
I've had many problems lately, but none involved checksum nor the dhcpd
(btw, I assume that you are seeing bad checksum on the receiving server)
could you add a nic to your PE1750? 

danny


 Jonathan Feally wrote:
  Sorry for the cross-post, but this could be either lists problem.
 
  I have 2 boxes running 7-STABLE as of 20081130, both i386 SMP. One is 
  running ISC DHCPD 3.0.x from recent ports, and the other dhclient from 
  make world.
 
  The server is refusing to answer the DISCOVER request, as it thinks 
  the IP checksum is wrong, which tcpdump also confirms. Other DHCP 
  clients are working fine on this network, so I do not believe it to be 
  the network, server or dhcpd.
 
  Server is running a 2 Port Intel card - em driver.
 
  Client is a Dell PE1750 with 2 onboard NIC's - bge driver.
 
  I have tried turning off both RXCSUM and TXCSUM on both the client and 
  server machines with no luck. I also tried the second NIC on the 
  server with the same result.
 
  This setup was working just a couple of weeks ago, and the only thing 
  that has changed is updating the src for a make world. PXE booting 
  this server does result in an IP being issued, so it is pointing 
  towards something new/changed in 7-STABLE.
 
  I have attached a 3 packet dump of the DISCOVER requests.
 
  Can anybody shed some light on this for me?
 
  Thanks, -Jon
 
  
 
  ___
  [EMAIL PROTECTED] mailing list
  http://lists.freebsd.org/mailman/listinfo/freebsd-net
  To unsubscribe, send any mail to [EMAIL PROTECTED]
 
 
 -- 
 This message has been scanned for viruses and
 dangerous content by MailScanner, and is
 believed to be clean.
 
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to [EMAIL PROTECTED]
 


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: diskless+pxe notes

2008-11-15 Thread Danny Braniss
 Hi,
 i finally decided to try and use pxeboot to replace the etherboot
 method I was using so far for diskless setups.
 
 The goal is to fully share the server's root and /usr directories,
 as documented in diskless(8). I'd like to share the following
 notes, hopefully to go in the manpage.
 
   cheers
   luigi
 
Hi,
With a slightly modified libstand/bootp.c - a PR was sent
way back, but you can check
ftp://ftp.cs.huji.ac.il/users/danny/freebsd/diskless-boot
you can control the diskless boot options.
by comenting out kernel= in
/boot/defaults/loader.conf
you can set in the dhcpd.conf.
since most of the tags received via dhcp are placed in kenv,
the crucial options are there!

BTW, we use diskless servers/workstations for 90% of our hosts, the
exception being:
 - the dhcp/tftp server
 - a 'lagged' server - the router/server get confused :-)
 - our mail servers, there is a bug somewhere, where some critical
   network resources get deadlocked.
 - our developement servers.

the / of the diskless is almost identical to the server, but for
many reasons, I like to keep it appart. The trick to overcome the
read-only problem, is using unionfs for /etc:
in rc.initdiskless:

if [ -e /conf/union ]; then
kldload unionfs
mount_md 4096 /.etc
mount_unionfs -o transparent /.etc /etc
fi

the /conf is nfs mounted  from a central site, the location is passed
via dhcp:

confpath=`kenv conf-path`
if [ -n $confpath ] ; then
if [ `expr $confpath : '\(.*\):'` ] ; then
echo Mounting $confpath on /conf
mount_nfs $confpath /conf
chkerr $? mount_nfs $confpath /conf
to_umount=${to_umount} $confpath
fi
fi


the actual rc.conf is configured like this:
eval `kenv | sed -n 's/^rc\.//p'`
rm -f /etc/rc.conf /etc/rc.conf.local
for fc in $conf0 $conf1 $conf2 $conf3 $conf4 $conf5 $conf6 $conf7 $conf8 
$conf9 rc.conf.$hostname
do
ho=`expr $fc : '\(.*\):'`
fl=`expr $fc : '.*/\(.*\)'`
if [ ${ho} !=  ]; then
mp=`expr $fc : '\(.*\)/.*'`
mount_nfs $mp /mnt  /dev/null 21
if [ -f /mnt/$fl ]; then
echo # from $fc /mnt/$fl  /etc/rc.conf
cat /mnt/$fl  /etc/rc.conf
fi
umount /mnt  /dev/null 21
elif [ -e /conf/$fc ] ; then
echo # from /conf/$fc  /etc/rc.conf
cat /conf/$fc  /etc/rc.conf
fi
done

 -- root path configuration -
 
 There seems to be a well known problem in pxeloader, see
 kern/106493 , where pxeloader defaults to using a root path of
 /pxeroot when offered / .  The patch suggested in
 
   http://www.freebsd.org/cgi/query-pr.cgi?pr=106493
 
 is trivial and judging from it I believe this is addressing a
 true bug and not a feature.  Fortunately there is a workaround
 (suggested in the PR) which is using // as a root path.
 
 - sharing /boot with the server ---
 
 I believe it is quite useful to share the whole root
 partition between the server and the diskless client.
 This would require at a minimum some conditional code
 in loader.conf (or loader.rc, etc) so that at least you
 point to different kernels.
 
 A minimalistic approach can be adding this line to /boot/loader.conf
 
   bootfile=kernel\\${loaddev};kernel
 
 The variable $loaddev contains the name of the load device,
 which is pxe0 in the case of pxeboot, and disk* in other
 cases when loading from the local disk.
   If you make sure that there is no 'kernel.disk*' on the
 directory, and instead there is a kernel.pxe0 in the same
 directory, then the diskless machines and the server will boot
 from the proper file.
 
 Unfortunately i don't know how to implement a conditional
 in /boot/loader.conf -- otherwise one could do much nicer things
 such as differentiate which modules to load and so on.
 
 --- pxeloader bug in 7.x ---
 Also worth mentioning is an annoying bug in pxeloader as compiled
 on 7.x, see http://www.freebsd.org/cgi/query-pr.cgi?pr=118222
 i.e. the pxeloader in 7.x fails to proceed and prints a message
 can't figure out which disk we are booting from.
 
 The workaround is using a pxeloader from FreeBSD6 works.
 I guess this is a compiler-related problem (given that 6.x uses gcc 3.4
 as a compiler, while 7.x uses gcc 4.2).
 
 -
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to [EMAIL PROTECTED]
 


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bad NFS/UDP performance

2008-10-06 Thread Danny Braniss
 
 On Sat, 4 Oct 2008, Danny Braniss wrote:
 
  at the moment, the best I can do is run it on a different hardware that has 
  if_em, the results are in
  ftp://ftp.cs.huji.ac.il/users/danny/lock.prof/7.1-1000.em the 
  benchmark ran better with the Intel NIC, averaged UDP 54MB/s, TCP 53MB/s (I 
  get the same numbers with an older kernel).
 
 Dear Danny:
 
 Unfortunately, I was left slightly unclear on the comparison you are making 
 above.  Could you confirm whether or not, with if_em, you see a performance 
 regression using UDP NFS between 7.0-RELEASE and the most recent 7.1-STABLE, 
 and if you do, whether or not the RLOCK-WLOCK change has any effect on 
 performance?  It would be nice to know on the same hardware but at least with 
 different hardware we get a sense of whether or not this might affect other 
 systems or whether it's limited to a narrower set of configurations.
 
 Thanks,

7.1-1000.em vanilla 7.1 1 x Intel Core Duo
7.1-1000.x2200.em   vanilla 7.1 2 x Dual-Core AMD Opteron
7.0-1000.x2200.em   7.0 + RLOCK-WLOCK

the plot thickens.
I put an em card in, and the throughput is almost the same than with the bge.

all the tests were done on the same host, a Sun x2200/amd/2cpux2core
except for the one over the weekend that is a intel Core Duo, and not the same
if_em card, sorry about that but one has PCI X, the other PCI Express :-(.

what is becoming obvious is that NFS/UDP is very temperamental/sensitive :-)

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bad NFS/UDP performance

2008-10-04 Thread Danny Braniss
 
 On Fri, 3 Oct 2008, Danny Braniss wrote:
 
  On Fri, 3 Oct 2008, Danny Braniss wrote:
 
  gladly, but have no idea how to do LOCK_PROFILING, so some pointers would 
  be helpfull.
 
  The LOCK_PROFILING(9) man page isn't a bad starting point -- I find that 
  the defaults work fine most of the time, so just use them.  Turn the 
  enable 
  syscl on just before you begin a run, and turn it off immediately 
  afterwards.  Make sure to reset between reruns (rebooting to a new kernel 
  is fine too!).
 
  in ftp://ftp.cs.huji.ac.il/users/danny/lock.prof
  there 3 files:
  7.1-100 host connected at 100 running -prerelease
  7.1-1000same but connected at 1000
  7.0-1000-stable with your 'patch'
  at 100 my benchmark didn't suffer from the profiling, average was about 9.
  at 1000 the benchmark got realy hit, average was around 12 for the patched,
  and 4 for the unpatched (less than at 100).
 
 Interesting.  A bit of post-processing:
 
 [EMAIL PROTECTED]:/tmp cat 7.1-1000 | awk -F' ' '{print $3 $9}' | sort -n 
 | 
 tail -10
 2413283 /r+d/7/sys/kern/kern_mutex.c:141
 2470096 /r+d/7/sys/nfsclient/nfs_socket.c:1218
 2676282 /r+d/7/sys/net/route.c:293
 2754866 /r+d/7/sys/kern/vfs_bio.c:1468
 3196298 /r+d/7/sys/nfsclient/nfs_bio.c:1664
 3318742 /r+d/7/sys/net/route.c:1584
 3711139 /r+d/7/sys/dev/bge/if_bge.c:3287
 3753518 /r+d/7/sys/net/if_ethersubr.c:405
 3961312 /r+d/7/sys/nfsclient/nfs_subs.c:1066
 10688531 /r+d/7/sys/dev/bge/if_bge.c:3726
 [EMAIL PROTECTED]:/tmp cat 7.0-1000 | awk -F' ' '{print $3 $9}' | sort -n 
 | 
 tail -10
 468631 /r+d/hunt/src/sys/nfsclient/nfs_nfsiod.c:286
 501989 /r+d/hunt/src/sys/nfsclient/nfs_vnops.c:1148
 631587 /r+d/hunt/src/sys/nfsclient/nfs_socket.c:1198
 701155 /r+d/hunt/src/sys/nfsclient/nfs_socket.c:1258
 718211 /r+d/hunt/src/sys/kern/kern_mutex.c:141
 1118711 /r+d/hunt/src/sys/nfsclient/nfs_bio.c:1664
 1169125 /r+d/hunt/src/sys/nfsclient/nfs_subs.c:1066
 1222867 /r+d/hunt/src/sys/kern/vfs_bio.c:1468
 3876072 /r+d/hunt/src/sys/netinet/udp_usrreq.c:545
 5198927 /r+d/hunt/src/sys/netinet/udp_usrreq.c:864
 
 The first set above is with the unmodified 7-STABLE tree, the second with a 
 reversion of read locking on the UDP inpcb.  The big blinking sign of 
 interest 
 is that the bge interface lock is massively contended in the first set of 
 output, and basically doesn't appear in the second.  There are various 
 reasons 
 bge could stand out quite so much -- one possibly is that previously, the udp 
 lock serialized all access to the interface from the send code, preventing 
 the 
 send and receive paths from contending.
 
 A few things to try:
 
 - Let's look compare the context switch rates on the two benchmarks.  Could
you run vmstat and look at the cpu cs line during the benchmarks and see 
 how
similar the two are as the benchmarks run?  You'll want to run it with
vmstat -w 1 and collect several samples per benchmark, since we're really
interested in the distribution rather than an individual sample.
 
 - Is there any chance you could drop an if_em card into the same box and run
the identical benchmarks with and without LOCK_PROFILING to see whether it
behaves differently than bge when the patch is applied?  if_em's interrupt
handling is quite different, and may significantly affect lock use, and
hence contention.

at the moment, the best I can do is run it on a different hardware that has 
if_em,
the results are in 
ftp://ftp.cs.huji.ac.il/users/danny/lock.prof/7.1-1000.em
the benchmark ran better with the Intel NIC, averaged UDP 54MB/s, TCP 53MB/s
(I get the same numbers with an older kernel).

danny





___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bad NFS/UDP performance

2008-10-03 Thread Danny Braniss
  it more difficult than I expected.
  for one, the kernel date was missleading, the actual source update is the 
  key, so
  the window of changes is now 28/July to 19/August. I have the diffs, but 
  nothing
  yet seems relevant.
 
  on the other hand, I tried NFS/TCP, and there things seem ok, ie the 'good' 
  and the 'bad'
  give the same throughput, which seem to point to UDP changes ...
 
 Can you post the network-numbers?
so I ran some more test, these are for writes IO:

server is a NetApp:

kernel from 18/08/08 00:00:0 : 
/- UDP // TCP ---/
   1*512  38528 0.19s   83.50MB 0.20s   80.82MB/s
   2*512  19264 0.21s   76.83MB 0.21s   77.57MB/s
   4*512   9632 0.19s   85.51MB 0.22s   73.13MB/s
   8*512   4816 0.19s   83.76MB 0.21s   75.84MB/s
  16*512   2408 0.19s   83.99MB 0.21s   77.18MB/s
  32*512   1204 0.19s   84.45MB 0.22s   71.79MB/s
  64*512602 0.20s   79.98MB 0.20s   78.44MB/s
 128*512301 0.18s   86.51MB 0.22s   71.53MB/s
 256*512150 0.19s   82.83MB 0.20s   78.86MB/s
 512*512 75 0.19s   82.77MB 0.21s   76.39MB/s
1024*512 37 0.19s   85.62MB 0.21s   76.64MB/s
2048*512 18 0.21s   77.72MB 0.20s   80.30MB/s
4096*512  9 0.26s   61.06MB 0.30s   53.79MB/s
8192*512  4 0.83s   19.20MB 0.41s   39.12MB/s
   16384*512  2 0.84s   19.01MB 0.41s   39.03MB/s
   32768*512  1 0.82s   19.59MB 0.39s   40.89MB/s

kernel from 19/08/08 00:00:00:
   1*512  38528 0.45s   35.59MB 0.20s   81.43MB/s
   2*512  19264 0.45s   35.56MB 0.20s   79.24MB/s
   4*512   9632 0.49s   32.66MB 0.22s   73.72MB/s
   8*512   4816 0.47s   34.06MB 0.21s   75.52MB/s
  16*512   2408 0.53s   30.16MB 0.22s   72.58MB/s
  32*512   1204 0.31s   51.68MB 0.40s   40.14MB/s
  64*512602 0.43s   37.23MB 0.25s   63.57MB/s
 128*512301 0.51s   31.39MB 0.26s   62.70MB/s
 256*512150 0.47s   34.02MB 0.23s   69.06MB/s
 512*512 75 0.47s   34.01MB 0.23s   70.52MB/s
1024*512 37 0.53s   30.12MB 0.22s   73.01MB/s
2048*512 18 0.55s   29.07MB 0.23s   70.64MB/s
4096*512  9 0.46s   34.69MB 0.21s   75.92MB/s
8192*512  4 0.81s   19.66MB 0.43s   36.89MB/s
   16384*512  2 0.80s   19.99MB 0.40s   40.29MB/s
   32768*512  1 1.11s   14.41MB 0.38s   42.56MB/s




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bad NFS/UDP performance

2008-10-03 Thread Danny Braniss
 
 On Fri, 3 Oct 2008, Danny Braniss wrote:
 
  it more difficult than I expected.
  for one, the kernel date was missleading, the actual source update is the 
  key, so
  the window of changes is now 28/July to 19/August. I have the diffs, but 
  nothing
  yet seems relevant.
 
  on the other hand, I tried NFS/TCP, and there things seem ok, ie the 
  'good' and the 'bad' give the same throughput, which seem to point to UDP 
  changes ...
 
  Can you post the network-numbers?
  so I ran some more test, these are for writes IO:
 
 OK, so it looks like this was almost certainly the rwlock change.  What 
 happens if you pretty much universally substitute the following in 
 udp_usrreq.c:
 
 Currently Change to
 - -
 INP_RLOCK INP_WLOCK
 INP_RUNLOCK   INP_WUNLOCK
 INP_RLOCK_ASSERT  INP_WLOCK_ASSERT
 

I guess you were almost certainly correct :-)
I did the global subst. on the udp_usrreq.c from 19/08,
__FBSDID($FreeBSD: src/sys/netinet/udp_usrreq.c,v 1.218.2.3 2008/08/18 
23:00:41 bz Exp $);
and now udp is fine again!

danny


 Robert N M Watson
 Computer Laboratory
 University of Cambridge
 
 
  server is a NetApp:
 
  kernel from 18/08/08 00:00:0 :
 /- UDP // TCP ---/
1*512  38528 0.19s   83.50MB 0.20s   80.82MB/s
2*512  19264 0.21s   76.83MB 0.21s   77.57MB/s
4*512   9632 0.19s   85.51MB 0.22s   73.13MB/s
8*512   4816 0.19s   83.76MB 0.21s   75.84MB/s
   16*512   2408 0.19s   83.99MB 0.21s   77.18MB/s
   32*512   1204 0.19s   84.45MB 0.22s   71.79MB/s
   64*512602 0.20s   79.98MB 0.20s   78.44MB/s
  128*512301 0.18s   86.51MB 0.22s   71.53MB/s
  256*512150 0.19s   82.83MB 0.20s   78.86MB/s
  512*512 75 0.19s   82.77MB 0.21s   76.39MB/s
 1024*512 37 0.19s   85.62MB 0.21s   76.64MB/s
 2048*512 18 0.21s   77.72MB 0.20s   80.30MB/s
 4096*512  9 0.26s   61.06MB 0.30s   53.79MB/s
 8192*512  4 0.83s   19.20MB 0.41s   39.12MB/s
16384*512  2 0.84s   19.01MB 0.41s   39.03MB/s
32768*512  1 0.82s   19.59MB 0.39s   40.89MB/s
 
  kernel from 19/08/08 00:00:00:
1*512  38528 0.45s   35.59MB 0.20s   81.43MB/s
2*512  19264 0.45s   35.56MB 0.20s   79.24MB/s
4*512   9632 0.49s   32.66MB 0.22s   73.72MB/s
8*512   4816 0.47s   34.06MB 0.21s   75.52MB/s
   16*512   2408 0.53s   30.16MB 0.22s   72.58MB/s
   32*512   1204 0.31s   51.68MB 0.40s   40.14MB/s
   64*512602 0.43s   37.23MB 0.25s   63.57MB/s
  128*512301 0.51s   31.39MB 0.26s   62.70MB/s
  256*512150 0.47s   34.02MB 0.23s   69.06MB/s
  512*512 75 0.47s   34.01MB 0.23s   70.52MB/s
 1024*512 37 0.53s   30.12MB 0.22s   73.01MB/s
 2048*512 18 0.55s   29.07MB 0.23s   70.64MB/s
 4096*512  9 0.46s   34.69MB 0.21s   75.92MB/s
 8192*512  4 0.81s   19.66MB 0.43s   36.89MB/s
16384*512  2 0.80s   19.99MB 0.40s   40.29MB/s
32768*512  1 1.11s   14.41MB 0.38s   42.56MB/s
 
 
 
 
 


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bad NFS/UDP performance

2008-10-03 Thread Danny Braniss
forget it about LOCK_PROFILING, I'm RTFM now :-)
though some hints on values might be helpful.

have a nice weekend,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bad NFS/UDP performance

2008-10-03 Thread Danny Braniss
 
 On Fri, 3 Oct 2008, Danny Braniss wrote:
 
  OK, so it looks like this was almost certainly the rwlock change.  What 
  happens if you pretty much universally substitute the following in 
  udp_usrreq.c:
 
  Currently  Change to
  -  -
  INP_RLOCK  INP_WLOCK
  INP_RUNLOCKINP_WUNLOCK
  INP_RLOCK_ASSERT   INP_WLOCK_ASSERT
 
  I guess you were almost certainly correct :-) I did the global subst. on 
  the 
  udp_usrreq.c from 19/08, __FBSDID($FreeBSD: src/sys/netinet/udp_usrreq.c,v 
  1.218.2.3 2008/08/18 23:00:41 bz Exp $); and now udp is fine again!
 
 OK.  This is a change I'd rather not back out since it significantly improves 
 performance for many other UDP workloads, so we need to figure out why it's 
 hurting us so much here so that we know if there are reasonable alternatives.
 
 Would it be possible for you to do a run of the workload with both kernels 
 using LOCK_PROFILING around the benchmark, and then we can compare lock 
 contention in the two cases?  What we often find is that relieving contention 
 at one point causes new contention at another point, and if the primitive 
 used 
 at that point handles contention less well for whatever reason, performance 
 can be reduced rather than improved.  So maybe we're looking at an issue in 
 the dispatched UDP code from so_upcall?  Another less satisfying (and 
 fundamentally more difficult) answer might be something to do with the 
 scheduler, but a bit more analysis may shed some light.

gladly, but have no idea how to do LOCK_PROFILING, so some pointers would be
helpfull.

as a side note, many years ago I checked out NFS/TCP and it was really bad,
I even remember NetApp telling us to drop TCP, but now, things look rather
better. Wonder what caused it.

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bad NFS/UDP performance

2008-10-03 Thread Danny Braniss
 
 On Fri, 3 Oct 2008, Danny Braniss wrote:
 
  gladly, but have no idea how to do LOCK_PROFILING, so some pointers would 
  be 
  helpfull.
 
 The LOCK_PROFILING(9) man page isn't a bad starting point -- I find that the 
 defaults work fine most of the time, so just use them.  Turn the enable syscl 
 on just before you begin a run, and turn it off immediately afterwards.  Make 
 sure to reset between reruns (rebooting to a new kernel is fine too!).
 
in ftp://ftp.cs.huji.ac.il/users/danny/lock.prof
there 3 files:
7.1-100 host connected at 100 running -prerelease
7.1-1000same but connected at 1000
7.0-1000-stable with your 'patch' 
at 100 my benchmark didn't suffer from the profiling, average was about 9.
at 1000 the benchmark got realy hit, average was around 12 for the patched,
and 4 for the unpatched (less than at 100).

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bad NFS/UDP performance

2008-09-29 Thread Danny Braniss
  On Fri, 26 Sep 2008, Danny Braniss wrote:
  
   after more testing, it seems it's related to changes made between Aug 4 
   and 
   Aug 29 ie, a kernel built on Aug 4 works fine, Aug 29 is slow. I'l now 
   try 
   and close the gap.
  
  I think this is the best way forward -- skimming August changes, there are 
  a 
  number of candidate commits, including retuning of UDP hashes by mav, my 
  rwlock changes, changes to mbuf chain handling, etc.
 
 it more difficult than I expected.
 for one, the kernel date was missleading, the actual source update is the 
 key, so
 the window of changes is now 28/July to 19/August. I have the diffs, but 
 nothing
 yet seems relevant.
 
 on the other hand, I tried NFS/TCP, and there things seem ok, ie the 'good' 
 and the 'bad'
 give the same throughput, which seem to point to UDP changes ...
 
 danny

Grr, there goes binary search theory out of the window,
So far I have managed to pinpoint the day that the changes affect the 
throughput:
18/08/08 00:00:00   19/08/08 00:00:00
(I assume cvs's date is GMT).
now would be a good time for some help, specially how to undo changes, my
knowledge of csup/cvs are close to zero.

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bad NFS/UDP performance

2008-09-29 Thread Danny Braniss
  it more difficult than I expected.
  for one, the kernel date was missleading, the actual source update is the 
  key, so
  the window of changes is now 28/July to 19/August. I have the diffs, but 
  nothing
  yet seems relevant.
 
  on the other hand, I tried NFS/TCP, and there things seem ok, ie the 'good' 
  and the 'bad'
  give the same throughput, which seem to point to UDP changes ...
 
 Can you post the network-numbers?
[again :-]
  Writing 16 MB file
  BSCount / 7.0 --/ / 7.1 -/

should now read:
/ Aug 18 --/  /--- Aug 19 /
 1*512  32768 0.16s  98.11MB/s  0.43s 37.18MB/s
 2*512  16384 0.17s  92.04MB/s  0.46s 34.79MB/s
 4*512   8192 0.16s 101.88MB/s  0.43s 37.26MB/s
 8*512   4096 0.16s  99.86MB/s  0.44s 36.41MB/s
16*512   2048 0.16s 100.11MB/s  0.50s 32.03MB/s
32*512   1024 0.26s  61.71MB/s  0.46s 34.79MB/s
64*512512 0.22s  71.45MB/s  0.45s 35.41MB/s
   128*512256 0.21s  77.84MB/s  0.51s 31.34MB/s
   256*512128 0.19s  82.47MB/s  0.43s 37.22MB/s
   512*512 64 0.18s  87.77MB/s  0.49s 32.69MB/s
  1024*512 32 0.18s  89.24MB/s  0.47s 34.02MB/s
  2048*512 16 0.17s  91.81MB/s  0.30s 53.41MB/s
  4096*512  8 0.16s 100.56MB/s  0.42s 38.07MB/s
  8192*512  4 0.82s  19.56MB/s  0.80s 19.95MB/s
 16384*512  2 0.82s  19.63MB/s  0.95s 16.80MB/s
 32768*512  1 0.81s  19.69MB/s  0.96s 16.64MB/s
  
  Average:   75.8633.00


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bad NFS/UDP performance

2008-09-27 Thread Danny Braniss
 
 :-vfs.nfs.realign_test: 22141777
 :+vfs.nfs.realign_test: 498351
 : 
 :-vfs.nfsrv.realign_test: 5005908
 :+vfs.nfsrv.realign_test: 0
 : 
 :+vfs.nfsrv.commit_miss: 0
 :+vfs.nfsrv.commit_blks: 0
 : 
 : changing them did nothing - or at least with respect to nfs throughput :-)
 :
 :I'm not sure what any of these do, as NFS is a bit out of my league.
 ::-)  I'll be following this thread though!
 :
 :-- 
 :| Jeremy Chadwickjdc at parodius.com |
 
 A non-zero nfs_realign_count is bad, it means NFS had to copy the
 mbuf chain to fix the alignment.  nfs_realign_test is just the
 number of times it checked.  So nfs_realign_test is irrelevant.
 it's nfs_realign_count that matters.
 
it's zero, so I guess I'm ok there.
funny though, on my 'good' machine, vfs.nfsrv.realign_test: 5862999
and on the slow one, it's 0 - but then again the good one has been up
for several days.

 Several things can cause NFS payloads to be improperly aligned.
 Anything from older network drivers which can't start DMA on a 
 2-byte boundary, resulting in the 14-byte encapsulation header 
 causing improper alignment of the IP header  payload, to rpc
 embedded in NFS TCP streams winding up being misaligned.
 
 Modern network hardware either support 2-byte-aligned DMA, allowing
 the encapsulation to be 2-byte aligned so the payload winds up being
 4-byte aligned, or support DMA chaining allowing the payload to be
 placed in its own mbuf, or pad, etc.
 
 --
 
 One thing I would check is to be sure a couple of nfsiod's are running
 on the client when doing your tests.  If none are running the RPCs wind
 up being more synchronous and less pipelined.  Another thing I would
 check is IP fragment reassembly statistics (for UDP) - there should be
 none for TCP connections no matter what the NFS I/O size selected.
 
ahh, nfsiod, it seems that it's now dynamicaly started! at least none show
when host is idle, after i run my tests  there are 20! with ppid 0
need to refresh my NFS knowledge.
how can I see the IP fragment reassembly statistics?

 (It does seem more likely to be scheduler-related, though).
 

tend to agree, I tried bith ULE/BSD, but the badness is there.

   -Matt
 

thanks,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bad NFS/UDP performance

2008-09-27 Thread Danny Braniss
 --==_Exmh_1222467420_5817P
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 
 David,
 
 You beat me to it.
 
 Danny, read the iperf man page:
-b, --bandwidth n[KM]
   set  target  bandwidth to n bits/sec (default 1 Mbit/sec).  This
   setting requires UDP (-u).
 
 The page needs updating, though. It should read -b, --bandwidth
 n[KMG]. It also does NOT require -u. If you use -b, UDP is assumed.

I did RTFM(*), but when i tried it just wouldn't work, I tried today
and it's actually working - so don't RTFM before coffee!
btw, even though iperf sucks, netperf udp tends to bring the server down
to it's knees.

danny
PS: * - i don't seem to have the iperf man, all I have is iperf -h


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bad NFS/UDP performance

2008-09-27 Thread Danny Braniss
 On Fri, 26 Sep 2008, Danny Braniss wrote:
 
  after more testing, it seems it's related to changes made between Aug 4 and 
  Aug 29 ie, a kernel built on Aug 4 works fine, Aug 29 is slow. I'l now try 
  and close the gap.
 
 I think this is the best way forward -- skimming August changes, there are a 
 number of candidate commits, including retuning of UDP hashes by mav, my 
 rwlock changes, changes to mbuf chain handling, etc.

it more difficult than I expected.
for one, the kernel date was missleading, the actual source update is the key, 
so
the window of changes is now 28/July to 19/August. I have the diffs, but nothing
yet seems relevant.

on the other hand, I tried NFS/TCP, and there things seem ok, ie the 'good' and 
the 'bad'
give the same throughput, which seem to point to UDP changes ...

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


bad NFS/UDP performance

2008-09-26 Thread Danny Braniss
Hi,
There seems to be some serious degradation in performance.
Under 7.0 I get about 90 MB/s (on write), while, on the same machine
under 7.1 it drops to 20!
Any ideas?

thanks,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bad NFS/UDP performance

2008-09-26 Thread Danny Braniss
 On Fri, Sep 26, 2008 at 10:04:16AM +0300, Danny Braniss wrote:
  Hi,
  There seems to be some serious degradation in performance.
  Under 7.0 I get about 90 MB/s (on write), while, on the same machine
  under 7.1 it drops to 20!
  Any ideas?
 
 1) Network card driver changes,
could be, but at least iperf/tcp is ok - can't get udp numbers, do you
know of any tool to measure udp performance?
BTW, I also checked on different hardware, and the badness is there.
 
 2) This could be relevant, but rwatson@ will need to help determine
that.

 http://lists.freebsd.org/pipermail/freebsd-stable/2008-September/045109.html

gut feeling is that it's somewhere else:


Writing 16 MB file
BSCount / 7.0 --/ / 7.1 -/
   1*512  32768 0.16s  98.11MB/s  0.43s 37.18MB/s
   2*512  16384 0.17s  92.04MB/s  0.46s 34.79MB/s
   4*512   8192 0.16s 101.88MB/s  0.43s 37.26MB/s
   8*512   4096 0.16s  99.86MB/s  0.44s 36.41MB/s
  16*512   2048 0.16s 100.11MB/s  0.50s 32.03MB/s
  32*512   1024 0.26s  61.71MB/s  0.46s 34.79MB/s
  64*512512 0.22s  71.45MB/s  0.45s 35.41MB/s
 128*512256 0.21s  77.84MB/s  0.51s 31.34MB/s
 256*512128 0.19s  82.47MB/s  0.43s 37.22MB/s
 512*512 64 0.18s  87.77MB/s  0.49s 32.69MB/s
1024*512 32 0.18s  89.24MB/s  0.47s 34.02MB/s
2048*512 16 0.17s  91.81MB/s  0.30s 53.41MB/s
4096*512  8 0.16s 100.56MB/s  0.42s 38.07MB/s
8192*512  4 0.82s  19.56MB/s  0.80s 19.95MB/s
   16384*512  2 0.82s  19.63MB/s  0.95s 16.80MB/s
   32768*512  1 0.81s  19.69MB/s  0.96s 16.64MB/s

Average:   75.8633.00

the nfs filer is a NetWork Appliance, and is in use, so i get fluctuations in 
the
measurements, but the relation are similar, good on 7.0, bad on 7.1

Cheers,
danny
 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bad NFS/UDP performance

2008-09-26 Thread Danny Braniss
 On Fri, Sep 26, 2008 at 12:27:08PM +0300, Danny Braniss wrote:
   On Fri, Sep 26, 2008 at 10:04:16AM +0300, Danny Braniss wrote:
Hi,
There seems to be some serious degradation in performance.
Under 7.0 I get about 90 MB/s (on write), while, on the same machine
under 7.1 it drops to 20!
Any ideas?
   
   1) Network card driver changes,
  could be, but at least iperf/tcp is ok - can't get udp numbers, do you
  know of any tool to measure udp performance?
  BTW, I also checked on different hardware, and the badness is there.
 
 According to INDEX, benchmarks/iperf does UDP bandwidth testing.

I know, but I get about 1mgb, which seems somewhat low :-(

 
 benchmarks/nttcp should as well.
 
 What network card is in use?  If Intel, what driver version (should be
 in dmesg).

bge: Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x9003 
and
bce: Broadcom NetXtreme II BCM5708 1000Base-T (B2)
and intels, but haven't tested there yet.

 
   2) This could be relevant, but rwatson@ will need to help determine
  that.
  
   http://lists.freebsd.org/pipermail/freebsd-stable/2008-September/045109.html
  
  gut feeling is that it's somewhere else:
  
  Writing 16 MB file
  BSCount / 7.0 --/ / 7.1 -/
 1*512  32768 0.16s  98.11MB/s  0.43s 37.18MB/s
 2*512  16384 0.17s  92.04MB/s  0.46s 34.79MB/s
 4*512   8192 0.16s 101.88MB/s  0.43s 37.26MB/s
 8*512   4096 0.16s  99.86MB/s  0.44s 36.41MB/s
16*512   2048 0.16s 100.11MB/s  0.50s 32.03MB/s
32*512   1024 0.26s  61.71MB/s  0.46s 34.79MB/s
64*512512 0.22s  71.45MB/s  0.45s 35.41MB/s
   128*512256 0.21s  77.84MB/s  0.51s 31.34MB/s
   256*512128 0.19s  82.47MB/s  0.43s 37.22MB/s
   512*512 64 0.18s  87.77MB/s  0.49s 32.69MB/s
  1024*512 32 0.18s  89.24MB/s  0.47s 34.02MB/s
  2048*512 16 0.17s  91.81MB/s  0.30s 53.41MB/s
  4096*512  8 0.16s 100.56MB/s  0.42s 38.07MB/s
  8192*512  4 0.82s  19.56MB/s  0.80s 19.95MB/s
 16384*512  2 0.82s  19.63MB/s  0.95s 16.80MB/s
 32768*512  1 0.81s  19.69MB/s  0.96s 16.64MB/s
  
  Average:   75.8633.00
  
  the nfs filer is a NetWork Appliance, and is in use, so i get fluctuations 
  in 
  the
  measurements, but the relation are similar, good on 7.0, bad on 7.1
 
 Do you have any NFS-related tunings in /etc/rc.conf or /etc/sysctl.conf?
 
no, but diffing the sysctl show:

-vfs.nfs.realign_test: 22141777
+vfs.nfs.realign_test: 498351

-vfs.nfsrv.realign_test: 5005908
+vfs.nfsrv.realign_test: 0

+vfs.nfsrv.commit_miss: 0
+vfs.nfsrv.commit_blks: 0

changing them did nothing - or at least with respect to nfs throughput :-)

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bad NFS/UDP performance

2008-09-26 Thread Danny Braniss
 On Fri, 2008-09-26 at 10:04 +0300, Danny Braniss wrote:
  Hi,
  There seems to be some serious degradation in performance.
  Under 7.0 I get about 90 MB/s (on write), while, on the same machine
  under 7.1 it drops to 20!
  Any ideas?
 
 The scheduler has been changed to ULE, and NFS has historically been
 very sensitive to changes like that.  You could try switching back to
 the 4BSD scheduler and seeing if that makes a difference.  If it does,
 toggling PREEMPTION would also be interesting to see the results of.
 
 Gavin

I'm testing 7.0-stable vs 7.1-prerelease, and both have ULE.
BTW, the nfs client hosts I'm testing are idle.

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bad NFS/UDP performance

2008-09-26 Thread Danny Braniss
 On Fri, Sep 26, 2008 at 12:27:08PM +0300, Danny Braniss wrote:
   On Fri, Sep 26, 2008 at 10:04:16AM +0300, Danny Braniss wrote:
Hi,
There seems to be some serious degradation in performance.
Under 7.0 I get about 90 MB/s (on write), while, on the same machine
under 7.1 it drops to 20!
Any ideas?
   
   1) Network card driver changes,
  could be, but at least iperf/tcp is ok - can't get udp numbers, do you
  know of any tool to measure udp performance?
  BTW, I also checked on different hardware, and the badness is there.
 
 According to INDEX, benchmarks/iperf does UDP bandwidth testing.
 
 benchmarks/nttcp should as well.
 
 What network card is in use?  If Intel, what driver version (should be
 in dmesg).
 
   2) This could be relevant, but rwatson@ will need to help determine
  that.
  
   http://lists.freebsd.org/pipermail/freebsd-stable/2008-September/045109.html
  
  gut feeling is that it's somewhere else:
  
  Writing 16 MB file
  BSCount / 7.0 --/ / 7.1 -/
 1*512  32768 0.16s  98.11MB/s  0.43s 37.18MB/s
 2*512  16384 0.17s  92.04MB/s  0.46s 34.79MB/s
 4*512   8192 0.16s 101.88MB/s  0.43s 37.26MB/s
 8*512   4096 0.16s  99.86MB/s  0.44s 36.41MB/s
16*512   2048 0.16s 100.11MB/s  0.50s 32.03MB/s
32*512   1024 0.26s  61.71MB/s  0.46s 34.79MB/s
64*512512 0.22s  71.45MB/s  0.45s 35.41MB/s
   128*512256 0.21s  77.84MB/s  0.51s 31.34MB/s
   256*512128 0.19s  82.47MB/s  0.43s 37.22MB/s
   512*512 64 0.18s  87.77MB/s  0.49s 32.69MB/s
  1024*512 32 0.18s  89.24MB/s  0.47s 34.02MB/s
  2048*512 16 0.17s  91.81MB/s  0.30s 53.41MB/s
  4096*512  8 0.16s 100.56MB/s  0.42s 38.07MB/s
  8192*512  4 0.82s  19.56MB/s  0.80s 19.95MB/s
 16384*512  2 0.82s  19.63MB/s  0.95s 16.80MB/s
 32768*512  1 0.81s  19.69MB/s  0.96s 16.64MB/s
  
  Average:   75.8633.00
  
  the nfs filer is a NetWork Appliance, and is in use, so i get fluctuations 
  in 
  the
  measurements, but the relation are similar, good on 7.0, bad on 7.1
 
 Do you have any NFS-related tunings in /etc/rc.conf or /etc/sysctl.conf?
 

after more testing, it seems it's related to changes made between Aug 4 and 
Aug 29
ie, a kernel built on Aug 4 works fine, Aug 29 is slow.
I'l now try and close the gap.

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: RELENG_7 hangs on boot w/Gigabyte MA78GM-S2H MB

2008-09-20 Thread Danny Braniss
 On Sat, Sep 20, 2008 at 08:05:33AM -0700, Jeremy Chadwick wrote:
  On Sat, Sep 20, 2008 at 09:45:10AM -0500, Bob Willcox wrote:
   On Sat, Sep 20, 2008 at 07:04:56AM -0700, Jeremy Chadwick wrote:
On Sat, Sep 20, 2008 at 08:24:29AM -0500, Bob Willcox wrote:
  1) It would be helpful to know if you installed i386 or amd64 
  FreeBSD,
 
 This is amd64 on this particular machine.
 
  2) With regards to the lock-up after mount root, if you press 
  NumLock
  or CapsLock, do the keyboard LEDs turn on/off?
 
 Nope, no keys do anything. You must either push reset or pull the 
 plug.

Is it possible to get the output when booting in verbose mode?  If not,
what are the last few lines before the machine locks up when booting
verbosely?
   
   Yep, just did that. The last things printed right before hang are:
   
   ioapic0: Assigning ISA IRQ 1 to local APIC 0
   ioapic0: Assigning ISA IRQ 4 to local APIC 1
   ioapic0: Assigning ISA IRQ 6 to local APIC 2
   ioapic0: Assigning ISA IRQ 7 to local APIC 0
   ioapic0: Assigning ISA IRQ 9 to local APIC 1
   ioapic0: Assigning ISA IRQ 12 to local APIC 2
   ioapic0: Assigning ISA IRQ 14 to local APIC 0
   ioapic0: Assigning ISA PCI 16 to local APIC 1
   ioapic0: Assigning ISA PCI 17 to local APIC 2
   ioapic0: Assigning ISA PCI 18 to local APIC 0
   ioapic0: Assigning ISA PCI 19 to local APIC 1
   ioapic0: Assigning ISA PCI 22 to local APIC 2
   trying to mount root from ufs:/dev/ad4s1a
   start_init: trying /sbin/init
   [hung at this point]
   
  3) Many others have seen the hanging/lock-up after mount root.  I
  believe one found a workaround by setting ATA_STATIC_ID in their 
  kernel
  configuration.  I realise this is a problem when you can't get the
  system up to a point of building a kernel; chicken-and-egg problem,
 
 Well, I can build a kernel if I run the 7.0-release kernel. That's 
 how I
 got to 7-stable on the machine in the first place. I used sneaker 
 net
 to copy it to this one via a CD (as I mentioned, the 7.0 kernel boots
 but the Realtek ethernet device is not recognized).

So the problem is that 7.0-RELEASE works fine for you, but after
upgrading your RELENG_7 source (to what is now 7.1-BETA), the machine
hangs after printing the mount root message.  Is this correct?
   
   Yes, that is pretty much it. The Realtek ethernet isn't working in in
   7.0-RELEASE either, but I'm guessing that that is a different (and less
   serious) problem related to changes in that device.
   
Here's another question: does booting into single-user exhibit the same
problem as multi-user?
   
   It looks like when I try a single-user mode (and verbose) boot the only
   difference is that the las line shown above (the start_init line) isn't
   printed. Otherwise, the hang is the same.
   
  4) The Realtek NIC on that motherboard is probably too new to be
  supported under RELENG_7.  Realtek has a history of releasing 
  different
  sub-revisions of the same NIC/PHY, and the internal changes are 
  severe
  enough to cause the NIC to not work correctly (under any OS) without
  full driver support for that specific sub-revision.
 
 That's what I suspected. The values displayed when doing a pciconf 
 -lv
 are similar as for this system I'm using to type this, but now that
 I look closer and make a direct comparison, the failing device has a
 rev=0x02 vs. rev=0x01 for the working one. The pciconf -lv output for
 the failing mb is:
 
 [EMAIL PROTECTED]:2:0:0: class=0x02 card=0xe0001458 
 chip=0x816810ec rev=0x02 hdr=0x00
 vendor = 'Realtek Semiconductor'
 device = 'RTL8168/8111 PCI-E Gigabit Ethernet NIC'
 class  = network
 subclass   = ethernet

Regarding the Realtek issue: I've CC'd PYUN Yong-Hyeon (surname in
caps), who maintains the re(4) driver for FreeBSD.  He might have a
patch available for you to try, or help determine how to get this NIC
working on FreeBSD.  He'll probably need more than just pciconf -lv
output, but should be able to work with you.
   
   Ok, that'd be great. I must say that I'm close to simply returning this
   MB and going with something not quite so new that is more likely to
   work. I was hoping to get this system up and running this weekend. :(
  
  I wish I knew what was causing the lock-up for you.  I'm truly baffled,
  especially given that the system is able to boot + find the kernel +
  load kernel modules.  Debugging this problem is out of field; jhb@ might
  have some ideas, as I'm not sure what magic happens immediately before
  the root filesystem is mounted.
  
  Those debugging/helping may want disklabel -r -A ad4s1 output.  At
  least you can boot 7.0-RELEASE to get that information.
  
  Regarding hardware:
  
  I myself purchased an Asus P5Q SE board, with an 

Re: bin/121684: : dump(8) frequently hangs

2008-09-03 Thread Danny Braniss
 Danny,
 
 Thanks for the suggestion, but my system is a P-III so there is only one
 CPU. At 1GHz, I think that this easily qualifies as an older, slower,
 non-smp host.

I just tried it on a GEODE/7.0 stable - slower than a P-III :-), and
dump went smoothly so, to help isolate the problem, if it still happens to you,
let me know what os/version - I can probably find a P-III around here.

danny
 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bin/121684: : dump(8) frequently hangs

2008-09-02 Thread Danny Braniss
take a look at:
http://www.freebsd.org/cgi/query-pr.cgi?pr=117603
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Using iscsi with multiple targets

2008-07-14 Thread Danny Braniss
 FreeBSD 7.0
 
 I have 2 machines with identical configurations/hardware, let's call them A 
 (master) 
 and B (slave). I have installed iscsi-target from ports and have set up 3 
 targets 
 representing the 3 drives I wish to be connected to from A.
 
 The Targets file:
 # extents   filestart   length
 extent0 /dev/da10   465GB
 extent1 /dev/da20   465GB
 extent2 /dev/da30   465GB
 
 # targetflags   storage netmask
 target0 rw  extent0 192.168.0.1/24
 target1 rw  extent1 192.168.0.1/24
 target2 rw  extent2 192.168.0.1/24
 
 I then start up iscsi_target and all is good.
 
 Now on A I have set up my /etc/iscsi.conf file as follows:
 
 # cat /etc/iscsi.conf
 data1 {
  targetaddress=192.168.0.252
  targetname=iqn.1994-04.org.netbsd.iscsi-target:target0
  initiatorname=iqn.2005-01.il.ac.huji.cs::BSD-2-1.sven.local
 }
 data2 {
  targetaddress=192.168.0.252
  targetname=iqn.1994-04.org.netbsd.iscsi-target:target1
  initiatorname=iqn.2005-01.il.ac.huji.cs::BSD-2-1.sven.local
 }
 data3 {
  targetaddress=192.168.0.252
  targetname=iqn.1994-04.org.netbsd.iscsi-target:target2
  initiatorname=iqn.2005-01.il.ac.huji.cs::BSD-2-1.sven.local
 }
 
 So far so good, now come the issues. First of all, it would appear that with 
 iscontrol one can only start one named session at a time; for example
 /sbin/iscontrol -n data1
 /sbin/iscontrol -n data2
 /sbin/isconrtol -n data3
 
 I guess that is ok, except that each invocation of iscontrol resets the other 
 sessions. Here is the camcontrol and dmesg output from running the above 3 
 commands.
 
 # camcontrol devlist
 AMCC 9550SXU-8L DISK 3.08at scbus0 target 0 lun 0 (pass0,da0)
 AMCC 9550SXU-8L DISK 3.08at scbus0 target 1 lun 0 (pass1,da1)
 AMCC 9550SXU-8L DISK 3.08at scbus0 target 2 lun 0 (pass2,da2)
 AMCC 9550SXU-8L DISK 3.08at scbus0 target 3 lun 0 (pass3,da3)
 NetBSD NetBSD iSCSI 0at scbus1 target 0 lun 0 (da5,pass5)
 NetBSD NetBSD iSCSI 0at scbus1 target 1 lun 0 (da6,pass6)
 NetBSD NetBSD iSCSI 0at scbus1 target 2 lun 0 (da4,pass4)
 
 
 [ /sbin/iscontrol -n data1 ]
 da4 at iscsi0 bus 0 target 0 lun 0
 da4: NetBSD NetBSD iSCSI 0 Fixed Direct Access SCSI-3 device
 
 [ /sbin/iscontrol -n data2 ]
 (da4:iscsi0:0:0:0): lost device
 (da4:iscsi0:0:0:0): removing device entry
 da4 at iscsi0 bus 0 target 0 lun 0
 da4: NetBSD NetBSD iSCSI 0 Fixed Direct Access SCSI-3 device
 da5 at iscsi0 bus 0 target 1 lun 0
 da5: NetBSD NetBSD iSCSI 0 Fixed Direct Access SCSI-3 device
 
 [ /sbin/iscontrol -n data3 ]
 (da4:iscsi0:0:0:0): lost device
 (da4:iscsi0:0:0:0): removing device entry
 (da5:iscsi0:0:1:0): lost device
 (da5:iscsi0:0:1:0): removing device entry
 da4 at iscsi0 bus 0 target 2 lun 0
 da4: NetBSD NetBSD iSCSI 0 Fixed Direct Access SCSI-3 device
 da5 at iscsi0 bus 0 target 0 lun 0
 da5: NetBSD NetBSD iSCSI 0 Fixed Direct Access SCSI-3 device
 da6 at iscsi0 bus 0 target 1 lun 0
 da6: NetBSD NetBSD iSCSI 0 Fixed Direct Access SCSI-3 device
 
 
 It would appear that rather than appending the new device to the end of the 
 da 
 devices, it starts to do some type of naming queue after the second device. 
 If I am 
 to use these devices in any type of automated setup, how can make sure that 
 after 
 these commands, da6 will always be target 1 (i.e. /dev/da2 on the slave 
 machine).
 
 Next, there is no startup script for iscontrol - would that simply have to 
 be 
 added the system or is there a way with sysctl that it could be done. The 
 plan here 
 is use gmirror such that /dev/da1 on A is mirrored with the /dev/da1 on B 
 using iscsi.

Hi Sven,
I just tried it here, and it seems that at the end all is ok :-)
I think the lost/removing/found has something to do to iscontrol calling
camcontrol rescan - I will check this later, but the end result is that
you should have all /dev/da's.
I don't see any reasonable safe way to tie a scsi# (/dev/dan),
except to label (see glabel) the disk.
The startup script is, at the moment, not trivial, but I'm attaching
it, so someone can suggest improvements :-)
#!/bin/sh

# PROVIDE: iscsi
# REQUIRE: NETWORKING
# BEFORE:  DAEMON
# KEYWORD: nojail shutdown

#
# Add the following lines to /etc/rc.conf to enable iscsi:
#
# iscsi_enable=YES
# iscsi_fstab=/etc/fstab.iscsi

. /etc/rc.subr
. /cs/share/etc/rc.subr

name=iscsi
rcvar=`set_rcvar`

command=/sbin/iscontrol

iscsi_enable=${iscsi_enable:-NO}
iscsi_fstab=${iscsi_fstab:-/etc/fstab.iscsi}
iscsi_exports=${iscsi_exports:-/etc/exports.iscsi}
iscsi_debug=${iscsi_debug:-0}
start_cmd=iscsi_start
faststop_cmp=iscsi_stop
stop_cmd=iscsi_stop

start_precmd=iscontrol_precmd
iscontrol_prog=${iscontrol_prog:-iscontrol}
iscontrol_log=${iscontrol_log:-/var/log/$iscontrol_prog}

Re: ata on alix/geode stopped being detcted.

2008-06-25 Thread Danny Braniss
 hi,
   latest changes in dev/ata broke this, on older -stable ...
 ata0-master: pio=PIO4 wdma=UNSUPPORTED udma=UNSUPPORTED cable=40 =wire
 ad0: success setting PIO4 on National chip
 ad0: 977MB SanDisk SDCFB-1024 Rev 0.00 at ata0-master PIO4
 
 on latest -stable:
 ata0-master: pio=PIO4 wdma=WDMA2 udma=UNSUPPORTED cable=40 wire
 
 and no disk.
 
 cheers,
   danny

problem solved:
somehow 'device atadisk' was lost from the kernel configuration file

ata0-master: pio=PIO4 wdma=WDMA2 udma=UNSUPPORTED cable=40 wire
ad0: setting PIO4 on CS5536 chip
ad0: setting WDMA2 on CS5536 chip
ad0: 1953MB SanDisk SDCFB-2048 HDX 3.21 at ata0-master WDMA2
ad0: 4001760 sectors [3970C/16H/63S] 4 sectors/interrupt 1 depth queue

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


ata on alix/geode stopped being detcted.

2008-06-24 Thread Danny Braniss
hi,
latest changes in dev/ata broke this, on older -stable
...
ata0-master: pio=PIO4 wdma=UNSUPPORTED udma=UNSUPPORTED cable=40 wire
ad0: success setting PIO4 on National chip
ad0: 977MB SanDisk SDCFB-1024 Rev 0.00 at ata0-master PIO4

on latest -stable:
ata0-master: pio=PIO4 wdma=WDMA2 udma=UNSUPPORTED cable=40 wire

and no disk.

cheers,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: auto_nlist failed on cp_time at location 1

2008-04-24 Thread Danny Braniss
 
 --dc+cDN39EJAMEtIO
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 
 In the last episode (Apr 23), Tim Stoddard said:
  I just upgraded from FreeBSD 6.2 -
  6.3 (using source tree).  I then recompiled my net-snmp port binaries (using
  portupgrade).  I am now get error message in my logs every five secs. 
  I am sure my libkvm is in sync with my kernel.  I do not know what else
  to look at.  
 
 You got bit by 
 
  revision 1.178.2.5
  date: 2008/04/09 19:47:20;  author: peter;  state: Exp;  lines: +68 -5
  MFC: record per-cpu stats for %user/%nice/%system/%idle
 
 , which removed the kernel variable that net-snmp uses to track CPU
 usage. Try this patch (put it in /usr/ports/net-mgmt/net-snmp/files and
 rebuild net-snmp).  I've sent it to the net-snmp port maintainer so
 hopefully it will be committed soon.
 
 -- 
   Dan Nelson
   [EMAIL PROTECTED]
 
the same goes for rpc.rstatd :-), see
http://www.freebsd.org/cgi/query-pr.cgi?pr=123014
 --dc+cDN39EJAMEtIO
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: attachment; filename=patch-cpu_nlist.c
 
 --- agent/mibgroup/hardware/cpu/cpu_nlist.c   2007-01-19 10:53:44.0 
 -0600
 +++ agent/mibgroup/hardware/cpu/cpu_nlist.c   2008-04-22 00:13:48.330686919 
 -0500
 @@ -1,5 +1,5 @@
  /*
 - *   nlist() interface
 + *   sysctl() interface
   * e.g. FreeBSD
   */
  #include net-snmp/net-snmp-config.h
 @@ -12,24 +12,9 @@
  #include sys/types.h
  #include sys/resource.h
  
 -#ifdef HAVE_SYS_DKSTAT_H
 -#include sys/dkstat.h
 -#endif
  #ifdef HAVE_SYS_SYSCTL_H
  #include sys/sysctl.h
  #endif
 -#ifdef HAVE_SYS_VMMETER_H
 -#include sys/vmmeter.h
 -#endif
 -#ifdef HAVE_VM_VM_PARAM_H
 -#include vm/vm_param.h
 -#endif
 -#ifdef HAVE_VM_VM_EXTERN_H
 -#include vm/vm_extern.h
 -#endif
 -
 -#define CPU_SYMBOL  cp_time
 -#define MEM_SYMBOL  cnt
  
  void _cpu_copy_stats( netsnmp_cpu_info *cpu );
  
 @@ -67,11 +52,12 @@
   */
  int netsnmp_cpu_arch_load( netsnmp_cache *cache, void *magic ) {
  long   cpu_stats[CPUSTATES];
 -struct vmmeter mem_stats;
 +int size, tempval;
 +
  netsnmp_cpu_info *cpu = netsnmp_cpu_get_byIdx( -1, 0 );
  
 -auto_nlist( CPU_SYMBOL, (char *) cpu_stats, sizeof(cpu_stats));
 -auto_nlist( MEM_SYMBOL, (char *)mem_stats, sizeof(mem_stats));
 +size = sizeof(cpu_stats);
 +sysctlbyname(kern.cp_time, cpu_stats, size, NULL, 0);
  
  cpu-user_ticks = (unsigned long)cpu_stats[CP_USER];
  cpu-nice_ticks = (unsigned long)cpu_stats[CP_NICE];
 @@ -85,15 +71,19 @@
   * Interrupt/Context Switch statistics
   *   XXX - Do these really belong here ?
   */
 -#if defined(openbsd2) || defined(darwin)
 -cpu-swapIn  = (unsigned long)mem_stats.v_swpin;
 -cpu-swapOut = (unsigned long)mem_stats.v_swpout;
 -#else
 -cpu-swapIn  = (unsigned 
 long)mem_stats.v_swappgsin+mem_stats.v_vnodepgsin;
 -cpu-swapOut = (unsigned 
 long)mem_stats.v_swappgsout+mem_stats.v_vnodepgsout;
 -#endif
 -cpu-nInterrupts  = (unsigned long)mem_stats.v_intr;
 -cpu-nCtxSwitches = (unsigned long)mem_stats.v_swtch;
 +size = sizeof(int);
 +#define GET_VM_STATS(cat, name, netsnmpname) \
 +do { \
 +sysctlbyname(vm.stats. #cat . #name, tempval, size, NULL, 0); \
 +cpu-netsnmpname = (unsigned long) tempval; \
 +} while(0)
 +
 +GET_VM_STATS(vm,  v_swappgsin,   swapIn);
 +GET_VM_STATS(vm,  v_swappgsout,  swapOut);
 +GET_VM_STATS(vm,  v_vnodepgsin,  pageIn);
 +GET_VM_STATS(vm,  v_vnodepgsout, pageOut);
 +GET_VM_STATS(sys, v_intr,nInterrupts);
 +GET_VM_STATS(sys, v_swtch,   nCtxSwitches);
  
  #ifdef PER_CPU_INFO
  for ( i = 0; i  n; i++ ) {
 
 --dc+cDN39EJAMEtIO
 Content-Type: text/plain; charset=us-ascii
 MIME-Version: 1.0
 Content-Transfer-Encoding: 7bit
 Content-Disposition: inline
 
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to [EMAIL PROTECTED]
 --dc+cDN39EJAMEtIO--
 


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Marvell Technology Group Ltd. Yukon EC Ultra

2008-03-27 Thread Danny Braniss
Hi,
Under load, the msk has problems,  with hw.msk.legacy_intr=1 and 0.
with = 1, i get
TCP segementation error
watchdog timeout
with = 0,
Tx MAC parity error
watchdog timeout

the board is a Asus P5K-VM

Cheers,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Marvell Technology Group Ltd. Yukon EC Ultra

2008-03-27 Thread Danny Braniss
 On Thu, Mar 27, 2008 at 12:57:31PM +0200, Danny Braniss wrote:
   Hi,
  Under load, the msk has problems,  with hw.msk.legacy_intr=1 and 0.
   with = 1, i get
  TCP segementation error
  watchdog timeout
   with = 0,
  Tx MAC parity error
  watchdog timeout
   
 
 Would you show me verbosed boot messages related with msk(4)/e1000phy(4)?

mskc0: Marvell Yukon 88E8056 Gigabit Ethernet port 0xc800-0xc8ff mem 
0xfeafc000-0xfeaf irq 17 at device 0.0 on pci1
mskc0: Reserved 0x4000 bytes for rid 0x10 type 3 at 0xfeafc000
mskc0: MSI count : 1
mskc0: attempting to allocate 1 MSI vectors (1 supported)
msi: routing MSI IRQ 256 to vector 52
mskc0: using IRQ 256 for MSI
mskc0: RAM buffer size : 128KB
mskc0: Port 0 : Rx Queue 85KB(0x:0x000153ff)
mskc0: Port 0 : Tx Queue 43KB(0x00015400:0x0001)
msk0: Marvell Technology Group Ltd. Yukon EC Ultra Id 0xb4 Rev 0x03 on mskc0
msk0: bpf attached
msk0: Ethernet address: 00:1e:8c:6d:5c:fe
miibus0: MII bus on msk0
e1000phy0: Marvell 88E1149 Gigabit PHY PHY 0 on miibus0
e1000phy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX-FDX, 
auto
mskc0: [MPSAFE]
mskc0: [FILTER]

is this enough?
danny

   the board is a Asus P5K-VM
   
   Cheers,
  danny
   
   
 
 -- 
 Regards,
 Pyun YongHyeon


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Marvell Technology Group Ltd. Yukon EC Ultra

2008-03-27 Thread Danny Braniss
 On Thu, Mar 27, 2008 at 01:32:46PM +0200, Danny Braniss wrote:
On Thu, Mar 27, 2008 at 12:57:31PM +0200, Danny Braniss wrote:
  Hi,
Under load, the msk has problems,  with hw.msk.legacy_intr=1 
 and 0.
  with = 1, i get
TCP segementation error
watchdog timeout
  with = 0,
Tx MAC parity error
watchdog timeout
  

Would you show me verbosed boot messages related with msk(4)/e1000phy(4)?
   
   mskc0: Marvell Yukon 88E8056 Gigabit Ethernet port 0xc800-0xc8ff mem 
   0xfeafc000-0xfeaf irq 17 at device 0.0 on pci1
   mskc0: Reserved 0x4000 bytes for rid 0x10 type 3 at 0xfeafc000
   mskc0: MSI count : 1
   mskc0: attempting to allocate 1 MSI vectors (1 supported)
   msi: routing MSI IRQ 256 to vector 52
   mskc0: using IRQ 256 for MSI
   mskc0: RAM buffer size : 128KB
   mskc0: Port 0 : Rx Queue 85KB(0x:0x000153ff)
   mskc0: Port 0 : Tx Queue 43KB(0x00015400:0x0001)
   msk0: Marvell Technology Group Ltd. Yukon EC Ultra Id 0xb4 Rev 0x03 on 
 mskc0
   msk0: bpf attached
   msk0: Ethernet address: 00:1e:8c:6d:5c:fe
   miibus0: MII bus on msk0
   e1000phy0: Marvell 88E1149 Gigabit PHY PHY 0 on miibus0
   e1000phy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 
 1000baseTX-FDX, 
   auto
   mskc0: [MPSAFE]
   mskc0: [FILTER]
   
   is this enough?
 
 Yes, it seems that 88E8056/88E1149 PHY has several issues. I recall
 that there had been several reports for this issue. Since nfe(4)
 with 88E1149 also have some stability issues, e1000phy(4) has lack
 of required code for 88E1149 PHY. Up to date, I couldn't find a
 clue, sorry. I'll let you know if I have a code to give it spin.

great and thanks,

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: BTX on USB pen drive

2008-03-13 Thread Danny Braniss
 On Sunday 09 March 2008 09:07:03 am Torfinn Ingolfsen wrote:
  On Sat, 08 Mar 2008 18:44:50 -0800
  Jeremy Chadwick [EMAIL PROTECTED] wrote:
  
   Your boot0cfg line to reinstall the boot0 MBR looks fine, but I don't
   use boot0 myself (I prefer to go right into boot2/loader).
  
  I used 'fdisk -B da0' to install /boot/mbr to the disk for testing.
  When I now boot the disk on the Acer laptop, it just displays one
  register dump followed by BTX halted.
 
 You haven't updated boot2 (via bsdlabel -B) which sits in between boot0/mbr 
 and /boot/loader.
 

think you can apply the same magic to pxeldr?

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Promise driver/support

2008-02-14 Thread Danny Braniss
is there support for this Promise card?

[EMAIL PROTECTED]:4:14:0:  class=0x010400 card=0x0374105a chip=0x8350105a 
rev=0x00 hdr=0x00
vendor = 'Promise Technology Inc'
class  = mass storage
subclass   = RAID

thanks,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Documentation: Installing FreeBSD 7.0 via serial console and PXE

2008-01-30 Thread Danny Braniss
Hi Jeremy,
I'm very glad that you a) can write! b) that you are actually doing
something with respect to the zillions of missguided how-to's :-)
Having some experience with the subject, and please, don't read me 
wrong, I see some different approaches:
- indeed this IS the 21'st century, and it's unbelivable that we still have
  to deal with baudrates! (why can't they be more like modems? autosense:-)
- newer servers don't have serial anymore :-(, the have IPMI/ILO/etc
  some only have com2

what im trying to say, is that hard coding where the console is is a 'problem'
It's my belief that the setting of the console can be done via DHCP - at
the moment I can select the com1/2 - sio.0/sio.1 - via dhcp.

the other item I would like to raise, is the way we do it here.
1st: I boot the new host diskless, this allows us to find out quickly
if all hardware is working, using a tested root/kernel - since DHCP/TFTP/NFS
are working, it takes only a few minutes to bring up a new host
set it to boot pxe
add the mac address to the dhcp.conf
and reboot
2nd: if/and when we decide to make the host 'stand-alone', we do
sysinstall to partition the disk (or via bsdlabel if you are
good at maths)
   cd /mnt-root
   rsh -n server dump 0f - /the/root/partition | restore rf -
   change the bios setting to boot off disk (or if you have the console,
   reboot and hit ESC when doing dhcp ...)

ok, so i fibbed a bit :-), there are some small 'configurables'(*) missing, but
I hope you get the idea.

Cheers,
danny
PS: *: like setting a diskless setup, which is rather simple and gladly can
   try to explain so that you can the write it out in readable english :-)



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ad0: TIMEOUT - WRITE_DMA type errors with 7.0-RC1

2008-01-27 Thread Danny Braniss

 Henri Hennebert wrote:
  Jeremy Chadwick wrote:
  On Fri, Jan 25, 2008 at 06:17:24PM -0700, Joe Peterson wrote:
  Glad you got it back!  Yes, when I was first playing with ZFS, I noti=
 ced
  that booting between single and multi user mode could make the pools
  invisible.  Import seemed to bring them back...
 
  I did go into single-user mode and attempt to do ZFS-related commands,=
 
  which might explain the no datasets available once I was back in
  multiuser!  I would classify that as a bug, and one which is going to
  cause all sorts of hair-pulling for administrators in the future.  I
  wonder what it's caused by.
 =20
  In single user / is read only and so /boot/zfs/zpool.cache can't be=20
  created/updated
 
 But it's still readable. The issue is that hostid isn't set (by=20
 /etc/rc.d/hostid).

if the root is read only, as the case of diskless/dataless boot, it's
the fact that /boot/zfs/zpool.cache cannot be used which causes
the problem, so adding zpool import -a solves the issue.

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: strings /boot/kernel/kernel | grep ___

2008-01-05 Thread Danny Braniss
 On 7-stable
   strings /boot/kernel/kernel | grep ___
 fails to show kernel config,
 whereas on 6.2-REL  before it worked.
 Also in 7 there's no
   START CONFIG FILE
   END CONFIG FILE
 Is this deliberate or a mistake ?
 strings are still there though, look for
   ^options CONFIG_AUTOGENERATED
 (assuming your kernel congif gile included
   options INCLUDE_CONFIG_FILE
 )

config -x /boot/kernel/kernel



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: 7.0-BETA4 and msk problems

2007-12-27 Thread Danny Braniss
 On Mon, Dec 10, 2007 at 11:03:47AM +0200, Danny Braniss wrote:
On Sun, Dec 09, 2007 at 02:41:28PM +0200, Danny Braniss wrote:
  with this onboard NIC (LOB?)
  
  mskc0: Marvell Yukon 88E8056 Gigabit Ethernet 
  e1000phy0: Marvell 88E1149 Gigabit PHY PHY 0 on miibus0
  
  [EMAIL PROTECTED]:1:0:0:   class=0x02 card=0x81f81043 
 chip=0x436411ab 
  rev=0x12 hdr=0x00
  vendor = 'Marvell Semiconductor (Was: Galileo Technology Ltd)'
  device = '88E8056 Yukon PCI-E Gigabit Ethernet Controller'
  class  = network
  subclass   = ethernet
  
  I'm getting allot of:
msk0: watchdog timeout
  and
mskc0: Tx descriptor error
  and 
msk0: link state changed to DOWN
  and
msk0: link state changed to UP
  
  any help is most welcome,
danny
  
  

It seems that the issue happens only on 88E8056/88E1149 PHY.
See PR 116853 and 114631.
Sorry, I have no cluet yet.
   
   to add some more noise, this is the first host that panicked too :-)
   anything I can do to help?
 
 Probably ship the hardware to me? :)
love to, but the hardware is not mine :-) 

here is some more info, this is a different board, but with the
same Marvell 88E8056, and it panics after printing 'no PHY found!'
and the ethernet is -1 (ff.ff...)

danny



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


zfs diskless boot problem

2007-12-25 Thread Danny Braniss
Zfs uses /boot/zfs to keep track of it's pools, but in a diskless
environment, this is a read-only fs. This causes several inconveniences,
- /etc/rc.d/zfs needs :
 zfs_start_main()
 {
dlv=`/sbin/sysctl -n vfs.nfs.diskless_valid 2 /dev/null`
if [ ${dlv:=0} -ne 0 ]; then
zpool import -a
fi
 ...
- how important is /boot/zfs?

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


unionfs lock problems

2007-12-17 Thread Danny Braniss
with 7.0-Beta4, I'm getting quiet a few of these:

lockmgr: thread 0xff00039269f0 unlocking unheld lock
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
_lockmgr() at _lockmgr+0x6ae
VOP_UNLOCK_APV() at VOP_UNLOCK_APV+0x46
unionfs_unlock() at unionfs_unlock+0x22f
VOP_UNLOCK_APV() at VOP_UNLOCK_APV+0x46
vn_read() at vn_read+0x264
dofileread() at dofileread+0xa1
kern_readv() at kern_readv+0x4c
read() at read+0x54
syscall() at syscall+0x254
Xfast_syscall() at Xfast_syscall+0xab
--- syscall (3, FreeBSD ELF64, read), rip = 0x80187b18c, rsp = 0x7fffafb8, 
rbp = 0x7fffb045 ---

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


ufs_rename: fvp == tvp (can't happen)

2007-12-17 Thread Danny Braniss
Hi,
I'm also getting quiet a few of this can't happen. The system
is running 7.0-beta4, and is doing 'portsupgrade -af'

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: 7.0-BETA4 and msk problems

2007-12-10 Thread Danny Braniss
 On Sun, Dec 09, 2007 at 02:41:28PM +0200, Danny Braniss wrote:
   with this onboard NIC (LOB?)
   
   mskc0: Marvell Yukon 88E8056 Gigabit Ethernet 
   e1000phy0: Marvell 88E1149 Gigabit PHY PHY 0 on miibus0
   
   [EMAIL PROTECTED]:1:0:0:   class=0x02 card=0x81f81043 
 chip=0x436411ab 
   rev=0x12 hdr=0x00
   vendor = 'Marvell Semiconductor (Was: Galileo Technology Ltd)'
   device = '88E8056 Yukon PCI-E Gigabit Ethernet Controller'
   class  = network
   subclass   = ethernet
   
   I'm getting allot of:
  msk0: watchdog timeout
   and
  mskc0: Tx descriptor error
   and 
  msk0: link state changed to DOWN
   and
  msk0: link state changed to UP
   
   any help is most welcome,
  danny
   
   
 
 It seems that the issue happens only on 88E8056/88E1149 PHY.
 See PR 116853 and 114631.
 Sorry, I have no cluet yet.

to add some more noise, this is the first host that panicked too :-)
anything I can do to help?
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


7.0-BETA4 and msk problems

2007-12-09 Thread Danny Braniss
with this onboard NIC (LOB?)

mskc0: Marvell Yukon 88E8056 Gigabit Ethernet 
e1000phy0: Marvell 88E1149 Gigabit PHY PHY 0 on miibus0

[EMAIL PROTECTED]:1:0:0:   class=0x02 card=0x81f81043 chip=0x436411ab 
rev=0x12 hdr=0x00
vendor = 'Marvell Semiconductor (Was: Galileo Technology Ltd)'
device = '88E8056 Yukon PCI-E Gigabit Ethernet Controller'
class  = network
subclass   = ethernet

I'm getting allot of:
msk0: watchdog timeout
and
mskc0: Tx descriptor error
and 
msk0: link state changed to DOWN
and
msk0: link state changed to UP

any help is most welcome,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD-6.2, 7.0-BETA1 on X60

2007-12-03 Thread Danny Braniss
try monitoring the traffic (tcpdump/wireshark), this should give you
a good starting point.

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


bge/ilo blues on Sun X2200

2007-11-08 Thread Danny Braniss
newer kernels (after Oct. 20) are breacking bge.
i've checked the cvsdiffs, but nothing seems relevant.

the ilo (Integrated Lights Out) port, bge1, though not configured, gets
configured to 10baseT/UTP full-duplex, and no magic will
get it to 100/full-duplex, which is what the ilo is using.

this works fine on an Oct 20 system.

any ideas?

cheers,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: /usr/share/man/man8/MAKEDEV.8

2007-10-28 Thread Danny Braniss
 On Sun, Oct 28, 2007 at 12:37:45PM -0400, Andrew Lankford wrote:
  Thanks for replying, but once again I've miscommunicated the issue.  I 
  meant that MAKEDEV.8 is in /usr/src/share/man/man8/, and the
  modification time is Oct 13th, recent, which suggests to me that it's still 
  in the 7.0-BETA1 cvs tree.
 
 And you're very much correct:
 
 http://www.freebsd.org/cgi/cvsweb.cgi/src/share/man/man8/MAKEDEV.8

how about reading what ity says? it might be very educational :-)

NAME
 MAKEDEV -- old script for creating device nodes

DESCRIPTION
 The MAKEDEV script was deprecated by devfs(5) and removed from FreeBSD
 after devfs(5) became mandatory.



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


any hope for nfe/msk?

2007-10-24 Thread Danny Braniss
Hi,
these drivers don't work under 7.0
As soon as some mild preasure is applied, they start loosing interrupts, and
in my case the hosts come to a total stand-still, since they are diskless
and rely on the network.
This happens at 1gb and at 100mg.

Maybe the problem is with the shared interrups?

irq16: mskc0 uhci0   3308351 13
or
irq21: nfe0 ohci01584415 24

but I have no idea how to uncouple this

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: PXE booting issues

2007-06-24 Thread Danny Braniss
 I'm currently working on an embedded project which will be built
 around a BSD (I'm not sure which yet), currently I have an image up
 and running DragonFly and I'm currently attempting to do the same with
 FreeBSD for comparison.
 
 I'm more or less following the miniBSD tutorials (updating the various
 file lists and such as I go).
 
 The system is being built around 6.2-STABLE but I'm having some issues
 getting it to PXE boot.
 
 On my FreeBSD host I've enabled TFTP, exported the rootfs for the
 system via NFS, built and installed isc-dhcpd and configured it with
 the extra options for PXE booting.
 
 The client currently gets its IP address from the server successfully,
 retrieves and loads pxeboot but when it comes to launch the kernel it
 eventually throws an NFS timeout.
 
 I know the export is working because I used NFS to pull the rootfs
 over to my DragonFly boot host to see what the result was when I
 booted the same image from a known working boot host (it worked
 correctly until it hit a problem in the image I will detail in a
 separate message).
 
 I did attempt to rebuild the pxeboot loader, following the standard
 instructions;
 set LOADER_TFTP_SUPPORT=YES in /etc/make.conf
 cd /usr/src/sys/boot
 make clean  make depend  make
 
 It appears to be successful (and the output would support this) but
 the i386/pxeldr/pxeboot and i386/loader/loader files do not exist, my
 only guess is that I've not set a make variable I should have, the
 most confusing part is that the dd command which is the final step in
 generating pxeboot appears in the output and appears to be successful;
 ==
 dd if=pxeboot.tmp of=pxeboot obs=2k conv=osync
 425+0 records in
 107+0 records out
 ==
 The discrepancy in the records in and the records out is concerning
 but I would expect the file to exist regardless, I'm currently using
 the default /boot/pxeboot.
 
 Any suggestions as to what might be causing this would be greatly appreciated.

1- try sniffing (wireshark, not the kind that will get you high :-)
   and see where it hangs.
2- the dhcp should tell pxeloader where the root is:
option root-path ip:/path

danny



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: iSCSI initiator tester wanted

2007-06-08 Thread Danny Braniss
 This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
 --enig2465C3FB4D976B8B44FBA3F0
 Content-Type: text/plain; charset=UTF-8
 Content-Transfer-Encoding: quoted-printable
 
 Danny Braniss wrote:
  Hi all,
  I'm in the last mile before crossing the beta-release line,
  so I'd like to get some input, and update the list of targets it suppor=
 ts.
  you can obtain the driver from:
 
 Have you tested it with net/iscsi-target from ports? I get the following
 messages and errors when I try it:
 
 Jun  7 22:53:08 finstall kernel: ic_action: called
 Jun  7 22:53:08 finstall kernel: ic_action: func_code=3D0x901 flags=3D0xc=
 0
 status=3D0x0 target=3D0 lun=3D0 retry_count=3D1 timeout=3D30
 Jun  7 22:53:08 finstall kernel: ic_action: XPT_SCSI_IO cmd=3D0x35
 Jun  7 22:53:08 finstall kernel: scsi_encap: called
 Jun  7 22:53:08 finstall kernel: scsi_encap: ccb-sp=3D0xc2e97c00
 Jun  7 22:53:08 finstall kernel: dwl: called
 Jun  7 22:53:08 finstall kernel: isc_qout: called
 Jun  7 22:53:08 finstall kernel: 0] isc_qout: enqued: pq=3D0xc37ff0bc
 Jun  7 22:53:08 finstall kernel: proc_out: called
 Jun  7 22:53:08 finstall kernel: 0] proc_out: opcode=3D0x1 sn(cmd=3D0x27
 expCmd=3D0x26 maxCmd=3D0x26 expStat=3D0x0 itt=3D0x27)
 Jun  7 22:53:08 finstall kernel: isc_sendPDU: called
 Jun  7 22:53:08 finstall kernel: 0] ism_proc: odone=3D1
 Jun  7 22:53:08 finstall kernel: proc_out: called
 Jun  7 22:53:08 finstall kernel: 0] ism_proc: odone=3D0
 Jun  7 22:53:08 finstall kernel: so_input: called
 Jun  7 22:53:08 finstall kernel: so_getbhs: called
 Jun  7 22:53:08 finstall kernel: proc_out: called
 Jun  7 22:53:08 finstall kernel: 0] ism_proc: odone=3D0
 Jun  7 22:53:08 finstall kernel: so_recv: called
 Jun  7 22:53:08 finstall kernel: 0] so_recv: len=3D48] opcode=3D0x21
 ahs_len=3D0x0 ds_len=3D0x0
 Jun  7 22:53:08 finstall kernel: ism_recv: called
 Jun  7 22:53:08 finstall kernel: 0] ism_recv: opcode=3D0x21 itt=3D0x27
 stat#0x1 maxcmd=3D0x27
 Jun  7 22:53:08 finstall kernel: _scsi_rsp: called
 Jun  7 22:53:08 finstall kernel: _scsi_rsp: itt=3D27 pq=3D0xc37ff5e0
 opq=3D0xc37ff0bc
 Jun  7 22:53:08 finstall kernel: iscsi_done: called
 Jun  7 22:53:08 finstall kernel: _scsi_done: called
 Jun  7 22:53:08 finstall kernel: _scsi_done: ccb_h-status=3D1
 Jun  7 22:53:08 finstall kernel: so_input: called
 Jun  7 22:53:08 finstall kernel: so_getbhs: called
 Jun  7 22:53:08 finstall kernel: proc_out: called
 Jun  7 22:53:08 finstall kernel: 0] ism_proc: odone=3D0
 Jun  7 22:53:08 finstall iscontrol[2084]: cam_open_btl: no passthrough
 device found at 1:0:1
 Jun  7 22:53:08 finstall iscontrol[2084]: cam_open_btl: no passthrough
 device found at 1:0:2
 Jun  7 22:53:08 finstall iscontrol[2084]: cam_open_btl: no passthrough
 device found at 1:0:3
 Jun  7 22:53:38 finstall kernel: _nop_out: called
 Jun  7 22:53:38 finstall kernel: 0] _nop_out: cws=3D1
 Jun  7 22:53:38 finstall kernel: proc_out: called
 Jun  7 22:53:38 finstall kernel: 0] ism_proc: odone=3D0
 
 The message on the machine running scsi-target is: Unsupported INQUIRY
 VPD page 80
yes, I have tested it against ports/net/iscsi-target, I use it to try out
errror recovery :-), and as far as I could tell it's harmelss. 
Anothere thing I can report is that running both (target/initiator) does not
work, and it seems that the target gets stuck.

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: iSCSI initiator tester wanted

2007-06-08 Thread Danny Braniss
...
  The message on the machine running scsi-target is: Unsupported INQUIRY
  VPD page 80
 yes, I have tested it against ports/net/iscsi-target, I use it to try out
 errror recovery :-), and as far as I could tell it's harmelss. 
 Anothere thing I can report is that running both (target/initiator) does not
 ^
on the same host
 work, and it seems that the target gets stuck.

cheers,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: iSCSI initiator tester wanted

2007-06-08 Thread Danny Braniss

 A couple comments just from reading through this, see below.
 
  #!/bin/sh
 
  # PROVIDE: iscsi
  # REQUIRE: NETWORKING
  # BEFORE:  DAEMON
  # KEYWORD: nojail shutdown
 
  #
  # Add the following lines to /etc/rc.conf to enable iscsi:
  #
  # iscsi_enable=YES
  # iscsi_fstab=/etc/fstab.iscsi
 
 The iscsi_exports knob should also be documented here.
agreed

 
  . /etc/rc.subr
 
  name=iscsi
  rcvar=`set_rcvar`
 
  command=/usr/local/sbin/iscontrol
 
 Assuming this gets commited this will want to be /sbin/iscontrol.
 
absolutely

  iscsi_enable=${iscsi_enable:-NO}
  iscsi_fstab=${iscsi_fstab:-/etc/fstab.iscsi}
  iscsi_exports=${iscsi_exports:-/etc/exports.iscsi}
 
  start_cmd=iscsi_start
  faststop_cmp=iscsi_stop
  stop_cmd=iscsi_stop
 
  iscsi_wait()
  {
 dev=$1
 trap echo 'wait loop cancelled'; exit 1 2
 count=0
 while true; do
  if [ -c $dev ]; then
  break;
  fi
  if [ $count -eq 0 ]; then
   echo -n Waiting for ${dev}': '
  fi
  count=$((${count} + 1))
  if [ $count -eq 6 ]; then
  echo ' Failed'
  return 0
  break
  fi
  echo -n '.'
  sleep 5;
 done
 echo '.'
 return 1
  }
 
  iscsi_start()
  {
 #
 # load needed modules
 for m in iscsi_initiator geom_label; do
  kldstat -qm $m || kldload $m
 done
 
 Good thinking making geom_label a pseudo-requirement. Examples and 
 documentation for fstab.iscsi should strongly recommend its use, since 
 device names will vary.
 
 sysctl debug.iscsi=2
 
 Maybe make this another rc variable that could be set in /etc/rc.conf. 
 You'll probably also want to change the module's default verbosity 
 level once it becomes more official.
it will be zero by default, and no reason to clobber rc.conf.

 
 #
 # start iscontrol for each target
 if [ -n ${iscsi_targets} ]; then
  for target in ${iscsi_targets}; do
  ${command} ${rc_flags} -n ${target}
  done
 fi
 
 if [ -f ${iscsi_fstab} ]; then
  while read spec file type opt t1 t2
  do
case ${spec} in
\#*|'')
  ;;
*)
  if iscsi_wait ${spec}; then
  break;
  fi
  echo type=$type spec=$spec file=$file
  fsck -p ${spec}  mount ${spec} ${file}
  ;;
esac
  done  ${iscsi_fstab}
 fi
 
 if [ -f ${iscsi_exports} ]; then
  cat ${iscsi_exports}  /etc/exports
  #/etc/rc.d/mountd reload does not work, why?
  kill -1 `cat /var/run/mountd.pid`
 fi
  }
 
 Look at how Pawel handled this with ZFS (mostly in the zfs and mountd 
 rc.d scripts), and use the fact that mountd can take multiple exports 
 files on its command line to your advantage. i.e. appending to the 
 normal exports file is not really what you want to do.

I like the idea of keeping things from spreading around, and maybe 
/etc/rc.d/mountd
can be taught to use all exportfs.something files might be an idea, specially
since sometimes one has to '/etc/rc.d/mountd reload' - i miss 'exportfs -a' :-)


 
  iscsi_stop()
  {
 echo 'iscsi stopping'
 while read spec file type opt t1 t2
  do
case ${spec} in
\#*|'')
  ;;
*)
  echo iscsi: umount $spec
  umount -fv $spec
  # and remove from the exports ...
 
 See above; this could be a no-op.
 
  ;;
esac
  done  ${iscsi_fstab}
  }


 
  load_rc_config $name
  run_rc_command $1
  --
  problems with the above script:
  - no background fsck
 
 It would be nice not to re-invent the wheel here, and there are other 
 reasons it would be nice to just use /etc/fstab instead of adding a new 
 file -- a number of utilities use /etc/fstab to map between mountpoints 
 and device names even if the device isn't mounted. Did you try this 
 approach, and if so what obstacles did you encounter? I will play 
 around with this if I have time. The late fstab/mount option will 
 probably be useful here.

it all boils down to my not-liking-to-spread-out syndrome, rc.conf should
have all that is needed to configure a host, but alas, that is a too
minimalistic approach, since there are also config files.

well, some of the solutions take into concideration my local environment,
most of the servers and workstations are 'dataless', they share many files, and
via DHCP/rc.conf and some other magic, it all works. Except for 'small' changes
in cofiguration files, ie:
most of the hosts have serial console enabled, but a few problematic ones 
don't.
[easy solution: a script that changes off/on accroding to some rc.conf 
tunable).

most have a common fstab (cdrom, proc, linproc0, but different disks 
(da,ad,etc).
so it would be nice to be able to keep the common stuff (DEFAULT) and the 
merge the diffs.
and i don't want to go the XML road, nor any other heavy handed solution.

ok, enough ramblings for a bussy morning.

chears,
danny

PS: 

iSCSI initiator tester wanted

2007-06-06 Thread Danny Braniss
Hi all,
I'm in the last mile before crossing the beta-release line,
so I'd like to get some input, and update the list of targets it supports.
you can obtain the driver from:
ftp://ftp.cs.huji.ac.il/users/danny/freebsd/iscsi-2.0.92.tar.gz

Cheers,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: iSCSI initiator tester wanted

2007-06-06 Thread Danny Braniss
 Quoting Danny Braniss [EMAIL PROTECTED]:
  I'm in the last mile before crossing the beta-release line,
  so I'd like to get some input, and update the list of targets it supports.
  you can obtain the driver from:
  ftp://ftp.cs.huji.ac.il/users/danny/freebsd/iscsi-2.0.92.tar.gz
 
 Looks great! I've done some basic testing against our cluster of three 
 LeftHand Networks NSM 160's running SAN/iQ 6.6SP1. My machine is 
 running -CURRENT as of a couple days ago (with gcc 4.2 and symbol 
 versioning). I've tested previous snapshots of the driver against the 
 same SAN on this and another machine running -STABLE with good results.
 
so i'm updating my Targets file.

 Is there anything specific you'd like tested? What connection 
 interruption scenarios does the driver try to recover from? I'm running 
 some backups to an iSCSI mount now. When that finishes (and my machine 
 is otherwise unoccupied) I'll play around with temporarily yanking the 
 ethernet cable and other fun tricks.
 
it 'should' recover from network disconects, like pulling out cable, or
rebooting the target, but I think that this will only work if there is no
major activity - I better test this one again.
it should also flush buffers when you shut down the host, this was a major
pain with the old versions.

 Thanks for the Makefiles. Your blurb text incorrectly directs the 
 reader to run make in sys/dev/iscsi_initiator (which doesn't exist, and 
 there's no Makefile in sys/dev/iscsi). Obviously you meant 
 sys/modules/iscsi_initiator. Also, a line about running make in 
 iscontrol/ would be helpful, as would an install target in that 
 Makefile.
ok, fixed the 'typos', I also forgot the sample rc.d/iscsi, 
 
 Do you have any suggestions on startup integration (rc script, fstab 
 magic, etc)? I know you said once before that that was hopefully coming 
 soon..
 
this is an attempt:
#!/bin/sh

# PROVIDE: iscsi
# REQUIRE: NETWORKING
# BEFORE:  DAEMON
# KEYWORD: nojail shutdown

#
# Add the following lines to /etc/rc.conf to enable iscsi:
#
# iscsi_enable=YES
# iscsi_fstab=/etc/fstab.iscsi

. /etc/rc.subr

name=iscsi
rcvar=`set_rcvar`

command=/usr/local/sbin/iscontrol

iscsi_enable=${iscsi_enable:-NO}
iscsi_fstab=${iscsi_fstab:-/etc/fstab.iscsi}
iscsi_exports=${iscsi_exports:-/etc/exports.iscsi}

start_cmd=iscsi_start
faststop_cmp=iscsi_stop
stop_cmd=iscsi_stop

iscsi_wait()
{
dev=$1
trap echo 'wait loop cancelled'; exit 1 2
count=0
while true; do
if [ -c $dev ]; then
break;
fi
if [ $count -eq 0 ]; then
 echo -n Waiting for ${dev}': '
fi
count=$((${count} + 1))
if [ $count -eq 6 ]; then
echo ' Failed'
return 0
break
fi
echo -n '.'
sleep 5;
done
echo '.'
return 1
}

iscsi_start()
{
#
# load needed modules
for m in iscsi_initiator geom_label; do
kldstat -qm $m || kldload $m
done

sysctl debug.iscsi=2
#
# start iscontrol for each target
if [ -n ${iscsi_targets} ]; then
for target in ${iscsi_targets}; do
${command} ${rc_flags} -n ${target}
done
fi

if [ -f ${iscsi_fstab} ]; then
while read spec file type opt t1 t2
do
  case ${spec} in
  \#*|'')
;;
  *)
if iscsi_wait ${spec}; then
break;
fi
echo type=$type spec=$spec file=$file
fsck -p ${spec}  mount ${spec} ${file}
;;
  esac
done  ${iscsi_fstab} 
fi

if [ -f ${iscsi_exports} ]; then
cat ${iscsi_exports}  /etc/exports
#/etc/rc.d/mountd reload does not work, why?
kill -1 `cat /var/run/mountd.pid`
fi
}

iscsi_stop()
{
echo 'iscsi stopping'
while read spec file type opt t1 t2
do
  case ${spec} in
  \#*|'')
;;
  *)
echo iscsi: umount $spec
umount -fv $spec
# and remove from the exports ...
;;
  esac
 done  ${iscsi_fstab} 
}

load_rc_config $name
run_rc_command $1
--
problems with the above script:
- no background fsck
- restart will mess the exports file
- the wait loop should be replaced by something more deterministic.

 Thanks again. I'll post again if I manage to break something.
 
Ok, but can't say I look forward to hear from you :-)

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


  1   2   >