status of autotuning freebsd for 9.2

2013-07-07 Thread Alfred Perlstein

Andre,

Are you going to have time to MFC things from -current for auto-tuning 
-stable before 9.2?


I fear (maybe unnecessarily?) that we are about to ship yet another 
release that can't do basic 10gigE when sufficient memory exists.


If you don't have time, then let me know and I'll see what I can do.

--
Alfred Perlstein
VP Software Engineering, iXsystems

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found

2013-07-07 Thread Andre Albsmeier
On Thu, 04-Jul-2013 at 08:15:50 +0200, Konstantin Belousov wrote:
 On Thu, Jul 04, 2013 at 07:27:00AM +0200, Andre Albsmeier wrote:
  On Thu, 04-Jul-2013 at 07:24:40 +0200, Konstantin Belousov wrote:
   On Thu, Jul 04, 2013 at 07:14:09AM +0200, Andre Albsmeier wrote:
On Mon, 17-Jun-2013 at 21:30:31 +0200, John Baldwin wrote:
 On Sunday, June 16, 2013 2:39:42 am Andre Albsmeier wrote:
  On Fri, 31-May-2013 at 16:51:03 +0200, John Baldwin wrote:
   On Friday, May 31, 2013 8:26:11 am Andre Albsmeier wrote:
Each day at 5:15 we are generating snapshots on various 
machines.
This used to work perfectly under 7-STABLE for years but since
we started to use 9.1-STABLE the machine reboots in about 10%
of all cases.

After rebooting we find a new snapshot file which is a bit
smaller than the good ones and with different permissions
It does not succeed a fsck. In this example it is the one
whose name is beginning with s3:

-r--r-   1 root  operator  snapshot 72802894528 29 May 
05:15 s2-2013.05.28-03.15.04
-r   1 root  operator  snapshot 72802893824 29 May 
05:15 s3-2013.05.29-03.15.03
-r--r-   1 root  operator  snapshot 72802894528 28 May 
14:22 s4-2013.05.23-06.38.44
-r--r-   1 root  operator  snapshot 72802894528 28 May 
14:22 s5-2013.05.24-03.15.03
-r--r-   1 root  operator  snapshot 72802894528 28 May 
14:22 s6-2013.05.25-03.15.03

After enabling DIAGNOSTIC, WITNESS and INVARIANTS in the kernel
I see the following LORs (mksnap_ffs starts exactly at 5:15):

May 29 05:15:00 kern.crit palveli kernel: lock order reversal:
May 29 05:15:00 kern.crit palveli kernel: 1st 0xc2371da8 ufs 
(ufs) @ /src/src-9/sys/kern/vfs_mount.c:1240
May 29 05:15:00 kern.crit palveli kernel: 2nd 0xc2371ec4 
devfs (devfs) @ /src/src-9/sys/ufs/ffs/ffs_vfsops.c:1414
May 29 05:15:04 kern.crit palveli kernel: lock order reversal:
May 29 05:15:04 kern.crit palveli kernel: 1st 0xc228471c 
snaplk (snaplk) @ /src/src-9/sys/ufs/ufs/ufs_vnops.c:976
May 29 05:15:04 kern.crit palveli kernel: 2nd 0xc22f25e4 ufs 
(ufs) @ /src/src-9/sys/ufs/ffs/ffs_snapshot.c:1626

Unfortunatley no corefiles are being generated ;-(.

I have checked and even rebuilt the (UFS1) fs in question
from scratch. I have also seen this happen on an UFS2 on
another machine and on a third one when running dump -L
on a root fs.

Any hints of how to proceed?
   
   Would it be possible to setup a serial console that is logged on 
   this machine
   to see if it is panic'ing but failing to write out a crashdump?
  
  Couldn't attach the serial console yet ;-(. But I had people
  attach a KVMoverIP switch and enabled the various KDB options
  in the kernel. Now we can see a bit more (see below) -- no
  crashdump is being generated though.
 
 :(  Unfortunately these LORs don't really help with discerning the 
 cause of
 the reboot.  If you have remote power access (and still wanted to 
 test this)
 one option would be to change KDB to drop into the debugger on a 
 panic.
 Then you could connect over the KVM and take images of the original 
 panic
 along with a stack trace.

After a few days of no problems, the box decided to crash
during mksnap_ffs today ;-(. But now I have a crashdump,
see below. Unfortunatley, I cannot upload the dump somewhere
but if you ask me check whatever things I'll be happy to help.

kgdb /usr/obj/src/src-9/sys/palveli/kernel.debug vmcore.4
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and 
you are
welcome to change it and/or distribute copies of it under certain 
conditions.
Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for 
details.
This GDB was configured as i386-marcel-freebsd...

Unread portion of the kernel message buffer:


Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0xcfb5e000
fault code  = supervisor write, page not present
instruction pointer = 0x20:0xc07cb2fe
stack pointer   = 0x28:0xd83545d0
frame pointer   = 0x28:0xd835490c
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 12929 (mksnap_ffs)
trap number = 12
panic: page fault
KDB: stack backtrace:

RE: USB ports on Lenovo T400 do not work after a suspend/resume

2013-07-07 Thread Hans Petter Selasky
Hi,

FYI: The USB stack will currently run a complete controller reset upon resume, 
like during boot.

--HPS 

 
-Original message-
 From:Ian Smith smi...@nimnet.asn.au mailto:smi...@nimnet.asn.au 
 Sent: Sunday 7th July 2013 7:52
 To: Adrian Chadd adr...@freebsd.org mailto:adr...@freebsd.org 
 Cc: freebsd-a...@freebsd.org mailto:freebsd-a...@freebsd.org ; 
 freebsd-stable@freebsd.org mailto:freebsd-stable@freebsd.org ; 
 freebsd-...@freebsd.org mailto:freebsd-...@freebsd.org 
 Subject: Re: USB ports on Lenovo T400 do not work after a suspend/resume
 
 On Sun, 30 Jun 2013 15:02:57 -0700, Adrian Chadd wrote:
   On 30 June 2013 07:22, Ian Smith smi...@nimnet.asn.au 
 mailto:smi...@nimnet.asn.au  wrote:
 [..]
Nothing of note that I can see, if that usb hub-to-bus remapping is
normal.  As you said, 'CPU0: local APIC error 0x40' looks maybe sus.
Maybe someone who knows might comment on that?
 
 Does noone know what that signifies?  Maybe it's not relevant to this.
 
Just checking: you've tried other USB devices apart from uftdi0?
   
   Yup, there's no 5v on the port.
 
 I was rather taken aback to hear this.  Would not this indicate a 
 failure to reinitialise the basic underlying USB hardware on resume?
 
 More than a bit bemused, Ian
 ___
 freebsd-a...@freebsd.org mailto:freebsd-a...@freebsd.org  mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-acpi 
 http://lists.freebsd.org/mailman/listinfo/freebsd-acpi 
 To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org 
 mailto:freebsd-acpi-unsubscr...@freebsd.org 
 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


RE: XHCI umass support breaks between r248085 and r252560 on 9-STABLE

2013-07-07 Thread Hans Petter Selasky
Hi,

Check for CAM/SCSI related changes. There has not been so many USB changes 
recently. Possibly not USB related.

Thank you,

--HPS
 
-Original message-
 From:Alexandre Kovalenko bsd.gai...@gmail.com mailto:bsd.gai...@gmail.com 
 Sent: Thursday 4th July 2013 20:58
 To: freebsd-...@freebsd.org mailto:freebsd-...@freebsd.org 
 Cc: freebsd-stable@freebsd.org mailto:freebsd-stable@freebsd.org 
 Subject: XHCI umass support breaks between r248085 and r252560 on 9-STABLE
 
 Three different external hard drives (Seagate, Western Digital and noname USB 
 3.0 enclosure) refused to be recognized as the umass devices. Reverting 
 /usr/src/sys/dev/bsd/controller to r248085, building and loading just xhci 
 module makes drives appear again. Below are snippets from the log in both 
 cases:
 
 Non working:
 
 Jul  4 14:35:17 twinhead kernel: xhci0: XHCI (generic) USB 3.0 controller 
 mem 0xfddfe000-0xfddf irq 16 at device 0.0 on pci2
 Jul  4 14:35:17 twinhead kernel: xhci0: 64 byte context size.
 Jul  4 14:35:17 twinhead kernel: usbus0 on xhci0
 Jul  4 14:35:17 twinhead kernel: usbus0: 5.0Gbps Super Speed USB v3.0
 Jul  4 14:35:17 twinhead kernel: ugen0.1: 0x1912 at usbus0
 Jul  4 14:35:17 twinhead kernel: uhub0: 0x1912 XHCI root HUB, class 9/0, rev 
 3.00/1.00, addr 1 on usbus0
 Jul  4 14:35:17 twinhead kernel: uhub0: 8 ports with 8 removable, self powered
 Jul  4 14:35:24 twinhead kernel: ugen0.2: ASMedia at usbus0
 Jul  4 14:35:24 twinhead kernel: umass0: ASMedia AS2105, class 0/0, rev 
 3.00/0.01, addr 1 on usbus0
 Jul  4 14:35:29 twinhead kernel: (probe0:umass-sim0:0:0:0): INQUIRY. CDB: 12 
 00 00 00 24 00 
 Jul  4 14:35:29 twinhead kernel: (probe0:umass-sim0:0:0:0): CAM status: CCB 
 request completed with an error
 Jul  4 14:35:29 twinhead kernel: (probe0:umass-sim0:0:0:0): Retrying command
 Jul  4 14:35:30 twinhead kernel: (probe0:umass-sim0:0:0:0): INQUIRY. CDB: 12 
 00 00 00 24 00 
 Jul  4 14:35:30 twinhead kernel: (probe0:umass-sim0:0:0:0): CAM status: CCB 
 request completed with an error
 Jul  4 14:35:30 twinhead kernel: (probe0:umass-sim0:0:0:0): Retrying command
 Jul  4 14:35:35 twinhead kernel: (probe0:umass-sim0:0:0:0): INQUIRY. CDB: 12 
 00 00 00 24 00 
 Jul  4 14:35:35 twinhead kernel: (probe0:umass-sim0:0:0:0): CAM status: CCB 
 request completed with an error
 Jul  4 14:35:35 twinhead kernel: (probe0:umass-sim0:0:0:0): Retrying command
 Jul  4 14:35:36 twinhead kernel: (probe0:umass-sim0:0:0:0): INQUIRY. CDB: 12 
 00 00 00 24 00 
 Jul  4 14:35:36 twinhead kernel: (probe0:umass-sim0:0:0:0): CAM status: CCB 
 request completed with an error
 Jul  4 14:35:36 twinhead kernel: (probe0:umass-sim0:0:0:0): Retrying command
 Jul  4 14:35:41 twinhead kernel: (probe0:umass-sim0:0:0:0): INQUIRY. CDB: 12 
 00 00 00 24 00 
 Jul  4 14:35:41 twinhead kernel: (probe0:umass-sim0:0:0:0): CAM status: CCB 
 request completed with an error
 Jul  4 14:35:41 twinhead kernel: (probe0:umass-sim0:0:0:0): Error 5, Retries 
 exhausted
 
 Working:
 
 Jul  4 14:40:20 twinhead kernel: ugen0.2: ASMedia at usbus0 (disconnected)
 Jul  4 14:40:20 twinhead kernel: umass0: at uhub0, port 2, addr 1 
 (disconnected)
 Jul  4 14:40:27 twinhead kernel: ugen0.2: vendor 0x174c at usbus0
 Jul  4 14:40:27 twinhead kernel: umass0: vendor 0x174c product 0x5106, class 
 0/0, rev 3.00/0.01, addr 1 on usbus0
 Jul  4 14:40:27 twinhead kernel: (probe0:umass-sim0:0:0:0): REPORT LUNS. CDB: 
 a0 00 00 00 00 00 00 00 00 10 00 00 
 Jul  4 14:40:27 twinhead kernel: (probe0:umass-sim0:0:0:0): CAM status: SCSI 
 Status Error
 Jul  4 14:40:27 twinhead kernel: (probe0:umass-sim0:0:0:0): SCSI status: 
 Check Condition
 Jul  4 14:40:27 twinhead kernel: (probe0:umass-sim0:0:0:0): SCSI sense: 
 ILLEGAL REQUEST asc:20,0 (Invalid command operation code)
 Jul  4 14:40:27 twinhead kernel: (probe0:umass-sim0:0:0:0): Error 22, 
 Unretryable error
 Jul  4 14:40:27 twinhead kernel: da0 at umass-sim0 bus 0 scbus4 target 0 lun 0
 Jul  4 14:40:27 twinhead kernel: da0: Hitachi HTS542520K9SA00 BBDO Fixed 
 Direct Access SCSI-5 device 
 Jul  4 14:40:27 twinhead kernel: da0: 400.000MB/s transfers
 Jul  4 14:40:27 twinhead kernel: da0: 190782MB (390721968 512 byte sectors: 
 255H 63S/T 24321C)
 Jul  4 14:40:27 twinhead kernel: da0: quirks=0x2NO_6_BYTE
 
 I can provide additional information or try  patches as necessary.
 
 Alexandre Sunny Kovalenko (Олександр Коваленко)
 
 ___
 freebsd-...@freebsd.org mailto:freebsd-...@freebsd.org  mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-usb 
 http://lists.freebsd.org/mailman/listinfo/freebsd-usb 
 To unsubscribe, send any mail to freebsd-usb-unsubscr...@freebsd.org 
 mailto:freebsd-usb-unsubscr...@freebsd.org 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found

2013-07-07 Thread Konstantin Belousov
On Sun, Jul 07, 2013 at 09:25:53AM +0200, Andre Albsmeier wrote:
 OK, here we go (looks better now):
 
 GNU gdb 6.1.1 [FreeBSD]
 Copyright 2004 Free Software Foundation, Inc.
 GDB is free software, covered by the GNU General Public License, and you are
 welcome to change it and/or distribute copies of it under certain conditions.
 Type show copying to see the conditions.
 There is absolutely no warranty for GDB.  Type show warranty for details.
 This GDB was configured as i386-marcel-freebsd...
 
 Unread portion of the kernel message buffer:
 dev = stripe/p, block = 592, fs = /palveli
 panic: ffs_blkfree_cg: freeing free block
 KDB: stack backtrace:
 db_trace_self_wrapper(c08207eb,d70fc924,c05fdfc9,c081df13,c08a82e0,...) at 
 db_trace_self_wrapper+0x26/frame 0xd70fc8f4
 kdb_backtrace(c081df13,c08a82e0,c0833a0b,d70fc930,d70fc930,...) at 
 kdb_backtrace+0x29/frame 0xd70fc900
 panic(c0833a0b,c2aae178,250,0,c2af80d4,...) at panic+0xc9/frame 0xd70fc924
 ffs_blkfree_cg(250,0,8000,49f,d70fcad0,...) at ffs_blkfree_cg+0x399/frame 
 0xd70fc9c8
 ffs_blkfree(c2b35100,c2af8000,c2b0d470,250,0,...) at ffs_blkfree+0xad/frame 
 0xd70fca00
 indir_trunc(fffa3ff4,,0,8000,0,...) at indir_trunc+0x658/frame 
 0xd70fcae0
 indir_trunc(dff3,,c072df0a,c2d68d00,c087abd8,...) at 
 indir_trunc+0x514/frame 0xd70fcbc0
 handle_workitem_freeblocks(0,d70fcc4c,2,246,c2ab1000,...) at 
 handle_workitem_freeblocks+0x2dc/frame 0xd70fcc24
 process_worklist_item(0,0,0,c086ae78,0,...) at 
 process_worklist_item+0x27a/frame 0xd70fcc6c
 softdep_process_worklist(c2b36548,0,54,c0835825,64,...) at 
 softdep_process_worklist+0x91/frame 0xd70fcc9c
 softdep_flush(0,d70fcd08,0,c2aac2f0,0,...) at softdep_flush+0x3e4/frame 
 0xd70f
 fork_exit(c0738bb0,0,d70fcd08) at fork_exit+0xa2/frame 0xd70fccf4
 fork_trampoline() at fork_trampoline+0x8/frame 0xd70fccf4
 --- trap 0, eip = 0, esp = 0xd70fcd40, ebp = 0 ---
 Uptime: 2d16h29m37s
 Physical memory: 503 MB
 Dumping 95 MB: 80 64 48 32 16
 
 No symbol stopped_cpus in current context.
 No symbol stoppcbs in current context.
 #0  doadump (textdump=1) at pcpu.h:249
 249 pcpu.h: No such file or directory.
 in pcpu.h
 (kgdb) where
 #0  doadump (textdump=1) at pcpu.h:249
 #1  0xc05f in kern_reboot (howto=260) at 
 /src/src-9/sys/kern/kern_shutdown.c:449
 #2  0xc05fe028 in panic (fmt=value optimized out) at 
 /src/src-9/sys/kern/kern_shutdown.c:637
 #3  0xc0717899 in ffs_blkfree_cg (ump=0xc2b35100, fs=0xc2af8000, 
 devvp=0xc2b0d470, bno=592, 
 size=32768, inum=1183, dephd=0xd70fcad0) at 
 /src/src-9/sys/ufs/ffs/ffs_alloc.c:2151
 #4  0xc0717c8d in ffs_blkfree (ump=0xc2b35100, fs=0xc2af8000, 
 devvp=0xc2b0d470, bno=592, 
 size=32768, inum=1183, vtype=VREG, dephd=0xd70fcad0) at 
 /src/src-9/sys/ufs/ffs/ffs_alloc.c:2280
 #5  0xc0730348 in indir_trunc (freework=0xc2f99100, dbn=1642816, lbn=-376844)
 at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7965
 #6  0xc0730204 in indir_trunc (freework=0xc2f99100, dbn=1639680, lbn=-8205)
 at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7946
 #7  0xc07324bc in handle_workitem_freeblocks (freeblks=0xc2fc1e00, flags=512)
 at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7588
 #8  0xc0730dfa in process_worklist_item (mp=0xc2b36548, target=10, flags=512)
 at /src/src-9/sys/ufs/ffs/ffs_softdep.c:1774
 #9  0xc07360c1 in softdep_process_worklist (mp=0xc2b36548, full=0)
 at /src/src-9/sys/ufs/ffs/ffs_softdep.c:1558
 #10 0xc0738f94 in softdep_flush () at 
 /src/src-9/sys/ufs/ffs/ffs_softdep.c:1414
 #11 0xc05d1b82 in fork_exit (callout=0xc0738bb0 softdep_flush, arg=0x0, 
 frame=0xd70fcd08)
 at /src/src-9/sys/kern/kern_fork.c:988
 #12 0xc07ba904 in fork_trampoline () at 
 /src/src-9/sys/i386/i386/exception.s:279
 (kgdb) up 10
 #10 0xc0738f94 in softdep_flush () at 
 /src/src-9/sys/ufs/ffs/ffs_softdep.c:1414
 1414progress += softdep_process_worklist(mp, 0);
 
   -Andre

This looks unrelated, and exactly this panic is usually has one of two
causes:
- corrupted filesystem, run fsck to recheck it;
- faulty hardware, most likely RAM, but might be CPU/CPU cache/bus.

Is it the same machine where the bcopy panic occured ?


pgp3UzKr2vglG.pgp
Description: PGP signature


Re: status of autotuning freebsd for 9.2

2013-07-07 Thread Andre Oppermann

On 07.07.2013 08:32, Alfred Perlstein wrote:

Andre,

Are you going to have time to MFC things from -current for auto-tuning -stable 
before 9.2?


I simply ran out of time on Friday and MFCing such a big change requires
more testing.


I fear (maybe unnecessarily?) that we are about to ship yet another release 
that can't do basic
10gigE when sufficient memory exists.


There was some debate with myself whether such a behavior changing MFC
would be appropriate for a mid-stream stable release.  I guess yes, though
a number of people who currently set the parameters manually would have
to remove their tuning settings.


If you don't have time, then let me know and I'll see what I can do.


Can you help me with with testing?

--
Andre

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: make buildworld is now 50% slower

2013-07-07 Thread Daniel Braniss
 On Fri, Jul 05, 2013 at 02:39:00PM +0200, Dimitry Andric wrote:
  [redirecting to the correct mailing list, freebsd-stable@ ...]
  
  On Jul 5, 2013, at 10:53, Daniel Braniss da...@cs.huji.ac.il wrote:
   after today's update of 9.1-STABLE I noticed that make 
   build[world|kernel] are
   taking conciderable more time, is it because the upgrade of clang?
   and if so, is the code produced any better?
   
   before:
   buildwordl:26m4.52s real 2h28m32.12s user 36m6.27s sys
   buildkernel:   7m29.42s real 23m22.22s user 4m26.26s sys
   
   today:
   buildwordl:   34m29.80s real 2h38m9.37s user 37m7.61s sys
   buildkernel:15m31.52s real 22m59.40s user 4m33.06s sys
  
  Ehm, your user and sys times are not that much different at all, they
  add up to about 5% slower for buildworld, and 1% faster for build kernel.
  Are you sure nothing else is running on that machine, eating up CPU time
  while you are building? :)
  
  But yes, clang 3.3 is of course somewhat larger than 3.2.  You might
  especially notice that, if you are using gcc, which is very slow at
  compiling C++.
  
  In any case, if you do not care about clang, just set WITHOUT_CLANG= in
  your /etc/src.conf, and you can shave off some build time.
 
 I just built world/kernel (stable/9 r252769) 5 hours ago.  Results:
 
 time make -j4 buildworld  = roughly 21 minutes on my hardware
 time make -j4 buildkernel = roughly 8 minutes on my hardware
 

It's been a long time since I saw such numbers, maybe it's time
to see where time is being spent, I will run it without clang to compare with
your numbers.

 These numbers are about the norm for me, meaning I do not see a
 substantial increase in build times.
 
 Key point: I do not use/build/grok clang, i.e. WITHOUT_CLANG=true is in
 my src.conf.  But I am aware of the big clang change in r252723.
 
 If hardware details are wanted, ask, but I don't think it's relevant to
 what the root cause is.
 

from what you are saying, I guess clang is not responsible.
looking for my Sherlock Holmes hat.
thanks,
danny

 -- 
 | Jeremy Chadwick   j...@koitsu.org |
 | UNIX Systems Administratorhttp://jdc.koitsu.org/ |
 | Making life hard for others since 1977. PGP 4BD6C0CB |
 


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Fwd: ixgbe Jumbo race condition leading to Deadlock

2013-07-07 Thread Kaushal Bhandankar
In 82599, for a Jumbo packet of 9.5 K ( which consumes 5 descriptors of
2048 bytes each ), when does the Descriptor write back happen ? Does it
happen per Descriptor or once per aggregated Descriptors ? Is it possible
that all descriptors except last one to be written back and when you read
RDH register, I get the last pending descriptor waiting inside 82599.
We are using srrctl |= IXGBE_SRRCTL_DESCTYPE_ADV_ONEBUF;

In my setup, I am seeing that, I don't see EOP set even when I read 5
descriptors. Checking DD will return me an incomplete packet. What should I
do in such a case ?

References from Data sheet:
- Checking through DD bits eliminates a potential race condition: all
descriptor data is posted internally prior to incrementing the head
register and a read of the head register could potentially pass the
descriptor waiting inside the 82599.

Regards,
Kaushal
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: make buildworld is now 50% slower

2013-07-07 Thread Jeremy Chadwick
On Sun, Jul 07, 2013 at 11:50:29AM +0300, Daniel Braniss wrote:
  On Fri, Jul 05, 2013 at 02:39:00PM +0200, Dimitry Andric wrote:
   [redirecting to the correct mailing list, freebsd-stable@ ...]
   
   On Jul 5, 2013, at 10:53, Daniel Braniss da...@cs.huji.ac.il wrote:
after today's update of 9.1-STABLE I noticed that make 
build[world|kernel] are
taking conciderable more time, is it because the upgrade of clang?
and if so, is the code produced any better?

before:
buildwordl:  26m4.52s real 2h28m32.12s user 36m6.27s sys
buildkernel: 7m29.42s real 23m22.22s user 4m26.26s sys

today:
buildwordl: 34m29.80s real 2h38m9.37s user 37m7.61s sys
buildkernel:15m31.52s real 22m59.40s user 4m33.06s sys
   
   Ehm, your user and sys times are not that much different at all, they
   add up to about 5% slower for buildworld, and 1% faster for build kernel.
   Are you sure nothing else is running on that machine, eating up CPU time
   while you are building? :)
   
   But yes, clang 3.3 is of course somewhat larger than 3.2.  You might
   especially notice that, if you are using gcc, which is very slow at
   compiling C++.
   
   In any case, if you do not care about clang, just set WITHOUT_CLANG= in
   your /etc/src.conf, and you can shave off some build time.
  
  I just built world/kernel (stable/9 r252769) 5 hours ago.  Results:
  
  time make -j4 buildworld  = roughly 21 minutes on my hardware
  time make -j4 buildkernel = roughly 8 minutes on my hardware
  
 
 It's been a long time since I saw such numbers, maybe it's time
 to see where time is being spent, I will run it without clang to compare with
 your numbers.
 
  These numbers are about the norm for me, meaning I do not see a
  substantial increase in build times.
  
  Key point: I do not use/build/grok clang, i.e. WITHOUT_CLANG=true is in
  my src.conf.  But I am aware of the big clang change in r252723.
  
  If hardware details are wanted, ask, but I don't think it's relevant to
  what the root cause is.
  
 
 from what you are saying, I guess clang is not responsible.
 looking for my Sherlock Holmes hat.

Some points to those numbers I stated above:

- System is an Intel Q9550 with 8GB of RAM

- Single SSD (UFS2+SU+TRIM) is used for root, /usr, /var, /tmp, and swap

- /usr/src is on ZFS (raidz1 + 3 disks) -- however I got equally small
numbers when it was on the SSD

- /usr/src is using compression=lz4  (to folks from -fs: yeah, I'm
trying it out to see how much of an impact it has on interactivity.  I
can still tell when it kicks in, but it's way, way better than lzjb.
Rather not get into that here)

- Contents of /etc/src.conf (to give you some idea of what I disable):

WITHOUT_ATM=true
WITHOUT_BLUETOOTH=true
WITHOUT_CLANG=true
WITHOUT_FLOPPY=true
WITHOUT_FREEBSD_UPDATE=true
WITHOUT_INET6=true
WITHOUT_IPFILTER=true
WITHOUT_IPX=true
WITHOUT_KERBEROS=true
WITHOUT_LIB32=true
WITHOUT_LPR=true
WITHOUT_NDIS=true
WITHOUT_NETGRAPH=true
WITHOUT_PAM_SUPPORT=true
WITHOUT_PPP=true
WITHOUT_SENDMAIL=true
WITHOUT_WIRELESS=true
WITH_OPENSSH_NONE_CIPHER=true

It's WITHOUT_CLANG that cuts down the buildworld time by a *huge* amount
(I remember when it got introduced, my buildworld jumped up to something
like 40 minutes); the rest probably save a minute or two at most.

- /etc/make.conf doesn't contain much that's relevant, other than:

CPUTYPE?=core2

# For DTrace; also affects ports
STRIP=
CFLAGS+=-fno-omit-frame-pointer

- I do some tweaks in /etc/sysctl.conf (mainly vfs.read_min and
vfs.read_max), but I will admit I am not completely sure what those
do quite yet (I just saw the commit from scottl@ a while back talking
about how an increased vfs.read_min helps them at Netflix quite a
lot).  I also adjust kern.maxvnodes.

- Some ZFS ARC settings are adjusted in /boot/loader.conf (I'm playing
with some stuff I read in Andriy Gapon's ZFS PDF), but they definitely
do not have a major impact on the numbers I listed off.

- I do increase kern.maxdsiz, kern.dfldsiz, and kern.maxssiz in
/boot/loader.conf to 2560M/2560M/256M respectively, but that was mainly
from the days when I ran MySQL and needed a huge userland processes.

All in all my numbers are low/small because of two things: the SSD, and
WITHOUT_CLANG.

Hope this gives you somewhere to start/stuff to ponder.

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Making life hard for others since 1977. PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: USB ports on Lenovo T400 do not work after a suspend/resume

2013-07-07 Thread Jeremy Chadwick
On Sun, Jul 07, 2013 at 03:51:12PM +1000, Ian Smith wrote:
 On Sun, 30 Jun 2013 15:02:57 -0700, Adrian Chadd wrote:
   On 30 June 2013 07:22, Ian Smith smi...@nimnet.asn.au wrote:
 [..]
Nothing of note that I can see, if that usb hub-to-bus remapping is
normal.  As you said, 'CPU0: local APIC error 0x40' looks maybe sus.
Maybe someone who knows might comment on that?
 
 Does noone know what that signifies?  Maybe it's not relevant to this.

It's too vague to know.  The error comes from lapic_handle_error(),
which is a generic/small routine which pulls the local APIC error status
register.  (Note I'm saying APIC, not ACPI -- two different things)

apic_vector.S sets this up/makes use of this function, and its done as
an interrupt handler.

I think this is one of those situations where you have to know *what* is
being set up/done at that moment in time for the error code to mean
something.  Maybe booting verbose would give more information as to what
was being done that lead up to the line.

I've CC'd John Baldwin who might have some ideas.

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Making life hard for others since 1977. PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: make buildworld is now 50% slower

2013-07-07 Thread Matthew D. Fuller
Apropos of nothing, but...

On Sun, Jul 07, 2013 at 03:17:14AM -0700 I heard the voice of
Jeremy Chadwick, and lo! it spake thus:

 WITHOUT_LIB32=true

suggests you're running amd64, which I'm pretty sure means

 - I do increase kern.maxdsiz, kern.dfldsiz, and kern.maxssiz in
 /boot/loader.conf to 2560M/2560M/256M respectively, but that was mainly
 from the days when I ran MySQL and needed a huge userland processes.

are not necessarily _in_creases, and may well be mostly _de_creases.
e.g., on a RELENG_9 box with 8 gig of physical RAM:

% sysctl kern.{max{d,s},dfld}siz
kern.maxdsiz: 34359738368
kern.maxssiz: 536870912
kern.dfldsiz: 134217728

while a -CURRENT box with 16 has dfldsiz blown all the way up too.  I
don't recall doing anything to change them at all recently, and a
glance over loader.conf, sysctl.conf, rc.local, and the kernel configs
doesn't turn up anything.


-- 
Matthew Fuller (MF4839)   |  fulle...@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
   On the Internet, nobody can hear you scream.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: make buildworld is now 50% slower

2013-07-07 Thread Jeremy Chadwick
On Sun, Jul 07, 2013 at 05:47:31AM -0500, Matthew D. Fuller wrote:
 Apropos of nothing, but...
 
 On Sun, Jul 07, 2013 at 03:17:14AM -0700 I heard the voice of
 Jeremy Chadwick, and lo! it spake thus:
 
  WITHOUT_LIB32=true
 
 suggests you're running amd64, which I'm pretty sure means
 
  - I do increase kern.maxdsiz, kern.dfldsiz, and kern.maxssiz in
  /boot/loader.conf to 2560M/2560M/256M respectively, but that was mainly
  from the days when I ran MySQL and needed a huge userland processes.
 
 are not necessarily _in_creases, and may well be mostly _de_creases.
 e.g., on a RELENG_9 box with 8 gig of physical RAM:
 
 % sysctl kern.{max{d,s},dfld}siz
 kern.maxdsiz: 34359738368
 kern.maxssiz: 536870912
 kern.dfldsiz: 134217728

 while a -CURRENT box with 16 has dfldsiz blown all the way up too.  I
 don't recall doing anything to change them at all recently, and a
 glance over loader.conf, sysctl.conf, rc.local, and the kernel configs
 doesn't turn up anything.

Thanks!

The settings I mention are from ancient times -- specifically RELENG_6
on i386 (I know because I found an old mailing list post of mine
discussing the settings with a user).

The problem as I said was that mysqld would crap itself (crash and be
quite loud about it) if the process allocated too much memory/became too
large.  I am fairly certain the issue related to the data size, **not**
the stack size (but I didn't see the harm in increasing that either).

It's good to know I can remove these on amd64.  Yay, one less thing in
loader.conf I have to deal with...  :-)  Thanks again!

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Making life hard for others since 1977. PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found

2013-07-07 Thread Andre Albsmeier
On Sun, 07-Jul-2013 at 09:41:12 +0200, Konstantin Belousov wrote:
 On Sun, Jul 07, 2013 at 09:25:53AM +0200, Andre Albsmeier wrote:
  OK, here we go (looks better now):
  
  GNU gdb 6.1.1 [FreeBSD]
  Copyright 2004 Free Software Foundation, Inc.
  GDB is free software, covered by the GNU General Public License, and you are
  welcome to change it and/or distribute copies of it under certain 
  conditions.
  Type show copying to see the conditions.
  There is absolutely no warranty for GDB.  Type show warranty for details.
  This GDB was configured as i386-marcel-freebsd...
  
  Unread portion of the kernel message buffer:
  dev = stripe/p, block = 592, fs = /palveli
  panic: ffs_blkfree_cg: freeing free block
  KDB: stack backtrace:
  db_trace_self_wrapper(c08207eb,d70fc924,c05fdfc9,c081df13,c08a82e0,...) at 
  db_trace_self_wrapper+0x26/frame 0xd70fc8f4
  kdb_backtrace(c081df13,c08a82e0,c0833a0b,d70fc930,d70fc930,...) at 
  kdb_backtrace+0x29/frame 0xd70fc900
  panic(c0833a0b,c2aae178,250,0,c2af80d4,...) at panic+0xc9/frame 0xd70fc924
  ffs_blkfree_cg(250,0,8000,49f,d70fcad0,...) at ffs_blkfree_cg+0x399/frame 
  0xd70fc9c8
  ffs_blkfree(c2b35100,c2af8000,c2b0d470,250,0,...) at ffs_blkfree+0xad/frame 
  0xd70fca00
  indir_trunc(fffa3ff4,,0,8000,0,...) at indir_trunc+0x658/frame 
  0xd70fcae0
  indir_trunc(dff3,,c072df0a,c2d68d00,c087abd8,...) at 
  indir_trunc+0x514/frame 0xd70fcbc0
  handle_workitem_freeblocks(0,d70fcc4c,2,246,c2ab1000,...) at 
  handle_workitem_freeblocks+0x2dc/frame 0xd70fcc24
  process_worklist_item(0,0,0,c086ae78,0,...) at 
  process_worklist_item+0x27a/frame 0xd70fcc6c
  softdep_process_worklist(c2b36548,0,54,c0835825,64,...) at 
  softdep_process_worklist+0x91/frame 0xd70fcc9c
  softdep_flush(0,d70fcd08,0,c2aac2f0,0,...) at softdep_flush+0x3e4/frame 
  0xd70f
  fork_exit(c0738bb0,0,d70fcd08) at fork_exit+0xa2/frame 0xd70fccf4
  fork_trampoline() at fork_trampoline+0x8/frame 0xd70fccf4
  --- trap 0, eip = 0, esp = 0xd70fcd40, ebp = 0 ---
  Uptime: 2d16h29m37s
  Physical memory: 503 MB
  Dumping 95 MB: 80 64 48 32 16
  
  No symbol stopped_cpus in current context.
  No symbol stoppcbs in current context.
  #0  doadump (textdump=1) at pcpu.h:249
  249 pcpu.h: No such file or directory.
  in pcpu.h
  (kgdb) where
  #0  doadump (textdump=1) at pcpu.h:249
  #1  0xc05f in kern_reboot (howto=260) at 
  /src/src-9/sys/kern/kern_shutdown.c:449
  #2  0xc05fe028 in panic (fmt=value optimized out) at 
  /src/src-9/sys/kern/kern_shutdown.c:637
  #3  0xc0717899 in ffs_blkfree_cg (ump=0xc2b35100, fs=0xc2af8000, 
  devvp=0xc2b0d470, bno=592, 
  size=32768, inum=1183, dephd=0xd70fcad0) at 
  /src/src-9/sys/ufs/ffs/ffs_alloc.c:2151
  #4  0xc0717c8d in ffs_blkfree (ump=0xc2b35100, fs=0xc2af8000, 
  devvp=0xc2b0d470, bno=592, 
  size=32768, inum=1183, vtype=VREG, dephd=0xd70fcad0) at 
  /src/src-9/sys/ufs/ffs/ffs_alloc.c:2280
  #5  0xc0730348 in indir_trunc (freework=0xc2f99100, dbn=1642816, 
  lbn=-376844)
  at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7965
  #6  0xc0730204 in indir_trunc (freework=0xc2f99100, dbn=1639680, lbn=-8205)
  at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7946
  #7  0xc07324bc in handle_workitem_freeblocks (freeblks=0xc2fc1e00, 
  flags=512)
  at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7588
  #8  0xc0730dfa in process_worklist_item (mp=0xc2b36548, target=10, 
  flags=512)
  at /src/src-9/sys/ufs/ffs/ffs_softdep.c:1774
  #9  0xc07360c1 in softdep_process_worklist (mp=0xc2b36548, full=0)
  at /src/src-9/sys/ufs/ffs/ffs_softdep.c:1558
  #10 0xc0738f94 in softdep_flush () at 
  /src/src-9/sys/ufs/ffs/ffs_softdep.c:1414
  #11 0xc05d1b82 in fork_exit (callout=0xc0738bb0 softdep_flush, arg=0x0, 
  frame=0xd70fcd08)
  at /src/src-9/sys/kern/kern_fork.c:988
  #12 0xc07ba904 in fork_trampoline () at 
  /src/src-9/sys/i386/i386/exception.s:279
  (kgdb) up 10
  #10 0xc0738f94 in softdep_flush () at 
  /src/src-9/sys/ufs/ffs/ffs_softdep.c:1414
  1414progress += softdep_process_worklist(mp, 0);
  
  -Andre
 
 This looks unrelated, and exactly this panic is usually has one of two
 causes:
 - corrupted filesystem, run fsck to recheck it;

root@palveli:~fsck /dev/stripe/p 
** /dev/stripe/p
** Last Mounted on /palveli
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
9895 files, 2039706 used, 15697693 free (5397 frags, 1961537 blocks, 0.0% 
fragmentation)

* FILE SYSTEM IS CLEAN *

 - faulty hardware, most likely RAM, but might be CPU/CPU cache/bus.

Well, of course I cannot prove that this is not the case.
But the box runs flawlessly otherwise. RAM is ECC monitored,
PSU is OK and airflow is OK. Sure, I can't look inside of
CPU etc.

 
 Is it the same machine where the bcopy panic occured ?

Yes. Let's see what it does the next days...

-Andre

Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found

2013-07-07 Thread Jeremy Chadwick
On Sun, Jul 07, 2013 at 02:13:54PM +0200, Andre Albsmeier wrote:
 On Sun, 07-Jul-2013 at 09:41:12 +0200, Konstantin Belousov wrote:
  On Sun, Jul 07, 2013 at 09:25:53AM +0200, Andre Albsmeier wrote:
   OK, here we go (looks better now):
   
   GNU gdb 6.1.1 [FreeBSD]
   Copyright 2004 Free Software Foundation, Inc.
   GDB is free software, covered by the GNU General Public License, and you 
   are
   welcome to change it and/or distribute copies of it under certain 
   conditions.
   Type show copying to see the conditions.
   There is absolutely no warranty for GDB.  Type show warranty for 
   details.
   This GDB was configured as i386-marcel-freebsd...
   
   Unread portion of the kernel message buffer:
   dev = stripe/p, block = 592, fs = /palveli
   panic: ffs_blkfree_cg: freeing free block
   KDB: stack backtrace:
   db_trace_self_wrapper(c08207eb,d70fc924,c05fdfc9,c081df13,c08a82e0,...) 
   at db_trace_self_wrapper+0x26/frame 0xd70fc8f4
   kdb_backtrace(c081df13,c08a82e0,c0833a0b,d70fc930,d70fc930,...) at 
   kdb_backtrace+0x29/frame 0xd70fc900
   panic(c0833a0b,c2aae178,250,0,c2af80d4,...) at panic+0xc9/frame 0xd70fc924
   ffs_blkfree_cg(250,0,8000,49f,d70fcad0,...) at ffs_blkfree_cg+0x399/frame 
   0xd70fc9c8
   ffs_blkfree(c2b35100,c2af8000,c2b0d470,250,0,...) at 
   ffs_blkfree+0xad/frame 0xd70fca00
   indir_trunc(fffa3ff4,,0,8000,0,...) at indir_trunc+0x658/frame 
   0xd70fcae0
   indir_trunc(dff3,,c072df0a,c2d68d00,c087abd8,...) at 
   indir_trunc+0x514/frame 0xd70fcbc0
   handle_workitem_freeblocks(0,d70fcc4c,2,246,c2ab1000,...) at 
   handle_workitem_freeblocks+0x2dc/frame 0xd70fcc24
   process_worklist_item(0,0,0,c086ae78,0,...) at 
   process_worklist_item+0x27a/frame 0xd70fcc6c
   softdep_process_worklist(c2b36548,0,54,c0835825,64,...) at 
   softdep_process_worklist+0x91/frame 0xd70fcc9c
   softdep_flush(0,d70fcd08,0,c2aac2f0,0,...) at softdep_flush+0x3e4/frame 
   0xd70f
   fork_exit(c0738bb0,0,d70fcd08) at fork_exit+0xa2/frame 0xd70fccf4
   fork_trampoline() at fork_trampoline+0x8/frame 0xd70fccf4
   --- trap 0, eip = 0, esp = 0xd70fcd40, ebp = 0 ---
   Uptime: 2d16h29m37s
   Physical memory: 503 MB
   Dumping 95 MB: 80 64 48 32 16
   
   No symbol stopped_cpus in current context.
   No symbol stoppcbs in current context.
   #0  doadump (textdump=1) at pcpu.h:249
   249 pcpu.h: No such file or directory.
   in pcpu.h
   (kgdb) where
   #0  doadump (textdump=1) at pcpu.h:249
   #1  0xc05f in kern_reboot (howto=260) at 
   /src/src-9/sys/kern/kern_shutdown.c:449
   #2  0xc05fe028 in panic (fmt=value optimized out) at 
   /src/src-9/sys/kern/kern_shutdown.c:637
   #3  0xc0717899 in ffs_blkfree_cg (ump=0xc2b35100, fs=0xc2af8000, 
   devvp=0xc2b0d470, bno=592, 
   size=32768, inum=1183, dephd=0xd70fcad0) at 
   /src/src-9/sys/ufs/ffs/ffs_alloc.c:2151
   #4  0xc0717c8d in ffs_blkfree (ump=0xc2b35100, fs=0xc2af8000, 
   devvp=0xc2b0d470, bno=592, 
   size=32768, inum=1183, vtype=VREG, dephd=0xd70fcad0) at 
   /src/src-9/sys/ufs/ffs/ffs_alloc.c:2280
   #5  0xc0730348 in indir_trunc (freework=0xc2f99100, dbn=1642816, 
   lbn=-376844)
   at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7965
   #6  0xc0730204 in indir_trunc (freework=0xc2f99100, dbn=1639680, 
   lbn=-8205)
   at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7946
   #7  0xc07324bc in handle_workitem_freeblocks (freeblks=0xc2fc1e00, 
   flags=512)
   at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7588
   #8  0xc0730dfa in process_worklist_item (mp=0xc2b36548, target=10, 
   flags=512)
   at /src/src-9/sys/ufs/ffs/ffs_softdep.c:1774
   #9  0xc07360c1 in softdep_process_worklist (mp=0xc2b36548, full=0)
   at /src/src-9/sys/ufs/ffs/ffs_softdep.c:1558
   #10 0xc0738f94 in softdep_flush () at 
   /src/src-9/sys/ufs/ffs/ffs_softdep.c:1414
   #11 0xc05d1b82 in fork_exit (callout=0xc0738bb0 softdep_flush, arg=0x0, 
   frame=0xd70fcd08)
   at /src/src-9/sys/kern/kern_fork.c:988
   #12 0xc07ba904 in fork_trampoline () at 
   /src/src-9/sys/i386/i386/exception.s:279
   (kgdb) up 10
   #10 0xc0738f94 in softdep_flush () at 
   /src/src-9/sys/ufs/ffs/ffs_softdep.c:1414
   1414progress += softdep_process_worklist(mp, 
   0);
   
 -Andre
  
  This looks unrelated, and exactly this panic is usually has one of two
  causes:
  - corrupted filesystem, run fsck to recheck it;
 
 root@palveli:~fsck /dev/stripe/p 
 ** /dev/stripe/p
 ** Last Mounted on /palveli
 ** Phase 1 - Check Blocks and Sizes
 ** Phase 2 - Check Pathnames
 ** Phase 3 - Check Connectivity
 ** Phase 4 - Check Reference Counts
 ** Phase 5 - Check Cyl groups
 9895 files, 2039706 used, 15697693 free (5397 frags, 1961537 blocks, 0.0% 
 fragmentation)
 
 * FILE SYSTEM IS CLEAN *

Taken from your previous mail (showing only UFS stuff):

http://lists.freebsd.org/pipermail/freebsd-stable/2013-June/073817.html

 fstab:
 --
 /dev/da0s1a/   ufs noatime,rw  
 

Re: USB ports on Lenovo T400 do not work after a suspend/resume

2013-07-07 Thread Ian Smith
On Sun, 7 Jul 2013 03:26:24 -0700, Jeremy Chadwick wrote:
  On Sun, Jul 07, 2013 at 03:51:12PM +1000, Ian Smith wrote:
   On Sun, 30 Jun 2013 15:02:57 -0700, Adrian Chadd wrote:
 On 30 June 2013 07:22, Ian Smith smi...@nimnet.asn.au wrote:
   [..]
  Nothing of note that I can see, if that usb hub-to-bus remapping is
  normal.  As you said, 'CPU0: local APIC error 0x40' looks maybe sus.
  Maybe someone who knows might comment on that?
   
   Does noone know what that signifies?  Maybe it's not relevant to this.
  
  It's too vague to know.  The error comes from lapic_handle_error(),
  which is a generic/small routine which pulls the local APIC error status
  register.  (Note I'm saying APIC, not ACPI -- two different things)

Indeed; I've been familiar with PICs since c.'79.  Googling to check 
what the 'A' stood for I found this .. from '97 but usefully descriptive 
perhaps: http://people.freebsd.org/~fsmp/SMP/papers/apicsubsystem.txt

I also found this from March 2011 involving Mike Tancsa, you and jhb@ :)
http://freebsd.1045724.n5.nabble.com/CPU0-local-APIC-error-0x40-CPU1-local-APIC-error-0x40-td3961805.html

  apic_vector.S sets this up/makes use of this function, and its done as
  an interrupt handler.

Whether an (unserviced?) interrupt error is related to Adrian's symptom 
- apparent total failure of USB reinitialisation on resume, but only if 
no USB devices exist in the external slots - remains to be seen.  hps@ 
has just confirmed that it should work the same as on boot, but then 
this error was flagged on boot - perhaps it also manifests on resume?

  I think this is one of those situations where you have to know *what* is
  being set up/done at that moment in time for the error code to mean
  something.  Maybe booting verbose would give more information as to what
  was being done that lead up to the line.
  
  I've CC'd John Baldwin who might have some ideas.

Thanks.  We have verbose dmesg already.  Thread starts (in -stable) at
http://lists.freebsd.org/pipermail/freebsd-stable/2013-June/073917.html
and amidst some wild goose chases, pointer to verbose dmesg etc is at
http://lists.freebsd.org/pipermail/freebsd-stable/2013-June/074018.html

cheers, Ian
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: USB ports on Lenovo T400 do not work after a suspend/resume

2013-07-07 Thread Adrian Chadd
I don't think it's a USB controller issue.

Those ports are connected to USB hubs, right? I wonder if there's some
ACPI nonsense that's resulting in the hubs not being powered up on
resume.



-adrian

On 7 July 2013 00:32, Hans Petter Selasky
hans.petter.sela...@bitfrost.no wrote:
 Hi,

 FYI: The USB stack will currently run a complete controller reset upon
 resume, like during boot.

 --HPS



 -Original message-
 From:Ian Smith smi...@nimnet.asn.au
 Sent: Sunday 7th July 2013 7:52
 To: Adrian Chadd adr...@freebsd.org
 Cc: freebsd-a...@freebsd.org; freebsd-stable@freebsd.org;
 freebsd-...@freebsd.org
 Subject: Re: USB ports on Lenovo T400 do not work after a suspend/resume

 On Sun, 30 Jun 2013 15:02:57 -0700, Adrian Chadd wrote:
   On 30 June 2013 07:22, Ian Smith smi...@nimnet.asn.au wrote:
 [..]
Nothing of note that I can see, if that usb hub-to-bus remapping is
normal.  As you said, 'CPU0: local APIC error 0x40' looks maybe sus.
Maybe someone who knows might comment on that?

 Does noone know what that signifies?  Maybe it's not relevant to this.

Just checking: you've tried other USB devices apart from uftdi0?
  
   Yup, there's no 5v on the port.

 I was rather taken aback to hear this.  Would not this indicate a
 failure to reinitialise the basic underlying USB hardware on resume?

 More than a bit bemused, Ian
 ___
 freebsd-a...@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-acpi
 To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ixgbe Jumbo race condition leading to Deadlock

2013-07-07 Thread Jack Vogel
The potential race condition as the data sheet puts it, is only when you
are
trying to manage your RX ring by reading the RDH register, this is a bad
idea
anyway, none of our (Intel) drivers do this. Using the DD bit is what you
want
to do. The DD bit is set when the descriptor is written back, and that
happens
when the DMA is complete.

The packet is incomplete until the descriptor with EOP set, in my code an
mbuf chain is created, and as each new descriptor is processed the pointer
to the head of the whole chain is kept in rxbuf-fmp, thus when you get to
the EOP descriptor you will be ready to send the whole chain to the stack.

Its good that you are using ONEBUF since packet split has hardware issues
on 82599.

Are you developing a new driver, or simply having issues using mine?

Regards,

Jack



On Sun, Jul 7, 2013 at 2:24 AM, Kaushal Bhandankar kaushal...@gmail.comwrote:

 In 82599, for a Jumbo packet of 9.5 K ( which consumes 5 descriptors of
 2048 bytes each ), when does the Descriptor write back happen ? Does it
 happen per Descriptor or once per aggregated Descriptors ? Is it possible
 that all descriptors except last one to be written back and when you read
 RDH register, I get the last pending descriptor waiting inside 82599.
 We are using srrctl |= IXGBE_SRRCTL_DESCTYPE_ADV_ONEBUF;

 In my setup, I am seeing that, I don't see EOP set even when I read 5
 descriptors. Checking DD will return me an incomplete packet. What should I
 do in such a case ?

 References from Data sheet:
 - Checking through DD bits eliminates a potential race condition: all
 descriptor data is posted internally prior to incrementing the head
 register and a read of the head register could potentially pass the
 descriptor waiting inside the 82599.

 Regards,
 Kaushal
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ixgbe Jumbo race condition leading to Deadlock

2013-07-07 Thread Kaushal Bhandankar
Hi Jack,
Thanks for the explanation. Do you suggest that I keep reading rx
descriptor with DD bit and keep them pending till I get the descriptor with
EOP set ? How much max delay can be expected for the EOP descriptor to be
written back ?

Regards,
Kaushal


On Sun, Jul 7, 2013 at 10:40 PM, Jack Vogel jfvo...@gmail.com wrote:

 The potential race condition as the data sheet puts it, is only when you
 are
 trying to manage your RX ring by reading the RDH register, this is a bad
 idea
 anyway, none of our (Intel) drivers do this. Using the DD bit is what you
 want
 to do. The DD bit is set when the descriptor is written back, and that
 happens
 when the DMA is complete.

 The packet is incomplete until the descriptor with EOP set, in my code an
 mbuf chain is created, and as each new descriptor is processed the pointer
 to the head of the whole chain is kept in rxbuf-fmp, thus when you get to
 the EOP descriptor you will be ready to send the whole chain to the stack.

 Its good that you are using ONEBUF since packet split has hardware issues
 on 82599.

 Are you developing a new driver, or simply having issues using mine?

 Regards,

 Jack



 On Sun, Jul 7, 2013 at 2:24 AM, Kaushal Bhandankar 
 kaushal...@gmail.comwrote:

 In 82599, for a Jumbo packet of 9.5 K ( which consumes 5 descriptors of
 2048 bytes each ), when does the Descriptor write back happen ? Does it
 happen per Descriptor or once per aggregated Descriptors ? Is it possible
 that all descriptors except last one to be written back and when you read
 RDH register, I get the last pending descriptor waiting inside 82599.
 We are using srrctl |= IXGBE_SRRCTL_DESCTYPE_ADV_ONEBUF;

 In my setup, I am seeing that, I don't see EOP set even when I read 5
 descriptors. Checking DD will return me an incomplete packet. What should
 I
 do in such a case ?

 References from Data sheet:
 - Checking through DD bits eliminates a potential race condition: all
 descriptor data is posted internally prior to incrementing the head
 register and a read of the head register could potentially pass the
 descriptor waiting inside the 82599.

 Regards,
 Kaushal
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ixgbe Jumbo race condition leading to Deadlock

2013-07-07 Thread Kaushal Bhandankar
The solution is:

I have a function which pre-calculates the buffers required for processing
the packet. This is to eliminate any lack-of-memory errors during
processing.
In this function I loop over descriptors from next_to_check onwards. If I
loop over some descriptors with DD set and do not see the EOP set till I
reach 5 descriptors, I will return a failure and not change next_to_check
etc.
So that on the next read, or later hopefully descriptor with EOP would have
been written back. This will ensure that race condition does not happen.

Let me know if it sounds good.

Regards,
Kaushal


On Sun, Jul 7, 2013 at 10:43 PM, Kaushal Bhandankar kaushal...@gmail.comwrote:

 Hi Jack,
 Thanks for the explanation. Do you suggest that I keep reading rx
 descriptor with DD bit and keep them pending till I get the descriptor with
 EOP set ? How much max delay can be expected for the EOP descriptor to be
 written back ?

 Regards,
 Kaushal


 On Sun, Jul 7, 2013 at 10:40 PM, Jack Vogel jfvo...@gmail.com wrote:

 The potential race condition as the data sheet puts it, is only when
 you are
 trying to manage your RX ring by reading the RDH register, this is a bad
 idea
 anyway, none of our (Intel) drivers do this. Using the DD bit is what you
 want
 to do. The DD bit is set when the descriptor is written back, and that
 happens
 when the DMA is complete.

 The packet is incomplete until the descriptor with EOP set, in my code an
 mbuf chain is created, and as each new descriptor is processed the pointer
 to the head of the whole chain is kept in rxbuf-fmp, thus when you get to
 the EOP descriptor you will be ready to send the whole chain to the stack.

 Its good that you are using ONEBUF since packet split has hardware issues
 on 82599.

 Are you developing a new driver, or simply having issues using mine?

 Regards,

 Jack



 On Sun, Jul 7, 2013 at 2:24 AM, Kaushal Bhandankar 
 kaushal...@gmail.comwrote:

 In 82599, for a Jumbo packet of 9.5 K ( which consumes 5 descriptors of
 2048 bytes each ), when does the Descriptor write back happen ? Does it
 happen per Descriptor or once per aggregated Descriptors ? Is it possible
 that all descriptors except last one to be written back and when you read
 RDH register, I get the last pending descriptor waiting inside 82599.
 We are using srrctl |= IXGBE_SRRCTL_DESCTYPE_ADV_ONEBUF;

 In my setup, I am seeing that, I don't see EOP set even when I read 5
 descriptors. Checking DD will return me an incomplete packet. What
 should I
 do in such a case ?

 References from Data sheet:
 - Checking through DD bits eliminates a potential race condition: all
 descriptor data is posted internally prior to incrementing the head
 register and a read of the head register could potentially pass the
 descriptor waiting inside the 82599.

 Regards,
 Kaushal
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
 




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


RE: USB ports on Lenovo T400 do not work after a suspend/resume

2013-07-07 Thread Hans Petter Selasky
Hi,

The USB code should re-attach the uhub driver to the root HUB and any other 
HUBs after resume. Part of the attach code is to set the  power on.

See /sys/dev/usb/usb_hub.c

And:

grep -r UHF_PORT_POWER /sys/dev/usb/

--HPS
 
 
-Original message-
 From:Adrian Chadd adr...@freebsd.org mailto:adr...@freebsd.org 
 Sent: Sunday 7th July 2013 18:43
 To: Hans Petter Selasky hans.petter.sela...@bitfrost.no 
 mailto:hans.petter.sela...@bitfrost.no 
 Cc: freebsd-a...@freebsd.org mailto:freebsd-a...@freebsd.org ; 
 freebsd-stable@freebsd.org mailto:freebsd-stable@freebsd.org ; Ian Smith 
 smi...@nimnet.asn.au mailto:smi...@nimnet.asn.au ; 
 freebsd-...@freebsd.org mailto:freebsd-...@freebsd.org 
 Subject: Re: USB ports on Lenovo T400 do not work after a suspend/resume
 
 I don't think it's a USB controller issue.
 
 Those ports are connected to USB hubs, right? I wonder if there's some
 ACPI nonsense that's resulting in the hubs not being powered up on
 resume.
 
 
 
 -adrian
 
 On 7 July 2013 00:32, Hans Petter Selasky
 hans.petter.sela...@bitfrost.no mailto:hans.petter.sela...@bitfrost.no  
 wrote:
  Hi,
 
  FYI: The USB stack will currently run a complete controller reset upon
  resume, like during boot.
 
  --HPS
 
 
 
  -Original message-
  From:Ian Smith smi...@nimnet.asn.au mailto:smi...@nimnet.asn.au 
  Sent: Sunday 7th July 2013 7:52
  To: Adrian Chadd adr...@freebsd.org mailto:adr...@freebsd.org 
  Cc: freebsd-a...@freebsd.org mailto:freebsd-a...@freebsd.org ; 
  freebsd-stable@freebsd.org mailto:freebsd-stable@freebsd.org ;
  freebsd-...@freebsd.org mailto:freebsd-...@freebsd.org 
  Subject: Re: USB ports on Lenovo T400 do not work after a suspend/resume
 
  On Sun, 30 Jun 2013 15:02:57 -0700, Adrian Chadd wrote:
On 30 June 2013 07:22, Ian Smith smi...@nimnet.asn.au 
  mailto:smi...@nimnet.asn.au  wrote:
  [..]
 Nothing of note that I can see, if that usb hub-to-bus remapping is
 normal.  As you said, 'CPU0: local APIC error 0x40' looks maybe sus.
 Maybe someone who knows might comment on that?
 
  Does noone know what that signifies?  Maybe it's not relevant to this.
 
 Just checking: you've tried other USB devices apart from uftdi0?
   
Yup, there's no 5v on the port.
 
  I was rather taken aback to hear this.  Would not this indicate a
  failure to reinitialise the basic underlying USB hardware on resume?
 
  More than a bit bemused, Ian
  ___
  freebsd-a...@freebsd.org mailto:freebsd-a...@freebsd.org  mailing list
  http://lists.freebsd.org/mailman/listinfo/freebsd-acpi 
  http://lists.freebsd.org/mailman/listinfo/freebsd-acpi 
  To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org 
  mailto:freebsd-acpi-unsubscr...@freebsd.org 
 
 ___
 freebsd-a...@freebsd.org mailto:freebsd-a...@freebsd.org  mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-acpi 
 http://lists.freebsd.org/mailman/listinfo/freebsd-acpi 
 To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org 
 mailto:freebsd-acpi-unsubscr...@freebsd.org 
 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: status of autotuning freebsd for 9.2

2013-07-07 Thread Alfred Perlstein

On 7/7/13 1:34 AM, Andre Oppermann wrote:

On 07.07.2013 08:32, Alfred Perlstein wrote:

Andre,

Are you going to have time to MFC things from -current for 
auto-tuning -stable before 9.2?


I simply ran out of time on Friday and MFCing such a big change requires
more testing.

I fear (maybe unnecessarily?) that we are about to ship yet another 
release that can't do basic

10gigE when sufficient memory exists.


There was some debate with myself whether such a behavior changing MFC
would be appropriate for a mid-stream stable release.  I guess yes, 
though

a number of people who currently set the parameters manually would have
to remove their tuning settings.


If you don't have time, then let me know and I'll see what I can do.


Can you help me with with testing?

Yes.  Please give me your proposed changes and I'll stand up a machine 
and give feedback.


-Alfred
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: XHCI umass support breaks between r248085 and r252560 on 9-STABLE

2013-07-07 Thread Alexandre Kovalenko
I do apologize for the typo below, which made my message unclear: I meant to 
say that I have reverted _/usr/src/sys/dev/usb/controller_ directory, 
specifically the following files:

root@twinhead:/usr/src/sys/dev/usb/controller # svn diff -r252560 | grep Index:
Index: xhci_pci.c
Index: ohci_pci.c
Index: xhci.c
Index: usb_controller.c
Index: xhcireg.h
root@twinhead:/usr/src/sys/dev/usb/controller # 

which (I think) are USB related and not CAM related. Please, let me know if I 
am wrong.

SIde question (I have been off the lists for a while): is it now considered 
polite to top-post? It was frowned upon way back when… if it still is not, I do 
apologize, but I can see no good way to fix it at this point.


Alexandre Sunny Kovalenko (Олександр Коваленко)


On Jul 7, 2013, at 3:36 AM, Hans Petter Selasky 
hans.petter.sela...@bitfrost.no wrote:

 Hi,
 
 Check for CAM/SCSI related changes. There has not been so many USB changes 
 recently. Possibly not USB related.
 
 Thank you,
 
 --HPS
  
 -Original message-
  From:Alexandre Kovalenko bsd.gai...@gmail.com
  Sent: Thursday 4th July 2013 20:58
  To: freebsd-...@freebsd.org
  Cc: freebsd-stable@freebsd.org
  Subject: XHCI umass support breaks between r248085 and r252560 on 9-STABLE
  
  Three different external hard drives (Seagate, Western Digital and noname 
  USB 3.0 enclosure) refused to be recognized as the umass devices. Reverting 
  /usr/src/sys/dev/bsd/controller to r248085, building and loading just xhci 
  module makes drives appear again. Below are snippets from the log in both 
  cases:
  
  Non working:
  
  Jul  4 14:35:17 twinhead kernel: xhci0: XHCI (generic) USB 3.0 controller 
  mem 0xfddfe000-0xfddf irq 16 at device 0.0 on pci2
  Jul  4 14:35:17 twinhead kernel: xhci0: 64 byte context size.
  Jul  4 14:35:17 twinhead kernel: usbus0 on xhci0
  Jul  4 14:35:17 twinhead kernel: usbus0: 5.0Gbps Super Speed USB v3.0
  Jul  4 14:35:17 twinhead kernel: ugen0.1: 0x1912 at usbus0
  Jul  4 14:35:17 twinhead kernel: uhub0: 0x1912 XHCI root HUB, class 9/0, 
  rev 3.00/1.00, addr 1 on usbus0
  Jul  4 14:35:17 twinhead kernel: uhub0: 8 ports with 8 removable, self 
  powered
  Jul  4 14:35:24 twinhead kernel: ugen0.2: ASMedia at usbus0
  Jul  4 14:35:24 twinhead kernel: umass0: ASMedia AS2105, class 0/0, rev 
  3.00/0.01, addr 1 on usbus0
  Jul  4 14:35:29 twinhead kernel: (probe0:umass-sim0:0:0:0): INQUIRY. CDB: 
  12 00 00 00 24 00 
  Jul  4 14:35:29 twinhead kernel: (probe0:umass-sim0:0:0:0): CAM status: CCB 
  request completed with an error
  Jul  4 14:35:29 twinhead kernel: (probe0:umass-sim0:0:0:0): Retrying command
  Jul  4 14:35:30 twinhead kernel: (probe0:umass-sim0:0:0:0): INQUIRY. CDB: 
  12 00 00 00 24 00 
  Jul  4 14:35:30 twinhead kernel: (probe0:umass-sim0:0:0:0): CAM status: CCB 
  request completed with an error
  Jul  4 14:35:30 twinhead kernel: (probe0:umass-sim0:0:0:0): Retrying command
  Jul  4 14:35:35 twinhead kernel: (probe0:umass-sim0:0:0:0): INQUIRY. CDB: 
  12 00 00 00 24 00 
  Jul  4 14:35:35 twinhead kernel: (probe0:umass-sim0:0:0:0): CAM status: CCB 
  request completed with an error
  Jul  4 14:35:35 twinhead kernel: (probe0:umass-sim0:0:0:0): Retrying command
  Jul  4 14:35:36 twinhead kernel: (probe0:umass-sim0:0:0:0): INQUIRY. CDB: 
  12 00 00 00 24 00 
  Jul  4 14:35:36 twinhead kernel: (probe0:umass-sim0:0:0:0): CAM status: CCB 
  request completed with an error
  Jul  4 14:35:36 twinhead kernel: (probe0:umass-sim0:0:0:0): Retrying command
  Jul  4 14:35:41 twinhead kernel: (probe0:umass-sim0:0:0:0): INQUIRY. CDB: 
  12 00 00 00 24 00 
  Jul  4 14:35:41 twinhead kernel: (probe0:umass-sim0:0:0:0): CAM status: CCB 
  request completed with an error
  Jul  4 14:35:41 twinhead kernel: (probe0:umass-sim0:0:0:0): Error 5, 
  Retries exhausted
  
  Working:
  
  Jul  4 14:40:20 twinhead kernel: ugen0.2: ASMedia at usbus0 (disconnected)
  Jul  4 14:40:20 twinhead kernel: umass0: at uhub0, port 2, addr 1 
  (disconnected)
  Jul  4 14:40:27 twinhead kernel: ugen0.2: vendor 0x174c at usbus0
  Jul  4 14:40:27 twinhead kernel: umass0: vendor 0x174c product 0x5106, 
  class 0/0, rev 3.00/0.01, addr 1 on usbus0
  Jul  4 14:40:27 twinhead kernel: (probe0:umass-sim0:0:0:0): REPORT LUNS. 
  CDB: a0 00 00 00 00 00 00 00 00 10 00 00 
  Jul  4 14:40:27 twinhead kernel: (probe0:umass-sim0:0:0:0): CAM status: 
  SCSI Status Error
  Jul  4 14:40:27 twinhead kernel: (probe0:umass-sim0:0:0:0): SCSI status: 
  Check Condition
  Jul  4 14:40:27 twinhead kernel: (probe0:umass-sim0:0:0:0): SCSI sense: 
  ILLEGAL REQUEST asc:20,0 (Invalid command operation code)
  Jul  4 14:40:27 twinhead kernel: (probe0:umass-sim0:0:0:0): Error 22, 
  Unretryable error
  Jul  4 14:40:27 twinhead kernel: da0 at umass-sim0 bus 0 scbus4 target 0 
  lun 0
  Jul  4 14:40:27 twinhead kernel: da0: Hitachi HTS542520K9SA00 BBDO Fixed 
  Direct Access SCSI-5 device 
  Jul  4 14:40:27 twinhead kernel: da0: 400.000MB/s transfers
  Jul  4 14:40:27 twinhead kernel: 

Re: USB ports on Lenovo T400 do not work after a suspend/resume

2013-07-07 Thread Lars Engels
On Sun, Jun 30, 2013 at 03:02:57PM -0700, Adrian Chadd wrote:
 On 30 June 2013 07:22, Ian Smith smi...@nimnet.asn.au wrote:
 
  After removing [numbers] (for WITNESS?), diff started making sense.
  The below is between the first and second suspend/resume cycles in
  dmesg-3.txt, encompassing the others.
 
 Cool!
 
  Nothing of note that I can see, if that usb hub-to-bus remapping is
  normal.  As you said, 'CPU0: local APIC error 0x40' looks maybe sus.
  Maybe someone who knows might comment on that?
 
  Just checking: you've tried other USB devices apart from uftdi0?
 
 Yup, there's no 5v on the port.

Oh, BTW: can you check if you have power on the ports after the first
resume and no power after all next resumes until you reboot your
notebook?
That's the situation I had and maybe it can lead to something. ;)


pgponhzdZpTA9.pgp
Description: PGP signature


Shutdown hangs on unmount of a gjournaled file system in 8-Stable

2013-07-07 Thread Andreas Longwitz
The problem occurs after an update of 8-stable from r248120 to r252111.
Sometimes shutdown hangs:

Waiting (max 60 seconds) for system process `vnlru' to stop...done
Waiting (max 60 seconds) for system process `bufdaemon' to stop...done
Waiting (max 60 seconds) for system process `syncer' to stop...
Syncing disks, vnodes remaining...0 0 done
All buffers synced.

From the kernel dump I see the deadlock occurs on unmount of a
gjournaled file system. Involved are two processes

db ps
pid ppid pgrp uid state wmesg  wchan  cmd
  1   0   1   0  SLs  mount dr 0xff007f7e559c [init]
 18   0   0   0  SL   suspwt   0xff007f7e5364 [g_journal switcher]

(kgdb) info threads
 158 Thread 12 (PID=1: init)  sched_switch (td=0xff000235e8e0,
  newtd=value optimized out,
flags=value optimized out) at /usr/src/sys/kern/sched_ule.c:1932
 
 217 Thread 100076 (PID=18: g_journal switcher)  sched_switch

(td=0xff0002bd6000,
newtd=value optimized out, flags=value optimized out) at
/usr/src/sys/kern/sched_ule.c:1932


(kgdb) thread 158
[Switching to thread 158 (Thread 12)]#0
sched_switche(td=0xff000235e8e0,
newtd=value optimized out, flags=value optimized out) at
/usr/src/sys/kern/sched_ule.c:1932
1932cpuid = PCPU_GET(cpuid);

(kgdb) bt
#0  sched_switch (td=0xff000235e8e0, newtd=value optimized out,
  flags=value optimized out)
  at /usr/src/sys/kern/sched_ule.c:1932
#1  0x80407836 in mi_switch (flags=260, newtd=0x0) at

  /usr/src/sys/kern/kern_synch.c:466
#2  0x8043e0e2 in sleepq_wait (wchan=0xff007f7e559c, pri=80)
  at /usr/src/sys/kern/subr_sleepqueue.c:613
#3  0x80407fc6 in _sleep (ident=0xff007f7e559c,
  lock=0xff007f7e52f0,
  priority=value optimized out,
  wmesg=0x8069f595 mount drain, timo=0)
  at /usr/src/sys/kern/kern_synch.c:250
#4  0x8048ee42 in dounmount (mp=0xff007f7e52f0,
  flags=524288, td=value optimized out)
  at /usr/src/sys/kern/vfs_mount.c:1266
#5  0x80493202 in vfs_unmountall () at
  /usr/src/sys/kern/vfs_subr.c:3321
#6  0x803fec69 in boot (howto=value optimized out) at
  /usr/src/sys/kern/kern_shutdown.c:428
#7  0x803fef86 in reboot (td=value optimized out,
  uap=0xff8000238bb0)
  at /usr/src/sys/kern/kern_shutdown.c:191
#8  0x805db1b4 in amd64_syscall (td=0xff000235e8e0,
  traced=0) at subr_syscall.c:114
#9  0x805c282c in Xfast_syscall () at
 /usr/src/sys/amd64/amd64/exception.S:387

(kgdb) f 5
#5  0x80493202 in vfs_unmountall () at
  /usr/src/sys/kern/vfs_subr.c:3321
3321error = dounmount(mp, MNT_FORCE, td);

(kgdb) p mp-mnt_lockref
$1=1

(kgdb) f 4
#4  0x8048ee42 in dounmount (mp=0xff007f7e52f0,
 flags=524288, td=value optimized out)
 at /usr/src/sys/kern/vfs_mount.c:1266
1266error = msleep(mp-mnt_lockref, MNT_MTX(mp), PVFS,

(kgdb) list
1261if (flags  MNT_FORCE)
1262 mp-mnt_kern_flag |= MNTK_UNMOUNTF;
1263error = 0;
1264if (mp-mnt_lockref) {
1265 mp-mnt_kern_flag |= MNTK_DRAINING;
1266 error = msleep(mp-mnt_lockref, MNT_MTX(mp), PVFS,
1267mount drain, 0);
1268}
1269MNT_IUNLOCK(mp);
1270KASSERT(mp-mnt_lockref == 0,

(kgdb) thread 217
[Switching to thread 217 (Thread 100076)]#0  sched_switch
  (td=0xff0002bd6000,
   newtd=value optimized out,
   flags=value optimized out) at
   /usr/src/sys/kern/sched_ule.c:1932
1932cpuid = PCPU_GET(cpuid);

(kgdb) bt
#0  sched_switch (td=0xff0002bd6000, newtd=value optimized out,
   flags=value optimized out)
   at /usr/src/sys/kern/sched_ule.c:1932
#1  0x80407836 in mi_switch (flags=260, newtd=0x0) at
   /usr/src/sys/kern/kern_synch.c:466
#2  0x8043e0e2 in sleepq_wait
   (wchan=0xff007f7e5364, pri=159)
   at /usr/src/sys/kern/subr_sleepqueue.c:613
#3  0x80407fc6 in _sleep (ident=0xff007f7e5364,
   lock=0xff007f7e52f0,
   priority=value optimized out,
   wmesg=0x806a0813 suspwt, timo=0)
   at /usr/src/sys/kern/kern_synch.c:250
#4  0x804a25f0 in vfs_write_suspend (mp=0xff007f7e52f0) at
   /usr/src/sys/kern/vfs_vnops.c:1277
#5  0x80c843bd in g_journal_switcher
   (arg=value optimized out) at
   /usr/src/sys/modules/geom/geom_journal/../
../../geom/journal/g_journal.c:2968
#6  

Re: XHCI umass support breaks between r248085 and r252560 on 9-STABLE

2013-07-07 Thread Scot Hetzel
On Sun, Jul 7, 2013 at 3:09 PM, Alexandre Kovalenko
bsd.gai...@gmail.com wrote:

 SIde question (I have been off the lists for a while): is it now considered 
 polite to top-post? It was frowned upon way back when… if it still is not, I 
 do apologize, but I can see no good way to fix it at this point.



Side Answer:  I believe that it preferred that you don't top-post on
the lists.  To get around GMail's top-post issue:

- just delete the first 2 lines in the reply
- scroll thru the quoted message deleting what isn't relevant to your reply
- inline your responses to the relevant parts of the message

-- 
DISCLAIMER:

No electrons were maimed while sending this message. Only slightly bruised.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Sanity Check on Mac Mini

2013-07-07 Thread Doug Hardie
As I previously indicated, I have tested a couple more Minis and updated the 
instructions with what I learned.  Here is the revised version:

2.12Installing FreeBSD on an Apple Mac Mini

The Mac Mini is an attractive server platform.  Its small, runs cool, low 
powered, and reasonably cheap.  There a variety of configurations available.  
However, the bottom of the line seems to be a powerful server.

There are a few issues with installing FreeBSD on the mini.  Mostly they derive 
from the newer hardware it uses and that it uses EFI rather than a BIOS for 
booting.  There is not a simple install that will get the unit working, but the 
additional steps required are quite simple.  The goal of these instructions is 
to get FreeBSD 9.1-Release running as a headless server on a Late 2012 Mini, 
Model No A1347.  Its probably possible to setup the mini as a workstation, but 
that would require some additional effort to test the display and mouse 
interfaces and find fixes for any issues with those.

The original intent was to have the server without system source so that it 
could be maintained using freebsd-update.  However, that will probably have to 
wait until 9.2-Release is available.  In the meantime, freebsd-update has to be 
used with care since I believe it will replace the modified bge files.


2.12.1  Preparing for the Install

2.12.1.1Automatic Startup after Power is Restored

Generally servers need to be automatically restarted after a power failure.  
Start up the Mini in OS-X.  If this is a new unit, I go through the 
registration so that Apple has it on record for use with AppleCare.  Go to 
System Preferences and select Energy Saver.  I set Put hard disk to sleep when 
possible, Wake for network access, Allow power button to put the computer to 
sleep, and most importantly - Start up automatically after a power failure.  
Note, shutting down the computer at this time will not permit it to come back 
on when power is applied.  You have to pull the power plug.  Apparently this 
setting is a bit mislabeled.  Its more like Return the Power to the last status.

These settings work properly with Mac OS-X.  I have not found a way to set the 
startup settings while running FreeBSD yet.  These settings do carry over to 
the FreeBSD install.  However, you may need to lock the energy saver 
preferences for that to happen.

Shutdown the Mini.


2.12.1.2Preparing FreeBSD for the installation

You can select either the i386 or the amd64 distributions.  Both have been 
tested with these procedures and yield a working server.  The bottom of the 
line mini comes with 4 GB of memory installed.  The i386 distribution will only 
use 2 GB.  The remainder will not be used.  The amd64 distribution builds 
larger binary modules, but it will use all the memory.

Download the 9.1 Release distribution Memstick Image.  You will need to copy 
that to a memstick.  There are instructions in section 2.3.5 for copying the 
image to the memstick.  Obtain a display and USB keyboard and connect them to 
the mini.

With a browser go to svnweb.freebsd.org/base/head/sys/dev.  Click on the bge 
folder.  Click on the name if_bge.c.  Find Revision 245931.  Click on the 
download link and save the file.

Go back to the bge page and click on if_bgereg.h.  Find Revision 243686. Click 
on the download link and save the file.  Edit the saved if_bgereg.h file and 
add the following to the end:

#define PCIER_DEVICE_CAP0x4
#define PCIER_DEVICE_CTL0x8
#define PCIEM_CAP_MAX_PAYLOAD   0x0007
#define PCIEM_CTL_RELAXED_ORD_ENABLE0x0010
#define PCIEM_CTL_NOSNOOP_ENABLE0x0800
#define PCIER_DEVICE_STA0xa
#define PCIEM_STA_CORRECTABLE_ERROR 0x0001
#define PCIEM_STA_NON_FATAL_ERROR   0x0002
#define PCIEM_STA_FATAL_ERROR   0x0004
#define PCIEM_STA_UNSUPPORTED_REQ   0x0008

There was a change to some of the names in if_bgereg.h after the 9.1 Release 
was created, but before the corrections to the bge driver were included.  It 
would be possible to grab the appropriate earlier verion of if_bgereg.h, 
however, when rebuilding the kernel, there are other drivers that use the new 
names.  This seems to be the easiest approach.  Also, it worked.

Go back to the dev page and click on the mii folder.  Click on brgphy.c.  Find 
revision 244482.  Click on the download link and save the file.

Copy the saved files to another memstick.


2.12.2  Installing the 9.1 Release

Boot the mini using the memstick.  Hold down the Option key on the keyboard and 
power up the mini.  You will hear the hardware check beep and shortly 
thereafter the screen will show one or more boot icons.  Double click on the 
one named Windows.  It will have a USB icon.

Continue through the normal installation procedure as detailed earlier in this 
chapter.  If you are building a FreeBSD only server, use the entire disk.  
Also, be sure to install the system source.  You will need it later.

You will 

Re: USB ports on Lenovo T400 do not work after a suspend/resume

2013-07-07 Thread Adrian Chadd
Nope, no power after first resume if i have nothing plugged in.

Why?



-adrian

On 7 July 2013 13:49, Lars Engels lars.eng...@0x20.net wrote:
 On Sun, Jun 30, 2013 at 03:02:57PM -0700, Adrian Chadd wrote:
 On 30 June 2013 07:22, Ian Smith smi...@nimnet.asn.au wrote:

  After removing [numbers] (for WITNESS?), diff started making sense.
  The below is between the first and second suspend/resume cycles in
  dmesg-3.txt, encompassing the others.

 Cool!

  Nothing of note that I can see, if that usb hub-to-bus remapping is
  normal.  As you said, 'CPU0: local APIC error 0x40' looks maybe sus.
  Maybe someone who knows might comment on that?
 
  Just checking: you've tried other USB devices apart from uftdi0?

 Yup, there's no 5v on the port.

 Oh, BTW: can you check if you have power on the ports after the first
 resume and no power after all next resumes until you reboot your
 notebook?
 That's the situation I had and maybe it can lead to something. ;)
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Sanity Check on Mac Mini

2013-07-07 Thread Yonghyeon PYUN
On Sun, Jul 07, 2013 at 05:56:09PM -0700, Doug Hardie wrote:
 As I previously indicated, I have tested a couple more Minis and updated the 
 instructions with what I learned.  Here is the revised version:
 

[...]

 2.12.3Rebuilding the kernel to support the Ethernet Interface
 
 Once the system has been rebooted, you will notice that ifconfig may not show 
 the ethernet interface.  There are at least two different chips being used 
 for that interface.  Some of the units work right out of the box.  Others do 
 not.  I have two units and the only visible difference is the Part No.  Part 
 Nu. MC815LL/A appears to be the older unit and the bge interface worked on 
 install.  Part No MD387LL/A is newer and has the newer chips that require the 
 driver update.
 
  If the bge interface does not show, then the bge driver needs to be updated 
 to recognize the NIC.  Mount the second memstick with the files retrieved 
 earlier and move them into the kernel source.  I used the following commands:
 
 cp -p brgphy.c /usr/src/sys/dev/mii
 cp -p if_bgereg.h /usr/src/sys/dev/bge
 cp -p if_bge.c /usr/src/sys/dev/bge
 
 then rebuild the kernel.  Note the instructions here are for GENERIC, but you 
 can use KERNCONF to specify a custom kernel.
 
 cd /usr/src
 make buildkernel
 make installkernel
 
 Reboot the server as before.  Now ifconfig will show bge0 and it will work.  
 The mini is now running a useable version of 9.1-Release.  There are still 
 some items remaining to be resolved:  Updating the kernel with the recent 
 security patches, Disabling Bluetooth and Wireless to save power, and 
 unattended rebooting.  These issues are still being addressed.
 

I'm not sure whether this bge(4) controller is sitting behind
TB(Apple Thunderbolt) bridge. The Apple TB bridge has known
performance issue and some BCM controllers have a work-around to
mitigate it. The work-around is not enabled by default so I'm
interested in bge(4) performance numbers on your box. If you can't
get more than 920 ~ 930Mbps(950Mbps or higher with jumbo frame)
please let me know.
I didn't enable the work-around yet since it will hurt other BCM
controllers when TB bridge is absent.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: USB ports on Lenovo T400 do not work after a suspend/resume

2013-07-07 Thread Ian Smith
On Sun, 7 Jul 2013 18:47:03 -0700, Adrian Chadd wrote:
  On 7 July 2013 13:49, Lars Engels lars.eng...@0x20.net wrote:
   On Sun, Jun 30, 2013 at 03:02:57PM -0700, Adrian Chadd wrote:
   On 30 June 2013 07:22, Ian Smith smi...@nimnet.asn.au wrote:
  
After removing [numbers] (for WITNESS?), diff started making sense.
The below is between the first and second suspend/resume cycles in
dmesg-3.txt, encompassing the others.
  
   Cool!
  
Nothing of note that I can see, if that usb hub-to-bus remapping is
normal.  As you said, 'CPU0: local APIC error 0x40' looks maybe sus.
Maybe someone who knows might comment on that?
   
Just checking: you've tried other USB devices apart from uftdi0?
  
   Yup, there's no 5v on the port.
  
   Oh, BTW: can you check if you have power on the ports after the first
   resume and no power after all next resumes until you reboot your
   notebook?
   That's the situation I had and maybe it can lead to something. ;)

  Nope, no power after first resume if i have nothing plugged in.
  
  Why?

Checking one more point .. do the USB ports come up ok if you originally 
boot with nothing plugged in?  If so (or if not), does that local APIC 
error message appear the same then too?

cheers, Ian

PS OT: finally found a USB keyboard but I'd forgotten that my friend's 
machine is an SL500, not T500.  Moreover, because its keyboard+trackpad 
etc is non-working (internally disconnected), I have no way to resume it 
without the kbd (and the 9.1-R memstick) plugged in.  Even with kbd and 
memstick left in and using acpiconf -s3 it suspends ok but is hung after 
resume by dabbing power button; no screen and kbd is dead too - sorry, 
no help there.  OTOH my son just bought a refurb T430 ('doze 7, beats 8 
anyway) which I should get to play with a bit this week.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Shutdown hangs on unmount of a gjournaled file system in 8-Stable

2013-07-07 Thread Konstantin Belousov
On Mon, Jul 08, 2013 at 12:26:43AM +0200, Andreas Longwitz wrote:
 The deadlock can be explained now: pid 1 (init) sleeps on mount drain
 because mp-mnt_lockref was 1. This setting was done by pid 18 (gjournal
 switcher) by calling vfs_busy(). pid 18 now sleeps on suspwt because
 mp-mnt_writeopcount was 1. This setting was done by pid 1 before going
 to sleep by calling vn_start_write() in dounmount().
 
 I think the reason for this deadlock is the commit r249055 which seems
 not to be compatible with gjournal.
Thank you for the analysis. I think 'not compatible' is some
understatement. The situation clearly causes a deadlock, you are right.

The vfs_busy(); vfs_write_suspend(); call sequence is somewhat dubious,
in fact, exactly because unmount could start in between. I think that
vfs_write_suspend() must avoid setting MNT_SUSPEND if unmount was
started. Patch below, for HEAD, should fix the problem, by marking the
callers of vfs_write_suspend(), which are not protected by the covered
vnode lock, with the VS_SKIP_UNMOUNT flag.

I believe that the conflicts on stable/8 should be trivial, if any.

diff --git a/sys/geom/journal/g_journal.c b/sys/geom/journal/g_journal.c
index a3c996c..3ce2785 100644
--- a/sys/geom/journal/g_journal.c
+++ b/sys/geom/journal/g_journal.c
@@ -2960,7 +2960,7 @@ g_journal_do_switch(struct g_class *classp)
GJ_TIMER_STOP(1, bt, BIO_FLUSH time of %s, sc-sc_name);
 
GJ_TIMER_START(1, bt);
-   error = vfs_write_suspend(mp);
+   error = vfs_write_suspend(mp, VS_SKIP_UNMOUNT);
GJ_TIMER_STOP(1, bt, Suspend time of %s, mountpoint);
if (error != 0) {
GJ_DEBUG(0, Cannot suspend file system %s (error=%d).,
diff --git a/sys/kern/vfs_vnops.c b/sys/kern/vfs_vnops.c
index 7eac0ef..06e59f9 100644
--- a/sys/kern/vfs_vnops.c
+++ b/sys/kern/vfs_vnops.c
@@ -1668,8 +1668,7 @@ vn_finished_secondary_write(mp)
  * Request a filesystem to suspend write operations.
  */
 int
-vfs_write_suspend(mp)
-   struct mount *mp;
+vfs_write_suspend(struct mount *mp, int flags)
 {
int error;
 
@@ -1680,6 +1679,21 @@ vfs_write_suspend(mp)
}
while (mp-mnt_kern_flag  MNTK_SUSPEND)
msleep(mp-mnt_flag, MNT_MTX(mp), PUSER - 1, wsuspfs, 0);
+
+   /*
+* Unmount holds a write reference on the mount point.  If we
+* own busy reference and drain for writers, we deadlock with
+* the reference draining in the unmount path.  Callers of
+* vfs_write_suspend() must specify VS_SKIP_UNMOUNT if
+* vfs_busy() reference is owned and caller is not in the
+* unmount context.
+*/
+   if ((flags  VS_SKIP_UNMOUNT) != 0 
+   (mp-mnt_kern_flag  MNTK_UNMOUNT) != 0) {
+   MNT_IUNLOCK(mp);
+   return (EBUSY);
+   }
+
mp-mnt_kern_flag |= MNTK_SUSPEND;
mp-mnt_susp_owner = curthread;
if (mp-mnt_writeopcount  0)
diff --git a/sys/sys/vnode.h b/sys/sys/vnode.h
index 42bfb65..b0cbcc0 100644
--- a/sys/sys/vnode.h
+++ b/sys/sys/vnode.h
@@ -398,6 +398,9 @@ extern int  vttoif_tab[];
 #defineVR_START_WRITE  0x0001  /* vfs_write_resume: start write 
atomically */
 #defineVR_NO_SUSPCLR   0x0002  /* vfs_write_resume: do not clear 
suspension */
 
+#defineVS_SKIP_UNMOUNT 0x0001  /* vfs_write_suspend: fail if the
+  filesystem is being unmounted */
+
 #defineVREF(vp)vref(vp)
 
 #ifdef DIAGNOSTIC
@@ -711,7 +714,7 @@ int vn_io_fault_pgmove(vm_page_t ma[], vm_offset_t offset, 
int xfersize,
 intvfs_cache_lookup(struct vop_lookup_args *ap);
 void   vfs_timestamp(struct timespec *);
 void   vfs_write_resume(struct mount *mp, int flags);
-intvfs_write_suspend(struct mount *mp);
+intvfs_write_suspend(struct mount *mp, int flags);
 intvop_stdbmap(struct vop_bmap_args *);
 intvop_stdfsync(struct vop_fsync_args *);
 intvop_stdgetwritemount(struct vop_getwritemount_args *);
diff --git a/sys/ufs/ffs/ffs_snapshot.c b/sys/ufs/ffs/ffs_snapshot.c
index 9a9c88a..ad157aa 100644
--- a/sys/ufs/ffs/ffs_snapshot.c
+++ b/sys/ufs/ffs/ffs_snapshot.c
@@ -423,7 +423,7 @@ restart:
 */
for (;;) {
vn_finished_write(wrtmp);
-   if ((error = vfs_write_suspend(vp-v_mount)) != 0) {
+   if ((error = vfs_write_suspend(vp-v_mount, 0)) != 0) {
vn_start_write(NULL, wrtmp, V_WAIT);
vn_lock(vp, LK_EXCLUSIVE | LK_RETRY);
goto out;
diff --git a/sys/ufs/ffs/ffs_suspend.c b/sys/ufs/ffs/ffs_suspend.c
index 3198c1a..a8c4578 100644
--- a/sys/ufs/ffs/ffs_suspend.c
+++ b/sys/ufs/ffs/ffs_suspend.c
@@ -206,7 +206,7 @@ ffs_susp_suspend(struct mount *mp)
return (EPERM);
 #endif
 
-   if ((error = vfs_write_suspend(mp)) != 0)
+   if ((error = vfs_write_suspend(mp, VS_SKIP_UNMOUNT)) != 0)