Re: Puzzling performance

2010-08-03 Thread John Baldwin
On Monday, August 02, 2010 6:13:50 pm Guy Helmer wrote:
 On a FreeBSD 7.1 SCHED_ULE kernel, I have a large number of files opened and
 mmapped (with MAP_NOSYNC option) for shared-memory communication between
 processes.  Normally, memcpy() copies data into these shared-memory buffers
 in a reasonable amount of time closely related to the size of the copy
 (roughly 10us per 10KB).  However, due to performance issues I've found that
 sometimes a memcpy() takes an abnormally long time (10ms for 40KB, and I
 suspect longer times occurring when I have not had monitoring enabled).  The
 system doesn't seem to be in memory overcommit -- there is just a minor
 amount of swap in use, and I've not seen page-ins or page-outs while
 watching systat or vmstat.
 
 Since I'm using MAP_NOSYNC, I would not expect the pager to flush dirty
 pages to disk and cause add delays.  Any ideas where to look?  Might it help
 to pin threads to CPUs in case a thread is getting moved to a different
 core?

Pinning might help yes.  You might also want to ensure there aren't any 
interrupts on that CPU.  Currently there isn't a good way to figure that out 
short of kgdb though. :(

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: PCI config space is not restored upon resume (macbook pro)

2010-08-03 Thread John Baldwin
On Tuesday, August 03, 2010 6:49:07 am Oleg Sharoyko wrote:
 Hi!
 
 I'm trying to make FreeBSD (9-Current, checkout on 2010-08-01) correctly
 suspend/resume on macbook pro. As of now I have to issues with resume:
 
 1. Display stays blank upon resume. Got 'vga0: failed to reload state'
  in dmesg, but I haven't looked into this  yet.
 
 2. Some hardware is missing upon resume, specifically ath, msk and firewire.
 This devices disappear because rather strange values are being
 read from pci config space (such as vendor id, device id and others).

I wonder if the bus numbers for PCI-PCI bridges need to be restored on resume?  
If they aren't then config transactions won't be routed properly.  You could 
add a pcib_resume() method that prints out the various bus register values 
after resume to see if they match what we print out during boot.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: sched_pin() versus PCPU_GET

2010-08-04 Thread John Baldwin
On Tuesday, August 03, 2010 9:46:16 pm m...@freebsd.org wrote:
 On Fri, Jul 30, 2010 at 2:31 PM, John Baldwin j...@freebsd.org wrote:
  On Friday, July 30, 2010 10:08:22 am John Baldwin wrote:
  On Thursday, July 29, 2010 7:39:02 pm m...@freebsd.org wrote:
   We've seen a few instances at work where witness_warn() in ast()
   indicates the sched lock is still held, but the place it claims it was
   held by is in fact sometimes not possible to keep the lock, like:
  
   thread_lock(td);
   td-td_flags = ~TDF_SELECT;
   thread_unlock(td);
  
   What I was wondering is, even though the assembly I see in objdump -S
   for witness_warn has the increment of td_pinned before the PCPU_GET:
  
   802db210:   65 48 8b 1c 25 00 00mov%gs:0x0,%rbx
   802db217:   00 00
   802db219:   ff 83 04 01 00 00   incl   0x104(%rbx)
* Pin the thread in order to avoid problems with thread migration.
* Once that all verifies are passed about spinlocks ownership,
* the thread is in a safe path and it can be unpinned.
*/
   sched_pin();
   lock_list = PCPU_GET(spinlocks);
   802db21f:   65 48 8b 04 25 48 00mov%gs:0x48,%rax
   802db226:   00 00
   if (lock_list != NULL  lock_list-ll_count != 0) {
   802db228:   48 85 c0test   %rax,%rax
* Pin the thread in order to avoid problems with thread migration.
* Once that all verifies are passed about spinlocks ownership,
* the thread is in a safe path and it can be unpinned.
*/
   sched_pin();
   lock_list = PCPU_GET(spinlocks);
   802db22b:   48 89 85 f0 fe ff ffmov%rax,-0x110(%rbp)
   802db232:   48 89 85 f8 fe ff ffmov%rax,-0x108(%rbp)
   if (lock_list != NULL  lock_list-ll_count != 0) {
   802db239:   0f 84 ff 00 00 00   je 802db33e
   witness_warn+0x30e
   802db23f:   44 8b 60 50 mov0x50(%rax),%r12d
  
   is it possible for the hardware to do any re-ordering here?
  
   The reason I'm suspicious is not just that the code doesn't have a
   lock leak at the indicated point, but in one instance I can see in the
   dump that the lock_list local from witness_warn is from the pcpu
   structure for CPU 0 (and I was warned about sched lock 0), but the
   thread id in panic_cpu is 2.  So clearly the thread was being migrated
   right around panic time.
  
   This is the amd64 kernel on stable/7.  I'm not sure exactly what kind
   of hardware; it's a 4-way Intel chip from about 3 or 4 years ago IIRC.
  
   So... do we need some kind of barrier in the code for sched_pin() for
   it to really do what it claims?  Could the hardware have re-ordered
   the mov%gs:0x48,%rax PCPU_GET to before the sched_pin()
   increment?
 
  Hmmm, I think it might be able to because they refer to different 
  locations.
 
  Note this rule in section 8.2.2 of Volume 3A:
 
• Reads may be reordered with older writes to different locations but not
  with older writes to the same location.
 
  It is certainly true that sparc64 could reorder with RMO.  I believe ia64
  could reorder as well.  Since sched_pin/unpin are frequently used to 
  provide
  this sort of synchronization, we could use memory barriers in pin/unpin
  like so:
 
  sched_pin()
  {
td-td_pinned = atomic_load_acq_int(td-td_pinned) + 1;
  }
 
  sched_unpin()
  {
atomic_store_rel_int(td-td_pinned, td-td_pinned - 1);
  }
 
  We could also just use atomic_add_acq_int() and atomic_sub_rel_int(), but 
  they
  are slightly more heavyweight, though it would be more clear what is 
  happening
  I think.
 
  However, to actually get a race you'd have to have an interrupt fire and
  migrate you so that the speculative read was from the other CPU.  However, I
  don't think the speculative read would be preserved in that case.  The CPU
  has to return to a specific PC when it returns from the interrupt and it has
  no way of storing the state for what speculative reordering it might be
  doing, so presumably it is thrown away?  I suppose it is possible that it
  actually retires both instructions (but reordered) and then returns to the 
  PC
  value after the read of listlocks after the interrupt.  However, in that 
  case
  the scheduler would not migrate as it would see td_pinned != 0.  To get the
  race you have to have the interrupt take effect prior to modifying 
  td_pinned,
  so I think the processor would have to discard the reordered read of
  listlocks so it could safely resume execution at the 'incl' instruction.
 
  The other nit there on x86 at least is that the incl instruction is doing
  both a read and a write and another rule in the section 8.2.2 is this:
 
   • Reads are not reordered with other reads.
 
  That would seem to prevent the read of listlocks from passing the read of
  td_pinned in the incl instruction on x86.
 
 I wonder how that's interpreted

Re: PCI config space is not restored upon resume (macbook pro)

2010-08-04 Thread John Baldwin
 acpi_pcib_acpi_attach(device_t bus);
-static int acpi_pcib_acpi_resume(device_t bus);
 static int acpi_pcib_read_ivar(device_t dev, device_t child,
int which, uintptr_t *result);
 static int acpi_pcib_write_ivar(device_t dev, device_t child,
@@ -94,7 +93,7 @@
 DEVMETHOD(device_attach,   acpi_pcib_acpi_attach),
 DEVMETHOD(device_shutdown, bus_generic_shutdown),
 DEVMETHOD(device_suspend,  bus_generic_suspend),
-DEVMETHOD(device_resume,   acpi_pcib_acpi_resume),
+DEVMETHOD(device_resume,   bus_generic_resume),
 
 /* Bus interface */
 DEVMETHOD(bus_print_child, bus_generic_print_child),
@@ -257,13 +257,6 @@
 return (acpi_pcib_attach(dev, sc-ap_prt, sc-ap_bus));
 }
 
-static int
-acpi_pcib_acpi_resume(device_t dev)
-{
-
-return (acpi_pcib_resume(dev));
-}
-
 /*
  * Support for standard PCI bridge ivars.
  */
Index: dev/acpica/acpi_pcibvar.h
===
--- dev/acpica/acpi_pcibvar.h   (revision 210796)
+++ dev/acpica/acpi_pcibvar.h   (working copy)
@@ -31,13 +31,14 @@
 #define_ACPI_PCIBVAR_H_
 
 #ifdef _KERNEL
+
 void   acpi_pci_link_add_reference(device_t dev, int index, device_t pcib,
 int slot, int pin);
 intacpi_pci_link_route_interrupt(device_t dev, int index);
 intacpi_pcib_attach(device_t bus, ACPI_BUFFER *prt, int busno);
 intacpi_pcib_route_interrupt(device_t pcib, device_t dev, int pin,
 ACPI_BUFFER *prtbuf);
-intacpi_pcib_resume(device_t dev);
+
 #endif /* _KERNEL */
 
 #endif /* !_ACPI_PCIBVAR_H_ */
Index: dev/pci/pcib_private.h
===
--- dev/pci/pcib_private.h  (revision 210796)
+++ dev/pci/pcib_private.h  (working copy)
@@ -37,6 +37,7 @@
  * Export portions of generic PCI:PCI bridge support so that it can be
  * used by subclasses.
  */
+DECLARE_CLASS(pcib_driver);
 
 /*
  * Bridge-specific data.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: sched_pin() versus PCPU_GET

2010-08-04 Thread John Baldwin
On Wednesday, August 04, 2010 12:20:31 pm m...@freebsd.org wrote:
 On Wed, Aug 4, 2010 at 2:26 PM, John Baldwin j...@freebsd.org wrote:
  On Tuesday, August 03, 2010 9:46:16 pm m...@freebsd.org wrote:
  On Fri, Jul 30, 2010 at 2:31 PM, John Baldwin j...@freebsd.org wrote:
   On Friday, July 30, 2010 10:08:22 am John Baldwin wrote:
   On Thursday, July 29, 2010 7:39:02 pm m...@freebsd.org wrote:
We've seen a few instances at work where witness_warn() in ast()
indicates the sched lock is still held, but the place it claims it was
held by is in fact sometimes not possible to keep the lock, like:
   
thread_lock(td);
td-td_flags = ~TDF_SELECT;
thread_unlock(td);
   
What I was wondering is, even though the assembly I see in objdump -S
for witness_warn has the increment of td_pinned before the PCPU_GET:
   
802db210:   65 48 8b 1c 25 00 00mov%gs:0x0,%rbx
802db217:   00 00
802db219:   ff 83 04 01 00 00   incl   0x104(%rbx)
 * Pin the thread in order to avoid problems with thread 
migration.
 * Once that all verifies are passed about spinlocks ownership,
 * the thread is in a safe path and it can be unpinned.
 */
sched_pin();
lock_list = PCPU_GET(spinlocks);
802db21f:   65 48 8b 04 25 48 00mov%gs:0x48,%rax
802db226:   00 00
if (lock_list != NULL  lock_list-ll_count != 0) {
802db228:   48 85 c0test   %rax,%rax
 * Pin the thread in order to avoid problems with thread 
migration.
 * Once that all verifies are passed about spinlocks ownership,
 * the thread is in a safe path and it can be unpinned.
 */
sched_pin();
lock_list = PCPU_GET(spinlocks);
802db22b:   48 89 85 f0 fe ff ffmov%rax,-0x110(%rbp)
802db232:   48 89 85 f8 fe ff ffmov%rax,-0x108(%rbp)
if (lock_list != NULL  lock_list-ll_count != 0) {
802db239:   0f 84 ff 00 00 00   je 802db33e
witness_warn+0x30e
802db23f:   44 8b 60 50 mov0x50(%rax),%r12d
   
is it possible for the hardware to do any re-ordering here?
   
The reason I'm suspicious is not just that the code doesn't have a
lock leak at the indicated point, but in one instance I can see in the
dump that the lock_list local from witness_warn is from the pcpu
structure for CPU 0 (and I was warned about sched lock 0), but the
thread id in panic_cpu is 2.  So clearly the thread was being migrated
right around panic time.
   
This is the amd64 kernel on stable/7.  I'm not sure exactly what kind
of hardware; it's a 4-way Intel chip from about 3 or 4 years ago IIRC.
   
So... do we need some kind of barrier in the code for sched_pin() for
it to really do what it claims?  Could the hardware have re-ordered
the mov%gs:0x48,%rax PCPU_GET to before the sched_pin()
increment?
  
   Hmmm, I think it might be able to because they refer to different 
   locations.
  
   Note this rule in section 8.2.2 of Volume 3A:
  
 • Reads may be reordered with older writes to different locations but 
   not
   with older writes to the same location.
  
   It is certainly true that sparc64 could reorder with RMO.  I believe 
   ia64
   could reorder as well.  Since sched_pin/unpin are frequently used to 
   provide
   this sort of synchronization, we could use memory barriers in pin/unpin
   like so:
  
   sched_pin()
   {
 td-td_pinned = atomic_load_acq_int(td-td_pinned) + 1;
   }
  
   sched_unpin()
   {
 atomic_store_rel_int(td-td_pinned, td-td_pinned - 1);
   }
  
   We could also just use atomic_add_acq_int() and atomic_sub_rel_int(), 
   but they
   are slightly more heavyweight, though it would be more clear what is 
   happening
   I think.
  
   However, to actually get a race you'd have to have an interrupt fire and
   migrate you so that the speculative read was from the other CPU.  
   However, I
   don't think the speculative read would be preserved in that case.  The 
   CPU
   has to return to a specific PC when it returns from the interrupt and it 
   has
   no way of storing the state for what speculative reordering it might be
   doing, so presumably it is thrown away?  I suppose it is possible that it
   actually retires both instructions (but reordered) and then returns to 
   the PC
   value after the read of listlocks after the interrupt.  However, in that 
   case
   the scheduler would not migrate as it would see td_pinned != 0.  To get 
   the
   race you have to have the interrupt take effect prior to modifying 
   td_pinned,
   so I think the processor would have to discard the reordered read of
   listlocks so it could safely resume execution at the 'incl' instruction.
  
   The other nit there on x86 at least is that the incl instruction is doing
   both

Re: Not getting interrupts from PCI express slot

2010-08-04 Thread John Baldwin
On Wednesday, August 04, 2010 1:18:53 pm Hans Petter Selasky wrote:
 Hi,
 
 I'm not getting any interrupts from a PCI express slot. When I insert a 
 device, no attach event is generated. If the device is present during boot 
the 
 device is fully detected, but still no IRQ's. Is there anything I can do or 
 test?
 
 I'm running 8-stable on amd64.

In general FreeBSD doesn't support hotplug PCI currently.  Likely you'd need
some sort of hotplug bridge driver similar to cbb(4) for Cardbus slots that 
would catch whatever interrupt is generated when a card is inserted and add 
the device, etc.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: PCI config space is not restored upon resume (macbook pro)

2010-08-05 Thread John Baldwin
On Thursday, August 05, 2010 11:30:23 am Oleg Sharoyko wrote:
 On 4 August 2010 19:12, John Baldwin j...@freebsd.org wrote:
 
  Cool, I actually think that the ACPI PCI-PCI driver can just use the
  stock PCI-PCI bridge driver's suspend and resume methods.  Can you try
  out this alternate patch instead?
 
 It works, and sure looks better than mine. I didn't know there's such a nice
 way to inherit methods.
 
  This sounds like the display just needs to be powered on via DPMS.
  You might be able to make this work via acpi_video and toggling the
  LCD status that way.  You could also try dpms.ko.
 
 I'm afraid things are not that simple. I have tried without success
 acpi_video.ko,
 dmps.ko, sysctl hw.acpi.reset_video and sysutils/vbetool. And what worries me,
 X server cannon start on resumed system. From Xorg.log:
 
 (EE) NV(0): Failed to determine the amount of available video memory
 
 It looks like videcard just ignores any requests.

Are you using the nvidia-driver or the nv driver from X?

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: sched_pin() versus PCPU_GET

2010-08-05 Thread John Baldwin
On Thursday, August 05, 2010 12:01:22 pm m...@freebsd.org wrote:
 On Wed, Aug 4, 2010 at 9:20 AM,  m...@freebsd.org wrote:
  On Wed, Aug 4, 2010 at 2:26 PM, John Baldwin j...@freebsd.org wrote:
  Actually, I would beg to differ in that case.  If PCPU_GET(spinlocks)
  returns non-NULL, then it means that you hold a spin lock,
 
  ll_count is 0 for the correct pc_spinlocks and non-zero for the
  wrong one, though.  So I think it can be non-NULL but the current
  thread/CPU doesn't hold a spinlock.
 
  I don't believe we have any code in the NMI handler.  I'm on vacation
  today so I'll check tomorrow.
 
 I checked and ipi_nmi_handler() doesn't appear to have any local
 changes.  I assume that's where I should look?

The tricky bits are all in the assembly rather than in C, probably in 
exception.S.  However, if %gs were corrupt I would not expect it to point to 
another CPU's data, but garbage from userland.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: sched_pin() versus PCPU_GET

2010-08-05 Thread John Baldwin
On Thursday, August 05, 2010 11:59:37 am m...@freebsd.org wrote:
 On Wed, Aug 4, 2010 at 11:55 AM, John Baldwin j...@freebsd.org wrote:
  On Wednesday, August 04, 2010 12:20:31 pm m...@freebsd.org wrote:
  On Wed, Aug 4, 2010 at 2:26 PM, John Baldwin j...@freebsd.org wrote:
   On Tuesday, August 03, 2010 9:46:16 pm m...@freebsd.org wrote:
   On Fri, Jul 30, 2010 at 2:31 PM, John Baldwin j...@freebsd.org wrote:
On Friday, July 30, 2010 10:08:22 am John Baldwin wrote:
On Thursday, July 29, 2010 7:39:02 pm m...@freebsd.org wrote:
 We've seen a few instances at work where witness_warn() in ast()
 indicates the sched lock is still held, but the place it claims 
it was
 held by is in fact sometimes not possible to keep the lock, like:

 thread_lock(td);
 td-td_flags = ~TDF_SELECT;
 thread_unlock(td);

 What I was wondering is, even though the assembly I see in 
objdump -S
 for witness_warn has the increment of td_pinned before the 
PCPU_GET:

 802db210:   65 48 8b 1c 25 00 00mov%gs:0x0,%rbx
 802db217:   00 00
 802db219:   ff 83 04 01 00 00   incl   0x104(%rbx)
  * Pin the thread in order to avoid problems with thread 
migration.
  * Once that all verifies are passed about spinlocks 
ownership,
  * the thread is in a safe path and it can be unpinned.
  */
 sched_pin();
 lock_list = PCPU_GET(spinlocks);
 802db21f:   65 48 8b 04 25 48 00mov%gs:0x48,%rax
 802db226:   00 00
 if (lock_list != NULL  lock_list-ll_count != 0) {
 802db228:   48 85 c0test   %rax,%rax
  * Pin the thread in order to avoid problems with thread 
migration.
  * Once that all verifies are passed about spinlocks 
ownership,
  * the thread is in a safe path and it can be unpinned.
  */
 sched_pin();
 lock_list = PCPU_GET(spinlocks);
 802db22b:   48 89 85 f0 fe ff ffmov   
 %rax,-0x110(%rbp)
 802db232:   48 89 85 f8 fe ff ffmov   
 %rax,-0x108(%rbp)
 if (lock_list != NULL  lock_list-ll_count != 0) {
 802db239:   0f 84 ff 00 00 00   je 
802db33e
 witness_warn+0x30e
 802db23f:   44 8b 60 50 mov0x50(%rax),
%r12d

 is it possible for the hardware to do any re-ordering here?

 The reason I'm suspicious is not just that the code doesn't have 
a
 lock leak at the indicated point, but in one instance I can see 
in the
 dump that the lock_list local from witness_warn is from the pcpu
 structure for CPU 0 (and I was warned about sched lock 0), but 
the
 thread id in panic_cpu is 2.  So clearly the thread was being 
migrated
 right around panic time.

 This is the amd64 kernel on stable/7.  I'm not sure exactly what 
kind
 of hardware; it's a 4-way Intel chip from about 3 or 4 years ago 
IIRC.

 So... do we need some kind of barrier in the code for sched_pin() 
for
 it to really do what it claims?  Could the hardware have re-
ordered
 the mov%gs:0x48,%rax PCPU_GET to before the sched_pin()
 increment?
   
Hmmm, I think it might be able to because they refer to different 
locations.
   
Note this rule in section 8.2.2 of Volume 3A:
   
  • Reads may be reordered with older writes to different locations 
but not
with older writes to the same location.
   
It is certainly true that sparc64 could reorder with RMO.  I 
believe ia64
could reorder as well.  Since sched_pin/unpin are frequently used 
to provide
this sort of synchronization, we could use memory barriers in 
pin/unpin
like so:
   
sched_pin()
{
  td-td_pinned = atomic_load_acq_int(td-td_pinned) + 1;
}
   
sched_unpin()
{
  atomic_store_rel_int(td-td_pinned, td-td_pinned - 1);
}
   
We could also just use atomic_add_acq_int() and 
atomic_sub_rel_int(), but they
are slightly more heavyweight, though it would be more clear what 
is happening
I think.
   
However, to actually get a race you'd have to have an interrupt fire 
and
migrate you so that the speculative read was from the other CPU. 
 However, I
don't think the speculative read would be preserved in that case. 
 The CPU
has to return to a specific PC when it returns from the interrupt 
and it has
no way of storing the state for what speculative reordering it might 
be
doing, so presumably it is thrown away?  I suppose it is possible 
that it
actually retires both instructions (but reordered) and then returns 
to the PC
value after the read of listlocks after the interrupt.  However, in 
that case
the scheduler would not migrate as it would see td_pinned != 0.  To 
get the
race you have to have the interrupt take effect prior to modifying 
td_pinned,
so I think the processor

Re: 8.1-STABLE amd64 machine check

2010-08-11 Thread John Baldwin

Dan Langille wrote:
I am encountering a situation similar to one reported by Andrew Heybey 
at http://docs.freebsd.org/cgi/mid.cgi?6E83197B-9DD5-4C7E-846D-AD176C25464D


This morning I found this in my /var/log/messages:

Aug 11 01:59:48 kraken kernel: MCA: Bank 4, Status 0x94614c62001c011b
Aug 11 01:59:48 kraken kernel: MCA: Global Cap 0x0106, 
Status 0x
Aug 11 01:59:48 kraken kernel: MCA: Vendor AuthenticAMD, ID 0x100f42, 
APIC ID 0

Aug 11 01:59:48 kraken kernel: MCA: CPU 0 COR GCACHE LG RD error
Aug 11 01:59:48 kraken kernel: MCA: Address 0x5d0fe8c


from /var/run/dmesg.boot

Copyright (c) 1992-2010 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 8.1-STABLE #0: Sun Jul 25 19:18:56 EDT 2010
d...@kraken.example.org:/usr/obj/usr/src/sys/KRAKEN amd64
Timecounter i8254 frequency 1193182 Hz quality 0
CPU: AMD Phenom(tm) II X4 945 Processor (3010.17-MHz K8-class CPU)
  Origin = AuthenticAMD  Id = 0x100f42  Family = 10  Model = 4 
Stepping = 2


Features=0x178bfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT 


  Features2=0x802009SSE3,MON,CX16,POPCNT
  AMD 
Features=0xee500800SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM,3DNow!+,3DNow!
  AMD 
Features2=0x37ffLAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,IBS,SKINIT,WDT 


  TSC: P-state invariant
real memory  = 4294967296 (4096 MB)
avail memory = 4100710400 (3910 MB)
ACPI APIC Table: 111909 APIC1708
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
FreeBSD/SMP: 1 package(s) x 4 core(s)
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
 cpu2 (AP): APIC ID:  2
 cpu3 (AP): APIC ID:  3


Andrew: You posted about this on July 14.  Anything new since then?

John: Is it time for me to get a new CPU?


Hmm, this is what mcelog says:

HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 0 4 northbridge
ADDR 5d0fe8c
  Northbridge NB Array Error
   bit33 = err cpu1
   bit42 = L3 subcache in error bit 0
   bit43 = L3 subcache in error bit 1
   bit46 = corrected ecc error
  memory/cache error 'generic read mem transaction, generic 
transaction, level generic'

STATUS 94614c62001c011b MCGSTATUS 0
MCGCAP 106 APICID 0 SOCKETID 0
CPUID Vendor AMD Family 16 Model 4

It was a corrected ECC error.  If you get more than one then perhaps the 
CPU is busted, but if you only get one, an isolated bit flip may not be 
worth worrying about.


--
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: real memory falsely reports 8G, BIOS avail memory reports 1G

2010-08-16 Thread John Baldwin
On Monday, August 09, 2010 8:13:03 am Julian H. Stacey wrote:
 Hi hack...@freebsd.org
 A laptop here emits a puzzlingly dmesg to both 8.1-RC2  8.1-RELEASE:
   real memory  = 8572108800 (8175 MB)
   avail memory = 1018789888 (971 MB)
 BIOS reckons it has 1G. No panel to unscrew to inspect memory. 
 I don't beleive 8G.
 
 If this is a bug in FreeBSD detect code ?
 I am ready to run test kernel patches againt 8.1-RELEASE
  report back if anyone has code.
 (I have room to install a current too if necessary)
 
 Full dmesg here:
 http://www.berklix.com/~jhs/hardware/laptops/novatech-8355/dmesg/
 
 Cheers,
 Julian

Hmm, do you have dmidecode output?

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Why doesn't ppc(4) check non-ENXIO failures during probe?

2010-08-16 Thread John Baldwin
On Sunday, August 15, 2010 1:33:38 am Garrett Cooper wrote:
 One thing that's puzzling me about the ppc(4) driver's ISA
 routines is that it only checks to see whether or not the device has
 an IO error:

Your patch would break hinted ppc devices.  ENXIO means that the device_t 
being probed has an ISA PNP ID, but it does not match any of the IDs in the 
list.  ENONET means that the device_t does not have an ISA ID at all.  For the 
isa bus that means it was explicitly created via a set of ppc.X hints.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Why doesn't ppc(4) check non-ENXIO failures during probe?

2010-08-17 Thread John Baldwin
On Monday, August 16, 2010 7:23:54 pm Garrett Cooper wrote:
 On Mon, Aug 16, 2010 at 1:19 PM, John Baldwin j...@freebsd.org wrote:
  On Sunday, August 15, 2010 1:33:38 am Garrett Cooper wrote:
  One thing that's puzzling me about the ppc(4) driver's ISA
  routines is that it only checks to see whether or not the device has
  an IO error:
 
  Your patch would break hinted ppc devices.  ENXIO means that the device_t
  being probed has an ISA PNP ID, but it does not match any of the IDs in the
  list.  ENONET means that the device_t does not have an ISA ID at all.  For 
  the
  isa bus that means it was explicitly created via a set of ppc.X hints.
 
 Just clarifying some things because I don't know all of the details.
 
 If a ISA based parallel port fails to probe with ENOENT, then it's
 assumed that the configuration details are incorrect, and it should
 reprobe the device with different configuration settings (irq, isa
 port, etc) a max of BIOS_MAX_PPC times before it finally bails failing
 to configure a device (ppc_probe in ppc.c)? What if all of the ISA
 details in the device.hints file are bogus and the only detail that's
 correct is in the puc driver, etc? Would it fail to connect the card
 if it reached the BIOS_MAX_PPC ISA-related failure limit (see
 ppc_probe again)?

ISA_PNP_PROBE() does not talk to the hardware, it just compares device IDs.
You have to realize that device_t objects on an ISA device come from three
sources:

 1) Builtin devices are auto-enumerated via ACPI or PnP BIOS.  Any
 modern BIOS will do this for things like built in serial ports, ISA
 timers, PS/2 keyboard, etc.

 2) ISA PnP adapters in an ISA slot are enumerated via ISA PnP.

 3) Users indicate that specific ISA devices are present via hints.

Devices from 1) and 2) have an assigned device ID (HID) and zero-or-more
compatibility IDs (CID).  ISA_PNP_PROBE() accepts a list of HID IDs and
returns true (0) if the HID or any of the CIDs match any of the ids in
the list that is passed in.  If none of the IDs match it returns ENXIO.
Thus for devices from 1) and 2) ISA_PNP_PROBE() returns either 0 or
ENXIO.  For devices from 3), ISA_PNP_PROBE() will always return ENOENT.

Your change would break 3) since those devices would then never probe.

ppc_probe() is called to verify that the hardware truly exists at the
resources that are claimed.  In practice the loop you refer to never runs
now as the default hints for ppc always specify a port and ppc adapters
from 1) always include the port resource.  That loop should probably
belong in an identify routine instead of in the probe routine anyway.
It probably predates new-bus.

The waters are slightly muddied further by the fact that if the resources
specified in a hint match the resources from one of the devices found via
1) or 2), the device from 1) or 2) will actually subsume the hinted device
so you will not get a separate type 3) device.  For example, in the default
hints uart0 specifies an I/O port of 0x3f8.  If ACPI tells the OS about a
COM1 serial port with the default I/O port (0x3f8), then the hints cause
that device to be named uart0 and to use the flags from uart0 to enable
the serial console, etc.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: real memory falsely reports 8G, BIOS avail memory reports 1G

2010-08-17 Thread John Baldwin
On Tuesday, August 17, 2010 7:49:22 am Julian H. Stacey wrote:
 John Baldwin wrote:
  On Monday, August 09, 2010 8:13:03 am Julian H. Stacey wrote:
   Hi hack...@freebsd.org
   A laptop here emits a puzzlingly dmesg to both 8.1-RC2  8.1-RELEASE:
 real memory  = 8572108800 (8175 MB)
 avail memory = 1018789888 (971 MB)
   BIOS reckons it has 1G. No panel to unscrew to inspect memory. 
   I don't beleive 8G.
   
   If this is a bug in FreeBSD detect code ?
   I am ready to run test kernel patches againt 8.1-RELEASE
report back if anyone has code.
   (I have room to install a current too if necessary)
   
   Full dmesg here:
   http://www.berklix.com/~jhs/hardware/laptops/novatech-8355/dmesg/
   
   Cheers,
   Julian
  
  Hmm, do you have dmidecode output?
 
 Hi, Thanks for interest, Yes here

Yeah, I saw you post the details later in the thread and had forgotten to
delete my reply.  At one point the code to print out real memory was
changed to use the DMI/SMBios information as when it is correct it gives
a more accurate looking number (an even 8GB for example vs some number that
is slightly smaller than 8GB).  It looks like your DMI info is just very
wrong resulting in a bogus number in the printf.  However, it is only a
cosmetic failure, it doesn't affect how the kernel runs or uses memory.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Why doesn't ppc(4) check non-ENXIO failures during probe?

2010-08-17 Thread John Baldwin
On Tuesday, August 17, 2010 3:56:20 pm Garrett Cooper wrote:
 On Tue, Aug 17, 2010 at 6:07 AM, John Baldwin j...@freebsd.org wrote:
  On Monday, August 16, 2010 7:23:54 pm Garrett Cooper wrote:
  On Mon, Aug 16, 2010 at 1:19 PM, John Baldwin j...@freebsd.org wrote:
   On Sunday, August 15, 2010 1:33:38 am Garrett Cooper wrote:
   One thing that's puzzling me about the ppc(4) driver's ISA
   routines is that it only checks to see whether or not the device has
   an IO error:
  
   Your patch would break hinted ppc devices.  ENXIO means that the device_t
   being probed has an ISA PNP ID, but it does not match any of the IDs in 
   the
   list.  ENONET means that the device_t does not have an ISA ID at all.  
   For the
   isa bus that means it was explicitly created via a set of ppc.X hints.
 
  Just clarifying some things because I don't know all of the details.
 
  If a ISA based parallel port fails to probe with ENOENT, then it's
  assumed that the configuration details are incorrect, and it should
  reprobe the device with different configuration settings (irq, isa
  port, etc) a max of BIOS_MAX_PPC times before it finally bails failing
  to configure a device (ppc_probe in ppc.c)? What if all of the ISA
  details in the device.hints file are bogus and the only detail that's
  correct is in the puc driver, etc? Would it fail to connect the card
  if it reached the BIOS_MAX_PPC ISA-related failure limit (see
  ppc_probe again)?
 
  ISA_PNP_PROBE() does not talk to the hardware, it just compares device IDs.
  You have to realize that device_t objects on an ISA device come from three
  sources:
 
   1) Builtin devices are auto-enumerated via ACPI or PnP BIOS.  Any
   modern BIOS will do this for things like built in serial ports, ISA
   timers, PS/2 keyboard, etc.
 
   2) ISA PnP adapters in an ISA slot are enumerated via ISA PnP.
 
   3) Users indicate that specific ISA devices are present via hints.
 
  Devices from 1) and 2) have an assigned device ID (HID) and zero-or-more
  compatibility IDs (CID).  ISA_PNP_PROBE() accepts a list of HID IDs and
  returns true (0) if the HID or any of the CIDs match any of the ids in
  the list that is passed in.  If none of the IDs match it returns ENXIO.
  Thus for devices from 1) and 2) ISA_PNP_PROBE() returns either 0 or
  ENXIO.  For devices from 3), ISA_PNP_PROBE() will always return ENOENT.
 
  Your change would break 3) since those devices would then never probe.
 
  ppc_probe() is called to verify that the hardware truly exists at the
  resources that are claimed.  In practice the loop you refer to never runs
  now as the default hints for ppc always specify a port and ppc adapters
  from 1) always include the port resource.  That loop should probably
  belong in an identify routine instead of in the probe routine anyway.
  It probably predates new-bus.
 
  The waters are slightly muddied further by the fact that if the resources
  specified in a hint match the resources from one of the devices found via
  1) or 2), the device from 1) or 2) will actually subsume the hinted device
  so you will not get a separate type 3) device.  For example, in the default
  hints uart0 specifies an I/O port of 0x3f8.  If ACPI tells the OS about a
  COM1 serial port with the default I/O port (0x3f8), then the hints cause
  that device to be named uart0 and to use the flags from uart0 to enable
  the serial console, etc.
 
 So more or less it's for BIOSes with ISA that doesn't feature plug
 and play (286s, 386s, some 486s?)? Just trying to fill in the gap :).

Yes, it may perhaps still be useful for some x86 embedded systems, though
it is doubtful that those would use a ppc(4) device perhaps.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Modules and Buses

2010-08-19 Thread John Baldwin
On Thursday, August 19, 2010 8:38:05 am Alexandr Rybalko wrote:
 Hi all,
 
 Can someone say, how `make` in sys/modules dir can obtain available buses.
 I try to make clean version of bfe, that can be for PCI bus or can be part 
of SoC (like BCM5354) on SSB bus.
 So for proper module building I need to know what bus interface I must build 
if_bfe_pci.c, or if_bfe_siba.c, or both?

You can always include both buses.  If a bus driver isn't present in the 
kernel the attachment will just never be invoked.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Converting from jiffies to ticks

2010-08-20 Thread John Baldwin
On Friday, August 20, 2010 9:14:23 am Jesse Smith wrote:
 I am currently trying to port a program from Linux to FreeBSD which
 detects how much processor time a process is using. The native Linux
 code does this (in part) by reading the number of jiffies a given
 process uses. This info is pulled from the /proc/PID/stat file.
 
 One function is failing on FreeBSD and it's obviously because FreeBSD
 does not have all the same files/data in the /proc directory.
 
 I've looked around and, as I understand it, FreeBSD uses ticks instead
 of jiffies to measure process usage. However, how to gather that data
 is a bit lost on me.
 
 This raises a question for me:
 Where can I find the equivalent information on FreeBSD? I assume
 there's a function call. Maybe in the kvm_* family? I need to be able to
 get the number of ticks a given PID is using, both in the kernel and
 userspace.
 
 
 The rest of the program measures everything in jiffies, so it would be
 ideal for me to get the ticks used on FreeBSD (based on PID), convert it
 to jiffies and pass it back to the main program.

FreeBSD saves the total runtime in an architecture-dependent ticker count 
that is separate from ticks.  (ticks tends to run at hz, so by default 
1000 times per second, where as the 'ticker' on x86 is the TSC which runs at 
the clock speed of the CPU (throttling and turbo boost aside)).  You can look 
at the calcru() function to see how the kernel converts the runtime ticker 
count (saved in rux_runtime) into microseconds.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Question about printcpuinfo in sys/amd64/amd64/indentcpu.c

2010-08-23 Thread John Baldwin
On Friday, August 20, 2010 10:14:46 am Garrett Cooper wrote:
 Hi,
 Currently the code in identcpu.c does a check for a specific cpu
 value extension. This is set to 0x8004 (even though the
 corresponding code below iterates through 0x8002:0x8005):

It does not invoke 0x8005 (, not =, is used as the loop terminator).

 /* Check for extended CPUID information and a processor name. */
 if (cpu_exthigh = 0x8004) {
 brand = cpu_brand;
 for (i = 0x8002; i  0x8005; i++) {
 do_cpuid(i, regs);
 memcpy(brand, regs, sizeof(regs));
 brand += sizeof(regs);
 }
 }

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: kld modules remain loaded if MOD_LOAD handler returns an error

2010-08-23 Thread John Baldwin
On Friday, August 20, 2010 1:13:53 pm Ryan Stone wrote:
 Consider the following modules:
 
 /* first.c */
 static int *test;
 
 int
 test_function(void)
 {
 return *test;
 }
 
 static int
 first_modevent(struct module *m, int what, void *arg)
 {
 int err = 0;
 
 switch (what) {
 case MOD_LOAD:/* kldload */
 test = malloc(sizeof(int), M_TEMP, M_NOWAIT | M_ZERO);
 if (!test)
 err = ENOMEM;
 break;
 case MOD_UNLOAD:  /* kldunload */
 break;
 default:
 err = EINVAL;
 break;
 }
 return(err);
 }
 
 static moduledata_t first_mod = {
 first,
 first_modevent,
 NULL
 };
 
 DECLARE_MODULE(first, first_mod, SI_SUB_KLD, SI_ORDER_ANY);
 MODULE_VERSION(first, 1);
 
 
 /* second.c */
 static int
 second_modevent(struct module *m, int what, void *arg)
 {
 int err = 0;
 
 switch (what) {
 case MOD_LOAD:/* kldload */
 test_function();
 break;
 case MOD_UNLOAD:  /* kldunload */
 break;
 default:
 err = EINVAL;
 break;
 }
 return(err);
 }
 
 static moduledata_t second_mod = {
 second,
 second_modevent,
 NULL
 };
 
 DECLARE_MODULE(second, second_mod, SI_SUB_KLD, SI_ORDER_ANY);
 MODULE_DEPEND(second, first, 1, 1, 1);
 
 
 Consider the case where malloc fails in first_modevent.
 first_modevent will return ENOMEM, but the module will remain loaded.
 Now when the second module goes and loads, it calls into the first
 module, which is not initialized properly, and promptly crashes when
 test_function() dereferences a null pointer.
 
 It seems to me that a module should be unloaded if it returns an error
 from its MOD_LOAD handler.  However, that's easier said than done.
 The MOD_LOAD handler is called from a SYSINIT, and there's no
 immediately obvious way to pass information about the failure from the
 SYSINIT to the kernel linker.  Anybody have any thoughts on this?

Yeah, it's not easy to fix.  Probably we could patch the kernel linker to 
notice if any of the modules for a given linker file had errors during
initialization and trigger an unload if that occurs.  I don't think this would 
be too hard to do.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: kld modules remain loaded if MOD_LOAD handler returns an error

2010-08-23 Thread John Baldwin
On Monday, August 23, 2010 11:04:20 am Andriy Gapon wrote:
 on 23/08/2010 15:10 John Baldwin said the following:
  On Friday, August 20, 2010 1:13:53 pm Ryan Stone wrote:
  Consider the following modules:
 
  /* first.c */
  static int *test;
 
  int
  test_function(void)
  {
  return *test;
  }
 
  static int
  first_modevent(struct module *m, int what, void *arg)
  {
  int err = 0;
 
  switch (what) {
  case MOD_LOAD:/* kldload */
  test = malloc(sizeof(int), M_TEMP, M_NOWAIT | M_ZERO);
  if (!test)
  err = ENOMEM;
  break;
  case MOD_UNLOAD:  /* kldunload */
  break;
  default:
  err = EINVAL;
  break;
  }
  return(err);
  }
 
  static moduledata_t first_mod = {
  first,
  first_modevent,
  NULL
  };
 
  DECLARE_MODULE(first, first_mod, SI_SUB_KLD, SI_ORDER_ANY);
  MODULE_VERSION(first, 1);
 
 
  /* second.c */
  static int
  second_modevent(struct module *m, int what, void *arg)
  {
  int err = 0;
 
  switch (what) {
  case MOD_LOAD:/* kldload */
  test_function();
  break;
  case MOD_UNLOAD:  /* kldunload */
  break;
  default:
  err = EINVAL;
  break;
  }
  return(err);
  }
 
  static moduledata_t second_mod = {
  second,
  second_modevent,
  NULL
  };
 
  DECLARE_MODULE(second, second_mod, SI_SUB_KLD, SI_ORDER_ANY);
  MODULE_DEPEND(second, first, 1, 1, 1);
 
 
  Consider the case where malloc fails in first_modevent.
  first_modevent will return ENOMEM, but the module will remain loaded.
  Now when the second module goes and loads, it calls into the first
  module, which is not initialized properly, and promptly crashes when
  test_function() dereferences a null pointer.
 
  It seems to me that a module should be unloaded if it returns an error
  from its MOD_LOAD handler.  However, that's easier said than done.
  The MOD_LOAD handler is called from a SYSINIT, and there's no
  immediately obvious way to pass information about the failure from the
  SYSINIT to the kernel linker.  Anybody have any thoughts on this?
  
  Yeah, it's not easy to fix.  Probably we could patch the kernel linker to 
  notice if any of the modules for a given linker file had errors during
  initialization and trigger an unload if that occurs.  I don't think this 
  would 
  be too hard to do.
 
 John,
 
 please note that for this testcase we would need to prevent second module's
 modevent from being executed at all.
 Perhaps a module shouldn't be considered as loaded until modevent caller 
 marks it
 as successfully initialized, but I haven't looked at the actual code.

Well, if these two event handlers are in the same module, I think that is a
bug in the module really.  I tend to collapse such things down to a single
event handler per kld just so I can really get the ordering correct anyway. :)
If they are in two separate .ko files then the other solution would work.
We could also hack the module code to mark a linker_file as 'broken' and have
the module_helper sysinit not call mod_load if the containing file is 'broken',
etc.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Why doesn't ppc(4) check non-ENXIO failures during probe?

2010-08-24 Thread John Baldwin
On Tuesday, August 24, 2010 12:09:45 am M. Warner Losh wrote:
 In message: 201008171615.21103@freebsd.org
 John Baldwin j...@freebsd.org writes:
 :  So more or less it's for BIOSes with ISA that doesn't feature plug
 :  and play (286s, 386s, some 486s?)? Just trying to fill in the gap :).
 : 
 : Yes, it may perhaps still be useful for some x86 embedded systems, though
 : it is doubtful that those would use a ppc(4) device perhaps.
 
 Many embedded x86 systems use ppc(4) as a DIO port.  ppi attaches to
 it and can be used to frob bits.
 
 These days, of course, almost all boards have ACPI, so that means they
 get enumerated that way.  Only boards that don't run windows might not
 have ACPI, in which case the devices are usually enumerated via
 PNPBIOS.  But not always, since those boards tend to have the buggiest
 BIOSes on the planet in this area.  Hints are needed on a few of these
 boards since nothing else will work.  And they have Atom processors on
 them...

The specific code I am referring to is the code in ppc_isa_probe() that tries 
to auto-identify a ppc port by poking at various I/O ports directly.  It is 
not enabled by default.  You'd have to have a ppc hint that did not include an 
I/O port for this code to be triggered I think as it only gets executed if a 
ppc(4) device does not have an I/O port resource from ACPI/PnPBIOS/hints.

I was mostly thinking of this in terms of ISA cards, and I doubt that even 
modern embedded systems have ISA slots. :)

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: disassembler

2010-08-27 Thread John Baldwin
On Thursday, August 26, 2010 11:42:25 pm Aryeh Friedman wrote:
 On Thu, Aug 26, 2010 at 11:36 PM, Aryeh Friedman
 aryeh.fried...@gmail.com wrote:
  On Thu, Aug 26, 2010 at 10:46 PM, Dirk Engling erdge...@erdgeist.org 
wrote:
  On 27.08.10 04:17, Aryeh Friedman wrote:
 
  Is there a disassembler in the base system if not what is a good
  option from ports?
 
  Try objdump -d,
 
   erdgeist
 
 
  flosoft# objdump -d /dev/da0
  objdump: Warning: '/dev/da0' is not an ordinary file

For a raw file of x86 instructions use ndisasm from the 'nasm' port.  Note 
that it assumes 16-bit code by default, but you can use ndisasm -U to parse 
32-bit instructions instead.  For a typical MBR boot loader, plain ndisasm 
should work fine:

# ndisasm /dev/twed0
  FCcld
0001  31C0  xor ax,ax
0003  8EC0  mov es,ax
0005  8ED8  mov ds,ax
0007  8ED0  mov ss,ax
0009  BC007Cmov sp,0x7c00
000C  BE1A7Cmov si,0x7c1a
000F  BF1A06mov di,0x61a
0012  B9E601mov cx,0x1e6
0015  F3A4  rep movsb
0017  E9008Ajmp word 0x8a1a
001A  31F6  xor si,si
...

etc.

I would dd the first sector of your disk off to a file and run ndisasm on that 
though rather than on the live disk.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Debugging Loadable Modules Using GDB

2010-08-30 Thread John Baldwin
On Friday, August 27, 2010 4:11:41 pm Alexander Fiveg wrote:
 Hi,
 from FreeBSD Developers' Handbook, 10.7 Debugging Loadable Modules Using
 GDB:
 ...
 (kgdb) add-symbol-file /sys/modules/linux/linux.ko 0xc0ae22d0
 ...
 
 Actually I couldn't debug my modules using .ko-file. Moreover, I've find out 
that .ko files do not contain sections with debugging info. With .kld-file 
debugging works out. Do I something incorrectly or the info in the Developers 
Book is outdated?

With newer versions of kgdb you shouldn't need to manually invoke 'add-symbol-
file'.  Kernel modules are treated as shared libraries and should 
automatically be loaded.  Try using 'info sharedlibrary' to see the list of 
kernel modules and if symbols for them are loaded already.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: System freezes unexpectly

2010-08-30 Thread John Baldwin
On Sunday, August 29, 2010 10:18:48 am Davide Italiano wrote:
 Hi.
 I'm running 8.1 on my Sony Vaio laptop, with dwm as window manager on
 lastest Xorg on ports.
 When I'm trying to run firefox3, the system freezes unexpectly. I
 know that freezes is a bit generic but I can't find a more specific
 term to describe the situation. Dmesg doesn't give useful infos.
 
 I installed firefox using pkg_add -r , the only add-on/plugin
 installed is Xmarks.
 
 I'm ready to eventually debug, any suggestion is apprectiated.
 
 Thanks

Can you ssh into the machine or ping it when it is frozen?

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Debugging Loadable Modules Using GDB

2010-08-30 Thread John Baldwin
On Monday, August 30, 2010 12:12:50 pm Alexander Fiveg wrote:
 On Mon, Aug 30, 2010 at 08:16:11AM -0400, John Baldwin wrote:
   On Friday, August 27, 2010 4:11:41 pm Alexander Fiveg wrote:
   Hi,
   from FreeBSD Developers' Handbook, 10.7 Debugging Loadable Modules Using
   GDB:
   ...
   (kgdb) add-symbol-file /sys/modules/linux/linux.ko 0xc0ae22d0
   ...
   
   Actually I couldn't debug my modules using .ko-file. Moreover, I've find 
   out 
  that .ko files do not contain sections with debugging info. With .kld-file 
  debugging works out. Do I something incorrectly or the info in the 
  Developers 
  Book is outdated?
  
  With newer versions of kgdb you shouldn't need to manually invoke 
  'add-symbol-
  file'.  Kernel modules are treated as shared libraries and should 
  automatically be loaded.  Try using 'info sharedlibrary' to see the list of 
  kernel modules and if symbols for them are loaded already.
 Yes, the .ko files are loaded automatically. The problem is that they do
 not contain debugging info. I have always to load the .kld file in order to 
 debug a module:
 
 (kgdb) f 9
 #9  0xc4dc558b in rm_8254_delayed_interrupt_per_packet () from
 /boot/kernel/if_ringmap.ko
 (kgdb) info locals
 No symbol table info available.
 
 (kgdb) add-symbol-file 
 /home/alexandre/p4/ringmap/current/sys/modules/ringmap/if_ringmap.kld 
 0xc4dafc70
 add symbol table from file 
 /home/alexandre/p4/ringmap/current/sys/modules/ringmap/if_ringmap.kld
 at
   .text_addr = 0xc4dafc70
 (y or n) y
 Reading symbols from 
 /home/alexandre/p4/ringmap/current/sys/modules/ringmap/if_ringmap.kld...done.
 
 (kgdb) f 9 
 #9  0xc4dc558b in rm_8254_delayed_interrupt_per_packet ()
 at 
 /home/alexandre/p4/ringmap/current/sys/modules/ringmap/../../dev/e1000/ringmap_8254.c:142
 142   co-ring-slot[slot_num].ts = co-ring-last_ts;
 
 (kgdb) info locals
 co = (struct capt_object *) 0xc4d68380
 adapter = (struct adapter *) 0xc4e77000
 __func__ = 
 e\000\000�\...@\000\000\211\203�e\000\000\017\206b\022\000\000\2039\000\213a\004\017\205�\f\000\000\001��1�
 
 
 Is there any way to get the all symbols and needed debug info without
 loading the .kld file ?

How are you compiling the kld?  If you are building it by hand, use
'make DEBUG_FLAGS=-g' when you build and install the kld.  That should build
with debug symbols enabled and install the ko.symbols file which kgdb will
find and use.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Debugging Loadable Modules Using GDB

2010-08-30 Thread John Baldwin
On Monday, August 30, 2010 4:34:04 pm Alexander Fiveg wrote:
 On Mon, Aug 30, 2010 at 01:10:37PM -0400, John Baldwin wrote:
   On Monday, August 30, 2010 12:12:50 pm Alexander Fiveg wrote:
   On Mon, Aug 30, 2010 at 08:16:11AM -0400, John Baldwin wrote:
 On Friday, August 27, 2010 4:11:41 pm Alexander Fiveg wrote:
 Hi,
 from FreeBSD Developers' Handbook, 10.7 Debugging Loadable Modules 
 Using
 GDB:
 ...
 (kgdb) add-symbol-file /sys/modules/linux/linux.ko 0xc0ae22d0
 ...
 
 Actually I couldn't debug my modules using .ko-file. Moreover, I've 
 find out 
that .ko files do not contain sections with debugging info. With 
.kld-file 
debugging works out. Do I something incorrectly or the info in the 
Developers 
Book is outdated?

With newer versions of kgdb you shouldn't need to manually invoke 
'add-symbol-
file'.  Kernel modules are treated as shared libraries and should 
automatically be loaded.  Try using 'info sharedlibrary' to see the 
list of 
kernel modules and if symbols for them are loaded already.
   Yes, the .ko files are loaded automatically. The problem is that they do
   not contain debugging info. I have always to load the .kld file in order 
   to 
   debug a module:
   
   (kgdb) f 9
   #9  0xc4dc558b in rm_8254_delayed_interrupt_per_packet () from
   /boot/kernel/if_ringmap.ko
   (kgdb) info locals
   No symbol table info available.
   
   (kgdb) add-symbol-file 
   /home/alexandre/p4/ringmap/current/sys/modules/ringmap/if_ringmap.kld 
   0xc4dafc70
   add symbol table from file 
   /home/alexandre/p4/ringmap/current/sys/modules/ringmap/if_ringmap.kld
   at
 .text_addr = 0xc4dafc70
   (y or n) y
   Reading symbols from 
   /home/alexandre/p4/ringmap/current/sys/modules/ringmap/if_ringmap.kld...done.
   
   (kgdb) f 9 
   #9  0xc4dc558b in rm_8254_delayed_interrupt_per_packet ()
   at 
   /home/alexandre/p4/ringmap/current/sys/modules/ringmap/../../dev/e1000/ringmap_8254.c:142
   142   co-ring-slot[slot_num].ts = co-ring-last_ts;
   
   (kgdb) info locals
   co = (struct capt_object *) 0xc4d68380
   adapter = (struct adapter *) 0xc4e77000
   __func__ = 
   e\000\000�\...@\000\000\211\203�e\000\000\017\206b\022\000\000\2039\000\213a\004\017\205�\f\000\000\001��1�
   
   
   Is there any way to get the all symbols and needed debug info without
   loading the .kld file ?
  
  How are you compiling the kld?  If you are building it by hand, use
  'make DEBUG_FLAGS=-g' when you build and install the kld.  That should build
  with debug symbols enabled and install the ko.symbols file which kgdb will
  find and use.
 Thanks a lot!. That is what I want to know. But I think this option is not 
 mentioned anywhere. I could not find it in man make make.conf and also 
 no mention about it in FreeBSD Developers' Handbook. 

It's a bit of a feature of the bsd.*.mk files that if you define
'DEBUG_FLAGS' it is added to CFLAGS (and CXXFLAGS) and that any resulting
binaries are not stripped, etc.  The same trick can be used to build debug
versions of binaries and libraries.  It probably is underdocumented.  Not
sure make.conf(5) is the right place as the typical usage is on the command
line, not in /etc/make.conf or /etc/src.conf.  However, I can't think of a
better place.  Maybe src.conf(5)?

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: System freezes unexpectly

2010-08-31 Thread John Baldwin
On Monday, August 30, 2010 12:45:40 pm Garrett Cooper wrote:
 On Mon, Aug 30, 2010 at 9:24 AM, Davide Italiano
 davide.itali...@gmail.com wrote:
  removing ~/.mozilla works fine. I think that problem's related to
  add-on Xmarks I've been installer or to Restore session
  functionality
 
 It would have been interesting to capture what `froze' the machine, in
 particular because it could have been a valuable bug for either
 Mozilla to capture and fix, or for us to capture and fix. Unless your
 machine doesn't meet the hardware requirements, I don't see a reason
 why a userland application should lock up a system.
 
 There are other ways you can debug this further, using -safe-mode as a
 next step, then choose to not restore the last session (which is
 available from within the javascript settings file -- nsPrefs.js?).

If only firefox is frozen, then you can always ssh in from another machine and 
use top/ps, etc., or even gdb on the firefox process itself.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: MCE Decoding - MCA: Bank 8, Status 0xcc0031800001009f/0xc8000980000200cf

2010-09-13 Thread John Baldwin
On Saturday, September 11, 2010 1:40:28 am Simon wrote:
 Hello,
 
 Can someone please help me decode these two errors on FreeBSD 8.1-R:
 
 MCA: Bank 8, Status 0xcc003181009f
 MCA: Global Cap 0x1c09, Status 0x
 MCA: Vendor GenuineIntel, ID 0x106a5, APIC ID 16
 MCA: CPU 0 COR (198) OVER RD channel ?? memory error
 MCA: Address 0x1b6188d80
 MCA: Misc 0x72ae24200084
 
 MCA: Bank 8, Status 0xc800098200cf
 MCA: Global Cap 0x1c09, Status 0x
 MCA: Vendor GenuineIntel, ID 0x106a5, APIC ID 16
 MCA: CPU 0 COR (38) OVER MS channel ?? memory error
 MCA: Misc 0x72ae24200140

HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 0 BANK 8 
MISC 72ae24200084 ADDR 1b6188d80 
MCG status:
MCi status:
Error overflow
MCi_MISC register valid
MCi_ADDR register valid
MCA: MEMORY CONTROLLER RD_CHANNELunspecified_ERR
Transaction: Memory read error
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 198
Memory transaction Tracker ID (RTId): 84
Memory DIMM ID of error: 0
Memory channel ID of error: 0
Memory ECC syndrome: 72ae2420
STATUS cc003181009f MCGSTATUS 0
MCGCAP 1c09 APICID 10 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 26
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 0 BANK 8 
MISC 72ae24200140 
MCG status:
MCi status:
Error overflow
MCi_MISC register valid
MCA: MEMORY CONTROLLER MS_CHANNELunspecified_ERR
Transaction: Memory scrubbing error
Memory ECC error occurred during scrub
Memory corrected error count (CORE_ERR_CNT): 38
Memory transaction Tracker ID (RTId): 40
Memory DIMM ID of error: 0
Memory channel ID of error: 0
Memory ECC syndrome: 72ae2420
STATUS c800098200cf MCGSTATUS 0
MCGCAP 1c09 APICID 10 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 26

You have some corrected memory errors (198+38 = 236) in the first DIMM (on the 
SuperMicro boards we have at work, it would correspond to the DIMM slot 
labeled P1_DIMM1A).  In my experience I would just ignore them unless the 
count gets much higher (say 1+ / per hour).

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: is vfs.lookup_shared unsafe in 7.3?

2010-09-15 Thread John Baldwin
On Monday, September 13, 2010 4:57:15 pm cronfy wrote:
 Hello,
 
 Trying to overtake high server load (sudden peaks of 15%us/85%sy, LA 
 40, very slow lstat() at these moments, looks like some kind of lock
 contention) I enabled vfs.lookup_shared=1 on two servers today. One is
 FreeBSD-7.3 kernel csup'ed and built Sep  9 2010 and other is
 FreeBSD-7.3 csup'ed and built Jul 16 2010.
 
 The server with more fresh kernel is running nice and does not show
 high load anymore. But on the second server it did not help. More,
 after a few hours of work with vfs.lookup_shared=1 I noticed processes
 stucked in ufs state. I tried to kill them with no luck. Disabling
 vfs.lookup_shared freezed the whole system.
 
 So, is vfs.lookup_shared=1 unsafe in 7.3? Did it become more stable
 between 16 Jul and 9 Sep (is it the reason why first system is still
 running?), or should I expect that it will freeze in a near time too?
 
 Thanks in advance!

No, 7.3 has a bug that can cause these hangs that is probably made worse by
vfs.lookup_shared=1, but can occur even if it is disabled.  You want
these fixes applied (in order, one of them reverts part of another):

Author: jhb
Date: Fri Jul 16 20:23:24 2010
New Revision: 210173
URL: http://svn.freebsd.org/changeset/base/210173

Log:
  When the MNTK_EXTENDED_SHARED mount option was added, some filesystems were
  changed to defer the setting of VN_LOCK_ASHARE() (which clears LK_NOSHARE
  in the vnode lock's flags) until after they had determined if the vnode was
  a FIFO.  This occurs after the vnode has been inserted into a VFS hash or
  some similar table, so it is possible for another thread to find this vnode
  via vget() on an i-node number and block on the vnode lock.  If the lockmgr
  interlock (vnode interlock for vnode locks) is not held when clearing the
  LK_NOSHARE flag, then the lk_flags field can be clobbered.  As a result
  the thread blocked on the vnode lock may never get woken up.  Fix this by
  holding the vnode interlock while modifying the lock flags in this case.
  
  The softupdates code also toggles LK_NOSHARE in one function to close a
  race with snapshots.  Fix this code to grab the interlock while fiddling
  with lk_flags.

Modified:
  stable/7/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c
  stable/7/sys/fs/cd9660/cd9660_vfsops.c
  stable/7/sys/fs/udf/udf_vfsops.c
  stable/7/sys/ufs/ffs/ffs_softdep.c
  stable/7/sys/ufs/ffs/ffs_vfsops.c

Author: jhb
Date: Fri Aug 20 20:33:13 2010
New Revision: 211532
URL: http://svn.freebsd.org/changeset/base/211532

Log:
  MFC: Use VN_LOCK_AREC() and VN_LOCK_ASHARE() rather than manipulating
  lockmgr lock flags directly.

Modified:
  stable/7/sys/fs/nwfs/nwfs_node.c
  stable/7/sys/fs/pseudofs/pseudofs_vncache.c
  stable/7/sys/fs/smbfs/smbfs_node.c
  stable/7/sys/gnu/fs/xfs/FreeBSD/xfs_freebsd_iget.c
  stable/7/sys/kern/vfs_lookup.c

Author: jhb
Date: Fri Aug 20 20:58:57 2010
New Revision: 211533
URL: http://svn.freebsd.org/changeset/base/211533

Log:
  Revert 210173 as it did not properly fix the bug.  It assumed that the
  VI_LOCK() for a given vnode was used as the internal interlock for that
  vnode's v_lock lockmgr lock.  This is not the case.  Instead, add dedicated
  routines to toggle the LK_NOSHARE and LK_CANRECURSE flags.  These routines
  lock the lockmgr lock's internal interlock to synchronize the updates to
  the flags member with other threads attempting to acquire the lock.  The
  VN_LOCK_A*() macros now invoke these routines, and the softupdates code
  uses these routines to temporarly enable recursion on buffer locks.
  
  Reviewed by:  kib

Modified:
  stable/7/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c
  stable/7/sys/fs/cd9660/cd9660_vfsops.c
  stable/7/sys/fs/udf/udf_vfsops.c
  stable/7/sys/kern/kern_lock.c
  stable/7/sys/sys/lockmgr.h
  stable/7/sys/sys/vnode.h
  stable/7/sys/ufs/ffs/ffs_softdep.c
  stable/7/sys/ufs/ffs/ffs_vfsops.c

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Questions about mutex implementation in kern/kern_mutex.c

2010-09-15 Thread John Baldwin
 there is no any memory barrier in mtx_init()?  If another thread
(on another CPU) finds that mutex is initialized using mtx_initialized()
then it can mtx_lock() it and mtx_lock() it second time, as a result
mtx_recurse field will be increased, but its value still can be
uninitialized on architecture with relaxed memory ordering model.
 
 It seems to me that it's generally a programming error to rely on the
 return of mtx_initialized(), as there is no serialization with e.g. a
 thread calling mtx_destroy().  A fully correct serialization model
 would require that a single thread initialize the mtx and then create
 any worker threads that will use the mtx.

Yes, it is the caller's job to not expose a mtx until after it has been 
initialized.  A memory barrier in mtx_init() can't solve all those races.  If 
you put an object containing a mutex on a global queue and only invoke 
mtx_init() after dropping the global lock protecting the global queue, no 
amount of memory barriers in mtx_init() will save you.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: is vfs.lookup_shared unsafe in 7.3?

2010-09-16 Thread John Baldwin
On Thursday, September 16, 2010 3:53:47 am cronfy wrote:
  Hello,
 
  Trying to overtake high server load (sudden peaks of 15%us/85%sy, LA 
  40, very slow lstat() at these moments, looks like some kind of lock
  contention) I enabled vfs.lookup_shared=1 on two servers today. One is
  FreeBSD-7.3 kernel csup'ed and built Sep  9 2010 and other is
  FreeBSD-7.3 csup'ed and built Jul 16 2010.
 
  The server with more fresh kernel is running nice and does not show
  high load anymore. But on the second server it did not help. More,
  after a few hours of work with vfs.lookup_shared=1 I noticed processes
  stucked in ufs state. I tried to kill them with no luck. Disabling
  vfs.lookup_shared freezed the whole system.
 
  So, is vfs.lookup_shared=1 unsafe in 7.3? Did it become more stable
  between 16 Jul and 9 Sep (is it the reason why first system is still
  running?), or should I expect that it will freeze in a near time too?
 
  Thanks in advance!
 
  No, 7.3 has a bug that can cause these hangs that is probably made worse by
  vfs.lookup_shared=1, but can occur even if it is disabled.  You want
  these fixes applied (in order, one of them reverts part of another):
 
 Thank you for the fix and for the explanation, that's exactly what I
 wanted to know. Just to be sure: do these patches completely fix the
 bug with hangs (even without vfs.lookup_shared=1)?

Yes.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: odd issues with DDB vs GDB

2010-09-16 Thread John Baldwin
On Wednesday, September 15, 2010 8:01:19 pm Patrick Mahan wrote:
 All,
 
 I am trying to debug a system hang occurring on my HP Proliant G6 running 
 some of our
 kernel software.  I am seeing that under certain test loads, the system will 
 hang-up
 complete, no keyboard, no console, etc.  I suspect it is some of the kernel 
 code that
 I have inherited that contains a lot of locking (lots of data structure, each 
 having
 their own mutex lock (sleepable)).

You need to use 'kgdb' rather than 'gdb' on kernel.debug.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: traling whitespace in CFLAGS if make.conf:CPUTYPE is not defined/empty

2010-09-16 Thread John Baldwin
On Wednesday, September 15, 2010 9:01:20 pm Alexander Best wrote:
 hi there,
 
 after discovering PR #114082 i noticed that with CPUTYPE not being defined in
 make.conf, `make -VCFLAGS` reports a trailing whitespace for CFLAGS.
 the reason for this is that ${_CPUCFLAGS} gets added to CFLAGS even if it's
 empty.
 
 the following patch should take care of the problem. i also added the same
 logik to COPTFLAGS. although i wasn't able to trigger the trailing whitespace,
 it should still introduce a cleaner behaviour.

Does the trailing whitespace break anything?  In the past we have had a
non-empty default CPU CFLAGS (e.g. using '-mtune=pentiumpro' on i386 at one
point IIRC) which this change would break.  Unless the trailing whitespace
is causing non-cosmetic problems I'd probably just leave it as it is.

Also, if we were to go with this approach, I would not have changed
kern.pre.mk at all, but set both NO_CPU_CFLAGS and NO_CPU_COPTFLAGS in
bsd.cpu.mk when CPUTYPE was empty.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Questions about mutex implementation in kern/kern_mutex.c

2010-09-16 Thread John Baldwin
On Thursday, September 16, 2010 1:33:07 pm Andrey Simonenko wrote:
 On Wed, Sep 15, 2010 at 08:46:00AM -0700, Matthew Fleming wrote:
  I'll take a stab at answering these...
  
  On Wed, Sep 15, 2010 at 6:44 AM, Andrey Simonenko
  si...@comsys.ntu-kpi.kiev.ua wrote:
   Hello,
  
   I have questions about mutex implementation in kern/kern_mutex.c
   and sys/mutex.h files (current versions of these files):
  
   1. Is the following statement correct for a volatile pointer or integer
variable: if a volatile variable is updated by the compare-and-set
instruction (e.g. atomic_cmpset_ptr(val, ...)), then the current
value of such variable can be read without any special instruction
(e.g. v = val)?
  
I checked Assembler code for a function with v = val and val = v
like statements generated for volatile variable and simple variable
and found differences: on ia64 v = val was implemented by ld.acq and
val = v was implemented by st.rel; on mips and sparc64 Assembler code
can have different order of lines for volatile and simple variable
(depends on the code of a function).
  
  I think this depends somewhat on the hardware and what you mean by
  current value.
 
 Current value means that the value of a variable read by one thread
 is equal to the value of this variable successfully updated by another
 thread by the compare-and-set instruction.  As I understand from the kernel
 source code, atomic_cmpset_ptr() allows to update a variable in a way that
 all other CPUs will invalidate corresponding cache lines that contain
 the value of this variable.

That is not true.  It is likely true on x86, but it is certainly not true on
other architectures such as sparc64 where a write may be held in a store 
buffer for an indeterminate amount of time (and note that some lock releases 
are simple stores with a rel memory barrier).  All that we require is that 
if the value is stale, the atomic_cmpset() that attempts to set MTX_CONTESTED 
will fail.

 The mtx_owned(9) macro uses this property, mtx_owned() does not use anything
 special to compare the value of m-mtx_lock (volatile) with current thread
 pointer, all other functions that update m-mtx_lock of unowned mutex use
 compare-and-set instruction.  Also I cannot find anything special in
 generated Assembler code for volatile variables (except for ia64 where
 acquire loads and release stores are used).

No, mtx_owned() is just not harmed by the races it loses.  You can certainly 
read a stale value of mtx_lock in mtx_owned() if some other thread owns the 
lock or has just released the lock.  However, we don't care, because in both 
of those cases, mtx_owned() returns false.  What does matter is that 
mtx_owned() can only return true if we currently hold the mutex.  This works 
because 1) the same thread cannot call mtx_unlock() and mtx_owned() at the 
same time, and 2) even CPUs that hold writes in store buffers will snoop their 
store buffer for local reads on that CPU.  That is, a given CPU will never 
read a stale value of a memory word that is older than a write it has 
performed to that word.

  If you want a value that is not in-flux, then something like
  atomic_cmpset_ptr() setting to the current value is needed, so that
  you force any other atomic_cmpset to fail.  However, since there is no
  explicit lock involved, there is no strong meaning for current value
  and a read that does not rely on a value cached in a register is
  likely sufficient.  While the volatile keyword in C has no explicit
  hardware meaning, it often means that a load from memory (or,
  presumably, L1-L3 cache) is required.
 
 The volatile keyword here and all questions are related to the base C
 compiler, current version and currently supported architectures in FreeBSD.
 Yes, here under volatile I want to say that the value of a variable is
 not cached in a register and it is referenced by its address in all
 commands.
 
 There are some places in the kernel where a variable is updated in
 something like do { v = value; } while (!atomic_cmpset_int(value, ...));
 and that variable is not volatile, but the compiler generates correct
 Assembler code.  So volatile is not a requirement for all cases.

Hmm, I suspect that many of those places actually do use volatile.  The 
various lock cookies (mtx_lock, etc.) are declared volatile in the structure.  
Otherwise the compiler would be free to conclude that 'v = value;' is a loop 
invariant and move it out of the loop which would break.  Given that, the 
construct you referred to does in fact require 'value' to be volatile.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Questions about mutex implementation in kern/kern_mutex.c

2010-09-17 Thread John Baldwin
On Thursday, September 16, 2010 11:24:29 pm Benjamin Kaduk wrote:
 On Thu, 16 Sep 2010, John Baldwin wrote:
 
  On Thursday, September 16, 2010 1:33:07 pm Andrey Simonenko wrote:
 
  The mtx_owned(9) macro uses this property, mtx_owned() does not use 
  anything
  special to compare the value of m-mtx_lock (volatile) with current thread
  pointer, all other functions that update m-mtx_lock of unowned mutex use
  compare-and-set instruction.  Also I cannot find anything special in
  generated Assembler code for volatile variables (except for ia64 where
  acquire loads and release stores are used).
 
  No, mtx_owned() is just not harmed by the races it loses.  You can certainly
  read a stale value of mtx_lock in mtx_owned() if some other thread owns the
  lock or has just released the lock.  However, we don't care, because in both
  of those cases, mtx_owned() returns false.  What does matter is that
  mtx_owned() can only return true if we currently hold the mutex.  This works
  because 1) the same thread cannot call mtx_unlock() and mtx_owned() at the
  same time, and 2) even CPUs that hold writes in store buffers will snoop 
  their
  store buffer for local reads on that CPU.  That is, a given CPU will never
  read a stale value of a memory word that is older than a write it has
  performed to that word.
 
 Sorry for the naive question, but would you mind expounding a bit on what 
 keeps the thread from migrating to a different CPU and getting a stale 
 value there?  (I can imagine a couple possible mechanisms, but don't know 
 enough to know which one(s) are the real ones.)

The memory barriers in the thread_lock() / thread_unlock() pair of a context
switch ensure that any writes posted by the thread before it performs a context
switch will be visible on the new CPU before the thread resumes execution.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Questions about mutex implementation in kern/kern_mutex.c

2010-09-17 Thread John Baldwin
On Friday, September 17, 2010 1:42:44 pm Andrey Simonenko wrote:
 On Thu, Sep 16, 2010 at 02:16:05PM -0400, John Baldwin wrote:
  On Thursday, September 16, 2010 1:33:07 pm Andrey Simonenko wrote:
   The mtx_owned(9) macro uses this property, mtx_owned() does not use 
   anything
   special to compare the value of m-mtx_lock (volatile) with current thread
   pointer, all other functions that update m-mtx_lock of unowned mutex use
   compare-and-set instruction.  Also I cannot find anything special in
   generated Assembler code for volatile variables (except for ia64 where
   acquire loads and release stores are used).
  
  No, mtx_owned() is just not harmed by the races it loses.  You can 
  certainly 
  read a stale value of mtx_lock in mtx_owned() if some other thread owns the 
  lock or has just released the lock.  However, we don't care, because in 
  both 
  of those cases, mtx_owned() returns false.  What does matter is that 
  mtx_owned() can only return true if we currently hold the mutex.  This 
  works 
  because 1) the same thread cannot call mtx_unlock() and mtx_owned() at the 
  same time, and 2) even CPUs that hold writes in store buffers will snoop 
  their 
  store buffer for local reads on that CPU.  That is, a given CPU will never 
  read a stale value of a memory word that is older than a write it has 
  performed to that word.
 
 Looks like I understand the logic why mtx_owned() works correctly when
 mtx_lock is present in CPU cache or is absent in CPU cache.  The mtx_lock
 value definitely can say whether lock is held by the current thread, but
 it cannot say whether it is unowned or is owned by another thread.
 
 Let me ask another one question about memory barriers and thread migration.
 
 Let a thread locked a mutex, modified shared data protected by this mutex
 and was migrated from CPU1 to CPU2 (mutex is still locked).  In this scenario
 just migrated thread will not see stale data for a mutex itself (the
 m-mtx_lock value) and for shared data on CPU2 because when it was migrated
 from CPU1 there was at least one unlock call for some another mutex that had
 release semantics and appropriate memory barrier instruction was run
 implicitly or explicitly.  As a result this rel memory barrier made all
 modifications from CPU1 visible on another CPUs.  When CPU2 switched to just
 migrated thread there was at least on lock call for some another mutex with
 acquire semantics, so rel/acq memory barriers pair works here together.
 (Also I consider case when CPU2 did not work with that mutex, but worked
 with its memory before.  Some thread on CPU2 could allocate some memory,
 worked with it and freed it.  Later the same part of memory was allocated
 by a thread on CPU1 for mutex).
 
 Is the above written description correct?

Yes.

   There are some places in the kernel where a variable is updated in
   something like do { v = value; } while (!atomic_cmpset_int(value, 
   ...));
   and that variable is not volatile, but the compiler generates correct
   Assembler code.  So volatile is not a requirement for all cases.
  
  Hmm, I suspect that many of those places actually do use volatile.  The 
  various lock cookies (mtx_lock, etc.) are declared volatile in the 
  structure.  
  Otherwise the compiler would be free to conclude that 'v = value;' is a 
  loop 
  invariant and move it out of the loop which would break.  Given that, the 
  construct you referred to does in fact require 'value' to be volatile.
 
 I checked Assembler code for these functions:
 
 kern/subr_msgbuf.c:msgbuf_addchar()
 vm/vm_map.c:vmspace_free()

They may happen to accidentally work because atomic_cmpset() clobbers all of
memory, but these should be marked volatile.

Index: vm/vm_map.c
===
--- vm/vm_map.c (revision 212801)
+++ vm/vm_map.c (working copy)
@@ -343,10 +343,7 @@
if (vm-vm_refcnt == 0)
panic(vmspace_free: attempt to free already freed vmspace);
 
-   do
-   refcnt = vm-vm_refcnt;
-   while (!atomic_cmpset_int(vm-vm_refcnt, refcnt, refcnt - 1));
-   if (refcnt == 1)
+   if (atomic_fetchadd_int(vm-vm_refcnt, -1) == 1)
vmspace_dofree(vm);
 }
 
Index: vm/vm_map.h
===
--- vm/vm_map.h (revision 212801)
+++ vm/vm_map.h (working copy)
@@ -237,7 +237,7 @@
caddr_t vm_taddr;   /* (c) user virtual address of text */
caddr_t vm_daddr;   /* (c) user virtual address of data */
caddr_t vm_maxsaddr;/* user VA at max stack growth */
-   int vm_refcnt;  /* number of references */
+   volatile int vm_refcnt; /* number of references */
/*
 * Keep the PMAP last, so that CPU-specific variations of that
 * structure on a single architecture don't result in offset
Index: sys/msgbuf.h
===
--- sys

Re: Bumping MAXCPU on amd64?

2010-09-22 Thread John Baldwin
On Wednesday, September 22, 2010 6:36:56 am Maxim Sobolev wrote:
 Hi,
 
 Is there any reason to keep MAXCPU at 16 in the default kernel config? 
 There are quite few servers on the market today that have 24 or even 32 
 physical cores. With hyper-threading this can even go as high as 48 or 
 64 virtual cpus. People who buy such hardware might get very 
 disappointed finding out that the FreeBSD is not going to use such 
 hardware to its full potential.
 
 Does anybody object if I'd bump MAXCPU to 32, which is still low but 
 might me more reasonable default these days, or at least make it an 
 kernel configuration option documented in the NOTES?

?

% grep MAXCPU ~/work/freebsd/svn/head/sys/amd64/include/param.h 
#define MAXCPU  32
#define MAXCPU  1

In fact:

% grep MAXCPU ~/work/freebsd/svn/stable/8/sys/amd64/include/param.h 
#define MAXCPU  32
#define MAXCPU  1

Unfortunately this can't be MFC'd to 7 as it would destroy the ABI for 
existing klds. 

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Bumping MAXCPU on amd64?

2010-09-22 Thread John Baldwin
On Wednesday, September 22, 2010 1:08:30 pm Curtis Penner wrote:
 MAXCPU at 32 has been good in the 32bit days.  Soon there will be (if 
 not already) systems that will have 16cores/socket or more, and 
 motherboards that have 4 sockets or more.  Combining this with 
 hyper-threading, you have gone significantly beyond the limits of 
 feasible server.

My point was in response to Maxim's mail about bumping it from 16.  Going 
higher than 32 is a bigger project (but in progress-ish) as it involves 
transitioning away from a simple int to hold CPU ID bitmasks (cpumask_t) and 
using cpuset_t instead.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: PATCH: fix bogus error message bus_dmamem_alloc failed to align memory properly

2010-09-27 Thread John Baldwin
On Friday, September 24, 2010 9:00:44 pm Neel Natu wrote:
 Hi,
 
 This patch fixes the bogus error message from bus_dmamem_alloc() about
 the buffer not being aligned properly.
 
 The problem is that the check is against a virtual address as opposed
 to the physical address. contigmalloc() makes guarantees about
 the alignment of physical addresses but not the virtual address
 mapping it.
 
 Any objections if I commit this patch?

Hmmm, I guess you are doing super-page alignment rather than sub-page 
alignment?  In general I thought the busdma code only handled sub-page 
alignment and doesn't fully handle requests for super-page alignment.

For example, since it insists on walking individual pages at a time, if you 
had an alignment setting of 4 pages and passed in a single, aligned 4-page 
buffer, bus_dma would actually bounce the last 3 pages so that each individual 
page is 4-page aligned.  At least, I think that is what would happen.

For sub-page alignment, the virtual and physical address alignments should be 
the same.

 best
 Neel
 
 Index: sys/powerpc/powerpc/busdma_machdep.c
 ===
 --- sys/powerpc/powerpc/busdma_machdep.c  (revision 213113)
 +++ sys/powerpc/powerpc/busdma_machdep.c  (working copy)
 @@ -529,7 +529,7 @@
   CTR4(KTR_BUSDMA, %s: tag %p tag flags 0x%x error %d,
   __func__, dmat, dmat-flags, ENOMEM);
   return (ENOMEM);
 - } else if ((uintptr_t)*vaddr  (dmat-alignment - 1)) {
 + } else if (vtophys(*vaddr)  (dmat-alignment - 1)) {
   printf(bus_dmamem_alloc failed to align memory properly.\n);
   }
  #ifdef NOTYET
 Index: sys/sparc64/sparc64/bus_machdep.c
 ===
 --- sys/sparc64/sparc64/bus_machdep.c (revision 213113)
 +++ sys/sparc64/sparc64/bus_machdep.c (working copy)
 @@ -652,7 +652,7 @@
   }
   if (*vaddr == NULL)
   return (ENOMEM);
 - if ((uintptr_t)*vaddr % dmat-dt_alignment)
 + if (vtophys(*vaddr) % dmat-dt_alignment)
   printf(%s: failed to align memory properly.\n, __func__);
   return (0);
  }
 Index: sys/ia64/ia64/busdma_machdep.c
 ===
 --- sys/ia64/ia64/busdma_machdep.c(revision 213113)
 +++ sys/ia64/ia64/busdma_machdep.c(working copy)
 @@ -455,7 +455,7 @@
   }
   if (*vaddr == NULL)
   return (ENOMEM);
 - else if ((uintptr_t)*vaddr  (dmat-alignment - 1))
 + else if (vtophys(*vaddr)  (dmat-alignment - 1))
   printf(bus_dmamem_alloc failed to align memory properly.\n);
   return (0);
  }
 Index: sys/i386/i386/busdma_machdep.c
 ===
 --- sys/i386/i386/busdma_machdep.c(revision 213113)
 +++ sys/i386/i386/busdma_machdep.c(working copy)
 @@ -540,7 +540,7 @@
   CTR4(KTR_BUSDMA, %s: tag %p tag flags 0x%x error %d,
   __func__, dmat, dmat-flags, ENOMEM);
   return (ENOMEM);
 - } else if ((uintptr_t)*vaddr  (dmat-alignment - 1)) {
 + } else if (vtophys(*vaddr)  (dmat-alignment - 1)) {
   printf(bus_dmamem_alloc failed to align memory properly.\n);
   }
   if (flags  BUS_DMA_NOCACHE)
 Index: sys/amd64/amd64/busdma_machdep.c
 ===
 --- sys/amd64/amd64/busdma_machdep.c  (revision 213113)
 +++ sys/amd64/amd64/busdma_machdep.c  (working copy)
 @@ -526,7 +526,7 @@
   CTR4(KTR_BUSDMA, %s: tag %p tag flags 0x%x error %d,
   __func__, dmat, dmat-flags, ENOMEM);
   return (ENOMEM);
 - } else if ((uintptr_t)*vaddr  (dmat-alignment - 1)) {
 + } else if (vtophys(*vaddr)  (dmat-alignment - 1)) {
   printf(bus_dmamem_alloc failed to align memory properly.\n);
   }
   if (flags  BUS_DMA_NOCACHE)
 ___
 freebsd-hackers@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
 To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
 

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: PATCH: fix bogus error message bus_dmamem_alloc failed to align memory properly

2010-09-28 Thread John Baldwin
On Monday, September 27, 2010 5:13:03 pm Neel Natu wrote:
 Hi John,
 
 Thanks for reviewing this.
 
 On Mon, Sep 27, 2010 at 8:04 AM, John Baldwin j...@freebsd.org wrote:
  On Friday, September 24, 2010 9:00:44 pm Neel Natu wrote:
  Hi,
 
  This patch fixes the bogus error message from bus_dmamem_alloc() about
  the buffer not being aligned properly.
 
  The problem is that the check is against a virtual address as opposed
  to the physical address. contigmalloc() makes guarantees about
  the alignment of physical addresses but not the virtual address
  mapping it.
 
  Any objections if I commit this patch?
 
  Hmmm, I guess you are doing super-page alignment rather than sub-page
  alignment?  In general I thought the busdma code only handled sub-page
  alignment and doesn't fully handle requests for super-page alignment.
 
 
 Yes, this is for allocations with sizes greater than PAGE_SIZE and
 alignment requirements also greater than a PAGE_SIZE.
 
  For example, since it insists on walking individual pages at a time, if you
  had an alignment setting of 4 pages and passed in a single, aligned 4-page
  buffer, bus_dma would actually bounce the last 3 pages so that each 
  individual
  page is 4-page aligned.  At least, I think that is what would happen.
 
 
 I think you are referring to bus_dmamap_load() operation that would
 follow the bus_dmamem_alloc(), right? The memory allocated by
 bus_dmamem_alloc() does not need to be bounced. In fact, the dmamap
 pointer returned by bus_dmamem_alloc() is NULL.
 
 At least for the amd64 implementation there is code in
 _bus_dmamap_load_buffer() which will coalesce individual dma segments
 if they satisfy 'boundary' and 'segsize' constraints.

So the problem is earlier in the routine where it does this:

/*
 * Get the physical address for this segment.
 */
if (pmap)
curaddr = pmap_extract(pmap, vaddr);
else
curaddr = pmap_kextract(vaddr);

/*
 * Compute the segment size, and adjust counts.
 */
max_sgsize = MIN(buflen, dmat-maxsegsz);
sgsize = PAGE_SIZE - ((vm_offset_t)curaddr  PAGE_MASK);
if (map-pagesneeded != 0  run_filter(dmat, curaddr)) {
sgsize = roundup2(sgsize, dmat-alignment);
sgsize = MIN(sgsize, max_sgsize);
curaddr = add_bounce_page(dmat, map, vaddr, sgsize);
} else {
sgsize = MIN(sgsize, max_sgsize);
}

If you have a map that does need bouncing, then it will split up the pages.
It happens to work for bus_dmamem_alloc() because that returns a NULL map
which doesn't bounce.  But if you had a PCI device which supported only
32-bit addresses on a 64-bit machine with an aligned, 4 page buffer above
4GB and did a bus_dma_map_load() on that buffer, it would get split up into
4 separate 4 page-aligned pages.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: PATCH: fix bogus error message bus_dmamem_alloc failed to align memory properly

2010-09-29 Thread John Baldwin
On Tuesday, September 28, 2010 4:02:08 pm Neel Natu wrote:
 Hi John,
 
 On Tue, Sep 28, 2010 at 6:36 AM, John Baldwin j...@freebsd.org wrote:
  On Monday, September 27, 2010 5:13:03 pm Neel Natu wrote:
  Hi John,
 
  Thanks for reviewing this.
 
  On Mon, Sep 27, 2010 at 8:04 AM, John Baldwin j...@freebsd.org wrote:
   On Friday, September 24, 2010 9:00:44 pm Neel Natu wrote:
   Hi,
  
   This patch fixes the bogus error message from bus_dmamem_alloc() about
   the buffer not being aligned properly.
  
   The problem is that the check is against a virtual address as opposed
   to the physical address. contigmalloc() makes guarantees about
   the alignment of physical addresses but not the virtual address
   mapping it.
  
   Any objections if I commit this patch?
  
   Hmmm, I guess you are doing super-page alignment rather than sub-page
   alignment?  In general I thought the busdma code only handled sub-page
   alignment and doesn't fully handle requests for super-page alignment.
  
 
  Yes, this is for allocations with sizes greater than PAGE_SIZE and
  alignment requirements also greater than a PAGE_SIZE.
 
   For example, since it insists on walking individual pages at a time, if 
   you
   had an alignment setting of 4 pages and passed in a single, aligned 
   4-page
   buffer, bus_dma would actually bounce the last 3 pages so that each 
   individual
   page is 4-page aligned.  At least, I think that is what would happen.
  
 
  I think you are referring to bus_dmamap_load() operation that would
  follow the bus_dmamem_alloc(), right? The memory allocated by
  bus_dmamem_alloc() does not need to be bounced. In fact, the dmamap
  pointer returned by bus_dmamem_alloc() is NULL.
 
  At least for the amd64 implementation there is code in
  _bus_dmamap_load_buffer() which will coalesce individual dma segments
  if they satisfy 'boundary' and 'segsize' constraints.
 
  So the problem is earlier in the routine where it does this:
 
 /*
  * Get the physical address for this segment.
  */
 if (pmap)
 curaddr = pmap_extract(pmap, vaddr);
 else
 curaddr = pmap_kextract(vaddr);
 
 /*
  * Compute the segment size, and adjust counts.
  */
 max_sgsize = MIN(buflen, dmat-maxsegsz);
 sgsize = PAGE_SIZE - ((vm_offset_t)curaddr  PAGE_MASK);
 if (map-pagesneeded != 0  run_filter(dmat, curaddr)) {
 sgsize = roundup2(sgsize, dmat-alignment);
 sgsize = MIN(sgsize, max_sgsize);
 curaddr = add_bounce_page(dmat, map, vaddr, sgsize);
 } else {
 sgsize = MIN(sgsize, max_sgsize);
 }
 
  If you have a map that does need bouncing, then it will split up the pages.
  It happens to work for bus_dmamem_alloc() because that returns a NULL map
  which doesn't bounce.  But if you had a PCI device which supported only
  32-bit addresses on a 64-bit machine with an aligned, 4 page buffer above
  4GB and did a bus_dma_map_load() on that buffer, it would get split up into
  4 separate 4 page-aligned pages.
 
 
 You are right.
 
 I assume that you are ok with the patch and the discussion above was
 an FYI, right?

I think the patch is ok, but my point is that super-page alignment isn't
really part of the design of the current bus_dma and only works for
bus_dmammem_alloc() by accident.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Fix mfiutil compile with -DDEBUG

2010-10-08 Thread John Baldwin
On Sunday, October 03, 2010 10:33:17 pm Garrett Cooper wrote:
 make -DDEBUG is broken in mfiutil:
 
 $ make -DDEBUG
 cc -O2 -pipe -fno-strict-aliasing -pipe -O2 -march=nocona
 -fno-builtin-strftime -DDEBUG -Wall -Wcast-align -Woverflow
 -Wsign-compare -Wunused -std=gnu99 -fstack-protector -Wsystem-headers
 -Werror -Wall -Wno-format-y2k -W -Wno-unused-parameter
 -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith
 -Wno-uninitialized -Wno-pointer-sign -c
 /usr/src/usr.sbin/mfiutil/mfi_config.c
 /usr/src/usr.sbin/mfiutil/mfi_config.c: In function 'dump_config':
 /usr/src/usr.sbin/mfiutil/mfi_config.c:1027: error: 'union mfi_pd_ref'
 has no member named 'device_id'
 /usr/src/usr.sbin/mfiutil/mfi_config.c:1083: error: 'union mfi_pd_ref'
 has no member named 'device_id'
 *** Error code 1
 
 Stop in /usr/src/usr.sbin/mfiutil.
 $
 
 device_id is a field in the v field in the mfi_pd_ref union
 (/sys/dev/mfi/mfireg.h):
 
 union mfi_pd_ref {
 struct {
 uint16_tdevice_id;
 uint16_tseq_num;
 } v;
 uint32_tref;
 } __packed;

Yes, there were different versions of these definitions in mfireg.h at one 
point.  Your patch is fine.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: panic_cpu should be volatile

2010-10-08 Thread John Baldwin
On Thursday, October 07, 2010 1:40:49 pm Andriy Gapon wrote:
 
 panic_cpu variable in kern_shutdown.c should be volatile otherwise it's 
 cached in
 a register in the innermost while-loop in this code (observed on amd64 with 
 base
 gcc and -O2):
 if (panic_cpu != PCPU_GET(cpuid))
 while (atomic_cmpset_int(panic_cpu, NOCPU,
 PCPU_GET(cpuid)) == 0)
 while (panic_cpu != NOCPU)
 ; /* nothing */
 
 The patch is here:
 http://people.freebsd.org/~avg/panic_cpu.diff
 
 I also took a liberty to move the variable into the scope of panic() 
 functions as
 it doesn't seem to be useful outside of it.  But this is not necessary, of 
 course.

Looks fine to me.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Make mfiutil(8) more robust

2010-10-08 Thread John Baldwin
On Sunday, October 03, 2010 11:32:01 pm Garrett Cooper wrote:
 On Sun, Oct 3, 2010 at 8:30 PM, Garrett Cooper yaneg...@gmail.com wrote:
 As discussed offlist with some of the Yahoo! FreeBSD folks,
  mfiutil catches errors, but doesn't communicate it back up to the
  executing process. Examples follow...
 Before:

I think these are both fine.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: generic_stop_cpus: prevent parallel execution

2010-10-11 Thread John Baldwin
On Thursday, October 07, 2010 1:53:46 pm Andriy Gapon wrote:
 
 Here is patch that applies the technique from panic() to generic_stop_cpus() 
 to
 prevent its parallel execution on multiple CPUs:
 http://people.freebsd.org/~avg/generic_stop_cpus.diff
 
 In theory this could lead to two CPUs stopping each other and everyone else, 
 and
 thus a total system halt.
 
 Also, in theory, we should have some smarter locking here, because two (or 
 more
 CPUs) could be stopping unrelated sets of CPUs.  But in practice, it seems, 
 this
 function is only used to stop all other CPUs.  Unless I overlooked other 
 usages,
 that is.
 
 Additionally, I took this opportunity to make amd64-specific suspend_cpus()
 function use generic_stop_cpus() instead of rolling out essentially duplicate
 code.  I couldn't see any reason no to consolidate, but perhaps I missed 
 something.
 
 Big thanks to Matthew and his employer for the idea and example.

One note.  Use 'cpu_spinwait()' in the inner loop waiting for 'stopping_cpu'
to change.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: anyone got advice on sendmail and TLS on 8.1?

2010-10-11 Thread John Baldwin
On Sunday, October 10, 2010 5:22:01 pm Julian Elischer wrote:
   When I last did sendmail there wasn't any TLS/SSL stuff.
 
 has anyone got an exact howto  as to how to enable a simple sendmail 
 server?
 
 all I want is:
 
 TLS and authenticated email submission by me and my family
 able to forward the email anywhere (maybe just to my ISP but who 
 knows) (outgoing)
 non TLS submission from outside to reject all mail not to 
 elischer.{org,com}
 and deliver our mail to mailboxes or gmail (or where-ever /etc/aliases 
 says.).
 
 This is probably ALMOST a default configuration
 but I can't be sure what is needed.. there are several
 howtos on hte net but they are generally old and differ in details.

Your best bet is probably to look at the docs on sendmail.org.  You need to 
recompile the sendmail in base against SASL and need to install cyrus-sasl2 
from ports to manage your authentication database.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [PATCH] Fix /bin/sh compilation with CFLAGS += -DDEBUG 1

2010-10-12 Thread John Baldwin
On Tuesday, October 12, 2010 6:47:49 am Garrett Cooper wrote:
 Hi,
 It looks like the format strings are broken on 64-bit archs in
 /bin/sh's TRACE functionality (can be enabled by uncommenting -DDEBUG
  1 in bin/sh/Makefile). The attached patch fixes this functionality
 again so one can trace sh's calls with TRACE, which may or may be
 helpful to those debugging /bin/sh.
 Tested build and execution on amd64; tested build on i386.
 Thanks!
 -Garrett

I don't think the Makefile bits are needed, you can just use
'make DEBUG_FLAGS=-g -DDEBUG=2' instead.

Also, if you plan on using -g you should generally set DEBUG_FLAGS anyway so 
binaries are not stripped.

The use of things like PRIoMAX is not done in FreeBSD as it is ugly.  You can
use things like '%t' to print ptrdiff_t types instead.  So for example, for
the first hunk, I would change the type of 'startloc' to ptrdiff_t and use 
this:

TRACE((evalbackq: size=%td: \%.*s\\n,
(dest - stackblock()) - startloc,
(int)((dest - stackblock()) - startloc),
stackblock() + startloc));

Also, in your change here, you used %j to print a size_t.  That will break on 
i386.  You should use %z to print size_t's, but even better is to just use %t 
to print a ptrdiff_t (which is the type that holds the difference of two 
pointers).

The various changes in jobs.c should use '%td' as well rather than (int) 
casts.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [PATCH] Fix /bin/sh compilation with CFLAGS += -DDEBUG 1

2010-10-12 Thread John Baldwin
On Tuesday, October 12, 2010 2:31:36 pm Garrett Cooper wrote:
 On Tue, Oct 12, 2010 at 5:30 AM, John Baldwin j...@freebsd.org wrote:
  You should use %z to print size_t's, but even better is to just use %t
  to print a ptrdiff_t (which is the type that holds the difference of two
  pointers).
 
 Ok. The overall temperature of using PRI* from POSIX seems like
 it's undesirable; is it just POSIX cruft that FreeBSD conforms to in
 theory only and doesn't really use in practice, or is there an example
 of real practical application where it's used in the sourcebase?

PRI* are ugly.  FreeBSD provides it so that we are compliant and so that
portable code can use it, but we do not use it in our source tree because
it is unreadable.

  The various changes in jobs.c should use '%td' as well rather than (int)
  casts.
 
 Ok. Tested build and runtime on amd64 and tested build-only with i386.

Hmm, jobs.c shouldn't need any of the (ptrdiff_t) casts as the expression
being printed is already a ptrdiff_t.  See this non-debug code in jobs.c
for example:

int
bgcmd(int argc, char **argv)
{
char s[64];
struct job *jp;

...
do {
...
fmtstr(s, 64, [%td] , jp - jobtab + 1);


-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [PATCH] Bug with powerof2 macro in sys/param.h

2010-10-14 Thread John Baldwin
On Thursday, October 14, 2010 7:58:32 am Andriy Gapon wrote:
 on 14/10/2010 00:30 Garrett Cooper said the following:
  I was talking to someone today about this macro, and he noted that
  the algorithm is incorrect -- it fails the base case with ((x) == 0 --
  which makes sense because 2^(x) cannot equal 0 (mathematically
  impossible, unless you consider the limit as x goes to negative
  infinity as log (0) / log(2) is undefined). I tested out his claim and
  he was right:
 
 That's kind of obvious given the code.
 I think that this might be an intentional optimization.
 I guess that it doesn't really make sense to apply powerof2 to zero and the 
 users
 of the macro should do the check on their own if they expect zero as input 
 (many
 places in the do not allow that).

I agree, the current macro is this way on purpose (and straight out of
Hacker's Delight).

Of the existing calls you weren't sure of:

sys/dev/cxgb/cxgb_sge.c:   while (!powerof2(fl_q_size))
sys/dev/cxgb/cxgb_sge.c:   while (!powerof2(jumbo_q_size))

These are fine, will not be zero.

sys/x86/x86/local_apic.c:  KASSERT(powerof2(count), (bad count));
sys/x86/x86/local_apic.c:  KASSERT(powerof2(align), (bad align));

These are fine.  No code allocates zero IDT vectors.  We never allocate IDT
vectors for unallocated MSI or MSI-X IRQs.

sys/net/flowtable.c:   ft-ft_lock_count =
2*(powerof2(mp_maxid + 1) ? (mp_maxid + 1):

Clearly, 'mp_maxid + 1' will not be zero (barring a bizarre overflow case
which will not happen until we support 2^32 CPUs), so this is fine.

sys/i386/pci/pci_pir.c:if (error 
!powerof2(pci_link-pl_irqmask)) {

This fine.  Earlier in the function if pl_irqmask is zero, then all of the
pci_pir_choose_irq() calls will fail, so this is only invoked if pl_irqmask
is non-zero.  In practice pl_irqmask is never zero anyway.

I suspect the GEOM ones are also generally safe.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [PATCH] Bug with powerof2 macro in sys/param.h

2010-10-15 Thread John Baldwin
On Thursday, October 14, 2010 11:49:23 pm Garrett Cooper wrote:
 On Thu, Oct 14, 2010 at 6:37 AM, John Baldwin j...@freebsd.org wrote:
  On Thursday, October 14, 2010 7:58:32 am Andriy Gapon wrote:
  on 14/10/2010 00:30 Garrett Cooper said the following:
   I was talking to someone today about this macro, and he noted that
   the algorithm is incorrect -- it fails the base case with ((x) == 0 --
   which makes sense because 2^(x) cannot equal 0 (mathematically
   impossible, unless you consider the limit as x goes to negative
   infinity as log (0) / log(2) is undefined). I tested out his claim and
   he was right:
 
  That's kind of obvious given the code.
  I think that this might be an intentional optimization.
  I guess that it doesn't really make sense to apply powerof2 to zero and 
  the users
  of the macro should do the check on their own if they expect zero as input 
  (many
  places in the do not allow that).
 
 But the point is that this could be micro-optimizing things
 incorrectly. I'm running simple iteration tests to see what the
 performance is like, but the runtime is going to take a while to
 produce stable results.
 
 Mathematically there is a conflict with the definition of the macro,
 so it might confuse folks who pay attention to the math as opposed to
 the details (if you want I'll gladly add a comment around the macro in
 a patch to note the caveats of using powerof2).

We aren't dealing with mathematicians, but programmers.
 
  I agree, the current macro is this way on purpose (and straight out of
  Hacker's Delight).

And this book trumps you on that case.  Using the powerof2() macro as it
currently stands is a widely-used practice among folks who write
systems-level code.  If you were writing a powerof2() function for a higher
level language where performance doesn't matter and bit twiddling isn't
common, then a super-safe variant of powerof2() might be appropriate.
 
However, this is C, and C programmers are expected to know how this stuff
works.

  sys/net/flowtable.c:   ft-ft_lock_count =
  2*(powerof2(mp_maxid + 1) ? (mp_maxid + 1):
 
  Clearly, 'mp_maxid + 1' will not be zero (barring a bizarre overflow case
  which will not happen until we support 2^32 CPUs), so this is fine.
 
 But that should be caught by the mp_machdep code, correct?

Yes, hence bizarre.   It is also way unrealistic and not worth excessive
pessimizations scattered throughout the tree.

 What about the other patches? The mfiutil and mptutil ones at least
 get the two beforementioned tools in sync with sys/param.h at least,
 so I see some degree of value in the patches (even if they're just
 cleanup).

No, powerof2() should not change.  It would most likely be a POLA violation
to change how it works given 1) it's historical behavior, and 2) it's
underlying idiom's common (and well-understood) use among the software
world.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: SCSI_DELAY cleanup

2010-10-19 Thread John Baldwin
On Tuesday, October 19, 2010 10:31:10 am Alexander Best wrote:
 On Tue Oct 19 10, Matthew Jacob wrote:
   It would be an effective behavioral change for those of us who remove 
  that line.
  Personally, I think 5 seconds is too long- even 2 seconds is more than 
  adequate even for moderately old 'other' hardware like scanners.
  
  For -current, why don't you simply remove all of the config lines and 
  leave the default at 2000ms?
 
 hmmm...i can only test the delay value on amd64. i was under the impression
 that archs like arm and mips need the longer delay.
 
 also at some locations in the code SCSI_DELAY is being set to 15000. i believe
 this is the case when certain drivers (cam, ahb, aha) get loaded as a kernel
 module, but i'm not sure. it looks like this:
 
 .if !defined(KERNBUILDDIR)
 opt_scsi.h:
   echo #define SCSI_DELAY 15000  ${.TARGET}
 .endif

I believe this is all old history.  SCSI_DELAY used to be set to 15000 in
GENERIC many years ago and was lowered to 5000.  Most likely these Makefiles
were simply not updated at the time.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: SCSI_DELAY cleanup

2010-10-19 Thread John Baldwin
On Tuesday, October 19, 2010 3:14:46 pm Alexander Best wrote:
 On Tue Oct 19 10, John Baldwin wrote:
  On Tuesday, October 19, 2010 10:31:10 am Alexander Best wrote:
   On Tue Oct 19 10, Matthew Jacob wrote:
 It would be an effective behavioral change for those of us who remove 
that line.
Personally, I think 5 seconds is too long- even 2 seconds is more than 
adequate even for moderately old 'other' hardware like scanners.

For -current, why don't you simply remove all of the config lines and 
leave the default at 2000ms?
   
   hmmm...i can only test the delay value on amd64. i was under the 
   impression
   that archs like arm and mips need the longer delay.
   
   also at some locations in the code SCSI_DELAY is being set to 15000. i 
   believe
   this is the case when certain drivers (cam, ahb, aha) get loaded as a 
   kernel
   module, but i'm not sure. it looks like this:
   
   .if !defined(KERNBUILDDIR)
   opt_scsi.h:
 echo #define SCSI_DELAY 15000  ${.TARGET}
   .endif
  
  I believe this is all old history.  SCSI_DELAY used to be set to 15000 in
  GENERIC many years ago and was lowered to 5000.  Most likely these Makefiles
  were simply not updated at the time.
 
 oh i see. maybe this revised patch would be better suited then.

I think so, but you should post this to scsi@ for the best review.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: fix pnpinfo on arch=amd64

2010-10-25 Thread John Baldwin
On Saturday, October 23, 2010 8:22:48 pm Alexander Best wrote:
 this tiny patch will fix pnpinfo so it doesn't core dump (bus error) any
 longer on arch=amd64.

This utility isn't really useful on amd64 though.  No amd64 machines have ISA 
slots in which to place an ISA PnP adapter.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: fix pnpinfo on arch=amd64

2010-10-25 Thread John Baldwin
On Monday, October 25, 2010 9:34:37 am Erik Trulsson wrote:
 On Mon, Oct 25, 2010 at 08:45:47AM -0400, John Baldwin wrote:
  On Saturday, October 23, 2010 8:22:48 pm Alexander Best wrote:
   this tiny patch will fix pnpinfo so it doesn't core dump (bus error) any
   longer on arch=amd64.
  
  This utility isn't really useful on amd64 though.  No amd64 machines have 
  ISA 
  slots in which to place an ISA PnP adapter.
 
 Are you really sure about that?
 
 See  http://www.ibase.com.tw/2009/mb945.htmL  or
 http://www.adek.com/ATX-motherboards.html  for what certainly looks like
 counter-examples.

Hmm, well, I suspect in this case these boards exist to support really
ancient custom hardware.  If you are stuck with one of these, then manually
needing to fix up pnpinfo.c is probably the least of your problems.  However,
I strongly doubt that FreeBSD users are lining up to buy these motherboards
so they can use an ISA SB16 adapter with FreeBSD/amd64.

I was not aware of these boards previously, but I still doubt that pnpinfo is
relevant to any FreeBSD/amd64 users.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: SYSCALL_MODULE() macro and modfind() issues

2010-10-26 Thread John Baldwin
On Tuesday, October 26, 2010 3:28:10 am Andriy Gapon wrote:
 on 26/10/2010 01:01 Selphie Keller said the following:
  hi fbsd-hackers,
  
  Noticed a issue in 8.1-release, 8.1p1-release and 8.1-stable
  amd64/i386, to where modfind() will no longer find pmap_helper for the
  /usr/ports/sysutils/pmap port, or other syscall modules using
  SYSCALL_MODULE() macro.
  The issue is that modfind() function no longer finds any modules using
  SYSCALL_MODULE() macro to register the kernel module. Making it
  difficult for userland apps to call the syscall provided. modfind()
  always returns -1 which prevents modstat() from getting the required
  information to perform the syscall.
  
  Also tested, the demo syscall module:
 
 After commit r205320 and, apparently, its MFC you need to prefix the module 
 with
 sys/.  For example:
 modstat(modfind(sys/syscall), stat);
 
 P.S.
 Perhaps a KPI breakage in a stable branch?

Ugh, it was a breakage though it's too late to back it out at this point.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: SYSCALL_MODULE() macro and modfind() issues

2010-10-26 Thread John Baldwin
On Tuesday, October 26, 2010 4:00:14 am Selphie Keller wrote:
 Thanks Andriy,
 
 Took a look at the change to src/sys/sys/sysent.h
 
 @@ -149,7 +149,7 @@ static struct syscall_module_data name##
  }; \
 \
  static moduledata_t name##_mod = { \
 -   #name,  \
 +   sys/ #name,   \
 syscall_module_handler, \
 name##_syscall_mod \
  }; \
 
 applied the MFC prefix to pmap port:
 
 --- /usr/ports/sysutils/pmap/work/pmap/pmap/pmap.c.orig 2010-10-26
 00:55:32.0 -0700
 +++ /usr/ports/sysutils/pmap/work/pmap/pmap/pmap.c  2010-10-26
 00:56:10.0 -0700
 @@ -86,12 +86,12 @@ main(int argc, char **argv)
  struct kinfo_proc *kp;
  intpmap_helper_syscall;
 
 -if ((modid = modfind(pmap_helper)) == -1) {
 +if ((modid = modfind(sys/pmap_helper)) == -1) {
 /* module not found, try to load */
 modid = kldload(pmap_helper.ko);
 if (modid == -1)
 err(1, unable to load pmap_helper module);
 -   modid = modfind(pmap_helper);
 +   modid = modfind(sys/pmap_helper);
 if (modid == -1)
 err(1, pmap_helper module loaded but not found);
 }
 
 which restored functionality on freebsd 8.1.

The best approach might be to have something like this:

static int
pmap_find(void)
{
int modid;

modid = modfind(pmap_helper);
if (modid == -1)
modid = modfind(sys/pmap_helper);
return (modid);
}

then in the original main() routine use this:

if ((modid = pmap_find()) == -1) {
/* module not found, try to load */
modid  = kldload(pmap_helper.ko);
if (modid == -1)
err(1, unable to load pmap_helper module);
modid = pmap_find();
if (modid == -1)
err(1, pmap_helper module loaded but not found);
}

This would make the code work for both old and new versions.

 -Estella Mystagic (Selphie)
 
 On Tue, Oct 26, 2010 at 12:28 AM, Andriy Gapon a...@icyb.net.ua wrote:
  on 26/10/2010 01:01 Selphie Keller said the following:
  hi fbsd-hackers,
 
  Noticed a issue in 8.1-release, 8.1p1-release and 8.1-stable
  amd64/i386, to where modfind() will no longer find pmap_helper for the
  /usr/ports/sysutils/pmap port, or other syscall modules using
  SYSCALL_MODULE() macro.
  The issue is that modfind() function no longer finds any modules using
  SYSCALL_MODULE() macro to register the kernel module. Making it
  difficult for userland apps to call the syscall provided. modfind()
  always returns -1 which prevents modstat() from getting the required
  information to perform the syscall.
 
  Also tested, the demo syscall module:
 
  After commit r205320 and, apparently, its MFC you need to prefix the 
module with
  sys/.  For example:
  modstat(modfind(sys/syscall), stat);
 
  P.S.
  Perhaps a KPI breakage in a stable branch?
  --
  Andriy Gapon
 
 ___
 freebsd-hackers@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
 To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
 

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: stock gdb bug: DWARF2 with DWARF_OFFSET_SIZE == 8

2010-10-26 Thread John Baldwin
On Monday, October 25, 2010 7:39:17 pm Oleksandr Tymoshenko wrote:
  gdb on MIP64 does not read DWARF2 line information correctly if
 gcc was configured with  DWARF_OFFSET_SIZE == 8.
 
 .debug_line starts with total length field which could be 12 bytes
 long or 4 bytes long. If it starts with 0x - it's 12 bytes
 long. Depending on its size one of the following field is either 8
 bytes or 4 bytes. This one-line patch fixes this issue for MIPS64
 but I'm not 100% sure that it doesn't break something else. So
 I'd appreciate input of someone with better grip on ELF/DWARF
 stuff then me.
 
 Patch:
 http://people.freebsd.org/~gonzo/patches/mips64gdb.diff

I looked at GDB 6.6's source and it does pass in cu-header instead of NULL 
at the same place, so I think your fix is correct.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [PATCH] mfiutil(8) - capture errors and percolate up to caller

2010-10-26 Thread John Baldwin
On Tuesday, October 26, 2010 2:09:53 pm Garrett Cooper wrote:
 Because a number of places in the mfiutil(8) code immediately call
 warn(3) after an error to an API occurred, and because warn(3) employs
 printf, et all (multiple times) in libc, there's an off-chance that
 the errno value can get stomped on by the warn(3) calls, which could
 lead to confusing results from anyone depending on the value being
 returned from the mfiutil APIs. Thus, the attached patch I'm providing
 fixes those cases, as well as converts an existing internal API
 (display_pending_firmware) to an non-void return mechanism. I also
 made a few stack variable alignment changes to match style(9) as well
 as got rid of the ad hoc powerof2 call in favor of the value in
 sys/param.h.
 I've run a small number of unit tests on my desktop at home with
 my mfi(4) card, but will test out other failing cases with equipment I
 have access to at work.

Just a few nits:

1) The include of sys/param.h should replace sys/types.h (there's a note 
about these two headers in style(9), FYI).

2) patrol_get_props() should return 'error' on failure rather than 'errno'.

3) mfi_get_time() failing isn't fatal.  The code already handles this case by 
not printing out a 'next run time' if at is zero.  I think you can remove the 
check for at == 0.  If all the other commands work and just that command fails 
I don't think it should be fatal.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [PATCH] Fix 'implicit declaration' warning and update vgone(9)

2010-10-27 Thread John Baldwin
On Wednesday, October 27, 2010 7:33:13 am Sergey Kandaurov wrote:
 On 27 October 2010 10:23, Lars Hartmann l...@chaotika.org wrote:
  The vgonel function isnt declarated in any header, the vgonel prototype
  in vgone(9) isnt correct - found by Ben Kaduk ka...@mit.edu
 
 Hi.
 
 I'm afraid it's just an overlooked man page after many VFS changes in 5.x.
 As vgonel() is a static (i.e. private and not visible from outside) function
 IMO it should be removed from vgone(9) man page.

Agreed.  It certainly should not be added to vnode.h.  I'm curious how the 
reporter is getting a warning since there is a static prototype for vgonel() 
in vfs_subr.c.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [PATCH] hwpmc(4) syscall arguments fix

2010-11-01 Thread John Baldwin
On Friday, October 29, 2010 8:12:06 pm Oleksandr Tymoshenko wrote:
  I ran into problems trying to get hwpmc to work on 64-bit MIPS
 system with big endian byte order. Turned out hwpmc syscall handler
 is byte-order and register_t size agnostic unlike the rest of syscalls.
 The best solution I have so far is a copy sys/sysproto.h approach:
 http://people.freebsd.org/~gonzo/patches/hwpmc-syscall.diff
 
 Any other ideas how to get it fixed in more clean way?

Yes, a better way would be to add pmc_syscall() to sys/kern/syscalls.master as 
a NOSTD system call.  Then it's arguments would be included in sysproto.h 
directly.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Fileops in file.h

2010-11-08 Thread John Baldwin
On Sunday, November 07, 2010 10:08:08 am Fernando Apesteguía wrote:
 Hi,
 
 I'm trying to understand  some pieces of the FreeBSD kernel.
 Having a look at struct fileops in file.h I was wondering why other
 file related functions don't have an entry in the function vector. I
 was thinking in mmap, fsync or sendfile.
 
 Can anyone tell me the reason?

Mostly that it hasn't been done yet.  If there was a clean way to do an 
f_mmap() and get some of the type-specific knowledge out of vm_mmap.c I'd 
really like it.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [PATCH] mptutil(8) - capture errors and percolate up to caller

2010-11-08 Thread John Baldwin
On Saturday, November 06, 2010 4:13:23 am Garrett Cooper wrote:
 Similar to r214396, this patch deals with properly capturing error
 and passing it up to the caller in mptutil just in case the errno
 value gets stomped on by warn*(3); this patch deals with an improper
 use of warn(3), and also some malloc(3) errors, as well as shrink down
 some static buffers to fit the data being output.
 If someone could review and help me commit this patch it would be
 much appreciated; all I could do is run negative tests on my local box
 and minor positive tests on my vmware fusion instance because it
 doesn't fully emulate a fully working mpt(4) device (the vmware
 instance consistently crashed with a warning about the mpt
 controller's unimplemented features after I poked at it enough).
 I'll submit another patch to fix up style(9) in this app if requested.
 Thanks!

The explicit 'return (ENOMEM)' calls are fine as-is.  I do not think they need 
changing.

Having static char arrays of '15' rather than '16' is probably pointless.  The 
stack is already at least 4-byte aligned on all the architectures we support, 
so a 15-byte char array will actually be 16 bytes.  It was chose to be a good
enough value, not an exact fit.  An exact fit is not important here.

Moving the 'buf' in mpt_raid_level() is a style bug.  It should stay where it 
is.  Same with 'buf' in mpt_volstate() and mpt_pdstate().

IOC_STATUS_SUCCESS() returns a boolean, it is appropriate to test it with ! 
rather than == 0.  It is also easier for a person to read the code that way.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: libkvm: consumers of kvm_getprocs for non-live kernels?

2010-11-11 Thread John Baldwin
On Wednesday, November 10, 2010 3:41:52 pm Ulrich Spörlein wrote:
 Hi,
 
 I have this cleanup of libkvm sitting in my tree and it needs a little
 bit of testing, especially the function kvm_proclist, which is only
 called from kvm_deadprocs which is only called from kvm_getprocs when kd
 is not ALIVE.
 
 The only consumer in our tree that I can make out is *probably* kgdb, as
 ps(1), top(1), w(1), pkill(1), fstat(1), systat(1), pmcstat(8) and
 bsnmpd don't really work on coredumps

ps and fstat certainly work fine on crashdumps.  w did before devfs (it 
doesn't have a good way to map the device entries from the crashed kernel to 
the entries in wtmp IIRC).  kvm_getprocs() is certainly actively used by 
various programs on crashdumps and works.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Managing userland data pointers in kqueue/kevent

2010-11-15 Thread John Baldwin
On Friday, November 12, 2010 1:40:00 pm Paul LeoNerd Evans wrote:
 I'm trying to build a high-level language wrapper around kqueue/kevent,
 specifically, a Perl wrapper.
 
 (In fact I am trying to fix this bug:
   http://rt.cpan.org/Public/Bug/Display.html?id=61481
 )
 
 My plan is to use the  void *udata  field of a kevent watcher to store a
 pointer to some user-provided Perl data structure (an SV*), to associate
 with the event. Typically this could be a code reference for an event
 callback or similar, but the exact nature doesn't matter. It's a pointer
 to a reference-counted data structure. SvREFCNT_dec(sv) is the function
 used to decrement the reference counter.
 
 To account for the fact that the kernel stores a pointer here, I'm
 artificially increasing the reference count on the object, so that it
 still remains alive even if the rest of the Perl code drops it, to rely
 on getting it back out of the kernel in an individual kevent. At some
 point when the kernel has finished looking after the event, this count
 needs to be decreased again, so the structure can be freed.
 
 I am having trouble trying to work out how to do this, or rather, when.
 I have the following problems:
 
  * If the event was registered using EV_ONESHOT, when it gets fired the
flags that come back in the event stucture do not include EV_ONESHOT.
 
  * Some events can only happen once, such as watching for EVFILT_PROC
NOTE_EXIT events.
 
  * The kernel can silently drop watches, such as when the process calls
close() on a filehandl with an EVFILT_READ or EVFILT_WRITE watch.
 
  * There doesn't seem to be a way to query that pointer back out of the
kernel, in case the user code wants to EV_DELETE the watch.
 
 These problems all mean that I never quite know when I ought to call
 SvREFCNT_dec() on that pointer.
 
 My current best-attack plan looks like the following:
 
  a) Store a structure in the  void *udata  that contains the actual SV*
 pointer and a flag to remember if the event had been installed as
 EV_ONESHOT (or remember if it was one of the event types that is
 oneshot anyway)
 
  b) Store an entire mapping in userland from filter+identity to pointer,
 so that if userland wants to EV_DELETE the watch early, it has the
 pointer to be able to drop it.
 
 I can't think of a solution to the close() problem at all, though.
 
 Part a of my solution seems OK (though I'd wonder why the flags back
 from the kernel don't contain EV_ONESHOT), but part b confuses me. I had
 thought the point of kqueue/kevent is the O(1) nature of it, which is
 among why the kernel is storing that  void *udata  pointer in the first
 place. If I have to store a mapping from every filter+identity back to
 my data pointer, why does the kernel store one at all? I could just
 ignore the udata field and use my mapping for my own purposes.
 
 Have I missed something here, then? I was hoping there'd be a nice way
 for kernel to give me back those pointers so I can just decrement a
 refcount on it, and have it reclaimed. 

I think the assumption is that userland actually maintains a reference on the 
specified object (e.g. a file descriptor) and will know to drop the associated 
data when the file descriptor is closed.  That is, think of the kevent as a 
member of an eventable object rather than a separate object that has a 
reference to the eventable object.  When the eventable object's reference 
count drops to zero in userland, then the kevent should be deleted, either via 
EV_DELETE, or implicitly (e.g. by closing the associated file descriptor).

I think in your case you should not give the kevent a reference to your 
object, but instead remove the associated event for a given object when an 
object's refcount drops to zero.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Phantom sysctl

2010-11-15 Thread John Baldwin
On Monday, November 15, 2010 12:53:57 pm Garrett Cooper wrote:
 According to SYSCTL_INT(9):
 
  The SYSCTL kernel interfaces allow code to statically declare sysctl(8)
  MIB entries, which will be initialized when the kernel module containing
  the declaration is initialized.  When the module is unloaded, the sysctl
  will be automatically destroyed.
 
 The sysctl should be reaped when the module is unloaded. My dumb
 test kernel module [1] doesn't seem to do that though (please note
 that the OID test_int_sysctl is created, and not reaped... FWIW it's
 kind of bizarre that test_int_sysctl is created in the first place,
 given what I've seen when SYSCTL_* gets executed):

I believe I have seen this work properly before.  Look for 'sysctl' in
sys/kern/kern_linker.c to see the sysctl hooks invoked on kldload and
kldunload to manage these sysctls.  You will probably want to start your
debugging in the unload hook as it sounds like the node is not being
fully deregistered.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Managing userland data pointers in kqueue/kevent

2010-11-15 Thread John Baldwin
On Monday, November 15, 2010 1:12:11 pm Paul LeoNerd Evans wrote:
 On Mon, Nov 15, 2010 at 11:25:42AM -0500, John Baldwin wrote:
  I think the assumption is that userland actually maintains a reference on 
  the 
  specified object (e.g. a file descriptor) and will know to drop the 
  associated 
  data when the file descriptor is closed.  That is, think of the kevent as a 
  member of an eventable object rather than a separate object that has a 
  reference to the eventable object.  When the eventable object's reference 
  count drops to zero in userland, then the kevent should be deleted, either 
  via 
  EV_DELETE, or implicitly (e.g. by closing the associated file descriptor).
 
 Ah. Well, that could be considered a bit more awkward for the use case I
 wanted to apply. The idea was that the  udata  would refer effectively
 to a closure, to invoke when the event happens. The idea being you can
 just add an event watcher by, say:
 
   $ev-EV_SET( $pid, EVFILT_PROC, 0, NOTE_EXIT, 0, sub {
  print STDERR The child process $pid has now exited\n;
   } );
 
 So, the kernel's udata pointer effectively holds the only reference to
 this anonymous closure. It's much more flexible this way, especially for
 oneshot events like that.
 
 The beauty is also that the kevents() loop can simply know that the
 udata is always a code reference so just has to invoke it to do whatever
 the original caller wanted to do.
 
 Keep in mind my use-case here; I'm not trying to be one specific
 application, it's a general-purpose kevent-wrapping library.

So is GCD (Apple's libdispatch).  It also implements closures on top of
kevent.  However, the way it works is that it doesn't expose kevent()
directly, instead it uses kevent to implement asynchronous I/O on a 
socket for example, and since it is logically managing the life cycle
of a socket, it knows when the socket is closed and cleans up then.

  I think in your case you should not give the kevent a reference to your 
  object, but instead remove the associated event for a given object when an 
  object's refcount drops to zero.
 
 Well that's certainly doable in longrunning watches, but I don't think
 it sounds very convenient for a oneshot event; see the above example for
 justification.

For the above case, if you know an event is one shot, you should either
use EV_ONESHOT, or use a wrapper around the closure that clears the event
after the closure runs (or possibly before the closure runs?)

 Also it again begs my question, worth repeating here:
 
 On Friday, November 12, 2010 1:40:00 pm Paul LeoNerd Evans wrote:
  I had
  thought the point of kqueue/kevent is the O(1) nature of it, which is
  among why the kernel is storing that  void *udata  pointer in the first
  place. If I have to store a mapping from every filter+identity back to
  my data pointer, why does the kernel store one at all? I could just
  ignore the udata field and use my mapping for my own purposes.
 
 If you're saying that in my not-so-rare use case, I don't want to be
 using udata, and instead keeping my own mapping, why does the kernel
 provide this udata field at all?

Your use case is rare.  Almost all consumers of kevent() that I've seen
use kevent() as one part of a system that maintain the lifecycle of objects.
Those objects are only accessed within the system, so the system knows when
an object is closed and can release the resources at the same time.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: breaking the crunchgen logic into a share/mk file

2010-11-16 Thread John Baldwin
On Tuesday, November 16, 2010 8:01:43 am Andrey V. Elsukov wrote:
 On 08.11.2010 15:31, Adrian Chadd wrote:
  I've broken out the crunchgen logic from src/rescue/rescue into a
  share/mk file - that way it can be reused in other areas.
 
  The diff is here: http://people.freebsd.org/~adrian/crunchgen-mk.diff
  http://people.freebsd.org/%7Eadrian/crunchgen-mk.diff
 
  This bsd.crunchgen.mk file is generic enough to use in my
  busybox-style thing as well as for src/rescue/rescue/.
 
  Comments, feedback, etc welcome!
 
 It seems this broke usage of livefs from sysinstall.
 sysinstall does check for /rescue/ldconfig and can not find it there.
 I think attached patch can fix this issue (not tested).

Err, are there no longer hard links to all of the frontends for a given 
crunch?  If so, that is a problem as it will make rescue much harder to use.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Software interrupts; where to start

2010-11-16 Thread John Baldwin
On Tuesday, November 16, 2010 12:08:51 pm Nathan Vidican wrote:
 What I would like to do, is replace the above scenario with one wherein the
 program writing to the serial port is always connected and running, but not
 polling; ideally having some sort of interupt or signal triggered from
 within memcached when a value is altered. Sort of a 're-sync' request
 asserting that the program sending data out the serial port should 'loop
 once'. I'd like to continue with the use of memcached as it provides a
 simple way for multiple systems to query the values in the array as well,
 (ie: some devices need not change the data, but only view it; given the
 latency requirements memcached operates ideally). This trigger should be
 asynchronous in that it should be fired and forgotten by memcached (by
 nature of the hardware designed, no error-checking nor receipt would be
 needed).
 
 I'm just not sure where to start? Could someone send me the right RTFM link
 to start from, or perhaps suggest a better way to look at solving this
 problem? Ideally any example code to look at with a simple signal or
 interrupt type of handler would be great. What I'm leaning towards is
 modifying memcached daemon to send a signal or trigger an interrupt of some
 sort to tell the other program communicating with the device to re-poll
 once. It would also be nice to have a way to trigger from other programs
 too.

A simple solution would be to create a pipe shared between memcached and the 
process that writes to the serial port.  memcached would write a dummy byte to 
the pipe each time it updates the values.  Your app could either use 
select/poll/kqueue or a blocking read on the pipe to sleep until memcached 
does an update.  That requires modify memcached though.  I'm not familiar 
enough with memcached to know if it already has some sort of signalling 
facility already that you could use directly.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: breaking the crunchgen logic into a share/mk file

2010-11-20 Thread John Baldwin
On Tuesday, November 16, 2010 8:45:08 am Andrey V. Elsukov wrote:
 On 16.11.2010 16:29, John Baldwin wrote:
  Err, are there no longer hard links to all of the frontends for a given 
  crunch?  If so, that is a problem as it will make rescue much harder to use.
 
 Yes, probably this patch is not needed and it should be fixed somewhere in
 makefiles. But currently rescue does not have any hardlinks:
 http://pub.allbsd.org/FreeBSD-snapshots/i386-i386/9.0-HEAD-20101116-JPSNAP/cdrom/livefs/rescue/
 
 And what is was before:
 http://pub.allbsd.org/FreeBSD-snapshots/i386-i386/9.0-HEAD-20101112-JPSNAP/cdrom/livefs/rescue/

That definitely needs to be fixed.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: new cpuid bits

2010-11-22 Thread John Baldwin
On Friday, November 19, 2010 10:39:53 am Andriy Gapon wrote:
 
 Guys,
 
 I would like to add definitions for couple more useful CPUID bits, but I am
 greatly confused about how to name them.
 I failed to deduce the naming convention from the existing definitions and I 
 am
 not sure how to make the names proper and descriptive.
 
 The bits in question are returned by CPUID.6 in EAX and ECX.
 CPUID.6 block is described by both AMD and Intel as Thermal and Power 
 Management
 (Leaf).  Bits in EAX are defined only for Intel at present, the bit in ECX is
 defined for both.
 
 Description/naming of the bits from the specifications:
 EAX[0]: Digital temperature sensor is supported if set
 EAX[1]: Intel Turbo Boost Technology Available
 EAX[2]: ARAT. APIC-Timer-always-running feature is supported if set.
 ECX[0]:
   Intel: Hardware Coordination Feedback Capability (Presence of Bits MCNT and 
 ACNT
 MSRs).
   AMD:  EffFreq: effective frequency interface.
 
 How does the following look to you?
 I will appreciate suggestions/comments.

Looks fine to me.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Quick i386 question...

2010-11-22 Thread John Baldwin
On Saturday, November 20, 2010 3:38:58 pm Sergio Andrés Gómez del Real wrote:
 If received an interrupt while in protected-mode and paging enabled,
 is linear address from IDT stored at the idtr translated using the
 paging-hierarchy structures?
 I have looked at the interrupt/exception chapter in the corresponding
 Intel manual but can't find the answer. Maybe I overlooked.

Yes.  A linear address is the flat virtual address after segments are taken 
into account.  It is the address used as an input to the paging support in the 
MMU.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Best way to determine if an IRQ is present

2010-11-22 Thread John Baldwin
On Saturday, November 20, 2010 4:58:02 pm Garrett Cooper wrote:
 Trying to do a complete solution for kern/145385, Andriy has
 raised concerns about IRQ mapping to CPUs; while I've have put
 together more pieces of the puzzle, I'm a bit confused how I determine
 whether or not an IRQ is available for use.
 Sure, I could linear probe a series of IRQs, but that would
 probably be expensive, and different architectures treat IRQs
 differently, so building assumptions based on the fact that IRQ
 hierarchy is done in a particular order is probably not the best thing
 to do.
 I've poked around kern/kern_cpuset.c and kern/kern_intr.c a bit
 but I may have missed something important...

Well, the real solution is actually larger than described in the PR.  What you 
really want to do is take the logical CPUs offline when they are halted.  
Taking a CPU offline should trigger an EVENTHANDLER that various bits of code 
could invoke.  In the case of platforms that support binding interrupts to 
CPUs (x86 and sparc64 at least), they would install an event handler that 
searches the MD interrupt tables (e.g. the interrupt_sources[] array on x86) 
and move bound interrupts to other CPUs.  However, I think all the interrupt
bits will be MD, not MI.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Building my own release ISOs

2010-11-22 Thread John Baldwin
On Sunday, November 21, 2010 8:31:22 pm Sean Bruno wrote:
 Does this look about right to build from a test branch?
 
 sudo make release SVNROOT=ssh+svn://svn.freebsd.org/base
 SVNBRANCH=projects/sbruno_64cpus MAKE_ISOS=y MAKE_DVD=y NO_FLOPPIES=y
 NODOC=y NOPORTSATALL=y WORLD_FLAGS=-j32 KERNEL_FLAGS=-j32
 BUILDNAME=sbruno CHROOTDIR=/new_release

Sure.  Note, though, that you don't have to create a branch just to build a 
release with a patch.  You can always use LOCAL_PATCHES to apply patches to 
the source tree you build a release against.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Best way to determine if an IRQ is present

2010-11-25 Thread John Baldwin

Andriy Gapon wrote:

on 22/11/2010 16:24 John Baldwin said the following:
Well, the real solution is actually larger than described in the PR.  What you 
really want to do is take the logical CPUs offline when they are halted.  
Taking a CPU offline should trigger an EVENTHANDLER that various bits of code 
could invoke.  In the case of platforms that support binding interrupts to 
CPUs (x86 and sparc64 at least), they would install an event handler that 
searches the MD interrupt tables (e.g. the interrupt_sources[] array on x86) 
and move bound interrupts to other CPUs.  However, I think all the interrupt

bits will be MD, not MI.


That's a good idea and a comprehensive approach.
One minor technical detail - should an offlined CPU be removed from all_cpus 
mask/set?


That's tricky.  In other e-mails I've had on this topic, the idea has 
been to have a new online_cpus mask and maybe a CPU_ONLINE() test macro 
 similar to CPU_ABSENT().  In that case, an offline CPU should still be 
in all_cpus, but many places that use all_cpus would need to use 
online_cpus instead.


--
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: How to debug BTX loader?

2010-11-30 Thread John Baldwin
On Monday, November 29, 2010 1:01:27 pm Darmawan Salihun wrote:
 Hi guys, 
 
 I'm currently working on a BIOS for a custom Single Board Computer (SBC). 
 I have the required BIOS source code and tools at hand. 
 However, the boot process always stuck in the BTX loader 
 (the infamous ACPI autoload failed) when I booted out of USB stick 
 (with the FreeBSD 8.1 USB stick image). 
 
 I could get the system to boot into FreeBSD 8.1 
 (by keeping the CDROM tray open and close it when the board looks for 
 boot device, otherwise BTX will reboot instantly). 

Are you getting an actual BTX error message or a freeze?  BTX is just a 
minikernel written all in assembly.  It doesn't handle loading the kernel, 
etc.  All that work is done by the /boot/loader program (which is written in 
C).  You can find all the source to the boot code in src/sys/boot.  The BTX 
kernel is in src/sys/boot/i386/btx/btx/.

However, to debug this further we would need more info such as what exactly 
you are seeing (a hang, a BTX fault with a register dump, etc.).

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: 8.1-RELEASE hangs on reboot

2010-12-01 Thread John Baldwin
On Tuesday, November 30, 2010 8:23:19 pm Ondřej Majerech wrote:
 Hello,
 
 my 8.1-R system has just started hanging on reboot. Specifically after
 I svn up'd my source and updated from 8.1-R-p1 to -p2.
 
 Some kind of hang occurs on every reboot attempt. Usually it hangs at
 the Rebooting... message, but sometimes the thing just locks up
 before it even syncs disks. shutdown -p now seems to shutdown the
 system successfully each time.
 
 So I booted into single-user mode, executed reboot and during the
 Syncing disks I pressed Ctrl-Alt-Escape to break into the debugger.
 There I single-stepped with the s command until the thing simply
 stopped doing anything. (Even if I pressed NumLock, the LED on the
 keyboard wouldn't turn off.)
 
 The screen content at the moment of hang is (dutifully typed over as
 the thing is dead and I don't have a serial cable):
 
 [thread pid 12 tid 100017 ]
 Stopped at sckbdevent+0x5f: call _mtx_unlock_flags
 db
 [thread pid 12 tid 100017 ]
 Stopped at _mtx_unlock_flags: pushq %rbp
 db
 [thread pid 12 tid 100017 ]
 Stopped at _mtx_unlock_flags+0x1: movq %rsp,%rbp
 db
 [thread pid 12 tid 100017 ]
 Stopped at _mtx_unloock_flags+0x4: subq $0x20,%rsp
 db
 [thread pid 12 tid 100017 ]
 Stopped at _mtx_unlock_flags+0x8: movq %rbx,(%rsp)
 db
 [thread pid 12 tid 100017 ]
 Stopped at _mtx_unlock_flags+0xc: movq %r12,0x8(%rsp)
 db
 [thread pid 12 pid 100017 ]
 Stopped at _mtx_unlock_flags+0x11: movq %rdi,%rbx
 db
 [thread pid 12 pid 100017 ]
 Stopped at _mtx_unlock_flags+0x14: movq %r13,0x10(%rsp)
 db
 E
 
 Including that E at the end.

No good ideas here, though I think we just turned off PSL_T by
accident so it ran for a while before hanging after this.  'E' must be
the start of a message on the console.

 As I said, it's 8.1-RELEASE-p2; it's on AMD64. I'm using custom kernel
 which only differs from GENERIC by addition of the debugging options:
 
 options INVARIANTS
 options INVARIANT_SUPPORT
 options WITNESS
 options DEBUG_LOCKS
 options DEBUG_VFS_LOCKS
 options DIAGNOSTIC
 
 I tried rebooting with ACPI disabled, but the thing paniced on boot with
 
 panic: Duplicate free of item 0xff00025e from zone
 0xff00bfdcc2a0(1024)
 
 cpuid = 0
 KDB: enter: panic
 [thread pid 0 tid 10 ]
 Stopped at kdb_enter+0x3d: movq $0, 0x6b2d20(%rip)
 db bt
 Tracing pid 0 tid 10 td 0x80c63fc0
 kdb_enter() at kdb_enter+0x3d
 panic() at panic+0x17b
 uma_dbg_free() at uma_dbg_free+0x171
 uma_zfree_arg() at uma_zfree_arg+0x68
 free() at free+0xcd
 device_set_driver() at device_set_driver+0x7c
 device_attach() at device_attach+0x19b
 bus_generic_attach() at bus_generic_attach+0x1a
 pci_attach() at pci_attach+0xf1

The free() should be the free to free the softc but that implies it had a 
previous driver and softc.  Maybe add some debug info to devclass_set_driver() 
to print out the previous driver's name (and maybe the value of the pointer)
before free'ing the softc.  You could use gdb on the kernel.debug and the 
pointer value to figure out exactly which driver was the previous one and look 
to see if it's probe routine does something funky with the softc pointer.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: How to debug BTX loader?

2010-12-02 Thread John Baldwin
On Wednesday, December 01, 2010 4:09:42 pm Darmawan Salihun wrote:
 Hi John, 
 
 --- On Tue, 11/30/10, John Baldwin j...@freebsd.org wrote:
 
  From: John Baldwin j...@freebsd.org
  Subject: Re: How to debug BTX loader?
  To: freebsd-hackers@freebsd.org
  Cc: Darmawan Salihun darmawan_sali...@yahoo.com
  Date: Tuesday, November 30, 2010, 9:38 AM
  On Monday, November 29, 2010 1:01:27
  pm Darmawan Salihun wrote:
   Hi guys, 
   
   I'm currently working on a BIOS for a custom Single
  Board Computer (SBC). 
   I have the required BIOS source code and tools at
  hand. 
   However, the boot process always stuck in the BTX
  loader 
   (the infamous ACPI autoload failed) when I booted
  out of USB stick 
   (with the FreeBSD 8.1 USB stick image). 
   
   I could get the system to boot into FreeBSD 8.1 
   (by keeping the CDROM tray open and close it when the
  board looks for 
   boot device, otherwise BTX will reboot instantly). 
  
  Are you getting an actual BTX error message or a
  freeze?  BTX is just a 
  minikernel written all in assembly.  It doesn't handle
  loading the kernel, 
  etc.  All that work is done by the /boot/loader
  program (which is written in 
  C).  You can find all the source to the boot code in
  src/sys/boot.  The BTX 
  kernel is in src/sys/boot/i386/btx/btx/.
  
  However, to debug this further we would need more info such
  as what exactly 
  you are seeing (a hang, a BTX fault with a register dump,
  etc.).
 
 One of the BTX fault shows the register dump in the attachment. 
 I hope this could help. Anyway, If I were to try to interpret 
 such register dump, where should I start? I understand x86/x86_64 
 assembly pretty much, but I'm not quite well versed with the 
 FreeBSD code using it. 

Looks like the mailing list stripped the attachment.  Can you post the 
attachment at a URL?

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: How to debug BTX loader?

2010-12-02 Thread John Baldwin
On Thursday, December 02, 2010 2:12:04 pm Darmawan Salihun wrote:
 Hi John, 
 
 --- On Thu, 12/2/10, John Baldwin j...@freebsd.org wrote:
 
  From: John Baldwin j...@freebsd.org
  Subject: Re: How to debug BTX loader?
  To: freebsd-hackers@freebsd.org
  Cc: Darmawan Salihun darmawan_sali...@yahoo.com
  Date: Thursday, December 2, 2010, 8:58 AM
  On Wednesday, December 01, 2010
  4:09:42 pm Darmawan Salihun wrote:
   Hi John, 
   
   --- On Tue, 11/30/10, John Baldwin j...@freebsd.org
  wrote:
   
From: John Baldwin j...@freebsd.org
Subject: Re: How to debug BTX loader?
To: freebsd-hackers@freebsd.org
Cc: Darmawan Salihun darmawan_sali...@yahoo.com
Date: Tuesday, November 30, 2010, 9:38 AM
On Monday, November 29, 2010 1:01:27
pm Darmawan Salihun wrote:
 Hi guys, 
 
 I'm currently working on a BIOS for a custom
  Single
Board Computer (SBC). 
 I have the required BIOS source code and
  tools at
hand. 
 However, the boot process always stuck in
  the BTX
loader 
 (the infamous ACPI autoload failed) when I
  booted
out of USB stick 
 (with the FreeBSD 8.1 USB stick image). 
 
 I could get the system to boot into FreeBSD
  8.1 
 (by keeping the CDROM tray open and close it
  when the
board looks for 
 boot device, otherwise BTX will reboot
  instantly). 

Are you getting an actual BTX error message or a
freeze?  BTX is just a 
minikernel written all in assembly.  It
  doesn't handle
loading the kernel, 
etc.  All that work is done by the
  /boot/loader
program (which is written in 
C).  You can find all the source to the boot
  code in
src/sys/boot.  The BTX 
kernel is in src/sys/boot/i386/btx/btx/.

However, to debug this further we would need more
  info such
as what exactly 
you are seeing (a hang, a BTX fault with a
  register dump,
etc.).
   
   One of the BTX fault shows the register dump in the
  attachment. 
   I hope this could help. Anyway, If I were to try to
  interpret 
   such register dump, where should I start? I understand
  x86/x86_64 
   assembly pretty much, but I'm not quite well versed
  with the 
   FreeBSD code using it. 
  
  Looks like the mailing list stripped the attachment. 
  Can you post the 
  attachment at a URL?
 
 
 The BTX crash message is in the attachment.

Ok, so clearly the instruction pointer has jumped off into the weeds given 
that the instruction stream is all 0xff.  The instruction pointer value 
(0xc09d3600) implies that this is in the kernel already during early kernel 
startup (before the kernel installs its own IDT with its own fault and 
exception handlers).  It might be helpful to pull up gdb on your kernel.debug 
file and do 'l *0xc09d3600' to see what you get.  Looking at the stack 
'0xc1830188' might be another address in the kernel.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: coretemp(4)/amdtemp(4) and sysctl nodes

2010-12-06 Thread John Baldwin
On Friday, December 03, 2010 1:05:02 pm m...@freebsd.org wrote:
 There are very few uses in FreeBSD mainline code of
 sysctl_remove_oid(), and I was looking at potentially removing them.
 However, the use in coretemp/amdtemp has me slightly stumped.
 
 Each device provides a device_get_sysctl_ctx sysctl_ctx that is
 automatically cleaned up when the device goes away.  Yet the sysctl
 nodes for both amdtemp and coretemp use the context of other devices,
 rather than their own.  I can't quite figure out why, though the two
 are slightly different enough that they may have different reasons.
 
 For coretmp(4) I don't see how the parent device can be removed first,
 since we are a child device.  So from my understanding it makes no
 sense to have an explicit sysctl_remove_oid() and attach in the
 parent's sysctl_ctx.

Well, you would want 'kldunload coretemp.ko' to remove the sysctl node even 
though the parent device is still around.  I suspect the same case is true
for amdtemp.  Probably these drivers should use a separate sysctl context.
I'm not sure how the sysctl code handles removing a node that has an active 
context though.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: small dtrace patch for review

2010-12-06 Thread John Baldwin
On Friday, December 03, 2010 11:57:42 am Andriy Gapon wrote:
 
 The patch is not about DTrace functionality, but about infrastructure use in 
 one
 particular place.
 http://people.freebsd.org/~avg/dtrace_gethrtime_init.diff
 I believe that sched_pin() is need there to make sure that host/base CPU 
 stays
 the same for all calls to smp_rendezvous_cpus().
 The pc_cpumask should just be a cosmetic change.

Looks good to me.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: atomic_set_xxx(x, 0)

2010-12-07 Thread John Baldwin
On Tuesday, December 07, 2010 12:58:43 pm Andriy Gapon wrote:
 
 $ glimpse atomic_set_ | fgrep -w 0
 /usr/src/sys/dev/arcmsr/arcmsr.c:   
 atomic_set_int(acb-srboutstandingcount, 0);
 /usr/src/sys/dev/arcmsr/arcmsr.c:   
 atomic_set_int(acb-srboutstandingcount, 0);
 /usr/src/sys/dev/jme/if_jme.c:  atomic_set_int(sc-jme_morework, 0);
 /usr/src/sys/dev/jme/if_jme.c:  atomic_set_int(sc-jme_morework, 0);
 /usr/src/sys/dev/ale/if_ale.c:  atomic_set_int(sc-ale_morework, 0);
 /usr/src/sys/mips/rmi/dev/xlr/rge.c:
 atomic_set_int((priv-frin_to_be_sent[i]), 0);
 /usr/src/sys/dev/drm/drm_irq.c:
 atomic_set_rel_32(dev-vblank[i].count, 0);
 /usr/src/sys/dev/cxgb/ulp/tom/cxgb_tom.c:   
 atomic_set_int(t-tids_in_use, 0);
 
 I wonder if these are all bugs and atomic_store_xxx() was actually intended?

They are most likely bugs.  You can probably ask yongari@ about jme(4) and
ale(4) and np@ about cxgb(4).  drm_irq looks to want to be an 
atomic_store_rel().
Not sure who to ask about arcmsr(4).  I'm not sure arcmsr(4) really needs the
atomic ops at all, but it should be using atomic_fetchadd() and
atomic_readandclear() instead of some of the current atomic ops.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: getting rid of some -mno-* flags under sys/boot

2010-12-20 Thread John Baldwin
On Sunday, December 19, 2010 12:42:01 pm Garrett Cooper wrote:
 On Sun, Dec 19, 2010 at 3:23 AM, Alexander Best arun...@freebsd.org wrote:
  hi there,
 
  i think some of the -mno-* flags in sys/boot/* can be scrubbed, since 
  they're
  already being included from ../Makefile.inc.
 
 Looks good.
 
  also TARGET cleandir leaves some files behind in i386/gptboot which should 
  be
  fixed by this patch.
 
 AHA. This might fix the issue I've seen rebuilding stuff with
 gptzfsboot for a good while now where I have to (on mostly rare
 occasions with -j24, etc typically after updating my source tree)
 rebuild it manually. gptzfsboot and zfsboot also need the fix, BTW.
 The only thing is that these files live under the common directory, so
 shouldn't common clean them up (I see that common doesn't have a
 Makefile though, only a Makefile.inc -- ouch)?
 FWIW though, wouldn't it be better to avoid this accidental bug
 and unnecessary duplication by doing something like the following?
 
 # ...
 
 OBJS=zfsboot.o sio.o gpt.o drv.o cons.o util.o
 CLEANFILES+= gptzfsboot.out ${OBJS}
 
 gptzfsboot.out: ${BTXCRT} ${OBJS}
 # ...

Yes, an OBJS would be good.  Also, gptboot.c was recently changed to not
#include ufsread.c, so that explicit dependency can be removed, as can the
GPTBOOT_UFS variable.

Similar fixes probably apply to gptzfsboot.

BTW, the code in common/ is not built into a library, but specific boot
programs (typically /boot/loader on different platforms) include specific
objects.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: PCI IDE Controller Base Address Register setting

2010-12-28 Thread John Baldwin
On Monday, December 27, 2010 6:07:35 am Darmawan Salihun wrote:
 Hi, 
 
 I'm trying to install FreeBSD 8.0 on AMD Geode LX800 (CS5536 southbridge). 
However, it cannot detect the IDE controller (in the CS5536) correctly. It 
says something similar to this: 
 IDE controller not present

Hmm, I can't find a message like that anywhere.  Can you get the exact message 
you are seeing?

 I did lspci in Linux (BackTrack 3) 
 and I saw that the IDE controller Base Address Registers (BARs) 
 are all disabled (only contains zeros), 
 except for one of them (BAR4). 
 BAR4 decodes 16-bytes I/O ports (FFF0h-h). 
 The decoded ports seems to conform to the PCI IDE specification 
 for native-PCI IDE controller (relocatable within the 
 16-bit I/O address space). 
 
 I did cat /proc/ioports and I found that 
 the following I/O port address ranges decoded correctly 
 to the IDE controller in the CS5536 southbridge:
 
 1F0h-1F7h 
 3F6h 
 170h-177h
 FFF0h-h
 
 My question: 
 Does FreeBSD require the IDE controller BARs 
 to be programmed to also decode 
 legacy I/O ports ranges (1F0h-1F7h,3F6h and 170h-177h)? 

No.  We hardcode the ISA ranges for BARs 0 through 3 if a PCI IDE controller 
has the Primary or Secondary bits set in its programming interface 
register and don't even look at the BARs.  We do always examines BARs 4 and 5 
using the normal probing scheme of writing all 1's, etc.  The code in question 
looks like this:

/*
 * For ATA devices we need to decide early what addressing mode to use.
 * Legacy demands that the primary and secondary ATA ports sits on the
 * same addresses that old ISA hardware did. This dictates that we use
 * those addresses and ignore the BAR's if we cannot set PCI native
 * addressing mode.
 */
static void
pci_ata_maps(device_t bus, device_t dev, struct resource_list *rl, int force,
uint32_t prefetchmask)
{
struct resource *r;
int rid, type, progif;
#if 0
/* if this device supports PCI native addressing use it */
progif = pci_read_config(dev, PCIR_PROGIF, 1);
if ((progif  0x8a) == 0x8a) {
if (pci_mapbase(pci_read_config(dev, PCIR_BAR(0), 4)) 
pci_mapbase(pci_read_config(dev, PCIR_BAR(2), 4))) {
printf(Trying ATA native PCI addressing mode\n);
pci_write_config(dev, PCIR_PROGIF, progif | 0x05, 1);
}
}
#endif
progif = pci_read_config(dev, PCIR_PROGIF, 1);
type = SYS_RES_IOPORT;
if (progif  PCIP_STORAGE_IDE_MODEPRIM) {
pci_add_map(bus, dev, PCIR_BAR(0), rl, force,
prefetchmask  (1  0));
pci_add_map(bus, dev, PCIR_BAR(1), rl, force,
prefetchmask  (1  1));
} else {
rid = PCIR_BAR(0);
resource_list_add(rl, type, rid, 0x1f0, 0x1f7, 8);
r = resource_list_reserve(rl, bus, dev, type, rid, 0x1f0,
0x1f7, 8, 0);
rid = PCIR_BAR(1);
resource_list_add(rl, type, rid, 0x3f6, 0x3f6, 1);
r = resource_list_reserve(rl, bus, dev, type, rid, 0x3f6,
0x3f6, 1, 0);
}
if (progif  PCIP_STORAGE_IDE_MODESEC) {
pci_add_map(bus, dev, PCIR_BAR(2), rl, force,
prefetchmask  (1  2));
pci_add_map(bus, dev, PCIR_BAR(3), rl, force,
prefetchmask  (1  3));
} else {
rid = PCIR_BAR(2);
resource_list_add(rl, type, rid, 0x170, 0x177, 8);
r = resource_list_reserve(rl, bus, dev, type, rid, 0x170,
0x177, 8, 0);
rid = PCIR_BAR(3);
resource_list_add(rl, type, rid, 0x376, 0x376, 1);
r = resource_list_reserve(rl, bus, dev, type, rid, 0x376,
0x376, 1, 0);
}
pci_add_map(bus, dev, PCIR_BAR(4), rl, force,
prefetchmask  (1  4));
pci_add_map(bus, dev, PCIR_BAR(5), rl, force,
prefetchmask  (1  5));
}


-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: PCI IDE Controller Base Address Register setting

2010-12-28 Thread John Baldwin
On Tuesday, December 28, 2010 1:38:05 pm Darmawan Salihun wrote:
 Hi,
 
 --- On Tue, 12/28/10, John Baldwin j...@freebsd.org wrote:
 
  From: John Baldwin j...@freebsd.org
  Subject: Re: PCI IDE Controller Base Address Register setting
  To: freebsd-hackers@freebsd.org
  Cc: Darmawan Salihun darmawan_sali...@yahoo.com
  Date: Tuesday, December 28, 2010, 10:20 AM
  On Monday, December 27, 2010 6:07:35
  am Darmawan Salihun wrote:
   Hi, 
   
   I'm trying to install FreeBSD 8.0 on AMD Geode LX800
  (CS5536 southbridge). 
  However, it cannot detect the IDE controller (in the
  CS5536) correctly. It 
  says something similar to this: 
   IDE controller not present
  
  Hmm, I can't find a message like that anywhere.  Can
  you get the exact message 
  you are seeing?
  
 
 It says: 
 
 No disks found! Please verify that your disk controller is being properly
 probed at boot time.

Oh, so this is a message from the installer.  Can you capture a verbose dmesg
via a serial console perhaps?  Or at least the kernel probe messages for your
ATA controller?

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: PCI IDE Controller Base Address Register setting

2010-12-28 Thread John Baldwin
On Tuesday, December 28, 2010 2:10:59 pm Darmawan Salihun wrote:
 Hi, 
 
 --- On Tue, 12/28/10, John Baldwin j...@freebsd.org wrote:
 
  From: John Baldwin j...@freebsd.org
  Subject: Re: PCI IDE Controller Base Address Register setting
  To: Darmawan Salihun darmawan_sali...@yahoo.com
  Cc: freebsd-hackers@freebsd.org
  Date: Tuesday, December 28, 2010, 1:52 PM
  On Tuesday, December 28, 2010 1:38:05
  pm Darmawan Salihun wrote:
   Hi,
   
   --- On Tue, 12/28/10, John Baldwin j...@freebsd.org
  wrote:
   
From: John Baldwin j...@freebsd.org
Subject: Re: PCI IDE Controller Base Address
  Register setting
To: freebsd-hackers@freebsd.org
Cc: Darmawan Salihun darmawan_sali...@yahoo.com
Date: Tuesday, December 28, 2010, 10:20 AM
On Monday, December 27, 2010 6:07:35
am Darmawan Salihun wrote:
 Hi, 
 
 I'm trying to install FreeBSD 8.0 on AMD
  Geode LX800
(CS5536 southbridge). 
However, it cannot detect the IDE controller (in
  the
CS5536) correctly. It 
says something similar to this: 
 IDE controller not present

Hmm, I can't find a message like that
  anywhere.  Can
you get the exact message 
you are seeing?

   
   It says: 
   
   No disks found! Please verify that your disk
  controller is being properly
   probed at boot time.
  
  Oh, so this is a message from the installer.  Can you
  capture a verbose dmesg
  via a serial console perhaps?  
 
 I'm not sure if I can do this because I've tried a couple of times 
 but nothing comes out of the serial console. Perhaps a wrong baud rate 
 setting? 
 I set it to 96bps and 8-N-1 back then. Is that correct? 

Yes, that should be correct.  You have to turn the console on however (it is
not enabled by default).  The simplest way to do this is probably to hit the
key option to break into the loader prompt when you see the boot menu (I think
it is option '6').  Then enter 'boot -D' at the 'OK' prompt.  This should boot
with both the video and serial consoles enabled with the video console as the
primary console.  For a verbose boot, use 'boot -Dv'

If you want to test out the serial console before you boot, you can instead
enter 'set console=vidconsole,comconsole' at the prompt.  You should then
see an OK prompt on both the screen and the serial port.

Note that the serial console is hardcoded to use the default I/O ports for
COM1.

  Or at least the kernel
  probe messages for your
  ATA controller?
  
 
 I recall that pressing Alt+F2 during the installation would open-up 
 another console, full with log messages. Would that be enough? 

Actually, the kernel probe messages are on the main console, but you can hit
scroll lock to freeze the console and then use page up to go back in history
and find the messages.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: PCI IDE Controller Base Address Register setting

2011-01-03 Thread John Baldwin
On Saturday, January 01, 2011 2:58:12 pm Darmawan Salihun wrote:
 
 --- On Thu, 12/30/10, Darmawan Salihun darmawan_sali...@yahoo.com wrote:
 
  From: Darmawan Salihun darmawan_sali...@yahoo.com
  Subject: Re: PCI IDE Controller Base Address Register setting
  To: John Baldwin j...@freebsd.org
  Cc: freebsd-hackers@freebsd.org
  Date: Thursday, December 30, 2010, 3:28 PM
  --- On Tue, 12/28/10, John Baldwin
  j...@freebsd.org
  wrote:
  
   From: John Baldwin j...@freebsd.org
   Subject: Re: PCI IDE Controller Base Address Register
  setting
   To: Darmawan Salihun darmawan_sali...@yahoo.com
   Cc: freebsd-hackers@freebsd.org
   Date: Tuesday, December 28, 2010, 2:22 PM
   On Tuesday, December 28, 2010 2:10:59
   pm Darmawan Salihun wrote:
Hi, 

--- On Tue, 12/28/10, John Baldwin j...@freebsd.org
   wrote:

 From: John Baldwin j...@freebsd.org
 Subject: Re: PCI IDE Controller Base
  Address
   Register setting
 To: Darmawan Salihun darmawan_sali...@yahoo.com
 Cc: freebsd-hackers@freebsd.org
 Date: Tuesday, December 28, 2010, 1:52 PM
 On Tuesday, December 28, 2010 1:38:05
 pm Darmawan Salihun wrote:
  Hi,
  
  --- On Tue, 12/28/10, John Baldwin
  j...@freebsd.org
 wrote:
  
   From: John Baldwin j...@freebsd.org
   Subject: Re: PCI IDE Controller
  Base
   Address
 Register setting
   To: freebsd-hackers@freebsd.org
   Cc: Darmawan Salihun darmawan_sali...@yahoo.com
   Date: Tuesday, December 28, 2010,
  10:20
   AM
   On Monday, December 27, 2010
  6:07:35
   am Darmawan Salihun wrote:
Hi, 

I'm trying to install FreeBSD
  8.0
   on AMD
 Geode LX800
   (CS5536 southbridge). 
   However, it cannot detect the IDE
   controller (in
 the
   CS5536) correctly. It 
   says something similar to this: 
IDE controller not present
   
   Hmm, I can't find a message like
  that
 anywhere.  Can
   you get the exact message 
   you are seeing?
   
  
  It says: 
  
  No disks found! Please verify that
  your
   disk
 controller is being properly
  probed at boot time.
 
 Oh, so this is a message from the
   installer.  Can you
 capture a verbose dmesg
 via a serial console perhaps?  

I'm not sure if I can do this because I've tried
  a
   couple of times 
but nothing comes out of the serial console.
  Perhaps a
   wrong baud rate setting? 
I set it to 96bps and 8-N-1 back then. Is that
   correct? 
   
   Yes, that should be correct.  You have to turn the
   console on however (it is
   not enabled by default).  The simplest way to do
  this
   is probably to hit the
   key option to break into the loader prompt when you
  see the
   boot menu (I think
   it is option '6').  Then enter 'boot -D' at the 'OK'
   prompt.  This should boot
   with both the video and serial consoles enabled with
  the
   video console as the
   primary console.  For a verbose boot, use 'boot -Dv'
   
  
  Thanks, I tested this option and it worked. 
  I could see the debugging messages. 
  
  FreeBSD cannot detect the disk in all of the IDE
  interfaces.  
  (The AMDCS5536 only implemented the primary channel)
  
  Anyway, I manage to change the mapping in BAR4 of the IDE
  controller. 
  However, I'm confused as to how to force FreeBSD to
  recognize the 
  IDE controller to work only in compatibility mode. 
  Because, I'm not sure if the physical IDE controller chip
  supports 
  Native-PCI mode correctly at all. 
  If I set BAR4 to disabled(i.e. not decoding any I/O
  addresses at all), 
  would FreeBSD use compatibility mode? or would it consider
  the 
  IDE controller not present?
  
  Here's some notes about the IDE controller PCI
  configuration registers:
  1. The Programming Interface register contains 80h (which
  means _only_ 
  compatibility mode supported). I have yet to be able to
  write new values 
  into this register. That's the state of the register right
  now. 
  I noticed in your previous reply that for FreeBSD to be
  forced to use 
  compatibility mode, the programming interface register bits
  in the PCI configuration register must be set accordingly 
  (I suppose the bits in the lower nibble).
  
  2. BAR0-BAR3 cannot be changed and contains 00h. 
  I have yet to experiment with BAR5.The default value is
  00h
  
 
 Silly me that I didn't know about the SFF-8038i standard 
 (PCI IDE Bus mastering). So, I found out that it seems the 
 allocation of I/O ports for the IDE controller is just fine. 
 However, the primary IDE channel is shared between 
 an IDE interface  and a CF card. Moreover, Linux detects 
 DMA bug, because all drives connected to the interface would be 
 in PIO mode :-/
 If all drives on the primary channel are forced to PIO mode, then 
 shouldn't the IDE PCI bus master register (offset 20h per SFF-8038i)
 along with the command register (offset 4h), are set

Re: PANIC: thread_exit: Last thread exiting on its own.

2011-01-03 Thread John Baldwin
On Friday, December 31, 2010 4:22:36 am Lev Serebryakov wrote:
 Hello, Giovanni.
 You wrote 31 декабря 2010 г., 1:56:20:
 
   I've  got  this  panic on reboot from geom_raid5.
  Could you please provide some backtrace? Have you got a core?
   Backtrace  was were simple (I've reproduce it from my memory, but it
   really was that simple):
 
   all debugger-related stuff
   panic()
   thread_exit()
   kthread_exit()
   g_raid5_worker()
   fork_trampoline()
   ...
 
   No core, because I didn't have dumpdev configured :(
 
  Which revision of -STABLE are you running(or when last src update)?
   uname shows:
 
 FreeBSD 8.2-PRERELEASE #2: Tue Dec 21 01:17:16 MSK 2010
 
   I've  rebuilt  kernel  RIGHT after `csup', so difference is no more
  than several hours.

Looks like 204087 needs to be MFC'd.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [patch] have rtprio check that arguments are numeric; change atoi to strtol

2011-01-04 Thread John Baldwin
On Tuesday, January 04, 2011 6:25:02 am Kostik Belousov wrote:
 On Tue, Jan 04, 2011 at 11:40:45AM +0100, Giorgos Keramidas wrote:
 @@ -123,12 +121,28 @@ main(argc, argv)
   }
   exit(0);
   }
 - exit (1);
 + exit(1);
 +}
 +
 +static int
 +parseint(const char *str, const char *errname)
 +{
 + char *endp;
 + long res;
 +
 + errno = 0;
 + res = strtol(str, endp, 10);
 + if (errno != 0 || endp == str || *endp != '\0')
 + err(1, %s shall be a number, errname);

Small nit, maybe use 'must' instead of 'shall'.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Building third-party modules for kernel with debug options?

2011-01-07 Thread John Baldwin
On Friday, January 07, 2011 7:15:59 am Lev Serebryakov wrote:
 Hello, Freebsd-hackers.
 
 
   I've found, that struct bio is depend on state of DIAGNOSTIC
 flag (options DIAGNOSTIC in kernel config). But when I build
 third-party GEOM (or any other) module with using of bsd.kmod.mk,
 there is no access to these options. So, module, built from ports, can
 fail on user's kernel, even if it built with proper kernel sources in
 /usr/src/sys. Is here any solution for this problem?
 
 P.S. NB: GEOM module is only example, question is about modules 
 kernel options in general, so I put this message on Hackers list.

In general we try to avoid having public kernel data structures change size 
when various kernel options are in use.  Some noticeable exceptions to this 
rule are PAE (i386-only) and LOCK_PROFILING (considered to be something users 
would not typically use).  DIAGNOSTIC might arguably be considered the same as 
LOCK_PROFILING, but I am surprised it affects bio.  It should only affect a 
GEOM module that uses bio_pblockno however in this case since you should be 
using kernel routines to allocate bio structures rather than malloc'ing one 
directly.  Perhaps phk@ would ok moving bio_pblockno up above the optional 
diagnostic fields.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [rfc] allow to boot with = 256GB physmem

2011-01-21 Thread John Baldwin
On Friday, January 21, 2011 11:09:10 am Sergey Kandaurov wrote:
 Hello.
 
 Some time ago I faced with a problem booting with 400GB physmem.
 The problem is that vm.max_proc_mmap type overflows with
 such high value, and that results in a broken mmap() syscall.
 The max_proc_mmap value is a signed int and roughly calculated
 at vmmapentry_rsrc_init() as u_long vm_kmem_size quotient:
 vm_kmem_size / sizeof(struct vm_map_entry) / 100.
 
 Although at the time it was introduced at svn r57263 the value
 was quite low (f.e. the related commit log stands:
 The value defaults to around 9000 for a 128MB machine.),
 the problem is observed on amd64 where KVA space after
 r212784 is factually bound to the only physical memory size.
 
 With INT_MAX here is 0x7fff, and sizeof(struct vm_map_entry)
 is 120, it's enough to have sligthly less than 256GB to be able
 to reproduce the problem.
 
 I rewrote vmmapentry_rsrc_init() to set large enough limit for
 max_proc_mmap just to protect from integer type overflow.
 As it's also possible to live tune this value, I also added a
 simple anti-shoot constraint to its sysctl handler.
 I'm not sure though if it's worth to commit the second part.
 
 As this patch may cause some bikeshedding,
 I'd like to hear your comments before I will commit it.
 
 http://plukky.net/~pluknet/patches/max_proc_mmap.diff

Is there any reason we can't just make this variable and sysctl a long?

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: pci_suspend/pci_resume of custom pcie board

2011-01-25 Thread John Baldwin
On Tuesday, January 25, 2011 9:47:35 am Philip Soeberg wrote:
 Hi,
 
 I'm in a particular problem where I need to set my custom pcie adapter 
 into d3hot power-mode and a couple of seconds later reset it back to d0.
 The board has an FPGA directly attached to the pcie interface, and as I 
 need to re-configure the FPGA on the fly, I have to ensure the datalink 
 layer between the upstream bridge and my device is idle to prevent any 
 hickups.
 
 On linux I simply do a pci_save_state(device) followed by 
 pci_set_power_state(device, d3hot), then after my magic on my board, I 
 do the reverse: pci_set_power_state(device, d0) followed by 
 pci_restore_state(device).
 
 On FreeBSD, say 8, I've found the pci_set_powerstate function, which is 
 documented in PCI(9), but that function does not save nor restore the 
 config space.
 
 I've tried, just for the fun of it, to go via pci_cfg_save(device, 
 dinfo, 0) with dinfo being device_get_ivars(device) and then 
 subsequently restoring the config space back via pci_cfg_restore(), but 
 since both those functions are declared in dev/pci/pci_private.h I'm 
 not sure if I'm supposed to use those directly or not.. Besides, I'm not 
 really having any luck with that approach.
 
 Reading high and low on the net suggest that not all too many driver 
 devs are concerned with suspend/resume operation of their device, and if 
 they are, leave it to user-space to decide when to suspend/resume a 
 device.. I would like to be able to save off my device' config space, 
 put it to sleep (d3hot), wake it back up (d0) and restore the device' 
 config space directly from the device' own driver..
 
 Anyone who can help me with this?

Use this:

pci_cfg_save(dev, dinfo, 0);
pci_set_powerstate(dev, PCI_POWERSTATE_D3);

/* do stuff */

/* Will set state to D0. */
pci_cfg_restore(dev, dinfo);

We probably should create some wrapper routines (pci_save_state() and 
pci_restore_state() would be fine) that hide the 'dinfo' detail as that isn't 
something device drivers should have to know.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: rtld optimizations

2011-01-26 Thread John Baldwin
On Wednesday, January 26, 2011 10:25:27 am Mark Felder wrote:
 On Tue, 25 Jan 2011 22:49:11 -0600, Alexander Kabaev kab...@gmail.com  
 wrote:
 
   The only extra quirk that said commit
  does is an optimization of a dlsym() call, which is hardly ever in
  critical performance path.
 
 It's really not my place to say, but it seems strange that if an  
 optimization is available people would ignore it because they don't think  
 it's important enough. I don't understand this mentality; if it's not  
 going to break anything and it obviously can improve performance in  
 certain use cases, why not merge it and make FreeBSD even better?

Many things that seem obvious aren't actually true, hence the need for
actual testing and benchmarks.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Namecache lock contention?

2011-01-28 Thread John Baldwin
On Friday, January 28, 2011 8:46:07 am Ivan Voras wrote:
 I have this situation on a PHP server:
 
 36623 www 1  760   237M 30600K *Name   6   0:14 47.27% php-cgi
 36638 www 1  760   237M 30600K *Name   3   0:14 46.97% php-cgi
 36628 www 1 1050   237M 30600K *Name   2   0:14 46.88% php-cgi
 36627 www 1 1050   237M 30600K *Name   0   0:14 46.78% php-cgi
 36639 www 1 1050   237M 30600K *Name   5   0:14 46.58% php-cgi
 36643 www 1 1050   237M 30600K *Name   7   0:14 46.39% php-cgi
 36629 www 1  760   237M 30600K *Name   1   0:14 46.39% php-cgi
 36642 www 1 1050   237M 30600K *Name   2   0:14 46.39% php-cgi
 36626 www 1 1050   237M 30600K *Name   5   0:14 46.19% php-cgi
 36654 www 1 1050   237M 30600K *Name   7   0:13 46.19% php-cgi
 36645 www 1 1050   237M 30600K *Name   1   0:14 45.75% php-cgi
 36625 www 1 1050   237M 30600K *Name   0   0:14 45.56% php-cgi
 36624 www 1 1050   237M 30600K *Name   6   0:14 45.56% php-cgi
 36630 www 1  760   237M 30600K *Name   7   0:14 45.17% php-cgi
 36631 www 1 1050   237M 30600K RUN 4   0:14 45.17% php-cgi
 36636 www 1 1050   237M 30600K *Name   3   0:14 44.87% php-cgi
 
 It looks like periodically most or all of the php-cgi processes are 
 blocked in *Name for long enough that top notices, then continue, 
 probably in a thundering herd way. From grepping inside /sys the most 
 likely suspect seems to be something in the namecache, but I can't find 
 exactly a symbol named Name or string beginning with Name that would 
 be connected to a lock.

In vfs_cache.c:

static struct rwlock cache_lock;
RW_SYSINIT(vfscache, cache_lock, Name Cache);

What are the php scripts doing?  Do they all try to create and delete files at 
the same time (or do renames)?

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Divide-by-zero in loader

2011-01-28 Thread John Baldwin
On Friday, January 28, 2011 12:41:08 pm Matthew Fleming wrote:
 I spent a few days chasing down a bug and I'm wondering if a loader
 change would be appropriate.
 
 So we have these new front-panel LCDs, and like everything these days
 it's a SoC.  Normally it presents to FreeBSD as a USB communications
 device (ucom), but when the SoC is sitting in its own boot loader, it
 presents as storage (umass).  If the box is rebooted in this state,
 the reboot gets into /boot/loader and then reboots itself.  (It took a
 few days just to figure out I was getting into /boot/loader, since the
 only prompt I could definitively stop at was boot2).
 
 Anyways, I eventually debugged it to the device somehow presenting
 itself to /boot/loader with a geometry of 1024/256/0, and since od_sec
 is 0 that causes a divide-by-zero error in bd_io() while the loader is
 trying to figure out if this is GPT or MBR formatted.  We're still
 trying to figure out why the loader sees this incorrect geometry.
 
 But meanwhile, this patch fixes the issue, and I wonder if it would be
 a useful safety-belt for other devices where an incorrect geometry can
 be seen?

That's probably fine.  A sector count of zero is invalid for CHS.  However, 
probably we should not even be using C/H/S at all if the device claims to 
support EDD.  We already use raw LBAs if it supports EDD, and we should 
probably just ignore C/H/S altogether if it supports EDD.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Divide-by-zero in loader

2011-01-28 Thread John Baldwin
On Friday, January 28, 2011 2:14:45 pm Matthew Fleming wrote:
 On Fri, Jan 28, 2011 at 11:00 AM, John Baldwin j...@freebsd.org wrote:
  On Friday, January 28, 2011 12:41:08 pm Matthew Fleming wrote:
  I spent a few days chasing down a bug and I'm wondering if a loader
  change would be appropriate.
 
  So we have these new front-panel LCDs, and like everything these days
  it's a SoC.  Normally it presents to FreeBSD as a USB communications
  device (ucom), but when the SoC is sitting in its own boot loader, it
  presents as storage (umass).  If the box is rebooted in this state,
  the reboot gets into /boot/loader and then reboots itself.  (It took a
  few days just to figure out I was getting into /boot/loader, since the
  only prompt I could definitively stop at was boot2).
 
  Anyways, I eventually debugged it to the device somehow presenting
  itself to /boot/loader with a geometry of 1024/256/0, and since od_sec
  is 0 that causes a divide-by-zero error in bd_io() while the loader is
  trying to figure out if this is GPT or MBR formatted.  We're still
  trying to figure out why the loader sees this incorrect geometry.
 
  But meanwhile, this patch fixes the issue, and I wonder if it would be
  a useful safety-belt for other devices where an incorrect geometry can
  be seen?
 
  That's probably fine.  A sector count of zero is invalid for CHS.  However,
  probably we should not even be using C/H/S at all if the device claims to
  support EDD.  We already use raw LBAs if it supports EDD, and we should
  probably just ignore C/H/S altogether if it supports EDD.
 
 This is all almost entirely outside my knowledge, but at the moment
 bd_eddprobe() requres a geometry of 1023/255/63 before it attempts to
 check if EDD can be used.  Is that check incorrect?

Well, it is very conservative in that it only uses EDD if it thinks it can't
use C/H/S.  It would be interesting to see if simply checking for a sector
count of 0 there would avoid the divide-by-zero and let your device work.

However, it might actually be useful to always use EDD if possible, esp.
EDD3 since that lets you not use bounce buffers down in 1MB.

 In my specific case I know there's no bootable stuff on this disk; the
 earlier layers bypassed it correctly without a problem.
 
 Thanks,
 matthew
 

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: NVIDIA (port) driver fails to create /dev/nvidactl; 8.2Prerelease

2011-01-31 Thread John Baldwin
On Friday, January 28, 2011 3:43:12 pm Duane H. Hesser wrote:
 I am attempting to replace the 'nv' X11 driver with the official
 nvidia driver from ithe x11/nvidia-driver port, in order to handle
 the AVCHD video files from my Canon HF S20.
 
 I have been trying for several days now, having read the nvidia
 README file in /usr/local/share and everything Google has to offer.
 
 Unfortunately devilfs is smarter and meaner than I.
 
 The 'xorg.conf' file is created by nividia-xconfig.  The console
 output when calling 'startx' to begin the frustration is
 
 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 X.Org X Server 1.7.5
 Release Date: 2010-02-16
 X Protocol Version 11, Revision 0
 Build Operating System: FreeBSD 8.1-RELEASE i386 
 Current Operating System: FreeBSD belinda.androcles.org 8.2-PRERELEASE 
FreeBSD 8.2-PRERELEASE #3: Thu Jan 27 13:45:06 PST 2011 
r...@belinda.androcles.org:/usr/obj/usr/src/sys/BELINDA i386
 Build Date: 08 January 2011  05:52:50PM
  
 Current version of pixman: 0.18.4
 Before reporting problems, check http://wiki.x.org
 to make sure that you have the latest version.
 Markers: (--) probed, (**) from config file, (==) default setting,
 (++) from command line, (!!) notice, (II) informational,
 (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
 (==) Log file: /var/log/Xorg.0.log, Time: Fri Jan 28 11:32:46 2011
 (==) Using config file: /etc/X11/xorg.conf
 NVIDIA: could not open the device file /dev/nvidiactl (No such file or 
directory).
 (EE) Jan 28 11:32:46 NVIDIA(0): Failed to initialize the NVIDIA kernel 
module. Please see the
 (EE) Jan 28 11:32:46 NVIDIA(0): system's kernel log for additional error 
messages and
 (EE) Jan 28 11:32:46 NVIDIA(0): consult the NVIDIA README for details.
 (EE) NVIDIA(0):  *** Aborting ***
 (EE) Screen(s) found, but none have a usable configuration.
 
 Fatal server error:
 no screens found

You don't have an nvidia0 device attached to vgapci0.  I would suggest adding 
printfs to the nvidia driver's probe routine to find out why it failed to 
probe.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: weird characters in top(1) output

2011-02-01 Thread John Baldwin
On Tuesday, February 01, 2011 8:11:54 am Alexander Best wrote:
 On Tue Feb  1 11, Sergey Kandaurov wrote:
  On 1 February 2011 15:24, Alexander Best arun...@freebsd.org wrote:
   hi there,
  
   i was doing the following:
  
   top inf  ~/output
  
   when i noticed that this was missing the overall statistics line. so i 
   went
   ahead and did:
  
   top -d2 inf  ~/output
  
   funny thing is that for the second output some weird characters seem to 
   get
   spammed into the overall statistics line:
  
   last pid: 14320;  load averages:  0.42,  0.44,  0.37  up 1+14:02:02
   13:21:05
   249 processes: 1 running, 248 sleeping
   CPU: ^[[3;6H 7.8% user,  0.0% nice, 10.6% system,  0.6% interrupt, 81.0% 
   idle
   Mem: 1271M Active, 205M Inact, 402M Wired, 67M Cache, 212M Buf, 18M Free
   Swap: 18G Total, 782M Used, 17G Free, 4% Inuse
  
   this only seems to happen when i redirect the top(1) output to a file. if 
   i do:
  
   top -d2 inf
  
   ...everything works fine. i verified the issue under zsh(1) and sh(1).
  
  My quick check shows that this is a regression between 7.2 and 7.3.
  Reverting r196382 fixes this bug for me.
 
 thanks for the help. indeed reverting r196382 fixes the issue.

Hmm, you need more than 10 CPUs to understand the reason for that fix.
Without it all of the updated per-CPU states are off by one column so you
get weird screen effects.  The garbage characters are actually just a
terminal sequence to move the cursor.  top uses these things a _lot_ to
move the cursor around.

You can try this instead though, it figures out the appropriate number of
spaces rather than using Move_to() for these two routines:

Index: display.c
===
--- display.c   (revision 218032)
+++ display.c   (working copy)
@@ -447,12 +447,14 @@
 /* print tag and bump lastline */
 if (num_cpus == 1)
printf(\nCPU: );
-else
-   printf(\nCPU %d: , cpu);
+else {
+   value = printf(\nCPU %d: , cpu);
+   while (value++ = cpustates_column)
+   printf( );
+}
 lastline++;
 
 /* now walk thru the names and print the line */
-Move_to(cpustates_column, y_cpustates + cpu);
 while ((thisname = *names++) != NULL)
 {
if (*thisname != '\0')
@@ -532,7 +534,7 @@
 register char **names;
 register char *thisname;
 register int *lp;
-int cpu;
+int cpu, value;
 
 for (cpu = 0; cpu  num_cpus; cpu++) {
 names = cpustate_names;
@@ -540,11 +542,13 @@
 /* show tag and bump lastline */
 if (num_cpus == 1)
printf(\nCPU: );
-else
-   printf(\nCPU %d: , cpu);
+else {
+   value = printf(\nCPU %d: , cpu);
+   while (value++ = cpustates_column)
+   printf( );
+}
 lastline++;
 
-Move_to(cpustates_column, y_cpustates + cpu);
 while ((thisname = *names++) != NULL)
 {
if (*thisname != '\0')

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Strange problems in the old libc malloc routines

2011-02-02 Thread John Baldwin
On Wednesday, February 02, 2011 01:04:15 pm Andrew Duane wrote:
 We are still using the FreeBSD 6 malloc routines, and are rather suddenly
 having a large number of problems with one or two of our programs. Before
 I dig into the 100+ crash dumps I have, I thought I'd see if anyone else
 has ever encountered this.
 
 The problems all seem to stem from some case of malloc returning the
 pointer 1 instead of either NULL or a valid pointer. Always exactly 1.
 Where this goes bad depends on where it happens (in the program or inside
 malloc itself), but that pointer value of 1 is always involved. Some of
 the structures like page_dir look corrupted too. It seems as if maybe the
 1 is coming from sbrk(0) which is just returning the value of curbrk
 (which is correct, and not even close to 1).

Could it be related to calls to malloc(0) perhaps?  phkmalloc uses a constant 
for those that defaults to the last byte in a page (e.g. 4095 on x86).  I'm 
not sure what platform you are using malloc on, but is it possible that you 
have ZEROSIZEPTR set to 1 somehow?  Even so, if that is true free() should 
just ignore that pointer and not corrupt its internal state.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Analyzing wired memory?

2011-02-08 Thread John Baldwin
()
ff80798ea000 - ff8079949000 kmem_alloc_nofault() (kstack/mapdev)
ff8079949000 - ff807994a000 kmem_alloc() / contigmalloc()
ff807994a000 - ff807994b000 object 0xff0060568af8
ff807994b000 - ff8079969000 kmem_alloc_nofault() (kstack/mapdev)
ff8079969000 - ff807996b000 
ff807996b000 - ff80799b kmem_alloc() / contigmalloc()
ff80799b - ff80799b1000 object 0xff00606caca8
ff80799b1000 - ff80799b2000 object 0xff00606caca8
ff80799b2000 - ff80799b6000 kmem_alloc() / contigmalloc()
ff80799b6000 - ff80799b7000 object 0xff0060568af8
ff80799b7000 - ff80799b8000 object 0xff0060568af8
ff80799b8000 - ff8079cbc000 kmem_alloc() / contigmalloc()
ff8079cbc000 - ff807aa0e000 kmem_alloc_nofault() (kstack/mapdev)
ff807aa0e000 - 8000 
8000 - 808164e8 text/data/bss
808164e8 - 81822000 bootstrap data

(The various objects inserted directly into the kernel_map are likely from
the nvidia driver.)

The 'kvm' command in my gdb script is mostly MI, but some bits are MD such as
the code to handle the 'AP stacks' region.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: CFR: FEATURE macros for AUDIT/CAM/IPC/KTR/MAC/NFS/NTP/PMC/SYSV/...

2011-02-11 Thread John Baldwin
On Friday, February 11, 2011 4:30:28 am Alexander Leidinger wrote:
 Hi,
 
 during the last GSoC various FEATURE macros where added to the system.  
 Before committing them, I would like to get some review (like if macro  
 is in the correct file, and for those FEATURES where the description  
 was not taken from NOTES if the description is OK).
 
 If nobody complains, I would like to commit this in 1-2 weeks. If you  
 need more time to review, just tell me.
 
 Here is the list of affected files (for those impatient ones which do  
 not want to look at the attached patch before noticing that they are  
 not interested to look at it):

Hmm, so what is the rationale for adding FEATURE() macros?  Do we just want to 
add them for everything or do we want to add them on-demand as use cases for 
each knob arrive?  Some features can already be inferred (e.g. if KTR is 
compiled in, then the debug.ktr.mask sysctl will exist).  Also, in the case of 
KTR, I'm not sure that any userland programs need to alter their behavior 
based on whether or not that feature was present.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


<    7   8   9   10   11   12   13   14   15   16   >