Re: The imperfect beauty of NetBSD [Was: NetBSD vs. FreeBSD]

2010-01-05 Thread Martin Husemann
On Tue, Jan 05, 2010 at 01:24:42AM -0500, Alex Goncharov wrote:
> Can something be done about the ACPI errors popping up every (roughly)
> minute?  Will filing a PR help?

Please file one and let's hope it helps (at first glance it looks harmless
and easy to fix, but I'm no way an acpi expert).

Martin


Re: The imperfect beauty of NetBSD [Was: NetBSD vs. FreeBSD]

2010-01-05 Thread Martin Husemann
On Tue, Jan 05, 2010 at 01:11:23PM +0100, Joerg Sonnenberger wrote:
> Unless the polling for battery and TZ is disabled, it won't.

Do they deliver reliable data with these errors? If not, disabling sounds
like a valid option.

Martin


Re: disklabel and solaris vtoc

2010-01-05 Thread Martin Husemann
On Wed, Jan 06, 2010 at 12:01:18AM +0100, Christoph Egger wrote:
> fdisk recognizes the solaris boot partition.
> disklabel does not know solaris vtoc and hence doesn't recognize the zfs
> file system.

disklabel is the wrong tool for this.

You could add a dkscan_vtoc command (see dkscan_bsdlabel(8)). Should
be very simple to write.

Martin


Re: PR 42583: ACPI errors, HP DV6-1334US [Was: The imperfect beauty of NetBSD]

2010-01-06 Thread Martin Husemann
On Wed, Jan 06, 2010 at 08:42:12PM +, Christos Zoulas wrote:
> Try 'shutdown -p now'. I've had problems with deadlocks when using halt -p
> due to processes not dying on time and the kernel getting stuck on their
> resources (or even panicking).

Another test is:

  sysctl -w machdep.sleep_state=5

(better not do that while running multiuser).

This powers off directly, w/o running the power management handlers.

Martin


Re: Fastest dump device

2010-01-11 Thread Martin Husemann
On Mon, Jan 11, 2010 at 07:39:15AM -0800, Paul Goyette wrote:
> Wondering:  would it make sense for savecore to run asynchronously,
> rather than waiting for it to complete before continuing with the rest 
> of system startup?

Only for a dedicated dump device that is not also used for swap. You
risk loosing important parts of the dump otherwise.

Martin


Re: memory leak in USB stack

2010-01-12 Thread Martin Husemann
On Tue, Jan 12, 2010 at 05:24:46PM +0100, Manuel Bouyer wrote:
> Comments ?

Good catch, patch looks ok to me.

Martin


Yet another dirct config proposal for i2c busses

2010-02-03 Thread Martin Husemann
Folks,

we still have to solve the problem of "scanning" i2c busses, especially
on machines where no scan is needed since the firmware happily tells
us everything we might want to know.

In the past (as far as I remember) two proposal where presented and
both shot down. Now I have a machine where I realy needed the i2c
based fan controll to make the noise bearble - so I put on my asbestos
suit and wrote another implementation, which I'd like to present here.

The previous proposals:

 - use the OpenBSD way: an optional "scan" callback provided by the
   i2c controller driver. Downsides: needs changes ~every i2c controller
   driver in-tree
 - the macppc way: see macppc/dev/kiic* - basically a slightly different
   bus, needs frontend/backend split of i2c device drivers and a lot
   of additional frontends to be written

I may misrember details and critics raised against one of those.

Goals I tried to achieve:

 - Allow both direct and indirect config at the i2c bus layer,
   depending on availability of firmware provided locators
 - Allow unmodified i2c device drivers to continue working
 - Keep MD changes as simple and small as possible
 - No changes to MI i2c bus controllers
 - Allow MD i2c bus controllers to easily override the generic
   behaviour (i.e.: provide additional locators or modify firmware
   provided ones)

Seems like it worked out, and the changes are pretty small in the end.

Quick overview how it works:

 - If we are doing direct config, MD code (via generic support routines,
   or by overriding those) adds a prop_array to the device properties
   of the i2c bus controller (the parent(!) of the i2c bus). This array
   contains a dictionary for each i2c device on the bus. Entries in
   this dictionary are:
 "name" -> string, device name
 "address" -> uint32, i2c address
 "size" -> uint32 (optional)
 "compatibility" -> a list of names, i.e. the chip used, used for
matching a hardware driver (think: alternative "name" props)

 - When the i2c bus attaches, it queries the device properties of it's
   parent device and checks the "i2c-child-devices" property (the array
   described above), and if it is present, iterates over the array
   creating i2c_attach_arg from it. To allow direct config matches,
   the i2c_attach_args structure has been extended.

   If the device property is not available, indirect config is done.

 - An i2c device driver for a proper device will need no changes, but
   for i.e. write-only devices matching based on strings can be added.
   A generic helper function to match the "compatible" string list against
   a driver specific list is provided.

A few more details:

Let's start with setting up the device property. In the attached sparc64 MD
code the setup is done inside device_register() whenever a "iic" device
attaches and there is not yet a "i2c-child-devices" property at the parent.
This check is needed to allow MD i2c controller drivers to override
the generic behaviour. For OpenFirmware based machines, a convenience
funtction "of_enter_i2c_devs()" to do the device property setup is provided.

Next step: the i2c bus attaches and checks for the device property. This is
all done in the i2c bus code, no i2c controller driver needs modifications.

Last part of the puzzle: the i2c device drivers can check for the (new)
ia_name pointer in the i2c_attach_args structure to find out if direct
config is available. For example the spdmem driver does a (nasty/stupid?)
check for certain address values - which does not make any sense in the
direct config case:

@@ -164,8 +165,17 @@ spdmem_match(device_t parent, cfdata_t m
int spd_len, spd_crc_cover;
uint16_t crc_calc, crc_spd;
 
-   if ((ia->ia_addr & SPDMEM_ADDRMASK) != SPDMEM_ADDR)
-   return 0;
+   if (ia->ia_name) {
+   /* add other names as we find more firmware variations */
+   if (strcmp(ia->ia_name, "dimm-spd"))
+   return 0;
+   }
+
+   /* only do this lame test when not using direct config */
+   if (ia->ia_name == NULL) {
+   if ((ia->ia_addr & SPDMEM_ADDRMASK) != SPDMEM_ADDR)
+   return 0;
+   }
 
sc.sc_tag = ia->ia_tag;
sc.sc_addr = ia->ia_addr;

If the firmware name is not a good indicator of the driver to use, the
"compatible" list can be used, via the generic iic_compat_match()
function.

I have tested this on a few sparc64 machines, it works for me.
I won't mind if we decide to not used this but go with one of the older
proposals instead - but we need to move on.

Comments?

Martin
Index: dev/ofw/openfirm.h
===
RCS file: /cvsroot/src/sys/dev/ofw/openfirm.h,v
retrieving revision 1.27
diff -c -u -p -r1.27 openfirm.h
--- dev/ofw/openfirm.h  11 Nov 2009 16:56:52 -  1.27
+++ dev/ofw/openfirm.h  3 Feb 2010 19:46:14 -
@@ -115,4 +115,6 @@ boolean_t   of_to_dataprop(prop_dicti

Re: Yet another dirct config proposal for i2c busses

2010-02-04 Thread Martin Husemann
On Thu, Feb 04, 2010 at 06:27:47PM +1100, matthew green wrote:
> have you tested it on anything besides sparc64 yet?

No - my non-sparc64 machines seem to be mostly i2c-less (have to double
check).

Martin


Re: Device page

2010-02-07 Thread Martin Husemann
On Sat, Feb 06, 2010 at 10:55:02PM +, Eduardo Horvath wrote:
> ISTR sparc64 used bits in the physaddr_t for frambuffer mappings.

If so it is well hidden (bus_space_map with LINEAR is used, but AFAICT that
does nothing special to the uvm page).

Martin


Re: uticom(4)

2010-02-07 Thread Martin Husemann
On Sun, Feb 07, 2010 at 12:46:02AM +, Jonathan A. Kollasch wrote:
> Oh, that is an issue.  And as the Catweasel thread shows,
> this firmware is trivially small.  Maybe this issue with
> firmload can be fixed in the future.

Maybe we need a config_defer_root() ?

Martin


Re: Cannot list a particular directory through NFS with UDP

2010-02-07 Thread Martin Husemann
On Sun, Feb 07, 2010 at 02:53:51PM +0100, Jeremie Le Hen wrote:
> FWIW, I've tried changing rsize to 1024 and I could read the entire
> tree.

I bet a tcp mount (-T option) would make it work too.
What network interface do you use on the NetBSD client and what type of
network is this (e.g. gigE with jumbo frames enabled)?

I see something similar on some networks with some machines - while other
machines just work (but I never got around to do further diagnostic, just
made the mount use TCP and forgot about it).

Martin


Re: kthread with kpause or callout

2010-02-08 Thread Martin Husemann
On Mon, Feb 08, 2010 at 11:11:04AM +0100, Frank Wille wrote:
> - Running a kthread and calling kpause() between the polls.
> - Using a callout which reschedules itself after the poll.

The thread is quite a bit more heavyweight, but you have full freedom
to do what you want. The callout is pretty lightweight but you are
still subject of "interrupt context" rules.

I ran into a similar case recently where I didn't want full thread context
but needed it sometimes - I ended up using a a rescheduled callout plus
sometimes adding a job to the sysmon workqueue (sysmon_task_queue_sched(),
this task was in envsys scope, so it was an easy way out).

Martin


Re: kthread with kpause or callout

2010-02-08 Thread Martin Husemann
On Mon, Feb 08, 2010 at 03:04:07PM +0100, Frank Wille wrote:
> I'm just unsure about using mutexes during the callout. I have an
> IPL_NONE-kmutex which locks register access (my chip supports several
> register banks, so I need to make sure they are not switched). May I
> acquire this mutex during a callout (which is a softint, as I understand)?
> Will the softint sleep or busy-wait?

Depends on the mutex type, from mutex(9):

   IPL_NONE, or one of the IPL_SOFT* constants

 An adaptive mutex will be returned.  Adaptive mutexes provide
 mutual exclusion between LWPs, and between LWPs and soft
 interrupt handlers.

 Adaptive mutexes cannot be acquired from a hardware interrupt
 handler.  An LWP may either sleep or busy-wait when attempt-
 ing to acquire an adaptive mutex that is already held.

   IPL_VM, IPL_SCHED, IPL_HIGH

 A spin mutex will be returned.  Spin mutexes provide mutual
 exclusion between LWPs, and between LWPs and interrupt han-
 dlers.


The wording is not explicit, but a softint is not allowed to block on
an adaptive mutex, you need a spin mutex for that (usually such mutexes
are used by interrupt handlers, so you have a "natural" IPL to use
here).

A caller will always busy wait trying to aquire a spin mutex, but it might
fall back to sleep on an adaptive mutex.

Martin


Re: kthread with kpause or callout

2010-02-08 Thread Martin Husemann
On Mon, Feb 08, 2010 at 03:09:55PM +0100, Martin Husemann wrote:
> The wording is not explicit, but a softint is not allowed to block on

s/softint/callout/ of course, sorry for the confusion.

Martin


Re: kthread with kpause or callout

2010-02-08 Thread Martin Husemann
On Mon, Feb 08, 2010 at 03:35:35PM +0100, Frank Wille wrote:
> IMHO that would allow my callout to sleep on acquiring the mutex?

A softint can sleep, a callout can not.

Martin


Re: btuart and SOCKET Bluetooth CF

2010-02-18 Thread Martin Husemann
On Thu, Feb 18, 2010 at 11:53:50AM +, Iain Hibbert wrote:
> Yes, except C89 and C++ do not permit that, so gcc (also lint?) makes
> complaints about it. It is not "reasonable" alas.

Nitpick: C++ and C99 do allow it, gcc does not complain (at least I couldn't
get it to complain with any -std= and -Wall value I tried).

We should fix lint.

Martin


Re: Yet another direct config proposal for i2c busses

2010-02-21 Thread Martin Husemann
I had only positive feedback on the proposal so far.
Attached is a updated version of the MI parts.

I did a bit of cleanup and added a way to pass device dependend "cookies"
trough to the i2c devices -  I need this in a machine specific driver, and it
seems to be the simplest way to do this (actually I haven't found any other
sensible way). For machines using OpenFirmware this would be the OF node of
the i2c device, for others it could be a pointer to some ACPI table entry or
whatever.

I intend to commit this sometime later next week, it is blocking the addition
of a few drivers urgently needed to make some machines usefull.

Martin
Index: dev/ofw/openfirm.h
===
RCS file: /cvsroot/src/sys/dev/ofw/openfirm.h,v
retrieving revision 1.27
diff -c -u -p -r1.27 openfirm.h
--- dev/ofw/openfirm.h  11 Nov 2009 16:56:52 -  1.27
+++ dev/ofw/openfirm.h  21 Feb 2010 13:25:53 -
@@ -115,4 +115,6 @@ boolean_t   of_to_dataprop(prop_dictionary
 int*of_network_decode_media(int, int *, int *);
 char   *of_get_mode_string(char *, int);
 
+void   of_enter_i2c_devs(prop_dictionary_t, int);
+
 #endif /*_OPENFIRM_H_*/
Index: dev/ofw/ofw_subr.c
===
RCS file: /cvsroot/src/sys/dev/ofw/ofw_subr.c,v
retrieving revision 1.16
diff -c -u -p -r1.16 ofw_subr.c
--- dev/ofw/ofw_subr.c  21 Jan 2010 15:56:08 -  1.16
+++ dev/ofw/ofw_subr.c  21 Feb 2010 13:25:53 -
@@ -325,3 +325,60 @@ of_get_mode_string(char *buffer, int len
strncpy(buffer, pos + 2, len);
return buffer;
 }
+
+/*
+ * Iterate over the subtree of a i2c controller node.
+ * Add all sub-devices into an array as part of the controller's
+ * device properties.
+ * This is used by the i2c bus attach code to do direct configuration.
+ */
+void
+of_enter_i2c_devs(prop_dictionary_t props, int ofnode)
+{
+   int node, len;
+   char name[32];
+   uint64_t r64;
+   uint64_t r32;
+   uint8_t smr[24];
+   prop_array_t array;
+   prop_dictionary_t dev;
+
+   array = prop_array_create();
+
+   for (node = OF_child(ofnode); node; node = OF_peer(node)) {
+   if (OF_getprop(node, "name", name, sizeof(name)) <= 0)
+   continue;
+   len = OF_getproplen(node, "reg");
+   if (len == sizeof(r64)) {
+   if (OF_getprop(node, "reg", &r64, sizeof(r64))
+   != sizeof(r64))
+   continue;
+   r32 = r64;
+   } else if (len == sizeof(r32)) {
+   if (OF_getprop(node, "reg", &r32, sizeof(r32))
+   != sizeof(r32))
+   continue;
+   } else if (len == 24) {
+   if (OF_getprop(node, "reg", smr, sizeof(smr))
+   != sizeof(smr))
+   continue;
+   /* smbus reg property */
+   r32 = smr[7];
+   } else {
+   panic("unexpected \"reg\" size %d for \"%s\", "
+   "parent %x, node %x",
+   len, name, ofnode, node);
+   }
+
+   dev = prop_dictionary_create();
+   prop_dictionary_set_cstring(dev, "name", name);
+   prop_dictionary_set_uint32(dev, "addr", r32 >> 1);
+   prop_dictionary_set_uint64(dev, "cookie", node);
+   of_to_dataprop(dev, node, "compatible", "compatible");
+   prop_array_add(array, dev);
+   prop_object_release(dev);
+   }
+
+   prop_dictionary_set(props, "i2c-child-devices", array);
+   prop_object_release(array);
+}
Index: dev/i2c/i2cvar.h
===
RCS file: /cvsroot/src/sys/dev/i2c/i2cvar.h,v
retrieving revision 1.6
diff -c -u -p -r1.6 i2cvar.h
--- dev/i2c/i2cvar.h9 Jul 2007 21:00:33 -   1.6
+++ dev/i2c/i2cvar.h21 Feb 2010 13:25:53 -
@@ -118,12 +118,29 @@ struct i2c_attach_args {
i2c_addr_t  ia_addr;/* address of device */
int ia_size;/* size (for EEPROMs) */
int ia_type;/* bus type */
+   /* only set if using direct config */
+   const char *ia_name;/* name of the device */
+   int ia_ncompat; /* number of pointers in the
+  ia_compat array */
+   const char **   ia_compat;  /* chip names */
+   /*
+* The following is of limited usefullness and should only be used
+* in rare cases where we realy know what we are doing. Example:
+* a machine depended i2c driver (located in sys/arch/$arch/dev)
+* needing to access some firmware properties.
+* Depending on the firmware in use, an identifier f

Re: apm(4) fixes

2010-03-07 Thread Martin Husemann
On Sun, Mar 07, 2010 at 11:08:12PM +0100, Manuel Bouyer wrote:
> I don't know, I didn't try then one by one.
> It would be usefull to have a list of handlers which are run, to
> ease debugging this sort of issue.

doesn't "boot -v" provide you with that sort of output?

Martin


Re: apm(4) fixes

2010-03-07 Thread Martin Husemann
On Mon, Mar 08, 2010 at 01:16:04AM +0100, Martin Husemann wrote:
> doesn't "boot -v" provide you with that sort of output?

I guess I meant -x

Martin


Re: MI overrides of bus_dma(9), bus_space(9), pci(9)

2010-03-10 Thread Martin Husemann
On Wed, Mar 10, 2010 at 09:46:03PM +0900, Masao Uebayashi wrote:
> [..] For example,
> bus_addr_t of a device instance should be taught to struct device as
> an attribute (or property or whatever you call).

What is *the* "bus_addr_t of a device instance"?

> Then we can have a
> unified way to calculate the physical address of the device, by
> summing up its parents' bus_addr_t.

I'm neither sure this could be done on all archs, nor why you would want to
do this at all.

Martin


Re: MI overrides of bus_dma(9), bus_space(9), pci(9)

2010-03-10 Thread Martin Husemann
On Wed, Mar 10, 2010 at 11:11:38PM +0900, Masao Uebayashi wrote:
> > What is *the* "bus_addr_t of a device instance"?
> 
> That depends.

On what?

> And this answer has never been answered.  This is why we have all the
> messy MD bus code like rbus...

I disagree (not on messy, MD, rbus, but on above question being unanswerable
having anything to do with it).

> > nor why you would want to do this at all.
> 
> Because NetBSD/arm (imx31) had no bus_space_mmap(9).  See

Ah, for mmap - but architectures with multiple MMUs there might not
be a single answer, and if there is, it might not stay constant infinitely.

Martin


Re: MI overrides of bus_dma(9), bus_space(9), pci(9)

2010-03-10 Thread Martin Husemann
On Wed, Mar 10, 2010 at 11:30:17PM +0900, Masao Uebayashi wrote:
> Details of architectures and buses.

And also on the device - there might be multiple addresses needed.

> I don't know how multiple MMUs work.

It just means that there is no global view of the VA <-> PA mapping,
especially the device itself might see completely different VAs for the
same PAs as the kernel sees.

I still don't understand how you want to derive that by summing up properties
of parent devices, and what you would do with the sum - you would still
need to create a mapping for the result (and in case of dma convert it to
the proper address the device sees, which is inherently MD).

Martin


Re: config(5) break down

2010-03-16 Thread Martin Husemann
On Tue, Mar 16, 2010 at 05:26:20AM +, David Holland wrote:
> After a fashion. Check how our LOCKDEBUG works. :-/

You mean "crawls"?

Martin


Re: uhmodem crashes netbsd-5/i386 on attach

2010-04-11 Thread Martin Husemann
On Sun, Apr 11, 2010 at 09:17:28AM +0100, Iain Hibbert wrote:
> whereas it worked fine with u3g afterwards. I don't know if some of that
> work needs to be pulled up but it might be worth disabling uhmodem and
> trying the u3g driver..

There was some fallout, but this seems to be fixed now - so pulling up u3g
and disabling uhmodem might be an option; but we can also at least work
around the crash if a full pullup is not wanted to some branches - I'll ask
releng.

Martin


Re: KASSERT() in kern_timeout.c

2010-04-19 Thread Martin Husemann
On Mon, Apr 19, 2010 at 04:22:29PM +0100, Mindaugas Rasiukevicius wrote:
> Right, except callout_schedule() should be enough.  Also, you have removed
> callout_stop() from syn_cache_rm().  In such case it is unsafe, you need
> to keep it.

s/callout_stop()/callout_halt()/ ?

Martin



Re: allocating memory during kernel startup

2010-05-07 Thread Martin Husemann
On Fri, May 07, 2010 at 09:28:31AM +, Andrew Doran wrote:
> Right, on the face of it I don't see why this is something to be handled at
> runtime.

Move consinit() past uvm_init() ;-}

Martin


Re: allocating memory during kernel startup

2010-05-07 Thread Martin Husemann
On Fri, May 07, 2010 at 03:15:55PM -0400, Michael wrote:
> We get out of cold WAY later than that. In fact kmem is ready before  
> autoconfig starts and cold is only cleared after that.

Realistically, besides all the problems we already solved by rearranging
intializiation order, it is only consinit(). So you just need to take
care what you call from there (and maybe pass some flags).

Maybe rasops can do minimal configuration early and complete at driver attach
time?

Martin


Re: allocating memory during kernel startup

2010-05-07 Thread Martin Husemann
On Fri, May 07, 2010 at 08:39:51PM +0100, Mindaugas Rasiukevicius wrote:
> Well, kmem(9) is initialised very early, just after pool(9), in uvm_init(),
> where I moved it couple years ago.  Yes, 'cold' is unset very late.  Since
> allocation gets postponed anyway, is that a problem?  You did not describe
> your use case. :)

consinit(), run just before uvm_init().

It may need to initialize raspos, which will allocate memory for fonts,
screens, ...

Martin


Re: allocating memory during kernel startup

2010-05-07 Thread Martin Husemann
On Fri, May 07, 2010 at 04:05:30PM -0400, Michael wrote:
> kernel output anyway. All it needs is a sane way to decide wether it  
> can allocate memory or not.

So split init in a minimal (optional) version called from the consinit
functions, and a full grown init called at driver attach time (and have the
latter skip the parts already done by the first if it ran already).

Martin


Re: PAT support

2010-05-19 Thread Martin Husemann
Can we restrict the scope of this posting slightly and move it to
port-x86?

Thanks,

Martin


Re: xxxVERBOSE module?

2010-05-23 Thread Martin Husemann
On Sat, May 22, 2010 at 08:29:22PM -0700, Paul Goyette wrote:
> attempt to load from the filesystem.  I expected it to fail, but the 
> failure mode is rather ungraceful 

Maybe this would help?

Index: vfs_lookup.c
===
RCS file: /cvsroot/src/sys/kern/vfs_lookup.c,v
retrieving revision 1.121
diff -u -r1.121 vfs_lookup.c
--- vfs_lookup.c8 Jan 2010 11:35:10 -   1.121
+++ vfs_lookup.c23 May 2010 09:43:18 -
@@ -340,6 +340,11 @@
 * Get root directory for the translation.
 */
cwdi = self->l_proc->p_cwdi;
+   if (__predict_false(cwdi == NULL)) {
+   PNBUF_PUT(cnp->cn_pnbuf);
+   ndp->ni_vp = NULL;
+   return ENONENT;
+   }
rw_enter(&cwdi->cwdi_lock, RW_READER);
state->namei_startdir = cwdi->cwdi_rdir;
if (state->namei_startdir == NULL)

Or actually move that test up a few lines and use the existing if (error) exit.

Martin


Re: __read_mostly and __mp_friendly annotations

2010-05-23 Thread Martin Husemann
Looks good, but the __mp_friendly name is not quite obvious. Why not call
it __cacheline_aligned?

Martin


Re: bus_space(9) overrides & resource reservations

2010-05-27 Thread Martin Husemann
What are the arguments to bus_space_tag_create()?

I'm looking for a flag to tell it the "bus endianess" of the resulting tag,
as that would help to sort out an abstraction violation in SBUS <-> pcmcia
adapters. Support for that would be optional, of course.

Martin


Re: bus_space(9) overrides & resource reservations

2010-05-27 Thread Martin Husemann
On Thu, May 27, 2010 at 10:18:53PM +0900, Izumi Tsutsui wrote:
> The real "create" function, especially for access primitives,
> should stay in MD implementation.
> (See atari/dev/if_ne_mb.c for awful examples)

The example I had in mind is sys/dev/sbus/stp4020.c, see for example
lines 342ff (for sparc specific modifications of the opaque bus_space
handle/tag) and line 859ff (for a sparc64 specific one).

But when reading this again, I found other "MD" quirks in there, and it's
probably ok since SBUS is not realy a MI bus.


Martin


Re: bus_space(9) overrides & resource reservations

2010-05-27 Thread Martin Husemann
On Thu, May 27, 2010 at 12:06:53PM -0500, David Young wrote:
> You could also override bus_space_map() or whichever routine is most
> suitable.

Yes, and for the sparc case it would work, but sparc64 can do it a
lot more elegantly, and I'm still looking for a way to request the MD
bus_space_* implementation to provide this trick to MI code (in stp4020.c
this is the places where the ASI used for bus access is replaced by
its little endian equivalent).

However, this is a tangent we don't need to follow in this thread (if a
solution for this doesn't pop up by the suggested changes "by luck").

Martin


Re: MI commpage?

2010-06-06 Thread Martin Husemann
On Sun, Jun 06, 2010 at 01:15:54AM -0500, David Young wrote:
> Where in the kernel do I plug in an MI common page?  I mean a
> (read-only) page shared by the kernel and each user process.

Check kern/kern_lwp.c:lwp_ctl_alloc

Martin


Select broken in current?

2010-07-10 Thread Martin Husemann
I just updated my machine to latest -current as of yesterday and noticed
mpg123 from pkgsrc stopped working. It uses a child process to buffer audio
output (19007 in the ktrace below) and when that is ready, it tells the 
parent to proceed with the decoding (the writing of 1 byte). The parent
(18830), however, returns from the select and imediately goes into error
mode, cleans up and exits - w/o even looking at the 1 byte the buffer process
has sent:

 19007  1 mpg123   CALL  write(4,0xb738,1)
 19007  1 mpg123   GIO   fd 4 wrote 1 bytes
   "\^B"
 19007  1 mpg123   RET   write 1, -18632/0xb738
 18830  1 mpg123   RET   __select50 1, -18632/0xb738
 18830  1 mpg123   CALL  write(2,0xae00,0x37)
 18830  1 mpg123   GIO   fd 2 wrote 55 bytes
   "[audio.c:538] error: Buffer process didn't initialize!\n"

I haven't recompiled the mpg123 pkg yet (not sure that would help, and don't
want to kill this testing environment).

Any ideas?

Martin


Re: RFC: device flavours

2010-07-27 Thread Martin Husemann
On Tue, Jul 27, 2010 at 01:56:23AM +, Quentin Garnier wrote:
> "For free" is a subjective thing.  I don't think using device_register()
> --which is a MD callback--to pass information between two MI drivers is
> free.

Well, using a MD callback to attach MD information from ACPI somehow
makes sense to me.

FWIW, what David outlines is pretty close to the way i2c direct config
works in -current.

Martin


Re: RFC: device flavours

2010-07-28 Thread Martin Husemann
On Wed, Jul 28, 2010 at 04:41:20PM +, Quentin Garnier wrote:
> [..]  My intent was to provide a new tool, and to show an
> example of a situation where said tool is less of a kludge, more module-
> friendly and so on.

Sorry to be one of the folks who don't get it, but your new tool looks
very wiered to me, and I still fail to see how it is an advantage in your
concrete example. Now I trust you on technical designs and especially
autoconf(9), so I'm sure it is my fault to be blind here.

Going to re-read your proposal from scratch...

Martin


Re: Fixes for kern/40018 -- any chance of getting these pulled into the -current and 5.x trees?

2010-10-06 Thread Martin Husemann
I'm testing the patch in -current with these hardware:

bge0 at pci2 dev 0 function 0: Broadcom BCM57780 Fast Ethernet
bge0: interrupting at ioapic0 pin 16
adjust device control 0x192000 -> 0x195000
bge0: ASIC BCM57780 A1 (0x57780001), Ethernet address 00:26:2d:90:46:d1
bge0: setting short Tx thresholds
ukphy0 at bge0 phy 1: OUI 0x001be9, model 0x0019, rev. 1
ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto

and:

bge0 at pci0 dev 2 function 0: Broadcom BCM5704C Dual Gigabit Ethernet
bge0: interrupting at ivec 37c8
bge0: ASIC BCM5704 A3 (0x2003), Ethernet address 00:03:ba:45:d5:ed
brgphy0 at bge0 phy 1: BCM5704 1000BASE-T media interface, rev. 0
brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto

I found no problems so far.

Martin


Re: SLIP coexisting with serial data?

2010-10-10 Thread Martin Husemann
I wonder if you would need this kind of console even for ddb work. If not,
it probably would be simple to grab (the local) console in userland and serve
it via a special telnetd (or sshd, or whatever) variant.

I.e. you get pure serial console untill userland takes over and attaches
slip (or ppp, or what have you) and then continue to use console via telnet.

If you need console for ddb, things get messy.

Martin


Re: acpivga(4) v. MI display controls

2010-10-15 Thread Martin Husemann
On Fri, Oct 15, 2010 at 08:26:34AM +0300, Jukka Ruohonen wrote:
> This was discussed during the development process.

Where?

Martin


Re: acpivga(4) v. MI display controls

2010-10-16 Thread Martin Husemann
On Sat, Oct 16, 2010 at 11:28:33AM +0300, Jukka Ruohonen wrote:
> I do not know OF well, but my impression is that it is much, much less
> invasive than what we have nowadays on x86 where close interaction between
> the firmware and drivers are expected.

Indeed. Quentin once tried to explain to me why autoconfig is so much harder
with ACPI than with OF, but I failed to get the big picture.

The main difference that I understood seems to be what you call virtual
and natural device trees: in OF world we guide the whole autoconfig tree
along the OF device tree, with differences close to the leafs (i.e. the
scsibus der Mouse mentioned). At every point during autoconfig we can
make sure to have enough OF information already available during the
device_register() call. The only problem we ran into so far, IIRC, is the
id of FC disks for boot device detection, but we worked around that pretty
easily.

I don't think the auto-config time and in/out distinction you draw
realy is that relevant. With OF we still can call firmware methods any
time later, and we could take callbacks (though I don't think there are
any relevant). ACPI seems to do more in that area, but I fail to see the
fundamental problem, assuming you manage to get ACPI device tree traversal
and autoconfig tree building "synchronized" somehow (i.e. have all needed
ACPI information available for device_register()).

Martin


Re: kernel module loading vs securelevel

2010-10-16 Thread Martin Husemann
On Sat, Oct 16, 2010 at 12:35:02PM +0900, Izumi Tsutsui wrote:
> Hmm, what do you think about this feature?
> Only available in INSECURE environment?

I think it makes sense once we have lots of device drivers as modules.
Boot minimal kernel, autoload all needed device drivers, lock system
state. At least in configurations where you want to lock it.

Martin


Re: acpivga(4) v. MI display controls

2010-10-17 Thread Martin Husemann
On Sat, Oct 16, 2010 at 05:45:51PM -0500, David Young wrote:
> I think that ACPI should definitely guide autoconfiguration, if ACPI is
> available.

This is the other main difference to OF: on ports using OF, it is always
available. ACPI on i386 is not (yet).


Martin


Re: acpivga(4) v. MI display controls

2010-10-17 Thread Martin Husemann
On Sun, Oct 17, 2010 at 06:47:35AM -0400, der Mouse wrote:
> For example, while I have personal experience with only the one unit,
> http://www.netbsd.org/ports/sparc/javastation.html#mrcoffe implies
> fairly strongly that some JavaStation-1s have OB and others have OF.

Yes, indeed, but the difference between OBP and OF are abstracted away in
promlib, which seems to be mostly impossible in the i386 and no-ACPI case.

Martin


Re: Fixes for kern/40018 -- any chance of getting these pulled into the -current and 5.x trees?

2010-10-29 Thread Martin Husemann
On Wed, Oct 06, 2010 at 05:04:06PM +0200, Martin Husemann wrote:
> I'm testing the patch in -current with these hardware:
> 
> bge0 at pci2 dev 0 function 0: Broadcom BCM57780 Fast Ethernet
> bge0: interrupting at ioapic0 pin 16
> adjust device control 0x192000 -> 0x195000
> bge0: ASIC BCM57780 A1 (0x57780001), Ethernet address 00:26:2d:90:46:d1
> bge0: setting short Tx thresholds
> ukphy0 at bge0 phy 1: OUI 0x001be9, model 0x0019, rev. 1
> ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
> 1000baseT-FDX, auto

[...]

> I found no problems so far.

Bah, after quite some testing I now got in situations where heavy load
NFS traffic made the system hang - dropping all the offloading options
made it work again. This is on a notebook, and I don't regularily use
the server where it happens - no idea why the same traffic to a similar
machine works fine.

Martin


Re: Fixes for kern/40018 -- any chance of getting these pulled into the -current and 5.x trees?

2010-10-30 Thread Martin Husemann
On Fri, Oct 29, 2010 at 04:32:28PM -0700, Brian Buhrow wrote:
>   Hello.  Is this something you can reproduce easily?  I wonder if the
> ethernet chip is just hanging, and the rest of the machine is fine?

The machine was fine otherwise. I didn't have time last night to experiment,
will reproduce and narrow it down, but this will take a few days...

I'm not yet sure what details are important to reproduce it, I was surised
to hit this state after a few weeks w/o problems - but let me double check
I used the patched kernel at all before we continue to worry.

Martin


Re: RFC: ppath(3): property list paths library

2010-11-03 Thread Martin Husemann
Let me play devils advocate for a minute:

If we create a library with such a wiered API that we need another library
to make use of that libary "easy" - maybe we are abusing that libary or
we should reconsider its API?

This is one of the ocassions where I would love to use C++ and templates
in the kernel ;-}

Martin


Re: Fixes for kern/40018 -- any chance of getting these pulled into the -current and 5.x trees?

2010-11-10 Thread Martin Husemann
Sorry, have been travelling too much lately. I verified the kernel that
showed the problem did indeed have the patch applied, but didn't get around
to doing any further diagnostics yet.

Martin


Re: Fixes for kern/40018 -- any chance of getting these pulled into the -current and 5.x trees?

2010-11-10 Thread Martin Husemann
I just checked that at least my problem with the patch is NOT a regression:
it fails the same way w/o the patch.

So: something else to investigate, no reason to delay this.

Martin


Re: mutexes, locks and so on...

2010-11-12 Thread Martin Husemann
On Fri, Nov 12, 2010 at 02:35:59PM +0100, Johnny Billquist wrote:
> What I am observing is how slow NetBSD have become, which is very
> obvious on a platform like the VAX. Running something like Ultrix
> runs circles around NetBSD nowadays. And I'm trying to find where
> all the CPU cycles are going, and mutexes and locks are one place
> I've decided to focus on at the moment. Other suggestions are
> welcome.

Did you do any benchmarks yet comparing -current to NetBSD-4 yet?
It would be interesting to see numbers.

Martin


Re: Heads up: moving some uvmexp stat to being per-cpu

2010-12-15 Thread Martin Husemann
I have one stupid question: why can't we leave the size of the counters
at 32bit on a per arch basis?

At a quick glance the sparc code looked v9 only, so will need some work.

Martin


Re: cngetc and watchdogs

2010-12-21 Thread Martin Husemann
Will this allow time to proceed while at the ddb prompt?
I considered using the feature to keep the fan controll loop active on
a SB1000 while in ddb (the alternative is to make fans run full speed on
ddb entry, which is a real nuisance if you are anywhere near the machine).

Martin


Bogus KASSERT() in LFS?

2011-01-05 Thread Martin Husemann
Disclaimer: I know nothing about LFS, but it seems to me that there is no
guarantee for "curpg" to not be NULL in the following code from
src/sys/ufs/lfs/lfs_vnops.c:

while (by_list || soff < MIN(blkeof, endoffset)) {
if (by_list) {
/*
 * Find the first page in a block.  Skip
 * blocks outside our area of interest or beyond
 * the end of file.
 */
KASSERT(curpg == NULL ||
(curpg->flags & PG_MARKER) == 0);


and actually some ATF tests die for me with SIGSEGV inside the KASSERT.
So, would this patch be ok?

Index: lfs_vnops.c
===
RCS file: /cvsroot/src/sys/ufs/lfs/lfs_vnops.c,v
retrieving revision 1.233
diff -u -r1.233 lfs_vnops.c
--- lfs_vnops.c 2 Jan 2011 05:09:32 -   1.233
+++ lfs_vnops.c 5 Jan 2011 15:07:00 -
@@ -1860,7 +1860,8 @@
 * blocks outside our area of interest or beyond
 * the end of file.
 */
-   KASSERT((curpg->flags & PG_MARKER) == 0);
+   KASSERT(curpg == NULL ||
+   (curpg->flags & PG_MARKER) == 0);
if (pages_per_block > 1) {
while (curpg &&
((curpg->offset & fs->lfs_bmask) ||



Martin


Re: Bogus KASSERT() in LFS?

2011-01-05 Thread Martin Husemann
On Wed, Jan 05, 2011 at 04:25:09PM +, Eduardo Horvath wrote:
> I think you're right.  While I'm pretty sure that curpg won't be NULL on 
> the first iteration, I think it can be NULL on subsequent iterations.  I'd 
> commit that change.

It shouldn't get there on subsequent iterations if it pulled a NULL out
of the TAILQ because it explicitily breaks out of the loop in that case.

Why do you think it can't happen initially?

Martin


Re: Bogus KASSERT() in LFS?

2011-01-05 Thread Martin Husemann
On Wed, Jan 05, 2011 at 05:03:15PM +, Eduardo Horvath wrote:
> If by_list is set we'll always get here, and I don't think we'd be called 
> if the vnode had no pages at all

Ok, I'll add a KASSERT() to check for that and re-run the tests.

Martin


Re: Bogus KASSERT() in LFS?

2011-01-05 Thread Martin Husemann
On Wed, Jan 05, 2011 at 06:06:17PM +0100, Martin Husemann wrote:
> Ok, I'll add a KASSERT() to check for that and re-run the tests.

Indeed, it never happens on first entry to the loop, but via the goto top
later.

I'll commit the original patch. Unfortunately I run into locking issues
later, so this still does not fix the full test.

Martin


Re: Bogus KASSERT() in LFS?

2011-01-05 Thread Martin Husemann
On Wed, Jan 05, 2011 at 07:35:53PM +, Eduardo Horvath wrote:
> Really?  Last time I tried (about a month or two ago) I wasn't able to 
> hang LFS.  OTOH, looks like there's been quite some churn since then.  
> 
> What's your setup and what tests are you running?

I run src/tests/fs/vfs/t_full with argument "lfs_fillfs", unfortunately
gdb doesn't like me:

(gdb) run lfs_fillfs
Starting program: /usr/obj/tests/fs/vfs/t_full lfs_fillfs
Segment size 1048576 is too large; trying smaller sizes.
WARNING: the log-structured file system is experimental
WARNING: it may cause system crashes and/or corrupt data
lfs_cleanerd[5658]: /mnt: attaching cleaner
lfs_cleanerd[5658]: /mnt: detaching cleaner
panic: rumpuser fatal failure 11 (Resource deadlock avoided)

Program received signal SIGABRT, Aborted.
0x42a09720 in ?? ()
(gdb) bt
#0  0x42a09720 in ?? ()
#1  0x42a09720 in ?? ()
Previous frame identical to this frame (corrupt stack?)

On a life kernel this probably would be a "locking against myself". Have
you tried filling lfs with a LOCKDEBUG kernel recently?

Martin


Re: modules_enabled in kernel ELF note section

2011-01-12 Thread Martin Husemann
On Wed, Jan 12, 2011 at 01:59:42PM +1100, matthew green wrote:
> modular kernels don't *have* to have modules loaded via the boot
> loader.  so what i think you're really after is a flag that says
> "we want to try to load modules in the loader".

We could have a note section that makes it easy to find out for the
boot loader (a) wether the kernel is modular and (b) the list of
modules it has built in.

The loader could then drop any module requested in its configuration file
if it is already part of the kernel, or drop the whole list if the kernel
is non-modular.

The boot loader could also construct an automatic list if it is not configured
for any, which as only item includes the file system it found the kernel
on.

Martin


Softfloat userland needing to properly deliver SIGFPE traps

2011-01-14 Thread Martin Husemann
After Christos recently added sigqueue and friends to -current, I tried to
use them to fix a long standing softfloat userland problem. One variant of
the problem shows up in sparc64 atf test runs (sparc64 uses softfloat
for 128 bit long double): the userland software, and our atf tests, expect
to get proper details about the SIGFPE it caught, but softloat only did a
raise(SIGFPE), so no siginfo is available.

The userland change looked straight forward:

Index: softfloat-specialize
===
RCS file: /cvsroot/src/lib/libc/softfloat/softfloat-specialize,v
retrieving revision 1.4
diff -c -u -p -r1.4 softfloat-specialize
--- softfloat-specialize26 Sep 2004 21:13:27 -  1.4
+++ softfloat-specialize14 Jan 2011 09:40:15 -
@@ -56,11 +59,26 @@ should be simply `float_exception_flags 
 fp_except float_exception_mask = 0;
 void float_raise( fp_except flags )
 {
+siginfo_t info;
 
 float_exception_flags |= flags;
 
 if ( flags & float_exception_mask ) {
-   raise( SIGFPE );
+   memset(&info, 0, sizeof info);
+   info.si_signo = SIGFPE;
+   info.si_pid = getpid();
+   info.si_uid = geteuid();
+   if (flags & float_flag_underflow)
+   info.si_code = FPE_FLTUND;
+   else if (flags & float_flag_overflow)
+   info.si_code = FPE_FLTOVF;
+   else if (flags & float_flag_divbyzero)
+   info.si_code = FPE_FLTDIV;
+   else if (flags & float_flag_invalid)
+   info.si_code = FPE_FLTINV;
+   else if (flags & float_flag_inexact)
+   info.si_code = FPE_FLTRES;
+   sigqueueinfo(getpid(), &info);
 }
 }


But: looking at the kernel code, this is not allowed. Only SI_USER and
SI_QUEUE request are passed through. For testing, I disabled this
security check like this:
 
Index: sys_sig.c
===
RCS file: /cvsroot/src/sys/kern/sys_sig.c,v
retrieving revision 1.30
diff -c -u -p -r1.30 sys_sig.c
--- sys_sig.c   10 Jan 2011 04:39:18 -  1.30
+++ sys_sig.c   14 Jan 2011 09:40:38 -
@@ -229,14 +229,16 @@ kill1(struct lwp *l, pid_t pid, ksiginfo
if (ksi->ksi_uid != kauth_cred_geteuid(l->l_cred))
return EPERM;
 
-   switch (ksi->ksi_code) {
-   case SI_USER:
-   case SI_QUEUE:
-   break;
-   default:
-   return EPERM;
+   if (ksi->ksi_signo != SIGFPE) {
+   switch (ksi->ksi_code) {
+   case SI_USER:
+   case SI_QUEUE:
+   break;
+   default:
+   return EPERM;
+   }
}
-   
+
if (pid > 0) {
/* kill single process */
mutex_enter(proc_lock);


This change makes it work for me, but of course it is a horrible hack, not
acceptable for commit. We need to have the possibility to SI_USER/
SI_QUEUE a SIGFPE, but given the mostly union content of ksiginfo, I
don't see an obvious way how to do it properly.

I'd like to suggest a special SI_SELFSIGFPE ksi_code, which overrides
the pid/uid tests, only allows sending to the same process, and encodes
the final (kernel internal) ksi_code as ksi_signo, while passing on the
rest of ksiginfo untouched (so userland could, if possible, fill in
struct fault).

Userland call would look like this:

+   memset(&info, 0, sizeof info);
+   info.si_code = SI_SELFSIGFPE;
+   if (flags & float_flag_underflow)
+   info.si_signo = FPE_FLTUND;
+   else if (flags & float_flag_overflow)
+   info.si_signo = FPE_FLTOVF;
+   else if (flags & float_flag_divbyzero)
+   info.si_signo = FPE_FLTDIV;
+   else if (flags & float_flag_invalid)
+   info.si_signo = FPE_FLTINV;
+   else if (flags & float_flag_inexact)
+   info.si_signo = FPE_FLTRES;
+   sigqueueinfo(getpid(), &info);

and kernel would move si_signo to ksi_code, set signo to SIGFPE, and pass
the remaining siginfo on untouched.

Still a hack, but I don't have better ideas (besides creating another special
purpose syscall for this).

Martin


Re: Softfloat userland needing to properly deliver SIGFPE traps

2011-01-14 Thread Martin Husemann
On Fri, Jan 14, 2011 at 07:14:43PM +0100, Matthias Drochner wrote:
> If you have some basic FPU available, you could just cause
> matching SIGFPEs using eg doubles. Eg assign
> HUGE_VAL*HUGE_VAL to a double, or divide a double by zero.
> The sun libm code does this in some cases.

That's certainly an option - but we still need to solve this generally
(and also making float_raise MD overridable is a bit of a pain due to
all the renaming and include magic, but of course it is possible).

Martin


Re: xorg pci probing

2011-01-16 Thread Martin Husemann
On Sat, Jan 15, 2011 at 10:26:13PM +0100, Christoph Egger wrote:
> 
> Hi!
> 
> I have a machine with two PCI graphic cards:
> 1x Radeon HD 4200
> 1x Radeon HD 5600
> 
> Starting X fails with the error message
> "Primary device is not PCI"
> 
> Per discussion with macallen@ I implemented
> pci_device_is_boot_vga() in libpciaccess
> which uses a new ioctl().

Can you please explain what X is trying to tell you with that message
and how "this is the vga device used as console by the os" is related to
the answer? I wonder if the question more should be "this vga device
has been initialized by the firmware already", and maybe your patch only
happens to work on your machine by luck.

What happens if you boot your machine with a serial console (assuming for
the sake of argument it would have a serial port)?

Martin


Re: Dates in boot loaders on !x86

2011-01-16 Thread Martin Husemann
On Sun, Jan 16, 2011 at 08:59:15PM +0100, Joerg Sonnenberger wrote:
> That's what the version number is supposed to tell you.

Just thinking out loud: could we make the embedded date conditional
on some /etc/mk.conf variable, unset for standard builds?

Martin


Re: xorg pci probing

2011-01-18 Thread Martin Husemann
On Tue, Jan 18, 2011 at 09:47:29AM +0100, Christoph Egger wrote:
> In /var/log/X.org both Radeon devices are enlisted but none has been
> selected as the primary one. X then quits with the message
> "Primary device is not PCI"

I still don't understand what this means, and why it doesn't choose
one of them arbitrarily.

If you edit your config file to select a primary one, it would work?
If you select the other one as primary, would it still work?

> Per discussion with macallen@ I implemented pci_device_is_boot_vga()
> and that works.

Can you please explain why the OS console device is relevant here? You
said it is not about initialization - I don't quite get it.

> > What happens if you boot your machine with a serial console (assuming for
> > the sake of argument it would have a serial port)?
> 
> That machine is a laptop and has no serial console.

Yes, but the same situation could happen on a desktop. Assume I would
have such a machine and boot it with serial console. According to your
logic (as far as I understood it so far), I wouldn't be able to use X.
That sucks and would clearly be a serious regression.

I'm sure I'm missing something, so please give more details.

Martin


Re: xorg pci probing

2011-01-18 Thread Martin Husemann
On Tue, Jan 18, 2011 at 11:39:35AM +0100, Christoph Egger wrote:
> The function pci_device_is_boot_vga() is supposed to return 
> 'true' for the pci vga device the firmware uses which is not
> neccessarily the OS console device (it can be a serial console).

Ok. But your function answers "this is the vga device where wsdisplay0
attached" instead - is that a safe bet, especially in case you boot with
a serial console?

I still completely fail to see why X would like to know this detail and
what a wrong answer would mean. If it's just a wrong default display
but fixable via config file, it's fine.

> > Yes, but the same situation could happen on a desktop. Assume I would
> > have such a machine and boot it with serial console. According to your
> > logic (as far as I understood it so far), I wouldn't be able to use X.
> 
> Right. The error message "Primary device is not PCI" comes from

This is unacceptable. I have a amd64 machine with serial console that I
use to run X, luckily it has only a single vga card right now.

Martin


Re: xorg pci probing

2011-01-18 Thread Martin Husemann
On Tue, Jan 18, 2011 at 12:19:50PM +0100, Christoph Egger wrote:
> According to macallan@  ttyE* is the first graphic device,
> ttyF* the second, ttyG* the third, etc.

Yes, counting in NetBSD autoconfig probe order.

> So yes, it is safe.

Depends on the exact question, which has been explained out of band to
me (thanks Jared!), and "wrong" answer is non fatal, as well as there
will be a meaningfull (but kind of arbitrary) answer in the serial
console case.

So everything is fine, nothing to see here - carry on!

Martin


Re: Dates in boot loaders on !x86

2011-01-18 Thread Martin Husemann
On Tue, Jan 18, 2011 at 03:03:24PM +, Julio Merino wrote:
> If that's the only use case, having an option to enable timestamps is 
> enough: enable it *while you are hacking* but leave it off for the 
> benefit of the rest of the world.

Yes, that was all we asked for: leave a switch in place, disable it for
all standard builds.

Martin



Re: xorg pci probing

2011-01-18 Thread Martin Husemann
To summarize what goes on: on x86 archs (and AFAICT for NetBSD only there)
the X starup code enumerates all pci buses the kernel found, searching for
vga devices. If multiple are found, it prefers the one that the new function
marks. That's basically all about it.

If the new function says "no" to all devices, we have the same state as right
now. This, however, will not happen, even with serial console, as the function
realy only finds the first wsdisplay device's parent vga.

On sane ports we do not enumerate via /dev/pci, but straight open ttyE0 (i.e.
the first display, the same that the new function will mark) or ttyF0 etc for
multiple servers, and are there right away.

I would prefer to drop the stupid enumeration code on x86 too, but I'm 
definitively not goint to touch that code myself - so no right to complain.

Martin


Re: Dates in boot loaders on !x86

2011-01-18 Thread Martin Husemann
On Tue, Jan 18, 2011 at 05:02:13PM +0100, Johnny Billquist wrote:
> Ok. And that is good because...?

Various, the most prominent benefits:

 - you can easily do binary patches
 - you can verify code you touched that should have no effect realy did not
   have any effect
 - you can verify toolchain changes did not cause unexpected changes


Martin


Re: Softfloat userland needing to properly deliver SIGFPE traps

2011-01-22 Thread Martin Husemann
On Fri, Jan 14, 2011 at 10:06:09AM -0800, Matt Thomas wrote:
> I'm thinking the check should go away.

How about the variant below - it allows a process to deliver arbitrary
signals to itself, but only SI_USER/SI_QUEUE variants to foreign processes.

Perhaps the whole part inside the new if () block should be hidden in
kauth instead, but I don't feel like touching that mess right now.

> One thing I'm concerned about is whether this signal will directed to 
> the queueing lwp.  Seems to me that for a synthetic fault like this,
> that would need to be true.

Yes, do we have means to do that already?

Martin



Re: Softfloat userland needing to properly deliver SIGFPE traps

2011-01-22 Thread Martin Husemann
On Sat, Jan 22, 2011 at 10:11:48AM +0100, Martin Husemann wrote:
> How about the variant below - it allows a process to deliver arbitrary
> signals to itself, but only SI_USER/SI_QUEUE variants to foreign processes.

Ooops, here is the patch...

Martin

Index: sys_sig.c
===
RCS file: /cvsroot/src/sys/kern/sys_sig.c,v
retrieving revision 1.30
diff -c -u -r1.30 sys_sig.c
--- sys_sig.c   10 Jan 2011 04:39:18 -  1.30
+++ sys_sig.c   22 Jan 2011 09:08:53 -
@@ -223,20 +223,22 @@
if ((u_int)ksi->ksi_signo >= NSIG)
return EINVAL;
 
-   if (ksi->ksi_pid != l->l_proc->p_pid)
-   return EPERM;
-
-   if (ksi->ksi_uid != kauth_cred_geteuid(l->l_cred))
-   return EPERM;
-
-   switch (ksi->ksi_code) {
-   case SI_USER:
-   case SI_QUEUE:
-   break;
-   default:
-   return EPERM;
+   if (pid != l->l_proc->p_pid) {
+   if (ksi->ksi_pid != l->l_proc->p_pid)
+   return EPERM;
+
+   if (ksi->ksi_uid != kauth_cred_geteuid(l->l_cred))
+   return EPERM;
+
+   switch (ksi->ksi_code) {
+   case SI_USER:
+   case SI_QUEUE:
+   break;
+   default:
+   return EPERM;
+   }
}
-   
+
if (pid > 0) {
/* kill single process */
mutex_enter(proc_lock);


Re: Per-CPU Unit (PCU) interface

2011-01-25 Thread Martin Husemann
On Mon, Jan 24, 2011 at 03:27:58PM +, Mindaugas Rasiukevicius wrote:
> While looking at the bugs on still work-in-progress mips64 FPU code on
> Matt's branch, it occurred to me that we could abstract SMP handling
> complexity into MI interface.  Basically, all architectures are using
> similar logic for FPU handling on SMP, but each have own variations,
> confusions, and therefore each fall into the bugs.  Hence, PCU:

I can't make up my mind if this is a complication or proper abstraction.

Assuming it is only used for lazy FPU saving and an arch does not have
other PCU needs, it overall does not save a lot of work. On the other
hand it does not allow MD optimiziations (obvious example are the fpu
handling IPIs on sparc64 where we do not bother to create a full C runtime
environment in the IPI handler).

> - There can be multiple PCUs, so this can be re-used not only for FPU,
> but any similar MD functionality, e.g. PowerPC AltiVec.

Are there other examples of this?

> - Once there is MI IPI support, it is ~trivial to convert the code to
> use them by: 1) splsoftclock() -> splhigh() 2) replacing xc_unicast()
> calls with cpu_send_ipi() and moving them *before* splx().

I can not parse this paragraph - and what "MI IPI support" are you talking
about? How does it differ form xcall(9)?

Martin


Re: Softfloat userland needing to properly deliver SIGFPE traps

2011-01-25 Thread Martin Husemann
Any objections?
If not, I'm going to commit this sometime later this week.

Martin


Index: sys_sig.c
===
RCS file: /cvsroot/src/sys/kern/sys_sig.c,v
retrieving revision 1.30
diff -c -u -r1.30 sys_sig.c
--- sys_sig.c   10 Jan 2011 04:39:18 -  1.30
+++ sys_sig.c   22 Jan 2011 09:08:53 -
@@ -223,20 +223,22 @@
if ((u_int)ksi->ksi_signo >= NSIG)
return EINVAL;
 
-   if (ksi->ksi_pid != l->l_proc->p_pid)
-   return EPERM;
-
-   if (ksi->ksi_uid != kauth_cred_geteuid(l->l_cred))
-   return EPERM;
-
-   switch (ksi->ksi_code) {
-   case SI_USER:
-   case SI_QUEUE:
-   break;
-   default:
-   return EPERM;
+   if (pid != l->l_proc->p_pid) {
+   if (ksi->ksi_pid != l->l_proc->p_pid)
+   return EPERM;
+
+   if (ksi->ksi_uid != kauth_cred_geteuid(l->l_cred))
+   return EPERM;
+
+   switch (ksi->ksi_code) {
+   case SI_USER:
+   case SI_QUEUE:
+   break;
+   default:
+   return EPERM;
+   }
}
-   
+
if (pid > 0) {
/* kill single process */
mutex_enter(proc_lock);


Re: mpt Serious performance issues

2011-01-28 Thread Martin Husemann
On Fri, Jan 28, 2011 at 11:14:44AM +0100, Stephan wrote:
> What is going on here? Does the struct scsipi_xfer_mode *xm come from
> the scsipi layer so it tells the driver what to do?

Yes, it is set from the target discovery code, so you should get a 1 bit
there for every target device that understands tagging.

Martin


Re: mpt Serious performance issues

2011-01-28 Thread Martin Husemann
On Fri, Jan 28, 2011 at 11:30:16AM +0100, Stephan wrote:
> The discovery code you spoke from is part of every device driver,
> isnt`t it? The scsipi man page tells about a XS_CTL_DISCOVERY flag in
> xs_control, which i can?t find in the mpt driver. So how does e.g. the
> mpt driver tell the scsipi layer what capabilities a given device has?

It doesn't, the scsipi layer scans the bus for devices and if it finds
some it queries their mode pages (depending on supported scsi version)
and sets the ability flags.

See scsiconf.c:scsi_probe_device for details:

/*
 * Determine the operating mode capabilities of the device.
 */
if (periph->periph_version >= 2) {
if ((inqbuf.flags3 & SID_CmdQue) != 0 &&
(quirks & PQUIRK_NOTAG) == 0)
periph->periph_cap |= PERIPH_CAP_TQING;
if ((inqbuf.flags3 & SID_Linked) != 0)
periph->periph_cap |= PERIPH_CAP_LINKCMDS;
...

Martin


Re: turning off COMPAT_386BSD_MBRPART in disklabel

2011-02-08 Thread Martin Husemann
On Mon, Feb 07, 2011 at 06:35:49PM +, David Laight wrote:
> x86/amd64 sysinst uses it own code for fdisk and installboot.
> I can't quite remember whether it runs disklabel - but it will request
> it writes a label from a temporary file...

What David says:

int
write_disklabel (void)
{
 
#ifdef DISKLABEL_CMD
/* disklabel the disk */
return run_program(RUN_DISPLAY, "%s -f /tmp/disktab %s '%s'",
DISKLABEL_CMD, diskdev, bsddiskname);
#else
return 0;
#endif
}

Martin


Re: USB printing panic

2011-02-10 Thread Martin Husemann
Can you try something like this?

Martin
Index: ulpt.c
===
RCS file: /cvsroot/src/sys/dev/usb/ulpt.c,v
retrieving revision 1.85
diff -u -p -r1.85 ulpt.c
--- ulpt.c  3 Nov 2010 22:34:24 -   1.85
+++ ulpt.c  10 Feb 2011 09:33:25 -
@@ -656,7 +656,7 @@ ulptclose(dev_t dev, int flag, int mode,
 
if (sc->sc_has_callout) {
DPRINTFN(2, ("ulptclose: stopping read callout\n"));
-   callout_stop(&sc->sc_read_callout);
+   callout_halt(&sc->sc_read_callout, NULL);
sc->sc_has_callout = 0;
}
 


Re: bouyer-quota2: fsck_ffs crash

2011-03-11 Thread Martin Husemann
On Thu, Mar 10, 2011 at 07:45:20PM +0100, Manuel Bouyer wrote:
> Can you see if tests/sbin/fsck_ffs completes fine on sparc64
> (atf-run|atf-report in this directory) ?

All quota tests pass on sparc64 (see http://www.netbsd.org/~martin/sparc64-atf)

Martin


Re: sysmon_pswitch_event(): provide a sleep routine when powerd(8) is not running

2011-03-28 Thread Martin Husemann
On Mon, Mar 28, 2011 at 03:50:46PM +0300, Jukka Ruohonen wrote:
> I would go for (3), perhaps with a "-s" flag to halt(8). This would also solve
> the user interface issue that remains unresolved in options (1) and (2).
> Extending halt(8) has been discussed also previously (cf. e.g. [1]).

Don't mix it with the user interface or "halt", it is completely unrelated.

I would go for (2), since there already is "gracefull shutdown" code in there,
but it needs MD extensions for this case.

Martin


Re: GSoC project idea: posix_spawn

2011-03-28 Thread Martin Husemann
On Mon, Mar 28, 2011 at 08:16:09PM +0200, Joerg Sonnenberger wrote:
> Implementing a real posix_spawn system calls could do that. Measuring
> the impact for different work loads makes a nice research paper as side
> effect. This includes both estimate the temporary memory used and the
> performance implications.
> 
> Difficulty: medium-hard
> Reference:
> http://pubs.opengroup.org/onlinepubs/9699919799/functions/posix_spawn.html

I like the proposal.

Martin


Re: extent-patch and overview of what is supposed to follow

2011-04-02 Thread Martin Husemann
On Sat, Apr 02, 2011 at 11:30:16AM +0200, Manuel Bouyer wrote:
> AFAIK dtrace doesn't work on non-modular kernels ...

Nor on most of our archs, and AFAICT there is not even a document 
describing the (maybe nontrivial amount of) work needed to make it
work there.

Martin


Re: diff: add show proc command to ddb

2011-04-06 Thread Martin Husemann
On Wed, Apr 06, 2011 at 02:07:18AM +0300, Vladimir Kirillov wrote:
> Hello, tech-kern@!
> 
> I really wanted a show proc command to avoid looking up process
> information by running ps with all flags and intensive scrolling.
> 
> The show proc output mostly combines the outputs of all switches
> in ps.

I like the idea. Note that some archs already have a 

   mach proc 

command (defaulting to curlwp, and the man page needs an update, it talks
about procs).

Maybe this can be integrated in the MI variant with an option?
Should the "mach" variants go away later?

Martin


Re: diff: add show proc command to ddb

2011-04-06 Thread Martin Husemann
On Wed, Apr 06, 2011 at 11:03:47PM +1000, matthew green wrote:
> since it works on lwps...can we call the MI version "show lwp "?

Ok for the address based variants:

  show lwp
   (shows curlwp)

  show lwp 0xffce
   (shows lwp at that address)

but with a pid/lid it gets "strange":
  show lwp /p 0t25 
 (show pid 15 first lwp?)


Martin


New boothowto flag to prevent raid auto-root-configuration

2011-04-17 Thread Martin Husemann
Hi folks,

as described in PR 44774 (see
http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=44774), it is
currently not possible to use a standard NetBSD install CD on a system
wich normally boots from raid (at least on i386, amd64 or sparc64, where
a stock GENERIC kernel is used).

To fix this, I'd like to introduce a new boothowto flag, that turns off
all magic to override the root device (for now: turns off the root part
of raidframe autoconfiguration, but could do similar things in the future
with LVM or whatever).

The patch attched does this. It will be accompanied by bootloader changes to
set this flag if a new keyword is present in /boot.cfg.

What do you think? Better naming suggestion also welcome.

Martin

Index: sys/reboot.h
===
RCS file: /cvsroot/src/sys/sys/reboot.h,v
retrieving revision 1.25
diff -c -u -r1.25 reboot.h
--- sys/reboot.h25 Dec 2007 18:33:48 -  1.25
+++ sys/reboot.h18 Apr 2011 05:34:01 -
@@ -53,6 +53,9 @@
 #defineRB_STRING   0x0400  /* use provided bootstr */
 #defineRB_POWERDOWN(RB_HALT|0x800) /* turn power off (or at least 
halt) */
 #define RB_USERCONF0x1000  /* change configured devices */
+#defineRB_NO_ROOT_OVERRIDE 0x2000  /* no automatic override of the 
booted
+* device, like raidframes auto
+* root configuration */
 
 /*
  * Extra autoboot flags (passed by boot prog to kernel). See also
Index: dev/raidframe/rf_netbsdkintf.c
===
RCS file: /cvsroot/src/sys/dev/raidframe/rf_netbsdkintf.c,v
retrieving revision 1.284
diff -c -u -r1.284 rf_netbsdkintf.c
--- dev/raidframe/rf_netbsdkintf.c  18 Mar 2011 23:53:26 -  1.284
+++ dev/raidframe/rf_netbsdkintf.c  18 Apr 2011 05:34:01 -
@@ -465,7 +465,7 @@
/* if the user has specified what the root device should be
   then we don't touch booted_device or boothowto... */
 
-   if (rootspec != NULL)
+   if ((rootspec != NULL) || (boothowto & RB_NO_ROOT_OVERRIDE))
return;
 
/* we found something bootable... */


Re: New boothowto flag to prevent raid auto-root-configuration

2011-04-18 Thread Martin Husemann
On Mon, Apr 18, 2011 at 01:06:23PM +0200, Klaus . Heinz wrote:
> Instead of providing RB_NO_ROOT_OVERRIDE I would prefer something that
> actually _lets_ me override everything else from boot.cfg.

Yes, I understand this wish (and it is not that hard to implement).
However, I think both are quite orthogonal and should be discussed separately.

Note that especially for the install CD setup, a rootstring passed from
boot.cfg is *not* possible, as we do not know which CD drive is used
for booting.

Martin


Re: New boothowto flag to prevent raid auto-root-configuration

2011-04-18 Thread Martin Husemann
On Mon, Apr 18, 2011 at 11:06:21PM +0900, Izumi Tsutsui wrote:
> I.e. currently only root on cd0a works on x86 GENERIC anyway.

One more bug to fix, but unrelated to this thread, isn't it?
See also PR 43012.

Martin


Re: New boothowto flag to prevent raid auto-root-configuration

2011-04-18 Thread Martin Husemann
On Mon, Apr 18, 2011 at 11:59:32PM +0900, Izumi Tsutsui wrote:
> I thought passing "root on cd0a" from boot.cfg just worked on x86..

Maybe, but I'm not talking about x86 only.

Martin


Re: New boothowto flag to prevent raid auto-root-configuration

2011-04-18 Thread Martin Husemann
On Mon, Apr 18, 2011 at 10:32:59AM -0700, Brian Buhrow wrote:
>   Hello Martin.  Doesn't boot -a  already do this by allowing you to
> select the root filesystem and the init path?  I'm certain I've booted
> systems running with raid roots off of cdroms for repair purposes.

Yes, it does, but it requires user interaction (which I do not want to
enforce on the standard install CDs).

Martin


Re: sys/dev/isa/fd.c FDUNIT/FDTYPE

2011-05-03 Thread Martin Husemann
On Tue, May 03, 2011 at 08:06:56PM +0200, Edgar Fuß wrote:
> sys/dev/isa/fd.c defines FDUNIT and FDTYPE as DIV/MOD 8.
> etc/MAKEDEV uses makedisk_p16 for fd*.
> 
> Who's right?
> As I'm just adding a ninth (ten-sector) fd_type, I prefer the 16 version.

FDUNIT and FDTYPE both derive from minor(dev).
MAKEDEV creates 16 minors, so you get unit 0 and 1, and 8 types each.

What is FDUNIT good for?

Martin


Re: sys/dev/isa/fd.c FDUNIT/FDTYPE

2011-05-04 Thread Martin Husemann
On Wed, May 04, 2011 at 12:52:30PM +0200, Edgar Fuß wrote:
> > What is FDUNIT good for?
> Well, ehm, I suppose that's supposed to be, ehm, the unit number, e.g. 0 or 1 
> for fd0x/fd1x.

Oh, duh, of course ;-)
Unfortunatley, ultimately the kernel is correct (as always).
But you can easily change it to work div/mod 16 of course. It is all just a
convention between MAKEDEV and the kernel (which should match).

Martin


Re: sys/dev/isa/fd.c FDUNIT/FDTYPE

2011-05-04 Thread Martin Husemann
On Wed, May 04, 2011 at 08:50:10PM +0900, Izumi Tsutsui wrote:
> The problem is that there might be some ports whose MAXPARTITIONS is still 8
> and such ports can't use type 8.

Why not? It is not used as a partiton of fd*.
MAKEDEV is already wrong for those ports, the fd nodes probably should have
special case handling.

Martin


Re: sys/dev/isa/fd.c FDUNIT/FDTYPE

2011-05-04 Thread Martin Husemann
On Thu, May 05, 2011 at 01:27:50AM +0900, Izumi Tsutsui wrote:
> I'm afraid few developers will maintain MAKEDEV script properly,
> and few users will rerun /dev/MAKEDEV on upgrade.

Nothing that usr.sbin/postinstall can't fix (or at least warn about)

> Nowadays floppy is almost dead, so we don't have to care about
> compatibility, though...

Indeed - but this also means, nobody will be bitten by the change when 
we fix it (IIUC only machines with > 1 floppy would be affected anyway,
unless you use the new formats - I have a realy strange collection of
old machines, but pretty sure none of it has more than one floppy drive,
actually most of them have only broken drives).

Martin


Re: sys/dev/isa/fd.c FDUNIT/FDTYPE

2011-05-05 Thread Martin Husemann
On Fri, May 06, 2011 at 04:53:41AM +0900, Izumi Tsutsui wrote:
> If we "fix" kernels to use DISKUNIT() and DISKPART() macro
> for FDUNIT() and FDTYPE(), we can bump a number of fd types
> to MAXPARTITIONS with no further changes.
> Nothing needs to be done by users in that case.

Oh, now I see what you meant - didn't get it with the previous explanation -
nice trick.

Still, if postinstall can warn users, I don't see a big deal in fixing it
in other ways (though I can live with both, and the Edgar can do his changes
with your solution too, as he seems to use amd64).

Martin


Re: pmf(9) vs sysmon for power events (especially sleep when powerd(8) is not running)

2011-05-07 Thread Martin Husemann
On Fri, May 06, 2011 at 04:45:55PM +0100, Jean-Yves Migeon wrote:
> 1 - I shall patch sysmon_pswitch_event and add a callback for sleep 
> that MD code can register,

Yes (or a list of callbacks, even, maybe not only MD code but various
subsystems might need this later).

Martin


Re: statvfs() sleeps forever on tstile

2011-05-08 Thread Martin Husemann
On Sun, May 08, 2011 at 07:24:10PM +0200, Emmanuel Dreyfus wrote:
> As I understand, FFS sits on top of UFS. We migrated from UFS1 to UFS2
> some time ago (remeber the thing about the superblock that was
> converted?), and now we can have FFS v1 or FFS v2 over UFS2. You choose
> FFS v2 by formatting with newfs -O 2

I think this is incorrect. The (IMHO) correct version explanation
can be found in newfs(8):

-O filesystem-format
 Select the filesystem-format.
   04.3BSD; This option is primarily used to build
root file systems that can be understood by older
boot ROMs.
   1FFSv1; normal fast-filesystem (default).  This is
also known as `FFS', `UFS', or `UFS1'.
   2FFSv2; enhanced fast-filesystem (suited for more
than 1 Terabyte capacity, access control lists).
This is also known as `UFS2'.

Another part of the confusion is the superblock format, which changed at
the same time as the inode format, but you can build FFS1/UFS1 filesystems
with FFSv2 superblocks. For example this is from an i386 5.1 machine:

file system: /dev/rraid0a
endian  little-endian
magic   11954 (UFS1)timeSun May  8 22:40:07 2011
superblock location 8192id  [ 40b918f4 632cf0d8 ]
cylgrp  dynamic inodes  4.4BSD  sblock  FFSv2   fslevel 4
nbfree  16189026ndir182278  nifree  44922709nffree  571925

This is a FFSv1 file system (called UFS1 in the dumpfs output), but the
superblock is in FFSv2 format.

Compare to this:

file system: /dev/rwd0a
format  FFSv2
endian  little-endian
location 65536  (-b 128)
magic   19540119timeSun May  8 22:42:43 2011
superblock location 65536   id  [ 4c2cc116 47d65068 ]
cylgrp  dynamic inodes  FFSv2   sblock  FFSv2   fslevel 5

Here we have a FFSv2 file system, and are using the FFSv2 superblock as well.


Martin


Re: statvfs() sleeps forever on tstile

2011-05-14 Thread Martin Husemann
On Sat, May 14, 2011 at 07:42:14PM +, David Holland wrote:
> Should we change the dumpfs output to be more consistent with the
> terminology we're trying to use?

Yes


Re: add DIAGNOSTIC back to GENERIC/INSTALL

2011-06-16 Thread Martin Husemann
On Thu, Jun 16, 2011 at 09:29:36AM +0200, Manuel Bouyer wrote:
> So here's the formal question: would someone object if I add back
> 'options DIAGNOSTIC' to i386 and amd64 GENERIC and INSTALL kernels,
> with a comment saying this should be disabled on release branch
> (it would be up to releng to comment it out as part of the release process) ?

I am in favour of this proposal (and would add sparc64) - but in case we
do not agree, we should at least make sure that all regular automatic test
runs are using kernels with DIAGNOSTIC enabled.

Martin


Re: error adding a new system call

2011-06-16 Thread Martin Husemann
On Thu, Jun 16, 2011 at 12:39:11AM +0800, Charles Zhang wrote:
>- add the syscall to the src/sys/kern/syscall.master list

Use forward declarations in this version, like "struct whatever *"
instead of a typedef'd name, so the signature is valid all by itself.

Martin


Re: add DIAGNOSTIC back to GENERIC/INSTALL

2011-06-16 Thread Martin Husemann
On Thu, Jun 16, 2011 at 04:30:18PM +0100, Mindaugas Rasiukevicius wrote:
> > - Since performance is degraded and -current users concerned about it
> >   will need to compile their own kernels anyway - I believe LOCKDEBUG
> >   should be enabled as well.  Perhaps LOCKDEBUG should become a part
> >   of DEBUG - it is at least clearly a "heavier check". :)

Well, a DEBUG kernel is usable (and, IIRC a DIAGNOSTIC+LOCKDEBUG kernel
as well), but a DEBUG+LOCKDEBUG kernel is absolutely unusable for anything
besides debugging.

A simple solution (but with a gotcha): have releng builds both produce
GENERIC and GENERIC-DEBUG (or whatever you call it), then run automatic
test runs on both and compare.

Encourage users to run the -DEBUG version on test machines, but point to
GENERIC for performance critical things.


Martin


  1   2   3   4   5   6   7   8   >