Re: Yet another dirct config proposal for i2c busses

2010-02-03 Thread Paul Goyette



-
|   Paul Goyette   | PGP DSS Key fingerprint: |  E-mail addresses:  |
| Customer Service | FA29 0E3B 35AF E8AE 6651 |  paul at whooppee.com   |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: regression (crash) in sysmon/acpiacad

2010-02-04 Thread Paul Goyette

On Thu, 4 Feb 2010, Matthias Drochner wrote:



Hi -

my laptop (running -current) now crashes as soon as the
battery reaches warning level.
This is due to a jump to NULL in sysmon_envsys_events.c,
line 891 - an indirect call to sme-sme_refresh.
This is indeed 0 for acpiacad now, since the refresh
method was removed recently.
How is this intended to work?


Ooops.

The refresh routine was removed because we're now relying on the notify 
code to actually work.  But unfortunately we forgot to set the flag to 
avoid the call-back.


Either Jukka or I will get this fixed up shortly.


-
|   Paul Goyette   | PGP DSS Key fingerprint: |  E-mail addresses:  |
| Customer Service | FA29 0E3B 35AF E8AE 6651 |  paul at whooppee.com   |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: regression (crash) in sysmon/acpiacad

2010-02-04 Thread Paul Goyette

On Thu, 4 Feb 2010, Paul Goyette wrote:


On Thu, 4 Feb 2010, Matthias Drochner wrote:

my laptop (running -current) now crashes as soon as the
battery reaches warning level.
This is due to a jump to NULL in sysmon_envsys_events.c,
line 891 - an indirect call to sme-sme_refresh.
This is indeed 0 for acpiacad now, since the refresh
method was removed recently.
How is this intended to work?


Ooops.

The refresh routine was removed because we're now relying on the notify code 
to actually work.  But unfortunately we forgot to set the flag to avoid the 
call-back.


Either Jukka or I will get this fixed up shortly.


Actually, Jukka _did_ set the SME_DISABLE_REFRESH flag.  Unfortunately, 
it was not being checked in one of the call-backs.  I have just added 
the missing check.


Please update to rev 1.78 of src/sys/dev/sysmon/sysmon_envsys_events.c 
and try again.  Thanks for finding this, and please accept my apologies 
for the breakage.




-
|   Paul Goyette   | PGP DSS Key fingerprint: |  E-mail addresses:  |
| Customer Service | FA29 0E3B 35AF E8AE 6651 |  paul at whooppee.com   |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: regression (crash) in sysmon/acpiacad

2010-02-04 Thread Paul Goyette

On Thu, 4 Feb 2010, Matthias Drochner wrote:



p...@whooppee.com said:

Please update to rev 1.78 of src/sys/dev/sysmon/sysmon_envsys_events.c
 and try again.


Thanks -- I just managed to run low on battery with a new kernel.
I can confirm that the problem is fixed, the box runs well until
graceful shutdown by the sensor_battery script.

But -- since my battery was low anyway I did another test, whether
it behaves well if the userland powerd is not running. It does not.
I'd expect it to hit the cpu_reboot() call in sysmon_penvsys_event()
eventually, but it did not. It didn't even update the charge
value visible in envstat(8) -- this stayed at 6.00% (this is where
I killed powerd) even when the red battery LED on my (Dell)
laptop started to light permanently, which happens below 3% (the
low cap limit).
This used to work, so there is another regression.


OK, we'll look into it.  Since the charge value was not updating, it 
might be that the ACPI Notify isn't working here.  Perhaps we should not 
rely on this, and put the refresh() routine back...  Since I don't have 
any battery-powered devices, I'll defer to Jukka to see what we can do.



-
|   Paul Goyette   | PGP DSS Key fingerprint: |  E-mail addresses:  |
| Customer Service | FA29 0E3B 35AF E8AE 6651 |  paul at whooppee.com   |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: regression (crash) in sysmon/acpiacad

2010-02-05 Thread Paul Goyette

On Fri, 5 Feb 2010, Matthias Drochner wrote:



p...@whooppee.com said:

Regarding the earlier comment about your battery charge level not
changing...  Is this still true?


Just checked again and found that the charge level is
updated now, but only every 30s. (The code in acpibat_refresh()
suggests that cached values are kept only for 5s, and I
believe that is how it did behave formerly.)


The 30-second timer is the default poll interval for sysmon_envsys, so 
we're calling the refresh routine on that basis.  I'm not totally up to 
speed on the acpibat internal stuff, so I don't know if it should be 
getting updated more frequently.



I definitely watched the battery state for a couple of
minutes yesterday, so something else must have happened.

Looking at acpibat_refresh()... the 5-seconds check uses
microtime() which is not necessarily monotonic. And, iirc,
I had booted Windows before, which sets the RTC to GMT+1,
so I had to force the clock back by a manual ntpdate.
This might have killed status updating.
Well, just a hypothesis, but I'd suggest to use
getmicrouptime() for these checks.


Yeah, good point.


-
|   Paul Goyette   | PGP DSS Key fingerprint: |  E-mail addresses:  |
| Customer Service | FA29 0E3B 35AF E8AE 6651 |  paul at whooppee.com   |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: regression (crash) in sysmon/acpiacad

2010-02-07 Thread Paul Goyette

On Sun, 7 Feb 2010, Jukka Ruohonen wrote:


On Sun, Feb 07, 2010 at 02:24:37PM +0100, Joerg Sonnenberger wrote:

The point is that they have specific meaning for the hardware and as I
said, are generally not changeable. For example, on my laptop, the power
LED turns orange when warning cap is reached and shuts down hard on
critical.


You can already alter the low and warning capacity, as documented in
envsys.conf(5). I am not arguing that it would be a good idea, quite the
contrary.


You can't necessarily change the values that the hardware monitors, but 
you can set the limits that are used to control the sending of event 
notifications to powerd.  (A sensor can also set the ENVSYS_MONNOTSUPP 
flag to disable changing those limits.)



If sme_set_limits() is omitted but sme_get_limits() is introduced, the
visible change would merely imply that from

$ envstat -d acpibat0
 Current  CritMax  WarnMax  WarnMin  CritMin Unit
 warn cap: 2.511 Wh ( 5.00%)
  low cap: 0.200 Wh ( 0.40%)
   charge:50.230 Wh (100.00%)

we would move to

$ envstat -d acpibat0 | grep cap
   Current  CritMax  WarnMax  WarnMin  CritMin Unit
...
 charge:50.230   2.5110.200   Wh (100.00%)


At least currently, sensors with the ENVSYS_FPERCENT flag display all of 
their limit values as percentages.  So the output would be more like


   charge:50.230 5.000%   0.400%   Wh (100.00%)


Similar thing is already done in acpitz(4), sdtemp(4), and ipmi(4). The
argumentation in favor of the percentages goes in similar vein.

As I understand it, these callbacks were introduced specifically for those
sensors that are capable of checking the limits in hardware. Supposedly
this should be the case here.


Yes, the primary rationale for these callbacks is to provide a means for 
userland to obtain (and possibly modify) the values used by the hardware 
device to raise alarms.  At the time I was writing the sdtemp(4) driver, 
it seemed silly to always compare current-vs-limit in software when the 
chip was more than happy to do that task for us, if we would only set a 
threshold value for it to use!



-
|   Paul Goyette   | PGP DSS Key fingerprint: |  E-mail addresses:  |
| Customer Service | FA29 0E3B 35AF E8AE 6651 |  paul at whooppee.com   |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: need help with kern/42758 - correctly initialize W83627HF hwmon

2010-02-07 Thread Paul Goyette

I'll have a look.

On Sun, 7 Feb 2010, Michael Stapelberg wrote:


Hi,

Yesterday I debugged why the temperature values seen in envstat on my mainboard
looked very odd (70 degC while the bios hardware monitor shows 40 degC). As I
described in the bugreport, it is a missing initialization.

I also attached a patch to the bugreport, which fixes the problem to me.
However, I am not a NetBSD kernel developer, so the patch very likely needs
some polishing. It would be great if someone could have a look at it so that
it can go into the next NetBSD release.

Best regards,
Michael

PS: kern/42759 is related to this and a rather simple patch - it needs to be
merged before debugging kern/42758 (so that the i2c is accessible on this
board).

PS: Please CC me, as I am not on this list.



-
|   Paul Goyette   | PGP DSS Key fingerprint: |  E-mail addresses:  |
| Customer Service | FA29 0E3B 35AF E8AE 6651 |  paul at whooppee.com   |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: need help with kern/42758 - correctly initialize W83627HF hwmon

2010-02-07 Thread Paul Goyette

On Sun, 7 Feb 2010, Michael Stapelberg wrote:


Hi Paul,

Thanks for your help so far.

Excerpts from Paul Goyette's message of So Feb 07 22:06:23 +0100 2010:

 a) all W83627HF use temp sensors of type 3904, or
 b) the W83627HF on this board uses 3904.
Any chance you can do some research in the datasheet to see if there is
any mention of this?

The datasheet itself does not mention any preference in regard to the type.

In the wiki on lm-sensors.org I could find only one configuration which
actually uses the temperature sensors of the W83627HF. This configuration
also configures it to use the 3904 mode.


Yes, I went through the datasheet, too, and there seems to be no way to 
tell what type of sensor is actually used.


I think it would be OK to add a 'flags' to the config syntax, so you 
would say


lm* at iic? addr 0xXX flags 1

to enable 2N3904 bipolar temp sensors.  The default would be to keep 
current behavior which expects a normal thermistor.


Now, I just need to figure out where the flags value gets passed into 
the driver!



-
|   Paul Goyette   | PGP DSS Key fingerprint: |  E-mail addresses:  |
| Customer Service | FA29 0E3B 35AF E8AE 6651 |  paul at whooppee.com   |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: need help with kern/42758 - correctly initialize W83627HF hwmon

2010-02-07 Thread Paul Goyette

On Sun, 7 Feb 2010, Paul Goyette wrote:


In the wiki on lm-sensors.org I could find only one configuration which
actually uses the temperature sensors of the W83627HF. This configuration
also configures it to use the 3904 mode.


Yes, I went through the datasheet, too, and there seems to be no way to tell 
what type of sensor is actually used.


I think it would be OK to add a 'flags' to the config syntax, so you would 
say


lm* at iic? addr 0xXX flags 1

to enable 2N3904 bipolar temp sensors.  The default would be to keep current 
behavior which expects a normal thermistor.


Now, I just need to figure out where the flags value gets passed into the 
driver!


Can you try the attached diff, and set 'flags 1' in your config file?



-
|   Paul Goyette   | PGP DSS Key fingerprint: |  E-mail addresses:  |
| Customer Service | FA29 0E3B 35AF E8AE 6651 |  paul at whooppee.com   |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-Index: nslm7x.c
===
RCS file: /cvsroot/src/sys/dev/ic/nslm7x.c,v
retrieving revision 1.49
diff -u -p -r1.49 nslm7x.c
--- nslm7x.c13 Oct 2008 12:44:46 -  1.49
+++ nslm7x.c7 Feb 2010 23:04:42 -
@@ -1766,7 +1766,7 @@ static int
 wb_match(struct lm_softc *sc)
 {
const char *model = NULL;
-   int banksel, vendid, devid;
+   int banksel, vendid, devid, regval;
 
aprint_normal(\n);
/* Read vendor ID */
@@ -1790,6 +1790,16 @@ wb_match(struct lm_softc *sc)
case WB_CHIPID_W83627HF:
model = W83627HF;
lm_setup_sensors(sc, w83627hf_sensors);
+
+   if (device_cfdata(sc-sc_dev)-cf_flags  1) {
+   /* Switch to 2N3904 mode for the temperature sensors */
+   lm_generic_banksel(sc, WB_BANKSEL_B0);
+   (*sc-lm_writereg)(sc, WB_BANK0_RESVD1, 0x0);
+   regval = (*sc-lm_readreg)(sc, 0x5d);
+   regval |= 0xe;
+   (*sc-lm_writereg)(sc, WB_BANK0_VBAT, regval);
+   lm_generic_banksel(sc, banksel);
+   }
break;
case WB_CHIPID_W83627THF:
model = W83627THF;
Index: nslm7xvar.h
===
RCS file: /cvsroot/src/sys/dev/ic/nslm7xvar.h,v
retrieving revision 1.26
diff -u -p -r1.26 nslm7xvar.h
--- nslm7xvar.h 12 Oct 2008 13:17:28 -  1.26
+++ nslm7xvar.h 7 Feb 2010 23:04:42 -
@@ -87,6 +87,7 @@
 
 /* Bank 0 regs */
 #define WB_BANK0_CHIPID0x58/* Chip ID */
+#define WB_BANK0_RESVD10x59/* Resvd, bits 6-4 select temp sensor 
mode */
 #define WB_BANK0_FAN45 0x5c/* Fan 4/5 Divisor Control (W83791D only) */
 #define WB_BANK0_VBAT  0x5d/* VBAT Monitor Control */
 #define WB_BANK0_FAN4  0xba/* Fan 4 reading (W83791D only) */


Re: need help with kern/42758 - correctly initialize W83627HF hwmon

2010-02-08 Thread Paul Goyette

On Mon, 8 Feb 2010, Michael Stapelberg wrote:


Hi Paul,

Excerpts from Paul Goyette's message of Mo Feb 08 00:14:44 +0100 2010:

Can you try the attached diff, and set 'flags 1' in your config file?

The patch works fine. I would suggest to use flag 2, however, to be consistent
with the linux driver.


The Linux driver seems to do something a little bit strange...  From 
http://oss.sgi.com/cgi-bin/cvsweb.cgi/linux-2.6-xfs/Documentation/hwmon/w83781d?annotate=1.1


The W83782D and W83783S temperature conversion machine
understands about several kinds of temperature probes. You can
program the so-called beta value in the sensor files. '1' is
the PII/Celeron diode, '2' is the TN3904 transistor, and 3435
the default thermistor value. Other values are (not yet)
supported.

The datasheet for the sensor says

Reg. 0x59 bits [654]
Temperature sensor diode [321].  Set to 1, select Pentium II
compatible Diode. Set to 0 to select 2N3904 Bipolar mode.

Power-On default is 1.


Reg. 0x5d bits [321]
Temperature sensor [321].  Set to 1, select bipolar sensor.
Set to 0, select thermistor sensor.

Power-On default is 0.

So, from the datasheet, there seem to be only two sensor types, with 
Pentium II Diode being the default, while the Linux docs claim three 
sensor types and a default of some non-Pentium thermistor.  Rather 
confusing.


For now, I'm inclined to do things according to the datasheet.  :)

Unless anyone objects, I'll commit my patch later today.  (And I'll add 
comments on the flag value in all of the config files that contain the 
lm(4) device.)



-
|   Paul Goyette   | PGP DSS Key fingerprint: |  E-mail addresses:  |
| Customer Service | FA29 0E3B 35AF E8AE 6651 |  paul at whooppee.com   |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: wbsio(4): Winbond Super I/O driver for lm(4) over wbsio(4) attachment

2010-02-17 Thread Paul Goyette

On Wed, 17 Feb 2010, Constantine Aleksandrovich Murenin wrote:


1. Does it make sense to extend this to any other functions of the
  super-IO chip?  Are any others likely to ever show up on non-
  standard ISA ports?


Yes, wbsio(4) could later provide Watchdog Timer support, but such
timer is accessed directly through the Super I/O ports, not through a
separate set of ports as is the case with Hardware Monitor.


OK.


2. In the man page, please change the statement

       +driver provides support for the Winbond LPC Super I/O ICs.

  to

       +driver provides configuration support for the Winbond LPC
       +Super I/O ICs.


Interesting point, but perhaps this is being too specific?
Specifically since it could later even support the watchdog timer? :-)


OK.


3. As written, wbsio_print() assumes that the ir_size of the lm(4)'s
  isa_attach_args is the same as the size of the wbsio(4)'s ir_size.
  Is this really correct?  Or should we explicitly set the ir_size
  member before calling config_found()?


I don't see such assumption anywhere in wbsio(4).  In any case,
ir_size is always set in the match procedure of the respective
matching driver, e.g. it is still set to 8 for lm(4) in
lm_isa_match():

wbsio0 at isa0 port 0x2e-0x2f: Winbond LPC Super I/O W83627DHG rev 0x23
lm0 at wbsio0 port 0x290-0x297
lm0: Using default temp sensors
lm0: Winbond W83627DHG Hardware monitor


OK, I was not aware that each device's *_match() routine would set the 
ir_size.  Thanks for clarifying.



Curiosity:
I will confess that I'm writing this section without datasheets in
front of me...  But I notice that there are some differences in the
lists of devices between wbsio(4) and lm(4).  A few IDs are the same,
but most are different.

I also notice that wbsio(4) uses the device-ID, while lm(4) uses the
chip-ID.

Question:
4. So, would it make sense to be consistent and use the same xxx-ID in
  both drivers?  (In which case, wbsio(4) could simply include
 nslm7xvar.h rather than having its own ID definitions.)


Your observation is correct- the namespace for IDs is different, so
it would not make much sense to unify this with nslm7xvar.h in any
way.


OK


Question:
5. And, does the list of devices in wbsio(4) deliberately exclude the
  rest of the lm(4) devices?  Or could some/all of the remaining
  chips from lm(4) be added to wbsio(4)?


I think all supported devices were actually already included, and
since wbsio(4) was written, no new devices were introduced to lm(4).
Obviously, not all of those supported by lm(4) could ever show up on
the isa bus - for example, W83792D and a few others are SMBus-only.


OK.

You've addressed all my concerns.  Let's give a couple days for others 
to comment.



-
|   Paul Goyette   | PGP DSS Key fingerprint: |  E-mail addresses:  |
| Customer Service | FA29 0E3B 35AF E8AE 6651 |  paul at whooppee.com   |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-

Proposed changes to the dbcool(4) driver

2010-02-18 Thread Paul Goyette

Folks,

When the dbcool(4) driver was commited, sysmon_envsys(9) did not have 
the ability to retrieve and set sensor limit values in the hardware 
device.  As a result, dbcool had a separate user interface, using 
sysctl(8), for this purpose.


Now that sysmon_envsys has grown the required capability (and it is 
actually working!), I'd like to remove the bulk of the sysctl interface 
from the dbcool driver, and provide that functionality using sysmon's
capabilities.  (I'll leave intact the ability to control the fan-speed 
PWM controllers, since there's still no viable alternative.)


Does anyone see any reason to leave the sysctl interface in place?  If 
we do leave it, can/should it be enabled/disabled using


#define DBCOOL_SYSCTL
...
#endif

My preference would be to remove the sensor limit sysctl completely.


-
|   Paul Goyette   | PGP DSS Key fingerprint: |  E-mail addresses:  |
| Customer Service | FA29 0E3B 35AF E8AE 6651 |  paul at whooppee.com   |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Exposing internal sysmon_envsys(8) structures to userland

2010-02-28 Thread Paul Goyette
It has always bothered me that src/sys/sys/envsys.h exposes several 
internal data structures to userland.  It seems to me that the only 
items from this header file that should really be exported are


* the definitions for the various IOCTLs
* the definitions for the sensor units
* the definitions for the sensor states/values
* the definitions for the legacy struct envsys_tre_data

The struct sysmon_envsys_lim and struct envsys_data should be moved 
into src/sys/dev/sysmon/sysmon_envsysvar.h along with the related 
internal flags and properties definitions.


Comments?


-
|   Paul Goyette   | PGP DSS Key fingerprint: |  E-mail addresses:  |
| Customer Service | FA29 0E3B 35AF E8AE 6651 |  paul at whooppee.com   |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Unsafe GENERIC? - Re: (unknown)

2010-03-16 Thread Paul Goyette

On Tue, 16 Mar 2010, Aleksej Saushev wrote:


Well... Could we arrange it so that we have safe monolithic GENERIC
until issues are resolved somehow?


For i386, use MONOLITHIC instad of GENERIC

For amd64, the default is still MONOLITHIC, if I remember correctly.


-
|   Paul Goyette   | PGP DSS Key fingerprint: |  E-mail addresses:  |
| Customer Service | FA29 0E3B 35AF E8AE 6651 |  paul at whooppee.com   |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Dead ports [Re: config(5) break down]

2010-03-19 Thread Paul Goyette

On Fri, 19 Mar 2010, Thor Lancelot Simon wrote:


FYI -- and I doubt you could really buy just one of these at this price --
http://avnetexpress.avnet.com/store/em/EMController/Network-Processor/Cavium-Networks/CN5220-500BG729-SCP-Y-G/_/R-8960980/A-8960980/An-0?action=partcatalogId=500201langId=-1storeId=500201


I tried to get Avnet to provide a quote on an eval board for an Octeon 
and they never got back to me.  (The eval board didn't have a Buy_Now 
button, only a Request_Quote button.)



-
|   Paul Goyette   | PGP DSS Key fingerprint: |  E-mail addresses:  |
| Customer Service | FA29 0E3B 35AF E8AE 6651 |  paul at whooppee.com   |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Soliciting opinions on desired envstat -S behavior

2010-03-19 Thread Paul Goyette

Currently, the -S option for envstat(8) is documented to

...restore defaults to all devices registered with the framework.
This will remove all properties that were set in the configuration
file to the setting that the drivers use by default.

This was written this way long ago, before sysmon_envsys(9) learned how 
to import/export limit setting with hardware drivers.


As a result, the current behavior is sub-optimal, as it removes _ALL_ 
sensor limits, regardless of whether those limits were learned from the 
hardware driver or provided by the configuration file.  Even worse, the 
limits are removed from within the sysmon_envsys framework, but the 
hardware driver is not notified of the removal;  thus the hardware may 
still be checking the limits, and with the recent introduction of 
non-polled event delivery, the driver could actually report a limits 
event for a sensor with no limits!


I'm soliciting opinions on what the correct behavior should be, for each 
of the following sensor categories:


A. Sensors that have no hardware knowledge of limit values, and rely
   only on the sysmon_envsys(9) framework.

   I believe that the current behavior is correct - all knowledge of
   limit values is discarded.

B. Sensors that can provide initial limit values, but which cannnot
   handle changes to those values in hardware.  All limit processing is
   actually handled in the sysmon_envsys(9) framework and not in the
   driver or hardware.

   The current behavior works, but I believe that we should really
   re-import the initial values from the driver/hardware.  This is
   closest to the documented behavior.

C. Sensors that can actually process limit checking in the hardware (or
   in the driver) and that can accept new limit values to replace any
   initial (or factory default) values.

   I see at least three options here.

C1. In addition to current behavior (removing all limits from
the sysmon_envsys(9) framework), we should update the
driver/hardware to let it remove its own limits.

   or   C2. Behave as proposed in (B) above, and simply reimport the
current values from the hardware/driver after clearing
the framework.

   or   C3. Remember the initial values from when the sensor was first
registered, and after clearing the framework, reset both
the hardware/driver and the framework to the remembered
values.

I'd really appreciate comments/feedback/suggestions/etc.


-
|   Paul Goyette   | PGP DSS Key fingerprint: |  E-mail addresses:  |
| Customer Service | FA29 0E3B 35AF E8AE 6651 |  paul at whooppee.com   |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Soliciting opinions on desired envstat -S behavior

2010-03-20 Thread Paul Goyette

On Sun, 21 Mar 2010, Jukka Ruohonen wrote:


This part certainly gets complicated, and a C3 variation may sound
like the right thing to do.  I think it would be best if the drivers
could reset the limits internally with an appropriate reset command
issued directly to the underlying hardware as opposed to the
bookkeeping of the values in the software (otherwise, what would
happen after a soft reboot?), but I'm unaware regarding whether or not
such hardware support is usually present.


Additionally, the limit values can be in read-only registers, etc.


If the registers are read-only, the driver should not provide a 
xxx_set_limits() callback.  Furthermore, it should either prevent the 
user from changing the limits values (with the FMONNOTSUPP flag) or 
leave value/limit checking to the software framework.  In either case, 
if the registers are read-only, the driver falls into category A or B.




-
|   Paul Goyette   | PGP DSS Key fingerprint: |  E-mail addresses:  |
| Customer Service | FA29 0E3B 35AF E8AE 6651 |  paul at whooppee.com   |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Using proplist ioctl's from within the kernel

2010-03-25 Thread Paul Goyette

On Thu, 25 Mar 2010, Matthias Drochner wrote:



jeanyves.mig...@free.fr said:

There should be some way to serialize/deserialize prop/plistref
objects  within the kernel, but I never found a solution. I suppose
you will have  to dig deeper than I did :/


I think it should be avoided at all costs. With proplists, you give up type
safety
and you create a lot of overhead. This is acceptable for kernel-user interfaces
in some cases but not within the kernel.


So would it be appropriate for me to expose sysmon_envsys's global 
mutex and its device list in order for acpi_apm to scan for battery 
sensors and A/C adapters?



-
|   Paul Goyette   | PGP DSS Key fingerprint: |  E-mail addresses:  |
| Customer Service | FA29 0E3B 35AF E8AE 6651 |  paul at whooppee.com   |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Using proplist ioctl's from within the kernel

2010-03-26 Thread Paul Goyette

On Fri, 26 Mar 2010, haad wrote:


I think it should be avoided at all costs. With proplists, you give up
type
safety
and you create a lot of overhead. This is acceptable for kernel-user
interfaces
in some cases but not within the kernel.


So would it be appropriate for me to expose sysmon_envsys's global mutex
and its device list in order for acpi_apm to scan for battery sensors and
A/C adapters?



This is wrong and should be avoided IMHO.


Well, as it turns out, both the global mutex and the device list are 
already exposed, so it is a rather trivial task for acpi_apm.c to scan 
the device list.



-
|   Paul Goyette   | PGP DSS Key fingerprint: |  E-mail addresses:  |
| Customer Service | FA29 0E3B 35AF E8AE 6651 |  paul at whooppee.com   |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


xxxVERBOSE module?

2010-05-19 Thread Paul Goyette
From the comments in the GENERIC config files, the primary reason for 
omitting the various xxxVERBOSE options is to avoid including large text 
tables in the resulting kernel.  And I vaguely recall some spirited 
discussion back when the change was made to exclude these options by 
default.


Now that we have MODULAR kernels (at least on some architectures), I've 
been wondering if it might make sense to create a mod_verbose that could 
be loaded during start-up time and then unloaded after the machine is up 
and running.  (For plug-and-play situations, such as USB, the module 
could be reloaded and unloaded whenever a new device is added.)


Is this something that would be useful?


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: xxxVERBOSE module?

2010-05-20 Thread Paul Goyette

On Thu, 20 May 2010, Manuel Bouyer wrote:


On Thu, May 20, 2010 at 08:13:15AM +0200, Marc Balmer wrote:

To my understanding most of the VERBOSE options are compile-time
switches which are used to conditionally compile debug code using
#ifdefs.  I am not sure how such a module should work then, changing
#ifdef'ed code into code that is always compiled, but only used under
certain circumstances?


No, this was about PCIVERBOSE, USBVERBOSE and SCSIVERBOSE, which
adds in the kernel table of strings for vendor/device names,
extended error codes, etc ...


Correct - that was my intention.


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: xxxVERBOSE module?

2010-05-20 Thread Paul Goyette

On Thu, 20 May 2010, David Young wrote:


On Wed, May 19, 2010 at 08:52:52PM -0700, Paul Goyette wrote:

From the comments in the GENERIC config files, the primary reason
for omitting the various xxxVERBOSE options is to avoid including
large text tables in the resulting kernel.  And I vaguely recall
some spirited discussion back when the change was made to exclude
these options by default.

Now that we have MODULAR kernels (at least on some architectures),
I've been wondering if it might make sense to create a mod_verbose
that could be loaded during start-up time and then unloaded after
the machine is up and running.  (For plug-and-play situations, such
as USB, the module could be reloaded and unloaded whenever a new
device is added.)

Is this something that would be useful?


Yes, it would be.  ...


Kewl - I plan on starting with PCIVERBOSE, followed closely with USB.


...  Just an idea: we may get a lot of mileage out of a
generic method for adding or augmenting, and removing kernel property
lists from a kernel, either at run- or compile-time.

Consider that much of the VERBOSE info is in tabular form, thus
expressible by property list.  So are PCI/USB/... vendor/product tables
that we use to match drivers with devices---I wish we could change
those tables without recompiling.  I anticipate using property lists
to express device locators.  So it seems to me that linking property
lists into the kernel could be very useful.


Great idea, but I think I'll save that for a future time when I have a 
bit more free time!



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Kernel modules - documentation?

2010-05-21 Thread Paul Goyette
Is there any documentation on the modules interface or API?  There does 
not seem to be anything in the man pages...


My specific questions:

What actually triggers an autoload of a module?  (There seem to be very 
few places where module_autoload() is called.)


What is the semantic difference between module_autoload() and a normal 
module_load()?


Does the code which calls either of these routines need to be concerned 
with whether the module has been previously loaded?  Is it OK to load a 
module that has already been loaded?


Given that there is a kernel thread that runs around and attempts to 
unload any unreferenced modules that have been loaded for a while, is 
it ever necessary or desirable to explicitly unload a module?


What happens if a global symbol referenced by a module doesn't exist? 
Does the module get loaded anyway, leaving the reference unresolved?



Thanks in advance!


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Kernel modules - documentation?

2010-05-21 Thread Paul Goyette

On Sat, 22 May 2010, Adam Hamsik wrote:


My specific questions:

What actually triggers an autoload of a module?  (There seem to be 
very few places where module_autoload() is called.)


Device module_autoloading is done in specfs_open.


OK.  So I guess there's no generic Gee, the kernel just tried to 
reference something that's not here, so maybe we can find a module to 
resolve that reference.


What is the semantic difference between module_autoload() and a 
normal module_load()?


module_autoload happens automatically when you try to open device node 
which doesn't have device driver in kernel.


My question was more along the lines of what is the _difference_ 
between module_load() vs module_autoload()?  Or even simpler, Why do 
both routines exist?  :)


Does the code which calls either of these routines need to be 
concerned with whether the module has been previously loaded?  Is it 
OK to load a module that has already been loaded?




I think that this is not possible and these routines return an error 
when you try to do that.


That's OK, I'd expect the error.  I can ignore that.  I just need to be 
sure that the previously-loaded module doesn't get screwed up from the 
attempt to load it the second time.


Given that there is a kernel thread that runs around and attempts to 
unload any unreferenced modules that have been loaded for a while, 

is it ever necessary or desirable to explicitly unload a module?

It doesn't work this way there is a thread(workqueue) which try to 
unload module in first 300 seconds or mili seconds I can't remember 
now. But this thread or whatever is it doesn't unload modules older 
that set limit AFAIK.


Hmmm, I must have misunderstood this code - time to go look again.

What happens if a global symbol referenced by a module doesn't exist? 

Does the module get loaded anyway, leaving the reference unresolved?

If you have a module which uses foo routine and it is not defined in 
kernel or loaded modules your module will simply not load.


This is actually what I was hoping would happen.


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: xxxVERBOSE module?

2010-05-21 Thread Paul Goyette

On Fri, 21 May 2010, John Nemeth wrote:


} be loaded during start-up time and then unloaded after the machine is up

It would have to be loaded by the boot loader then.  As far as I
know, only the x86 boot loader is capable of loading modules.


I'm almost finished with the PCIVERBOSE stuff...

My current approach is to load the module right before the first pcibus 
is enumerated, and unload when finished.  So we can use the in-kernel 
loader/linker for whichever platforms it supports.  For other platforms 
it will still be possible to set 'options PCIVERBOSE' to generate a 
built-in module.


The fun part is making sure that the shared code still plays nicely with 
src/lib/libpci :)



Yes, I think it would be very useful.  I was thinking of looking
at this myself.  Thanks for taking on this task.  It's one thing I can
cross off my TODO list.  :-)


No worries!  When I finish up with PCI, I'll start in on USB.


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: xxxVERBOSE module?

2010-05-21 Thread Paul Goyette

On Fri, 21 May 2010, John Nemeth wrote:


} My current approach is to load the module right before the first pcibus
} is enumerated, and unload when finished.  So we can use the in-kernel

   File systems aren't initialised during autoconf when the system is
being cold booted, thus it isn't possible for the kernel to load a
module at that point in time.  Also consider that on most platforms
with PCI, prior to the first pcibus being enumerated, the kernel
doesn't know anything about any disk drives that may be attached to the
system.


Ah, yeah!  Ooppss!  :)

I guess I can probably remove those changes.  So we'll have to rely on 
either the module being built-in or established by the boot loader.



} The fun part is making sure that the shared code still plays nicely with
} src/lib/libpci :)

I don't know anything about libpci, but have fun with that.


The only thing I find that uses libpci is /usr/sbin/pcictl and that 
seems to be working fine.



} No worries!  When I finish up with PCI, I'll start in on USB.

I imagine that will be mostly copy/paste.


Mostly.


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: xxxVERBOSE module?

2010-05-22 Thread Paul Goyette

On Sat, 22 May 2010, John Nemeth wrote:


} (referring to loading the pciverbose module...)
} I guess I can probably remove those changes.  So we'll have to rely on

Although, you can't test them, possibly keep them.  We do want to
support hotswap PCI at some point.


It turns out that I need to keep them after all!  Without an explicit 
load, the modules don't get initalized until much later in system 
startup, and this module beomces rather useless.  :)


There is still one problem, though.  As you (John) pointed out earlier, 
we don't have any accessible filesystems from which to load modules. 
But unfortunately, the kern_module subsystem doesn't seem to realize 
this.  So, if the pci code requests the module to be loaded, and the 
module is neither built-in nor provided by the boot loader, it _will_ 
attempt to load from the filesystem.  I expected it to fail, but the 
failure mode is rather ungraceful - the kernel panics with backtrace


_atomic_inc_32
namei + x109
vn_open + x84
kobj_load_vfs + x82
module_load_vfs + x2cf
module_do_load + 3e9
module_load + x68

It would seem to me that somewhere in this mess we should simply return 
ENOENT (or some other appropriate errno).



Also, dyoung@ is in the process of
merging cardbus into PCI.  I don't know if your changes would cause
mod_verbose to be autoloaded for cardbus, which uses PCIVERBOSE, but
cardbus insertion should be handled at some point.  And, of course,
there is PCIe and expresscard on the horizon.  expresscard is basically
hotswap PCIe along with USB (i.e.  an expresscard slot brings out
signals for both and the card determines which one it is going to
use).


Both of these will be reasonable.  I've already updated all the cardbus 
references to the pci_find{vendor,product} routines, so it would be a 
simple matter of making the pciverbose module gets (re)loaded.



What about X?  What does it use for scanning the PCI bus?


I suspect it uses libpci - I'll be doing a full-tree search for libpci 
shortly.



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Enabling built-in modules earlier in init

2010-06-16 Thread Paul Goyette

On Wed, 16 Jun 2010, Antti Kantee wrote:


On Wed Jun 16 2010 at 04:13:54 -0700, Paul Goyette wrote:

With the current ways of secmodel register, I'd be damn careful to not
push it around.  The effect is that if it's called 0 times, you have a
system which allows everything.  So if your suggestion is implemented
and you're testing a new secmodel which buggily omits register alongside
another correctly registering secmodel, things will appear to work fine,
But if in some scenario the buggy one is loaded alone, well ... welcome
to the wishing well.


I had some concern about this as well, wondering if I would be able to
be sure I'd found all the secmodel modules that might exist.


Especially ones which aren't in src!


Perhaps it would be best to retain MODULE_CLASS_SECMODEL and also add
the suggested MODULE_CLASS_EARLY?


That would be my vote.

But, early is a little vague.  What if in the future we want
modules which are initialized even earlier.  Will those be called
MODULE_CLASS_EARLIER_THAN_EARLY?  If the class means intialized before
autoconf, why not use that in the name?


Would MODULE_CLASS_AUTOCONF work?


Also, the modclass id is exported to userland and used as an index to
a table in modstat.  I think I filed a PR about this being suboptimal.


Yeah, I was planning to update modstat(8) as well.


The better choice is to update modctl(2) to pass down the information
as a proplist.  That way even module classes are pluggable and other
information is easy to add if necessary.  I'm secretly hoping someone
will do this before 6.0 ... ;)


I might do it, but on a second pass.  :)



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Enabling built-in modules earlier in init

2010-06-16 Thread Paul Goyette

On Wed, 16 Jun 2010, Antti Kantee wrote:


I have to admit I haven't been following your work too closely, but
builtin modules are initialized either when all of them are initialized
per class or when their initialization is explicitly requested.  So if
whatever uses PCIVERBOSE requests the load of the PCIVERBOSE module,
it should be initialized and you should be fine (see module_do_load()).

The only but is that explicit loads must be accompanied by
MODCTL_LOAD_FORCE.  I wrote it that way because of the security use case:
if you disable a builtin module due to a security hole, you don't want
it to get autoloaded later.  For file system modules you can always use
rm, but for builtins you don't have that luxury.  So if that is actually
what you're chocking on, I suggest adding some flag to determine if the
module has ever been loaded and ignore the need for -F if it hasn't.


The attached diffs add a new mod_disabled member to the module_t 
structure, and set the value to false in each place that a new entry is 
created.  (Since all of the allocations of module_t structures are done 
with kmem_zalloc() I could probably avoid the explicit setting of the 
value to false.)


The value is set to true whenever a module is removed from active duty 
and returned to the module_builtin list.  (I specifically did NOT mark a 
module disabled if its modcmd(INIT) failed, under the assumption that it 
might succeed in a later retry.)


Loading of an entry from the module_builtin list no longer requires the 
-f flag if the entry has not been disabled.


This approach avoids the need for defining new a new module class, and 
it avoids needing to mess around with the current secmodel_register() 
stuff.





-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-Index: sys/module.h
===
RCS file: /cvsroot/src/sys/sys/module.h,v
retrieving revision 1.23
diff -u -p -r1.23 module.h
--- sys/module.h24 May 2010 03:50:25 -  1.23
+++ sys/module.h16 Jun 2010 12:42:42 -
@@ -90,6 +90,7 @@ typedef struct module {
time_t  mod_autotime;
void*mod_ctf;
u_int   mod_fbtentries; /* DTrace FBT entrie count */
+   boolmod_disabled;
 } module_t;
 
 /*
Index: kern/kern_module.c
===
RCS file: /cvsroot/src/sys/kern/kern_module.c,v
retrieving revision 1.69
diff -u -p -r1.69 kern_module.c
--- kern/kern_module.c  26 May 2010 23:53:21 -  1.69
+++ kern/kern_module.c  16 Jun 2010 12:42:42 -
@@ -195,6 +195,7 @@ module_builtin_add(modinfo_t *const *mip
modp[i] = kmem_zalloc(sizeof(*modp[i]), KM_SLEEP);
modp[i]-mod_info = mip[i+mipskip];
modp[i]-mod_source = MODULE_SOURCE_KERNEL;
+   modp[i]-mod_disabled = false;
}
mutex_enter(module_lock);
 
@@ -416,6 +417,7 @@ module_init_class(modclass_t class)
 * init.
 */
if (module_do_builtin(mi-mi_name, NULL) != 0) {
+   mod-mod_disabled = true;
TAILQ_REMOVE(module_builtins, mod, mod_chain);
TAILQ_INSERT_TAIL(bi_fail, mod, mod_chain);
}
@@ -794,7 +796,7 @@ module_do_load(const char *name, bool is
}
}
if (mod) {
-   if ((flags  MODCTL_LOAD_FORCE) == 0) {
+   if (mod-mod_disabled  (flags  MODCTL_LOAD_FORCE) == 0) {
if (!autoload) {
module_error(use -f to reinstate 
builtin module \%s\, name);
@@ -854,6 +856,7 @@ module_do_load(const char *name, bool is
return error;
}
mod-mod_source = MODULE_SOURCE_FILESYS;
+   mod-mod_disabled = false;
TAILQ_INSERT_TAIL(pending, mod, mod_chain);
 
error = module_fetch_info(mod);
@@ -1086,6 +1089,7 @@ module_do_unload(const char *name)
}
if (mod-mod_source == MODULE_SOURCE_KERNEL) {
mod-mod_nrequired = 0; /* will be re-parsed */
+   mod-mod_disabled = true;
TAILQ_INSERT_TAIL(module_builtins, mod, mod_chain);
module_builtinlist++;
} else {
@@ -1129,6 +1133,7 @@ module_prime(void *base, size_t size)
}
 
TAILQ_INSERT_TAIL(module_bootlist, mod

Re: Enabling built-in modules earlier in init

2010-06-17 Thread Paul Goyette

On Thu, 17 Jun 2010, Antti Kantee wrote:


On Wed Jun 16 2010 at 15:36:30 -0700, Paul Goyette wrote:

The attached diffs add one more routine, module_init3() which gets
called from init_main() right after module_class_init(MODULE_CLASS_ANY).
module_init3() walks the list of builtin modules that have not already
been init'd and marks them disabled.

Tested briefly on my home systems and appears to work.

Any objections to committing this?


I'd still hook it to the end of module_class_init(MODULE_CLASS_ANY)
instead of adding more randomly numbered module_initn() calls.
The other benefit from doing so is that you get it done atomically,
which is always worthwhile, and doubly so when it's a low hanging fruit
like here.


I dislike adding another special-case-overload.  In my mind, having 
module_class_init(MODULE_CLASS_ANY) have the additional effect of 
disabling un-init'd modules is not much different from automatically 
registering modules in MODULE_CLASS_SECMODEL.  This thread has already 
noted that such registration belongs in each module's modcmd().



@@ -416,6 +434,7 @@ module_init_class(modclass_t class)
 * init.
 */
if (module_do_builtin(mi-mi_name, NULL) != 0) {
+   mod-mod_disabled = true;
TAILQ_REMOVE(module_builtins, mod, mod_chain);
TAILQ_INSERT_TAIL(bi_fail, mod, mod_chain);
}


Why do you mark it as disabled?  Doesn't this conflict with the it
might succeed in a later module_init_class() idea you presented earlier?


Yes, it does conflict.  I've removed this line, and updated the comment 
block above.



module_disabled = true/false in multiple places looks a little
error-prone.  Now that struct module is growing more and more members,
maybe we can just have an object allocator which initializes the value and
afterwards the only acceptable mutation for module_disabled is setting
it to true (might make sense to rename the variable to something like
module_virgin and flip the polarity, though).


HeHe - module_virgin would seem to be a bit obscure to me if I hadn't 
written the code.  Perhaps module_autoload_ok would be acceptable (with 
a state flip)?  Or module_require_force (with no flip)?


The attached diffs use module_autoload_ok instead of module_disabled 
(and state flip), provide an object allocator, and a single mutation 
function.  It also avoids disabling the module if it is unloaded by the 
auto-unload thread;  only modules that are explicitly unloaded by name 
should be prevented from future autoloads.


I've renamed both module_init2() and module_init3() to more descriptive 
routine names, but for now I've kept _init3() as a separate function and 
not made it part of module_init_class().  I'll be doing some testing of 
this over the next day or two before committing.


As always, review, comments, and suggestions (and yes, even criticisms!) 
are welcomed and solicited.   :)




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-Index: sys/sys/module.h
===
RCS file: /cvsroot/src/sys/sys/module.h,v
retrieving revision 1.23
diff -u -p -r1.23 module.h
--- sys/sys/module.h24 May 2010 03:50:25 -  1.23
+++ sys/sys/module.h17 Jun 2010 12:53:24 -
@@ -90,6 +90,7 @@ typedef struct module {
time_t  mod_autotime;
void*mod_ctf;
u_int   mod_fbtentries; /* DTrace FBT entrie count */
+   boolmod_autoload_ok;
 } module_t;
 
 /*
@@ -120,7 +121,8 @@ extern struct modlist   module_builtins;
 extern u_int   module_gen;
 
 void   module_init(void);
-void   module_init2(void);
+void   module_start_unload_thread(void);
+void   module_builtin_no_autoload(void);
 void   module_init_md(void);
 void   module_init_class(modclass_t);
 intmodule_prime(void *, size_t);
Index: sys/kern/init_main.c
===
RCS file: /cvsroot/src/sys/kern/init_main.c,v
retrieving revision 1.420
diff -u -p -r1.420 init_main.c
--- sys/kern/init_main.c10 Jun 2010 20:54:53 -  1.420
+++ sys/kern/init_main.c17 Jun 2010 12:53:24 -
@@ -430,7 +430,7 @@ main(void)
loginit();
 
/* Second part of module system initialization. */
-   module_init2();
+   module_start_unload_thread();
 
/* Initialize the file systems. */
 #ifdef NVNODE_IMPLICIT
@@ -594,9

Preserving early console output (pre-Copyright stuff)

2010-07-01 Thread Paul Goyette
I'm trying to debug some stuff on a new server, and there are some 
messages of interest that get printf'd on the console very early in 
the system's life, before the Copyright (c)... lines.  There is a 
large amount of info being printed, way more than a screenful and way 
more than can be captured with an eyeball snapshot.  Unfortunately, it 
seems that this information is NOT retained in the kernel's message 
buffer, so it never gets collected by syslog.  (For those who want to 
know details, some of the messages in question are generated by 
add_mem_cluster() in src/sys/arch/x86/x86/x86_machdep.c)


So, is there either

a) a way to capture these messages in the normal message buffer for 
later collection by syslog, or


b) a way to pause long enough to manually transcribe the output?  (A 
simple timed delay would work, although a Press any key to continue 
would be easier!)


Thanks in advance!


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Preserving early console output (pre-Copyright stuff)

2010-07-01 Thread Paul Goyette

On Thu, 1 Jul 2010, Allen Briggs wrote:


Heh.  You do need a serial cable and another computer, but it's not too
hard to set up.  You don't need to find any options in the BIOS if you
install the right boot blocks.


Yeah, OK, I got a cable, plugged it in, run cu(1) on one end, and added 
a new entry to the /boot.cfg on the other end:


menu=Serial/Single/Verbose/Debug:consdev com0; boot -svx

Then just boot, and things magically work!

Thanks for the clue-by-four


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Add NULL power handler to x86/ipmi(4)

2010-07-16 Thread Paul Goyette
Is there any reason anyone can think of to not add a NULL power handler 
to the ipmi(4) driver?  I can't see any reason for anything special to 
happen either at suspend or resume, and the lack of a power handler 
prevents the system from going to sleep at all.


Proposed patch is attached.

-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-Index: ipmi.c
===
RCS file: /cvsroot/src/sys/arch/x86/x86/ipmi.c,v
retrieving revision 1.46
diff -u -p -r1.46 ipmi.c
--- ipmi.c  10 Apr 2010 19:02:39 -  1.46
+++ ipmi.c  16 Jul 2010 12:19:09 -
@@ -1934,6 +1934,10 @@ ipmi_thread(void *cookie)
sc-sc_wdog.smw_tickle = ipmi_watchdog_tickle;
sysmon_wdog_register(sc-sc_wdog);
 
+   /* Set up a NULL power handler so we can possibly sleep */
+   if (!pmf_device_register(self, NULL, NULL))
+aprint_error_dev(self, couldn't establish a power handler\n);
+
mutex_enter(sc-sc_poll_mtx);
while (sc-sc_thread_running) {
ipmi_refresh_sensors(sc);


Re: Add NULL power handler to x86/ipmi(4)

2010-07-17 Thread Paul Goyette

On Sat, 17 Jul 2010, David Young wrote:


ipmi(4) should probably not suspend if its watchdog timer is active.


Would it be sufficient for ipmi(4) to refuse to suspend (return
false from the suspend method) if the watchdog is active?


Yes.  I think that's the right thing to do for now.


This is actually fairly easy to do.


Or should it make some sort of effort to disable the watchdog, and
attempt to reenable the watchdog during the subsequent resume?


That defeats the purpose of a watchdog timer, at least for my purposes.
This is a local policy choice that should probably not be hard-coded.
Maybe we need additional watchdog modes?

1 Watchdog countdown stops during suspension, restarts after


Probably a bit more difficult than #3 below, and it assumes that the 
remaining time is available.



2 Watchdog keeps counting down during suspend (we'll wake periodically
 to tickle it)


This might be rather tricky.  Depending on the type of suspend, the time 
for a wake-up to occur could be faily long and/or variable.  And there 
might not even be a way for the system to schedule its own wake event.



3 Watchdog prevents suspension


As I indicated, this is fairly easy to do, simply check if the w-dog is 
armed or not.


The attached patch provides an ipmi_suspend() method to implement #3. 
But I suspect that this is really a more generic watchdog capability, 
and it should be implemented for the entire class of watchdogs rather 
than only for ipmi(4)?


Comments?


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-Index: ipmi.c
===
RCS file: /cvsroot/src/sys/arch/x86/x86/ipmi.c,v
retrieving revision 1.46
diff -u -p -r1.46 ipmi.c
--- ipmi.c  10 Apr 2010 19:02:39 -  1.46
+++ ipmi.c  17 Jul 2010 02:40:27 -
@@ -225,6 +225,8 @@ int ipmi_sensor_status(struct ipmi_softc
 int add_child_sensors(struct ipmi_softc *, uint8_t *, int, int, int,
 int, int, int, const char *);
 
+bool   ipmi_suspend(device_t, const pmf_qual_t *);
+
 struct ipmi_if kcs_if = {
KCS,
IPMI_IF_KCS_NREGS,
@@ -1934,6 +1936,10 @@ ipmi_thread(void *cookie)
sc-sc_wdog.smw_tickle = ipmi_watchdog_tickle;
sysmon_wdog_register(sc-sc_wdog);
 
+   /* Set up a power handler so we can possibly sleep */
+   if (!pmf_device_register(self, ipmi_suspend, NULL))
+aprint_error_dev(self, couldn't establish a power handler\n);
+
mutex_enter(sc-sc_poll_mtx);
while (sc-sc_thread_running) {
ipmi_refresh_sensors(sc);
@@ -2093,3 +2099,14 @@ ipmi_dotickle(struct ipmi_softc *sc)
device_xname(sc-sc_dev), rc);
}
 }
+
+bool
+ipmi_suspend(device_t dev, const pmf_qual_t *qual)
+{
+   struct ipmi_softc *sc = device_private(dev);
+
+   /* Don't allow suspend if watchdog is armed */
+   if ((sc-sc_wdog.smw_mode  WDOG_MODE_MASK) != WDOG_MODE_DISARMED)
+   return false;
+   return true;
+}


Re: Add NULL power handler to x86/ipmi(4)

2010-07-17 Thread Paul Goyette

On Sat, 17 Jul 2010, David Young wrote:


It is a generic capability.  ...


That's the conclusion I came to.


... ISTM watchdog timers should eventually be
refactored in this way: each watchdog timer in the system should have a
corresponding pseudo-device, an instance of wdog(4). wdog(4) provides
its own suspension/resumption/detachment routines.  Let wdog_suspend()
return false if the watchdog is active (for now). ipmi0 should attach
wdog0 using a wdogbus interface attribute.  Let struct sysmon_wdog
become the watchdog attachment arguments, wdog_attach_args.


That's probably a bit more work than I want to commit to right now.  I 
was hoping there would be an easier way, since the sysmon pseudo-device 
already exists (with exactly one device-minor for wdogs).  But I could 
not find anywhere a device_t, so no way to register a power handler.


A also looked at just creating a new pmf device class for watchdogs, 
since most watchdogs are actually associated with some other hardware 
device (and therefore there's a device_t to use when registering the 
handler).  But there's pseudo-device swwdog(4) that has no hardware 
behind it, so I ended up back with the same problem!



Your patch looks fine.


Thanks.  I will commit this for now, with a big XXX in the commit log.

I've also got a patch for itesio(4).  Currently, itesio registers a NULL 
suspend handler, whether or not the particular device includes a wdog. 
My patch will register a real suspend handler if the wdog is present.




I'll add watchdog-factoring to my To-Do list, but for now I'll just 
commit the changes for itesio(4) and ipmi(4).




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: power management and pseudo-devices

2010-07-19 Thread Paul Goyette

On Mon, 19 Jul 2010, Antti Kantee wrote:


Include ioconf.h I think.


Tried that.  It works for compiling the kernel.  Unfortunately,
swwdog is included in rump, and there doesn't seem to be an ioconf.h
available for the librump build.


Well, whatever.  I don't think I want to look at that.


:)


Actually, I still think this is wrong.  It might make librump compile
and even link, that doesn't mean it will be usable if nothing ever
creates that extern symbol.  You'll have to check with pooka or explore
the code, but there have to be some components of rump that have partial
configuration files.  After all, he created the ioconf directive for
that purpose.


First of all, let's take a step up from the trenches and try to understand
the problem we're dealing with instead of trying to arbitrarily guess how
to fix the build.  We have two ways of building kernel code: monolithic
(./build.sh kernel=CONF) and modular (kernel modules, rump components).
The latter ones currently always do config stuff on-demand, so changes
cause breakage.  Should a swwdog kernel module exist (why isn't there
one?), it would run into the same problem.

Now, the ioconf keyword for config(1) is meant to help build modular
kernel code by allowing to specify a partial config file.  Currently,
as the name implies, it takes care of creating ioconf.[hc], namely in
this case struct cfdriver swwdog_cd.


Hmmm, I don't seem to see any mention of the ioconf keyword in the current 
config(1) or config(5) man pages.



Adding a SYSMON.ioconf will solve the problem:
=== snip ===
ioconf sysmon

include conf/files

pseudo-device swwdog
=== snip ===

The good news is that if some day a sysmon kernel module is added,
the exact same ioconf can be used without having to once again run
into trouble.


Yay - it appears to work!  Revised diffs are attached to this E-mail.



Then let's view the broader scale.  I think acpibat is currently the
only kernel module using ioconf, and I haven't bothered converting others
since I have realized that the scope of ioconf was a little too narrow.
I'm planning to change the keyword to module and add support for source
files.  This way the default build for every modular kernel component
goes through config, and we can avoid issues due to config changes.
IIRC I have the config(1) part for this done already, but being able to
use an autogenerated SRCS list requires some bsd.subdir.mk style advanced
make hackery which I haven't been in the mood for.  Plus, there's of
course the mess with file some.c foo | bar  (baz ^ xyzzy)  !!!frobnitz.

Finally, on the eternal someone should astral plane, someone should
fix the kernel build to consist of building a set of modules and linking
them together, so we don't have more than one way to skin a kernel.


I'm pretty sure someone != me  :)



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-Index: sys/rump/dev/lib/libsysmon/SYSMON.ioconf
===
RCS file: sys/rump/dev/lib/libsysmon/SYSMON.ioconf
diff -N sys/rump/dev/lib/libsysmon/SYSMON.ioconf
--- /dev/null   1 Jan 1970 00:00:00 -
+++ sys/rump/dev/lib/libsysmon/SYSMON.ioconf19 Jul 2010 12:50:54 -
@@ -0,0 +1,8 @@
+#  $NetBSD$
+#
+
+ioconf swwdog
+
+include conf/files
+
+pseudo-device  swwdog
Index: sys/rump/dev/lib/libsysmon/Makefile
===
RCS file: /cvsroot/src/sys/rump/dev/lib/libsysmon/Makefile,v
retrieving revision 1.2
diff -u -p -r1.2 Makefile
--- sys/rump/dev/lib/libsysmon/Makefile 16 Feb 2010 20:42:45 -  1.2
+++ sys/rump/dev/lib/libsysmon/Makefile 19 Jul 2010 12:50:54 -
@@ -4,6 +4,7 @@
 .PATH: ${.CURDIR}/../../../../dev/sysmon
 
 LIB=   rumpdev_sysmon
+IOCONF=SYSMON.ioconf
 
 SRCS=  sysmon_taskq.c sysmon_power.c sysmon_envsys.c sysmon_envsys_events.c \
sysmon_envsys_tables.c sysmon_envsys_util.c sysmon_wdog.c sysmon.c \
Index: sys/dev/sysmon/files.sysmon
===
RCS file: /cvsroot/src/sys/dev/sysmon/files.sysmon,v
retrieving revision 1.11
diff -u -p -r1.11 files.sysmon
--- sys/dev/sysmon/files.sysmon 30 Jan 2010 21:55:28 -  1.11
+++ sys/dev/sysmon/files.sysmon 19 Jul 2010 12:50:54 -
@@ -18,5 +18,5 @@ file  dev/sysmon/sysmon_wdog.csysmon_wdo
 file   dev/sysmon/sysmon.c sysmon_envsys | sysmon_wdog |
sysmon_power
 
-defpseudo swwdog: sysmon_wdog
+defpseudodev swwdog: sysmon_wdog
 filedev/sysmon/swwdog.cswwdog
Index: sys/dev

Modules loading modules?

2010-07-25 Thread Paul Goyette


Consider the following scenario:

We have a modular device driver, let's call it xxxmod.  The driver's
xxx_modcmd(MODULE_CMD_INIT, ...) routine handles all of its initialization,
including interaction with autoconf(9).  It therefore might attempt to use
an optional module (e.g., zzzverbose) to print some device attachment
messages.

(Another possible scenario would be to have one modular driver xxxmod
that calls config_found_xxx() which in turn requires another modular
driver yyymod.)

We have no problem if the zzzverbose module is already auto-loaded. But
if the module is not present, or if it has already been auto- unloaded,
the current code will call module_autoload() to attempt to (re)load it.
Since this call happens in xxx_modcmd(), we end calling mutex_enter()
while the mutex is still owned.

There is currently a requires mechanism that allows recursion within
the module loading process.  But it doesn't quite solve the problem, for
at least three reasons.  First, a required module cannot be optional. If
the desired module is not present, or if it is present but its own
xxx_modcmd(MODULE_CMD_INIT, ...) fails, the failure is propagated back
to the original outer call to module_load() which will also fail.

The second reason why this is not suitable is that the outer load will
add a reference to the module, preventing it from being auto-unloaded.
(A key benefit of having the xxx_verbose modules is that their large text
tables can be unloaded.)

A third reason that tends to make this option unuseable is the need to
identify ahead of time all of the possible optional modules and maintain
the dependency list.  Adding a new driver would thus require updating the
dependency list in the parent device's driver.

Any suggestions on how to resolve this conundrum?  Without a solution,
the zzz_verbose modules and modularized drivers that might use the 
zzz_verbose modules cannot co-exist, and we won't be able to have modular

drivers for devices whose children are served by other modular drivers.



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: envsys issues [was Re: envstat wrong: who's at fault?]

2010-07-26 Thread Paul Goyette

On Mon, 26 Jul 2010, der Mouse wrote:


[moved from port-i386]

[me]

In passing, would it be appropriate and/or useful to suggest
improvements to [the envsys(4)] API?  When I was writing code, I
found the envsys(4) ioctls to be deficient for my purposes.

[Paul Goyette]

I'd be interested in knowing in which ways you found the current API
lacking.  [...]


Well, some things are underdocumented.  For example, it seems that most
sensors are not in the units specified by envsys_tre_data_t.units, but
rather in 1e-6 of that unit - for example ENVSYS_SWATTHOUR sensors are
not in watthours but in microwatthours - and, as far as I've been able
to find, this factor of 1e6 is not documented anywhere but the source
to envstat(8).  Except for temperature, which is specifically said to
be in microKelvins.  But others are said to be in volts, amps, etc.  To
quote both the envsys(4) and sys/envsys.h,

union { /* all data is given */
uint32_t data_us;   /* in microKelvins, */
int32_t data_s; /* rpms, volts, amps, */
} cur, min, max, avg;   /* ohms, watts, etc */
/* see units below */


This is greatly cleaned up in the envsysV2 implementation.




If the micro is supposed to apply to all those units, not just
Kelvins, then (a) rpms needs to be removed from the list and (b) the
wording needs to be improved.

But the one piece I've found so far that isn't just a lack of
documentation (well, unless there are totally undocumented calls) is
that there's no way to fetch multiple sensors' values without potential
for skew between them..  For example, to quote from the code I've been
developing,


In envsysV2, a single call is used to retrieve the whole set of sensors 
(in a prop dictionary) so you don't need to worry about skew caused by 
userland sampling.  Of course, there is still the possibility of skew 
caused by the driver's need to fiddle with hardware registers.




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Modules loading modules?

2010-07-26 Thread Paul Goyette

On Mon, 26 Jul 2010, der Mouse wrote:


We have a modular device driver, let's call it xxxmod.  [...]  It []
might attempt to use an optional module (e.g., zzzverbose) to print
some device attachment messages.



First, a required module cannot be optional.  If the desired module
is not present, or if it is present but its own
xxx_modcmd(MODULE_CMD_INIT, ...) fails, the failure is propagated
back to the original outer call to module_load() which will also
fail.



The second reason why this is not suitable is that the outer load
will add a reference to the module, preventing it from being
auto-unloaded.


Surely the right answer here is to provide a way to say refer to this
module, but it's ok for its load to fail, and it's ok for it to get
auto-unloaded, including passing up whatever information is necessary
for the calling module to do something useful in failure cases?


I actually considered adding a set of desired modules as well as the 
current required modules.  That would certainly take care of the 
immediate problem with the xxx_verbose modules.


But it doesn't address the need to have one module/driver loading 
another one for a child device, and it would be a maintenance nightmare 
to require updating the dependency-list every time a new child driver 
gets created.



It really seems to me that the module system is there to help us, not
to shackle us, and that if it has properties which are leading to
problems, one of the options we should at the very least be considering
is changing those properties.


Yes, the current restriction appears to be burdensome.  I'm just trying 
to get some consensus on that - so far I've seen only one negative 
opinion (from pooka@ IIRC).  If we can agree that it's desirable for one 
module's xxx_modcmd() to load another module, then we can move to step 2 
and figure out how to make it happen without breaking anything.


If we don't get consensus, then I'll most likely need to revert the 
xxx_verbose modules.



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Modules loading modules?

2010-07-29 Thread Paul Goyette

On Wed, 28 Jul 2010, Andrew Doran wrote:


If modload-from-modcmd is found necessary, sounds more like a case for
the infamous recursive lock.


Recursive lock is the way to go.  I think the same lock should also cover
all device configuration activites (i.e. autoconf) and any other
heavy lifting where we have chunks of the system coming and going.


OK, I'm willing to make an attempt to implement the recursive_lock if 
noone else wants to give it a try!  It will probably take some time...




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Modules loading modules?

2010-07-31 Thread Paul Goyette

On Wed, 28 Jul 2010, Andrew Doran wrote:


it seems to me the root problem is that module_mutex is held while
calling into the module startup routines.

thus, the right solution is to remove this requirement.


Yes, that's what is needed.


I'm far from convinced that's a good idea.  First, it will probably
make the module code a nightmare -- what happens when you have multiple
interleaved loads, some of which fail at some point in their dependency
stack, and let's just throw in a manual modunload to mix up things
further.  Second, and pretty much related to number one, it goes against
one of the most fundamental principles of robust code: atomic actions.


This: atomicity.  The atomic behaviour is relied upon to give all or
nothing semantics upon load and unload, like transactions in a database.

Admittely some modules (not of my making) are sloppy about this and so
break that bit of the interface contract.


If modload-from-modcmd is found necessary, sounds more like a case for
the infamous recursive lock.


Recursive lock is the way to go.  I think the same lock should also cover
all device configuration activites (i.e. autoconf) and any other
heavy lifting where we have chunks of the system coming and going.


Well, folks, here is a first pass recursive locks!  The attached diffs 
are against -current as of a few minutes ago.


Some caveats:

1. This has only been compile-tested for x86 (amd64  i386).

2. Other architectures have not even been compiled yet.  It should
   work but you never know.

3. HPPA is an exceptional exception here, since it is the only in-tree
   architecture I found that cannot use SIMPLE_LOCKS.

4. There is only one known use case for this so far:  modules loading
   other modules from within their xxx_modcmd() routine.  The specific
   use case we have involves loading the acpicpu driver/module which
   eventually results in an attempt to load acpiverbose.

It would be really nice if the community could

A. Compile-test on additional architectures, especially HPPA
B. Test to see that existing mutex operations still work correctly
C. Exercise the known use case if possible
D. Identify additional use cases


There is one place in the code (marked with great big XXX) where I am 
simply unsure if I need to use a CAS-type operation for updating the 
reference count on the recursive mutex.  I _think_ it is save to simply 
auto-increment the field, but I am totally willing for any of you 
experts to tell my why this might not be safe.   :)


I will be updating one of my home machines (amd64) to use this code in 
the next few days.  I don't plan on making any commits until I've had 
some positive testing feedback, and hopefully some expert commentary on 
the XXX.  (Some additional concensus that this is the right thing to 
do would also be appreciated!)




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-Index: sys/arch/alpha/include/mutex.h
===
RCS file: /cvsroot/src/sys/arch/alpha/include/mutex.h,v
retrieving revision 1.4
diff -u -p -r1.4 mutex.h
--- sys/arch/alpha/include/mutex.h  28 Apr 2008 20:23:11 -  1.4
+++ sys/arch/alpha/include/mutex.h  31 Jul 2010 22:34:30 -
@@ -43,7 +43,11 @@ struct kmutex {
 
 struct kmutex {
union {
-   volatile uintptr_t  mtxa_owner;
+   struct {
+   volatile uintptr_t  mtxa_owner;
+   volatile uint8_tmtxa_recurse;
+   volatile uint8_tmtxa_unused[3];
+   } a;
struct {
volatile uint8_tmtxs_flags;
ipl_cookie_tmtxs_ipl;
@@ -53,7 +57,8 @@ struct kmutex {
__cpu_simple_lock_t mtx_lock;
 };
 
-#definemtx_owner   u.mtxa_owner
+#definemtx_owner   u.a.mtxa_owner
+#definemtx_recurse u.a.mtxa_recurse
 #definemtx_flags   u.s.mtxs_flags
 #definemtx_ipl u.s.mtxs_ipl
 
Index: sys/arch/arm/include/mutex.h
===
RCS file: /cvsroot/src/sys/arch/arm/include/mutex.h,v
retrieving revision 1.10
diff -u -p -r1.10 mutex.h
--- sys/arch/arm/include/mutex.h28 Apr 2008 20:23:14 -  1.10
+++ sys/arch/arm/include/mutex.h31 Jul 2010 22:34:31 -
@@ -57,7 +57,11 @@ struct kmutex {
 struct kmutex {
union {
/* Adaptive mutex

Re: Modules loading modules?

2010-08-01 Thread Paul Goyette

On Sun, 1 Aug 2010, Antti Kantee wrote:


Well, folks, here is a first pass recursive locks!  The attached diffs
are against -current as of a few minutes ago.


Oh, heh, I thought we have recursive lock support.  But with that gone
from the vfs locks, I guess not (apart from the kernel lock ;).


:)



I'm not sure if it's a good idea to change the size of kmutex_t.  I guess
plenty of data structures have carefully been adjusted by hand to its
size and I don't know of any automatic way to recalculate that stuff.

Even if not, since this is the only user and we probably won't have
that many of them even in the future, why not just define a new type
``rmutex'' which contains a kmutex, an owner and the counter?  It feels
wrong to punish all the normal kmutex users for just one use.  It'll also
make the implementation a lot simpler to test, since it's purely MI.

separate normal case and worst case


Good point, and it will be a lot less work, too!  :)  And it solves the 
problem of not permitting a rmutex being used with condvars.


One question:  Since an adaptive kmutex_t already includes an owner 
field, would we really need to have another copy of it in the rmutex_t 
structure?



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Modules loading modules?

2010-08-01 Thread Paul Goyette

On Sun, 1 Aug 2010, Antti Kantee wrote:


I'm not sure if it's a good idea to change the size of kmutex_t.  I
guess plenty of data structures have carefully been adjusted by hand
to its size and I don't know of any automatic way to recalculate that
stuff.

Even if not, since this is the only user and we probably won't have
that many of them even in the future, why not just define a new type
``rmutex'' which contains a kmutex, an owner and the counter?  It 
feels wrong to punish all the normal kmutex users for just one use.

It'll also make the implementation a lot simpler to test, since it's
purely MI.

separate normal case and worst case


Round two!  Taking pooka's suggestion, this version is built on top of 
(rather than beside) the existing non-recursive mutex.  As such, it does 
not affect any MD code.


Attached is a set of diffs that

1. Adds sys/sys/rmutex.h and sys/kern/kern_rmutex.c to implement
   recursive adaptive mutexes.  (Conspicuously missing is an rmutex(9)
   man page...  It will happen before this gets committed.)

2. Converts the existing module_lock from a normal kmutex_t to an
   rmutex_t

3. Updates all of the (surprisingly many) places where module_lock
   is acquired.

Compile-tested on port-amd64 (including rumptest).  Since there are no 
MD-changes in this version, there shouldn't be any issues with 
building on other ports.



As previously noted, there is only one known use case for this so far: 
modules loading other modules from within their xxx_modcmd() routine. 
The specific use case we have involves loading the acpicpu driver/module 
which eventually results in an attempt to load acpiverbose.


It would be really nice if the community could

A. Compile-test on additional architectures
B. Test to see that existing mutex operations still work correctly
C. Exercise the known use case if possible
D. Identify additional use cases


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-Index: sys/conf/files
===
RCS file: /cvsroot/src/sys/conf/files,v
retrieving revision 1.992
diff -u -p -r1.992 files
--- sys/conf/files  7 Jul 2010 01:09:39 -   1.992
+++ sys/conf/files  2 Aug 2010 00:08:26 -
@@ -1468,6 +1468,7 @@ file  kern/kern_prot.c
 file   kern/kern_ras.c
 file   kern/kern_rate.c
 file   kern/kern_resource.c
+file   kern/kern_rmutex.c
 file   kern/kern_runq.c
 file   kern/kern_rwlock.c
 file   kern/kern_rwlock_obj.c
Index: sys/sys/rmutex.h
===
RCS file: sys/sys/rmutex.h
diff -N sys/sys/rmutex.h
--- /dev/null   1 Jan 1970 00:00:00 -
+++ sys/sys/rmutex.h2 Aug 2010 00:08:27 -
@@ -0,0 +1,53 @@
+/* $NetBSD$*/
+
+/*-
+ * Copyright (c) 2010 The NetBSD Foundation, Inc.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
+ * ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef_SYS_RMUTEX_H_
+#define_SYS_RMUTEX_H_
+
+#include sys/mutex.h
+
+/*
+ * A recursive mutex is simply an adaptive mutex which can be acquired
+ * multiple times by a single owner.
+ */
+ 
+struct rmutex {
+   kmutex_trmtx_mtx;
+   uint32_trmtx_recurse;
+};
+
+typedef struct rmutex rmutex_t;
+
+void   rmutex_init(rmutex_t *);
+void   rmutex_destroy(rmutex_t *);
+void   rmutex_enter

Re: Modules loading modules?

2010-08-01 Thread Paul Goyette

On Mon, 2 Aug 2010, Quentin Garnier wrote:


+int
+rmutex_tryenter(rmutex_t *rmtx)
+{
+   int rv = 1;
+
+   if (mutex_owned(rmtx-rmtx_mtx)) {
+   rmtx-rmtx_recurse++;
+   KASSERT(rmtx-rmtx_recurse != 0);
+   } else if ((rv = mutex_tryenter(rmtx-rmtx_mtx)) != 0) {
+   rmtx-rmtx_recurse++;
+   KASSERT(rmtx-rmtx_recurse != 0);
+   }
+   return rv;
+}


I am probably not getting the idea, but I fail to see how this qualifies
as a mutex.  My reading of this piece of code is that rmutex_enter() is
always going to succeed immediately.


According to the mutex(9) man page:

mutex_owned(mtx)

   For adaptive mutexes, return non-zero if the current LWP holds
   the mutex. ...

The rmutex_init() code ensures that we have an adaptive mutex.  So what 
this says is


If _we_ already own the mutex,
increment counter
(rv is still initialized to 1)
Else
if we can acquire the mutex (updating rv)
increment the counter

Return



Please correct me if I'm wrong--and I am very probably wrong--, but
isn't the point of a recursive mutex to allow the thread initially
taking the mutex to take it again, but not other threads?  The code
would have to be a lot more complicated than that.

All I see here is a fancy no-op, but maybe it's just me.


It would be a no-op for a recursive spin mutex.  That's why I didn't
allow them!  :)


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Modules loading modules?

2010-08-01 Thread Paul Goyette

On Sun, 1 Aug 2010, John Nemeth wrote:


I'm thinking the acquisition of module_lock should be pushed into
module_autoload() instead of having the caller acquire it for this very
reason.  It makes it hard to change the way locking works in the
MODULAR code if you expect the caller to acquire the lock.  I don't
know why it was done this way originally, or what the consequences (if
any) would be for making the change.  Andrew, any thoughts on this?


I was thinking the same thing.

There is one case in kern/kern_exec.c where the lock was held while the 
code loops through a list of possible modules.  Looking closer, that 
code actually releases the lock and yield()s before re-acquiring it 
within the loop, so it probably could be modified without any adverse 
effects.


This would actually simplify the interface, since ALL of the routines 
would be doing their own locking, and noone in the rest of the kernel 
would need to know about the lock.





-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Modules loading modules?

2010-08-01 Thread Paul Goyette

On Mon, 2 Aug 2010, Masao Uebayashi wrote:


mutex_owned(mtx)

  ...

  mutex_owned() is provided for making diagnostic checks to verify
  that a lock is held.  For example:

  KASSERT(mutex_owned(driver_lock));

  It should not be used to make locking decisions at run time, or to
  verify that a lock is not held.

Thus you can't use it for your decision.


Yes, I did read the mutex(9) man page.

However, your assertion is at odds with pooka's suggestion, from
http://mail-index.netbsd.org/tech-kern/2010/08/01/msg008632.html


Good point.  I think it's ok to do:

  if (mutex_owned(mtx))
cnt++
  else
lock




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Modules loading modules?

2010-08-02 Thread Paul Goyette

On Sun, 1 Aug 2010, Eduardo Horvath wrote:


Why is there a need to hold a mutex across module loading instead of, say,
having a reference count tht's protected by a mutex?  I thought holding a
mutex across a potentially blocking operation such as a module load is
considered bad practice.


Please note that I am not introducing the holding of the lock over the 
potentially blocking operation.  This is already done.


Several others have already indicated that the atomicity of the module 
load (including the loading of any required modules) is a desirable 
property.  I'm simply looking for a mechanism that allows us to keep 
this property while allowing modules to load other modules.




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Locking for module_autoload() (was: Modules loading modules?)

2010-08-02 Thread Paul Goyette

On Mon, 2 Aug 2010, Adam Hamsik wrote:



On Aug,Monday 2 2010, at 4:27 AM, Paul Goyette wrote:


On Sun, 1 Aug 2010, John Nemeth wrote:


   I'm thinking the acquisition of module_lock should be pushed into
module_autoload() instead of having the caller acquire it for this very
reason.  It makes it hard to change the way locking works in the
MODULAR code if you expect the caller to acquire the lock.  I don't
know why it was done this way originally, or what the consequences (if
any) would be for making the change.  Andrew, any thoughts on this?


Attached are diffs for moving the locking out of each caller and into 
module_autoload() itself.  Most of these changes are fairly trivial, but a 
couple of them should be looked at more closely in case I've missed any 
subtleties in the original code:

kern/kern_syscall.c
net/if_ppp.c


I think that this patch has same problem as original code you can call 
module_autoload twice in same lwp. (you are holding module_lock while 
calling module_do_load).


This patch wasn;t intended to solve the nested-call problem.  This patch 
only purpose was to modify the existing semantics of the module_autoload 
routine to match that of the other module routines.  In particular, 
module_autoload is the only routine that requires its caller to acquire 
the mutex first;  all the other routines do all the required locking 
internally to each routine.



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Modules loading modules?

2010-08-02 Thread Paul Goyette

On Mon, 2 Aug 2010, Antti Kantee wrote:


On Mon Aug 02 2010 at 16:30:03 +1000, matthew green wrote:

this is an incomplete reading of the manual page, and you can not
use mutex_owned() the way you are trying to (regardless of what
pooka posted.) you can't even using it in various forms of assertions
safely.  from the man page:

It should not be used to make locking decisions at run time, or to
verify that a lock is not held.


That's the mantra, yes.


ie, you can not even KASSERT(!mutex_owned()).


Strictly speaking you can in a case where you have two locks always
are taken as l1,l2 and released as l2,l1 provided you're not dealing
with a spin mutex.  Does it make any sense?  no (l2 is useless).
Is it possible?  yes.

Now, on to sensible stuff.  I'm quite certain that warning was written
to make people avoid writing bad code without understanding locking --
if you need to used mutex_owned() to decide if you should lock a normal
mutex, your code is broken.  The other less likely possibility is that
someone plans to change mutex_owned in the future.

Further data point: the same warning is in rw_held, yet it was used to
implement recursive locking in vlockmgr until its very recent demise.

Ignoring man page mantras and focusing on how the code works, I do not
see anything wrong with Paul's use of mutex_owned().


but i'm still not sure why we're going to such lengths to hold a
lock across such a heavy weight operation like loading a module.
that may involve disk speeds, and so you're looking at waiting
millions of cycles for the lock.  aka, what eeh posted.


It's held for millions of cycles already now and nobody has pointed out
measurable problems.  But if it is deemed necessary, you can certainly
hide a cv underneath.  The efficiency requirements for the module lock
are probably anyway in the who cares wins ... not spectrum.  At least
I'm not aware of any fastpath wanting to use it.

Anyway, no real opinion there.  A cv most likely is the safe, no-brainer
choice.


Yes, the cv mechanism, coupled with changing the semantics of 
module_autoload() (to not require the caller to obtain the lock) makes 
good sense here.  I'm working on it and I should be able to post new 
diffs tomorrow.



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Modules loading modules?

2010-08-02 Thread Paul Goyette

On Mon, 2 Aug 2010, Paul Goyette wrote:

Yes, the cv mechanism, coupled with changing the semantics of 
module_autoload() (to not require the caller to obtain the lock) makes good 
sense here.  I'm working on it and I should be able to post new diffs 
tomorrow.



OK, it's time for Round #3!

Attached are diffs that implement Adam's suggested cv-based solution,
along with moving the locking for module_autolock() into the routine 
itself (for consistency with other routines).  I've also updated the 
module(4) man page this time around.



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-

MODULE_LOCKING.diff.gz
Description: Binary data


re: Modules loading modules?

2010-08-02 Thread Paul Goyette
The most recent verion of this patch set was recently posted, and it 
uses the cv_wait() mechanism suggested by Adam Hamsik.


This discussion is still very intresting, but perhaps we could start a 
new thread (or at least change the subject line)?  :)



On Tue, 3 Aug 2010, matthew green wrote:




On Mon Aug 02 2010 at 16:30:03 +1000, matthew green wrote:

this is an incomplete reading of the manual page, and you can not
use mutex_owned() the way you are trying to (regardless of what
pooka posted.) you can't even using it in various forms of assertions
safely.  from the man page:

It should not be used to make locking decisions at run time, or to
verify that a lock is not held.


That's the mantra, yes.


ie, you can not even KASSERT(!mutex_owned()).


Strictly speaking you can in a case where you have two locks always
are taken as l1,l2 and released as l2,l1 provided you're not dealing
with a spin mutex.  Does it make any sense?  no (l2 is useless).
Is it possible?  yes.


what is not possible is to safely to what the manual says.  you will
not have the right thing happen.  i've seen this when i've been
working on the sparc64 pmap.


Now, on to sensible stuff.  I'm quite certain that warning was written
to make people avoid writing bad code without understanding locking --
if you need to used mutex_owned() to decide if you should lock a normal
mutex, your code is broken.  The other less likely possibility is that
someone plans to change mutex_owned in the future.

Further data point: the same warning is in rw_held, yet it was used to
implement recursive locking in vlockmgr until its very recent demise.

Ignoring man page mantras and focusing on how the code works, I do not
see anything wrong with Paul's use of mutex_owned().


this just does not match my actual experience in the kernel.  i had
weird pmap-style problems and asserts firing wrongly while i did not
obey the rules in the manual directly.


.mrg.



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Modules loading modules?

2010-08-03 Thread Paul Goyette

On Tue, 3 Aug 2010, haad wrote:

Well, still no hurry, but it would be nice if some additional eyes were 
pointed this way.  I've got the recursive-module-load test case added

to the existing tests/modules/ stuff (and it works)!

BTW, since this is changing the kernel ABI, I guess it will need a 
version bump.  Do we have any other bumps coming soon that we can 
share?


I talked with Andrew today and he said that basicaly we can use
mutex_owned for this particular case. That should make your code much
simpler :) sorry for any additional work :)


Well, we've come full circle!  :)

It seems to me that, even if it is OK to use mutex_owned() for this 
particular case, perhaps we should not.  Most everyone seemed to agree 
that your condvar solution was technically correct, and if we did use 
the mutex_owned() solution it would only serve as a temptation for 
someone else in the future!


We now have two solutions, both of which actually work (tested on my 
home machine).  One solution (mutex_owned) might be slightly simpler and 
slightly more efficient than the other (condvar), but this section of 
code is certainly not performance-critical.  Therefore, is there any 
reason for preferring the mutex_owned solution?


I'm happy to go either way based on concensus here, but I'm leaning 
towards the condvar solution.  It doesn't require any exemptions or 
long explanations of why it is OK to use mutex_owned in ways that 
would appear to violate our own documentation.  And it also addresses 
eeh's (and mrg's) concerns of keeping the lock across potentially 
long-duration  operations.



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-04 Thread Paul Goyette

On Wed, 4 Aug 2010, Andrew Doran wrote:


Sorry for not replying sooner.

Please don't add a generic recursive lock facility, it would be abused.


Yeah!  My current patches have no new generic facilities at all.


I'd like something roughly like the following.  I think this should
also cover major configuration tasks such as device attach and detach,
so it wouldn't really be module specific.  The naming is a poor suggestion.

void
sysconfig_lock(void)
{

if (mutex_owned(sysconfig_mutex)) {
/*
 * At this point it's OK for the thread to hold other
 * locks, since the system configuration lock was the
 * first taken on the way in.
 */
sysconfig_recurse++;
return;
}
/*
 * I don't remember the correct argument sequence to the
 * following, but basically we need to assert that NO locks,
 * sleep or spin, other than kernel_lock are held,
* as the system configuration lock would be the outermost
 * lock in the system.
 */
LOCKDEBUG_BARRIER(...);
mutex_enter(sysconfig_mutex);
KASSERT(sysconfig_recurse == 0);
}

For the boot autoconfig path where you have things being driven by the
kernel, this would work OK.  Situations where there's a kthread or
something attaching and detaching devices on the fly are a bit different,
since they likely have local atomicity requirements (need to take device
private locks).  There you'd probably need the caller to take the lock.
USB comes to mind (along with some hardware RAID drivers that do hotplug
attach/detach).

Thoughts?


Well, at first glance, this proposal makes sense.  But it is certainly a 
much larger task than dealing with the immediate problem - modules which 
want to load other modules.


So, I really have three questions:

1. In keeping with earlier concerns about holding locks for long periods
   of time (eeh and mrg both commented on this), the approach described
   above would seem to have an even longer hold-time than the current
   module_lock.  Would not something similar to haad's condvar approach
   be appropriate for this proposal, as well as for the more-limited-in-
   scope recursive module lock?

2. Would it be reasonable to solve the immediate problem as previously
   proposed, rather than waiting for this ultimate solution?  I think
   it would be a long time before we could find and resolve all of the
   issues that might be created in the various threaded situations
   that may exist.  I know I'm certainly not sufficiently qualified to
   identify all those situations, and I also know that I don't have
   sufficient available work-hours to do this in any reasonable time-
   frame.

   I'd still like to move forward with the most recent solution to the
   module_lock problem.

3. Since we're still technically abusing the mutex(9) man-page caveat
   about using mutex_owned() for making locking decisions, it would
   be very much appreciated (at least by myself) to have an explanation
   of WHY it is OK in this case to do it here, but not somewhere else.





-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-05 Thread Paul Goyette

OK, got it.

I'll have another set of patches in a few days.

On Thu, 5 Aug 2010, Andrew Doran wrote:


On Wed, Aug 04, 2010 at 10:28:56AM -0700, Paul Goyette wrote:

On Wed, 4 Aug 2010, Andrew Doran wrote:


Sorry for not replying sooner.

Please don't add a generic recursive lock facility, it would be abused.


Yeah!  My current patches have no new generic facilities at all.


I'd like something roughly like the following.  I think this should
also cover major configuration tasks such as device attach and detach,
so it wouldn't really be module specific.  The naming is a poor suggestion.

void
sysconfig_lock(void)
{

if (mutex_owned(sysconfig_mutex)) {
/*
 * At this point it's OK for the thread to hold other
 * locks, since the system configuration lock was the
 * first taken on the way in.
 */
sysconfig_recurse++;
return;
}
/*
 * I don't remember the correct argument sequence to the
 * following, but basically we need to assert that NO locks,
 * sleep or spin, other than kernel_lock are held,
   * as the system configuration lock would be the outermost
 * lock in the system.
 */
LOCKDEBUG_BARRIER(...);
mutex_enter(sysconfig_mutex);
KASSERT(sysconfig_recurse == 0);
}

For the boot autoconfig path where you have things being driven by the
kernel, this would work OK.  Situations where there's a kthread or
something attaching and detaching devices on the fly are a bit different,
since they likely have local atomicity requirements (need to take device
private locks).  There you'd probably need the caller to take the lock.
USB comes to mind (along with some hardware RAID drivers that do hotplug
attach/detach).

Thoughts?


Well, at first glance, this proposal makes sense.  But it is
certainly a much larger task than dealing with the immediate problem
- modules which want to load other modules.

So, I really have three questions:

1. In keeping with earlier concerns about holding locks for long periods
   of time (eeh and mrg both commented on this), the approach described


For module lock the wait time isn't really a big concern.  Module load
and unload is a heavyweight operation so the wait time is expected.
There is nothing intrinsically wrong with holding a mutex for a long time
provided that you've got a good handle on the potential side effects.
It's a different story for other locks in the system like say proc_lock,
which can't be held for a long time without disrupting the user experience
and/or deadlocking the system.


   above would seem to have an even longer hold-time than the current
   module_lock.  Would not something similar to haad's condvar approach
   be appropriate for this proposal, as well as for the more-limited-in-
   scope recursive module lock?


condvar gives you another lock in a roundabout way, without priority
inheritance to help move things along if something has clogged the
pipework. :-).  So I don't see benefit over a mutex.


2. Would it be reasonable to solve the immediate problem as previously
   proposed, rather than waiting for this ultimate solution?  I think
   it would be a long time before we could find and resolve all of the
   issues that might be created in the various threaded situations
   that may exist.  I know I'm certainly not sufficiently qualified to
   identify all those situations, and I also know that I don't have
   sufficient available work-hours to do this in any reasonable time-
   frame.

   I'd still like to move forward with the most recent solution to the
   module_lock problem.


I'm not suggesting that you need to tackle autoconfiguration locking at
the same time.. What I'm suggesting is to put the primitives in place
(say in kern_lock.c or something) and then use these for the benefit of
the modules code. It would provide a statment of direction and avoid
re-hashing things later when somebody decides to tackle autoconf.
Sorry for being unclear.


3. Since we're still technically abusing the mutex(9) man-page caveat
   about using mutex_owned() for making locking decisions, it would
   be very much appreciated (at least by myself) to have an explanation
   of WHY it is OK in this case to do it here, but not somewhere else.


Ok well I think the simplest course of action would be to replace the
mutex_owned() with a global variable.  It would do much the same thing
but for the price of 4 or 8 bytes there is no question of contradicting
the mantra. So like:

/*
* Ok to check this unlocked, as for the value to equal curlwp it must
* have been set by the current thread of execution (i.e. not interrupt
* context or another LWP).
*/
l = curlwp;
if (sysconfig_lwp == l) {
/* recurse */
} else {
mutex_enter(sysconfig_lock);
sysconfig_lwp = l;
/* etc etc

Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-06 Thread Paul Goyette

On Thu, 5 Aug 2010, Paul Goyette wrote:


OK, got it.

I'll have another set of patches in a few days.


Well, it was a slow day at the office today, so I found some time to 
work on this!  :)


Attached is the latest version of this change.  For simplicity, I have 
broken the patches up into five separate groups:


Part 1 defines a set of new kernel locking primitives in file
kern/kern_lock.c to implement the recursive kernconfig_mutex.  (I'm not 
really stuck on the name, so open to alternative suggestions!)  All of 
the locking in sys/kern_module.c has been updated to use this instead of 
the internal module_lock mutex.  Additionally, the module_autoload() 
routine now does its own locking, rather than requiring its caller to do 
it.  And the description of locking in the module(9) man page is 
updated.


Part 2 removes all of the explicit module_{,un}lock() calls from the 
various xxx_verbose modules that I'd previous worked on.


Part 3 removes all of the explicit module_{,un}lock() calls from other 
bits and pieces of the kernel.  In a couple of places, new calls to

kernconfig_{,un}lock() are inserted.

Part 4 updates rump to provide equivalent locking routines.  (pooka, I 
would appreciate you looking at this.)


Part 5 adds a new atf test-case to the existing module tests to verify 
that recursion actually works!




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-Index: sys/sys/param.h
===
RCS file: /cvsroot/src/sys/sys/param.h,v
retrieving revision 1.373
diff -u -p -r1.373 param.h
--- sys/sys/param.h 28 Jul 2010 11:03:47 -  1.373
+++ sys/sys/param.h 6 Aug 2010 00:46:05 -
@@ -63,7 +63,7 @@
  * 2.99.9  (299000900)
  */
 
-#define__NetBSD_Version__  599003800   /* NetBSD 5.99.38 */
+#define__NetBSD_Version__  599003900   /* NetBSD 5.99.39 */
 
 #define __NetBSD_Prereq__(M,m,p) (M) * 1) + \
 (m) * 100) + (p) * 100) = __NetBSD_Version__)
Index: sys/sys/systm.h
===
RCS file: /cvsroot/src/sys/sys/systm.h,v
retrieving revision 1.239
diff -u -p -r1.239 systm.h
--- sys/sys/systm.h 31 Jan 2010 02:04:43 -  1.239
+++ sys/sys/systm.h 6 Aug 2010 00:46:06 -
@@ -492,6 +492,11 @@ void   kernel_lock_init(void);
 void   _kernel_lock(int);
 void   _kernel_unlock(int, int *);
 
+void   kernconfig_lock_init(void);
+void   kernconfig_lock(void);
+void   kernconfig_unlock(void);
+bool   kernconfig_is_held(void);
+
 #if defined(MULTIPROCESSOR) || defined(_MODULE)
 #defineKERNEL_LOCK(count, lwp) \
 do {   \
Index: sys/sys/module.h
===
RCS file: /cvsroot/src/sys/sys/module.h,v
retrieving revision 1.24
diff -u -p -r1.24 module.h
--- sys/sys/module.h26 Jun 2010 07:23:57 -  1.24
+++ sys/sys/module.h6 Aug 2010 00:46:07 -
@@ -115,7 +115,6 @@ __link_set_add_rodata(modules, name##_mo
 TAILQ_HEAD(modlist, module);
 
 extern struct vm_map   *module_map;
-extern kmutex_tmodule_lock;
 extern u_int   module_count;
 extern u_int   module_builtinlist;
 extern struct modlist  module_list;
Index: sys/kern/kern_lock.c
===
RCS file: /cvsroot/src/sys/kern/kern_lock.c,v
retrieving revision 1.150
diff -u -p -r1.150 kern_lock.c
--- sys/kern/kern_lock.c20 Dec 2009 20:42:23 -  1.150
+++ sys/kern/kern_lock.c6 Aug 2010 00:46:07 -
@@ -39,6 +39,7 @@ __KERNEL_RCSID(0, $NetBSD: kern_lock.c,
 #include sys/systm.h
 #include sys/kernel.h
 #include sys/lockdebug.h
+#include sys/mutex.h
 #include sys/cpu.h
 #include sys/syslog.h
 #include sys/atomic.h
@@ -56,6 +57,10 @@ bool kernel_lock_dodebug;
 __cpu_simple_lock_t kernel_lock[CACHE_LINE_SIZE / sizeof(__cpu_simple_lock_t)]
 __aligned(CACHE_LINE_SIZE);
 
+static kmutex_t kernconfig_mutex;
+static lwp_t *kernconfig_lwp;
+static int kernconfig_recurse;
+
 void
 assert_sleepable(void)
 {
@@ -306,3 +311,59 @@ _kernel_unlock(int nlocks, int *countp)
if (countp != NULL)
*countp = olocks;
 }
+
+/*
+ * Functions for manipulating the kernel configuration lock.  This
+ * recursive lock should be used to protect all additions and removals
+ * of of kernel functionality, such as device configuration and loading
+ * of modular kernel components.
+ */
+
+void
+kernconfig_lock_init(void

Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-06 Thread Paul Goyette

On Fri, 6 Aug 2010, Mindaugas Rasiukevicius wrote:


Hello,

Paul Goyette p...@whooppee.com wrote:


Well, it was a slow day at the office today, so I found some time to
work on this!  :)

Attached is the latest version of this change.  For simplicity, I have
broken the patches up into five separate groups:

...


Few comments on the patch:


-#define__NetBSD_Version__  599003800   /* NetBSD 5.99.38 */
+#define__NetBSD_Version__  599003900   /* NetBSD 5.99.39 */


Why bumping?


Seemed like the right thing to do, since the locking protocol is 
changing, as well as the global module_lock mutex disappearing.  But I 
really can't hink of any ill effects, unless someone has a module that 
suddenly fails link-and-load because it references module_lock.  It's 
not very likely.


I'll defer, or I'll wait to ride someone else's bump just to be safe.




+ * of of kernel functionality, such as device configuration and loading


One of too much.


Fixed.


+void
+kernconfig_lock(void)
+{
+   lwp_t   *l;
+
+   /*
+* OK to check this unlocked, since it could only be set to
+* curlwp by the current thread itself, and not by an interrupt
+* or any other LWP.
+*/


Please add KASSERT(!cpu_intr_p());


Done.  I was worried about this, and forgot to ask.  Thanks for picking 
it up.





+   l = curlwp;
+   if (kernconfig_lwp == l) {
+   kernconfig_recurse++;
+   KASSERT(kernconfig_recurse != 0);


Such (++ then != 0) assert is a bit useless.  How about:

KASSERT(kernconfig_recurse  1);


The intent here was to ensure that we didn't overflow the counter back 
to zero.  It might have made sense when the counter was a uint8_t, but 
probably not with an int!  It can probably be removed completely.





+   } else {
+   mutex_enter(kernconfig_mutex);
+   kernconfig_lwp = l;
+   kernconfig_recurse = 1;
+   }
+}


Somebody will confuse l with 1. :)


This came straight from ad@'s suggestions, and there seems to be a lot 
of precedent in the sys/kern/ code to use l.  But I will select a better 
variable name.





+   if (--kernconfig_recurse == 0) {
+   kernconfig_lwp = 0;
+   mutex_exit(kernconfig_mutex);


kernconfig_lwp = NULL;


Ooops!  :)




+bool
+kernconfig_is_held(void)
+{
+
+   return (mutex_owned(kernconfig_mutex));


No need for brackets around function.


Got it.




+   KASSERT(kernconfig_is_held);


But missing brackets in few cases.


Yeah, you're the second one to point that out.  I fixed them once, not 
sure how they crept back in.  But they've been fixed again.



Otherwise seems OK to me.  Thanks.



Thank you for the detailed look.


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-06 Thread Paul Goyette

On Thu, 5 Aug 2010, Paul Goyette wrote:


On Thu, 5 Aug 2010, Paul Goyette wrote:


OK, got it.

I'll have another set of patches in a few days.


Well, it was a slow day at the office today, so I found some time to work on 
this!  :)


Attached is the latest version of this change.  For simplicity, I have broken 
the patches up into five separate groups:


Part 1 defines a set of new kernel locking primitives in file
kern/kern_lock.c to implement the recursive kernconfig_mutex.  (I'm not 
really stuck on the name, so open to alternative suggestions!)  All of the 
locking in sys/kern_module.c has been updated to use this instead of the 
internal module_lock mutex.  Additionally, the module_autoload() routine now 
does its own locking, rather than requiring its caller to do it.  And the 
description of locking in the module(9) man page is updated.


Part 2 removes all of the explicit module_{,un}lock() calls from the various 
xxx_verbose modules that I'd previous worked on.


Part 3 removes all of the explicit module_{,un}lock() calls from other bits 
and pieces of the kernel.  In a couple of places, new calls to

kernconfig_{,un}lock() are inserted.

Part 4 updates rump to provide equivalent locking routines.  (pooka, I would 
appreciate you looking at this.)


Part 5 adds a new atf test-case to the existing module tests to verify that 
recursion actually works!


Following up on all the various comments from jmcneil@, pooka@, and 
rmind@, I've attached a revised set of diffs.  The most significant 
change between this and the previous revision is the separation of the 
kernconfig_lock_*() stuff into a separate kern_cfg.c source file.  This 
allows inclusion of the kernel code directly into rump, rather than 
having to re-implement it in rump/klock.c




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-

ML.diff.tgz
Description: Binary data


Length of wmesg for condvar?

2010-08-08 Thread Paul Goyette
The condvar(9) man page states The wmesg argument specifies a string of 
no more than 8 characters... yet there is at least one instance in the 
kernel sources where we have a wmesg long than 8 characters!  In 
src/sys/kern/kern_module.c in routine module_init() we have


cv_init(module_thread_cv, modunload);

There appear to be several other similar violations in various places:

src/sys/arch/x86/x86/ipmi.c
src/sys/arch/xen/xen/balloon.c
src/sys/dev/isa/fd.c
src/sys/dev/tprof/tprof.c
src/sys/dist/ipf/netinet/ip_auth.c
src/sys/dist/ipf/netinet/ip_log.c
src/sys/dist/ipf/netinet/ip_sync.c
src/sys/fs/nilfs/nilfs_subr.c
src/sys/fs/nilfs/nilfs_vfsops.c
src/sys/kern/sys_pipe.c
src/sys/net/agr/if_agr.c
src/sys/opencrypto/crypto.c
src/sys/rump/librump/rumpkern/sysproxy_socket.c

Should these be changed?  Are there any adverse effects from having a 
wmesg longer than 8 characters?


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Length of wmesg for condvar?

2010-08-09 Thread Paul Goyette

On Mon, 9 Aug 2010, Matthew Mondor wrote:


On Sun, 8 Aug 2010 17:23:23 -0700 (PDT)
Paul Goyette p...@whooppee.com wrote:


Should these be changed?  Are there any adverse effects from having a
wmesg longer than 8 characters?


It seems to me that the exporter of those use strncpy() (i.e.
kern/init_sysctl.c) and that the structures use WMESGLEN and
KI_WMESGLEN both defined as 8.  So other than inadvertently truncated
names it at least should not cause corruption, but I think that
truncated names could also be problematic when trying to distinguish
two strings starting with the same 8 characters (is that likely now)?
Especially when the only thing that differs between two states is some
suffix like rd and rw...  After all, those are intended for
humans :)



Does anyone object to my going through and coming up with shorter names 
(= 8 chars) for these condvars?


I'll leave changing the exporters from strncpy() to strlcpy() for 
another day.




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-15 Thread Paul Goyette

Updating the status on these changes:

One comment questioned whether or not a version bump was required, and 
I've more-or-less convinced myself it is at least desired.  While 
properly-working modules from the pre-update will continue to work on a 
post-update kernel, the reverse is not necessarily true.  A module 
written for a post-update kernel and which takes advantage of the 
changed locking protocol will fail on a pre-update kernel.


Another comment suggested that the name of newly-created file 
kern/kern_cfg.c should be changed to more closely match its contents, 
while an earlier comment had suggested generalizing the filename!  I've 
taken the more-specific path and the file is now called kern_cfglock.c
If other stuff gets added to this file later, we can come up with a more 
generic name at that time.


Some additional changes have been included in this latest set of diffs. 
Mostly, there were some KASSERT()s sprinkled throughout that tried to 
ensure that the code had not been called recursively.  Since recursion 
is now explicitly supported, all of the module_active stuff has been 
reworked.


Additionally, there was a statically-allocated list of pending modules 
that needed to have their initialization completed.  This has been 
changed to keep multiple stack-allocated lists, one for each depth.



I'm planning to commit these changes next weekend, unless there are any 
significant objections.  If anyone else needs to coordinate changes in 
order to ride-the-version-bump, let me know.




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-

ML.diff.tgz
Description: Binary data


Re: kernel module loading vs securelevel

2010-10-15 Thread Paul Goyette
If I remember correctly, autoload bypasses the authorization calls 
within the module subsystem.  Autoload also works ONLY for modules that 
are either built-in, passed by the bootloader, or located in the 
official directory;  autoloading from the filesystem is prohibited if 
the pathname contains any '/'.




On Sat, 16 Oct 2010, Izumi Tsutsui wrote:


XXX: module files can be loaded only on single user?


It looks kernel modules can't be loaded on multi user,
i.e. if kernel securelevel is 1, unless options INSECURE is specified.

i386 has options INSECURE by default so it just works,
but is it intended feature?


It would seem to be intentional.  After all, kernel modules can
do all sorts of nasty things if they want to.


In that case, module autoload/autounload is not functional at all and
we have to specify all possible necessary modules explicitly
during boot time??

---
Izumi Tsutsui

!DSPAM:4cb91b7e2433872420069!





-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: kernel module loading vs securelevel

2010-10-16 Thread Paul Goyette

On Sat, 16 Oct 2010, Izumi Tsutsui wrote:


autoload/autounload does NOT perform any authorization checks - please
look at the code!  No checking of securelevel occurs, as far as I can
see.  For autoload, the module name must not contain a '/', so if the
module is being loaded from the file system it must be loaded from the
blessed /stand/${ARCH}/${VERSION}/modules directory.  Including the
INSECURE option will have no effect on autoloading of modules.


Hmm.

I built MODULAR kernels on news68k and sun3 (which didn't have INSECURE)
but I couldn't use TMPFS or execute a.out binaries on multiuser
though they worked after shutdown(8) or on single user.

The code doesn't work as intended and just we should fix it?


Hmmm.  Maybe I am reading the code wrong.  But the intent of the code 
seems to be quite clear.  The manual load explicitly calls kauth_...() 
while the auto-load path does not make any such call.


Perhaps there is an additional authorization call in kobj_load_vfs() 
(which does the actual loading).  A quick grep of subr_kobj*.c for 
kauth_ does not reveal anything obvious.


Could you rerun your testing after setting sysctl kern.module.verbose? 
This should provide extra kernel debug printf() messages...




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: kernel module loading vs securelevel

2010-10-16 Thread Paul Goyette

Hmmm.

OK, I will build an i386 kernel _without_ INSECURE and see if I can 
reproduce the problem.



On Sat, 16 Oct 2010, Izumi Tsutsui wrote:


Could you rerun your testing after setting sysctl kern.module.verbose?
This should provide extra kernel debug printf() messages...


There are some info during boot (before securelevel is set to 1):

---
:
Mounting all filesystems...
DEBUG: module: plist load returned error 2 for 
`/stand/sun3/5.99.39/modules/mfs/mfs.kmod'
DEBUG: module: plist load returned error 2 for 
`/stand/sun3/5.99.39/modules/kernfs/kernfs.kmod'
DEBUG: module: plist load returned error 2 for 
`/stand/sun3/5.99.39/modules/procfs/procfs.kmod'
DEBUG: module: plist load returned error 2 for 
`/stand/sun3/5.99.39/modules/ptyfs/ptyfs.kmod'
DEBUG: module: cannot unload module `mfs' error=16
DEBUG: module: cannot unload module `kernfs' error=16
DEBUG: module: cannot unload module `procfs' error=16
Clearing temporary files.
DEBUG: module: cannot unload module `ptyfs' error=16
Starting amd.
Creating a.out runtime link editor directory cache.
Checking quotas: done.
Setting securelevel: kern.securelevel: 0 - 1
:
---

All file systems (mfs, kernfs, procfs, and ptyfs) are
mounted properly, though.

On multi user, there is no particular info:

---
# mount -t tmpfs tmpfs /mnt
mount_tmpfs: tmpfs on /mnt: Operation not supported by device
# shutdown now
Shutdown NOW!

:

Enter pathname of shell or RETURN for /bin/sh:
Terminal type is kterm.
We recommend creating a non-root account and using su(1) for root access.
# mount -t tmpfs tmpfs /mnt
DEBUG: module: plist load returned error 2 for 
`/stand/sun3/5.99.39/modules/tmpfs/tmpfs.kmod'
chariot# df /mnt
Filesystem   1K-blocks   Used  Avail %Cap Mounted on
tmpfs 49480  8  49472   0% /mnt
# DEBUG: module: cannot unload module `tmpfs' error=16

---
Izumi Tsutsui

!DSPAM:4cb9a96b2437545078506!





-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: kernel module loading vs securelevel

2010-10-16 Thread Paul Goyette

On Sat, 16 Oct 2010, John Nemeth wrote:


sys/kern/kern_module.c:module_autoload() makes almost the exact
same call to kauth_authorize_system as does module_load().  The
difference is that the second last parm is (void *)(uintptr_t)1.  What
difference this makes is going to be buried in the bowels of kauth, and
I'm not going to dig through that at this moment.


Hmmm, missed that one.  You're right.

The arg is, I think, the one that gets passsed to the kauth listener as 
arg2, which should cause us to return KAUTH_RESULT_ALLOW for autoload:


static int
module_listener_cb(kauth_cred_t cred, kauth_action_t action, void *cookie,
void *arg0, void *arg1, void *arg2, void *arg3)
{
int result;

result = KAUTH_RESULT_DEFER;

if (action != KAUTH_SYSTEM_MODULE)
return result;

if ((uintptr_t)arg2 != 0)   /* autoload */
result = KAUTH_RESULT_ALLOW;

return result;
}





}-- End of excerpt from Paul Goyette

!DSPAM:4cb9cb6e2431088414278!





-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: kernel module loading vs securelevel

2010-10-16 Thread Paul Goyette

On Sun, 17 Oct 2010, Masao Uebayashi wrote:


On Sat, Oct 16, 2010 at 08:57:04AM -0700, John Nemeth wrote:

On Jan 31,  5:14pm, Paul Goyette wrote:
} On Sat, 16 Oct 2010, Izumi Tsutsui wrote:
}
}  Hmm, what do you think about this feature?
}  Only available in INSECURE environment?
} 
}  We trust modules at the time when they're installed into the trusted
}  place, same as kernel itself.  I think prohibiting module load  at
}  run-time is rather pointless.
} 
}  Well I think the point is whether we should require INSECURE or not
}  to use module autoload/autounload after multiuser.
} 
}  If we should I'll enable options INSECURE by default on ports
}  that require options MODULAR (to save kernel file size).
}
} autoload/autounload does NOT perform any authorization checks - please
} look at the code!  No checking of securelevel occurs, as far as I can

 I just did and autoload most certainly does do authorization
checks.

} see.  For autoload, the module name must not contain a '/', so if the
} module is being loaded from the file system it must be loaded from the
} blessed /stand/${ARCH}/${VERSION}/modules directory.  Including the
} INSECURE option will have no effect on autoloading of modules.
}
} Manual loading and unloading of modules does involve calls to
} kauth_authorize_system() which will check securelevel.

 sys/kern/kern_module.c:module_autoload() makes almost the exact
same call to kauth_authorize_system as does module_load().  The
difference is that the second last parm is (void *)(uintptr_t)1.  What
difference this makes is going to be buried in the bowels of kauth, and
I'm not going to dig through that at this moment.


I read it as:

a) module_listener_cb() checks the 2nd arg, if autoload, judge the auth
   as allow

b) secmodel_securelevel_system_cb() denies the auth if (securelevel 
   0) regardless of autoload

Kauth disallows the auth, because 0 listner(s) denied.



Ah, I see.  That definitely explains it.


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: kernel module loading vs securelevel

2010-10-16 Thread Paul Goyette

On Sat, 16 Oct 2010, David Holland wrote:


 And also make the blessed directory itself immutable?  :)

As I recall the semantics of immutable are such that this isn't
necessary to protect modules that are present at boot time (that is,
they can't be unlinked/renamed/etc.), and if there are autoloadable
modules whose names aren't present at boot time, they'll fail the
check.


I've already misread the code here once, but...

As far as I can tell, each time a module_autoload call is made, if the 
module is neither built-in nor passed in by the boot loader, the code 
will attempt to load it via a call to kobj_load_vfs() which has path as 
an argument.  It doesn't appear to me that there is any pre-approved 
list of acceptable objects that can be loaded from the file system.


BTW, does the immutable flag prevent one from using an immutable 
directory as the mount-point for some other file system?  Hmmm...




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: kernel module loading vs securelevel

2010-10-16 Thread Paul Goyette

On Sun, 17 Oct 2010, Izumi Tsutsui wrote:


}  I'm just asking if options INSECURE is mandaory to use autoloading,
}  not module/autoloading is secure/silly/boo or not.
}
} No.  As far as I can tell, there's a bug in the relevant kauth listener,
} at least in terms of the original intent of the author of the autoloading
} code; the system scope kauth listener should return DEFER, not DENY.

 module_listener_cb() was added to kern_module.c in revision 1.51
by elad.  The kauth_authorize_system() calls were added to
kern_module.c by ad, but the respective commit log messages doesn't say
anything about them, so the original intent of the author of the
autoloading code (ad) is unclear.


The following patch makes autoload works even on securelevel  0,
but I'm not sure if it's correct and acceptable.
If not, options INSECURE is the only way to enable it..


Based on the discussion regarding the numerous ways in which this could 
be abused, I would personally vote for requiring INSECURE.  At least 
that way, things are pretty clear.  If the proposed patch were used, 
then you would have only options MODULAR in the kernel config file, 
which is not at all clear about the security of the resulting kernel.





Index: secmodel/securelevel/secmodel_securelevel.c
===
RCS file: /cvsroot/src/sys/secmodel/securelevel/secmodel_securelevel.c,v
retrieving revision 1.20
diff -u -p -r1.20 secmodel_securelevel.c
--- secmodel/securelevel/secmodel_securelevel.c 7 Oct 2009 01:06:57 -   
1.20
+++ secmodel/securelevel/secmodel_securelevel.c 16 Oct 2010 22:15:11 -
@@ -254,7 +254,7 @@ secmodel_securelevel_system_cb(kauth_cre
break;

case KAUTH_SYSTEM_MODULE:
-   if (securelevel  0)
+   if ((uintptr_t)arg2 == 0  securelevel  0)
result = KAUTH_RESULT_DENY;
break;


---
Izumi Tsutsui

!DSPAM:4cba24f22438312397761!





-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Getting rid of standard boiler plate in kernel man pages CODE REFERENCES

2010-11-08 Thread Paul Goyette

On Mon, 8 Nov 2010, David Holland wrote:


  Not from me! Although rather than /usr/src it might say the top level
  of the source tree.

 How about:
 [...]
 +Any paths are relative to the top level of the source tree (usually
 +.Pa /usr/src ) .

Sounds fine.

 Or perhaps historically instead of usually :)

Or often might work, too. No opinion on that :-)


typically or frequently are two additional possibilities!


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


ioctl(2) vs sys/ioctl.h

2010-12-18 Thread Paul Goyette
Is there some reason why there is a discrepancy in the definition of 
ioctl()?



From man page ioctl(2)


SYNOPSIS
 #include sys/ioctl.h

 int
 ioctl(int d, unsigned long request, void *argp);


Yet, from sys/ioctl.h we have

__BEGIN_DECLS
int ioctl(int, unsigned long, ...);
__END_DECLS


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: ioctl(2) vs sys/ioctl.h

2010-12-19 Thread Paul Goyette

On Sun, 19 Dec 2010, Christos Zoulas wrote:


In article pine.neb.4.64.1012181604050.27...@quicky.whooppee.com,
Paul Goyette  p...@whooppee.com wrote:

Is there some reason why there is a discrepancy in the definition of
ioctl()?

From man page ioctl(2)

SYNOPSIS
 #include sys/ioctl.h

 int
 ioctl(int d, unsigned long request, void *argp);


Yet, from sys/ioctl.h we have

__BEGIN_DECLS
int ioctl(int, unsigned long, ...);
__END_DECLS



Most of our ioctl's take pointer arguments. Some streams ioctls though
take int arguments (ioctl(fd, I_FLUSH, FLUSHR) for example) and using
void * as the argument would not compile cleanly. I think that we
should not have void * in the man page either.


Should the man page be updated to match reality?


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: ps_strings refactoring compat32 fix

2011-03-02 Thread Paul Goyette

On Wed, 2 Mar 2011, Joerg Sonnenberger wrote:


Hi all,
all programs get the argument and environment placed onto the initial
stack in a structure called ps_strings. NetBSD currently doesn't export
the correct version of this structure for 32bit binaries as in it
doesn't use the 32bit version of it. This breaks setproctitle(3) and
other users of it. The attached patch refactors the access in procfs and
kern.proc as well as making it honour 32bit. The Darwin code likely is
still broken, if someone wants to use that, it should be easy to adopt
the changes from copyin_procargs.

Also attached are two programs to test the consistency of the structure
and updating it (so that ps(1) can be used to ensure it works), but I
will leave it to someone with more interest in atf(7) to turn them into
full test cases.


If the patch gets committed, I'll take on the job of atf-ifying the 
tests.



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: covering set of PCI kernels?

2011-03-31 Thread Paul Goyette
I would be surprised if there are not some drivers in i386/conf/ALL that 
are not in GENERIC...


On Thu, 31 Mar 2011, David Young wrote:


Today I am trying to answer the question, what is the smallest
set of kernel configurations to build, in order to compile all
machine-dependent PCI code and all PCI drivers at least once?

The answer I have come up with so far is, All GENERIC kernels for
architectures having a pci/ directory, that is, these GENERIC kernels:

% ls -d sys/arch/*/pci | sed 's/\/pci$//' | sed 's/$/\/conf\/GENERIC/' | \

xargs ls -d

ls: sys/arch/algor/conf/GENERIC: No such file or directory
ls: sys/arch/atari/conf/GENERIC: No such file or directory
ls: sys/arch/powerpc/conf/GENERIC: No such file or directory
ls: sys/arch/sgimips/conf/GENERIC: No such file or directory
ls: sys/arch/x86/conf/GENERIC: No such file or directory
sys/arch/alpha/conf/GENERIC sys/arch/ibmnws/conf/GENERIC
sys/arch/arc/conf/GENERIC   sys/arch/macppc/conf/GENERIC
sys/arch/bebox/conf/GENERIC sys/arch/mvmeppc/conf/GENERIC
sys/arch/cats/conf/GENERIC  sys/arch/netwinder/conf/GENERIC
sys/arch/cobalt/conf/GENERICsys/arch/ofppc/conf/GENERIC
sys/arch/i386/conf/GENERIC  sys/arch/prep/conf/GENERIC
sys/arch/ia64/conf/GENERIC  sys/arch/sandpoint/conf/GENERIC

Plus some subset of the kernel configuration files in

sys/arch/algor/conf/
sys/arch/atari/conf/
sys/arch/sgimips/conf/

(Which ones?)

And an unknown number of powerpc configurations: evbppc, rs6000, what
else?

Dave

--
David Young OJC Technologies
dyo...@ojctech.com  Urbana, IL * (217) 344-0444 x24

!DSPAM:4d94ed852444056616303!





-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Interrupt issues with amd64 and an IXSystems 1U server (5.1)

2011-05-12 Thread Paul Goyette
I had a similar issue on a SuperMicro motherboard.  The attached patch 
is for -current but should not be too hard to adapt to 5.1




On Thu, 12 May 2011, Brian Buhrow wrote:


Hello.  I did try doing a diff of working and non-working kernels, but
the differences weren't obvious to me.  However, I think the problem is
that the network chips are interrupting on ioapic1, where as everything
else is interrupting on ioapic0.  It looks like even though ioapic1
interrupts are showing up in vmstat -i output, they're not actually causing
the drivers to get triggered.  It looks like this is a potentially known
issue, but I can't find any particular patches which address the problem.
Anyone else ever seen times when interrupts from ioapic1 or higher don't
get routed properly?
-thanks
-Brian
On May 12, 11:54pm, Manuel Bouyer wrote:
} Subject: Re: Interrupt issues with amd64 and an IXSystems 1U server (5.1)
} On Thu, May 12, 2011 at 12:12:54PM -0700, Brian Buhrow wrote:
}   Hello.  I've got anew system I'm building where I have interrupt
}  issues with with the GENERIC kernel. This is an amd64 system with
}  NetBSD-5.1 with sources as of about 3 weeks ago.
}   The INSTALL kernel works fine, but when I boot the GENERIC kernel, the
}  ethernet doesn't work.  I get a bunch of device timeout messages.  Both the
}  INSTALL and GENERIC kernels are  built from the same 5.1 sources.
}   Anyone have ideas on why the INSTALL kernel works fine, but the
}  GENERIC kernel does not?  They look to be the same to me in so far as  the
}  INSTALL config includes the GENERIC config.  I rebuilt the GENERIC kernel
}  to turn off MTRR  support, as the INSTALL kernel does, but that doesn't
}  help, though it does remove the FIXME message about there being too many
}  MTRR devices on the system.
}
} Did you try to diff the dmesg of working and non-working kernels ?
}
} --
} Manuel Bouyer bou...@antioche.eu.org
}  NetBSD: 26 ans d'experience feront toujours la difference
} --

-- End of excerpt from Manuel Bouyer




!DSPAM:4dcc62302402595018236!





-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-Index: src/sys/arch/x86/x86/mpacpi.c
===
RCS file: /cvsroot/src/sys/arch/x86/x86/mpacpi.c,v
retrieving revision 1.91
diff -u -p -r1.91 mpacpi.c
--- src/sys/arch/x86/x86/mpacpi.c   5 Apr 2011 13:17:04 -   1.91
+++ src/sys/arch/x86/x86/mpacpi.c   12 May 2011 23:04:06 -
@@ -625,7 +625,9 @@ mpacpi_derive_bus(ACPI_HANDLE handle, st
if (ACPI_FAILURE(rv))
goto out;
 
-   if (acpi_match_hid(devinfo, pciroot_hid)) {
+   if (acpi_match_hid(devinfo, pciroot_hid) 
+   ((devinfo-Valid  ACPI_VALID_STA) == 0 ||
+   (devinfo-CurrentStatus  ACPI_STA_OK) == ACPI_STA_OK)) {
rv = mpacpi_get_bbn(acpi, parent, bus);
if (ACPI_FAILURE(rv))
bus = 0;


Build and include hdaudio hdafg as modules

2011-05-28 Thread Paul Goyette
Now that Jared has modularized the hdaudio and hdafg drivers, is there 
any reason that we should not be building the modules by default?


And is there any reason why they should not be removed from i386/GENERIC 
and amd64/GENERIC (and, of course, added to i386/MONOLITHIC)?


Patch attached.

-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-Index: distrib/sets/lists/modules/mi
===
RCS file: /cvsroot/src/distrib/sets/lists/modules/mi,v
retrieving revision 1.21
diff -u -p -r1.21 mi
--- distrib/sets/lists/modules/mi   26 Feb 2011 18:07:15 -  1.21
+++ distrib/sets/lists/modules/mi   28 May 2011 13:18:18 -
@@ -52,6 +52,10 @@
 ./@MODULEDIR@/flash/flash.kmod base-kernel-modules kmod
 ./@MODULEDIR@/fss  base-kernel-modules kmod
 ./@MODULEDIR@/fss/fss.kmod base-kernel-modules kmod
+./@MODULEDIR@/hdafgbase-kernel-modules kmod
+./@MODULEDIR@/hdafg/hdafg.kmod base-kernel-modules kmod
+./@MODULEDIR@/hdaudio  base-kernel-modules kmod
+./@MODULEDIR@/hdaudio/hdaudio.kmod base-kernel-modules kmod
 ./@MODULEDIR@/hfs  base-kernel-modules kmod
 ./@MODULEDIR@/hfs/hfs.kmod base-kernel-modules kmod
 ./@MODULEDIR@/kernfs   base-kernel-modules kmod
Index: sys/arch/amd64/conf/GENERIC
===
RCS file: /cvsroot/src/sys/arch/amd64/conf/GENERIC,v
retrieving revision 1.318
diff -u -p -r1.318 GENERIC
--- sys/arch/amd64/conf/GENERIC 1 Apr 2011 12:11:17 -   1.318
+++ sys/arch/amd64/conf/GENERIC 28 May 2011 13:18:30 -
@@ -992,8 +992,8 @@ opl*at fms?
 opl*   at sv?
 
 # High Definition Audio
-hdaudio*   at pci? dev ? function ?# High Definition Audio
-hdafg* at hdaudiobus?
+#hdaudio*  at pci? dev ? function ?# High Definition Audio
+#hdafg*at hdaudiobus?
 
 # Audio support
 audio* at audiobus?
Index: sys/arch/i386/conf/GENERIC
===
RCS file: /cvsroot/src/sys/arch/i386/conf/GENERIC,v
retrieving revision 1.1032
diff -u -p -r1.1032 GENERIC
--- sys/arch/i386/conf/GENERIC  28 May 2011 13:01:49 -  1.1032
+++ sys/arch/i386/conf/GENERIC  28 May 2011 13:18:38 -
@@ -1369,8 +1369,8 @@ opl*  at yds?
 opl*   at ym?
 
 # High Definition Audio
-hdaudio*   at pci? dev ? function ?# High Definition Audio
-hdafg* at hdaudiobus?
+#hdaudio*  at pci? dev ? function ?# High Definition Audio
+#hdafg*at hdaudiobus?
 
 # Audio support
 audio* at audiobus?
Index: sys/arch/i386/conf/MONOLITHIC
===
RCS file: /cvsroot/src/sys/arch/i386/conf/MONOLITHIC,v
retrieving revision 1.15
diff -u -p -r1.15 MONOLITHIC
--- sys/arch/i386/conf/MONOLITHIC   6 Mar 2011 17:08:26 -   1.15
+++ sys/arch/i386/conf/MONOLITHIC   28 May 2011 13:18:45 -
@@ -66,3 +66,7 @@ pseudo-device putter  # for puffs and pu
 pseudo-device  pad # pseudo audio device driver
 
 pseudo-device  dm  # device-mapper device driver
+
+# High Definition Audio
+hdaudio*   at pci? dev ? function ?# High Definition Audio  
+hdafg* at hdaudiobus?
Index: sys/modules/Makefile
===
RCS file: /cvsroot/src/sys/modules/Makefile,v
retrieving revision 1.70
diff -u -p -r1.70 Makefile
--- sys/modules/Makefile14 Apr 2011 15:45:27 -  1.70
+++ sys/modules/Makefile28 May 2011 13:18:50 -
@@ -23,6 +23,8 @@ SUBDIR+=  ffs
 SUBDIR+=   filecore
 SUBDIR+=   flash
 SUBDIR+=   fss
+SUBDIR+=   hdafg
+SUBDIR+=   hdaudio
 SUBDIR+=   hfs
 SUBDIR+=   kernfs
 SUBDIR+=   ksem


Strange behaviour

2011-05-29 Thread Paul Goyette
The system: dual-12-core AMD Opteron-6172 (total 24 cores at 3.2GHz), 
and 32GB RAM.  (The motherboard is a ASUS KGPE-D16).


While running a 'build.sh -j48 release' I notice that there seems to be 
a rather excessive amount of System Time.  Running systat(1) I can see 
long periods of 60% to 70% (or even more), while at no time do I ever 
get more than 15% to 20% of user time.  The only thing that seems to be 
running (according to vmstat) at many thousands of Global TLB IPI 
interrupts, although Interrupt time is mostly invisible.  (It never 
shows up more than 1% or 2% on systat, and a xosview never shows any 
interrupt time at all.)  Other than the 5K to 10K (with occassional 
bursts of 18K) TLB IPI, the only other interrupt activity is a few 
hundred disk interrupts, and maybe a few dozen network interrupts.


Is this expected behavior?


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Strange behaviour

2011-05-29 Thread Paul Goyette
BTW, it was definitely a typo in the original - my system is 24 cores at 
2.1GHz, not 3.2GHz.


On Sun, 29 May 2011, Thor Lancelot Simon wrote:


On Sun, May 29, 2011 at 06:24:06PM -0700, Paul Goyette wrote:

The system: dual-12-core AMD Opteron-6172 (total 24 cores at
3.2GHz), and 32GB RAM.  (The motherboard is a ASUS KGPE-D16).

While running a 'build.sh -j48 release' I notice that there seems to
be a rather excessive amount of System Time.  Running systat(1) I

[...]

Is this expected behavior?


Yes.  We do a little better with 16 build jobs than 12 and considerably
worse with 20 than 16.

On my 24-core 2Ghz Opteron, the only thing I've tried that really can
do an effective job building NetBSD with more than 12 CPUs in use is
Linux.  I should probably try Solaris.


Despite the level of System time, I was able to run a complete 
build.sh -j48 release in less than 45 minutes.  And afterwards, I was 
able to build a complete new kernel in 44 seconds.  (I guess most all of 
the source directory was cached somewhere in that 32GB of RAM!)



Linux under Xen does even worse than NetBSD, probably because Xen has
no way to express to Linux the system's memory architecture.  Xen has
NUMA support in the sense that if you have, say, 6 cores assigned
to a virtual and those 6 can be mapped onto physical cores on the same
NUMA node, thus providing the virtual a uniform memory hierarchy, it
will do so; but if you have a 24-core machine built from two pairs
of dual-6-core chips, like the machines you and I have, it really doesn't
do a good job telling the guest what it needs to schedule effectively.






-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: mount -o extattr (extended attributes)

2011-06-16 Thread Paul Goyette
It looks to me like we might also need a rather extensive set of atf 
regression tests to make sure that future maintenance doesn't break 
anything...  :)


On Thu, 16 Jun 2011, Thor Lancelot Simon wrote:


On Thu, Jun 16, 2011 at 03:47:27PM +, Emmanuel Dreyfus wrote:

Hello

Here is a patch that adds mount -o extattr for auto-enabling UFS1
extended attributes.
http://ftp.espci.fr/shadow/manu/mount_extattr.patch

Any comment on it before I commit?

The next step is to add UFS_EXTATTR to GENERIC, XEN3_DOM0 and XEN3_DOMU
kernels (and others? which ones?). If there a smart way, or should I
do it for all ports?


The more I think about this code -- not yours, but the original
implementation -- the less comfortable I am with it.  It seems to be
something of a conceptual mess: it keeps metadata in plain files and
treats the metadata writes like file data writes, going around our entire
mechanism for preserving metadata integrity.

As you've observed, this poses an increased risk of filesystem corruption,
and this code is not sufficiently robust to handle such corruption on
system startup without crashing the kernel.

And there are locking issues with magically creating file Y when an
operation is done on file X, some of which you've found/fixed, but I
bet there are more.

Given the way this code works, creating a shadow tree of files
elsewhere in the filesystem namespace, it seems like it is almost the
canonical example of something that should be a layered filesystem
rather than internally implemented in FFS.   Then it would be possible,
for example, to mount the upper layer sync so that attribute writes
became synchronous and there were much less risk of corruption on crash.

Since extended attributes are often used to store data that are used for
authentication purposes, it seems to me there will potentially be a
dramatic change in system behavior when they are turned on as the system
goes to multiuser mode -- far from ideal though I suppose acceptable if
this is very well documented and the usual security level restrictions
are applied (can't remount the filesystem without -o extattr if
security level  0, etc).

I think this code is a very good example of why disabled-by-default
options are a really bad idea in our kernel.  If this code had not spent
a decade bit-rotting, I believe its serious problems would have been much
more obvious and by now it would have likely been removed from the tree.

I think that -- at least -- fsck needs to learn to put the extended
attribute database into a state in which it cannot crash the kernel,
before this code should be turned on by default.  However, given that
the extended attributes are treated like file data, that poses a risk
of truncation, which could have serious security consequences.  So,
I think that before this code is turned on by default, it really should
to:

1) Be modified to perform all extattr writes synchronously,
   arranging to flush the disk cache after each.  Alternately
   it might be possible to use WAPBL to preserve the ordering
   of the extattr and other metadata writes -- but have we ever
   used WAPBL for file-data-block writes like this?

   Note that there are tricky corner cases when attributes hold
   authorization data.  For example, what if an application
   changes the extended attributes, then writes sensitive data
   to a file, thinking it is protected?  It is necessary to order
   the extattr write and the subsequent data write properly or
   this is not safe.

2) Be able to repair damaged attribute databases, stopping and
   demanding manual intervention (manual fsck?) when it detects
   corruption that might have security or stability consequences.

3) In an ideal world, be reimplemented as a layer rather than a
   bunch of code inside FFS.

Thor

!DSPAM:4dfa379f2622059781524!





-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Multiple device attachments

2011-07-21 Thread Paul Goyette

On Thu, 21 Jul 2011, Michael wrote:


The examples you site seem to indicate that for example the le device may 
attach to many
alternative devices (e.g. pci, tc, ?), but only one attachment is made when 
autoconf is complete. I may have
read the code examples incorrectly -- please pardon me if I did; but what I 
want to know is --  can a
device have multiple attachments (more than one parent device) when autoconf is 
complete.


What do you mean - several instances of the same driver with different bus 
backends in the same kernel? Sure, there is absolutely nothing to prevent it.


Yes, there can be multiple instances of a driver, with each instance 
attached to a different attribute.


However, each single instance can be attached to only one parent.


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: scsictl equivalent for SATA

2011-08-01 Thread Paul Goyette
drvctl should work for you - this was added in the last couple of 
months.


On Mon, 1 Aug 2011, Manuel Bouyer wrote:


On Mon, Aug 01, 2011 at 12:43:03PM +0200, Edgar Fuß wrote:

I've got a hot-pluggable SATA drive in a RAID1 that failed.
I've never been into this with SATA, only with SCA.
What do I do after physically replacing the drive to make the new one known to 
the kernel?
I do know how to re-build the RAID, but what's the analogous to scsictl 
detach/scan?


There is none at this time (unless something changed recently I didn't
notice). But if you plugged the new drive in the same slot as the old
one, you should be able to use it without extra steps.

--
Manuel Bouyer bou...@antioche.eu.org
NetBSD: 26 ans d'experience feront toujours la difference
--

!DSPAM:4e3696682351153699719!





-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-

Where are the specific WARNS=n defined?

2011-08-22 Thread Paul Goyette
I'm trying to modularize a couple of drivers, and one of them is 
generating some gcc errors due to comparison of signed and unsigned 
values.


The driver module is currently being compiled with WARNS=4 (just picked 
that up from another Makefile).  Is there a more appropriate WARNS=n to 
use to permit the comparison?  Or should I simply add


CCOPTS+= -Wno-sign-compare

(which is used when the driver is being compiled as part of a kernel)?



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Where are the specific WARNS=n defined?

2011-08-23 Thread Paul Goyette

On Tue, 23 Aug 2011, Christos Zoulas wrote:


I'm trying to modularize a couple of drivers, and one of them is
generating some gcc errors due to comparison of signed and unsigned
values.

The driver module is currently being compiled with WARNS=4 (just picked
that up from another Makefile).  Is there a more appropriate WARNS=n to
use to permit the comparison?  Or should I simply add

CCOPTS+= -Wno-sign-compare

(which is used when the driver is being compiled as part of a kernel)?


It is best to fix the errors.


Yeah, OK!   :)

Fixing, will commit shortly.



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


wapbl module

2011-09-22 Thread Paul Goyette
It would appear that wapbl is only relevant for ffs file systems (and in 
particular, only for ffs filesystems with a V2 superblock format).


Yet the current modularization of wapbl is not dependant on the ffs 
module.  (wapbl's required-list is empty.)


I realize that even though wapbl registers itself as a module, it is not 
built as a loadable module - ie, it must be built-in.  I intend to try 
to change this, so I would like to know if wapbl is ever intended for 
non-ffs file systems.


FWIW, I also intend to have the ffs code auto-load wapbl whenever it is 
needed.


Thanks in advance for any comments.


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: wapbl module

2011-09-22 Thread Paul Goyette

On Thu, 22 Sep 2011, Joerg Sonnenberger wrote:


On Thu, Sep 22, 2011 at 08:24:48AM -0700, Paul Goyette wrote:

I realize that even though wapbl registers itself as a module, it is
not built as a loadable module - ie, it must be built-in.  I
intend to try to change this, so I would like to know if wapbl is
ever intended for non-ffs file systems.


The WAPBL option changes the buffer subsystem, that's why it can't
easily be modularized.


Ah, OK.  Then I guess I'll not dive any further into this!

Thanks.


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


prop_dictionary_iterator

2011-10-05 Thread Paul Goyette

In sys/dev/sysmon/sysmon_power.c we have

...
while ((obj = prop_object_iterator_next(iter)) != NULL) {
prop_dictionary_remove(ped-dict,
prop_dictionary_keysym_cstring_nocopy(obj));
prop_object_iterator_reset(iter);
}
...

The man-page for prop_dictionary_iterator says in part

... Storing to or removing from the dictionary invalidates
any active iterators for the dictionary.


Doesn't the code-snippet above do exactly what the man-page says should 
not be done?



I've got a machine that lately has decided to panic in the middle of the 
night.  I finally got a traceback which casts great suspicion on the 
above code.  (Unfortunately I managed to reset the machine before I was
able to copy the whole traceback;  I have now attached a PS2 keyboard, 
anticipating the next occurrence.)




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Return status ENXIO / ESRCH in kern_drvctl.c

2012-01-16 Thread Paul Goyette
While browsing the code for other stuff, I ran into what appears to be 
an inconsistency in drvctl(4).


In most cases, when the device specified cannot be found, we return 
ENXIO - Device not configured.


But in routine drvctl_command_get_properties(), if the device is not 
found, we return ESRCH - No such process.  (This is in file 
sys/kern/kern_drvctl.c rev 1.32 at line 462)


It seems to me that this is incorrect, and we should return ENXIO here.


As a side note, it would be nice if we had a drvctl(4) man page.   :)



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


rw_lock vs mutex

2012-02-02 Thread Paul Goyette
While digging around looking into another problem, I noticed that the 
piixpm(4) driver uses an rw_lock for its ic_acquire_bus/ic_release_bus 
routines.  ic_acquire_bus() uses rw_enter(..., RW_WRITER) and there 
doesn't appear to be any use anywhere of RW_READER for that lock.


The man page for rw_lock implies that it is a superset of a mutex.  So 
I'm wondering if it makes any sense to use the simpler mutex instead?



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


RE: rw_lock vs mutex

2012-02-02 Thread Paul Goyette

On Thu, 2 Feb 2012, paul_kon...@dell.com wrote:

A rw_lock allows multiple readers, correct?  If there's a non-trivial 
probability of concurrent reads that would make a difference.  If not, 
then a mutex would be just as good especially if that is lower 
overhead.


The rwlock in question is contained with the driver's softc.  The only 
exported accessors are i2c_bus_acquire() (which grabs a RW_WRITER lock) 
and i2c_bus_release().


A mutex makes much more sense.


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Enable KMEM_GUARD without all the rest of DEBUG ?

2012-03-08 Thread Paul Goyette
On another thread discussing random kernel crashes (likely due to memory 
corruption) [1], it was suggested that KMEM_GUARD might be a useful tool 
for debugging...


By adding KMEM_GUARD itself to a kernel config file, I discovered that 
KMEM_GUARD cannot be enabled individually - it is currently required as 
part of a set of tools that are collectively enabled by 'options DEBUG'


/build/netbsd-test/src/sys/kern/subr_kmem.c: In function 'kmem_init': 
/build/netbsd-test/src/sys/kern/subr_kmem.c:344:20: error: 'kmem_guard' undeclared (first use in this function) 
/build/netbsd-test/src/sys/kern/subr_kmem.c:344:33: error: 'kmem_guard_depth' undeclared (first use in this function) 
/build/netbsd-test/src/sys/kern/subr_kmem.c:344:52: error: 'kmem_guard_size' undeclared (first use in this function)


The following patches allow building of a kernel with only KMEM_GUARD. 
The first patch makes the variable allocations depend on the KMEM_GUARD 
whether or not DEBUG was specified, and the second patch includes the 
necessary uvm_kmguard.c when either option is specified.



Index: subr_kmem.c
===
RCS file: /cvsroot/src/sys/kern/subr_kmem.c,v
retrieving revision 1.42
diff -u -p -r1.42 subr_kmem.c
--- sys/kern/subr_kmem.c 5 Feb 2012 03:40:08 -   1.42
+++ sys/kern/subr_kmem.c 8 Mar 2012 01:53:32 -
@@ -122,9 +122,6 @@ static pool_cache_t kmem_cache[KMEM_CACH
 static size_t kmem_cache_maxidx __read_mostly;

 #if defined(DEBUG)
-int kmem_guard_depth = 0;
-size_t kmem_guard_size;
-static struct uvm_kmguard kmem_guard;
 static void *kmem_freecheck;
 #defineKMEM_POISON
 #defineKMEM_REDZONE
@@ -132,6 +129,12 @@ static void *kmem_freecheck;
 #defineKMEM_GUARD
 #endif /* defined(DEBUG) */

+#if defined (KMEM_GUARD)
+int kmem_guard_depth = 0;
+size_t kmem_guard_size;
+static struct uvm_kmguard kmem_guard;
+#endif
+
 #if defined(KMEM_POISON)
 static int kmem_poison_ctor(void *, void *, int);
 static void kmem_poison_fill(void *, size_t);
Index: uvm/files.uvm
===
RCS file: /cvsroot/src/sys/uvm/files.uvm,v
retrieving revision 1.20
diff -u -p -r1.20 files.uvm
--- sys/uvm/files.uvm   17 May 2011 05:32:31 -  1.20
+++ sys/uvm/files.uvm   8 Mar 2012 02:18:55 -
@@ -25,7 +25,7 @@ file  uvm/uvm_glue.c
 file   uvm/uvm_init.c
 file   uvm/uvm_io.c
 file   uvm/uvm_km.c
-file   uvm/uvm_kmguard.c   debug
+file   uvm/uvm_kmguard.c   debug | kmem_guard
 file   uvm/uvm_loan.c
 file   uvm/uvm_map.c
 file   uvm/uvm_meter.c


Does anyone have any objections to committing these fixes?


[1] http://mail-index.netbsd.org/current-users/2012/03/07/msg019419.html


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Adding SMBus block transfers to iic(4)

2012-03-11 Thread Paul Goyette
There are apparently some devices out there that really need to have i2c 
block-mode transfers, and it appears that none of our current drivers 
(at least, none of the mainstream pci drivers) supports this.


At first glance, it might appear that we could simply check the transfer
requested byte-count, and automatically operate in block mode if the 
count is greater than 2;  the implication is that we could fall back to 
byte or word transfers for smaller byte counts.  But it looks like at 
least one device out there [1] doesn't support this.


So I propose adding a new flag bit that can be passed to iic_exec() to 
request block-transfer protocols.  In src/sys/dev/i2c/i2c_io.h


#define I2C_F_BLOCK 0x20

Comments?  Suggestions?  Alternatives?



[1] http://eeepc-linux.googlecode.com/files/9LPR426A.pdf

-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Adding SMBus block transfers to iic(4)

2012-03-11 Thread Paul Goyette

Good point.

On Sun, 11 Mar 2012, Mouse wrote:


[...] i2c block-mode transfers [...]



#define I2C_F_BLOCK 0x20



Comments?  Suggestions?  Alternatives?


BLOCK has a second, quite different, meaning (as in, blocking I/O).
It may not apply here, but defining a bit in the interface that can be
misunderstood as indicating it does could, at the very least, be
confusing.

Might I suggest BLKMODE instead of BLOCK?  At least to my eye, that's a
lot less ambiguous.

/~\ The ASCII Mouse
\ / Ribbon Campaign
X  Against HTML mo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B

!DSPAM:4f5d315b1981260811860!





-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Adding SMBus block transfers to iic(4)

2012-03-13 Thread Paul Goyette

Just an update here.

In order to enable requesting block-mode transfers using the 
I2C_IOCTL_EXEC method from userland, I'm now proposing to add two new 
high-level operations to the i2c_op_t enum, along with a new macro 
predicate:


typedef enum {
I2C_OP_READ_WITH_STOP   = 1,
I2C_OP_WRITE= 2,
I2C_OP_WRITE_WITH_STOP  = 3,
+   I2C_OP_READ_BLOCK   = 5,/* All block operations imply STOP */
+   I2C_OP_WRITE_BLOCK  = 7
 } i2c_op_t;

 #defineI2C_OP_READ_P(x)(((int)(x)  2) == 0)
 #defineI2C_OP_WRITE_P(x)   (! I2C_OP_READ_P(x))
 #defineI2C_OP_STOP_P(x)(((int)(x)  1) != 0)
+#defineI2C_OP_BLKMODE_P(x) (((int)(x)  4) != 0)



On Sun, 11 Mar 2012, Paul Goyette wrote:

There are apparently some devices out there that really need to have i2c 
block-mode transfers, and it appears that none of our current drivers (at 
least, none of the mainstream pci drivers) supports this.


At first glance, it might appear that we could simply check the transfer
requested byte-count, and automatically operate in block mode if the count is 
greater than 2;  the implication is that we could fall back to byte or word 
transfers for smaller byte counts.  But it looks like at least one device out 
there [1] doesn't support this.


So I propose adding a new flag bit that can be passed to iic_exec() to 
request block-transfer protocols.  In src/sys/dev/i2c/i2c_io.h


#define I2C_F_BLOCK 0x20

Comments?  Suggestions?  Alternatives?



[1] http://eeepc-linux.googlecode.com/files/9LPR426A.pdf

-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-

!DSPAM:4f5d2d7a1985496917264!





-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: CVS commit: src

2012-03-14 Thread Paul Goyette

On Mar 14, 2012, at 10:55 AM, Martin Husemann wrote:


On Tue, Mar 13, 2012 at 06:41:18PM +, Elad Efrat wrote:

Log Message:
Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

   http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
   http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
   http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.


This seems to cause deadlocks in the *fs_rename_dir tests.
See recent (new) test failures both on i386 and amd64 in fs/vfs/t_vnops.


Hmmm, they didn't fail when I tested Elad's changes.  Either that, or I 
missed something in the test output.



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: CVS commit: src/tests/modules

2012-03-21 Thread Paul Goyette

looks good to me...

On Wed, 21 Mar 2012, Martin Husemann wrote:


On Wed, Mar 21, 2012 at 12:25:25PM +, Martin Husemann wrote:


This (untested) would add one (sysctl kern.module.modular would say 0 for
non-modular kernels, or 1 otherwise).


I had to fix other parts of the kernel, but the patch now works as expected.

Any objections?

Martin

!DSPAM:4f69f0a61987340033397!





-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


USB 3.0 ?

2012-03-29 Thread Paul Goyette
Just curious - is anyone out there working on a USB 3.0 driver for 
NetBSD?



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: introduce device_is_attached()

2012-04-16 Thread Paul Goyette
Along these same lines, it might be nice if there were a way to 
determine if _any_ instances of a device were attached.  This is 
particularly useful for device-driver modules, which should prevent 
themselves from being unloaded if there are active instances.


A quick check of some device-driver modules show that they simply call 
config_fini_component() which updates autoconfig tables, but doesn't 
seem to check if a driver instance exists...





On Mon, 16 Apr 2012, Christoph Egger wrote:



Hi,

I want to introduce a new function to sys/devices.h:

bool device_is_attached(device_t parent, cfdata_t cf);

The purpose is for bus drivers who wants to attach children
and ensure that only one instance of it will attach.

'parent' is the bus driver and 'cf' is the child device
as passed to the submatch callback via config_search_loc().

The return value is true if the child is already attached.

I implemented a reference usage of it in amdnb_misc.c to ensure
that amdtemp only attaches once on rescan.

Any comments to the patch?

Christoph


!DSPAM:4f8c4e7e1986566716057!



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Module name - recommendations/opinions sought

2012-04-25 Thread Paul Goyette
I'm in the process of modularizing the ieee80211 (Wireless LAN) code, 
and would like some feedback on what the module's name should be.  I can 
think of at least three or four likely candidates:


net80211
ieee80211
wlan
wireless-lan

I'm leaning towards the simplest one - wlan - but would like to know 
what others think before I commit.




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


link_set info needed

2012-04-25 Thread Paul Goyette
I've been working on modularizing the ieee80211 code, and I've 
discovered that the code using a link_set to collect various one-time 
initialization routines.  There is a RUN_ONCE() invocation that uses 
__link_set_for_each() to iterate over these init routines.


This all works just fine when including net80211 in a monolithic kernel, 
but for some reason, when building as a module, the link_set sections 
don't seem to get created (they're not listed by objdump -h).


__link_set_for_each tries to loop through the items in 
{start,stop}_link_set_set-name - these items are contributed by 
other source files.  But the symbols that should identify the start and 
stop of the link set are undefined.


Any suggestions on an appropriate alternate to link sets to collect the 
contributions?



If anyone wants to look at the actual code, it is in 
src/sys/net80211/ieee80211_netbsd.[ch]



Thanks!



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Module name - recommendations/opinions sought

2012-04-25 Thread Paul Goyette
Yes, wlan was my first choice, cimply because that is the name of the 
config attribute.


But it seems a consensus is emerging that the module name should be 
net80211 so that's what I'm going to do (assuming I figure out the 
link_set issue).


I'll leave the rototilling of the config files for another day.



On Thu, 26 Apr 2012, Masao Uebayashi wrote:


I thought module names alway match what's define'ed in config(1) files...

... and now I've found it's wlan, which should be changed to a saner
name like net80211.

On Thu, Apr 26, 2012 at 9:52 AM, Paul Goyette p...@whooppee.com wrote:

I'm in the process of modularizing the ieee80211 (Wireless LAN) code, and
would like some feedback on what the module's name should be.  I can think
of at least three or four likely candidates:

       net80211
       ieee80211
       wlan
       wireless-lan

I'm leaning towards the simplest one - wlan - but would like to know what
others think before I commit.



-
| Paul Goyette     | PGP Key fingerprint:     | E-mail addresses:       |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com    |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |                          | pgoyette at netbsd.org  |
-


!DSPAM:4f98c4322402014054294!





-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-

Modularizing net80211 (was: link_set info needed)

2012-04-27 Thread Paul Goyette

Folks,

I would like to commit the attached patches to modularize the net80211 
code.  The changes include:


1. Protecting the inclusion of opt_inet.h with #ifdef _KERNEL
2. Removing CTLFLAG_PERMANENT from all of the sysctl(8) variables
3. Replacing the use of link_set with direct calls to all of the
   crypto/auth initializers.

At a future time, I propose to split the crypto/auth stuff out into 
their own modules, but that will require a bit more planning and a lot 
more testing.  :)


I have tested the attached changes on my home system, using a modular 
if_rum(4) driver (which I plan to commit separately).



Comments/feedback welcomed...



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-Index: ieee80211.c
===
RCS file: /cvsroot/src/sys/net80211/ieee80211.c,v
retrieving revision 1.53
diff -u -p -r1.53 ieee80211.c
--- ieee80211.c 5 Apr 2010 07:22:24 -   1.53
+++ ieee80211.c 27 Apr 2012 12:34:49 -
@@ -43,7 +43,9 @@ __KERNEL_RCSID(0, $NetBSD: ieee80211.c,
  * IEEE 802.11 generic handler
  */
 
+#if defined(_KERNEL_OPT)
 #include opt_inet.h
+#endif
 
 #include sys/param.h
 #include sys/systm.h 
Index: ieee80211_crypto.c
===
RCS file: /cvsroot/src/sys/net80211/ieee80211_crypto.c,v
retrieving revision 1.15
diff -u -p -r1.15 ieee80211_crypto.c
--- ieee80211_crypto.c  23 May 2011 15:37:36 -  1.15
+++ ieee80211_crypto.c  27 Apr 2012 12:34:49 -
@@ -39,7 +39,9 @@ __FBSDID($FreeBSD: src/sys/net80211/iee
 __KERNEL_RCSID(0, $NetBSD: ieee80211_crypto.c,v 1.15 2011/05/23 15:37:36 
drochner Exp $);
 #endif
 
+#if defined(_KERNEL_OPT)
 #include opt_inet.h
+#endif
 
 /*
  * IEEE 802.11 generic crypto support.
Index: ieee80211_input.c
===
RCS file: /cvsroot/src/sys/net80211/ieee80211_input.c,v
retrieving revision 1.72
diff -u -p -r1.72 ieee80211_input.c
--- ieee80211_input.c   31 Dec 2011 20:41:58 -  1.72
+++ ieee80211_input.c   27 Apr 2012 12:34:49 -
@@ -39,7 +39,9 @@ __FBSDID($FreeBSD: src/sys/net80211/iee
 __KERNEL_RCSID(0, $NetBSD: ieee80211_input.c,v 1.72 2011/12/31 20:41:58 
christos Exp $);
 #endif
 
+#if defined(_KERNEL_OPT)
 #include opt_inet.h
+#endif
 
 #ifdef __NetBSD__
 #endif /* __NetBSD__ */
Index: ieee80211_ioctl.c
===
RCS file: /cvsroot/src/sys/net80211/ieee80211_ioctl.c,v
retrieving revision 1.57
diff -u -p -r1.57 ieee80211_ioctl.c
--- ieee80211_ioctl.c   31 Dec 2011 20:41:58 -  1.57
+++ ieee80211_ioctl.c   27 Apr 2012 12:34:50 -
@@ -43,8 +43,10 @@ __KERNEL_RCSID(0, $NetBSD: ieee80211_io
  * IEEE 802.11 ioctl support (FreeBSD-specific)
  */
 
+#if defined(_KERNEL_OPT)
 #include opt_inet.h
 #include opt_compat_netbsd.h
+#endif
 
 #include sys/endian.h
 #include sys/param.h
Index: ieee80211_netbsd.c
===
RCS file: /cvsroot/src/sys/net80211/ieee80211_netbsd.c,v
retrieving revision 1.20
diff -u -p -r1.20 ieee80211_netbsd.c
--- ieee80211_netbsd.c  19 Nov 2011 22:51:25 -  1.20
+++ ieee80211_netbsd.c  27 Apr 2012 12:34:50 -
@@ -40,6 +40,7 @@ __KERNEL_RCSID(0, $NetBSD: ieee80211_ne
 #include sys/kernel.h
 #include sys/systm.h 
 #include sys/mbuf.h   
+#include sys/module.h
 #include sys/proc.h
 #include sys/sysctl.h
 #include sys/once.h
@@ -72,6 +73,8 @@ static int ieee80211_sysctl_node(SYSCTLF
 intieee80211_debug = 0;
 #endif
 
+#ifdef NOTYET /* Currently we can't use link sets in modules */
+
 typedef void (*ieee80211_setup_func)(void);
 
 __link_set_decl(ieee80211_funcs, ieee80211_setup_func);
@@ -81,7 +84,7 @@ ieee80211_init0(void)
 {
ieee80211_setup_func * const *ieee80211_setup, f;
 
-__link_set_foreach(ieee80211_setup, ieee80211_funcs) {
+   __link_set_foreach(ieee80211_setup, ieee80211_funcs) {
f = (void*)*ieee80211_setup;
(*f)();
}
@@ -96,6 +99,29 @@ ieee80211_init(void)
 
RUN_ONCE(ieee80211_init_once, ieee80211_init0);
 }
+#else
+/*
+ * One-time initialization
+ *
+ * XXX Make sure you update the list of CYPTO initializers if new
+ * XXX mechanisms are added!
+ */
+
+void ccmp_register(void);
+void tkip_register(void);
+void wep_register(void);
+void ieee80211_external_auth_setup(void);
+
+void
+ieee80211_init(void)
+{
+
+   ccmp_register();
+   tkip_register();
+   wep_register();
+   ieee80211_external_auth_setup();
+}
+#endif /* NOT_YET */
 
 static

Re: Modularizing net80211 (was: link_set info needed)

2012-04-27 Thread Paul Goyette

On Fri, 27 Apr 2012, Joerg Sonnenberger wrote:


On Fri, Apr 27, 2012 at 05:51:30AM -0700, Paul Goyette wrote:

3. Replacing the use of link_set with direct calls to all of the
   crypto/auth initializers.


I believe this to a noticable regression due to bugs in the module
framework. It should support either link sets or _init sections.
If the latter is supposed to be the way forward, current linkset usage
should be adjusted and an appropiate place in the kernel for calling all
static initializers be found.


Thanks for the feedback.  I am not familiar with the _init sections 
stuff - can you provide a reference or example?



Please also note that separating the crypto/auth stuff into their own 
modules later on will remove these direct calls to the initializers.




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


  1   2   3   4   5   6   >