Re: Booting NetBSD 8 install image on PC Engines apu2d4 via serial console goes blank

2019-01-01 Thread Don NetBSD

On 12/31/2018 7:35 AM, Mike Pumford wrote:

On 30/12/2018 15:59, J. Lewis Muir wrote:

I'm trying to install NetBSD 8 on a PC Engines apu2d4
   https://www.pcengines.ch/apu2d4.htm
via a USB thumb drive with a NetBSD 8 install image
   
https://cdn.netbsd.org/pub/NetBSD/NetBSD-8.0/images/NetBSD-8.0-amd64-install.img.gz 


Yes but you need to reconfigure the bootloader (which can be done on an 
existing NetBSD system) to set things up so the kernel uses the serial console. 
The other alternative (assuming you are able to do keyboard input) is to set 
the console to com0 using the interactive bootloader.


For installboot assuming the usb install image is detected as sd0 on another 
netbsd system you can do:


installboot -e -o console=com0 /dev/rsd0a

You should also be able to do the same thing when installing as well. One thing 
I found with this is that this stopped the keyboard working in the bootloader 
but I didn't figure out why.


One other thing is I'd recommend a recent (last 2weeks or newer) 8.0 snapshot 
as this contains the change that makes NetBSD recognise the APU SATA controller 
properly although I've found it works pretty well with an high speed SD card as 
the disk.


ACPI and SMP both work perfectly.


IME, Pascal is pretty supportive of FOSS efforts.  You should try to keep him
in the loop and perhaps gain some additional insights that he/they can offer.


Re: Machinfo struct?

2018-12-21 Thread Don NetBSD

On 12/19/2018 10:58 AM, Rocky Hotas wrote:


Is there a struct/XML somewhere that describes the hardware
in the machine much like parsing dmesg might?  I.e., to
ensure the "current" state hasn't changed from a "desired"
state?  (boards removed/added, memory complement altered,
etc.)


You almost surely already know it, but anyway, from a root
shell:

drvctl -lt mainbus0

will produce a tree based on the configuration shown in dmesg.
It is a very elementary information, plain/text and not XML.
However, maybe you could do some scripting to compare this
tree with a previously stored one, and then detect the
possible changes between them.


Yes, but I'm looking to do this before init(8) is even invoked...
sometime just after the kernel has finished probing everything
but before it passes control to init(8) -- so the boot will be
unconditionally (and unavoidably) *aborted* if the hardware isn't
"as it should be".


openssl.cnf(5)

2018-12-10 Thread Don NetBSD

What's the expected location for the "default" openssl.cnf?


man.conf(5)

2018-12-08 Thread Don NetBSD

I'm looking at 7.1/i386...

Man page for man.conf(5) claims "_whatdb" to be the name of the apropos db.
However, /etc/man.conf actually includes a reference to "_mandb".

ISTR _whatdb was working on 6.1 but haven't yet tested _whatdb vs. _mandb
on a 7.1 system.  Can someone more knowledgeable comment and indicate
whether or not man page needs to be updated vs. the configuration file,
itself?


Machinfo struct?

2018-12-01 Thread Don NetBSD

Is there a struct/XML somewhere that describes the hardware
in the machine much like parsing dmesg might?  I.e., to
ensure the "current" state hasn't changed from a "desired"
state?  (boards removed/added, memory complement altered,
etc.)


Re: Sun widget

2018-11-30 Thread Don NetBSD

On 11/30/2018 1:46 PM, Brett Lymn wrote:

On Fri, Nov 30, 2018 at 12:21:46PM -0700, Don NetBSD wrote:


It is *exactly* one-to-one.  I.e., it's just a 3 inch "extender cable".

Perhaps to make the tightly spaced network connectors on the rear of the
box more accessible to folks with fat fingers?  (That seems ludicrous).


I have seen that sort of thing used in test environments where connectors
are plugged/unplugged frequently.  You don't want to wear out the on board
connectors which are hard to replace so you make a sacrificial cable. When
the connector wears out then you can replace it with a new one. Is it
possible the netra was part of some automated test system?


Dunno.  I just pulled it (and some other items) out of a pile of stuff
that was being scrapped.  No idea who the original owner was nor how
it was used (and I'm not supposed to look at the contents of the disk
drives -- before wiping them -- to determine any of that  :< ).

The little widgets were still in their original poly bags -- along with
other odds-and-ends.  So, *they* hadn't seen any prior use...


Re: Sun widget

2018-11-30 Thread Don NetBSD

On 11/30/2018 4:34 AM, Julian Coleman wrote:

Hi,


I have some little (3 inch) widgets with an RJ45 plug on one end and
jack on the other.  Carrying a (Sun?) part number of 422764100011.  I
suspect they may have been part of the Netra T5220 I recently acquired.

First thought was perhaps a crossover adapter?  But, the Netra's inet
ports are Gbe so that shouldn't be necessary.  Unless it is intended for
use with the (100Mb?) "management port"?

Or, perhaps a means of converting the serial management port's DB9<->RJ45
adapter to DCE/DTE pinout??


The last.  If I remember correctly, this stems from Sun's interpretation of
the serial standard, particularly which end was the computer and which end
was the terminal.  Sun serial ports were wired as a computer, but PC's
(maybe everyone copied the IBM PC?) were wired as a terminal.  When RJ45
serial connectors replaced DB25/DB9 connectors, Sun continued to have the
different wiring (also shared by Cisco) for its console connections.  So,
if you connect a terminal server to a Sun or Cisco console, you'll need a
crossover adapter.  The Avocent/Cyclades ADB0039 is the same, e.g.:

   https://www.kvm-switches-online.com/adb0039.html


Another respondent (offlist) suggested similarly.

However, I finally sat down with an ohm-meter (my cable checker has
a faulty display... should get around to fixing that, one of these days!)
and a magnifying glass (eyes don't work as well as they used to) to
sort out the pin mapping.

It is *exactly* one-to-one.  I.e., it's just a 3 inch "extender cable".

Perhaps to make the tightly spaced network connectors on the rear of
the box more accessible to folks with fat fingers?  (That seems
ludicrous).

On closer inspection (now that I have the magnifying glass in hand), the
RJ45S end of the adapter is a *true* RJ45S and not just an 8P8C.  So,
while likely not something for which I'll have a frequent need, they're
still worth tossing in the box of "miscellaneous widgets" as *when*
needed, there won't be any practical alternatives (short of replacing
the cable)!


Sun widget

2018-11-29 Thread Don NetBSD

I have some little (3 inch) widgets with an RJ45 plug on one end and
jack on the other.  Carrying a (Sun?) part number of 422764100011.  I
suspect they may have been part of the Netra T5220 I recently acquired.

First thought was perhaps a crossover adapter?  But, the Netra's inet
ports are Gbe so that shouldn't be necessary.  Unless it is intended for
use with the (100Mb?) "management port"?

Or, perhaps a means of converting the serial management port's DB9<->RJ45
adapter to DCE/DTE pinout??

[Google has offered up nothing meaningful]


PnP? device identification

2018-11-29 Thread Don NetBSD

I have various "not configured" (PnP?) devices attached at acpi0 showing
up on a 7.1/i386 kernel:  {MCH, COPR, RMSC, OMSC, PCIE, RMEM}.  How can I
sort out which devices they pertain to in order to determine if they are
being serviced by another driver attachment?

And, for those found not to be handled at all, to sort out which drivers
to include in a new kernel configuration?


Re: Netra T5220

2018-11-19 Thread Don NetBSD

On 11/19/2018 2:21 PM, Brett Lymn wrote:

On Sun, Nov 18, 2018 at 03:57:53PM -0700, Don NetBSD wrote:

Think of it as having a similar function wrt the ILOM as the OBP has to
the OS in older Sun boxen.


No, this is a totally different processor, the SP is, effectively, a
separate computer that has hooks into the main machine's hardware for
monitoring and control.  Changing things at the u-boot level has no real
effect on the main machine, just the SP.  You can still access the OBP
when you start the host console.


That's not what I said.  I made the analogy that u-boot is to the SP
as OBP is to SunOS (on an "older", no-SP box).

I.e., if you look at the SP as a product in itself, u-boot is the
preboot environment -- in much the same way that OBP provides a
"preboot environment" for Solaris.


Of course, at $WORK, you're not trying to get INTO a box that someone
has locked up -- as YOU are the party who likely locked it up in the first
place!


Indeed, what I really meant is that I have never seen any official
Oracle documentation for the SP boot.  It is not something that they
encourage you to poke at.


Actually, they do!  Just not for THIS product!  I've been grep-ing
documentation for other (Sun) products with SP equivalents and
taking my cue from what I find, there, to decide what to poke at,
here.


OTOH, when a system falls into your lap, you don't always have that sort
of access.  So, you need to rely on mechanisms that the designers put in
place to make this sort of thing possible!


Yes, it sounds like a lot of it is aimed at disaster recovery, when the
machine has cratered.  At $WORK that usually means somebody's services
are down which they normally get agitated about at which point we
usually just either have an Oracle field engineer out or have support
guiding us.


With rescued kit, I don't have the luxury (or expense!) of a support contract.
So, the more I can learn about a box, on my own, the better off I will be in
the longrun.

I create elaborate sets of notes for the stuff that I uncover/discover as
it will likely be "a long time" before I find myself staring at the
same box in some degraded capacity (I don't trust my meatware to hold onto
all of those sorts of details)



Re: Netra T5220

2018-11-18 Thread Don NetBSD

On 11/18/2018 2:09 PM, Brett Lymn wrote:

On Sat, Nov 17, 2018 at 06:53:51PM -0700, Don NetBSD wrote:

On 11/17/2018 1:52 PM, Brett Lymn wrote:

On Fri, Nov 16, 2018 at 02:11:26PM -0700, Don NetBSD wrote:


Yes, but what's the prompt BEFORE that (u-boot>)?  And, where do I
find the capabilities, there, documented?


As someone else mentioned that is the Service Processor boot, it is a
cut down linux image, IIRC running on powerpc.  I doubt if you will find
much publically available information on the guts of the SP... even if
you have access to Oracle support, it is not something that Oracle
customers are meant to mess with.


No.  The prompt BEFORE the service processor starts Linux.


Yes, I know what you are talking about.  I deal with Sun/Oracle
equipment at $WORK.  I have seen that prompt.  I have never seen any
documentation as to what you can do there.  I would be surprised if
there is anything available at all outside Oracle - I think the attitude
is that the customers don't need to mess with the SP at all and should
be treated just as a firmware blob (which is, in fact, how the updates
are provided - a blov for the linux image plus OFW update)


With it, you can:
- reconfigure the serial port parameters
- adjust the delay before Linux boots (give you more time to interrupt
  that process)
- reset the password (that Linux will request at it's "login:" prompt)
- upgrade the ILOM firmware (without ILOM *or* OS being functional!)
- reset the default ILOM parameters (e.g., for network settings)
- configure the ILOM network parameters
- test the ILOM's network connection (e.g., ping other hosts)
- indicate whether or not physical presence is required to break autoboot
- connect to the "system's" serial port (bypassing ILOM)
- add/delete "users"
- examine SP settings
- enable/disable the front panel power button
- power down the host
- reset the SP and/or host
- run diagnostics on the SP

and, of course:

- boot the ILOM

[Of course, I suspect I'll uncover additional uses as I tinker more with it!]

Think of it as having a similar function wrt the ILOM as the OBP has to
the OS in older Sun boxen.

Of course, at $WORK, you're not trying to get INTO a box that someone
has locked up -- as YOU are the party who likely locked it up in the first
place!

OTOH, when a system falls into your lap, you don't always have that sort
of access.  So, you need to rely on mechanisms that the designers put in
place to make this sort of thing possible!


Re: Netra T5220

2018-11-18 Thread Don NetBSD

On 11/18/2018 2:01 AM, Sad Clouds wrote:

On Sat, 17 Nov 2018 18:53:51 -0700
Don NetBSD  wrote:


The earlier "u-boot" prompt is significant as it lets you tinker with
the "pre-Linux" environment.  Among other things, it lets you erase
the Linux password so you CAN log into the SP (if you'd lost that
information)

I want to know what else it is useful for (besides exploring "help" at
that prompt)


U-Boot is just a bootloader, like Grub, the only useful thing it does,
is booting embedded Linux. Not sure why you'd want to tinker with that,
because if you misconfigure/damage it, you may find your hardware no
longer boots.


Actually, it does a fair bit more than "just boot Linux" -- hence the
reason to tinker with it!  :>  (why would it have a command interpreter if
the only thing it could do was "boot"?)

[Hint: ask yourself what you'd do if you didn't have the password to the ILOM;
or, if the Linux image had been corrupted/wouldn't boot; or if you wanted to
reflash that image (e.g., to support 11.4); or, if the serial port wasn't
"connected" to the ILOM]

For folks like me who acquire these devices without being able to
speak to the previous owner ("what's root's password?"), it's an
essential tool to getting into a box that may typically have been
locked up to prevent casual access (esp as you can't "pull the SP's
disk" to alter its contents off-line)

The hooks have been placed there, for a reason.  Silly NOT to understand
them and use them!


But anyway, good luck with your investigations


Re: Netra T5220

2018-11-17 Thread Don NetBSD

On 11/17/2018 1:52 PM, Brett Lymn wrote:

On Fri, Nov 16, 2018 at 02:11:26PM -0700, Don NetBSD wrote:


Yes, but what's the prompt BEFORE that (u-boot>)?  And, where do I
find the capabilities, there, documented?


As someone else mentioned that is the Service Processor boot, it is a
cut down linux image, IIRC running on powerpc.  I doubt if you will find
much publically available information on the guts of the SP... even if
you have access to Oracle support, it is not something that Oracle
customers are meant to mess with.


No.  The prompt BEFORE the service processor starts Linux.

Apply power...

   U-Boot 1.1.1

   custom Sun Microsystems U-Boot 1.3 (Dec  6 2011 - 11:01:09) r61032

   CPU:   MPC885ZPnn at 133 MHz: 8 kB I-Cache 8 kB D-Cache FEC present
   Board: SPARC885
  Watchdog enabled
   I2C:   ready
   DRAM:
   trying 128 MBytes
   (128 MB SDRAM) 128 MB
   Memory Tests: DA A1 A2 00 FF 55 AA T2 T3 T4
   POST memory PASSED
   FLASH: 32 MB
   In:serial
   Out:   serial
   Err:   serial
   Net:   FEC ETHERNET
   POST i2c  c  d 14 18 2a 2d 2e 30 40 43 46 51 53 54 56 59 68 69 6a 6b 70 71 


   PASSED
   POST cpu PASSED
   POST ethernet PASSED
   Booting linux in 30 seconds...

At this point, Linux hasn't booted.  You can abort the process (by asserting
your physical presence or with magic keystrokes).  You end up with a "preboot"
prompt:

   u-boot> version

   U-Boot 1.1.1

   custom Sun Microsystems U-Boot 1.3 (Dec  6 2011 - 11:01:09) r61032
   u-boot> boot

Now, you've issued the boot command to boot Linux on the SP:

   ## Booting image at fe08 ...
  Image Name:   Linux-2.4.22
  Image Type:   PowerPC Linux Kernel Image (gzip compressed)
  Data Size:815088 Bytes = 796 kB
  Load Address: 
  Entry Point:  
  Verifying Checksum ... OK
  Uncompressing Kernel Image ... OK
   do_bootm_linux():
 argv[0]=bootm
 argv[1]=0xfe08
   ## Current stack ends at 0x07D388B8 => set upper limit to 0x0080
   No initrd
   ## cmdline at 0x007FFF00 ... 0x007FFF80

yada yada yada... eventually, you get a login prompt:

   login: root
   Password: changeme
   Waiting for daemons to initialize...
   .
   Timed out waiting for daemons to start
   sccd daemon has shutdown

   Oracle(R) Integrated Lights Out Manager

   Version 3.0.10.4 r61032

   Copyright (c) 2010, Oracle and/or its affiliates. All rights reserved.

   Warning: password is set to factory default.

   ->

And now you have the SP prompt.

The earlier "u-boot" prompt is significant as it lets you tinker with
the "pre-Linux" environment.  Among other things, it lets you erase the
Linux password so you CAN log into the SP (if you'd lost that information)

I want to know what else it is useful for (besides exploring "help" at
that prompt)


Re: /var on tmpfs

2018-11-17 Thread Don NetBSD

On 11/16/2018 3:47 PM, David Young wrote:

I added a line to /etc/fstab,

swap /mfs tmpfs rw,-s8M 0 0


If memory is limited, doesn't that make creating a MFS swap just useless
overhead?  I.e., if you need to page to swap, you're just consuming memory
set aside FOR swap instead of "main memory"/buffer pool?


I modified my rc.conf to 1) indicate that /etc, /var, temporary and
home directories should be on (ephemeral!) memory filesystems, and 2)


Doesn't mfs have a higher overhead than tmpfs?


ensure that the prerequisite filesystems (/usr) were mounted before
mountcritmem ran.



If this works for you, too, maybe mountcritmem should go into the base
system.


With memory also being in short supply, I think the approach I should
take is to create a "volatile" partition backed with tmpfs mounted at,
for example, /volatile.

Then, *selectively* add symlinks from the ro portion of the filesystem
into that.  E.g., /tmp can point at /volatile/tmp to ensure ALL of
/tmp is backed with a volatile store.

This allows a finer-grained use of that (precious) resource -- memory.
E.g., printcap(5) can remain in the ro backed /etc... but, /etc/myname
can move to the volatile store.

[Aren't there some parts of /etc that various daemons update and, thus,
need to be writable?]

Finally, after instantiating everything that needs to reside in /volatile
(proper filenames, contents, ACLs, etc.) -- along with any required symlinks
from the ro portion of the filesystem -- build a tarball and stash it on
the persistent portion of the filesystem.

Then, reboot just needs to create /volatile and mount it; then populate it
with the contents of that tarball.

One caveat:  I'd have to create a /volatile hierarchy on the persistent
medium and mount the tmpfs on top of it.  This ensures that the parts
of the filesystem that will eventually be volatile are present even before
the system moves to multiuser!

Have I missed anything?

Finally, besides /var and /tmp, which *files* must be mutable -- acknowledging
that this may depend on the actual set of services in use, at the time (DHCP
client, BIND, etc.)?  Does /dev need to be mutable -- aren't owners and
perms changed dynamically for some devices by the system/services?


Re: Netra T5220

2018-11-16 Thread Don NetBSD

On 11/16/2018 2:53 PM, Sad Clouds wrote:

On Fri, 16 Nov 2018 13:22:19 -0700
Don NetBSD  wrote:


So, it seems like there are a boat load of prompts -- "u-boot>",
"->", "ok" ...  And, nothing that seems to summarize ALL of
the pertinent environments in which you can be interacting with
the box.


You're getting confused between various consoles. This is what I do to
log in

I have a Linux laptop connected to T5220 via serial cable


Here I use 'cu' to initiate connection:

# chmod 666 /dev/ttyUSB0
# cu -l /dev/ttyUSB0


Fine, I use tip(1)...


First login takes me to service processor console:

SUNSP00212824CA7D login: root
Password:
Waiting for daemons to initialize...

Daemons ready

Sun(TM) Integrated Lights Out Manager

Version 2.0.4.27.g

Warning: password is set to factory default.


This is where we differ.  I get a U-Boot banner (actually, TWO of them)
followed by a "u-boot>" prompt.  If I issue the "boot" command at that
prompt, the service processor boots (lots of diagnostic output before
finally offering up the "->" prompt)

Once at the SP prompt, I can progress to a console, as below.  I just
have this "extra step" BEFORE the SP prompt is available.

And, am unsure of what I can do, there (e.g., I was able to reset
the password for the SP using commands from the "u-boot>" prompt)


The following command takes me to system console. After I type 'y' and
hit Enter key, Solaris login prompt appears:

-> start /SP/console
Are you sure you want to start /SP/console (y/n)? y

Serial console started.  To stop, type #.

t5220 console login:




I don't know if your system is bootable and if it has Solaris running.
If not, then download ISO, burn to DVD and try booting from that.


The system boots (5.10) -- if I want to go that far.  Right now, I want
to sort out how to get the box into a configuration that I can at least
document.  Then, figure out how I might want to CHANGE that to suit my
specific needs.

Starting at the "u-boot level" seems the most prudent...



Re: /var on tmpfs

2018-11-16 Thread Don NetBSD

On 11/16/2018 12:35 PM, Rhialto wrote:

I once made a little script to make a bootable ISO9660 live file system,
given the distribution tarballs. It has to be able to live on a
read-only medium, hence it uses a tmpfs for /var. For initializing it,
it installs a script in /etc/rc.d. I basically used trial and error;
everything that produced an error message while booting was reason for
adding an extra directory or empty file.

https://www.falu.nl/~rhialto/mkiso

I just gave it a quick try, and qemu seemed a looot slower than
previously (when I last tried was under 7.0.2 I think)...


Thanks, I'll have a look.


Re: Netra T5220

2018-11-16 Thread Don NetBSD

On 11/16/2018 1:56 PM, Brett Lymn wrote:

On Fri, Nov 16, 2018 at 01:22:19PM -0700, Don NetBSD wrote:

Version 3.0.10.4 r61032

Copyright (c) 2010, Oracle and/or its affiliates. All rights reserved.

Warning: password is set to factory default.

-> version
SP firmware 3.0.10.4
SP firmware build number: 61032
SP firmware date: Tue Dec  6 10:59:21 PST 2011
SP filesystem version: 0.1.22

->


ok, that is the SP prompt. Get a console using:

start /HOST/console


Yes, but what's the prompt BEFORE that (u-boot>)?  And, where do I
find the capabilities, there, documented?

[I'm sure that a console won't let me into whatever OS is installed
as I've no idea what the root password is likely to be]

BTW, examining some of the logs suggest it is (was?) running
Solaris 5.10



Re: Netra T5220

2018-11-16 Thread Don NetBSD

On 11/16/2018 1:27 AM, Sad Clouds wrote:

On Fri, 16 Nov 2018 01:01:18 -0700
Don NetBSD  wrote:


[probably best to take this off list?]


I am hoping to make some time to play with this over the weekend
(or, over the holiday).  Right now, its just "in the way"  :-/


Not sure about your case specifically, but on my system there is an
ILOM SP (service processor), this is separate from UltraSPARC T2
processor.


Yes, this was my first source of confusion (I was expecting the OFW
to more resemble my Voyager/U60/SB2000).


They use embedded Linux that boots into SP, which is what you see on
the banner. This allows you to ssh into the system when it is not
running and configure/upgrade firmware, start/stop OS, etc. By default,
ILOM uses DHCP to acquire IP address and the default login/password is
root/changeme. There is a special management port that you need to plug
to the rest of your network. Alternatively you can use serial-to-usb
cable, which I guess is what you're doing since you can see SP boot
messages.


I'm using tip(1) over a regular serial port.


I think the SP is some kind of embedded IBM Power processor.

https://docs.oracle.com/cd/E19350-01/820-3010-12/820-3010-12.pdf


Note, however, that neither "U-Boot" (which is part of the banner)
NOR "u-boot>" (which is the prompt that appears) exists anywhere in
this text!

So, it seems like there are a boat load of prompts -- "u-boot>",
"->", "ok" ...  And, nothing that seems to summarize ALL of
the pertinent environments in which you can be interacting with
the box.

Note, for example, the different responses to the "version" command
(no doubt, this is "old" -- but HOW old?  Which versions of Slowaris
might it support -- without a firmware upgrade?):
---
U-Boot 1.1.1

custom Sun Microsystems U-Boot 1.3 (Dec  6 2011 - 11:01:09) r61032

CPU:   MPC885ZPnn at 133 MHz: 8 kB I-Cache 8 kB D-Cache FEC present
Board: SPARC885
   Watchdog enabled
I2C:   ready
DRAM:
trying 128 MBytes
(128 MB SDRAM) 128 MB
Memory Tests: DA A1 A2 00 FF 55 AA T2 T3 T4
POST memory PASSED
FLASH: 32 MB
In:serial
Out:   serial
Err:   serial
Net:   FEC ETHERNET
POST i2c  c  d 14 18 2a 2d 2e 30 40 43 46 51 53 54 56 59 68 69 6a 6b 70 71 
PASSED
POST cpu PASSED
POST ethernet PASSED
Booting linux in 30 seconds...

(*** abort boot ***)

u-boot> version

U-Boot 1.1.1

custom Sun Microsystems U-Boot 1.3 (Dec  6 2011 - 11:01:09) r61032
u-boot> boot
## Booting image at fe08 ...
   Image Name:   Linux-2.4.22
   Image Type:   PowerPC Linux Kernel Image (gzip compressed)
   Data Size:815088 Bytes = 796 kB
   Load Address: 
   Entry Point:  
   Verifying Checksum ... OK
   Uncompressing Kernel Image ... OK
do_bootm_linux():
  argv[0]=bootm
  argv[1]=0xfe08
## Current stack ends at 0x07D388B8 => set upper limit to 0x0080
No initrd
## cmdline at 0x007FFF00 ... 0x007FFF80
...yada yada yada

login: root
Password: changeme
Waiting for daemons to initialize...
.
Timed out waiting for daemons to start
sccd daemon has shutdown

Oracle(R) Integrated Lights Out Manager

Version 3.0.10.4 r61032

Copyright (c) 2010, Oracle and/or its affiliates. All rights reserved.

Warning: password is set to factory default.

-> version
SP firmware 3.0.10.4
SP firmware build number: 61032
SP firmware date: Tue Dec  6 10:59:21 PST 2011
SP filesystem version: 0.1.22

->


Re: /var on tmpfs

2018-11-16 Thread Don NetBSD

On 11/16/2018 11:12 AM, Jeremy C. Reed wrote:

On Thu, 15 Nov 2018, Don NetBSD wrote:


I've a box with a DoM.  I'd like to mount / as ro and create a
tmpfs for /var (and /tmp).  I don't think anything else NEEDS to
be rw (the infrequent changes to /etc can be made by unlocking /
to make those changes).

I imagine I can just make a tarball of a skeletal /var and
unpack this over /var, once mounted?

Is there a preexisting mechanism for this sort of thing?
Or, do I roll my own?


Have a look at the /etc/mtree/ specifications. Many /var/ entries in
there.  You could use it to create your own spec file for your required
files and directories with correct ownership and permissions and then
run mtree to generate them.


Ah, that would be a clever approach -- and, add little/nothing to the
image size as the entries would already exist in the existing specs
(I'd just be "moving" them into another spec).

But, it won't let me create *files*.

So, if I wanted to symlink all or part of /etc to, for example, /var/etc
(to eliminate the need for creating a second tmpfs -- and incurring a
second "overhead"), I'd still need a mechanism to instantiate those
files under /var.


Or (looking at my notes from 2002), I used a /var.copy directory
pre-populated as needed and after the /var was mounted and "cp -R -p
/var.copy/* /var" into it.


I'd thought:

# mount_tmpfs tmpfs /var

-- populate /var, as needed

# mount -u /
# tar czpf /somewhere/var.tgz /var

Then, just unpack the tarball onto the newly mounted /var in rc(5).

But, regardless, the point is that there is no preexisting mechanism
in place for this sort of thing?  E.g., FBSD had an rc.diskless
(a bit of overkill) that could be modified to achieve these sorts of
results.


Re: Netra T5220

2018-11-16 Thread Don NetBSD

On 11/16/2018 12:05 AM, Sad Clouds wrote:

On Thu, 15 Nov 2018 22:10:26 -0700
Don NetBSD  wrote:


This may well be the killer.  Someone appears to have flashed a
custom OFW image -- which I'll have to rid the machine of before I
can do ANYTHING with it.


If someone put the latest firmware version there, then keep it. There
is probably a way to reset passwords without flashing firmware, could
be a jumper on the mainboard.


The banner says:
U-Boot 1.1.1


custom Sun Microsystems U-Boot 1.3
^^

And, autoboots a Linux 2.4.22 kernel (which must reside on internal FLASH
as it boots even with the drives pulled!)

I am hoping to make some time to play with this over the weekend
(or, over the holiday).  Right now, its just "in the way"  :-/


/var on tmpfs

2018-11-15 Thread Don NetBSD

I've a box with a DoM.  I'd like to mount / as ro and create a
tmpfs for /var (and /tmp).  I don't think anything else NEEDS to
be rw (the infrequent changes to /etc can be made by unlocking /
to make those changes).

I imagine I can just make a tarball of a skeletal /var and
unpack this over /var, once mounted?

Is there a preexisting mechanism for this sort of thing?
Or, do I roll my own?


Re: Netra T5220

2018-11-15 Thread Don NetBSD

On 11/15/2018 3:46 AM, Sad Clouds wrote:

On Wed, 14 Nov 2018 17:00:06 -0700
Don NetBSD  wrote:


I've rescued a Netra T5220 (haven't attached a console, yet).



2. Solaris 11.3 is pretty good and has many features not available
in NetBSD, such as LDOMs, Zones, ZFS, good multithreading in kernel. It
also has its native pkg, so you can install additional software, or you
can use pkgsrc. I've not tried it yet, but I think you could run
Solaris as a primary LDOM and then a number of Solaris/OpenBSD
instances in guest LDOMs.


These (and the 20 inch depth) are what originally attracted me to the box.
I was hoping I could add a pair of dual channel SAS controller PCIe
cards to attach external arrays (I see *some* support for these under 11).


3. Not sure how Solaris is licensed these days, but if you're going to
use it for commercial purpose, you will probably need to pay. It may
also need latest firmware upgrades, which you cannot get without Oracle
support contract.


This may well be the killer.  Someone appears to have flashed a custom OFW
image -- which I'll have to rid the machine of before I can do ANYTHING
with it.

I have a colleague who worked for Sun before the Oracle buyout who may still
have access to current (or even previous!) patches.  If not, I'll probably swap
the box for something more convenient.


4. Future versions of Solaris have a tendency of removing support for
older hardware, so your upgrade path is rather limited. However if
you're not using hardware in a production environment, then you may not
need all the latest features and bug fixes.


Exactly.  OTOH, NetBSD support would have left that door open for me...  :-/


Netra T5220

2018-11-14 Thread Don NetBSD

I've rescued a Netra T5220 (haven't attached a console, yet).

I'm soliciting comments as to whether I should leave/install
Slowaris on it or NetBSD.  I think the Slowaris option gives
me more "out-of-the-box" functionality (without having to
build/install the apps I might want)


Re: Trimming a diskless distribution

2018-10-16 Thread Don NetBSD

On 10/16/2018 2:23 PM, Brett Lymn wrote:

On Mon, Oct 15, 2018 at 06:44:30PM -0700, Don NetBSD wrote:


You're used to dealing with "computers" where you CAN change a piece of
software AFTER release.  I deal with devices/appliances where the cost of
upgrading the device far exceeds the cost of the device (and comes at
a huge "reputation cost" in the eyes of the user:  "You mean, this device
has been BROKEN all of this time?")


Yes, I figured that was what you were doing but if there is any chance
that your product will be featured in the technology news channels for
having vulnerabilities that allow it to be used by bot herders or crypto
currency miners it would be possibly more embarressing...


IME, that happens when folks embrace some (large) piece of software (e.g.,
a Linux kernel) that they don't completely understand -- because it is never
formally defined, in its entirety, in a way that those deploying it can
grok.

OTOH, when you develop a codebase specifically FOR a particular product, you
avoid the risk of adding "cruft" to cover features and mechanisms that you
aren't using.

I'm looking to combine the best of both worlds -- use NetBSD to give me a
flexible hardware platform that I can morph to suit the needs of proposed
products (e.g., USB peripherals instead of having those same devices "on
board" in a production version) at a prototype/proof-of-concept level;
but the established (and "understood") codebase that we're already supporting
for an eventual product deployment to free us from having to "support"
a NetBSD implementation.


Re: Trimming a diskless distribution

2018-10-15 Thread Don NetBSD

On 10/14/2018 2:36 PM, Brett Lymn wrote:

I've been "manually" invoking everything that I want/need to run and
capturing any errors logged to sort out what might be missing.


Right - atf can automate this bit for you.  If you are doing a
customised build then you will want to do this agin if/when you update
to make sure things are not broken afterwards.  It means you can
validate things in a consistent and repeatable manner.


I think that's more than I need.  I'm going to pick *a* release
and stick with it (for a very long time -- updates will be difficult).
The bigger concern is deciding that I need to add some particular
binary to this "distribution".  Adding shouldn't break anything
that already works but could require additional dependencies that
haven't been present, prior to that point.


OK, if you say so but manually bashing through all the tests time after
time just to track down some obscure crash is pretty tedious in my mind.

You probably should think seriously about maintaining the image even if
it is for back-porting security updates.


You're used to dealing with "computers" where you CAN change a piece of
software AFTER release.  I deal with devices/appliances where the cost of
upgrading the device far exceeds the cost of the device (and comes at
a huge "reputation cost" in the eyes of the user:  "You mean, this device
has been BROKEN all of this time?")

I'm exploiting the fact that I can throw a NetBSD system onto some COTS
hardware (without requiring it to be a "PC"), embelish it with outboard
devices (so I don't have to get monies approved to build prototypes) and
pitch a proof of concept prototype to Management -- to fund REAL
development efforts.

The NetBSD variant will then be unceremoniously scrapped (it would be silly
to deploy something as big and bulky and likely to need continued updates
in the future when we have our own codebase that addresses our needs far
more precisely).  I just have to ensure that the NetBSD-variant of the
device "appears" to perform all of the basic functions (so Management can
actually "tickle" the prototype and elicit the expected responses/behaviors)
and have a valid explanation for those that the device can't perform.


Re: sysinst(8), `Installing from an unmounted filesystem'

2018-10-12 Thread Don NetBSD

On 10/11/2018 7:26 PM, Robert Elz wrote:

   | For me, I address that with additional *drives* -- typically external.

Different problem/issue.   I don't care much about space any more, drives
have oddles of it, and it has become cheap (whatever connection method).
What matters is mount attributes,. filesystem config (block sizes, etc) - those
are typically not all the same (pkgsrc has lots of little files, so does better
woth small block/frag sizes 0 they're not written all that often, whereas other
srcs usually have bigger fiels, and perform better with slightly bigger blocks,
and distfiles tend to be hugs, so work best with big blocks (and usually no
frags at all).   For mount options, some want to be read only, some log, and
/usr/obj is typically async (if the system crashes,it can just ne newfs'd if it
gets mangled)


"Space is cheap".  And, modern hardware -- even older hardware -- is still
faster than my meatware.  So, no real need to try to eek out extra performance.
There's always something ELSE that can be done while waiting for a "make world"
to complete -- updating build notes, formal documentation, tweeking some other
sources, laying out a PCB, etc.  When the "make" finishes, it can sit there and
wait for *me* to get around to checking up on it!  (computers are much more
patient than people)

So, it's not worth my time trying to optimize a filesystem for a particular
use (space/speed).  I have (literally) hundreds of 500GB+ drives.  Keeping
track of where they are physically stored is the most pressing practical
issue (I already track their individual contents with a RDBMS but that doesn't
help me find "disk #74")

For me, the issue is one of consistency -- making sure I can access a medium
on different systems, etc.  (e.g., no ZFS, RAID, etc. or other FS that might
not be supported in every kernel)


Re: sysinst(8), `Installing from an unmounted filesystem'

2018-10-11 Thread Don NetBSD

On 10/11/2018 4:55 PM, Robert Elz wrote:

   | Mounting /Sources and symlinking /Sources/src at /usr/src, /Sources/pkgsrc 
at
   | /usr/pkgsrc, /Sources/xsrc at /usr/xsrc, etc. covers all the bases.  Where
   | would you put xsrc if you mount a partition at /usr/src?  Ditto pkgsrc?

/usr/xsrc and /usr/pkgsrc both work, those are the standard places (they can
be separate partitions) though I use /usr/src/xsrc (and it remains part of
/usr/src) and /usr/src/pkgsrc (and is mounted separately).


Yes, my point was that a partition for /usr/src does nothing to address
/usr/xsrc or /usr/pkgsrc.  You either stuff them into /usr (which means
the size of /usr varies from machine to machine based on whether or not
those sources are present -- and, how many "packages" you are in the
process of building) or add them as separate partitions of their own.

As I'm unlikely to build packages and the system at the same time, I
can have a smaller amount of "working space" (for .o's) if all of the
sources share a partition (/Sources).


Distfiles (and
packages) are (for me) separate partitions, mounted on /local (not with 
symlinks, but
using pkgsrc var settings to alter their locations) but they could also be
mounted under /usr/src/pkgsrc.


I have /usr/pkgsrc/distfiles point to /Sources/distfiles.  Then, I can put some
few sources there *or* mount a partition OVER /Sources/distfiles with a more
complete set of sources.


  "mysources" go in ~kre/src  but there's also
a /local/$hostname (mounted) which can have a src subdir  if a host has a need
for sources all of its own, and if there was going to be a lot of that, it 
would be a
separate partition as well.

You might gather that I like separating data into multiple partitions - lots
of them,.,,   Unloike some, I do not regard partitions as mereley a mechanism
to get around issues with drives not being big enough, but as first class
objects with a whole set of properties of their own, which should be used
more, not less.


For me, I address that with additional *drives* -- typically external.  This
allows me to have a box with a 16MB (that's an M, not a G) DoM build its own
system -- by (temporarily) mounting the sources on an external medium.
And, use the same procedure for a 1G DoM, 20G disk or TB drive.

[I SneakerNet lots of stuff as I have /beaucoup des disques/ -- considerably
more than 100T]


   | [Note that the device mounted at /Sources may be an external drive that 
may or
   | may not be present at all times]

That's fine, any of this can be "noauto" in fstab.


Exactly.  The only partitions that mount, normally, are /, /var, /usr, /usr/pkg
and /home.  Everything else is a just a mountpoint.


   | It lets me keep sources off of systems that don't need them while keeping 
the
   | file hierarchy basically consistent -- the "basic" partitions are always 
the
   | same size and have the same content.

That's reasonable, I was not suggesting loading everything into /usr ...  just
not mounting anything (long term. /mnt, or /cdrom are OK for short term use)
(except /usr /var and /tmp ... all of which are more or less critical, and so
matter less ... that is, if the drive holding /usr dies, you're screwed,
wherever it is mounted, whereas if /usr/src dies, the system should normally
never notice, you don;t wantthings to hang because they're scanning /)

That, and that everyone has their own setup - proclaiming any as being
the way anyone else should do things is rarely a good idea.


Exactly.  I find a way that works for my hosts and my media.  One
that time has taught me will address the sorts of issues I'm likely
to encounter /with the way I use those systems/.


Re: sysinst(8), `Installing from an unmounted filesystem'

2018-10-11 Thread Don NetBSD

On 10/10/2018 7:28 PM, Robert Elz wrote:

   | I've typically installed them under /usr/src,

That's the standard patrh, though they can go anywhere.

   | (though usually symlinked to a separate partition mounted as /Sources).

You can certainly do it that way, but why not just  mount the
partition as /usr/src and do away with the symlink?   That's cleaner,
and generally better - mounting things in the root directory (including /mnt)
can cause even worse problems than should occur should the device
holding the mounted filesystem ever hang (refuse to do anything, either
dur to software or hardware issues).


Mounting /Sources and symlinking /Sources/src at /usr/src, /Sources/pkgsrc at
/usr/pkgsrc, /Sources/xsrc at /usr/xsrc, etc. covers all the bases.  Where
would you put xsrc if you mount a partition at /usr/src?  Ditto pkgsrc?
Or, /Sources/distfiles?  Or, /Source/mysources?

[Note that the device mounted at /Sources may be an external drive that may or
may not be present at all times]

It lets me keep sources off of systems that don't need them while keeping the
file hierarchy basically consistent -- the "basic" partitions are always the
same size and have the same content.


Re: Trimming a diskless distribution

2018-10-10 Thread Don NetBSD

On 10/8/2018 2:10 PM, Brett Lymn wrote:

On Sun, Oct 07, 2018 at 06:47:58PM -0700, Don NetBSD wrote:


I don't think so.  E.g., anything that relies on (or supports) remote
clients/services would require some explicit action at a remote node to
exercise those services.


For your purposes surely either just testing on loopback or on the local
interface from the machine would be sufficient?  You just wanted to make
sure things start.


Yes, but you have to arrange for all of those "things" to actually GET
started.  And, the bigger problem, if they  when invoked, there's
very little to help you figure out what they might be missing.

You can look at the .so's that are linked to a (non-static) binary
and arrange for them to be present.  But, if they exec something else,
in turn, then you have to arrange for those binaries (and their
dependencies) to be present.

Ideally, I'd like something like the "required to run" explicit dependency
specification in pkgsrc.

[Yes, I realize this is asking a lot.  So, I'm looking for a reliable way
of MANUALLY ascertaining these dependencies without having to rely on an
intimate knowledge of "how things work"]


E.g., Will ftpd(8) get "tested" if it is invoked via inetd.conf(5)?
(Ditto everything else therein).


ftp localhost

though you probably want something a bit more robust that fails within a
timeout so the tests move on.

If you want to really test externally then you could use qemu to build a
test virtual machine and use the host at the "external" machine.


I'm only looking to "test" to the extent that nothing SIGSEGV's or
otherwise crashes because I failed to include something that it needed
in the target filesystem.  I assume that if I've got everything in
place, it "will work".


I've been "manually" invoking everything that I want/need to run and
capturing any errors logged to sort out what might be missing.


Right - atf can automate this bit for you.  If you are doing a
customised build then you will want to do this agin if/when you update
to make sure things are not broken afterwards.  It means you can
validate things in a consistent and repeatable manner.


I think that's more than I need.  I'm going to pick *a* release
and stick with it (for a very long time -- updates will be difficult).
The bigger concern is deciding that I need to add some particular
binary to this "distribution".  Adding shouldn't break anything
that already works but could require additional dependencies that
haven't been present, prior to that point.


  The
errors alone don't tell you *what* you need to add -- they just tell you
that something is apparently "missing" or "not working properly".  It
then boils down to either having a familiarity with how each piece of
code works *or* digging through the sources to see what MIGHT be the
problem.


Right, this bit will still need to be done manually but at least you can
automate the running of the tests.

Anyway, up to you how you do this.  Just saying that this is the sort of
stuff that atf is meant for

To repeat/summarize, my approach, thus far, has been to PXE boot a kernel
and let it NFS mount a remote filesystem that is initially empty.  Then,
examine the errors that are generated and start adding stuff to that
file system to eliminate each error -- which inevitably exposes another
error, etc.

Once the system made it to single user, I continued the process after telling
it to go multiuser.

Once it could correctly get to multiuser, THEN I started looking at the
additional programs that I wanted present -- adding each, individually;
invoking them, manually; then pacifying the errors that were thrown.

Because the order that I resolve errors has an impact on which NEW errors
might (or might not!) arise, what I learn from creating one "distribution"
(bad choice of terms) doesn't directly translate to creating a different
distribution.

E.g., an error thrown by PROGRAMA may be fixed by dragging libc into the
target filesystem.  Some time later, PROGRAMB may need something else -- but
it ALSO needed libc and didn't complain about that because libc was already
in place from PROGRAMA's needs!  If the "next" distribution omits PROGRAMA,
then my notes regarding PROGRAMB's needs (part of the new distribution)
will be incomplete.


Re: sysinst(8), `Installing from an unmounted filesystem'

2018-10-10 Thread Don NetBSD

What about the `Source set directories'? I didn't find it nor in the
sparc64, neither in the amd64 ISO.

In this case, the installer (obviously, when you selected some source set to
be installed in a Custom installation) looks for




There is one set of sources that are used for ALL of the various ports.
Look for ./source/sets/{gnusrc,sharesrc,src,syssrc,xsrc}.tgz

I've typically installed them under /usr/src, et al. (though usually symlinked
to a separate partition mounted as /Sources).

E.g., my /Sources contains:
./src
./xsrc
./pkgsrc
...


Re: Trimming a diskless distribution

2018-10-07 Thread Don NetBSD

On 10/7/2018 2:07 PM, Brett Lymn wrote:

On Thu, Oct 04, 2018 at 12:16:39PM -0700, Don NetBSD wrote:


Of course, late bindings are a potential SNAFU in this approach -- unless
I inherently know how to precipitate those events!


Why not use the automated test framework to run everything you want to
run?  That should catch any errors for you.


I don't think so.  E.g., anything that relies on (or supports) remote
clients/services would require some explicit action at a remote node to
exercise those services.

E.g., Will ftpd(8) get "tested" if it is invoked via inetd.conf(5)?
(Ditto everything else therein).

I've been "manually" invoking everything that I want/need to run and
capturing any errors logged to sort out what might be missing.  The
errors alone don't tell you *what* you need to add -- they just tell you
that something is apparently "missing" or "not working properly".  It
then boils down to either having a familiarity with how each piece of
code works *or* digging through the sources to see what MIGHT be the
problem.


Trimming a diskless distribution

2018-10-04 Thread Don NetBSD

A question similar, in spirit, to that of Cág ("Correct way to trim the
distribution?")

What's a "good" way to trim the exported filesystem for a diskless
system?  (The goal being to eventually crunchgen a local system image
and eliminate the NFS mount entirely)

To date, I've been exporting an *empty* filesystem and incrementally adding
stuff to it, based on errors detected in the process of going to singleuser...
and, likewise, proceeding on to multiuser.

Of course, late bindings are a potential SNAFU in this approach -- unless
I inherently know how to precipitate those events!


Re: BSD disklabel partition letters in NetBSD

2018-10-04 Thread Don NetBSD

So, in a degenerate example, put 2 partitions on a disk that each represent
an entire root filesystem TO THE OS THAT IS BOOTED.


This is exactly the degenerate example I wanted to refer to. Let's consider
a BSD disklabel in the first sector of a hard disk (so, without MBR) with
the following partitions defined:

a: /, root partition of system A
b: unused
c: unused (in some cases, it represents the whole disk)
d: unused


In those cases where c is not the whole disk, then d would be.


e: /homeA, home partition of system A
f: swap partition of system A
g: /, root partition of system B
h: /homeB, home partition of system B
i: swap partition of system B

Let both system A and B be NetBSD, for example 8.0 e 7.1, so we are sure
that they are both fully-compatible with the BSD disklabel layout.


I've never tinkered with moving swap out of 'b' -- but imagine it could be
done, reliably.

I'm not sure why you would need two DIFFERENT swap partitions as only one
would be in use (based on which OS was booted).  But, let someone else
argue that point.

You can specify which NetBSD partition to boot at the boot prompt.  Or,
build a "menu" that provides a simpler interface to this.

fstab(5) in each root partition (/etc being part of that, in this example)
would call out 'e' or 'h' as the partition to be mounted on /home in that
particular root file system.  The "other" home partition could then be
mounted somewhere else (assuming the filesystem type is supported by the
kernels built for A/B.

E.g., fstab(5) for A's system would contain (assuming an sd(4) device):
   /dev/sd0a/   ffs ...
   /dev/sd0fnoneswap...
   /dev/sd0e/home   ffs ...
   /dev/sd0h/otherhome  ffs ...
   /dev/sd0g/otherroot  ffs ...
while B's would be:
   /dev/sd0g/   ffs ...
   /dev/sd0inoneswap...
   /dev/sd0h/home   ffs ...
   /dev/sd0e/otherhome  ffs ...
   /dev/sd0a/otherroot  ffs ...


Now, boot system A from partition `a'. First: is it possible to do so? Then,
how would partitions `g', `h' and `i' be detected? In other words, what
would the output of `disklabel wd0' be from system A?


The disklabel is the same for each (there's only one, in this case).  I am
assuming that these are non-overlapping regions of the medium.  I.e., the
physical sectors used by A's root partition differ from those used by B's.

So, set up those partitions (size+offset) as befitting your needs.  Then,
just elect which to boot (boot prompt) and where to mount the others (fstab).


Ideally, at least partitions `g' and `h' should be mounted, fully readable
and writeable. But as regards mountpoints, there is some confusion, given
that partition `g' should be mounted in the same place as the already
mounted partition `a'. Or does this only depend on how fstab(5) has been set
up?


The disklabel just cuts the medium into "pieces" (avoiding the term "slices").
It doesn't know where those will be mounted -- if at all!  fstab(5)'s role is
to specify these mount points (assuming you don't deliberately do something
"outside" the normal approach -- like running a special script to mount stuff)


Then, boot system B from partition `g'. Same questions as above.


Re: BSD disklabel partition letters in NetBSD

2018-09-28 Thread Don NetBSD

On 9/27/2018 3:32 AM, Rocky Hotas wrote:

For the sake of completeness, let's consider another case, hopefully
interesting to others. If you decide to install on the same disk two or more
BSD systems, all compatible with BSD disklabel (for example, two different
versions of NetBSD, or NetBSD and FreeBSD), would that unique BSD disklabel
in sector 1 of the disk be able to handle this?


Remember, the partitions have no knowledge of where they actually exist in a
FILESYSTEM!

So, in a degenerate example, put 2 partitions on a disk that each represent
an entire root filesystem TO THE OS THAT IS BOOTED.

Boot partition 1 and you get one "system" (OS and filesystem).  Boot partition
2 and you get another.

To be more exotic, fstab(5) in the filesystem represented in partition 1
can call for /home to be mounted from partition 6 with /oldhome mounted from
partition 7.  Meanwhile, the fstab(5) in the filesystem represented in
partition 2 swaps this order (/home from partition 7 and /oldhome from
partition 6).


In all the examples I've seen, this data structure is conceived to describe
only a single system, with one root partition (and then optional separate
partitions as /home, /var, /usr according to the administrator's choice, but
all referred to the same root). Multiple OSs would mean multiple root
partitions.


Yes.


However, it would be very odd if, in order to allow the existence of
multiple BSD OSs, a third-party partitioning scheme as MBR would be needed.
If these questions can be answered by reading some documentation or some
other source and you know the link, I would check it out (unfortunately I
found almost nothing about this).


I think you can actually get even messier if you use an MBR to create 4
different MBR "slices" -- and put NetBSD labels *in* each of those defining
4/8/16 different "NetBSD partitions".  Then boot some particular NetBSD
partition from that set of 4/8/16 available!

Sorry to further complicate the discussion by drawing a distinction between
MBR /slices/ (which the rest of the world calls a /partition/) and NetBSD
/partitions/ within said slice(s).

[Graphics would really make a lot of this easier to explain!  :< ]


Re: BSD disklabel partition letters in NetBSD

2018-09-28 Thread Don NetBSD

On 9/27/2018 3:41 AM, Rocky Hotas wrote:

Sent: Tuesday, September 25, 2018 at 11:51 PM
From: "Don NetBSD" 
To: netbsd-users@netbsd.org
Subject: Re: BSD disklabel partition letters in NetBSD


[...]


But it isn't placed in any of these places unless/until the system is
DIRECTED to do so.  I.e., you can access a brand-new, never-before-powered-on
disk drive via the fictitious IN-KERNEL disklabel STRUCTURE.  But, need
never actually write it to the medium to allow such access (on a NetBSD box).


IIUC, this is a subtle (but important) clarification.


There are some caveats, IIRC.  E.g., the fictitious disk label also "imagines"
(a fictitious) 'a' partition.  But, an attempt to "newfs /dev/rsd#a" will fail.

The only thing you can be sure of is the "whole disk" partition -- and ONLY if
you haven't altered the in-kernel instance of it WITHOUT writing that to the
medium (cuz there is no other instance!).

ISTR (many major releases ago) having to put ALL of the label information on
the medium (using disktab(5)).


Re: BSD disklabel partition letters in NetBSD

2018-09-25 Thread Don NetBSD

On 9/25/2018 10:52 AM, Michael van Elst wrote:

On Tue, Sep 25, 2018 at 05:05:44PM +0200, Rocky Hotas wrote:


only if none is written to disk,
a fictious label is generated from other data like an MBR.


Sorry, I can't understand this. Maybe it's related to the following
description:


disklabel is a data structure. If there is none on the disk, it is
generated from other information.


The problem comes from the fact that the term is severely overloaded.
There is a "data structure", a "portion of the medium" and a "software
program" that all share the name "disklabel".

The (fictitious) data structure needs to be explicitly written to the
medium -- by a sysadm's deliberate actions (most typically, by disklabel(8)!).
Otherwise, it is a purely ephemeral concept.


The disklabel would be used and 'd' would still be the raw partition.
The disklabel would also be placed on sector 1.


So, two disklabels in total. But what would be the contents of the
disklabel in sector 1?


With MBR, the disklabel is usually placed on relative sector 1 of
the MBR partition tagged as 'NetBSD' (type 169).

Without MBR, it is placed on absolute sector 1 of the medium.

N.B. the sector number can also be platform dependent, most use sector 1
but some use other sectors.


But it isn't placed in any of these places unless/until the system is
DIRECTED to do so.  I.e., you can access a brand-new, never-before-powered-on
disk drive via the fictitious IN-KERNEL disklabel STRUCTURE.  But, need
never actually write it to the medium to allow such access (on a NetBSD box).

[I.e., write the entire medium -- except the label portion -- and mount it on
some foreign OS and they won't see ANY label in place!  But, the DATA that
you wrote will still be there!]


Re: Sizing hardware drive capabilities (in the absence of probed devices)

2018-09-25 Thread Don NetBSD

On 9/25/2018 3:19 AM, David Brownlee wrote:

[attrs elided]


I have no idea whether this would actually map to your real
requirements, but a possible workflow could be:

Bringing up new appliance ("slot mapping")
- Assuming you have "ID" devices digitally and physically labelled 1..n.
- User is directed to insert as many ID devices as they have slots
switch on machine
- Appliance boots, detects it has devices attached, checks to see they
are ID devices, updates slots and records its slot mappings


I would just use N different (make/model) drives for that purpose and
examine dmesg on boot:  "OK, the 500G Seacrate is located in the
top left slot and that appears to have been probed as sd0.  The 320G WD
is in the slot to its right and that seems to have been probed as sd4.
etc."  As this is only done once, I can just grab any old drives and
stuff them into the machine, knowing their contents won't be altered
(unless I screw up).  Then, put them back  once I've got
the slots marked.


Mmm finding and maintaining N different models of drives might be a


The point is NOT to be required to have "special disks" (e.g., your disks
with ID's written on the media).  Pick up N disks that differ in some way
from each other (size, manufacturer) and stuff them into slots.  Watch
the dmesg output as those are probed.  Return the disks to their original
homes.


I don't expect (nor want!) "them" to be able to bring up new boxes
unsupervised.  There are too many little details that could have
consequences.  E.g., any performance metrics reported for a drive
in appliance A might differ from (that same drive!) in appliance B.


Reasonable, but its always nice to design what would be the full
robust system, and then decide what corners to cut :-p, plus from past
experience you invariably end up at some point needing to build a box
at the same time as your attention is split fielding something else
urgent.


The "right solution" is to use our existing product as the fixture!
- we already own the hardware design (and know how to troubleshoot it)
- can get replacement parts any time (what if the server shits the bed?)
- can build as many as we like (no hunting for identical/compatible servers)
- own all the sources (and know how they work!)
- don't have to worry about hot swap (just power the device down, remove
  the drive, insert another, power up -- near instantaneous boot)
- don't have to go "exploring" all of these issues

*This* approach is the result of someone with a superficial knowledge of
the issues spouting off to Management ears that were foolishly receptive
to the -- ahem -- "short cut".  I'll be able to prove that when I'm done.

My interest, now, lies in how I could exploit this approach for some
other organizations with which I'm affiliated (that DON'T have an existing
product line that they could repurpose for the task).


Normal use
- When a new sdX or wdX device is detected system determines its slot
mapping and uses it when talking to user
- If it can't determine slot mapping, it suggests a new slot mapping
pass (something strange has happened)

Optional extra credit ("Where is what slot")
- User is instructed to apply sticky number labels next to ID devices
when bring up appliance


*I* would be that "user".  I imagine eventually having a "live (remote)
display" that  reports/summarizes the activities and status of each
drive slot.  Presently, that takes the form of a text display that
summarizes a single appliance on a single screen (curses).  That
could evolve into something graphical.


Usually a big fan of html in this case - can start by spitting out a
static html page with a table and 30 second meta refresh, and extend
to some simple javascript which refreshes within page...


With *no* experience in HTML, I've actually become VERY interested in
using it to make "platform independent" interfaces.  E.g., interfaces
that I could serve over a phone (via WiFi) without having to write
applications that run *in* the phone.

[Looking at designing a remote display for a pallet scale, presently, so
a forklift operator can just look at his phone to see how much the
pallet of goods weighs WITHOUT having to exit the vehicle and walk up
to the (small) display located indoors.]


  I do product design/development for a living, not "test fixture
design".


We all have to start somewhere :-p


I did my stint with production test several decades ago.  Considerably
more labor intensive, back then.  Hence my fear of letting this turn
into a "project" that THEY have to maintain.


So, I'm not too keen on embelishing this more than necessary
(and delaying the NEXT product's delivery!)


It sounds like you have all the right ideas - we're fascinated to hear
how it goes! :)


Presently, I'm more interested in what I can do for OTHER folks using
a similar approach (COTS hardware vs. something "owned" but proprietary).
But, let some one else pay me to learn what I want to learn...  


Re: Simple way to securely access remote machine that's behind a NAT?

2018-09-25 Thread Don NetBSD

On 9/24/2018 6:04 PM, Michael Cheponis wrote:

I have a (linux raspberry pi) that's remotely located and NATted in such a
way that I cannot control that part of the infrastructure, although  do
have complete control of the machine otherwise (e.g. access to root).

What I'd like to do is access it from my local NetBSD system (which does
have a  'real' IPv4 address), something like this:

1) Start some 'daemon' on the remote linux box that attempts a connection
to ...
2) .. my local NetBSD box.

And

3) some local NetBSD program that allows me to get a shell prompt of the
remote linux machine here on my NetBSD machine via the connection set up in
(1) and (2).


I assume you can't be in two places at once -- i.e., you want to have the rPi
do  and SIT ("daemon"), possibly waiting *indefinitely*, for a
connection from the NetBSD box (which is "remote" to the rPi)?


Re: Sizing hardware drive capabilities (in the absence of probed devices)

2018-09-24 Thread Don NetBSD

On 9/24/2018 4:14 AM, David Brownlee wrote:

On Mon, 24 Sep 2018 at 11:08, Don NetBSD  wrote:


On 9/18/2018 3:54 AM, David Brownlee wrote:

Just some musing about handling drive mappings:

For sd devices you could use "scsictl sdX identify" to map back from
sdX to (scsibus, target, lun) numbers and then onto each drive's
physical location.


OK.  That would help me initially identify the "slots" in order to
hard-wire them in the kernel.  I.e., stuff every slot, boot, then
"identify" each disk (having made the contents of each disk unique
enough to map to the probed devices).

Presumably, once each slot is wired down, then it need not be
populated at boot -- yet the device will still exist for it when it
later "appears".


Yes, though if you can identify the slots for hardwiring into the
kernel you could also run the same process at runtime as run a GENERIC
kernel.

I have no idea whether this would actually map to your real
requirements, but a possible workflow could be:

Bringing up new appliance ("slot mapping")
- Assuming you have "ID" devices digitally and physically labelled 1..n.
- User is directed to insert as many ID devices as they have slots
switch on machine
- Appliance boots, detects it has devices attached, checks to see they
are ID devices, updates slots and records its slot mappings


I would just use N different (make/model) drives for that purpose and
examine dmesg on boot:  "OK, the 500G Seacrate is located in the
top left slot and that appears to have been probed as sd0.  The 320G WD
is in the slot to its right and that seems to have been probed as sd4.
etc."  As this is only done once, I can just grab any old drives and
stuff them into the machine, knowing their contents won't be altered
(unless I screw up).  Then, put them back  once I've got
the slots marked.

I am expecting this to bear some logical relationship to how the
manufacturer designed the "drive cage" (the one server that I've
examined so far has them laid out in the order a casual observer
would expect -- no surprises, there).

I don't expect (nor want!) "them" to be able to bring up new boxes
unsupervised.  There are too many little details that could have
consequences.  E.g., any performance metrics reported for a drive
in appliance A might differ from (that same drive!) in appliance B.


Normal use
- When a new sdX or wdX device is detected system determines its slot
mapping and uses it when talking to user
- If it can't determine slot mapping, it suggests a new slot mapping
pass (something strange has happened)

Optional extra credit ("Where is what slot")
- User is instructed to apply sticky number labels next to ID devices
when bring up appliance


*I* would be that "user".  I imagine eventually having a "live (remote)
display" that  reports/summarizes the activities and status of each
drive slot.  Presently, that takes the form of a text display that
summarizes a single appliance on a single screen (curses).  That
could evolve into something graphical.


Optional extra credit ("Where is what slot and sticky labels fall off")
- User directed to take photo of appliance with ID devices to record
where the slots were & upload to web server on applicance
- If user is confused on slot mapping web server on appliance can show
mapping picture

Optional extra credit ("Users mess with hardware/swap disks to other machines")
- At boot time system takes a copy of dmesg and notes the available
atabus/scsibus and device names
- If this ever changes it forces a new slot mapping pass


  I do product design/development for a living, not "test fixture
design".  So, I'm not too keen on embelishing this more than necessary
(and delaying the NEXT product's delivery!)


Re: BSD disklabel partition letters in NetBSD

2018-09-24 Thread Don NetBSD

On 9/24/2018 12:30 PM, Rocky Hotas wrote:

Now, of course, this all became mood on most modern machines with the move
to GPT, and this is good. As you probably know, we use full devices for
these (dk* and rdk*), so no limiting alphabet nor reserved letters.


Despite vaguely knowing that GPT is replacing MBR and similar, I never
installed NetBSD with GPT, so I didn't know this.


Also consider what OTHER systems might want to have a peek at that
disk.  Make sure your partitioning/GPT/MBR choices are compatible
there, as well!

[I physically move disks around a lot and want to be sure each machine
can access the entire contents of the drive.  If the drive is in an
external enclosure -- SCSI, FW, USB -- then doubly so!]


Re: BSD disklabel partition letters in NetBSD

2018-09-24 Thread Don NetBSD

On 9/24/2018 12:34 PM, Rocky Hotas wrote:

I like to mount /var on 'e', /usr on 'f', /usr/pkg on 'g' (picking up on
the g in pkG as a mnemonic), /home on 'h', /Sources on 'i', /Playpen on 'j'
and /Archive on 'k' (the hard ch as a mnemonic for the k) with /Leftovers
for 'l'.


This is a very clever way to remember the partitions, and also to make them
uniform across several different disks.


I don't want to have to keep notes as to how each machine is (was!)
configured.  So, it's easier for me to just standardize on an approach
and commit THAT to memory.

By putting the "extra" partitions up high, it lets me adapt the same
general layout to systems that only support 8 partitions (/home being
the last -- 'h' -- partition, in those cases).

Having a separate partition for /var helps if some process goes wonky
and starts to fill the /var "directory" (in the case when /var is part
of the / partition).  I can get to single user (which almost always
means the offending process is not running) and work with a / partition
that isn't overfull with all that /var cruft -- which has been preserved
on the /var *partition*, if I want to examine it)

I don't know if any architectures (Sun?) still have requirements
as to where the / partition must reside "physically", on the medium.
I always lay the partitions out consecutively and contiguously
just cuz it makes the arithmetic more straightforward.

[There are (?) also some issues wrt sector alignment if you're using
drives with 4K sectors -- though I think those are only performance
related (??)]


Re: Sizing hardware drive capabilities (in the absence of probed devices)

2018-09-24 Thread Don NetBSD

On 9/18/2018 3:54 AM, David Brownlee wrote:

Just some musing about handling drive mappings:

For sd devices you could use "scsictl sdX identify" to map back from
sdX to (scsibus, target, lun) numbers and then onto each drive's
physical location.


OK.  That would help me initially identify the "slots" in order to
hard-wire them in the kernel.  I.e., stuff every slot, boot, then
"identify" each disk (having made the contents of each disk unique
enough to map to the probed devices).

Presumably, once each slot is wired down, then it need not be
populated at boot -- yet the device will still exist for it when it
later "appears".


The drives would need to be labelled via GPS and the software set to
mount via named slices for referencing data on each drive.
It would even mean someone could pull a set of drives from one machine
to another and as long as they get the boot drive right the order of
the others is irrelevant.


There's no "other (NetBSD) machine", here.  The drives "go their separate ways"
once I've finished with them.  Think of it as a "PROM programmer" that can
handle multiple devices at the same time.  The disks are the equivalent of
the PROMs.  "Program" them and then remove them from the fixture and use
them 


For indicating which drive to pull one thought would be to quiesce all
other drives then pulse activity on the drive to pull for 5 seconds.


Quiescing the other drives would be unfortunate.  The goal is to be able
to have a more-or-less continuous process whereby drives are inserted,
processed and extracted regardless of the state/activity of the other
drives in the appliance.

An "operator" will power up the appliance and insert the drives that
need to be "processed" (they will likely NOT be the same nor require the
same processing).  He will "Start" a particular slot and then move on
to some other activity.  When informed that a slot is finished (successfully),
he will "Eject" that disk, slap a label onto it that has been printed
for it (by the same appliance), place the disk in a "completed" pile and
insert another disk in the now vacant slot.

Lather, rinse, repeat.

When the last disk is finished, the appliance will power itself down
(having logged the results of each disk in the event the power-down
occurs after the close of business).  The process will repeat on the
next day/shift.


Re: BSD disklabel partition letters in NetBSD

2018-09-22 Thread Don NetBSD

On 9/22/2018 2:24 AM, Rocky Hotas wrote:

As regards NetBSD: this use of ‘a’ and ‘b’ is mandatory? Or is it
possible to arbitrarily change the letter assignments? (E.g. partition
/home to ‘a’ and root partition and swap to ‘e’, ‘f’, ‘g’ ...)

Any suggestion/information about this would be very useful.
Thank you anyway,


With the exception of the "whole disk" partition, the others can be used
as you wish.

However, I've tried to be consistent with my choices.

As 'c' is "whole disk" on only SOME ports -- with 'd' being the whole disk
on others -- I don't ever use these for anything but those roles:  consider
mounting a disk that started out in life on one architecture (port) and now
you want to mount it on another!

I use a small 'a' partition as "/" and 'b' as swap.

I like to mount /var on 'e', /usr on 'f', /usr/pkg on 'g' (picking up on
the g in pkG as a mnemonic), /home on 'h', /Sources on 'i', /Playpen on 'j'
and /Archive on 'k' (the hard ch as a mnemonic for the k) with /Leftovers
for 'l'.

But, that's because most of my boxes are development systems so the ijkl
are considerably different from machine to machine (whereas aefgh tend to
be very similar in content).

Just pick something that you can easily remember.  There WILL be a time
when your box won't go to multiuser and you'll be poking around fstab(5)
with ed(1) instead of vi(1).  It helps to know what to expect, there!


Re: Sizing hardware drive capabilities (in the absence of probed devices)

2018-09-16 Thread Don NetBSD

On 9/16/2018 2:27 AM, Michael van Elst wrote:

netbsd-embed...@gmx.com (Don NetBSD) writes:


Ah!  So, the sd(4) driver won't pass "non-scsi" commands and
the sd(4) devices might not accept scsi commands.  (damned if you
do, damned if you don't)?


The sd driver just passes some byte sequence, some are intercepted by
mfi, some aren't. The mfi firmware might do more before it actually
reaches the disk.


The 2950 uses an mpt(4) -- though I suspect your points apply equally.


I'd have to make sure I numbered the targets on each scsibus as well.


Yes, you need to wire scsibus (if there could be more than one) and
you need to wire sd.


So, my kernel config should contain N+1 sd entries (the N+1th for a
wildcarded "sd?") each with specific unit numbers.  Ditto scsibus(4)'s
as well as the controller (mpt) to which they attach.  In this way, I
should be able to rely on the /dev mappings to specific hardware (even
in the absence of said hardware or portions thereof).


Yes, that's what I'm trying to guard against.  I can almost guarantee
that someone will get the "bright idea" that they can hack together a
second appliance -- using DIFFERENT hardware (a computer is a computer,
right?) and just cloning the system disk.  Then, complain when it
doesn't work as expected.


You need to wire down scsibus to a specific controller. I.e. adding
another controller or replacing it with a different model makes your
kernel config void.


That's acceptable -- it's an *appliance*, not a "general purpose computer".
I just need to ensure that the software either knows that the configuration
has been changed (and can complain about it) *or* lock the system (hardware
and software) so it *can't* be changed.

(sigh)  This would be *so* much easier if I just pulled product off the Line
and tweeked the firmware!  "Want another?  Go get one (and *I* know it will
be identical to the last!)"


Re: Sizing hardware drive capabilities (in the absence of probed devices)

2018-09-16 Thread Don NetBSD

On 9/16/2018 12:21 AM, Michael van Elst wrote:

netbsd-embed...@gmx.com (Don NetBSD) writes:


But can't I walk back up the device tree and find the number of leaves
on a particular (physical) controller?


You could find out from the config file how many disks you have wired.
Still unrelated to real hardware.


That gives me a number for the "MAX_DRIVES_SUPPORTED" manifest constant
in my code (I have to put SOME limit on how many devices can be handled).
Said a more practical way, I can make sure I build a kernel that handles
at least as many devices as my code will handle (silly for my code to
handle more than the kernel supports)


SATA would be wd(4), not sd(4).

SATA on a SAS controller appear as sd(4) devices on scsibus's.
Not sure I have the option to attach them to atabus, instead
(nor why I would want to do that)


If the driver presents the disks as sd(4) devices, you get some
virtual unit that just happens to be usuable to access the
physical one.

For example, the mfi(4) driver would present the disks to you as
sd(4). However, such an sd device would emulate the basic read/write
commands. It also passes through other commands, but your SATA devices
would have problems to understand SCSI commands.


Ah!  So, the sd(4) driver won't pass "non-scsi" commands and
the sd(4) devices might not accept scsi commands.  (damned if you
do, damned if you don't)?


But, I would have to rely on empirical observation to know which device
is which?


Yes. Fortunately we still enumerate devices serially, so the numbers don't
change.


I'd have to make sure I numbered the targets on each scsibus as well.
And, size each to handle the largest shelf that might be attached to it.
So, if I have a 15 drive shelf today, I'd number those slots 1-15.
The second shelf, 16-32.  Etc.

If, thereafter, the first shelf was replaced with an 8 drive shelf, then
devices 9-15 would just disappear -- 16 would continue to be the first
device on the second shelf.


The kernel configuration of course will be specific to your machine then.
If you replace hardware you might need a new configuration.


Yes, that's what I'm trying to guard against.  I can almost guarantee
that someone will get the "bright idea" that they can hack together a
second appliance -- using DIFFERENT hardware (a computer is a computer,
right?) and just cloning the system disk.  Then, complain when it
doesn't work as expected.

The only ways I can think of to hard-wire the software (and kernel config)
to the machine is to examine the MAC(s) in the machine and compare against
hard-coded values.  Or, PXE serve *a* kernel based on the MAC of the
client requesting it.  (this latter lets me painlessly address future
needs assuming the MACs are immutable)


How can I configure a kernel to support a very large number of
(wired down) drives even if the hardware to support those drives
isn't present (I'm thinking about the case of having a couple
of disk shelfs which may/may not be present at any given time)?


Disk shelfs are irrelevant, controllers, channels, target and
lun ids are. The scsi and ata manpages give some examples about
possible kernel configurations to wire down disks.



The shelfs are relevant because they can be "removed"  in much
the same way that a drive can be removed.


Simple passive shelfs aren't even visible. It's like the disks in that
shelf are dead if the shelf is removed. If you wire down scsibus to
specific controllers/ports and sd devices to specific scsibusses and
target/lun ids, nothing will change when a shelf is removed.


The key, there, is to wire down the scsibus and ensure the SAS cables
aren't swapped/misplugged.  I'm obviously trying to avoid all the
potential screwups that can happen after I release the fixture to
Manufacturing.

OK, I guess I'll drag out a few shelfs and start poking at them
to see what they reveal.  Or, maybe wait until I have access to
the "real" kit so I don't make assumptions based on MY hardware
that prove to be incompatible with the boss's stuff.

Thanks!


Re: Sizing hardware drive capabilities (in the absence of probed devices)

2018-09-15 Thread Don NetBSD

On 9/15/2018 11:27 PM, Michael van Elst wrote:

netbsd-embed...@gmx.com (Don NetBSD) writes:


How can I determine the number of /potential/ disk devices (sd(4))
that a system MIGHT support -- *if* the drives had been installed
prior to boot?


That would be difficult. sd(4) is used for several different kinds
of disks, including virtual ones and SCSI is a bus. You MIGHT
install hundreds of sd devices but that's unrelated to e.g. how
many drive slots the machine has.


But can't I walk back up the device tree and find the number of leaves
on a particular (physical) controller?  Note that I control the
kernel configuration so devices can't exist where i don't expect
them...  (e.g., remove the USB devices from the configuration
file and the possibility of a mass device being inserted goes away).


E.g., if I have a 15 slot backplane but only have
a drive installed in slot 13, then *that* appears as sd0 and there
is no mention of the potential for the other 14 drives.


A backplane might support a ses(4) enclosure device that could
be queried.


The backplane on the machine I'm currently using has no ses device
probed.  Yet, the kernel seems to know that there are 4 drives installed
(*IF* they are present when the machine is booted!)


A driver for a multiport controller usually knows how many ports
are available. But that's not exposed, and in case of a bus
topology, you still wouldn't know what is possible.


Presumably, I can wire down each sd(4) device to correspond to a
particular "slot" (SATA port) in the machine when I build a kernel
with that in mind.


SATA would be wd(4), not sd(4).


SATA on a SAS controller appear as sd(4) devices on scsibus's.
Not sure I have the option to attach them to atabus, instead
(nor why I would want to do that)


[This allows software to KNOW that sd0 is "the drive in the top
left slot" even if there is no drive present there when the machine
boots]


You could create a custom kernel that wires drive units to specific
locations. You'd also may need to wire the 'scsibus' instances.


But, I would have to rely on empirical observation to know which device
is which?


For SATA that would wd devices and atabus instances.

USB might be an issue. You may need to remove the umass driver so
that no SCSI or ATA instances can attach.


Exactly.  The issue then becomes one of ensuring a particular slot/bay
in a particular shelf maps to a particular /dev/sd*.  I'd have to
label the slots, label the shelfs, label the SAS cables (and which
connectors they attach to).  But, so long as no one swaps cables,
things should stay as intended.

If a shelf is not powered on at boot, then I'd have to wire down the
associated controller/scsibus and make provisions to reprobe it when
it comes online.


How can I configure a kernel to support a very large number of
(wired down) drives even if the hardware to support those drives
isn't present (I'm thinking about the case of having a couple
of disk shelfs which may/may not be present at any given time)?


Disk shelfs are irrelevant, controllers, channels, target and
lun ids are. The scsi and ata manpages give some examples about
possible kernel configurations to wire down disks.


The shelfs are relevant because they can be "removed"  in much
the same way that a drive can be removed.  Not planning for that
possibility means an operator may be receiving directions regarding
"remove drive 5" when, in fact, the shelf in which "5" is installed
happens to have labeled it as "20".

Man pages indicate syntax for wiring down but give no guidance as
to how (other than empirical) to figure out which is which.  E.g.,
for PATA, you knew MASTER vs. SLAVE.  Does driver probe the controller
in the order the motherboard manufacturer has defined "SATA connections"?
(and, how might that relate to a backplane where there is nothing
besides OCD to force the slots to appear in a particular order)


Sizing hardware drive capabilities (in the absence of probed devices)

2018-09-15 Thread Don NetBSD

How can I determine the number of /potential/ disk devices (sd(4))
that a system MIGHT support -- *if* the drives had been installed
prior to boot?  E.g., if I have a 15 slot backplane but only have
a drive installed in slot 13, then *that* appears as sd0 and there
is no mention of the potential for the other 14 drives.

Presumably, I can wire down each sd(4) device to correspond to a
particular "slot" (SATA port) in the machine when I build a kernel
with that in mind.

[This allows software to KNOW that sd0 is "the drive in the top
left slot" even if there is no drive present there when the machine
boots]

How can I configure a kernel to support a very large number of
(wired down) drives even if the hardware to support those drives
isn't present (I'm thinking about the case of having a couple
of disk shelfs which may/may not be present at any given time)?

Or, where does the probe actually take place (module name)
and I'll dig through the sources myself...


Re: disk geometry (i386/amd64)

2018-09-15 Thread Don NetBSD

On 9/13/2018 11:35 PM, Michael van Elst wrote:

netbsd-embed...@gmx.com (Don NetBSD) writes:


| > SMART [...] so it is clearly possible.
|
| I think only via wd(4)?

Oh, you mean, not sd(4) - yes, possibly.   Sorry, I have no idea how
one would access that kind of data over scsi.



I will have to keep poking through the manual pages.


SMART is something that only exists for ATA but modern SCSI devices may
provide similar information on some mode pages.


Yes, that's how our firmware gets identifying information for the drives.

And, how it tracks the GDT "in the field" to determine the relative health
of the drive (and, hence, the product).


Our atactl tool can query SMART status, but maybe not everything you
want to know. For SCSI there is nothing, you need to make your own
tool to issue the proper commands with SCIOCCOMMAND.

It's possible that the smartmontools in pkgsrc help.


Thanks, I'll look at that.

So, there's no equivalent of Linux's hdparm or lshw that brings all
of this together "under one roof"?


Devices attached to RAID controllers or something behind USB adapters
can be more of a problem. Passing through low-level commands is usually
not implemented.


I'll replace any RAID controllers with SAS controllers -- if, for no other
reason, than to get the RAID BIOS out of the way.  The goal is to run these
headless so can't afford for the BIOS to complain about something (and
not have an easy way to SEE the complaint)


Re: disk geometry (i386/amd64)

2018-09-13 Thread Don NetBSD

On 9/11/2018 11:16 AM, Mike Pumford wrote:

[attrs elided]

I've done a lot of work with SAS disk enclosures that support SES. They 
often have an SES command that can turn off the drive in a bay prior to 
removal (but support is optional).


Aren't standards *wonderful*?  What value to "optional" if folks can elect to
NOT implement??  


All SES control requests are optional as some simple enclosures can only report 
basic status. So its a genuinely has to be optional rather than a mandatory bit 
of the standard everyone ignores. For power control the enclosure either has to 
have dedicated drive power control hardware or initiate the stop unit commands 
for itself which would have been challenging up until the SAS era. The 
enclosures I worked on did support SES drive power :).


If I *have* to rely on this functionality (instead of just commanding the drive
to power down), then I'll have to do something to ensure the software doesn't
run on "just any old" server (and, sooner or later, someone will think about
using some desktop machine, etc.).  Maybe hardcode a test for the MAC in the
software?  Or, just serve everything up via PXE and control which machine
gets which software from there!


My read of the SATA spec indicated that hot plugging was part of the SATA
standard (and, by extension, SAS).  But, that support for it on HBA's was
(ahem) "optional".  In particular, support for cold presence detect (but, I'm
relying on the operator to perform that function as HE will be the person
doing the plugging/unplugging!)


Anything AHCI will detect the removal. Not sure where netbsd is on 
removing/adding disk devices now. Last time I tested it with SATA ahci was 7.1 
and I couldn't force a freshly inserted disk to create a wd device but this 
could be down to lack of knowledge on my part. AHCI did report the drive link 
rate correctly when I plugged it in. drvctl looks like it ought to be able to 
do the removal insertion but I'd need help from someone on the lists to advise 
on how to make it work with disks ;).


I can impose some minimum constraints on the operator(s) who will be using
the fixture.  E.g., tell them to click on "spin down" before removing the
drive and "wait for the green light" before extracting.

[This last bit will be harder to count on as folks will get away with removing
the drive prematurely, once, and then infer that they can get away with it
ALWAYS -- as that helps them get more done in less time (so they can spend the
freed time playing with their telephone or...)  These aren't "IT" guys so
trying to impress upon them the *need* to do X, Y or Z is tough]


But, some bright-eyed dweeb uttered "Why not use an old server and write
something in Linux to do the job?!" -- because, to said dweeb, Linux is the
panacea for all problems technical (ignorance must be a wonderful state of
mind -- everything is "easy-peasy"!)


Linux does have hotplug disk support (at least with SAS HBAs). Haven't tried 
AHCI as all my AHCI systems are NetBSD. ;)


Ditto on the NetBSD point.  I'd rather stay out of the Linux swamp (at least in
my home lab).

I pulled a small SOHO server (PE840) out and put 7.1 on it so I can test on
that platform -- hopefully a bit closer to the servers at work.  It'll support
four drive sleds so I can try to get a feel for how/if all of this will work
without lugging the other server home (I hate rack-mounted servers around the
house).  Also give me a feel for how many drives I can process in parallel
and whether or not I can keep the operator "continuously busy".


Re: disk geometry (i386/amd64)

2018-09-13 Thread Don NetBSD

On 9/11/2018 6:28 AM, Robert Elz wrote:

 Date:Tue, 11 Sep 2018 00:19:57 -0700
 From:Don NetBSD 
 Message-ID:  <3cedac34-90d8-78ff-b320-de2c5ac8c...@gmx.com>

   | [should I be "reply all" or just the list?  I guess a matter of personal
   | preferences?]

It usually makes little difference - it certainly makes no difference to me.
People who get annoyed by receiving 2 copies, usually ask to be
excluded.


Then I'll err on the side of just replying to the list.


| > The raw partition allows this.
| Again, as long as nothing else tinkers with the in-kernel copy of the 
disklabel
| before I look at it.

No, regardless of that.


   | Please reread the initial exchange (reproduced below, for convenience):

I know what you mean, you're just missing the point.   From the raw
partition you can discover everything that you need.  The label is
irrelevant for your use, ignore it.   You can use the new ioctl if it
works to get info from the drive, if not, you just do it the hard way.


You're ignoring:  the fact that the firmware for our existing products already
get the device information directly from the device; the dweeb's suggested
Linux implementation would make all of this visible via hdparm or lshw; and
the politics involved in most design decisions.

   "Why all of this code when our products do it in one line of code?
   Is NetBSD *that* brain damaged??  Maybe we *should* have gone the
   Linux route... (?)"

[Silly to argue technical issues with managers that haven't written a
line of code in decades -- yet, strangely, feel AS IF they still
understand all the issues!]


   | Again, as long as nothing else tinkers with the in-kernel copy of the 
disklabel
   | before I look at it.

Just don't look at it at all.


The question was whether DIOCGDISKINFO looked at *something else* (as it
doesn't seem to be bothered by changes to the in-kernel label) and would
avoid the in-kernel disk label "risk" completely.


However, your "as long as nothing else tinkers" seems a peculiar worry
given the application you seem to have in mind.   What "something else"
could possibly be doing the tinkering in the environment you are describing?


If the approach "works", one can almost bet that it will be embelished to
perform other duties.  And, that the folks doing that embelishment are likely
not going to be "me" nor think that they should ask me if there are any
hidden gotchas to avoid.

E.g., the products contain code to "heal" corrupted disk images (to a certain
extent).  One easy way to *test* this code is to construct disk images that are
known to be corrupt.  Install those images on physical media (hey!  Maybe using
this very same appliance!!).  Then, install the media in "product" and let it
"fix" the image.  Verify it has done so, properly, by examining the NEW image
(hey!  Maybe using this very same appliance!)

Or, the shake-and-bake guys might want to use it to stress test the drive
components.  (Have they failed?  Have the number of remapped sectors increased?
Has seek time suffered?)

Or, ...

Relying on some set of cryptic rules like "don't alter the in-kernel disk label
lest something SILENTLY misbehave" is a recipe for disaster.  Especially when
the appliance is seen as the authority for the "disk imaging process" (i.e.,
this first application)


   | (which appears to be DIOCGDISKINFO, but not DIOCGDINFO)

Probably, yes.   If that (DIOCGDISKINFO) works, great, if not,
it is all still possible.


Yes -- discard the servers, grab existing product off the Line, modify
the firmware to use the information that *it* gathered from the attached
disk and then reproduce as many of these "test fixtures" as you need to
process N drives in parallel.


   | Ah, OK.  So, if I verify this for the sd(4) driver on a particular OS
   | version/port, then I need never concern myself that some particular *drive*
   | may fail to yield valid data?  I.e., if the ioctl fails, I can panic()?

You could, but that would not be my recommended action.   Certainly
drives have made their size available to the driver ever since we started
getting intelligent drives, I very much doubt you could find one still working
which did not support that - anywhere.   But is dealing with that case
so hard, compared with all else you are doing, that a panic is acceptable?
(Even given you just mean an application panic, as in "discard the drive
in slot 23" and not "crash the kernel")


My point was it should be a "can't happen" -- that really CAN'T happen.
E.g., when our products power up and bring the drive on-line, if it fails
to respond to all queries/actions as expected, we throw an error and the
product doesn't work.  There's no "user remedy" other than "return for

Re: disk geometry (i386/amd64)

2018-09-11 Thread Don NetBSD

On 9/10/2018 11:03 PM, Michael van Elst wrote:

On Mon, Sep 10, 2018 at 05:37:53PM -0700, Don NetBSD wrote:


So, I can use the 'd' partition to access the medium (after unlocking the label
portion).  But, I can't count on anything else "displayed" by disklabel.  And,
I can't count on even the displayed values for the d partition if something
tinkers with the in-kernel label before I get a peek at it.


The disklabel might be the only thing that you can count on.


The raw partition allows this.


Again, as long as nothing else tinkers with the in-kernel copy of the disklabel
before I look at it.


No. As I said, the disklabel is ignored for the raw partition.


Yes, but if I *examine* the label, there is nothing that "ignored by the raw
partition" can do to give me valid parameter values.  I.e., the rest of the
label BESIDES the 'd' partition.


I had planned on DIOCGDINFO -- with all the caveats mentioned
above.  Perhaps DIOCGDISKINFO would be a better choice?


Old (but not too old) code needed to:

- DIOCGDISKINFO (using proplib!)
- fall back to DIOCGDINFO if that failed


Why would it have failed?


The driver might not support that request.


OK.


DISKINFO appears to not be "confused" by changes to the in-kernel label

DIOCGDISKINFO is what 'drvctl -p' will show you and gets data from the driver.


Yes.


I've not yet tried this with the DINFO ioctl.

DIOCGDINFO gets you a copy of the in-kernel disklabel.


Which is what disklabel(8) reports!


DIOCGSECTORSIZE gets you the size of a disk sector from the driver.
DIOCGMEDIASIZE gets you the number of disk sectors from the driver.

these were added for NetBSD-8 but also pulled up to NetBSD-7.1 (not
NetBSD-7.0) for compatibility with FreeBSD and to simplify programs
to get these values.


Hmmm... my bad.  The machine I built at work runs 7.1 -- but the box I have
been using here, at home, to explore this stuff is still at 6.1.5.  I will
have to move my code over to the box at work.


There's no confusion as no one/nothing looks at the disk besides my software.


The OS will look at the disk as soon as it is attached.


Only to probe it.  The drive is never mounted.  Never labeled.  Never
"anything" -- other than the actions I will be taking (or, others staff
operating under similar constraints).


It is probed and scanned for things like the disklabel or other partitioning
information and the wedge driver attaches to allow access by filesystems
as the disklabel does not support large disks.


But all that will do is generate diagnostic messages, not interfere with my
use of the physical device via the 'd' partition.  I don't really care if my
actions leave the medium in a state that NetBSD can't understand (as a NORMAL
drive) as it will never be asked to "understand" it.

OK.  So:
- manually insert drive
- rescan the scsibus
- start the device in question
- DIOCGDISKINFO to get size
- DIOCiforgot to unlock label portion
- read/write /dev/rsd#d
- DIOCCACHESYNC to flush pending writes
- stop the device
- manually remove drive

(I guess I'll have to see how I can "wait on start" and "wait on stop"...
synchronous calls?)

Thanks!  Maybe I'll bring the server home and code this over the weekend.


Re: disk geometry (i386/amd64)

2018-09-11 Thread Don NetBSD

[should I be "reply all" or just the list?  I guess a matter of personal
preferences?]

On 9/10/2018 10:08 PM, Robert Elz wrote:


   | So, I can use the 'd' partition to access the medium (after unlocking the 
label
   | portion).  But, I can't count on anything else "displayed" by disklabel.  
And,
   | I can't count on even the displayed values for the d partition if something
   | tinkers with the in-kernel label before I get a peek at it.

That's all correct.

   | > The raw partition allows this.
   |
   | Again, as long as nothing else tinkers with the in-kernel copy of the 
disklabel
   | before I look at it.

No, regardless of that.   Access to the raw partition ignores whatever is
in the label, always.   If not, there'd be no way to fix a badly broken label
(as in, if the label said that the size of the raw partition was 0 sectors.)


Please reread the initial exchange (reproduced below, for convenience):

8<
>> Further, I wanted to know that I could query the OS for details as to the
>> size of the medium (sector size and number of sectors).
>> on the media to have been preinitialized *or* requiring it to be 
"initialized"
>> prior to my access.  (I *won't* be writing a "disklabel" onto the medium but
>> will be altering much/all of its contents, otherwise)
>
> The raw partition allows this.

Again, as long as nothing else tinkers with the in-kernel copy of the disklabel
before I look at it.
8<

I.e., if something has tinkered with the in-kernel copy of the disk label,
then disklabel(8) will report those tinkered values.  So, I could NOT use
disklabel(8)'s output -- or anything else that reports the in-kernel
values -- to get an accurate description of ANY portion of the ACTUAL
"size of the medium (sector size and number of sectors)".  In the absence of
other mechanisms (e.g., the DIOCGDISKINFO ioctl), my only way to determine
the actual size of the medium would be to explore the 'd' partition with
seeks to ever increasing offsets.

For example, I can readily create a situation where the contents of
the portion of the physical medium wherein the label normally resides,
the output of disklabel(8) and the actual characteristics of the physical
device are all *different* from each other.

By way of proof:  a drive that has been zeroed (so, the label portion
contains nothing but zeroes); a disklabel(8) that has been *edited* to
completely bogus values and "saved" (or, anything else that alters the
in-kernel label); the actual disk, itself.

My goal is a reliable way of exposing what the "disk, itself" says!

(which appears to be DIOCGDISKINFO, but not DIOCGDINFO)


   | > - DIOCGDISKINFO (using proplib!)
   | > - fall back to DIOCGDINFO if that failed
   |
   | Why would it have failed?

if it were not implemented I think was the intent there.   That is, so the
code could work on various different versions of the system (be reasonably
portable) - and possibly also because (this I am not sure about) not all
drivers might implement the new method (yet).


Ah, OK.  So, if I verify this for the sd(4) driver on a particular OS
version/port, then I need never concern myself that some particular *drive*
may fail to yield valid data?  I.e., if the ioctl fails, I can panic()?


   | [I also need to query the SMART structures and other disk parameters
   | but I can workaround that]

SMART (what a name for something so simplistic) is available too.   No idea how,
but there are commands that get at it (and make changes) so it is clearly 
possible.


I think only via wd(4)?


   | I want the same sort of access -- but to the disk controller.

Disks are a little more more uniform and have a better structured command
set than the parallel port, so it is possible that there may be some
interference from the driver that simply knows that some things are
possible, and others are not - even if you disagree.

But just reading/writing arbitrary blocks on the drive should be possible


Re: disk geometry (i386/amd64)

2018-09-10 Thread Don NetBSD

On 9/9/2018 10:26 PM, Michael van Elst wrote:

On Sun, Sep 09, 2018 at 05:49:15PM -0700, Don NetBSD wrote:

On 9/9/2018 1:52 PM, Michael van Elst wrote:

On Sun, Sep 09, 2018 at 12:08:14PM -0700, Don NetBSD wrote:

Said another way, are these "in-kernel" values (which no longer reflect
the physical medium) ever reported in other system calls/ioctls/etc.
INSTEAD of the "real" values?


What is 'real' ? Start with the assumption that there is no way to know
the real data and that the disklabel is the method to record such
information. 'real' is then what you wrote into the disklabel.


The disk has a real size that can be queried from the actual device.


disklabel predates that. At that time you knew the data from looking at
the device with your eyes or reading the documentation. Then you wrote
the data as disklabel to a known disk sector.


Yes.  The early IDE drives required the controller to tell *them* what
their geometry was.


This didn't really change, except that something like the disklabel
tool will now get the data by querying the drive and then writing it
to the disklabel sector. Other software (mostly) gets the information
then from the disklabel.


But, the system knows the geometry (sectors + sector size) from the device
probe -- even if disklabel is never invoked.  This is the data that I want
without the "system" deciding to "color" it (or alter it!) for me.

E.g., in our products, we talk directly to the drive and don't require
something else to do this on our behalf.


I wanted assurance that I could access the 'd' partition and be assured
access to the entire medium, regardless as to what MIGHT have been encountered
in the sectors of the medium that NetBSD examines in search of a label.


That is always true. If there is no disklabel on-disk, you will get a
generated one, if there is a disklabel, you will get that. But for the
raw partition the offset and size are ignored.


So, I can use the 'd' partition to access the medium (after unlocking the label
portion).  But, I can't count on anything else "displayed" by disklabel.  And,
I can't count on even the displayed values for the d partition if something
tinkers with the in-kernel label before I get a peek at it.

(In theory, *I* should be the only thing dealing with that disk)


-> find the raw partition with sysctl
-> just access it.


Further, I wanted to know that I could query the OS for details as to the
size of the medium (sector size and number of sectors).
on the media to have been preinitialized *or* requiring it to be "initialized"
prior to my access.  (I *won't* be writing a "disklabel" onto the medium but
will be altering much/all of its contents, otherwise)


The raw partition allows this.


Again, as long as nothing else tinkers with the in-kernel copy of the disklabel
before I look at it.


I had planned on DIOCGDINFO -- with all the caveats mentioned
above.  Perhaps DIOCGDISKINFO would be a better choice?


Old (but not too old) code needed to:

- DIOCGDISKINFO (using proplib!)
- fall back to DIOCGDINFO if that failed


Why would it have failed?

DISKINFO appears to not be "confused" by changes to the in-kernel label
(edit disk size, etc. using disklabel(8); verify the changes persist
beyond disklabel(8) invocations; run drvctl as a wrapper for the DISKINFO
ioctl to verify the original drive values remain intact).

I've not yet tried this with the DINFO ioctl.

(It may be easier for me to just start at the device probe and work my
way *up* into the system than to attempt to dig down from above!)


- optionally use DIOCGWEDGEINFO to support partition devices ("wedges")


No wedges.  No partitions.  No filesystem.  Disk is just "blocks of persistent
memory".  I just need to know how many, how large they are and be able to
access ALL of them -- regardless of what the disk may have "experienced"
prior to my "seeing" it.

[I also need to query the SMART structures and other disk parameters
but I can workaround that]


There's no confusion as no one/nothing looks at the disk besides my software.


The OS will look at the disk as soon as it is attached.


Only to probe it.  The drive is never mounted.  Never labeled.  Never
"anything" -- other than the actions I will be taking (or, others staff
operating under similar constraints).

E.g., if I attached something to a "printer port" and it wasn't a printer, the
OS wouldn't care.  Yet, would still let me use that hardware to talk to
said device.

I want the same sort of access -- but to the disk controller.


But, I'd be concerned about pulling a drive that was still *spinning*.
atactl(8) doesn't seem to work with sd(4) devices.  I'll have to see if
drvctl -d spins the drive down as it is disconnected (and maybe a timeout
to ensure the

Re: disk geometry (i386/amd64)

2018-09-10 Thread Don NetBSD

On 9/10/2018 11:33 AM, Mike Pumford wrote:

On 10/09/2018 01:49, Don NetBSD wrote:


I'm not concerned with automatically detecting insertion/removal; that's
the job that the operator performs (above) -- along with the tagging of
the media, etc.


I've done a lot of work with SAS disk enclosures that support SES. They often 
have an SES command that can turn off the drive in a bay prior to removal (but 
support is optional).


Aren't standards *wonderful*?  What value to "optional" if folks can elect to
NOT implement??  

My read of the SATA spec indicated that hot plugging was part of the SATA
standard (and, by extension, SAS).  But, that support for it on HBA's was
(ahem) "optional".  In particular, support for cold presence detect (but, I'm
relying on the operator to perform that function as HE will be the person
doing the plugging/unplugging!)

[None of the products involved allow the drive to be removed so this wasn't
an issue for the design]

OTOH, when trying to repurpose someone ELSE's hardware, those details get
to be important (and, often hard to discern, reliably!)


But, I'd be concerned about pulling a drive that was still *spinning*.
atactl(8) doesn't seem to work with sd(4) devices.  I'll have to see if
drvctl -d spins the drive down as it is disconnected (and maybe a timeout
to ensure the operator doesn't remove the drive before its had a chance
to spin down sufficiently)


A SCSI stop unit command may spin down a SAS disk but its not guaranteed as it 
depends on whether or not the SAS HBA or expander is sending the drive periodic 
NOTIFY primitives (which trigger drive spinup). What works with one SAS HBA may 
not work with a different one.


On a more practical note. As long as you disconnect the drive from the 
connector gently and give it 5-10 seconds to spin down before any significant 
movement SAS/SATA drives tend to survive removal even if they are spinning.


As with most enterprises, I'm having to deal with nontechnical "political"
issues, here.

I'd first suggested using existing product (with a firmware update) to perform
these tasks.  The advantages being:
- we have a virtually unlimited number of them ("build more")
- they use much less power per spindle than a repurposed server
- they can be scaled to process tens or even hundreds of concurrent drives
- eliminate all the noisey fans
- we know EVERYTHING about what's in the box (hardware & firmware)
- I can dumb-down the requirements for the operator (so he doesn't damage
  a test fixture *or* components that will be used in saleable product!)

But, some bright-eyed dweeb uttered "Why not use an old server and write
something in Linux to do the job?!" -- because, to said dweeb, Linux is the
panacea for all problems technical (ignorance must be a wonderful state of
mind -- everything is "easy-peasy"!)

So, *his* "solution" fell into my lap to implement.  Having no taste for Linux
(running NetBSD since 0.8), I picked a more familiar platform.  I'd like to
make a "legitimate" effort to make it work.  But, note that the more stuff
that I rely on in the design (e.g., the hot-pluggable HBA), the more
restricted the choice of repurposable hardware available:
"Why can't we make another one of these fixtures?  We've got several
other old servers in the scrap pile..."
I don't want to have to explain how all servers are not created equally.

[Ideally, I'd like to throw it back into *his* lap and watch him struggle
to "solve" the problem -- and separately, pursue a parallel approach of
repurposing our existing product (hardware & software) to be available
when he's publicly learned his lesson (Linux is NOT a panacea).  But,
that would be too petty.  Given time, he'll learn that solutions are
rarely as simple as they seem!]

In SAS hotplug is not optional. ALL SAS HBAs should support it regardless of 
whether the drive is directly plugged in or in some of drive carrier. I don't 
know if NetBSD will deal with these events though.


I'm using a Dell 2950 as a first pass.  The drive sleds don't add anything
to the mix -- the HBA can accept SATA drives as well (but no mix-n-match).
The SATA->SAS converters that would be used in a normal deployment can be
removed as they would just complicate the job performed by the "operator".

The "support" question is then one of whether NetBSD can issue the correct
commands to the drive to prepare it for removal (and make it ready after
insertion).  Without the hot plug capability, the operator would have to
wait for all drives to be processed and then power down the server to replace
the drives with the next set.  And, how much information I can extract from
the native drive (e.g., serial number, model, etc.)

[This is how I'd support changing drives using our own hardware -- but, we
could afford to power down ONE drive/system at a time and not impact N
other drives that happen to be sharing that repurposed server]


Re: disk geometry (i386/amd64)

2018-09-09 Thread Don NetBSD

On 9/9/2018 1:52 PM, Michael van Elst wrote:

On Sun, Sep 09, 2018 at 12:08:14PM -0700, Don NetBSD wrote:

Said another way, are these "in-kernel" values (which no longer reflect
the physical medium) ever reported in other system calls/ioctls/etc.
INSTEAD of the "real" values?


What is 'real' ? Start with the assumption that there is no way to know
the real data and that the disklabel is the method to record such
information. 'real' is then what you wrote into the disklabel.


The disk has a real size that can be queried from the actual device.
Admittedly, things like the actual "geometry" are a moot point.  But,
the size is something that the system and the drive have to agree on
(whether you care about overprovisioned sectors isn't important at
this point)


Obviously, if drvctl(8) can report the "real" values, then they must be
preserved somewhere besides the in-kernel disklabel.


Modern drives can be queried about the disk geometry. The driver does
this and drvctl can be used to query the values collected by the driver.

The in-kernel disklabel is either generated ("default label") or read
from the on-disk label sector and sanitized a bit.


With a "foreign" drive, you can't count on the media to contain any meaningful
data in any particular places.  E.g., imagine "something" has treated the drive
as N "blocks of memory" and used them however IT deemed fit -- with no concern
for NetBSD (or *any* OS).  I.e., there was no concept of a "disk label", MBR,
etc. prior to the disk coming to the NetBSD machine.

I wanted assurance that I could access the 'd' partition and be assured
access to the entire medium, regardless as to what MIGHT have been encountered
in the sectors of the medium that NetBSD examines in search of a label.

Further, I wanted to know that I could query the OS for details as to the
size of the medium (sector size and number of sectors).  Again, without relying
on the media to have been preinitialized *or* requiring it to be "initialized"
prior to my access.  (I *won't* be writing a "disklabel" onto the medium but 
will be altering much/all of its contents, otherwise)



Erasing an on-disk label is done with 'disklabel -D'. The kernel will then
use a generated default label.


Writing a few sectors of garbage to /dev/rsd#d (seeking to start of file)
won't overwrite the label.


The label sector is normally write-protected and only accessible through
special ioctls.


Yes, DIOCWLABEL.  My plan was to DIOCGDINFO to get the *information* from the
(fictitious) disk label and then DIOCWLABEL to enable access to the entire
medium.  Then, open() the character/raw device and seek/write to my heart's
content.

But, this counts on the DIOCGDINFO ioctl giving me unadulterated information
regardless of the previous contents of the disk -- or, any disk that may have
previously been probed at that device (i.e., a drive that has since been
ejected).


Verify this by copying one disk to another:
 dd if=/dev/rsd0d of=/dev/rsd1d bs=1024k
and verifying the label of the destination disk isn't altered (?)


If there is no label on the disk, then you will get an artificial
'default label'. If there is a label on the disk, it will be read
and used by the first opener of the device, so disklabel would indeed
show the altered values.


So, given that the previous contents of the disk are indeterminate, I
can't count on the information returned from the *medium* (hence I
need access to the information returned from the *drive* -- the parameters
that the drive's onboard controller uses and not "data" that happens to
be stored in some particular place on a platter)


Bottom line:  I'm trying to expose the "native" (avoiding the use of the
word "raw" to minimize the association with the "character device") disk
interface to my code so I can put what I want


The "native" disk interface is what you use and has its limits.
Fortunately this is only relevant to a few low-level tools and
several additional interfaces were added to address the shortcomings.

To access the raw disk nowadays you would:

- use opendisk() to get a filedescriptor.
- use ioctl(...,DIOCGSECTORSIZE,...) to get the byte size of a disk sectors.
- use ioctl(...,DIOCGMEDIASIZE,...) to get the number of disk sectors.


A quick grep of the sources (7.1/amd64) don't turn up any hits for these
ioctls.  I had planned on DIOCGDINFO -- with all the caveats mentioned
above.  Perhaps DIOCGDISKINFO would be a better choice?


- use lseek/read or pread to read sectors
- use lseek/write or pwrite to write sectors
- use ioctl(...,DIOCCACHESYNC,...) to force the disk to write cached data.
- close() the descriptor when done.

But if you do this and change how the OS sees the disk (i.e. change
pa

Re: disk geometry (i386/amd64)

2018-09-09 Thread Don NetBSD

On 9/8/2018 11:55 PM, Michael van Elst wrote:

netbsd-embed...@gmx.com ("Don NetBSD") writes:


My understanding is that the 'd' partition is intended to reference the
entire medium.  But, a simple test (disklabel -e) indicates that I can
create an arbitrary (start,size) for that partition.  So, I could
potentially encounter a foreign disk that happens to *look* like it has a
'd' partition that doesn't map to the actual entire physical disk.


The 'd' partition is the raw partition (on x86, other architectures use
'c'). Accesses to the raw partition ignore the disklabel values.


Yet manually editing any of the parameters in the label via disklabel(8)
(d partition size/offset, total sectors, # cylinders, sectors/track, etc.)
*persists* across disklabel(8) invocations.  Apparently, these are only stored
in the kernel and not rewritten to the physical medium (?) -- i.e., they
don't persist across a reboot.

[I've verified both of these claims]

Which begs the question, why are these presumably immutable parameters
allowed to be altered in even the in-kernel copy of the label?  Why are
attempts at altering them (sectors/track, etc.) not ignored completely?

Said another way, are these "in-kernel" values (which no longer reflect
the physical medium) ever reported in other system calls/ioctls/etc.
INSTEAD of the "real" values?  (In my example, disklabel(8) is getting
the modified-to-be-incorrect values from *somewhere*...)


N.B. 'sysctl kern.rawpartition' shows what the raw partition is
(0='a',1='b',2='c',..).


How can I get the actual geometry and ensure any bogus label contents are
obliterated (i.e., overwriting the entire medium regardless of the
"protected" MBR/label space)?


The disklabel stores the physical geometry, but that's mostly what has been
stored in the on-disk disklabel. The driver also knows the geometry from the
drive itself. You can query this data using 'drvctl -p'.


I'm looking for programmatic ways of doing so (omit most/all of the filesystem
contents).

Obviously, if drvctl(8) can report the "real" values, then they must be
preserved somewhere besides the in-kernel disklabel.  (?)  And, somewhere
that can't be mucked with by altering the contents of the disklabel portion
of the medium!

[I'll grep the source to see...]


Erasing an on-disk label is done with 'disklabel -D'. The kernel will then
use a generated default label.


Writing a few sectors of garbage to /dev/rsd#d (seeking to start of file)
won't overwrite the label.  Verify this by copying one disk to another:
dd if=/dev/rsd0d of=/dev/rsd1d bs=1024k
and verifying the label of the destination disk isn't altered (?)

Bottom line:  I'm trying to expose the "native" (avoiding the use of the
word "raw" to minimize the association with the "character device") disk
interface to my code so I can put what I want, where I want it (as long
as I don't expect NetBSD to "use" the disk in any way).  And, trying to
figure out what contractual guarantees I can get from the OS to stay out
of my way in doing so.

[The alternative is to resort to bare metal programming and elide all
of the muck that is associated with a "PC"]


Related but different question:  is there any support for hot-plugging
(unplugging) drives (other than USB) on NetBSD?  I know you can kludge
this with external SCSI drives if the i/f is quiescent and you then rescan
the appropriate scsibus.  But, how does SATA/SAS cope?


About the same. There is an autoconfig feature in the drivers to 'rescan'
which can be triggered with 'drvctl -r'. ATA needs an additional parameter,
i.e. 'drvctl -r -a ata_hl atabus1' will try to attach drives on 'atabus1'.
For hotplug you would first detach the old drive with 'drvctl -d' and then
rescan.


OK.  Again, I can look through the source to see what the actual mechanism
is (to reproduce it in my code).

Do you happen to know if SATA/SAS are *inherently* hotpluggable?  Or, do
they need *hardware* support on the motherboard/backplane to do this?
I.e., is "support for hotplugging" entirely a software ("BIOS") issue?

Thanks for your help!


disk geometry (i386/amd64)

2018-09-08 Thread Don NetBSD
I have to design an appliance to accept "foreign" disks (sd(4) and wd(4)) and 
massage them into a form/content suitable for deployment (in yet another 
appliance).

My understanding is that the 'd' partition is intended to reference the entire 
medium.  But, a simple test (disklabel -e) indicates that I can create an 
arbitrary (start,size) for that partition.  So, I could potentially encounter a 
foreign disk that happens to *look* like it has a 'd' partition that doesn't 
map to the actual entire physical disk.

How can I get the actual geometry and ensure any bogus label contents are 
obliterated (i.e., overwriting the entire medium regardless of the "protected" 
MBR/label space)?

Related but different question:  is there any support for hot-plugging 
(unplugging) drives (other than USB) on NetBSD?  I know you can kludge this 
with external SCSI drives if the i/f is quiescent and you then rescan the 
appropriate scsibus.  But, how does SATA/SAS cope?