Re: configuring remote headless servers

2016-09-01 Thread Manuel Bouyer
On Thu, Sep 01, 2016 at 06:02:33PM +0100, Steve Blinkhorn wrote:
> But for now my original question still stands: what about using
> /fastboot?

it means your system won't run fsck at boot and mount them read/write
despite being marked unclean. This can cause a panic/reboot loop.

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: configuring remote headless servers

2016-09-01 Thread Steve Blinkhorn
I'm grateful for the sharing of wisdom and experience.I have
worked out that the servers most likely do have IPMI (they are Fujitsu
Siens Primergy RX100 GSO1), but given their age I suspect it will prove to be
an early version.

I saw something in the BIOS setup that looked related, but given the
urgent need to get them back into service I did not have time to
experiment at base and dare not set them into a novel configuration
(for me).   I have this problem of physical disability which prevents
me working on the machines directly in the machine room.

Perhaps if the ISP who provided them in the first place had thought to
configure IPMI then, my life would have been significantly easier these
past few weeks.

But for now my original question still stands: what about using
/fastboot?

I'm not ignoring the other suggestions, e.g. cross-connecting serial
ports, but at the moment they're not practical.
o

--
Steve Blinkhorn 

> 
> st...@prd.co.uk (Steve Blinkhorn) writes:
> 
> > Following on from the recent saga of upgrading from 2.0 to 7.0 which
> > assiduous readers may recall, the servers were re-installed in their
> > racks in the data centre.   All was well with one of them but the
> > other apparently failed.   It took three days for an engineer with
> > sufficiently developed skills to become available: He solved the
> > problem by switching the server on.
> >
> > But this led me to wonder how I would cope if, for instance, a server
> > came up in single-user mode requiring an fsck.   Once upon a time I
> > was able to assume that this would be a circumstance familiar to data
> > centre staff, but no longer.   What I would need would be a boot
> > sequence that started the network before any file system checking and
> > allowed remote login.   Alternatively, file system checking could be
> > disabled by default - even if the system went down by power cycling
> > the machine.
> >
> > I can see from the man pages for shutdown(8) and fastboot(8) that
> > there is provision related to this kind of circumstance.   Would it
> > simply be a matter of having an empty file named /fastboot in the root
> > directory?   If it matters, these are i386 machines.
> >
> > Any gotchas with this approach?
> 
> 
> Hello...  There has been several good responses to this, so I doubt that
> I will add much...  but anyway...
> 
> You will really want some sort of remote console, for real and true.
> This means either a serial console or some sort of internal or external
> console redirection.
> 
> For the serial console route, there is support in NetBSD to redirect to
> a serial port all of the console output when the kernel boots.  This
> would take care of your fsck example.  Couple this with a PDU that is
> network connected and can cycle plugs and you can power cycle the system
> and pretty much watch it boot up.  As for the device that is on the
> other end of the serial port, use your other system and cross connect
> them together.  This would require two serial ports per system and will
> work except when BOTH systems are down and nonfunctional.
> 
> Internal console redirection comes in the form of DRAC [Dell], iLO [HP]
> or IPMI [in some cases].  This works well and will provide total console
> redirection even of the BIOS boot process.  There may be an additional
> license required for advanced features, but you may not need those.
> Also, Amazon and ebay often sell the bits and pieces cheaply. This
> arrangement is, by far, the most functional.  DRAC and iLO will allow
> you to power cycle the systems without using an external PDU and you can
> pretty much see everything.
> 
> External console redirection is in the form of a network connected KVM
> box that sits on the video output and keyboard output of the system.  It
> is possible to get very cheap versions of these that MAY just work out
> for you, as long as you keep the arrangement simple [don't chain KVMs to
> KVMs, and the like].  Couple this with a network connected PDU and you
> can hard power cycle the systems pretty simply.
> 
> In a number of these cases it is required that the network connected
> device have Internet access of some form or that there be a jump box /
> VPN arrangement that will allow incoming connections to the PDUs and
> etc..
> 
> Someone mentioned the use of a thumb drive to boot up a minimal kernel
> with openssh running.  That was clever in a number of ways.  It would
> require, probably, someone who can place the thumb drive in the system,
> but they would not have to be any more talented than that.  You could
> probably tie the thumb drive to the system physically such that all
> someone would have to do is place it in a USB port.  Likewise you could
> do something with ANY external booting device such as a CD.  The down
> side is that it wouldn't help with network issues, but most of these
> solutions are dead in the water with that issue anyway.  The real trick
> with this is coming up with the boot, but 

Re: configuring remote headless servers

2016-08-31 Thread Lyndon Nerenberg
> But this led me to wonder how I would cope if, for instance, a server
> came up in single-user mode requiring an fsck.

The standard way to deal with this in DC deployments is to use IPMI:

1) Redirect the BIOS console to the IPMI virtual console.

2) Redirect the boot loader prompts to the IPMI virtual console device.

3) Spawn a getty on the virtual IPMI console device.

We do this on pretty much everything in our DCs.  We don't have any gear 
running NetBSD, but for the OpenBSD and FreeBSD machines, (2) involves one line 
in boot.conf, and (3) is an entry in gettytab.  (1) is OS agnostic, and 
involves configuring the machine's BIOS with IP addresses and login credentials.

--lyndon




signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: configuring remote headless servers

2016-08-31 Thread Swift Griggs
On Wed, 31 Aug 2016, Steve Blinkhorn wrote:
> It took three days for an engineer with sufficiently developed skills to 
> become available: He solved the problem by switching the server on.

Having found no good way to truly address issues like this without some 
control of my own, I don't deal with an ISP that won't give me power 
control and console. HP ILO's are a good solution since they can be used 
for both a hard power cycle and give you a real remote console. If your 
console stays in text mode, you don't even have to license the iLO. Many 
server BIOSs' have a mode whereby they can provide console support via a 
dedicated serial port (Tyan comes to mind as one of these). If you combine 
that functionality with something than can do remote power control (like a 
Baytech RPC or APC network PDUs) then you've got the same features. 

> But this led me to wonder how I would cope if, for instance, a server 
> came up in single-user mode requiring an fsck.

If you have true console access it wouldn't matter. You'd do the fsck then 
keep truckin'.

> I can see from the man pages for shutdown(8) and fastboot(8) that there 
> is provision related to this kind of circumstance.

I'll just apologize because I doubt my response was what you were looking 
for. I'll simply say this, when it comes to hosted systems, the faster the 
system can bring up the network and ssh with the absolute minimum of 
dependencies, the better. AFAIK, I've never seen an OS that really "gets" 
this, as evidenced that even though OS's *could* use their 
ramdisk/miniroots to launch OpenSSH (and statically link it), they rarely 
do (and there are some reasons, but I usually disagree with their 
importance). For a server without a decent console, having Openssh started 
is a defacto the same thing as having a usable server, thus the strategy 
should place a categorically *premium value* on doing that as soon in the 
boot process as possible with the least number of dependencies.

Also, NetBSD has the ability to redirect the console to a serial port as 
soon as the kernel starts booting. However, you'd need a serial console in 
place first before you can take advantage of that. However, in your 
scenario of the system needing an fsck and stopping the boot process, it'd 
save you from having to call some data-center hands & eyes at your ISP.

-Swift


Re: configuring remote headless servers

2016-08-31 Thread Michael van Elst
st...@prd.co.uk (Steve Blinkhorn) writes:

>But this led me to wonder how I would cope if, for instance, a server
>came up in single-user mode requiring an fsck.

This is handled by using server hardware that has an out-of-band
management console, i.e. BMC, ILO, DRAC, iRMC, or just a serial
console with a terminal server and a remote power switch.

Configuring some kind of emergency network before doing fsck
is difficult as you need to run with read-only disks and it
wouldn't help with other types of errors. It's too late to
answer a kernel or even a boot loader prompt.


Greetings,
-- 
-- 
Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: configuring remote headless servers

2016-08-31 Thread David Brownlee
On 31 August 2016 at 11:34, Steve Blinkhorn  wrote:
> Following on from the recent saga of upgrading from 2.0 to 7.0 which
> assiduous readers may recall, the servers were re-installed in their
> racks in the data centre.   All was well with one of them but the
> other apparently failed.   It took three days for an engineer with
> sufficiently developed skills to become available: He solved the
> problem by switching the server on.
>
> But this led me to wonder how I would cope if, for instance, a server
> came up in single-user mode requiring an fsck.   Once upon a time I
> was able to assume that this would be a circumstance familiar to data
> centre staff, but no longer.   What I would need would be a boot
> sequence that started the network before any file system checking and
> allowed remote login.   Alternatively, file system checking could be
> disabled by default - even if the system went down by power cycling
> the machine.
>
> I can see from the man pages for shutdown(8) and fastboot(8) that
> there is provision related to this kind of circumstance.   Would it
> simply be a matter of having an empty file named /fastboot in the root
> directory?   If it matters, these are i386 machines.
>
> Any gotchas with this approach?

As a data point - I had a USB key set to boot up with dhcpcd and then
run openvpn and sshd, then set the server to boot from USB first.

In the event of a server issue the remote hands had to plug the USB
key and hit the power switch (the OpenVPN was in case someone had
managed to bork the firewall as well such that inbound ssh was
disallowed - don't ask) .

It was generic enough that they could plug it into most any box with
ethernet and have an expectation of it working :)