Re: Race condition with mdadm at boot [still mystifying]

2011-03-10 Thread Stephen John Smoogen
On Thu, Mar 10, 2011 at 20:24, Chuck Munro  wrote:
> This is a bit long-winded, but I wanted to share some info 
>
> Regarding my earlier message about a possible race condition with mdadm, I
> have been doing all sorts of poking around with the boot process. Thanks to
> a tip from Steven Yellin at Stanford, I found where to add a delay in the
> rc.sysinit script, which invokes mdadm to assemble the arrays.
>
> Unfortunately, it didn't help, so it likely wasn't a race condition after
> all.
>
> However, on close examination of dmesg, I found something very interesting.
>  There were missing 'bind' statements for one or the other hot spare
> drive (or sometimes both).  These drives are connected to the last PHYs in
> each SATA controller ... in other words they are the last devices probed by
> the driver for a particular controller.  It would appear that the drivers
> are bailing out before managing to enumerate all of the partitions on the
> last drive in a group, and missing partitions occur quite randomly.

Ok this sounds familiar with another problem set I heard last week.
You need to make sure the drives on the array are "raid compatible"
these days. Various green drives can take way too long to spin up or
goto sleep quickly causing them to get marked as bad by dmraid before
they get ready. However if its not that, then the next two issues tend
to be cable related:

1) Cable isn't rated for the length. Sure you can buy a 2 foot sata
cable but the controller timing issues may assume something much
shorter.
2) Cable isn't rated for drive capacities.
3) Other bios issues that require updates and playing around (oh wait
the default is to spin everything down but I need it up).

> So it may or may not be a timing issue between the WD Caviar Black drives
> and both the LSI and Marvell SAS/SATA controller chips.
>
> So, I replaced the two drives (SATA-300) with two faster drives (SATA-600)
> on the off chance they might respond fast enough before the drivers move on
> to other duties.  That didn't help either.
>
> Each group of arrays uses unrelated drivers (mptsas and sata_mv) but both
> exhibit the same problem, so I'm mystified as to where the real issue lies.
>  Anyone care to offer suggestions?
>
> Chuck
>



-- 
Stephen J Smoogen.
"The core skill of innovators is error recovery, not failure avoidance."
Randy Nelson, President of Pixar University.
"Let us be kind, one to another, for most of us are fighting a hard
battle." -- Ian MacLaren


Race condition with mdadm at boot [still mystifying]

2011-03-10 Thread Chuck Munro

This is a bit long-winded, but I wanted to share some info 

Regarding my earlier message about a possible race condition with mdadm, 
I have been doing all sorts of poking around with the boot process. 
Thanks to a tip from Steven Yellin at Stanford, I found where to add a 
delay in the rc.sysinit script, which invokes mdadm to assemble the arrays.


Unfortunately, it didn't help, so it likely wasn't a race condition 
after all.


However, on close examination of dmesg, I found something very 
interesting.  There were missing 'bind' statements for one or the 
other hot spare drive (or sometimes both).  These drives are connected 
to the last PHYs in each SATA controller ... in other words they are the 
last devices probed by the driver for a particular controller.  It would 
appear that the drivers are bailing out before managing to enumerate all 
of the partitions on the last drive in a group, and missing partitions 
occur quite randomly.


So it may or may not be a timing issue between the WD Caviar Black 
drives and both the LSI and Marvell SAS/SATA controller chips.


So, I replaced the two drives (SATA-300) with two faster drives 
(SATA-600) on the off chance they might respond fast enough before the 
drivers move on to other duties.  That didn't help either.


Each group of arrays uses unrelated drivers (mptsas and sata_mv) but 
both exhibit the same problem, so I'm mystified as to where the real 
issue lies.  Anyone care to offer suggestions?


Chuck


Re[2]: Race condition with mdadm at bootup?

2011-03-10 Thread Chuck Munro
Well, I tried adding a 5-second sleep to the mdadm startup in the 
sysinit script, and 10 seconds in the mdmonitor script, but it made no 
difference.  I still got the spare partitions not included in two of the 
arrays.  What I find curious is that it's always the hot spares, never 
the active components.


The "No suitable drives" thing is a mystery, since all drives work for 
the other arrays, but I get things like:


  # mdadm -A -s
  mdadm: No suitable drives found for /dev/md/md_d23
  mdadm: No suitable drives found for /dev/md/md_d27

when I issue the command manually after the system is up.

d23 and d27 are the random arrays with missing spares this time around. 
 Next time I boot it'll be different arrays.


Time to put on my thinking cap  :-)

Chuck


On 03/10/2011 10:53 AM, Steven J. Yellin wrote:

Maybe I missed it, but I didn't see any response to your request

"Does anyone know of a way to have mdadm delay its assembly until all
partitions are enumerated? Even if it's simply to insert a
several-second wait time, that would probably work. My knowledge of the
internal workings of the boot process isn't good enough to know where to
look."

I thought partition enumeration was done before init was started, but I
don't know much about such matters. That's why I was waiting for someone
else to reply. Anyway, here's what information I can contribute: In
/etc/rc.d/rc.sysinit is a line with "/sbin/mdadm -A -s", before which
you can insert a delay. And /etc/rc.d/init.d/mdmonitor contains
"#chkconfig: 2345 15 85", where the relatively low number "15" means
when chkconfig sets mdmonitor to start during boot, chkconfig will make
a symbolic link to mdmonitor named "S15mdmonitor", causing mdmonitor to
start relatively early. If you 'chkconfig mdmonitor off', it will not be
started at all during the boot, and you can do it later by hand with
"service mdmonitor start". That would let you see if an arbitrarily long
delay of it helps. Surely you can recover if mdmonitor is needed for
later parts of the boot, if only by 'chkconfig mdmonitor on' and reboot.


Steven Yellin

On Tue, 8 Mar 2011, Chuck Munro wrote:


Hello folks,

This is my first adventure with SL after many years of using CentOS.
I'm using SL-6 on a large-ish VM server, and have been quite happy
with it.

I am experiencing a weird problem at bootup with large RAID-6 arrays.
After Googling around (a lot) I find that others are having the same
issues with CentOS/RHEL/Ubuntu/whatever. In my case it's Scientific
Linux-6 which should behave the same way as CentOS-6. I had the same
problem with the RHEL-6 evaluation version. I'm posting this question
to the CentOS mailing list as well.

For some reason, each time I boot the server a random number of RAID
arrays will come up with the hot-spare missing. This occurs with
hot-spare components only, never with the active components. Once in a
while I'm lucky enough to have all components come up correctly when
the system boots. Which hot spares fail to be configured is completely
random.

I have 12 2TB drives, each divided into 4 primary partitions, and
configured as 8 partitionable MD arrays. All drives are partitioned
exactly the same way. Each R6 array consists of 5 components
(partitions) plus a hot-spare. The small RAID-1 host OS array never
has a problem with its hot spare.

The predominant theory via Google is that there's a race condition at
boot time between full enumeration of all disk partitions and mdadm
assembling the arrays.

Does anyone know of a way to have mdadm delay its assembly until all
partitions are enumerated? Even if it's simply to insert a
several-second wait time, that would probably work. My knowledge of
the internal workings of the boot process isn't good enough to know
where to look.

I tried to issue 'mdadm -A -s /dev/md/md_dXX' after booting, but all
it does is complain about "No suitable drives found for /dev."

Here is the mdadm.conf file:
-

MAILADDR root
PROGRAM /root/bin/record_md_events.sh

DEVICE partitions
##DEVICE /dev/sd* << this didn't help.
AUTO +imsm +1.x -all

## Host OS root arrays:
ARRAY /dev/md0
metadata=1.0 num-devices=2 spares=1
UUID=75941adb:33e8fa6a:095a70fd:6fe72c69
ARRAY /dev/md1
metadata=1.1 num-devices=2 spares=1
UUID=7a96d82d:bd6480a2:7433f1c2:947b84e9
ARRAY /dev/md2
metadata=1.1 num-devices=2 spares=1
UUID=ffc6070d:e57a675e:a1624e53:b88479d0

## Partitionable arrays on LSI controller:
ARRAY /dev/md/md_d10
metadata=1.2 num-devices=5 spares=1
UUID=135f0072:90551266:5d9a126a:011e3471
ARRAY /dev/md/md_d11
metadata=1.2 num-devices=5 spares=1
UUID=59e05755:5b3ec51e:e3002cfd:f0720c38
ARRAY /dev/md/md_d12
metadata=1.2 num-devices=5 spares=1
UUID=7916eb13:cd5063ba:a1404cd7:3b65a438
ARRAY /dev/md/md_d13
metadata=1.2 num-devices=5 spares=1
UUID=9a767e04:e4e56a9d:c369d25c:9d333760

## Partitionable arrays on Tempo controllers:
ARRAY /dev/md/md_d20
metadata=1.2 num-devices=5 spares=1
UUID=1d5a3c32:eb9374ac:eff41754:f8a17

Re: No success installing ATI Radeon HD5970 driver

2011-03-10 Thread Phil Perry

On 10/03/11 17:28, Wil Irwin wrote:

Hi-


I have tried multiple times to install the driver using the "GUI" installer
and the subsequent steps. Installation appears to proceed and I can finish
with "aticonfig --initial". However, the driver doesn't appear to be
applied. Scrolling down any webpage or document is very constipated, and
dragging windows across the screen is also extremely constipated. In
addition the GUI for Catalyst Control Panel will allow resolution, etc.
changes, but they are not applied after a re-boot. I have also tried the
command-prompt based install, with exactly the same results. I'm using the
11.2 driver released on 02/15/2011. I should also note the same problem (or
at least similar) happened with SL5 and Ubuntu 10.x)

The errors shown for fgl_glxgears, fglrxinfo, and glxinfo. uname -r; and the
xorg.conf file are listed below. I am running SL6 with all updates and
packages installed.

Any suggestions would be VERY MUCH appreciated.

Thanks,

Wil




Hi Wil,

Can I suggest you try the ATI driver package for EL6 from elrepo.org:

http://elrepo.org
http://elrepo.org/tiki/kmod-fglrx

I believe the elrepo.org repository might already be installed under SL6.

Once you have elrepo installed, you can install the ATI drivers with:

yum --enablerepo=elrepo install kmod-fglrx

and if you need 32-bit application support on x86_64 then you should 
also install the fglrx-x11-drv-32bit package too.


*Before* you install the elrepo packaged drivers, please uninstall the 
previous ATI installer drivers:


sh /usr/share/ati/fglrx-uninstall.sh

At the moment elrepo.org only has the 10.12 drivers for SL6 but I'll do 
my best to get those updated soon.


Re: No success installing ATI Radeon HD5970 driver

2011-03-10 Thread Lucian
On Thu, Mar 10, 2011 at 5:28 PM, Wil Irwin  wrote:
> Any suggestions would be VERY MUCH appreciated.

Yes, use the ATI driver in Elrepo; it works for me with SL6 x86_64.


Re: a few questions about SL admin best practises

2011-03-10 Thread Aaron van Meerten
Another note with regards to LVM: with our infrastructure we did some basic 
IOZone and bonnie++ tests and discovered that use of LVM causes up to a 10% 
performance hit for I/O operations in relation to using a native partition 
table.  This convenience did not seem to be worth the hit in performance we 
found.

-Aaron


On Mar 10, 2011, at 1:48 PM, Konstantin Olchanski wrote:

> On Thu, Mar 10, 2011 at 05:35:30AM -0500, Robert P. J. Day wrote:
>> 
>>  first question -- is there any sane reason not to use LVM these
>> days?  the manual opens (predictably) with showing the student how to
>> allocate fixed partitions during the install, and leaves LVM setup for
>> later in the week as an "advanced" topic.  i see it the other way
>> around -- LVM should be the norm nowadays.
>> 
> 
> 
> No reason to use LVM. The traditional "md" software raid is much simpler
> and easier to manage (only one tool to learn - mdadm, compared
> to the 100 lvm management programs). Historically, LVM is a knock-off
> of XLV which was the companion partitionning tool to SGI's XFS filesystem.
> 
> 
>>  thoughts?  i'll always allocate /boot as a regular partition but
>> unless there are compelling reasons not to, i always recommend LVM as
>> the standard.
> 
> 
> Your /boot partition has to be mirrored across both of your system disks.
> If it's only on one disk and it fails, you have an unbootable machine,
> regardless of what tool you used (lvm or md).
> 
> With "md" it is very simple, /dev/md0 is the system partition mirrored
> across /dev/sda1 and /dev/sdb1, there is no need for separate /boot
> partition, GRUB happily installs on both /dev/sda and /dev/sdb, and
> your machine happily boots if either disk explodes.
> 
> To do the same with LVM, you probably have to read a book and take
> an advanced sysadmin class; and forget about getting it to actually
> work without the help of this mailing list.
> 
> 
> -- 
> Konstantin Olchanski
> Data Acquisition Systems: The Bytes Must Flow!
> Email: olchansk-at-triumf-dot-ca
> Snail mail: 4004 Wesbrook Mall, TRIUMF, Vancouver, B.C., V6T 2A3, Canada


Re: a few questions about SL admin best practises

2011-03-10 Thread Konstantin Olchanski
On Thu, Mar 10, 2011 at 05:35:30AM -0500, Robert P. J. Day wrote:
> 
>   first question -- is there any sane reason not to use LVM these
> days?  the manual opens (predictably) with showing the student how to
> allocate fixed partitions during the install, and leaves LVM setup for
> later in the week as an "advanced" topic.  i see it the other way
> around -- LVM should be the norm nowadays.
>


No reason to use LVM. The traditional "md" software raid is much simpler
and easier to manage (only one tool to learn - mdadm, compared
to the 100 lvm management programs). Historically, LVM is a knock-off
of XLV which was the companion partitionning tool to SGI's XFS filesystem.


>   thoughts?  i'll always allocate /boot as a regular partition but
> unless there are compelling reasons not to, i always recommend LVM as
> the standard.


Your /boot partition has to be mirrored across both of your system disks.
If it's only on one disk and it fails, you have an unbootable machine,
regardless of what tool you used (lvm or md).

With "md" it is very simple, /dev/md0 is the system partition mirrored
across /dev/sda1 and /dev/sdb1, there is no need for separate /boot
partition, GRUB happily installs on both /dev/sda and /dev/sdb, and
your machine happily boots if either disk explodes.

To do the same with LVM, you probably have to read a book and take
an advanced sysadmin class; and forget about getting it to actually
work without the help of this mailing list.


-- 
Konstantin Olchanski
Data Acquisition Systems: The Bytes Must Flow!
Email: olchansk-at-triumf-dot-ca
Snail mail: 4004 Wesbrook Mall, TRIUMF, Vancouver, B.C., V6T 2A3, Canada


Re: Race condition with mdadm at bootup?

2011-03-10 Thread Steven J. Yellin

Maybe I missed it, but I didn't see any response to your request

"Does anyone know of a way to have mdadm delay its assembly until all 
partitions are enumerated?  Even if it's simply to insert a several-second 
wait time, that would probably work.  My knowledge of the internal 
workings of the boot process isn't good enough to know where to look."


I thought partition enumeration was done before init was started, but 
I don't know much about such matters.  That's why I was waiting for 
someone else to reply.  Anyway, here's what information I can contribute: 
In /etc/rc.d/rc.sysinit is a line with "/sbin/mdadm -A -s", before which 
you can insert a delay.  And /etc/rc.d/init.d/mdmonitor contains 
"#chkconfig: 2345 15 85", where the relatively low number "15" means when 
chkconfig sets mdmonitor to start during boot, chkconfig will make a 
symbolic link to mdmonitor named "S15mdmonitor", causing mdmonitor to 
start relatively early.  If you 'chkconfig mdmonitor off', it will not be 
started at all during the boot, and you can do it later by hand with 
"service mdmonitor start".  That would let you see if an arbitrarily long 
delay of it helps.  Surely you can recover if mdmonitor is needed for 
later parts of the boot, if only by 'chkconfig mdmonitor on' and reboot.



Steven Yellin

On Tue, 8 Mar 2011, Chuck Munro wrote:


Hello folks,

This is my first adventure with SL after many years of using CentOS. I'm 
using SL-6 on a large-ish VM server, and have been quite happy with it.


I am experiencing a weird problem at bootup with large RAID-6 arrays. After 
Googling around (a lot) I find that others are having the same issues with 
CentOS/RHEL/Ubuntu/whatever.  In my case it's Scientific Linux-6 which should 
behave the same way as CentOS-6.  I had the same problem with the RHEL-6 
evaluation version.  I'm posting this question to the CentOS mailing list as 
well.


For some reason, each time I boot the server a random number of RAID arrays 
will come up with the hot-spare missing.  This occurs with hot-spare 
components only, never with the active components.  Once in a while I'm lucky 
enough to have all components come up correctly when the system boots.  Which 
hot spares fail to be configured is completely random.


I have 12 2TB drives, each divided into 4 primary partitions, and configured 
as 8 partitionable MD arrays.  All drives are partitioned exactly the same 
way.  Each R6 array consists of 5 components (partitions) plus a hot-spare. 
The small RAID-1 host OS array never has a problem with its hot spare.


The predominant theory via Google is that there's a race condition at boot 
time between full enumeration of all disk partitions and mdadm assembling the 
arrays.


Does anyone know of a way to have mdadm delay its assembly until all 
partitions are enumerated?  Even if it's simply to insert a several-second 
wait time, that would probably work.  My knowledge of the internal workings 
of the boot process isn't good enough to know where to look.


I tried to issue 'mdadm -A -s /dev/md/md_dXX' after booting, but all it does 
is complain about "No suitable drives found for /dev."


Here is the mdadm.conf file:
-

MAILADDR root
PROGRAM /root/bin/record_md_events.sh

DEVICE partitions
##DEVICE /dev/sd*<< this didn't help.
AUTO +imsm +1.x -all

## Host OS root arrays:
ARRAY /dev/md0
  metadata=1.0 num-devices=2 spares=1
  UUID=75941adb:33e8fa6a:095a70fd:6fe72c69
ARRAY /dev/md1
  metadata=1.1 num-devices=2 spares=1
  UUID=7a96d82d:bd6480a2:7433f1c2:947b84e9
ARRAY /dev/md2
  metadata=1.1 num-devices=2 spares=1
  UUID=ffc6070d:e57a675e:a1624e53:b88479d0

## Partitionable arrays on LSI controller:
ARRAY /dev/md/md_d10
  metadata=1.2 num-devices=5 spares=1
  UUID=135f0072:90551266:5d9a126a:011e3471
ARRAY /dev/md/md_d11
  metadata=1.2 num-devices=5 spares=1
  UUID=59e05755:5b3ec51e:e3002cfd:f0720c38
ARRAY /dev/md/md_d12
  metadata=1.2 num-devices=5 spares=1
  UUID=7916eb13:cd5063ba:a1404cd7:3b65a438
ARRAY /dev/md/md_d13
  metadata=1.2 num-devices=5 spares=1
  UUID=9a767e04:e4e56a9d:c369d25c:9d333760

## Partitionable arrays on Tempo controllers:
ARRAY /dev/md/md_d20
  metadata=1.2 num-devices=5 spares=1
  UUID=1d5a3c32:eb9374ac:eff41754:f8a176c1
ARRAY /dev/md/md_d21
  metadata=1.2 num-devices=5 spares=1
  UUID=38ffe8c9:f3922db9:60bb1522:80fea016
ARRAY /dev/md/md_d22
  metadata=1.2 num-devices=5 spares=1
  UUID=ebb4ea67:b31b2105:498d81af:9b4f45d3
ARRAY /dev/md/md_d23
  metadata=1.2 num-devices=5 spares=1
  UUID=da07407f:deeb8906:7a70ae82:6b1d8c4a

-

Your suggestions are most welcome ... thanks.

Chuck



Re: gdm and PreSession/Default

2011-03-10 Thread Ken Teh

I solved this problem.  For those of you who are required to put up the DOE 
warning banner or any other sort of disclaimer that aborts a gui login, the 
solution is to put the attached file in /etc/X11/xinit/xinitrc.d.  The scripts 
in xinitrc.d are run from xinitrc-common.  The distro's first script is 
00-start-message-bus.sh.  Choose a name so that the DOE warning script is run 
before it.  No point starting up dbus if the user chooses to decline the 
disclaimer and abort the login.



On 03/08/2011 12:41 PM, Ken Teh wrote:

SL6's gdm does not honor the exit code from /etc/gdm/PreSession/Default. In 
SL5x, it used to abort the login if the script returns a non-zero exit code. I 
used this feature to put up a zenity dialog which the user had to click yes in 
order to continue logging in. Now, the user logs regardless of the script.

In fact, the distro's Default script is empty. It used to contain the xsetroot 
and sessreg bits.

Any workarounds?


===
WARNING: This e-mail has been altered by MIMEDefang.  Following this
paragraph are indications of the actual changes made.  For more
information about your site's MIMEDefang policy, contact
MIMEDefang Administrator .  For more information about 
MIMEDefang, see:

   http://www.roaringpenguin.com/mimedefang/enduser.php3

An attachment named '000-doewarning.sh' was converted to 'defang-1.binary'.
To recover the file, right-click on the attachment and Save As
'000-doewarning.sh'




defang-1.binary
Description: defang-1.binary


No success installing ATI Radeon HD5970 driver

2011-03-10 Thread Wil Irwin
Hi-


I have tried multiple times to install the driver using the "GUI" installer
and the subsequent steps. Installation appears to proceed and I can finish
with "aticonfig --initial". However, the driver doesn't appear to be
applied. Scrolling down any webpage or document is very constipated, and
dragging windows across the screen is also extremely constipated. In
addition the GUI for Catalyst Control Panel will allow resolution, etc.
changes, but they are not applied after a re-boot. I have also tried the
command-prompt based install, with exactly the same results. I'm using the
11.2 driver released on 02/15/2011. I should also note the same problem (or
at least similar) happened with SL5 and Ubuntu 10.x)

The errors shown for fgl_glxgears, fglrxinfo, and glxinfo. uname -r; and the
xorg.conf file are listed below. I am running SL6 with all updates and
packages installed.

Any suggestions would be VERY MUCH appreciated.

Thanks,

Wil

---

[root@Cluster1 ~]# fglrxinfo

X Error of failed request:  BadRequest (invalid request code or no such
operation)

  Major opcode of failed request:  136 (GLX)

  Minor opcode of failed request:  19 (X_GLXQueryServerString)

  Serial number of failed request:  18

  Current serial number in output stream:  18


[root@Cluster1 ~]# glxinfo

name of display: :0.0

X Error of failed request:  BadRequest (invalid request code or no such
operation)

  Major opcode of failed request:  136 (GLX)

  Minor opcode of failed request:  19 (X_GLXQueryServerString)

  Serial number of failed request:  18

  Current serial number in output stream:  18


[root@Cluster1 ~]# fgl_glxgears

Using GLX_SGIX_pbuffer

X Error of failed request:  BadRequest (invalid request code or no such
operation)

  Major opcode of failed request:  136 (GLX)

  Minor opcode of failed request:  19 (X_GLXQueryServerString)

  Serial number of failed request:  18

  Current serial number in output stream:  18



[root@Cluster1 X11]# uname -r

2.6.32-71.18.1.el6.x86_64



[root@Cluster1 X11]# vi xorg.conf.fglrx-12

Section "ServerLayout"

Identifier "aticonfig Layout"

Screen  0  "aticonfig-Screen[0]-0" 0 0

EndSection



Section "Module"

EndSection



Section "Monitor"

Identifier   "aticonfig-Monitor[0]-0"

Option  "VendorName" "ATI Proprietary Driver"

Option  "ModelName" "Generic Autodetecting Monitor"

Option  "DPMS" "true"

EndSection



Section "Monitor"

Identifier   "0-DFP3"

Option  "VendorName" "ATI Proprietary Driver"

Option  "ModelName" "Generic Autodetecting Monitor"

Option  "DPMS" "true"

Option  "PreferredMode" "1920x1080"

Option  "TargetRefresh" "60"

Option  "Position" "0 0"

Option  "Rotate" "normal"

Option  "Disable" "false"

EndSection



Section "Device"

Identifier  "Videocard0"

Driver  "vesa"

EndSection



Section "Device"

Identifier  "aticonfig-Device[0]-0"

Driver  "fglrx"

Option  "Monitor-DFP3" "0-DFP3"


Re: nfs kickstart scripts

2011-03-10 Thread Connie Sieh

We will research this issue.

-Connie Sieh

On Wed, 9 Mar 2011, Ken Teh wrote:


Ok, I got it.  You were right.  The tab option did the trick.  The ks spec

ks=nfs::/path/to/kickstart

needs to be added to the kernel boot options.  It says so in the TUV docs, but 
it also says to type this in at the boot prompt.  Except, of course, there is 
no boot prompt unlike SL4x and SL5x.



On 03/09/2011 06:02 PM, Bluejay Adametz wrote:

A little more detail:  The iso images for SL4 and SL5 used to stop with the
boot: prompt after loading the kernel. I would then specify

boot: linux=nfs::/path/to/kickstart


Have you tried hitting  when the boot screen comes up? That seems
to allow modifying the boot options and may do what you want.

There is a note to this effect on the boot screen, but it's not real
obvious. I missed it for a while.

  - Bluejay Adametz, CFII, A&P, AA-5B N45210

Be careful what you teach You might have to learn it one day.
-Tunnell's Terse Transmogrification of
 Fido Fisher's Fortuitious Formulary

NOTICE: This message, including any attachments, is only for the use of
the intended recipient(s) and may contain confidential and privileged
information, or information otherwise protected from disclosure by law.
If the reader of this message is not the intended recipient, you are
hereby notified that any use, disclosure, copying, dissemination or
distribution of this message or any of its attachments is strictly
prohibited.  If you received this message in error, please contact the
sender immediately by reply email and destroy this message, including all
attachments, and any copies thereof.




Re: what's the order of upstart processing in SL 6?

2011-03-10 Thread Robert P. J. Day
On Thu, 10 Mar 2011, Troy Dawson wrote:

> I haven't found precisely what you are looking for, but I have found
> the correct man pages, or at least some more man pages.
>
> Looking at the documentation here
> http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Technical_Notes/deployment.html
> It says
> "Processes are known to Upstart as jobs and are defined by files in the
> /etc/init directory. Upstart is very well documented via man pages. Command
> overview is in init(8) and job syntax is described in init(5). "
>
> looking at
>   man 5 init
>   man 8 init
> It describes Upstart.
>
> As I said, it doesn't completely answer your question, but hopefully
> it points you in the right direction.

  i've been through those pages and i still haven't found an
unambiguous description of what happens when there is no
/etc/init/rc-sysinit.conf.  barring that, the best *guess* i can come
up with is that, if that particular file is missing, then upstart will
simply scan all of the /etc/init/*.conf files and start anything with
no dependencies.  or more technically, anything with the dependency:

  start on startup

which includes rcS.conf and a couple readahead files.  beyond that,
i'm not sure but i'll keep reading.

rday

-- 


Robert P. J. Day   Waterloo, Ontario, CANADA
http://crashcourse.ca

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday



Re: a few questions about SL admin best practises

2011-03-10 Thread Robert P. J. Day
On Thu, 10 Mar 2011, Troy Dawson wrote:

> If these are beginning linux admins who don't know about partitions,
> or haven't done linux partitioning, then you shouldn't do LVM first.
> You should teach them about partitions, and the general layout of
> Linux partitions. Your general windows admin isn't going to know
> about /boot or swap partitions. Your general unix admin will know
> about how his version of unix partitioning, and will appreciate
> knowing what partitions linux should have.  And if they aren't an
> admin, then they aren't going to know about partitions at all.
>
> If this is a bunch of Debian admins wanting to know RedHat, then go
> straight to LVM.

  i've decided i can combine the best of both worlds.  given that
/boot is still allocated as a regular primary partition, i can use
that to talk about partitions, while still using LVM for the remainder
of the disk layout.  i think that will solve the problem.

rday

-- 


Robert P. J. Day   Waterloo, Ontario, CANADA
http://crashcourse.ca

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday



Re: a few questions about SL admin best practises

2011-03-10 Thread Adam Miller
On Thu, Mar 10, 2011 at 05:35:30AM -0500, Robert P. J. Day wrote:
>   hoping this isn't egregiously off-topic but i'm teaching a RH sys
> admin course next week and i'm using SL 6.0 as the vehicle.  i'm being
> handed the courseware to use and i'm pondering which parts are really
> out of date so that i can skip them or replace them with newer
> material on the fly.
> 
>   first question -- is there any sane reason not to use LVM these
> days?  the manual opens (predictably) with showing the student how to
> allocate fixed partitions during the install, and leaves LVM setup for
> later in the week as an "advanced" topic.  i see it the other way
> around -- LVM should be the norm nowadays.
> 
>   thoughts?  i'll always allocate /boot as a regular partition but
> unless there are compelling reasons not to, i always recommend LVM as
> the standard.
> 
> rday

First Answer:
There is currently no reason not to use LVM for everything other than
/boot which should be referenced by UUID. 

Thoughts:
LVM is the defacto standard as it stands today in RHEL6 as outlined to
be base knowledge expected of a Red Hat Certified Systems Administrator
as well as Red Hat Certified Systems Engineer (the RHCSA is a prereq to
the RHCE ... so its somewhat redundant to list both but I did so for
clarity).

http://www.redhat.com/certification/rhcsa/objectives/
https://www.redhat.com/certification/rhce/objectives/

I think you're headed down the right direction.

-AdamM


Re: a few questions about SL admin best practises

2011-03-10 Thread Troy Dawson

On 03/10/2011 04:35 AM, Robert P. J. Day wrote:

   hoping this isn't egregiously off-topic but i'm teaching a RH sys
admin course next week and i'm using SL 6.0 as the vehicle.  i'm being
handed the courseware to use and i'm pondering which parts are really
out of date so that i can skip them or replace them with newer
material on the fly.

   first question -- is there any sane reason not to use LVM these
days?  the manual opens (predictably) with showing the student how to
allocate fixed partitions during the install, and leaves LVM setup for
later in the week as an "advanced" topic.  i see it the other way
around -- LVM should be the norm nowadays.

   thoughts?  i'll always allocate /boot as a regular partition but
unless there are compelling reasons not to, i always recommend LVM as
the standard.

rday



My two cents.  These are only opinions.

It depends on the level of the course.

If these are beginning linux admins who don't know about partitions, or 
haven't done linux partitioning, then you shouldn't do LVM first.  You 
should teach them about partitions, and the general layout of Linux 
partitions.
Your general windows admin isn't going to know about /boot or swap 
partitions.  Your general unix admin will know about how his version of 
unix partitioning, and will appreciate knowing what partitions linux 
should have.  And if they aren't an admin, then they aren't going to 
know about partitions at all.


If this is a bunch of Debian admins wanting to know RedHat, then go 
straight to LVM.


Again, my opinion.
Troy
--
__
Troy Dawson  daw...@fnal.gov  (630)840-6468
Fermilab  ComputingDivision/SCF/FEF/SLSMS Group
__


Re: what's the order of upstart processing in SL 6?

2011-03-10 Thread Troy Dawson

On 03/10/2011 07:31 AM, Robert P. J. Day wrote:

   finally taking the time to dig into upstart, and i'm confused by one
issue.  on my ubuntu system, the documentation "man 7 startup" claims
that the primary task on startup is /etc/init/rc-sysinit.conf, which
exists on my ubuntu system.  but on SL 6, while the man page reads the
same, there is no such .conf file in /etc/init, so what is the boot
sequence?

   i do see the file /etc/init/rc.conf so i'm sure i can work through
the sequence but since i want to explain this to some students next
week, i'd like to find the appropriate doc/man page that explains
*precisely* what happens.  so what's the process when
/etc/init/rc-sysinit.conf doesn't exist?  and what man page explains
that?  thanks.

rday



I haven't found precisely what you are looking for, but I have found the 
correct man pages, or at least some more man pages.


Looking at the documentation here
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Technical_Notes/deployment.html
It says
"Processes are known to Upstart as jobs and are defined by files in the 
/etc/init directory. Upstart is very well documented via man pages. 
Command overview is in init(8) and job syntax is described in init(5). "


looking at
  man 5 init
  man 8 init
It describes Upstart.

As I said, it doesn't completely answer your question, but hopefully it 
points you in the right direction.


Troy
--
__
Troy Dawson  daw...@fnal.gov  (630)840-6468
Fermilab  ComputingDivision/SCF/FEF/SLSMS Group
__


Re: a few questions about SL admin best practises

2011-03-10 Thread Brett Serkez
This is a good point, I always do manual partitioning.  For instance I see
no point in separate /boot or a swap partitions and I only use software raid
as hard drives fail and swapping a drive only requires a simple re-syncing.

Perhaps I've gotten very lucky, resizing or moving a partition has never
been an issue, I deal with small businesses, the only issues I occasionally
run into are MySQL maintenance related and an occasional re indexing of a
ext3 file system.  The servers I've built with this methodology are
extremely stable, most are in xen and any upgrades are copy existing stable
VM, update and test, swap to new stable VM.

My view might be too bias, I am a minimalist and install all my systems at
the barest minimum possible and tend to run in a very small memory and disk
foot print.   I am sure that LVM is mature and in thinking about, it more
probably should be covered in a class, but I'd only start using it when I
see a need as with any other technology.

I have one point of view for your consideration,

Brett

On Thu, Mar 10, 2011 at 8:26 AM, Robert P. J. Day wrote:

>
>   as i mentioned, one of the primary reasons i'm going with LVM from
> the start of the course is that it's the default layout with SL 6 --
> you have to explicitly choose *not* to use it.  and since i consider
> LVM to be a stable and mature technology, i don't see any compelling
> reason to avoid it.
>
> rday
>
> --
>
> 
> Robert P. J. Day   Waterloo, Ontario, CANADA
>http://crashcourse.ca
>
> Twitter:   http://twitter.com/rpjday
> LinkedIn:   http://ca.linkedin.com/in/rpjday
> 
>


Re: problem with hald

2011-03-10 Thread Charles G Waldman
 > On 09/03/11 15:57, Steven Timm wrote:
 > > any reason you can't just turn hald off?  Most servers don't need it.
 > >
 > > Steve Timm

Furthermore note that HAL is deprecated and scheduled for removal ...

 - Charles


Re: what's the order of upstart processing in SL 6?

2011-03-10 Thread Robert P. J. Day
On Thu, 10 Mar 2011, Robert P. J. Day wrote:

>   finally taking the time to dig into upstart, and i'm confused by
> one issue.  on my ubuntu system, the documentation "man 7 startup"
> claims that the primary task on startup is
> /etc/init/rc-sysinit.conf, which exists on my ubuntu system.  but on
> SL 6, while the man page reads the same, there is no such .conf file
> in /etc/init, so what is the boot sequence?
>
>   i do see the file /etc/init/rc.conf so i'm sure i can work through
> the sequence but since i want to explain this to some students next
> week, i'd like to find the appropriate doc/man page that explains
> *precisely* what happens.  so what's the process when
> /etc/init/rc-sysinit.conf doesn't exist?  and what man page explains
> that?  thanks.

  oh, wait, i just noticed this in /etc/inittab (whose only active
line is to define the default runlevel of 5):

# System initialization is started by /etc/init/rcS.conf

but, again, what man page or doc actually *states* that?  i'd like to
present to the students the sequence of man pages that walks them
through the phases of upstart, but i'm still not seeing what explains
that one step.  thanks.

rday


-- 


Robert P. J. Day   Waterloo, Ontario, CANADA
http://crashcourse.ca

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday



what's the order of upstart processing in SL 6?

2011-03-10 Thread Robert P. J. Day
  finally taking the time to dig into upstart, and i'm confused by one
issue.  on my ubuntu system, the documentation "man 7 startup" claims
that the primary task on startup is /etc/init/rc-sysinit.conf, which
exists on my ubuntu system.  but on SL 6, while the man page reads the
same, there is no such .conf file in /etc/init, so what is the boot
sequence?

  i do see the file /etc/init/rc.conf so i'm sure i can work through
the sequence but since i want to explain this to some students next
week, i'd like to find the appropriate doc/man page that explains
*precisely* what happens.  so what's the process when
/etc/init/rc-sysinit.conf doesn't exist?  and what man page explains
that?  thanks.

rday

-- 


Robert P. J. Day   Waterloo, Ontario, CANADA
http://crashcourse.ca

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday



Re: a few questions about SL admin best practises

2011-03-10 Thread Robert P. J. Day
On Thu, 10 Mar 2011, Michael Tiernan wrote:

> On 3/10/11 6:42 AM, Robert P. J. Day wrote:
> > i'm fascinated that you use mirroring RAID as a matter of course,
> > but consider LVM to be an advanced topic.
> >
> I too find this interesting. One of the things I love about SA is if
> you ask one question of three SAs you'll get nine answers no two of
> which are alike. :)
>
> I view LVM on the same level as basic system security. If you grok
> the core workings of security, then LVM's not too difficult for you.
>
> Let me say up front, I don't question Mr Serkez business choices, he
> knows better than I in that regard.
>
> Regarding LVM in general, in *my* opinion, if the system's running
> hardware RAID controllers, LVM is a good thing to use to allow the
> flexibility of moving partitions around and changing sizes
> dynamically. It's one more thing that makes our lives easier.

  as i mentioned, one of the primary reasons i'm going with LVM from
the start of the course is that it's the default layout with SL 6 --
you have to explicitly choose *not* to use it.  and since i consider
LVM to be a stable and mature technology, i don't see any compelling
reason to avoid it.

rday

-- 


Robert P. J. Day   Waterloo, Ontario, CANADA
http://crashcourse.ca

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday



Re: a few questions about SL admin best practises

2011-03-10 Thread Michael Tiernan

On 3/10/11 6:42 AM, Robert P. J. Day wrote:

i'm fascinated that you use mirroring RAID as a matter of course,
but consider LVM to be an advanced topic.
   
I too find this interesting. One of the things I love about SA is if you 
ask one question of three SAs you'll get nine answers no two of which 
are alike. :)


I view LVM on the same level as basic system security. If you grok the 
core workings of security, then LVM's not too difficult for you.


Let me say up front, I don't question Mr Serkez business choices, he 
knows better than I in that regard.


Regarding LVM in general, in *my* opinion, if the system's running 
hardware RAID controllers, LVM is a good thing to use to allow the 
flexibility of moving partitions around and changing sizes dynamically. 
It's one more thing that makes our lives easier.


Again, *my* opinion.

--
  <<  MCT>>Michael C Tiernan   Charter member of lopsa.org/
  MIT - Laboratory for Nuclear Science - http://www.lns.mit.edu
  High Perf Research Computing Facility at The Bates Linear Accelerator
  xmpp:mtier...@mit.edu   skype:mtiernan-mitcms   +1 (617) 324-9173


Re: a few questions about SL admin best practises

2011-03-10 Thread Robert P. J. Day
On Thu, 10 Mar 2011, Brett Serkez wrote:

> I have been using CentOS in small businesses since 2003 and never
> once found a reason to use LVM.   I've been using mirroring RAID,
> which has been very useful.   I tend to use the less is more adage,
> keeping my servers simple and lean and use XEN to break out
> functionality. This strategy has proven to be extremely reliable and
> easy to maintain, I can focus the majority of my energy on managing
> WinDoze.
>
> Having taught Linux and Linux administration myself, I see LVM as an
> advanced topic.  Perhaps in larger organizations the perspective is
> different.

  i'm fascinated that you use mirroring RAID as a matter of course,
but consider LVM to be an advanced topic.  in any event, part of the
reason i would consider LVM standard these days is that if you take
the default install configuration with SL6, you get LVM automatically.
that's just part of my thinking.

rday

--


Robert P. J. Day   Waterloo, Ontario, CANADA
http://crashcourse.ca

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday



Re: a few questions about SL admin best practises

2011-03-10 Thread Brett Serkez
I have been using CentOS in small businesses since 2003 and never once found
a reason to use LVM.   I've been using mirroring RAID, which has been very
useful.   I tend to use the less is more adage, keeping my servers simple
and lean and use XEN to break out functionality. This strategy has proven to
be extremely reliable and easy to maintain, I can focus the majority of my
energy on managing WinDoze.

Having taught Linux and Linux administration myself, I see LVM as an
advanced topic.  Perhaps in larger organizations the perspective is
different.

Brett

On Thu, Mar 10, 2011 at 5:35 AM, Robert P. J. Day wrote:

>  hoping this isn't egregiously off-topic but i'm teaching a RH sys
> admin course next week and i'm using SL 6.0 as the vehicle.  i'm being
> handed the courseware to use and i'm pondering which parts are really
> out of date so that i can skip them or replace them with newer
> material on the fly.
>
>  first question -- is there any sane reason not to use LVM these
> days?  the manual opens (predictably) with showing the student how to
> allocate fixed partitions during the install, and leaves LVM setup for
> later in the week as an "advanced" topic.  i see it the other way
> around -- LVM should be the norm nowadays.
>
>  thoughts?  i'll always allocate /boot as a regular partition but
> unless there are compelling reasons not to, i always recommend LVM as
> the standard.
>
> rday
>
> --
>
> 
> Robert P. J. Day   Waterloo, Ontario, CANADA
>http://crashcourse.ca
>
> Twitter:   http://twitter.com/rpjday
> LinkedIn:   http://ca.linkedin.com/in/rpjday
> 
>


a few questions about SL admin best practises

2011-03-10 Thread Robert P. J. Day
  hoping this isn't egregiously off-topic but i'm teaching a RH sys
admin course next week and i'm using SL 6.0 as the vehicle.  i'm being
handed the courseware to use and i'm pondering which parts are really
out of date so that i can skip them or replace them with newer
material on the fly.

  first question -- is there any sane reason not to use LVM these
days?  the manual opens (predictably) with showing the student how to
allocate fixed partitions during the install, and leaves LVM setup for
later in the week as an "advanced" topic.  i see it the other way
around -- LVM should be the norm nowadays.

  thoughts?  i'll always allocate /boot as a regular partition but
unless there are compelling reasons not to, i always recommend LVM as
the standard.

rday

-- 


Robert P. J. Day   Waterloo, Ontario, CANADA
http://crashcourse.ca

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday



Re: problem with hald

2011-03-10 Thread Faye Gibbins

Hi,

 Yep, that's what I'm leaning towards.

Faye

On 09/03/11 15:57, Steven Timm wrote:

any reason you can't just turn hald off?  Most servers don't need it.

Steve Timm


On Wed, 9 Mar 2011, Faye Gibbins wrote:


Hi,

I've got one very special machine that has about 15,000 - 20,000
automount entries.

Things are fine until about 5000 mount points are mounted up then any
further adding (in a tight bash loop) slows dramatically and hald
starts taking more and more CPU.
Memeory usage is well below physical ram, no swapping.

Automount mounting slows to a few seconds per mount and the system
loads goes up and the machine slowly grinds to a halt.

Running hald in foreground and in verbose mode I see lots and lots of
this type of messages:

12:46:01.254 [I] osspec.c:256: /proc/mounts tells, that the mount has
tree changed

(process:10306): GLib-CRITICAL **: g_hash_table_lookup_extended:
assertion `hash_table != NULL' failed

(process:10306): GLib-CRITICAL **: g_hash_table_lookup_extended:
assertion `hash_table != NULL' failed

(process:10306): GLib-CRITICAL **: g_hash_table_lookup_extended:
assertion `hash_table != NULL' failed

If anyone can shed some light on this it would make me very happy.

It feels like when you see a search routine slow as n increases, is
hald or dbus using a flat directory somewhere as a hash table?

Machine details:
SL5.5
2.6.18-194.32.1.el5

Faye








--
-
Faye Gibbins, Sys Admin.  GeoS KB.  Linux, Unix, Security
Beekeeper  - The Apiary Project, KB -   www.bees.ed.ac.uk
-
 (x(x_(X_x(O_o)x_x)_X)x)
  I grabbed at spannungsbogen before I knew I wanted it.
  Socrates: Question authority, question everything.
  Mermin:   If the maths works "Shut up and calculate!"

The University of Edinburgh is a charitable body,
registered in Scotland, with registration number SC005336.