Re: Race condition with mdadm at boot [still mystifying]
On Thu, Mar 10, 2011 at 20:24, Chuck Munro wrote: > This is a bit long-winded, but I wanted to share some info > > Regarding my earlier message about a possible race condition with mdadm, I > have been doing all sorts of poking around with the boot process. Thanks to > a tip from Steven Yellin at Stanford, I found where to add a delay in the > rc.sysinit script, which invokes mdadm to assemble the arrays. > > Unfortunately, it didn't help, so it likely wasn't a race condition after > all. > > However, on close examination of dmesg, I found something very interesting. > There were missing 'bind' statements for one or the other hot spare > drive (or sometimes both). These drives are connected to the last PHYs in > each SATA controller ... in other words they are the last devices probed by > the driver for a particular controller. It would appear that the drivers > are bailing out before managing to enumerate all of the partitions on the > last drive in a group, and missing partitions occur quite randomly. Ok this sounds familiar with another problem set I heard last week. You need to make sure the drives on the array are "raid compatible" these days. Various green drives can take way too long to spin up or goto sleep quickly causing them to get marked as bad by dmraid before they get ready. However if its not that, then the next two issues tend to be cable related: 1) Cable isn't rated for the length. Sure you can buy a 2 foot sata cable but the controller timing issues may assume something much shorter. 2) Cable isn't rated for drive capacities. 3) Other bios issues that require updates and playing around (oh wait the default is to spin everything down but I need it up). > So it may or may not be a timing issue between the WD Caviar Black drives > and both the LSI and Marvell SAS/SATA controller chips. > > So, I replaced the two drives (SATA-300) with two faster drives (SATA-600) > on the off chance they might respond fast enough before the drivers move on > to other duties. That didn't help either. > > Each group of arrays uses unrelated drivers (mptsas and sata_mv) but both > exhibit the same problem, so I'm mystified as to where the real issue lies. > Anyone care to offer suggestions? > > Chuck > -- Stephen J Smoogen. "The core skill of innovators is error recovery, not failure avoidance." Randy Nelson, President of Pixar University. "Let us be kind, one to another, for most of us are fighting a hard battle." -- Ian MacLaren
Race condition with mdadm at boot [still mystifying]
This is a bit long-winded, but I wanted to share some info Regarding my earlier message about a possible race condition with mdadm, I have been doing all sorts of poking around with the boot process. Thanks to a tip from Steven Yellin at Stanford, I found where to add a delay in the rc.sysinit script, which invokes mdadm to assemble the arrays. Unfortunately, it didn't help, so it likely wasn't a race condition after all. However, on close examination of dmesg, I found something very interesting. There were missing 'bind' statements for one or the other hot spare drive (or sometimes both). These drives are connected to the last PHYs in each SATA controller ... in other words they are the last devices probed by the driver for a particular controller. It would appear that the drivers are bailing out before managing to enumerate all of the partitions on the last drive in a group, and missing partitions occur quite randomly. So it may or may not be a timing issue between the WD Caviar Black drives and both the LSI and Marvell SAS/SATA controller chips. So, I replaced the two drives (SATA-300) with two faster drives (SATA-600) on the off chance they might respond fast enough before the drivers move on to other duties. That didn't help either. Each group of arrays uses unrelated drivers (mptsas and sata_mv) but both exhibit the same problem, so I'm mystified as to where the real issue lies. Anyone care to offer suggestions? Chuck
Re[2]: Race condition with mdadm at bootup?
Well, I tried adding a 5-second sleep to the mdadm startup in the sysinit script, and 10 seconds in the mdmonitor script, but it made no difference. I still got the spare partitions not included in two of the arrays. What I find curious is that it's always the hot spares, never the active components. The "No suitable drives" thing is a mystery, since all drives work for the other arrays, but I get things like: # mdadm -A -s mdadm: No suitable drives found for /dev/md/md_d23 mdadm: No suitable drives found for /dev/md/md_d27 when I issue the command manually after the system is up. d23 and d27 are the random arrays with missing spares this time around. Next time I boot it'll be different arrays. Time to put on my thinking cap :-) Chuck On 03/10/2011 10:53 AM, Steven J. Yellin wrote: Maybe I missed it, but I didn't see any response to your request "Does anyone know of a way to have mdadm delay its assembly until all partitions are enumerated? Even if it's simply to insert a several-second wait time, that would probably work. My knowledge of the internal workings of the boot process isn't good enough to know where to look." I thought partition enumeration was done before init was started, but I don't know much about such matters. That's why I was waiting for someone else to reply. Anyway, here's what information I can contribute: In /etc/rc.d/rc.sysinit is a line with "/sbin/mdadm -A -s", before which you can insert a delay. And /etc/rc.d/init.d/mdmonitor contains "#chkconfig: 2345 15 85", where the relatively low number "15" means when chkconfig sets mdmonitor to start during boot, chkconfig will make a symbolic link to mdmonitor named "S15mdmonitor", causing mdmonitor to start relatively early. If you 'chkconfig mdmonitor off', it will not be started at all during the boot, and you can do it later by hand with "service mdmonitor start". That would let you see if an arbitrarily long delay of it helps. Surely you can recover if mdmonitor is needed for later parts of the boot, if only by 'chkconfig mdmonitor on' and reboot. Steven Yellin On Tue, 8 Mar 2011, Chuck Munro wrote: Hello folks, This is my first adventure with SL after many years of using CentOS. I'm using SL-6 on a large-ish VM server, and have been quite happy with it. I am experiencing a weird problem at bootup with large RAID-6 arrays. After Googling around (a lot) I find that others are having the same issues with CentOS/RHEL/Ubuntu/whatever. In my case it's Scientific Linux-6 which should behave the same way as CentOS-6. I had the same problem with the RHEL-6 evaluation version. I'm posting this question to the CentOS mailing list as well. For some reason, each time I boot the server a random number of RAID arrays will come up with the hot-spare missing. This occurs with hot-spare components only, never with the active components. Once in a while I'm lucky enough to have all components come up correctly when the system boots. Which hot spares fail to be configured is completely random. I have 12 2TB drives, each divided into 4 primary partitions, and configured as 8 partitionable MD arrays. All drives are partitioned exactly the same way. Each R6 array consists of 5 components (partitions) plus a hot-spare. The small RAID-1 host OS array never has a problem with its hot spare. The predominant theory via Google is that there's a race condition at boot time between full enumeration of all disk partitions and mdadm assembling the arrays. Does anyone know of a way to have mdadm delay its assembly until all partitions are enumerated? Even if it's simply to insert a several-second wait time, that would probably work. My knowledge of the internal workings of the boot process isn't good enough to know where to look. I tried to issue 'mdadm -A -s /dev/md/md_dXX' after booting, but all it does is complain about "No suitable drives found for /dev." Here is the mdadm.conf file: - MAILADDR root PROGRAM /root/bin/record_md_events.sh DEVICE partitions ##DEVICE /dev/sd* << this didn't help. AUTO +imsm +1.x -all ## Host OS root arrays: ARRAY /dev/md0 metadata=1.0 num-devices=2 spares=1 UUID=75941adb:33e8fa6a:095a70fd:6fe72c69 ARRAY /dev/md1 metadata=1.1 num-devices=2 spares=1 UUID=7a96d82d:bd6480a2:7433f1c2:947b84e9 ARRAY /dev/md2 metadata=1.1 num-devices=2 spares=1 UUID=ffc6070d:e57a675e:a1624e53:b88479d0 ## Partitionable arrays on LSI controller: ARRAY /dev/md/md_d10 metadata=1.2 num-devices=5 spares=1 UUID=135f0072:90551266:5d9a126a:011e3471 ARRAY /dev/md/md_d11 metadata=1.2 num-devices=5 spares=1 UUID=59e05755:5b3ec51e:e3002cfd:f0720c38 ARRAY /dev/md/md_d12 metadata=1.2 num-devices=5 spares=1 UUID=7916eb13:cd5063ba:a1404cd7:3b65a438 ARRAY /dev/md/md_d13 metadata=1.2 num-devices=5 spares=1 UUID=9a767e04:e4e56a9d:c369d25c:9d333760 ## Partitionable arrays on Tempo controllers: ARRAY /dev/md/md_d20 metadata=1.2 num-devices=5 spares=1 UUID=1d5a3c32:eb9374ac:eff41754:f8a17
Re: No success installing ATI Radeon HD5970 driver
On 10/03/11 17:28, Wil Irwin wrote: Hi- I have tried multiple times to install the driver using the "GUI" installer and the subsequent steps. Installation appears to proceed and I can finish with "aticonfig --initial". However, the driver doesn't appear to be applied. Scrolling down any webpage or document is very constipated, and dragging windows across the screen is also extremely constipated. In addition the GUI for Catalyst Control Panel will allow resolution, etc. changes, but they are not applied after a re-boot. I have also tried the command-prompt based install, with exactly the same results. I'm using the 11.2 driver released on 02/15/2011. I should also note the same problem (or at least similar) happened with SL5 and Ubuntu 10.x) The errors shown for fgl_glxgears, fglrxinfo, and glxinfo. uname -r; and the xorg.conf file are listed below. I am running SL6 with all updates and packages installed. Any suggestions would be VERY MUCH appreciated. Thanks, Wil Hi Wil, Can I suggest you try the ATI driver package for EL6 from elrepo.org: http://elrepo.org http://elrepo.org/tiki/kmod-fglrx I believe the elrepo.org repository might already be installed under SL6. Once you have elrepo installed, you can install the ATI drivers with: yum --enablerepo=elrepo install kmod-fglrx and if you need 32-bit application support on x86_64 then you should also install the fglrx-x11-drv-32bit package too. *Before* you install the elrepo packaged drivers, please uninstall the previous ATI installer drivers: sh /usr/share/ati/fglrx-uninstall.sh At the moment elrepo.org only has the 10.12 drivers for SL6 but I'll do my best to get those updated soon.
Re: No success installing ATI Radeon HD5970 driver
On Thu, Mar 10, 2011 at 5:28 PM, Wil Irwin wrote: > Any suggestions would be VERY MUCH appreciated. Yes, use the ATI driver in Elrepo; it works for me with SL6 x86_64.
Re: a few questions about SL admin best practises
Another note with regards to LVM: with our infrastructure we did some basic IOZone and bonnie++ tests and discovered that use of LVM causes up to a 10% performance hit for I/O operations in relation to using a native partition table. This convenience did not seem to be worth the hit in performance we found. -Aaron On Mar 10, 2011, at 1:48 PM, Konstantin Olchanski wrote: > On Thu, Mar 10, 2011 at 05:35:30AM -0500, Robert P. J. Day wrote: >> >> first question -- is there any sane reason not to use LVM these >> days? the manual opens (predictably) with showing the student how to >> allocate fixed partitions during the install, and leaves LVM setup for >> later in the week as an "advanced" topic. i see it the other way >> around -- LVM should be the norm nowadays. >> > > > No reason to use LVM. The traditional "md" software raid is much simpler > and easier to manage (only one tool to learn - mdadm, compared > to the 100 lvm management programs). Historically, LVM is a knock-off > of XLV which was the companion partitionning tool to SGI's XFS filesystem. > > >> thoughts? i'll always allocate /boot as a regular partition but >> unless there are compelling reasons not to, i always recommend LVM as >> the standard. > > > Your /boot partition has to be mirrored across both of your system disks. > If it's only on one disk and it fails, you have an unbootable machine, > regardless of what tool you used (lvm or md). > > With "md" it is very simple, /dev/md0 is the system partition mirrored > across /dev/sda1 and /dev/sdb1, there is no need for separate /boot > partition, GRUB happily installs on both /dev/sda and /dev/sdb, and > your machine happily boots if either disk explodes. > > To do the same with LVM, you probably have to read a book and take > an advanced sysadmin class; and forget about getting it to actually > work without the help of this mailing list. > > > -- > Konstantin Olchanski > Data Acquisition Systems: The Bytes Must Flow! > Email: olchansk-at-triumf-dot-ca > Snail mail: 4004 Wesbrook Mall, TRIUMF, Vancouver, B.C., V6T 2A3, Canada
Re: a few questions about SL admin best practises
On Thu, Mar 10, 2011 at 05:35:30AM -0500, Robert P. J. Day wrote: > > first question -- is there any sane reason not to use LVM these > days? the manual opens (predictably) with showing the student how to > allocate fixed partitions during the install, and leaves LVM setup for > later in the week as an "advanced" topic. i see it the other way > around -- LVM should be the norm nowadays. > No reason to use LVM. The traditional "md" software raid is much simpler and easier to manage (only one tool to learn - mdadm, compared to the 100 lvm management programs). Historically, LVM is a knock-off of XLV which was the companion partitionning tool to SGI's XFS filesystem. > thoughts? i'll always allocate /boot as a regular partition but > unless there are compelling reasons not to, i always recommend LVM as > the standard. Your /boot partition has to be mirrored across both of your system disks. If it's only on one disk and it fails, you have an unbootable machine, regardless of what tool you used (lvm or md). With "md" it is very simple, /dev/md0 is the system partition mirrored across /dev/sda1 and /dev/sdb1, there is no need for separate /boot partition, GRUB happily installs on both /dev/sda and /dev/sdb, and your machine happily boots if either disk explodes. To do the same with LVM, you probably have to read a book and take an advanced sysadmin class; and forget about getting it to actually work without the help of this mailing list. -- Konstantin Olchanski Data Acquisition Systems: The Bytes Must Flow! Email: olchansk-at-triumf-dot-ca Snail mail: 4004 Wesbrook Mall, TRIUMF, Vancouver, B.C., V6T 2A3, Canada
Re: Race condition with mdadm at bootup?
Maybe I missed it, but I didn't see any response to your request "Does anyone know of a way to have mdadm delay its assembly until all partitions are enumerated? Even if it's simply to insert a several-second wait time, that would probably work. My knowledge of the internal workings of the boot process isn't good enough to know where to look." I thought partition enumeration was done before init was started, but I don't know much about such matters. That's why I was waiting for someone else to reply. Anyway, here's what information I can contribute: In /etc/rc.d/rc.sysinit is a line with "/sbin/mdadm -A -s", before which you can insert a delay. And /etc/rc.d/init.d/mdmonitor contains "#chkconfig: 2345 15 85", where the relatively low number "15" means when chkconfig sets mdmonitor to start during boot, chkconfig will make a symbolic link to mdmonitor named "S15mdmonitor", causing mdmonitor to start relatively early. If you 'chkconfig mdmonitor off', it will not be started at all during the boot, and you can do it later by hand with "service mdmonitor start". That would let you see if an arbitrarily long delay of it helps. Surely you can recover if mdmonitor is needed for later parts of the boot, if only by 'chkconfig mdmonitor on' and reboot. Steven Yellin On Tue, 8 Mar 2011, Chuck Munro wrote: Hello folks, This is my first adventure with SL after many years of using CentOS. I'm using SL-6 on a large-ish VM server, and have been quite happy with it. I am experiencing a weird problem at bootup with large RAID-6 arrays. After Googling around (a lot) I find that others are having the same issues with CentOS/RHEL/Ubuntu/whatever. In my case it's Scientific Linux-6 which should behave the same way as CentOS-6. I had the same problem with the RHEL-6 evaluation version. I'm posting this question to the CentOS mailing list as well. For some reason, each time I boot the server a random number of RAID arrays will come up with the hot-spare missing. This occurs with hot-spare components only, never with the active components. Once in a while I'm lucky enough to have all components come up correctly when the system boots. Which hot spares fail to be configured is completely random. I have 12 2TB drives, each divided into 4 primary partitions, and configured as 8 partitionable MD arrays. All drives are partitioned exactly the same way. Each R6 array consists of 5 components (partitions) plus a hot-spare. The small RAID-1 host OS array never has a problem with its hot spare. The predominant theory via Google is that there's a race condition at boot time between full enumeration of all disk partitions and mdadm assembling the arrays. Does anyone know of a way to have mdadm delay its assembly until all partitions are enumerated? Even if it's simply to insert a several-second wait time, that would probably work. My knowledge of the internal workings of the boot process isn't good enough to know where to look. I tried to issue 'mdadm -A -s /dev/md/md_dXX' after booting, but all it does is complain about "No suitable drives found for /dev." Here is the mdadm.conf file: - MAILADDR root PROGRAM /root/bin/record_md_events.sh DEVICE partitions ##DEVICE /dev/sd*<< this didn't help. AUTO +imsm +1.x -all ## Host OS root arrays: ARRAY /dev/md0 metadata=1.0 num-devices=2 spares=1 UUID=75941adb:33e8fa6a:095a70fd:6fe72c69 ARRAY /dev/md1 metadata=1.1 num-devices=2 spares=1 UUID=7a96d82d:bd6480a2:7433f1c2:947b84e9 ARRAY /dev/md2 metadata=1.1 num-devices=2 spares=1 UUID=ffc6070d:e57a675e:a1624e53:b88479d0 ## Partitionable arrays on LSI controller: ARRAY /dev/md/md_d10 metadata=1.2 num-devices=5 spares=1 UUID=135f0072:90551266:5d9a126a:011e3471 ARRAY /dev/md/md_d11 metadata=1.2 num-devices=5 spares=1 UUID=59e05755:5b3ec51e:e3002cfd:f0720c38 ARRAY /dev/md/md_d12 metadata=1.2 num-devices=5 spares=1 UUID=7916eb13:cd5063ba:a1404cd7:3b65a438 ARRAY /dev/md/md_d13 metadata=1.2 num-devices=5 spares=1 UUID=9a767e04:e4e56a9d:c369d25c:9d333760 ## Partitionable arrays on Tempo controllers: ARRAY /dev/md/md_d20 metadata=1.2 num-devices=5 spares=1 UUID=1d5a3c32:eb9374ac:eff41754:f8a176c1 ARRAY /dev/md/md_d21 metadata=1.2 num-devices=5 spares=1 UUID=38ffe8c9:f3922db9:60bb1522:80fea016 ARRAY /dev/md/md_d22 metadata=1.2 num-devices=5 spares=1 UUID=ebb4ea67:b31b2105:498d81af:9b4f45d3 ARRAY /dev/md/md_d23 metadata=1.2 num-devices=5 spares=1 UUID=da07407f:deeb8906:7a70ae82:6b1d8c4a - Your suggestions are most welcome ... thanks. Chuck
Re: gdm and PreSession/Default
I solved this problem. For those of you who are required to put up the DOE warning banner or any other sort of disclaimer that aborts a gui login, the solution is to put the attached file in /etc/X11/xinit/xinitrc.d. The scripts in xinitrc.d are run from xinitrc-common. The distro's first script is 00-start-message-bus.sh. Choose a name so that the DOE warning script is run before it. No point starting up dbus if the user chooses to decline the disclaimer and abort the login. On 03/08/2011 12:41 PM, Ken Teh wrote: SL6's gdm does not honor the exit code from /etc/gdm/PreSession/Default. In SL5x, it used to abort the login if the script returns a non-zero exit code. I used this feature to put up a zenity dialog which the user had to click yes in order to continue logging in. Now, the user logs regardless of the script. In fact, the distro's Default script is empty. It used to contain the xsetroot and sessreg bits. Any workarounds? === WARNING: This e-mail has been altered by MIMEDefang. Following this paragraph are indications of the actual changes made. For more information about your site's MIMEDefang policy, contact MIMEDefang Administrator . For more information about MIMEDefang, see: http://www.roaringpenguin.com/mimedefang/enduser.php3 An attachment named '000-doewarning.sh' was converted to 'defang-1.binary'. To recover the file, right-click on the attachment and Save As '000-doewarning.sh' defang-1.binary Description: defang-1.binary
No success installing ATI Radeon HD5970 driver
Hi- I have tried multiple times to install the driver using the "GUI" installer and the subsequent steps. Installation appears to proceed and I can finish with "aticonfig --initial". However, the driver doesn't appear to be applied. Scrolling down any webpage or document is very constipated, and dragging windows across the screen is also extremely constipated. In addition the GUI for Catalyst Control Panel will allow resolution, etc. changes, but they are not applied after a re-boot. I have also tried the command-prompt based install, with exactly the same results. I'm using the 11.2 driver released on 02/15/2011. I should also note the same problem (or at least similar) happened with SL5 and Ubuntu 10.x) The errors shown for fgl_glxgears, fglrxinfo, and glxinfo. uname -r; and the xorg.conf file are listed below. I am running SL6 with all updates and packages installed. Any suggestions would be VERY MUCH appreciated. Thanks, Wil --- [root@Cluster1 ~]# fglrxinfo X Error of failed request: BadRequest (invalid request code or no such operation) Major opcode of failed request: 136 (GLX) Minor opcode of failed request: 19 (X_GLXQueryServerString) Serial number of failed request: 18 Current serial number in output stream: 18 [root@Cluster1 ~]# glxinfo name of display: :0.0 X Error of failed request: BadRequest (invalid request code or no such operation) Major opcode of failed request: 136 (GLX) Minor opcode of failed request: 19 (X_GLXQueryServerString) Serial number of failed request: 18 Current serial number in output stream: 18 [root@Cluster1 ~]# fgl_glxgears Using GLX_SGIX_pbuffer X Error of failed request: BadRequest (invalid request code or no such operation) Major opcode of failed request: 136 (GLX) Minor opcode of failed request: 19 (X_GLXQueryServerString) Serial number of failed request: 18 Current serial number in output stream: 18 [root@Cluster1 X11]# uname -r 2.6.32-71.18.1.el6.x86_64 [root@Cluster1 X11]# vi xorg.conf.fglrx-12 Section "ServerLayout" Identifier "aticonfig Layout" Screen 0 "aticonfig-Screen[0]-0" 0 0 EndSection Section "Module" EndSection Section "Monitor" Identifier "aticonfig-Monitor[0]-0" Option "VendorName" "ATI Proprietary Driver" Option "ModelName" "Generic Autodetecting Monitor" Option "DPMS" "true" EndSection Section "Monitor" Identifier "0-DFP3" Option "VendorName" "ATI Proprietary Driver" Option "ModelName" "Generic Autodetecting Monitor" Option "DPMS" "true" Option "PreferredMode" "1920x1080" Option "TargetRefresh" "60" Option "Position" "0 0" Option "Rotate" "normal" Option "Disable" "false" EndSection Section "Device" Identifier "Videocard0" Driver "vesa" EndSection Section "Device" Identifier "aticonfig-Device[0]-0" Driver "fglrx" Option "Monitor-DFP3" "0-DFP3"
Re: nfs kickstart scripts
We will research this issue. -Connie Sieh On Wed, 9 Mar 2011, Ken Teh wrote: Ok, I got it. You were right. The tab option did the trick. The ks spec ks=nfs::/path/to/kickstart needs to be added to the kernel boot options. It says so in the TUV docs, but it also says to type this in at the boot prompt. Except, of course, there is no boot prompt unlike SL4x and SL5x. On 03/09/2011 06:02 PM, Bluejay Adametz wrote: A little more detail: The iso images for SL4 and SL5 used to stop with the boot: prompt after loading the kernel. I would then specify boot: linux=nfs::/path/to/kickstart Have you tried hitting when the boot screen comes up? That seems to allow modifying the boot options and may do what you want. There is a note to this effect on the boot screen, but it's not real obvious. I missed it for a while. - Bluejay Adametz, CFII, A&P, AA-5B N45210 Be careful what you teach You might have to learn it one day. -Tunnell's Terse Transmogrification of Fido Fisher's Fortuitious Formulary NOTICE: This message, including any attachments, is only for the use of the intended recipient(s) and may contain confidential and privileged information, or information otherwise protected from disclosure by law. If the reader of this message is not the intended recipient, you are hereby notified that any use, disclosure, copying, dissemination or distribution of this message or any of its attachments is strictly prohibited. If you received this message in error, please contact the sender immediately by reply email and destroy this message, including all attachments, and any copies thereof.
Re: what's the order of upstart processing in SL 6?
On Thu, 10 Mar 2011, Troy Dawson wrote: > I haven't found precisely what you are looking for, but I have found > the correct man pages, or at least some more man pages. > > Looking at the documentation here > http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Technical_Notes/deployment.html > It says > "Processes are known to Upstart as jobs and are defined by files in the > /etc/init directory. Upstart is very well documented via man pages. Command > overview is in init(8) and job syntax is described in init(5). " > > looking at > man 5 init > man 8 init > It describes Upstart. > > As I said, it doesn't completely answer your question, but hopefully > it points you in the right direction. i've been through those pages and i still haven't found an unambiguous description of what happens when there is no /etc/init/rc-sysinit.conf. barring that, the best *guess* i can come up with is that, if that particular file is missing, then upstart will simply scan all of the /etc/init/*.conf files and start anything with no dependencies. or more technically, anything with the dependency: start on startup which includes rcS.conf and a couple readahead files. beyond that, i'm not sure but i'll keep reading. rday -- Robert P. J. Day Waterloo, Ontario, CANADA http://crashcourse.ca Twitter: http://twitter.com/rpjday LinkedIn: http://ca.linkedin.com/in/rpjday
Re: a few questions about SL admin best practises
On Thu, 10 Mar 2011, Troy Dawson wrote: > If these are beginning linux admins who don't know about partitions, > or haven't done linux partitioning, then you shouldn't do LVM first. > You should teach them about partitions, and the general layout of > Linux partitions. Your general windows admin isn't going to know > about /boot or swap partitions. Your general unix admin will know > about how his version of unix partitioning, and will appreciate > knowing what partitions linux should have. And if they aren't an > admin, then they aren't going to know about partitions at all. > > If this is a bunch of Debian admins wanting to know RedHat, then go > straight to LVM. i've decided i can combine the best of both worlds. given that /boot is still allocated as a regular primary partition, i can use that to talk about partitions, while still using LVM for the remainder of the disk layout. i think that will solve the problem. rday -- Robert P. J. Day Waterloo, Ontario, CANADA http://crashcourse.ca Twitter: http://twitter.com/rpjday LinkedIn: http://ca.linkedin.com/in/rpjday
Re: a few questions about SL admin best practises
On Thu, Mar 10, 2011 at 05:35:30AM -0500, Robert P. J. Day wrote: > hoping this isn't egregiously off-topic but i'm teaching a RH sys > admin course next week and i'm using SL 6.0 as the vehicle. i'm being > handed the courseware to use and i'm pondering which parts are really > out of date so that i can skip them or replace them with newer > material on the fly. > > first question -- is there any sane reason not to use LVM these > days? the manual opens (predictably) with showing the student how to > allocate fixed partitions during the install, and leaves LVM setup for > later in the week as an "advanced" topic. i see it the other way > around -- LVM should be the norm nowadays. > > thoughts? i'll always allocate /boot as a regular partition but > unless there are compelling reasons not to, i always recommend LVM as > the standard. > > rday First Answer: There is currently no reason not to use LVM for everything other than /boot which should be referenced by UUID. Thoughts: LVM is the defacto standard as it stands today in RHEL6 as outlined to be base knowledge expected of a Red Hat Certified Systems Administrator as well as Red Hat Certified Systems Engineer (the RHCSA is a prereq to the RHCE ... so its somewhat redundant to list both but I did so for clarity). http://www.redhat.com/certification/rhcsa/objectives/ https://www.redhat.com/certification/rhce/objectives/ I think you're headed down the right direction. -AdamM
Re: a few questions about SL admin best practises
On 03/10/2011 04:35 AM, Robert P. J. Day wrote: hoping this isn't egregiously off-topic but i'm teaching a RH sys admin course next week and i'm using SL 6.0 as the vehicle. i'm being handed the courseware to use and i'm pondering which parts are really out of date so that i can skip them or replace them with newer material on the fly. first question -- is there any sane reason not to use LVM these days? the manual opens (predictably) with showing the student how to allocate fixed partitions during the install, and leaves LVM setup for later in the week as an "advanced" topic. i see it the other way around -- LVM should be the norm nowadays. thoughts? i'll always allocate /boot as a regular partition but unless there are compelling reasons not to, i always recommend LVM as the standard. rday My two cents. These are only opinions. It depends on the level of the course. If these are beginning linux admins who don't know about partitions, or haven't done linux partitioning, then you shouldn't do LVM first. You should teach them about partitions, and the general layout of Linux partitions. Your general windows admin isn't going to know about /boot or swap partitions. Your general unix admin will know about how his version of unix partitioning, and will appreciate knowing what partitions linux should have. And if they aren't an admin, then they aren't going to know about partitions at all. If this is a bunch of Debian admins wanting to know RedHat, then go straight to LVM. Again, my opinion. Troy -- __ Troy Dawson daw...@fnal.gov (630)840-6468 Fermilab ComputingDivision/SCF/FEF/SLSMS Group __
Re: what's the order of upstart processing in SL 6?
On 03/10/2011 07:31 AM, Robert P. J. Day wrote: finally taking the time to dig into upstart, and i'm confused by one issue. on my ubuntu system, the documentation "man 7 startup" claims that the primary task on startup is /etc/init/rc-sysinit.conf, which exists on my ubuntu system. but on SL 6, while the man page reads the same, there is no such .conf file in /etc/init, so what is the boot sequence? i do see the file /etc/init/rc.conf so i'm sure i can work through the sequence but since i want to explain this to some students next week, i'd like to find the appropriate doc/man page that explains *precisely* what happens. so what's the process when /etc/init/rc-sysinit.conf doesn't exist? and what man page explains that? thanks. rday I haven't found precisely what you are looking for, but I have found the correct man pages, or at least some more man pages. Looking at the documentation here http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Technical_Notes/deployment.html It says "Processes are known to Upstart as jobs and are defined by files in the /etc/init directory. Upstart is very well documented via man pages. Command overview is in init(8) and job syntax is described in init(5). " looking at man 5 init man 8 init It describes Upstart. As I said, it doesn't completely answer your question, but hopefully it points you in the right direction. Troy -- __ Troy Dawson daw...@fnal.gov (630)840-6468 Fermilab ComputingDivision/SCF/FEF/SLSMS Group __
Re: a few questions about SL admin best practises
This is a good point, I always do manual partitioning. For instance I see no point in separate /boot or a swap partitions and I only use software raid as hard drives fail and swapping a drive only requires a simple re-syncing. Perhaps I've gotten very lucky, resizing or moving a partition has never been an issue, I deal with small businesses, the only issues I occasionally run into are MySQL maintenance related and an occasional re indexing of a ext3 file system. The servers I've built with this methodology are extremely stable, most are in xen and any upgrades are copy existing stable VM, update and test, swap to new stable VM. My view might be too bias, I am a minimalist and install all my systems at the barest minimum possible and tend to run in a very small memory and disk foot print. I am sure that LVM is mature and in thinking about, it more probably should be covered in a class, but I'd only start using it when I see a need as with any other technology. I have one point of view for your consideration, Brett On Thu, Mar 10, 2011 at 8:26 AM, Robert P. J. Day wrote: > > as i mentioned, one of the primary reasons i'm going with LVM from > the start of the course is that it's the default layout with SL 6 -- > you have to explicitly choose *not* to use it. and since i consider > LVM to be a stable and mature technology, i don't see any compelling > reason to avoid it. > > rday > > -- > > > Robert P. J. Day Waterloo, Ontario, CANADA >http://crashcourse.ca > > Twitter: http://twitter.com/rpjday > LinkedIn: http://ca.linkedin.com/in/rpjday > >
Re: problem with hald
> On 09/03/11 15:57, Steven Timm wrote: > > any reason you can't just turn hald off? Most servers don't need it. > > > > Steve Timm Furthermore note that HAL is deprecated and scheduled for removal ... - Charles
Re: what's the order of upstart processing in SL 6?
On Thu, 10 Mar 2011, Robert P. J. Day wrote: > finally taking the time to dig into upstart, and i'm confused by > one issue. on my ubuntu system, the documentation "man 7 startup" > claims that the primary task on startup is > /etc/init/rc-sysinit.conf, which exists on my ubuntu system. but on > SL 6, while the man page reads the same, there is no such .conf file > in /etc/init, so what is the boot sequence? > > i do see the file /etc/init/rc.conf so i'm sure i can work through > the sequence but since i want to explain this to some students next > week, i'd like to find the appropriate doc/man page that explains > *precisely* what happens. so what's the process when > /etc/init/rc-sysinit.conf doesn't exist? and what man page explains > that? thanks. oh, wait, i just noticed this in /etc/inittab (whose only active line is to define the default runlevel of 5): # System initialization is started by /etc/init/rcS.conf but, again, what man page or doc actually *states* that? i'd like to present to the students the sequence of man pages that walks them through the phases of upstart, but i'm still not seeing what explains that one step. thanks. rday -- Robert P. J. Day Waterloo, Ontario, CANADA http://crashcourse.ca Twitter: http://twitter.com/rpjday LinkedIn: http://ca.linkedin.com/in/rpjday
what's the order of upstart processing in SL 6?
finally taking the time to dig into upstart, and i'm confused by one issue. on my ubuntu system, the documentation "man 7 startup" claims that the primary task on startup is /etc/init/rc-sysinit.conf, which exists on my ubuntu system. but on SL 6, while the man page reads the same, there is no such .conf file in /etc/init, so what is the boot sequence? i do see the file /etc/init/rc.conf so i'm sure i can work through the sequence but since i want to explain this to some students next week, i'd like to find the appropriate doc/man page that explains *precisely* what happens. so what's the process when /etc/init/rc-sysinit.conf doesn't exist? and what man page explains that? thanks. rday -- Robert P. J. Day Waterloo, Ontario, CANADA http://crashcourse.ca Twitter: http://twitter.com/rpjday LinkedIn: http://ca.linkedin.com/in/rpjday
Re: a few questions about SL admin best practises
On Thu, 10 Mar 2011, Michael Tiernan wrote: > On 3/10/11 6:42 AM, Robert P. J. Day wrote: > > i'm fascinated that you use mirroring RAID as a matter of course, > > but consider LVM to be an advanced topic. > > > I too find this interesting. One of the things I love about SA is if > you ask one question of three SAs you'll get nine answers no two of > which are alike. :) > > I view LVM on the same level as basic system security. If you grok > the core workings of security, then LVM's not too difficult for you. > > Let me say up front, I don't question Mr Serkez business choices, he > knows better than I in that regard. > > Regarding LVM in general, in *my* opinion, if the system's running > hardware RAID controllers, LVM is a good thing to use to allow the > flexibility of moving partitions around and changing sizes > dynamically. It's one more thing that makes our lives easier. as i mentioned, one of the primary reasons i'm going with LVM from the start of the course is that it's the default layout with SL 6 -- you have to explicitly choose *not* to use it. and since i consider LVM to be a stable and mature technology, i don't see any compelling reason to avoid it. rday -- Robert P. J. Day Waterloo, Ontario, CANADA http://crashcourse.ca Twitter: http://twitter.com/rpjday LinkedIn: http://ca.linkedin.com/in/rpjday
Re: a few questions about SL admin best practises
On 3/10/11 6:42 AM, Robert P. J. Day wrote: i'm fascinated that you use mirroring RAID as a matter of course, but consider LVM to be an advanced topic. I too find this interesting. One of the things I love about SA is if you ask one question of three SAs you'll get nine answers no two of which are alike. :) I view LVM on the same level as basic system security. If you grok the core workings of security, then LVM's not too difficult for you. Let me say up front, I don't question Mr Serkez business choices, he knows better than I in that regard. Regarding LVM in general, in *my* opinion, if the system's running hardware RAID controllers, LVM is a good thing to use to allow the flexibility of moving partitions around and changing sizes dynamically. It's one more thing that makes our lives easier. Again, *my* opinion. -- << MCT>>Michael C Tiernan Charter member of lopsa.org/ MIT - Laboratory for Nuclear Science - http://www.lns.mit.edu High Perf Research Computing Facility at The Bates Linear Accelerator xmpp:mtier...@mit.edu skype:mtiernan-mitcms +1 (617) 324-9173
Re: a few questions about SL admin best practises
On Thu, 10 Mar 2011, Brett Serkez wrote: > I have been using CentOS in small businesses since 2003 and never > once found a reason to use LVM. I've been using mirroring RAID, > which has been very useful. I tend to use the less is more adage, > keeping my servers simple and lean and use XEN to break out > functionality. This strategy has proven to be extremely reliable and > easy to maintain, I can focus the majority of my energy on managing > WinDoze. > > Having taught Linux and Linux administration myself, I see LVM as an > advanced topic. Perhaps in larger organizations the perspective is > different. i'm fascinated that you use mirroring RAID as a matter of course, but consider LVM to be an advanced topic. in any event, part of the reason i would consider LVM standard these days is that if you take the default install configuration with SL6, you get LVM automatically. that's just part of my thinking. rday -- Robert P. J. Day Waterloo, Ontario, CANADA http://crashcourse.ca Twitter: http://twitter.com/rpjday LinkedIn: http://ca.linkedin.com/in/rpjday
Re: a few questions about SL admin best practises
I have been using CentOS in small businesses since 2003 and never once found a reason to use LVM. I've been using mirroring RAID, which has been very useful. I tend to use the less is more adage, keeping my servers simple and lean and use XEN to break out functionality. This strategy has proven to be extremely reliable and easy to maintain, I can focus the majority of my energy on managing WinDoze. Having taught Linux and Linux administration myself, I see LVM as an advanced topic. Perhaps in larger organizations the perspective is different. Brett On Thu, Mar 10, 2011 at 5:35 AM, Robert P. J. Day wrote: > hoping this isn't egregiously off-topic but i'm teaching a RH sys > admin course next week and i'm using SL 6.0 as the vehicle. i'm being > handed the courseware to use and i'm pondering which parts are really > out of date so that i can skip them or replace them with newer > material on the fly. > > first question -- is there any sane reason not to use LVM these > days? the manual opens (predictably) with showing the student how to > allocate fixed partitions during the install, and leaves LVM setup for > later in the week as an "advanced" topic. i see it the other way > around -- LVM should be the norm nowadays. > > thoughts? i'll always allocate /boot as a regular partition but > unless there are compelling reasons not to, i always recommend LVM as > the standard. > > rday > > -- > > > Robert P. J. Day Waterloo, Ontario, CANADA >http://crashcourse.ca > > Twitter: http://twitter.com/rpjday > LinkedIn: http://ca.linkedin.com/in/rpjday > >
a few questions about SL admin best practises
hoping this isn't egregiously off-topic but i'm teaching a RH sys admin course next week and i'm using SL 6.0 as the vehicle. i'm being handed the courseware to use and i'm pondering which parts are really out of date so that i can skip them or replace them with newer material on the fly. first question -- is there any sane reason not to use LVM these days? the manual opens (predictably) with showing the student how to allocate fixed partitions during the install, and leaves LVM setup for later in the week as an "advanced" topic. i see it the other way around -- LVM should be the norm nowadays. thoughts? i'll always allocate /boot as a regular partition but unless there are compelling reasons not to, i always recommend LVM as the standard. rday -- Robert P. J. Day Waterloo, Ontario, CANADA http://crashcourse.ca Twitter: http://twitter.com/rpjday LinkedIn: http://ca.linkedin.com/in/rpjday
Re: problem with hald
Hi, Yep, that's what I'm leaning towards. Faye On 09/03/11 15:57, Steven Timm wrote: any reason you can't just turn hald off? Most servers don't need it. Steve Timm On Wed, 9 Mar 2011, Faye Gibbins wrote: Hi, I've got one very special machine that has about 15,000 - 20,000 automount entries. Things are fine until about 5000 mount points are mounted up then any further adding (in a tight bash loop) slows dramatically and hald starts taking more and more CPU. Memeory usage is well below physical ram, no swapping. Automount mounting slows to a few seconds per mount and the system loads goes up and the machine slowly grinds to a halt. Running hald in foreground and in verbose mode I see lots and lots of this type of messages: 12:46:01.254 [I] osspec.c:256: /proc/mounts tells, that the mount has tree changed (process:10306): GLib-CRITICAL **: g_hash_table_lookup_extended: assertion `hash_table != NULL' failed (process:10306): GLib-CRITICAL **: g_hash_table_lookup_extended: assertion `hash_table != NULL' failed (process:10306): GLib-CRITICAL **: g_hash_table_lookup_extended: assertion `hash_table != NULL' failed If anyone can shed some light on this it would make me very happy. It feels like when you see a search routine slow as n increases, is hald or dbus using a flat directory somewhere as a hash table? Machine details: SL5.5 2.6.18-194.32.1.el5 Faye -- - Faye Gibbins, Sys Admin. GeoS KB. Linux, Unix, Security Beekeeper - The Apiary Project, KB - www.bees.ed.ac.uk - (x(x_(X_x(O_o)x_x)_X)x) I grabbed at spannungsbogen before I knew I wanted it. Socrates: Question authority, question everything. Mermin: If the maths works "Shut up and calculate!" The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.