Re: [CentOS] rescue - UEFI revert to BIOS boot - how?

2018-12-08 Thread Chris Murphy
>
>
> Yeah the correct response is to fix the underlying cause of settings
vanishing. Not enable faux BIOS legacy boot, which literally means UEFI
plus BIOS hacked on top.

As much a PITA UEFI is, adding more crap on top is asking for even more
confusion.

An alternative is benevolent malware as the firmware+bootloader. E.g.
https://lwn.net/Articles/748586/


Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Weird problems with CentOS 7.6 1810 installer

2018-12-08 Thread Chris Murphy
>
>
>
> Commands line options:
rd.debug rd.udev.debug systems.log_level=debug

That willl be incredibly verbose, and slows things down a lot, so in the
off chance there's a race, you might get different results. But if not, the
log should contain something useful.

I like the hypothesis about mdadm metadata version 0.9, however that's
still really common on RHEL and CentOS. It was used for /boot prior to
about Fedora around Fedora 24 maybe?

It could also be a dracut bug, since that's what's largely responsible for
assembly. But then it can be confused by a change in udev rules. :-D

However, going back to mdadm 0.9 metadata, that version should only be
kernel auto detected. Theoretically, it gets activated before dracut gets
involved. 1.x versions have no kernel autodetect, instead it happens in
dracut (by calling mdadm to assemble and run it).

Oh, is this really dm-cache, not lvmcache? That might be a source of
confusion, if there isn't lvm metadata present to hint at LVM for proper
assembly. Of course, lvmcache still uses device mapper, but with LVM
metadata.

Anyway is quite an interesting, and concerning problem.


Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] SAMBA Issues

2018-09-05 Thread Chris Murphy
>
>
> I wouldn't recommend Samba for this use case. The way it does permissions
is like it's been grafted on from a Windows world. Fine for NAS stuff. But
for editing system files, is look into an SFTP or SSH GUI client for macOS.

Also, SELinux requires dirs/files labeled with samba_share_t which is not
how any system files should be labeled.

Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] LVM GUI in live CD

2018-05-21 Thread Chris Murphy
Another idea is Fedora 27 or 28 live media, and 'dnf install blivet-gui'


Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] formating DVR-RW

2018-04-26 Thread Chris Murphy
On Wed, Apr 25, 2018, 12:44 PM Fred Smith <fre...@fcshome.stoneham.ma.us>
wrote:

> On Wed, Apr 25, 2018 at 11:20:33AM -0400, Scott Robbins wrote:
> > On Wed, Apr 25, 2018 at 11:07:58AM -0400, Fred Smith wrote:
> > > On Wed, Apr 25, 2018 at 09:30:46AM -0500, Bill Gee wrote:
> > > >
> > > > It is possible that the optical drive in your computer does not
> support DVD-RW
> > > > media.  The only way I know of to find what media are supported is
> to use K3B.
> > > > If you go to Settings - Devices, you should get a list of readable
> and
> > > > writable media for each device.
> > > >
> > > > Bill Gee
> > >
> > > potentially stupid question here: Why would one format a cd/dvd?
> > > I've never had to do that, I just write to 'em.
> > >
> > > for what purpose or need would one format one?
> >
> > If you have a rewriteable one.  I am assuming (and we all know what that
> > means :) ) that the OP has such a disk.
>
> I write to RW media all the time without formatting it.
> I might "blank" it, but that isn't the same thing.
> After posting I realized that one might want to use a UDF filesystem
> on RW media, and I suppose for that purpose one would need to format
> it, though I've not done that on a CD or DVD, only on USB.
>

Yep. Formatting but no burning.

https://github.com/pali/udftools/blob/master/doc/HOWTO.udf


---
Chris Murphy

>
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] xfs file system errors

2018-03-18 Thread Chris Murphy
On Thu, Mar 15, 2018, 6:25 AM Jerry Geis <jerry.g...@gmail.com> wrote:

> How do I fix an xfs file system error ?
>
> I searched and it says to run xfs_repair /dev/sda1 - did not work.
> I got an error on boot and the machine dropped into service mode by
> entering the PW. I entered the above command and it said couldnt load
> library...
>
> SO I rebooted, dropped into rescue mode. Again I entered the command above
> and it said teh same thing.something about could not load library
>
> What am I missing ?


Without any logs or screenshots it's a guess.

My guess, it's mounting rootfs, runs into an errand and goes readonly. But
is still mounted. Therefore xfs_repair runs off the problem volume.

Boot using param rd.break=pre-mount

Now there is no mount of rootfs, and xfs_repair runs from the initramfs.

But 9 times out of 10, you're better off with the latest Fedora install
media (any) on a USB stick. Newer kernel and progs.

Ok maybe even 10 out of 10.


Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Install CentOS 7 on MBR hard disk

2018-02-16 Thread Chris Murphy
So is the end goal to have dual boot? You want to preserve the existing
Cent OS installation on this drive and also install Cent OS 7?

The biggest problem is the installer is really not very smart when it comes
to this use case. It's friendly for Windows and macOS dual boot, but fairly
well faceplants with dual boot Linux. So invariably you have manual surgery
to do pre and post install, or suffer.

Run 'efibootmgr' by itself, if you get boot entries the system is
definitely UEFI booted. If you get an error, it's legacy/faux-BIOS booted.

Certainly legacy boot is the easiest work around, but it can have an effect
on various things including drive and video modes that might be different
than UEFI booting. e.g. one of my older systems when booting legacy brings
up the SSD in IDE mode not SATA, and the system is slower. And it can only
use discrete GPU, the integrated GPU is unavailable. So I advise testing
before committing to legacy mode.

Also, rare, but not all UEFI systems come with a Compatiblity Support
Module (fake BIOS), in which case you're stuck.



Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Install CentOS 7 on MBR hard disk

2018-02-16 Thread Chris Murphy
On Thu, Feb 15, 2018, 4:18 PM Stephen John Smoogen <smo...@gmail.com> wrote:

>
> The only other thing I can think of is that the disk was already
> formatted to GPT. In that case it has to be EFI. [I had a disk which
> was GPT partitioned and removing that was quite a challenge as I had
> done a 'dd if=/dev/zero of=/dev/sda bs=512 count=10' and it still kept
> coming up as GPT. I believe I had to run a different disk command to
> really clean it.]
>


GPT has primary login at drive start, and backup location at drive end. To
remove it requires wipefs -a /dev/ and it will remove the signature found
in both primary and backup.



Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Install CentOS 7 on MBR hard disk

2018-02-16 Thread Chris Murphy
On Thu, Feb 15, 2018, 3:19 PM Yves Bellefeuille <y...@storm.ca> wrote:

> I have a UEFI system, but I want to install CentOS on a MBR (not GPT)
> hard disk.
>

Why?

While the UEFI spec permits using MBR for booting, it's confusing because
there's no actual single standard for MBR. There is for GPT.

Anyway, all OS installers I'm aware of on multiple platforms enforce GPT
for UEFI installations.


> The installation program keeps telling me that I must create an "EFI
> system partition on a GPT disk mounted at /boot/efi".
>
> I can't find a way to work around this. Is there a solution?
>

Yes, but it means giving bad advice. And that is to enable "legacy" OS
support to present a faux BIOS to the booting system instead of exposing
UEFI. It's bad advice because you have no good reason for wanting to use
MBR, it's an arbitrary request.


Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Re-enable grub boot in UEFI (Windows took over it)

2018-02-05 Thread Chris Murphy
On Mon, Feb 5, 2018 at 8:27 AM, Kay Diederichs
<kay.diederi...@uni-konstanz.de> wrote:

> grub-install /dev/nvme0n1


Running this on computers with UEFI firmware is not good advice, it's
an obsolete command. People should use the prebaked grubx64.efi binary
that comes in the grub2-efi package, and is a signed binary so it can
support UEFI Secure Boot.

If you run grub2-install, a new unsigned grub binary is created,
replacing grubx64.efi. If you have Secure Boot enabled, you will not
be able to boot, until you either reinstall the grub2-efi package (or
you self-sign the grub2-install created binary and then go through the
process of informing the firmware this is a valid binary by using
mokutil - but I estimate maybe 1 in 50 people might do this).




-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Re-enable grub boot in UEFI (Windows took over it)

2018-02-01 Thread Chris Murphy
On Thu, Feb 1, 2018 at 10:13 AM, wwp <subscr...@free.fr> wrote:
> Hello Chris,
>
>
> On Thu, 01 Feb 2018 17:00:03 + Chris Murphy <li...@colorremedies.com> 
> wrote:
>
>> You can to use efibootmgr for this. NVRAM boot entry is what changed, not
>> the contents of the EFI System partition.
>>
>> efibootmgr -v
>>
>> Will list all entries and Boot Order. You need to use --bootorder to make
>> sure the CentOS entry is first.
>
> Interesting.. thanks for your reply!
>
> Too bad I never run this command when things were OK (in order to
> compare), 'cause now, what it says doesn't mention anything that seem
> related to the CentOS partition or I read wrong:
>
> BootCurrent: 0007
> Timeout: 0 seconds
> BootOrder: 0001,0002,0003,0004,0005,0006,0007
> Boot* Windows Boot Manager  
> HD(1,GPT,a6b87338-9b9c-4a50-8fde-2447e8fdebb6,0x800,0xfa000)/File(\EFI\Microsoft\Boot\bootmgfw.efi)WINDOWS.x...B.C.D.O.B.J.E.C.T.=.{.9.d.e.a.8.6.2.c.-.5.c.d.d.-.4.e.7.0.-.a.c.c.1.-.f.3.2.b.3.4.4.d.4.7.9.5.}
> Boot0001* UEFI: A400 NVMe SanDisk 512GB, Partition 1
> HD(1,GPT,a6b87338-9b9c-4a50-8fde-2447e8fdebb6,0x800,0xfa000)/File(EFI\Microsoft\Boot\bootmgfw.efi)..BO
> Boot0002* Diskette DriveBBS(Floppy,Diskette Drive,0x0)..BO
> Boot0003* M.2 PCIe SSD  BBS(HD,P0: A400 NVMe SanDisk 512GB,0x0)..BO
> Boot0004* USB Storage DeviceBBS(USB,KingstonDataTraveler 3.0PMAP,0x0)..BO
> Boot0005* CD/DVD/CD-RW DriveBBS(CDROM,CD/DVD/CD-RW Drive,0x0)..BO
> Boot0006* Onboard NIC   BBS(Network,Onboard NIC,0x0)..BO
> Boot0007* UEFI: KingstonDataTraveler 3.0PMAP, Partition 1   
> PciRoot(0x0)/Pci(0x14,0x0)/USB(16,0)/HD(1,MBR,0x61f11812,0x800,0x737f800)..BO
>
> I don't know what 0001 and 0002 refer to exactly (there's only one SSD
> drive in this laptop).

For whatever reason the CentOS entry is missing.

Option 1:

A relatively easy cheat is to mount your root volume to /mnt and then search

grep efibootmgr /mnt/var/log/anaconda/program.log ##this is the
path and name on Fedora, not 100% certain on CentOS

And what you'll get back is a line that contains the efibootmgr
command that was used during the installation. So you'll need to
modify the forward slashes for it to work, something like this:

sudo efibootmgr -c -w -L CentOS -d /dev/sda -p 2 -l
\\EFI\\redhat\\grub\\shimx64.efi

Option 2:

At least on Fedora 27 + Windows 10, this is what my ESP contains:

├── EFI
│   ├── Boot
│   │   ├── bootx64.efi
│   │   ├── fallback.efi
│   │   └── fbx64.efi

Those are Fedora installed default bootloaders. So if you wipe out all
the NVRAM boot entries, these get used first. And when fallback.efi
figures out that there isn't a proper NVRAM boot entry, it's supposed
to insert one, just like the Option 1 command above does. You'll use
'efibootmgr -b  -B' to delete them one by one; looks like you
might be able to get away with just deleting 0001 and . Of course
it means the Windows boot entry is blown away, which might make you
nervous - but the way it's supposed to work is the GRUB menu should
have a Windows boot option in it, and you just pick that for booting
Windows.


I've mainly used option 1.



-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Re-enable grub boot in UEFI (Windows took over it)

2018-02-01 Thread Chris Murphy
You can to use efibootmgr for this. NVRAM boot entry is what changed, not
the contents of the EFI System partition.

efibootmgr -v

Will list all entries and Boot Order. You need to use --bootorder to make
sure the CentOS entry is first.


Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] /lib/firmware/microcode.dat update on CentOS 6

2018-01-24 Thread Chris Murphy
On Tue, Jan 23, 2018 at 4:26 AM, Johnny Hughes <joh...@centos.org> wrote:

>
> Here are a couple of posts for our reading pleasure:
>
> Intel recommends not installing the microcode now:
> http://intel.ly/2DsL9qz

Except this doesn't mention microcode at all. I can't even tell WTF
they're recommending not doing in this doc, it's that badly written.
You have to infer, by reading two prior docs, that they're referring
to microcode. And then you have to assume that's still what they're
referring to when they say:

"We recommend that OEMs, cloud service providers, system
manufacturers, software vendors and end users stop deployment of
current versions."  Current versions of what? Microcode?

But yes, indeed they appear to have pulled the 20180108 microcode,
which was previously set to latest at this link, and it is now
reverted to the 20171117 microcode.

https://downloadcenter.intel.com/download/27337/Linux-Processor-Microcode-Data-File?v=t

What these means for people who have CPUs which were not crashing
(rebooting being a new euphemism for crashing) , but saw variant 2
Spectre mitigation with the 20180108 microcode, will lose full
mitigation until Intel gets its ducks into a row.


*eye roll*



> Linus Torvalds agrees:
> http://tcrn.ch/2n2mEcA

His comments aren't about microcode though. And it also looks like he
got IBRS and IBPB confused. The better post on this front is

https://lkml.org/lkml/2018/1/22/598

As far as I know, there still is no mitigation for Spectre variant 1.



-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Centos 7 and btrfs

2017-12-28 Thread Chris Murphy
On Thu, Dec 28, 2017, 8:50 AM <m.r...@5-cent.us> wrote:

> Matt wrote:
> > I am setting up a new test server.  Doing a fresh install from CD onto
> > a couple 4TB drives.  Would like to try btrfs in a RAID 1 format.  Are
> > there any how to's on how to do that?
>
> I was under the impression that upstream was deprecating BTRFS.
>


Upstream being Red Hat, yes. Upstream Btrfs development continues
unaffected, Red Hat was not a major contributor to Btrfs development the
last few years.

Chris Murphy



>   mark
>
> ___
> CentOS mailing list
> CentOS@centos.org
> https://lists.centos.org/mailman/listinfo/centos
>
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Centos 7 and btrfs

2017-12-28 Thread Chris Murphy
>
>
>
I don't recommend it for roots. First there are udev, systems, and Btrfs
limitations that prevent automatic degraded boot. Second, any bugs you find
have a really good chance of already being fixed in upstream kernels. Btrfs
has has hundreds of thousands of line changes from a CentOS 3.10 whatever
kernel to a Fedora 4.14.8 kernel.

So if you were to test on Cent OS, at least you'd want to church any bugs
against a current elrepo or Fedora kernel and try to reproduce the problem.
And if it doesn't reproduce, now what? Might as well stick with the newer
kernel.

For data drives, raid 1 is fine. Just be sort aware of the fact Btrfs raid1
is not well named, is not going to behave like mdadm or LVM raid. There are
known issues like no spares, no auto resync if a drive is temporarily
missing and degraded writes happen to one drive, no faulty (kicked out)
drives when they misbehave, etc.

Anyway, Btrfs has been my primary filesystem for roots and data for years.
And I've experienced no unplanned data loss. But I still keep many backups
(up to seven copies, five are independent, the of which are Btrfs based).


Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Power Fail Protection Update

2017-08-16 Thread Chris Murphy
On Wed, Aug 16, 2017 at 8:49 AM, Chris Olson <chris_e_ol...@yahoo.com> wrote:
> Many thanks to those that responded to my original posting with
> information about Network UPS Tools and commercial UPS products.
>
> In our planning a path forward to implement UPS-based power fail
> protection, we have come across what appears to be an issue with
> the state of the CentOS 6 machines being UPS protected.  Most of
> these machines are desktop/deskside machines that are likely to
> be idle during non-work hours.  It is also likely that they will
> be hibernating or in a power save mode.
>
> In the power save mode, these machines do not respond to keyboard
> or mouse activity.  They also do not respond to network traffic
> such as a ping from other systems on the network.  The method we
> use to wake them up is a quick push on the power button when the
> hibernation state is indicated by the button's yellow LED display.
>
> This state of hibernation leaves us wondering if these systems will
> be able to respond to network messages sent by the UPS.  We have not
> yet made it all the way through the NUT and UPS documentation.
> The hibernation answer may very well be therein, but we have not
> found it so far.  Any help or direction regarding the hibernation
> issue as it relates to UPS power fail protection will be appreciated.

Suspend to RAM and suspend to disk both sync filesystems before the
system suspends, so what should be true is the file system is
consistent. The log might be dirty, but this would be replayed at next
boot if there's a power failure.

The only thing that would be lost is any unsaved work. The old school
answer is, save your files before you sleep the computer; i.e. the
burden is on the user.

My position is, this is a solved problem on mobile, where apps take
responsibility for saving state including user data. Some do this
locally, some sync it to the cloud. So far I'm not seeing this become
much of a thing on the desktop, other than macOS where it's fairly
standard at this point. Libreoffice by default saves autorecovery
information every 10 minutes, for example.


-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Kernel:[Hardware Error]:

2017-08-12 Thread Chris Murphy
On Sat, Aug 12, 2017 at 1:50 PM, Fred Smith
<fre...@fcshome.stoneham.ma.us> wrote:
> I had a series of kernel hardware error reports today while I was away
> from my computer:
>
> Message from syslogd@fcshome at Aug 12 10:12:24 ...
>  kernel:[Hardware Error]: MC2 Error: VB Data ECC or parity error.
>
> Message from syslogd@fcshome at Aug 12 10:12:24 ...
>  kernel:[Hardware Error]: Error Status: Corrected error, no action required.


Cosmic ray corrupted data in RAM, and ECC detected and corrected it?
Whatever it was, working as intended.


-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Btrfs going forward, was: Errors on an SSD drive

2017-08-11 Thread Chris Murphy
rfs device add'
'btrfs device delete'

That does file system resize, including moving extents if necessary in
the delete case, and it removes the device from the volume/array and
wipes the signature from the device. Resize is always online, atomic,
and in theory crash proof. There's also the seldom discussed seed
device / overlay feature, useful for live media that's substantially
simpler to implement and understand compared to dm - and it's also
much more reliable. The dm solution we currently have for lives will
eventually blow up without warning when it gets full and the overlay
is toast.
https://github.com/kdave/btrfs-wiki/wiki/Seed-device



-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Btrfs going forward, was: Errors on an SSD drive

2017-08-11 Thread Chris Murphy
On Fri, Aug 11, 2017 at 11:37 AM, hw <h...@gc-24.de> wrote:

> I want to know when a drive has failed.  How can I monitor that?  I´ve begun
> to use btrfs only recently.

Maybe checkout epylog and have it monitor for BTRFS messages. That's
your earliest warning because Btrfs will complain with any csum
mismatch even if the hardware is not reporting problems. For impending
drive failures, still your best bet is smartd even though the stats
are that it only predicts drive failures maybe 60% of the time.





>Chris Murphy wrote:
>> There's 1500 to 3000 line changes to Btrfs code per kernel release.
>> There's too much to backport most of it. Serious fixes do get
>> backported by upstream to longterm kernels, but to what degree, you
>> have to check the upstream changelogs to know about it.
>>
>> And right now most backports go to only 4.4 and 4.9. And I can't tell
>> you what kernel-3.10.0-514.10.2.el7.x86_64.rpm translates into, that
>> requires a secret decoder ring near as I can tell as it's a kernel
>> made from multiple branches,  and then also a bunch of separate
>> patches.
>
>
> So these kernels are a mess.  What´s the point of backports when they aren´t
> done correctly?

*sigh* Can we try to act rationally instead of emotionally?
Backporting is fucking hard. Have you bothered to look at kernel code
and how backporting is done? Or do you just assume that it's like
microwaving a hot pocket or something trivial? If it were easy, it
would be automated. It's not easy. A human has to look at the new
code, new fixes for old problems, and they have to graft it on old
ways of doing it and very often the new code does not cleanly apply to
old kernels. It's just a fact. And now that person has to come up with
a fix with old methods. That's a backport.

It is only messy to an outside observer, which includes me. People who
are doing the work at Red Hat very clearly understand it, the whole
point is to have a thoroughly understood stable conservative kernel.
They're very picky about taking on new features which tends to include
new regressions.



> This puts a big stamp "stay away from" on RHEL/Centos.

You have to pick your battles is what it comes down to. It is
completely legitimate to CentOS for stability elsewhere, and use a
nearly upstream kernel from elrepo.org or Fedora.

Of hand I'm not sure who is building CentOS compatible kernel packages
based on upstream longterm. A really good compromise right now is the
4.9 series, so if someone has a 4.9.42 kernel somewhere that'd be
neat. It's not difficult to build yourself either for that matter. I
can't advise you with Nvidia stuff though.


>Chris Murphy wrote
>> Red Hat are working on a new user space wrapper and volume format
>> based on md, device mapper, LVM, and XFS.
>> http://stratis-storage.github.io/
>> https://stratis-storage.github.io/StratisSoftwareDesign.pdf
>>
>> It's an aggressive development schedule and as so much of it is
>> journaling and CoW based I have no way to assess whether it ends up
>
>
> So in another 15 or 20 years, some kind of RH file system might become
> usable.

Lovely more hyperbole...

Read the document. It talks about an initial production quality
release 1st half of next year. It admits they're behind, *and* it also
says they can't wait 10 more years. So maybe 3? Maybe 5? I have no
idea. File systems are hard. Backups are good.


>Chris Murphy wrote:
>> tested. But this is by far the most cross platform solution: FreeBSD,
>> Illumos, Linux, macOS. And ZoL has RHEL/CentOS specific packages.
>
>
> That can be an advantage.
>
> What is the state of ZFS for Centos?  I´m going to need it because I have
> data on some disks that were used for ZFS and now need to be read by a
> machine running Centos.
>
> Does it require a particular kernel version?

Well, not to be a jerk but RTFM:
http://zfsonlinux.org/

It's like - I can't answer your question without reading it myself. So
there you go. I think it's DKMS based, so it has some kernel
dependencies but I think it's quite a bit more tolerant of different
kernel versions while maintain the same relative ZFS feature/bug set
for that particular release - basically it's decoupled from Linux.



>> But I can't tell you for sure what ZoL's faulty device behavior is
>> either, whether it ejects faulty or flaky devices and when, or if like
>> Btrfs is just tolerates it.
>
>
> You can monitor the disks and see when one has failed.


That doesn't tell me anything about how it differs from anything else.
mdadm offers email notifications as an option; LVM has its own
notification system I haven't really looked at but I don't think it
including email notifications; smartd can do emails but also dumps
standard messages to dmesg.


>
>> The elrepo.org folks c

Re: [CentOS] Btrfs going forward, was: Errors on an SSD drive

2017-08-11 Thread Chris Murphy
On Fri, Aug 11, 2017 at 11:17 AM, Mark Haney <mark.ha...@neonova.net> wrote:
> On Fri, Aug 11, 2017 at 1:00 PM, Chris Murphy <li...@colorremedies.com>
> wrote:
>
>> Changing the subject since this is rather Btrfs specific now.
>>
>>
>>
>> >>
>> >> Sounds like a hardware problem. Btrfs is explicitly optimized for SSD,
>> the
>> >> maintainers worked for FusionIO for several years of its development. If
>> >> the drive is silently corrupting data, Btrfs will pretty much
>> immediately
>> >> start complaining where other filesystems will continue. Bad RAM can
>> also
>> >> result in scary warnings where you don't with other filesytems. And I've
>> >> been using it in numerous SSDs for years and NVMe for a year with zero
>> >> problems.
>> >
>> >
>>
>>
>> LMFAO. Trust me, I tried several SSDs with BTRFS over the last couple of
>> years and had trouble the entire time. I constantly had to scrub the drive,
>> had freezes under moderate load and general nastiness.  If that's
>> 'optimized for SSDs', then something is very wrong with the definition of
>> optimized.  Not to mention the fact that BTRFS is not production ready for
>> anything, and I'm done trying to use it and going with XFS or EXT4
>> depending on my need.


Could you get your quoting in proper order? The way you did this looks
like I wrote the above steaming pile rant.

Whoever did write it, it's ridiculous, meaning it's worthy of
ridicule. From the provably unscientific and non-technical, to
craptasticly snotty writing "not to mention the fact" and then
proceeding to mention it. That's just being an idiot, and then framing
it.

Where are your bug reports? That question is a trap if you haven't in
fact filed any bugs, in particular upstream.



> As for a hardware problem, the drives were ones purchased in Lenovo
> professional workstation laptops, and, while you do get lemons
> occasionally, I tried 4 different ones of the exact same model and had the
> exact same issues.  Its highly unlikely I'd get 4 of the same brand to have
> hardware issues.

In fact it's highly likely because a.) it's a non-scientific sample
and b.) the hardware is intentionally identical. If the firmware is

 For SSDs all the sauce is in the firmware. If the model and firmware
were all the same, it is more likely to be a firmware bug than it is
to be a Btrfs bug. There are absolutely cases where Btrfs runs into
problems that other file systems don't, because Btrfs is designed to
detect them and others aren't. There's a reason why XFS and ext4 have
added metadata checksumming in recent versions. Hardware lies.
Firmware has bugs and it causes problems. And it can be months before
it materializes into a noticeable problem.

https://lwn.net/Articles/698090/

Btrfs tends to complain early and often when it encounters confusion,
It also will go read only sooner than other file systems in order to
avoid corrupting the file system. Almost always a normal mount will
automatically fallback to the most recent consistent state. Sometimes
it needs to be mounted with -o usebackuproot option. And still in
fewer cases it will need to be mounted read only, where other file
systems won't even tolerate that in the same situation.

The top two complaints I have about Btrfs is a.) what to do when a
normal mount doesn't work, it's really non-obvious what you *should*
do and in what order because there are many specialized tools for
different problems, so if your file system doesn't mount normally you
are really best off going straight to the upstream list and asking for
help, which is sorta shitty but that's the reality; b.) there are
still some minority workloads where users have to micromanage the file
system with a filtered balance to avoid a particular variety of bogus
enospc. Most of the enospc problems are fixed with some changes in
kernel 4.1 and 4.8. The upstream expert users are discussing some sort
of one size fits all user space filtered (meaning partial) balance so
regular users don't have to micromanage. It's completely a legitimate
complaint that having to micromanage a file system is b.s. This has
been a particularly difficult problem, and it's been around for a long
enough time that I think a lot of normal workloads that would have run
into problems have been masked (no problem) because so many users have
gotten into the arguably bad habit of doing their own filtered
balances.

But as for Btrfs having some inherent flaw that results in corrupt
file systems, it's silly. There are thousands of users in many
production workloads using this file system and they'd have given up a
long time ago, including myself.


>Once I went back to ext4 on those systems I could run the
> devil out of them and not see any freezes under eve

Re: [CentOS] Errors on an SSD drive

2017-08-11 Thread Chris Murphy
On Fri, Aug 11, 2017 at 7:53 AM, Robert Nichols
<rnicholsnos...@comcast.net> wrote:
> On 08/10/2017 11:06 AM, Chris Murphy wrote:
>>
>> On Thu, Aug 10, 2017, 6:48 AM Robert Moskowitz <r...@htt-consult.com>
>> wrote:
>>
>>>
>>>
>>> On 08/09/2017 10:46 AM, Chris Murphy wrote:
>>>>
>>>> If it's a bad sector problem, you'd write to sector 17066160 and see if
>>>
>>> the
>>>>
>>>> drive complies or spits back a write error. It looks like a bad sector
>>>> in
>>>> that the same LBA is reported each time but I've only ever seen this
>>>> with
>>>> both a read error and a UNC error. So I'm not sure it's a bad sector.
>>>>
>>>> What is DID_BAD_TARGET?
>>>
>>>
>>> I have no experience on how to force a write to a specific sector and
>>> not cause other problems.  I suspect that this sector is in the /
>>> partition:
>>>
>>> Disk /dev/sda: 240.1 GB, 240057409536 bytes, 468862128 sectors
>>> Units = sectors of 1 * 512 = 512 bytes
>>> Sector size (logical/physical): 512 bytes / 512 bytes
>>> I/O size (minimum/optimal): 512 bytes / 512 bytes
>>> Disk label type: dos
>>> Disk identifier: 0xc89d
>>>
>>>  Device Boot  Start End  Blocks   Id  System
>>> /dev/sda12048 2099199 1048576   83  Linux
>>> /dev/sda2 2099200 4196351 1048576   82  Linux swap /
>>> Solaris
>>> /dev/sda3 4196352   468862127   232332888   83  Linux
>>>
>>
>> LBA 17066160 would be on sda3.
>>
>> dd if=/dev/sda skip=17066160 count=1 2>/dev/null | hexdump -C
>>
>> That'll read that sector and display hex and ascii. If you recognize the
>> contents, it's probably user data. Otherwise, it's file system metadata or
>> a system binary.
>>
>> If you get nothing but an I/O error, then it's lost so it doesn't matter
>> what it is, you can definitely overwrite it.
>>
>> dd if=/dev/zero of=/dev/sda seek=17066160 count=1
>
>
> You really don't want to do that without first finding out what file is
> using
> that block. You will convert a detected I/O error into silent corruption of
> that file, and that is a much worse situation.

Yeah he'd want to do an fsck -f and see if repairs are made, and also
rpm -Va. There *will* be legitimately modified files, so it's going to
be tedious to exactly sort out the ones that are legitimately modified
vs corrupt. If it's a configuration file, I'd say you could ignore it
but any modified binaries other than permissions need to be replaced
and is the likely culprit.

The smartmontools page has hints on how to figure out what file is
affected by a particular sector being corrupt but the more layers are
involved the more difficult that gets. I'm not sure there's an easy to
do this with LVM in between the physical device and file system.

-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Btrfs going forward, was: Errors on an SSD drive

2017-08-11 Thread Chris Murphy
Changing the subject since this is rather Btrfs specific now.



On Fri, Aug 11, 2017 at 5:41 AM, hw <h...@gc-24.de> wrote:
> Chris Murphy wrote:
>>
>> On Wed, Aug 9, 2017, 11:55 AM Mark Haney <mark.ha...@neonova.net> wrote:
>>
>>> To be honest, I'd not try a btrfs volume on a notebook SSD. I did that on
>>> a
>>> couple of systems and it corrupted pretty quickly. I'd stick with
>>> xfs/ext4
>>
>>
>> if you manage to get the drive working again.
>>>
>>>
>>
>> Sounds like a hardware problem. Btrfs is explicitly optimized for SSD, the
>> maintainers worked for FusionIO for several years of its development. If
>> the drive is silently corrupting data, Btrfs will pretty much immediately
>> start complaining where other filesystems will continue. Bad RAM can also
>> result in scary warnings where you don't with other filesytems. And I've
>> been using it in numerous SSDs for years and NVMe for a year with zero
>> problems.
>
>
> That´s one thing I´ve been wondering about:  When using btrfs RAID, do you
> need to somehow monitor the disks to see if one has failed?

Yes.

The block layer has no faulty device handling, i.e. it just reports
whatever problems the device or the controller report. Where md/mdadm
and md/LVM have implemented policies for ejecting (setting a device to
faulty) a block device. Btrfs does not do that, it'll just keep trying
to use a faulty device.

So you have to setup something that monitors for either physical
device errors, or btrfs errors or both, depending on what you want.




>
>> On CentOS though, I'd get newer btrfs-progs RPM from Fedora, and use
>> either
>> an elrepo.org kernel, a Fedora kernel, or build my own latest long-term
>> from kernel.org. There's just too much development that's happened since
>> the tree found in RHEL/CentOS kernels.
>
>
> I can´t go with a more recent kernel version before NVIDIA has updated their
> drivers to no longer need fence.h (or what it was).
>
> And I thought stuff gets backported, especially things as important as file
> systems.

There's 1500 to 3000 line changes to Btrfs code per kernel release.
There's too much to backport most of it. Serious fixes do get
backported by upstream to longterm kernels, but to what degree, you
have to check the upstream changelogs to know about it.

And right now most backports go to only 4.4 and 4.9. And I can't tell
you what kernel-3.10.0-514.10.2.el7.x86_64.rpm translates into, that
requires a secret decoder ring near as I can tell as it's a kernel
made from multiple branches,  and then also a bunch of separate
patches.



>> Also FWIW Red Hat is deprecating Btrfs, in the RHEL 7.4 announcement.
>> Support will be removed probably in RHEL 8. I have no idea how it'll
>> affect
>> CentOS kernels though. It will remain in Fedora kernels.
>
>
> That would suck badly to the point at which I´d have to look for yet another
> distribution.  The only one ramaining is arch.
>
> What do they suggest as a replacement?  The only other FS that comes close
> is
> ZFS, and removing btrfs alltogether would be taking living in the past too
> many
> steps too far.

Red Hat are working on a new user space wrapper and volume format
based on md, device mapper, LVM, and XFS.
http://stratis-storage.github.io/
https://stratis-storage.github.io/StratisSoftwareDesign.pdf

It's an aggressive development schedule and as so much of it is
journaling and CoW based I have no way to assess whether it ends up
with its own set of problems, not dissimilar to Btrfs. We'll just have
to see. But if there are underlying guts in the device-mapper that do
things better/faster/easier than Btrfs, the Btrfs devs have said they
can hook into device-mapper for these things to consolidate code base,
in particular for the multiple device handling. By its own vague time
table it will be years before it has "rough ZFS features" and again
estimating bootloader support, and to what degree other distros pick
up on it, it very well could end up being widely adopted, or it could
be a Red Hat only thing in practice.

Canonical appears to be charging ahead with OpenZFS included by
default out of the box (although not for rootfs yet I guess), and that
has an open ended and possibly long window before legal issues get
tested. But this is by far the most cross platform solution: FreeBSD,
Illumos, Linux, macOS. And ZoL has RHEL/CentOS specific packages.

But I can't tell you for sure what ZoL's faulty device behavior is
either, whether it ejects faulty or flaky devices and when, or if like
Btrfs is just tolerates it.

The elrepo.org folks can still sanely set CONFIG_BTRFS_FS=m, but I
suspect if RHEL unsets that in RHEL 8 kernels, that CentOS will do the
same.



-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Errors on an SSD drive

2017-08-10 Thread Chris Murphy
On Thu, Aug 10, 2017, 6:48 AM Robert Moskowitz <r...@htt-consult.com> wrote:

>
>
> On 08/09/2017 10:46 AM, Chris Murphy wrote:
> > If it's a bad sector problem, you'd write to sector 17066160 and see if
> the
> > drive complies or spits back a write error. It looks like a bad sector in
> > that the same LBA is reported each time but I've only ever seen this with
> > both a read error and a UNC error. So I'm not sure it's a bad sector.
> >
> > What is DID_BAD_TARGET?
>
> I have no experience on how to force a write to a specific sector and
> not cause other problems.  I suspect that this sector is in the /
> partition:
>
> Disk /dev/sda: 240.1 GB, 240057409536 bytes, 468862128 sectors
> Units = sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disk label type: dos
> Disk identifier: 0xc89d
>
> Device Boot  Start End  Blocks   Id  System
> /dev/sda12048 2099199 1048576   83  Linux
> /dev/sda2 2099200 4196351 1048576   82  Linux swap /
> Solaris
> /dev/sda3 4196352   468862127   232332888   83  Linux
>

LBA 17066160 would be on sda3.

dd if=/dev/sda skip=17066160 count=1 2>/dev/null | hexdump -C

That'll read that sector and display hex and ascii. If you recognize the
contents, it's probably user data. Otherwise, it's file system metadata or
a system binary.

If you get nothing but an I/O error, then it's lost so it doesn't matter
what it is, you can definitely overwrite it.

dd if=/dev/zero of=/dev/sda seek=17066160 count=1

If you want an extra confirmation, you can first do 'smartctl -t long
/dev/sda' and then after the prescribed testing time, which is listed,
check it again with 'smartct -a /dev/sda' and see if the test completed, or
if under self-test log section, it shows it was aborted and lists a number
under the LBA_of_first_error column.



> But I don't know where it is in relation to the way the drive was
> formatted in my notebook.  I think it would have been in the / partition.
>




>
> > And what do you get for
> > smartctl -x 
>
> About 17KB of output?


Can you attach it as a file to the list? If the list won't accept the
attachment, put it up on fpaste.org or pastebin or something like that.
MUA's tend to nerf the output so don't paste it into an email.





> I don't know how to read what it is saying, but
> noted in the beginning:
>
> Write SCT (Get) XXX Error Recovery Control Command failed: scsi error
> badly formed scsi parameters
>
> Don't know what this means...
>
> BTW, the system is a Cubieboard2 armv7 SoC running Centos7-armv7hl. This
> is the first time I have used an SSD on a Cubie, but I know it is
> frequently done.  I would have to ask on the Cubie forum what others
> experience with SSDs have been.
>

It's very common. I think this is just an ordinary bad sector, if that LBA
value is consistent. If it's a new SSD it's slightly concerning. You can
either keep an eye on it, or put a little pressure on the manufacturer or
place of purchase that you have a bad sector and would like to swap out the
unit.

SSD's, in particular SD Cards (which you're not using, which is noted as
/dev/mmcblk0...) store you data as a probabilistic representation, and
through a lot of magic, the probability of retrieving your data correctly
from SSD is made very high. Almost deterministic.

The magic is in the firmware, and so there's some possibility any given SSD
problem is related to a firmware bug. So it's worth comparing the firmware
reported by smartctl and what the manufacturer has, and then their
changelog. Most have a way to update firmware without Windows, but don't
have images that will boot an arm board, usually the "universal" updater is
based on FreeDOS funny enough. You'd need to stick the SSD in an x86
computer to do this. Hilariously perverse, I did this with a Samsung 830
SSD a while back, sticking it into a Macbook Pro, and burned that firmware
ISO onto a DVD-RW, and it booted that Mac (using the firmware's BIOS
compatibility support module) and updated the SSD's firmware without a
problem.



Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Errors on an SSD drive

2017-08-10 Thread Chris Murphy
On Wed, Aug 9, 2017, 11:55 AM Mark Haney <mark.ha...@neonova.net> wrote:

> To be honest, I'd not try a btrfs volume on a notebook SSD. I did that on a
> couple of systems and it corrupted pretty quickly. I'd stick with xfs/ext4

if you manage to get the drive working again.
>

Sounds like a hardware problem. Btrfs is explicitly optimized for SSD, the
maintainers worked for FusionIO for several years of its development. If
the drive is silently corrupting data, Btrfs will pretty much immediately
start complaining where other filesystems will continue. Bad RAM can also
result in scary warnings where you don't with other filesytems. And I've
been using it in numerous SSDs for years and NVMe for a year with zero
problems.

On CentOS though, I'd get newer btrfs-progs RPM from Fedora, and use either
an elrepo.org kernel, a Fedora kernel, or build my own latest long-term
from kernel.org. There's just too much development that's happened since
the tree found in RHEL/CentOS kernels.

Also FWIW Red Hat is deprecating Btrfs, in the RHEL 7.4 announcement.
Support will be removed probably in RHEL 8. I have no idea how it'll affect
CentOS kernels though. It will remain in Fedora kernels.

Anyway, blkdiscard can be used on an SSD, whole or partition to zero them
out. And at least recent ext4 and XFS mkfs will do a blkdisard, same as
mksfs.btrfs.


Chris Murphy






> <
> https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail_term=icon
> >
> Virus-free.
> www.avast.com
> <
> https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail_term=link
> >
> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
> On Wed, Aug 9, 2017 at 1:48 PM, hw <h...@gc-24.de> wrote:
>
> > Robert Moskowitz wrote:
> >
> >> I am building a new system using an Kingston 240GB SSD drive I pulled
> >> from my notebook (when I had to upgrade to a 500GB SSD drive).  Centos
> >> install went fine and ran for a couple days then got errors on the
> >> console.  Here is an example:
> >>
> >> [168176.995064] sd 0:0:0:0: [sda] tag#14 FAILED Result:
> >> hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
> >> [168177.004050] sd 0:0:0:0: [sda] tag#14 CDB: Read(10) 28 00 01 04 68 b0
> >> 00 00 08 00
> >> [168177.011615] blk_update_request: I/O error, dev sda, sector 17066160
> >> [168487.534510] sd 0:0:0:0: [sda] tag#17 FAILED Result:
> >> hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
> >> [168487.543576] sd 0:0:0:0: [sda] tag#17 CDB: Read(10) 28 00 01 04 68 b0
> >> 00 00 08 00
> >> [168487.551206] blk_update_request: I/O error, dev sda, sector 17066160
> >> [168787.813941] sd 0:0:0:0: [sda] tag#20 FAILED Result:
> >> hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
> >> [168787.822951] sd 0:0:0:0: [sda] tag#20 CDB: Read(10) 28 00 01 04 68 b0
> >> 00 00 08 00
> >> [168787.830544] blk_update_request: I/O error, dev sda, sector 17066160
> >>
> >> Eventually, I could not do anything on the system.  Not even a 'reboot'.
> >> I had to do a cold power cycle to bring things back.
> >>
> >> Is there anything to do about this or trash the drive and start anew?
> >>
> >
> > Make sure the cables and power supply are ok.  Try the drive in another
> > machine
> > that has a different controller to see if there is an incompatibility
> > between
> > the drive and the controller.
> >
> > You could make a btrfs file system on the whole device: that should say
> > that
> > a trim operation is performed for the whole device.  Maybe that helps.
> >
> > If the errors persist, replace the drive.  I悲 use Intel SSDs because they
> > seam to have the least problems with broken firmwares.  Do not use SSDs
> > with
> > hardware RAID controllers unless the SSDs were designed for this
> > application.
> >
> >
> > ___
> > CentOS mailing list
> > CentOS@centos.org
> > https://lists.centos.org/mailman/listinfo/centos
> >
> >
>
>
> --
> [image: photo]
> Mark Haney
> Network Engineer at NeoNova
> 919-460-3330 <(919)%20460-3330> (opt 1) • mark.ha...@neonova.net
> www.neonova.net <https://neonova.net/>
> <https://www.facebook.com/NeoNovaNNS/>  <https://twitter.com/NeoNova_NNS>
> <http://www.linkedin.com/company/neonova-network-services>
> ___
> CentOS mailing list
> CentOS@centos.org
> https://lists.centos.org/mailman/listinfo/centos
>
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Errors on an SSD drive

2017-08-09 Thread Chris Murphy
If it's a bad sector problem, you'd write to sector 17066160 and see if the
drive complies or spits back a write error. It looks like a bad sector in
that the same LBA is reported each time but I've only ever seen this with
both a read error and a UNC error. So I'm not sure it's a bad sector.

What is DID_BAD_TARGET?

And what do you get for
smartctl -x 

Chris Murphy

On Wed, Aug 9, 2017, 8:03 AM Robert Moskowitz <r...@htt-consult.com> wrote:

> I am building a new system using an Kingston 240GB SSD drive I pulled
> from my notebook (when I had to upgrade to a 500GB SSD drive).  Centos
> install went fine and ran for a couple days then got errors on the
> console.  Here is an example:
>
> [168176.995064] sd 0:0:0:0: [sda] tag#14 FAILED Result:
> hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
> [168177.004050] sd 0:0:0:0: [sda] tag#14 CDB: Read(10) 28 00 01 04 68 b0
> 00 00 08 00
> [168177.011615] blk_update_request: I/O error, dev sda, sector 17066160
> [168487.534510] sd 0:0:0:0: [sda] tag#17 FAILED Result:
> hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
> [168487.543576] sd 0:0:0:0: [sda] tag#17 CDB: Read(10) 28 00 01 04 68 b0
> 00 00 08 00
> [168487.551206] blk_update_request: I/O error, dev sda, sector 17066160
> [168787.813941] sd 0:0:0:0: [sda] tag#20 FAILED Result:
> hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
> [168787.822951] sd 0:0:0:0: [sda] tag#20 CDB: Read(10) 28 00 01 04 68 b0
> 00 00 08 00
> [168787.830544] blk_update_request: I/O error, dev sda, sector 17066160
>
> Eventually, I could not do anything on the system.  Not even a
> 'reboot'.  I had to do a cold power cycle to bring things back.
>
> Is there anything to do about this or trash the drive and start anew?
>
> Thanks
>
> ___
> CentOS mailing list
> CentOS@centos.org
> https://lists.centos.org/mailman/listinfo/centos
>
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Centos 7 Samba - all shares read only

2017-05-04 Thread Chris Murphy
Pretty sure smb gets "control" of a directory via the group. For my
setup, each directory defined by a path in smb.conf has group
smbusers, and has rwx permissions. This is applied just to that
directory, it is not applied recursively. The files and folders in
that directory have the actual remote user's ownership and
permissions.

What is applied recursively is the selinux label. I find it's better
to have a dedicated filesystem volume so you can use the mount option
context="system_u:object_r:samba_share_t:s0" and that will apply that
context to the whole file system. If a file system volume is being
shared, then you'll need to use chcon -R
"system_u:object_r:samba_share_t:s0"  to apply that context to
everything. New files and directories will inherit this context (so
long as it's a copy and not a move; so if you move things behind the
scenes outside of samba, you can run into label problems since
inheritance doesn't apply to moving).


Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] OT: systemd Poll - So Long, and Thanks for All the fish.

2017-04-19 Thread Chris Murphy
On Wed, Apr 19, 2017 at 5:21 AM, James B. Byrne <byrn...@harte-lyne.ca> wrote:
>
> On Mon, April 17, 2017 17:13, Warren Young wrote:
>
>>
>> Also, I’ll remind the list that one of the *prior* times the systemd
>> topic came up, I was the one reminding people that most of our jobs
>> summarize as “Cope with change.â€
>>
>
> At some point 'coping with change' is discovered to consume a
> disproportionate amount of resources for the benefits obtained.  In my
> sole opinion the Linux community appears to have a
> change-for-change-sake fetish. This is entirely appropriate for an
> experimental project.  The mistake that I made many years ago was
> inferring that Linux was nonetheless suitable for business.
>
> To experimenters a ten year product cycle may seem an eternity. To
> many organisations ten years is barely time to work out all the kinks
> and adapt internal processes to automated equivalents.  And the
> smaller the business the more applicable that statement becomes.
>
> I do not have any strong opinion about systemd as I have virtually no
> experience with it.  But the regular infliction of massively
> disruptive changes to fundamental software has convinced us that Linux
> does not meet our business needs. Systemd and Upstart are not the
> cause of that.  They are symptoms of a fundamental difference of focus
> between what our firm needs and what the Linux community wants.

Apple has had massively disruptive changes on OS X and iOS. Windows
has had a fairly disruptive set of changes in Windows 10. About the
only things that don't change are industrial OS's.

When it comes to breaking user space, there's explicit rules against
that in Linux kernel development. And internally consistent API/ABI
stability is something you're getting in CentOS/RHEL kernels, it's one
of the points the distributions exist. But the idea that Windows and
OS X have better overall API stability I think is untrue, having
spoken to a very wide assortment of developers who build primarily
user space apps.

What does happen, in kernel ABI changes can break your driver, as
there's no upstream promise for ABI compatibility within the kernel
itself. The effect of this is very real on say, Android, and might be
one of the reasons for Google's Fuscia project which puts most of the
drivers, including video drivers, into user space. And Microsoft also
rarely changes things in their kernel, so again drivers tend to not
break.


-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] CentOS-6.8 fsck report Maximal Count

2017-03-19 Thread Chris Murphy
On Tue, Mar 14, 2017, 7:41 AM James B. Byrne <byrn...@harte-lyne.ca> wrote:

> On Fri, March 10, 2017 11:57, m.r...@5-cent.us wrote:
>
> >
> > Looks like only one sector's bad. Running badblocks should,
> > I think, mark that sector as bad, so the system doesn't try
> > to read or write there. I've got a user whose workstation has
> > had a bad sector running for over a year. However, if it
> > becomes two, or four, or 64 sectors, it's replacement
> > time, asap.
> > 
>
>
> Bear with me on this.  The last time I did anything like this I ended
> up having to boot into recovery mode from an install cd and do this by
> hand.  This is not an option in the present circumstance as the unit
> is a headless server in a remote location.
>
> If I do this:
>
> echo '-c' > /fsckoptions
> touch /forcefsck
> shutdown -r now
>
> Will this repair the bad block and bring the system back up? If not
> then what other options should I use?
>
> The bad block is located in an LV assigned to a libvirt pool
> associated with a single vm.  Can this be checked and corrected
> without having to deal with the base system? If so then how?



You'll need to search the smartmontools site for their doc on bad sectors.
There's a how to, to find what file is affected by the bad sector so you
can replace it. That's the only way to fix the problem.

This gets tricky going through LVM.


Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] usb 3.1 support in CentOS 7

2017-03-19 Thread Chris Murphy
On Tue, Mar 14, 2017, 6:11 PM Jerry Geis <jerry.g...@gmail.com> wrote:

> Hi All - Been trying to find out if USB 3.1 support is in CentOS 7 and
> kernel 3.10 ?
>
> I see its in the 4.X kernel - but what about CentOS 7?




USB 3.1 Gen 1 is the same thing as USB 3.0.

USB 3.1 Gen 2 is a different thing entirely as I expect it needs a much
newer kernel. I don't think it strictly requires USB-C, but in practice
that's the only form factor I've seen it appear in so far.


Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] [OT] Network Attached Storage

2017-01-03 Thread Chris Murphy
I suggest going with what you are, or or willing to become, most
familiar with. The simpler the better. And expect it will break,
therefore build a contingency for that breakage.

In my case, I'm using an Intel NUC with a Pentium N3700, a dyconn USB
3.0 hub, and a bunch of laptop drives in enclosures connected to that
hub. Fedora 25 Server is the OS, Btrfs is the file system. The primary
volume is just a single disk volume, using samba to share with
Mac/Windows/Linux; and periodically the shared subvolume gets
snapshot, and use btrfs send/receive to replicate the incremental
changes to additional independent volumes (the USB 3.0 drive). There's
5 copies of the data locally; three of which are independent copies.
An additional independent subset of that data is "in the cloud" so
that it's off site. It's a very basic setup but it's also been stable.
But I'm also really prepared to lose any of the copies at any time.
Even if I were using XFS on LVM, I'd still keep this many copies.


Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] CentOS 7 samba server + mac client problems

2017-01-03 Thread Chris Murphy
This is my smb.conf, it's extremely basic.

https://paste.fedoraproject.org/519466/34678791/

Note that max version is commented out. I've been using some version
of Samba 4 for a little over a year, and with macOS versions 10.9
through 10.12.

Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] CentOS 7 samba server + mac client problems

2017-01-03 Thread Chris Murphy
Only Mac clients are affected? Have you tested a Linux (e.g. Fedora 25
live OS would do) client?

It's necessary for all files to have selinux context
system_u:object_r:samba_share_t:s0. You can either user chcon -R to
apply it recursively to a particular directory you're sharing, or if
it's an entire (dedicated) volume, you can apply it volume wide with a
mount option, 'mount -o context="system_u:object_r:samba_share_t:s0"
/dev/mapper/brick /brick'

I'm not doing anything different for my Mac client than for a Linux
one, at least not that I'm aware of; however I'm using samba-4.5.3
since I'm using Fedora 25 server. It may be that older versions need
to force a lower version of SMB depending on which Mac client you're
using.


Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] HP Envy EFI problem after BIOS update

2016-09-15 Thread Chris Murphy
> [root@gary ~]# efibootmgr -o 2002,2001,0003,0002

Try
efibootmgr -v -O -T   ## capital O, to delete the entire boot order and timeout
efibootmgr -v -o 3,2 -t 3


I think making the DVD or USB the default boot options is a bad idea,
I wouldn't do that. Use the boot manager on demand if you need to boot
DVD or USB. The things that should be listed first are the things you
want booted by default and then the next fallback.

The other thing to do is check HP's web site directly for an even
newer version of the firmware. It may be the one you have fixes a
particular bug Microsoft is bothered by but may have an NVRAM garbage
collection bug that they don't care about or don't know about (yet).


-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] HP Envy EFI problem after BIOS update

2016-09-14 Thread Chris Murphy
Exactly what efibootmgr command are you using?



Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] HP Envy EFI problem after BIOS update

2016-09-14 Thread Chris Murphy
Multiboot is basically a PITA, the Ux sucks. Almost invariably you're
better off figuring out how to navigate the firmware's boot manager to
choose Windows vs Fedora vs CentOS; where you can use efibootmgr -o to
get the bootorder such that the first entry at least is the one you
usually want to boot by default. And then use the firmware's boot
manager to pick something else as an exception. Although I sometimes
find efibootmgr -n as a one time boot is neat.

If you want to standardize on a single GRUB boot menu that has all
options available, this is basically total shit right now, you have to
do it manually because between grub2-mkconfig and os-prober, they
aren't smart enough to create the proper boot entries. It *should* be
true that it'll create the proper boot entry for Windows, but only
Fedora 24's GRUB supports Secure Boot chainloading the Windows
bootloader, the CentOS one can boot the kernel with Secure Boot
enabled, but can't chainload another bootloader with Secure Boot
enabled. So the first step is you have to standardize on the Fedora 24
version of GRUB if you want to use Secure Boot and I can only advise
that you do because generally I think it's bad advice to disable it.

Next, how to boot CentOS from the Fedora 24 GRUB menu? grub2-mkconfig
and os-prober conspire to create wholly new CentOS boot entries rather
than using the unique CentOS boot entries; and further if you do a
CentOS kernel update, that'll only update the CentOS grub.cfg not the
Fedora one. This is why this is such shit. What I do is disable
os-prober entirely by adding GRUB_DISABLE_OS_PROBER="true" to
/etc/default/grub and then take your pick between
/etc/grub.d/40_custom and /etc/grub.d/41_custom: the first one you
just add the entry directly in that file, and the 2nd one you create a
drop in cfg of your own called custom.cfg. It's probably a little more
reliable to use 41_custom because any of these files can be replaced
with a grub update.

So your menu entry should be something like this:

menuentry 'CentOS menu' {
configfile /EFI/centos/grub.cfg
}

Now grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg

And what you ought to have if I didn't mess that up, is a CentOS item
in the Fedora menu, and when you click that, it'll read and use the
CentOS grub.cfg instead. Now you get all the custom CentOS boot
params, and whenever CentOS gets a kernel update and updates its
grub.cfg, you'll see those changes without having to do anything.


Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] HP Envy EFI problem after BIOS update

2016-09-14 Thread Chris Murphy
On Wed, Sep 14, 2016 at 9:06 AM, Gary Stainburn <g...@ringways.co.uk> wrote:
> I don't know if this additional information helps. Looking at he efibootmgr
> output, is the centos entry pointing to the right place?
>
> Also, any idea why the update isn't being make permenant?

Probably because the specified file wasn't found by the firmware on
partition 1, so it removed what it thinks is a bogus entry. Just a
guess.


>
> Number  Start   End SizeFile system  Name  Flags
>  1  1049kB  683MB   682MB   ntfs Basic data partition  hidden,
> diag
>  2  683MB   892MB   210MB   fat16EFI System Partition  boot


The proper command in this case is

efibootmgr -c -d /dev/sda -p 2 -L CentOS -l \\EFI\\centos\\shim.efi

efibootmgr -v is more rewarding




> Boot0001* Fedora
> HD(2,145800,64000,14c4ac1d-abd8-4121-84ee-c05a825920de)File(\EFI\fedora\shim.efi)RC

Looks fine. But notice the boot order:
BootOrder: 0003,2002,0002,3002,0001,2001,2003


That puts Fedora in 5th place. You need to use efibootmgr -n 0001 to
do a one time boot of Fedora at next boot; or you need to use
efibootmgr -o and explicitly change the entire boot order, separated
by commas.

I have no idea what RC at the end of these lines means though.



-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] group write permissions not being respected

2016-09-06 Thread Chris Murphy
On Tue, Sep 6, 2016, 8:08 PM Pat Haley <pha...@mit.edu> wrote:

>
> Trying the gluster client seems to fix the problem.
>


Hmm, suggests an NFS export issue then, rather than permissions issue?


Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] hacking grub to control number of retained kernels.

2016-09-05 Thread Chris Murphy
On Fri, Sep 2, 2016, 8:52 PM Fred Smith <fre...@fcshome.stoneham.ma.us>
wrote:

> I've recently had this problem on two C7 systems, wherein when doing "yum
> update", I get a warning about /boot being low on space.
>
> both systems were installed using the partition size recommended by
> Anaconda, right now "df -h" shows /boot as 494M, with 79M free.
>
> I don't store unrelated crap on /boot, I assume that yum and/or grub
> will manage it for me. So, why, after over a year, is it running low
> on space on two different systems?
>
> Is there some location in /boot where junk piles up, but shouldn't,
> that I have to know about so I can clean it out?
>
> I see EIGHT initramfs files in /boot, two per kernel, same name but
> one has a kdump just before the .img suffix. do I need those for old
> kernels that I may or may not ever boot? (they're 30 to 50 MB each).
>

I think jump using /boot is a bad idea. I wonder if that's really
necessary? Anyway, long term solution from the anaconda list is increasing
/boot size to 1GiB.


Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] group write permissions not being respected

2016-09-01 Thread Chris Murphy
On Thu, Sep 1, 2016, 8:11 AM Pat Haley <pha...@mit.edu> wrote:

>
> For the enforcing=0, is that referring to SELinux?  If so, we are not
> running SELinux.
>


OK so neither that nor chcon nor context mount option apply. It's something
else.





>
> On 08/31/2016 11:38 PM, Chris Murphy wrote:
>
> 
> > Try booting with enforcing=0 and if that fixes it, you need to find out
> > what security label is needed for gluster.
>
> 
> For the enforcing=0, is that referring to SELinux?  If so, we are not
> running SELinux.
>
> -
>
> > Chances are it's easiest to use -o context= mount option on the brick,
> but
> > if the brick is not exclusive to gluster you'll need chcon -R.
>
> -
> We aren't sure exactly what you mean by this second paragraph, can
> you expand on this?  Are these two exclusive options exclusive?  We aren't
> sure what you what you mean by "exclusive to gluster"
>
> -
> > If that's not it, maybe try the gluster client instead of using NFS. See
> if
> > you get a different result that narrows down what's going on.
> >
> > My vague recollection is for Samba, without the correct SELinux label, I
> > could neither read nor write.
> >
> >
> > Chris Murphy
> > ___
> > CentOS mailing list
> > CentOS@centos.org
> > https://lists.centos.org/mailman/listinfo/centos
>
> --
>
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> Pat Haley  Email:  pha...@mit.edu
> Center for Ocean Engineering   Phone:  (617) 253-6824
> Dept. of Mechanical EngineeringFax:(617) 253-8125
> MIT, Room 5-213http://web.mit.edu/phaley/www/
> 77 Massachusetts Avenue
> Cambridge, MA  02139-4301
>
> ___
> CentOS mailing list
> CentOS@centos.org
> https://lists.centos.org/mailman/listinfo/centos
>
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] group write permissions not being respected

2016-08-31 Thread Chris Murphy
Try booting with enforcing=0 and if that fixes it, you need to find out
what security label is needed for gluster.

Chances are it's easiest to use -o context= mount option on the brick, but
if the brick is not exclusive to gluster you'll need chcon -R.

If that's not it, maybe try the gluster client instead of using NFS. See if
you get a different result that narrows down what's going on.

My vague recollection is for Samba, without the correct SELinux label, I
could neither read nor write.


Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Kickstart issue with UEFi

2016-08-27 Thread Chris Murphy
On Fri, Aug 26, 2016 at 10:39 AM, Gordon Messmer
<gordon.mess...@gmail.com> wrote:
> On 08/25/2016 11:35 PM, Phil Manuel wrote:
>>
>> The relevant kickstart section is:-
>>
>> part /boot/efi --fstype efi --grow --maxsize=200 --size=20 --ondisk=sda
>> bootloader --append=" crashkernel=auto" --location=mbr --boot-drive=sda1
>> autopart --type=lvm
>
>
>
> A couple of things to consider:
> * The documentation for "autopart" states that "The autopart command can't
> be used with the logvol, part/partition, raid, reqpart, or volgroup in the
> same kickstart file," so your use of autopart and "part /boot/efi" appear to
> be incompatible.  Maybe drop the "part" line.
> * I specify partitions for kickstart, but my bootloader line is:
>   bootloader --location=mbr --append="net.ifnames=0 biosdevname=0"
>   No location is specified, the installer works it out.  Given the error you
> posted, I think sda1 might not be where anaconda is putting the EFI
> partition.

That appears to be basically correct. It'll put it on sda1 but it
doesn't want you to tell it to put it on sda1 when using autopart.
Pretty much autopart wants to be told very little, and Phil's
kickstart is being too explicit for autopart.



-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Kickstart issue with UEFi

2016-08-27 Thread Chris Murphy
IF you've sorted this out, skip this email, it might be confusing.


Your best bet might be to do this in a VM using custom partitioning
then look at the resulting /root/anaconda-ks.cfg for hints. I find the
nomenclature of anaconda kickstarts to be very confusing and
non-obvious. And I'm willing to bet this stuff changes a LOT between
versions of anaconda, especially in new areas like UEFI firmware as
during Fedora testing in the approximate lifecycle of RHEL 7, there
were lots of bugs and lots of fixes.

So I just tried a two disk autopartitioning with CentOS 7 (not 7.2)
which is using anaconda-19.31.79-1. The kickstart I get

#version=RHEL7
# System authorization information
auth --enableshadow --passalgo=sha512

# Use CDROM installation media
cdrom
# Run the Setup Agent on first boot
firstboot --enable
ignoredisk --only-use=vda,vdb
# Keyboard layouts
keyboard --vckeymap=us --xlayouts='us'
# System language
lang en_US.UTF-8

# Network information
network  --bootproto=dhcp --device=ens3 --onboot=off --ipv6=auto
network  --hostname=localhost.localdomain
# Root password
rootpw --iscrypted $6$
# System timezone
timezone America/New_York --isUtc
# System bootloader configuration
bootloader --location=mbr --boot-drive=vda
autopart --type=lvm
# Partition clearing information
clearpart --none --initlabel

%packages
@core

%end

-


So the screwy parts... bootloader --location=mbr makes zero sense to
me on UEFI systems. Clearly it should be on a partition, the EFI
system partition, and the installation that resulted in this kickstart
file did create an EFI system partition, a bootloader is on it, and
there is no bootloader code in the first 440 bytes of the PMBR. So,
yeah... pretty screwy that this kickstart does the right thing.

FWIW I don't highly recommend this layout because what it does is
creates a linear/concat of vda and vdb primarily for LVM, and spreads
rootfs (and home) across both drives. One device failure means
complete data loss. The install will let you create an mdadm raid
level 1 (mirror) of the EFI system partition in custom partitioning on
two drives. It's debatable if this is a great idea for enterprise
software, but no one has bothered to come up with the kind of solution
you'd see on other platforms where the thing that modifies the ESP is
capable of modifying all ESPs, to keep them in sync, without using
software RAID. So we're sorta stuck with mdadm raid1, or you'd have to
create your own script that syncs a primary ESP to the secondary ESP
(primary being the one mounted at /boot/efi and the only one that'd
get updated bootloaders and bootloader config).

Yada.


Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] GRUB 2 dumps to grub prompt when installed on >4TB disk

2016-08-23 Thread Chris Murphy
On Mon, Aug 22, 2016 at 1:24 PM, James A. Peltier <jpelt...@sfu.ca> wrote:
>
>
> When running grub2-install from within recovery mode I can assure you it is 
> not a user error because simply installing the grub2-efi-modules package 
> allows for grub2-install to work.

No, this logic is flawed. Running grub2-install is obsolete on UEFI,
it only applies for users who know exactly what they're getting
themselves into and have a use case for modules in grub2-efi-modules
that are not already in the grubx64.efi binary that's included in the
grub2-efi package. If you run grub2-install, it blows away that
grubx64.efi from the grub2-efi package in favor of a custom built one,
which has completely different and for the most part undocumented
behavior.

For example the grubx64.efi bootloader in grub2-efi expects to find
grub.cfg on the ESP in the same directory as the grubx64.efi binary.
If you run grub2-install, the resulting grubx64.efi expects to find
grub.cfg in /boot/grub2/ which is on your boot volume, not the EFI
System Partition. If this is UEFI system with Secure Boot enabled, the
grub2-install created grubx64.efi is not signed, so it'll fail Secure
Boot unless you go down the rabbit hole of signing it yourself.
Whereas the CentOS supplied grubx64.efi in the grub2-efi package is
already signed. And so on.

How are you booting the CentOS installation media? How was that media
created? This matters because it's possible to end up with a CSM-BIOS
boot inadvertently, and the installer will install a grub for BIOS
firmware, and not the entirely separate bootloader for UEFI. So it
might be worth booting from that install media, and get to a shell and
check if in fact this is an UEFI mode boot by running efibootmgr. If
you get an error message, it's not a UEFI mode boot, it's using
CSM-BIOS mode, and that would explain why the wrong bootloader is
being installed by the installer.


-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] GRUB 2 dumps to grub prompt when installed on >4TB disk

2016-08-21 Thread Chris Murphy
On Fri, Aug 19, 2016 at 4:59 PM, James A. Peltier <jpelt...@sfu.ca> wrote:
>
>
> - Original Message -
> | On Thu, Aug 18, 2016 at 11:57 AM, James A. Peltier <jpelt...@sfu.ca> wrote:
> | > Hi All,
> | >
> | > I have a Dell R710 that has 6x1TB in a RAID-5 configuration.
> |
> |
> | This is hardware RAID 5? Because it's pretty screwy how this ends up
> | working when using software RAID and might take additional
> | troubleshooting.
>
> Yes, it's a Dell R710XD
>
> | >  When installing CentOS 7 using the full disk capacity and booting in UEFI
> | >  mode the machine dumps me into a GRUB rescue mode prompt.
> | >   error: disk `,gpt2' not found
> | >   Entering rescue mode...
> | >   grub rescue>
> |
> |
> | This is confusing to me because there should be no such thing as grub
> | rescue on UEFI. On BIOS systems, there is boot.img (formerly stage 1)
> | and core.img in the MBR gap or on BIOS Boot if GPT disk (formerly
> | stage 1.5 and stage 2). The core.img is where grub rescue comes from
> | when it can't find grub modules, in particular normal.mod.
> |
> | But on UEFI, core.img, normal.mod, and a pile of other modules are all
> | baked into the grubx64.efi file founds on the EFI system partition.
> |
> | I suspect two things that can cause normal.mod to not be found:
> | a. The system is not in fact booting in UEFI mode and there's been
> | some mistake in the installation of grub.
> | b. The system is in UEFI mode, but either the installer, or
> | post-install, grub2-install was run which obliterates the grub2-efi
> | package installed grubx64.efi, i.e. it's not really proper to run
> | grub2-install on UEFI systems.
>
> I suspect this is the case.  when attempting to run grub-install the system 
> claims that the grub2-efi-modules packages aren't installed, so this may be 
> an installer bug.

What is attempting to run grub-install? Or even grub2-install? If the
installer is doing this, it's an installer bug. If the user is doing
it, it's user error.

Also, you will need to check the NVRAM for stale values because
grub2-install also populates NVRAM with what will become the wrong
entry. You'll need to use 'efibootmgr -v' to get a listing to find the
bogus entry, which will be pointing to a path that includes
grubx64.efi, note the boot number and then do 'efibootmgr -b 
-B'   Where bootnum is the four digit value for the bogus entry.

What should happen if there are no valid entries is shim.efi will work
with fallback.efi to create a proper NVRAM entry. The proper entry can
be found with the earlier grep efibootmgr command, and you can just
use that, while adding an additional \ for each \, so that it's \\.
NVRAM should point to shim.efi and it's shim.efi that loads the
prebaked grubx64.efi.


-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] GRUB 2 dumps to grub prompt when installed on >4TB disk

2016-08-18 Thread Chris Murphy
On Thu, Aug 18, 2016 at 11:57 AM, James A. Peltier <jpelt...@sfu.ca> wrote:
> Hi All,
>
> I have a Dell R710 that has 6x1TB in a RAID-5 configuration.


This is hardware RAID 5? Because it's pretty screwy how this ends up
working when using software RAID and might take additional
troubleshooting.



>  When installing CentOS 7 using the full disk capacity and booting in UEFI 
> mode the machine dumps me into a GRUB rescue mode prompt.
>   error: disk `,gpt2' not found
>   Entering rescue mode...
>   grub rescue>


This is confusing to me because there should be no such thing as grub
rescue on UEFI. On BIOS systems, there is boot.img (formerly stage 1)
and core.img in the MBR gap or on BIOS Boot if GPT disk (formerly
stage 1.5 and stage 2). The core.img is where grub rescue comes from
when it can't find grub modules, in particular normal.mod.

But on UEFI, core.img, normal.mod, and a pile of other modules are all
baked into the grubx64.efi file founds on the EFI system partition.

I suspect two things that can cause normal.mod to not be found:
a. The system is not in fact booting in UEFI mode and there's been
some mistake in the installation of grub.
b. The system is in UEFI mode, but either the installer, or
post-install, grub2-install was run which obliterates the grub2-efi
package installed grubx64.efi, i.e. it's not really proper to run
grub2-install on UEFI systems.

Boot off install media with boot parameter inst.rescue and choose all
the default options; this ought to assemble the file system per fstab,
and you can

chroot /mnt/sysimage
yum reinstall grub2-efi
efibootmgr -v
grep efibootmgr /var/log/anaconda/program.log   ## I think that's
right it might be anaconda.program.log though


It's really just reinstalling grub2-efi that should fix the problem,
the following two options are just information gathering in case the
reboot still doesn't work.

-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Software RAID and GRUB on CentOS 7

2016-08-13 Thread Chris Murphy
$ sudo grep grub2 /var/log/anaconda/program.log

This will get you the commands the installer used for installing the bootloader.


Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Getting hibernate to work on a new CentOS 7.2.1115 install

2016-06-13 Thread Chris Murphy
On Sun, Jun 12, 2016 at 10:46 AM, Ned Slider <n...@unixmail.co.uk> wrote:
>
>
> On 12/06/16 16:45, Globe Trotter wrote:
>>
>> Hi,
>> I am a new CentOS user (quite familiar with Fedora 1-23+) and I decided to
>> try a new install of CentOS on a ASUS R503U.
>>
>> However, I can not get hibernate to work. I try:
>> systemctl hibenaate
>> and I get:
>> Failed to execute operation: sleep verb not supported
>> Btw, the problem does not go away with super-user.
>>
>> I was wondering how to get around this issue. I would like the abililty to
>> hibernate and come back in the last state.
>> Thanks!

cat /sys/power/state
cat /sys/power/disk

The first should include 'disk' and the second should say enabled or
some such. Note that hibernation is probably not supported by the
CentOS kernel if this is on a UEFI computer with Secure Boot enabled
(it's not supported by Fedora kernels) as it's a possible vector to
defeat the point of Secure Boot.

systemd does check to see if there's enough unused swap to fit
Active(anon) mem into for hibernation, and if not then hibernation
won't be possible.

And yet another thing is that it's possible the initramfs isn't using
resume= which is currently a problem on
Fedora. So you might need to add this to the grub.cfg on the kernel
command line, something like resume=/dev/VG/swap or wherever it is. If
it's a /dev/sdXY, i.e. on a regular partition, then use UUID.



-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] HFSPlus Question

2016-06-07 Thread Chris Murphy
On Fri, Jun 3, 2016 at 12:24 AM, Ned Slider <n...@unixmail.co.uk> wrote:
>
>
> On 03/06/16 04:45, Chris Murphy wrote:
>>
>> On Tue, May 31, 2016, 7:59 PM Albert McCann <albert.mcc...@outlook.com>
>> wrote:
>>
>>> In CentOS 7.2.1511 does the 3.10.0-327.18.2.el7.centos.plus.x86_64 (Plus)
>>> kernel read HFSPlus iMac drives? I don't see any hfsplus modules
>>> installed
>>> anywhere, so I suspect not.
>>
>>
>>
>> It's in mainline so I don't know why it would not be built. It certainly
>> exists on Fedora. You could get Fedora live image, dd to a USB stick and
>> it
>> will boot the Mac.
>>
>
> There are a lot of modules in the equivalent mainline kernel that are not
> enabled / built in the RHEL kernel, reason being RH don't want the extra
> workload of maintaining (backporting fixes) those drivers for the 10 year
> lifespan of the product, long after upstream support at kernel.org has
> ended.
>
> In this case they probably determined it unlikely that a user would want to
> hook an HFSPlus volume up to a RHEL server. They also disabled a whole bunch
> of 100Mbit ethernet drivers commonly found on older desktop motherboards in
> RHEL7 for the same reason.

Fedora 24
[root@f24m mnt]# grep HFSPLUS /boot/config-4.5.6-300.fc24.x86_64
CONFIG_HFSPLUS_FS=m
# CONFIG_HFSPLUS_FS_POSIX_ACL is not set

CentOS 7
[root@localhost ~]# grep HFSPLUS /boot/config-3.10.0-123.20.1.el7.x86_64
# CONFIG_HFSPLUS_FS is not set

[root@localhost ~]# grep HFSPLUS /boot/config-4.6.1-1.el7.elrepo.x86_64
CONFIG_HFSPLUS_FS=m
# CONFIG_HFSPLUS_FS_POSIX_ACL is not set


So it looks like it's not created in the CentOS kernels, but is in the
elrepo and Fedora kernels.


-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] HFSPlus Question

2016-06-02 Thread Chris Murphy
On Tue, May 31, 2016, 7:59 PM Albert McCann <albert.mcc...@outlook.com>
wrote:

> In CentOS 7.2.1511 does the 3.10.0-327.18.2.el7.centos.plus.x86_64 (Plus)
> kernel read HFSPlus iMac drives? I don't see any hfsplus modules installed
> anywhere, so I suspect not.


It's in mainline so I don't know why it would not be built. It certainly
exists on Fedora. You could get Fedora live image, dd to a USB stick and it
will boot the Mac.

The much bigger problem is recent OS X versions default to, and convert on
updating prior versions to, Core Storage volumes. This is Apple's
equivalent of LVM. And there is no open source code for this. Upstream
liblkid doesn't even recognize it. It's actually a big problem as it
renders OS X HFS unreadable outside of OS X.

Microsoft's equivalent is Storage Spaces. But as yet it's not used by
default. Likewise no support on Linux still.

Chris Murphy


My sister's 17" iMac died, and I'm trying to
> recover the drive. If it spins up, I'd like to copy it with dd.
>
> I see that Elrepo has kmod-hfsplus and hfsplus-tools, will these work with
> the Plus kernel?
>
> I still have to pull the drive from that infernal iMac case, so can't test
> yet.
>
> Thank you for any clues, my Google-foo isn't finding anything on the Plus
> kernel and HFSPlus.
>
> ---
> I yam Popeye of the Borg. Prepares ta beez askimiligrated.
> ___
> CentOS mailing list
> CentOS@centos.org
> https://lists.centos.org/mailman/listinfo/centos
>
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Hard drives being renamed

2016-05-24 Thread Chris Murphy
On Tue, May 24, 2016, 3:08 PM Pat Haley <pha...@mit.edu> wrote:

>
> Our questions are
>
>   * Has anyone else experienced similar issues?
>   * What can we do to prevent such renaming in the future
>

Yes. You can't, it's non-deterministic. Use fs volume UUID instead, both in
grub.cfg command line for root= parameter and fstab.


Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Copying a live system

2016-05-03 Thread Chris Murphy
Anaconda live installs use this:

rsync -pogAXtlHrDx --exclude /dev/ --exclude /proc/ --exclude /sys/
--exclude /run/ --exclude /boot/*rescue* --exclude /etc/machine-id
/run/install/source/ /mnt/sysimage


Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] systemd-journald corruption

2016-05-03 Thread Chris Murphy
On Fri, Apr 29, 2016 at 6:26 AM, Chris Adams <li...@cmadams.net> wrote:
> Once upon a time, Chris Adams <li...@cmadams.net> said:
>> So far, turning off the compression appears to have worked (but I'll
>> have to watch it for a day or two to really see).
>
> Just to follow up: turning off journald compression does appear to have
> fixed the corruption problem I was seeing.  I'll watch for an updated
> systemd package.

Does journalctl --verify no longer complain of corruption? Or is it
possible there's corruption but journalctl and rsyslogd now tolerate
it somehow?


-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] systemd-journald corruption

2016-04-27 Thread Chris Murphy
On Wed, Apr 27, 2016 at 7:05 AM, Chris Adams <li...@cmadams.net> wrote:
> Once upon a time, Chris Murphy <li...@colorremedies.com> said:
>> On Tue, Apr 26, 2016 at 3:01 PM, Chris Adams <li...@cmadams.net> wrote:
>> > Once upon a time, Chris Murphy <li...@colorremedies.com> said:
>> >> On Tue, Apr 26, 2016, 2:09 PM Chris Adams <li...@cmadams.net> wrote:
>> >> > I have several recently-installed CentOS 7 servers that keep having
>> >> > systemd-journald corruption
>> >>
>> >> Determined with 'journalctl --verify' or another way?
>
> One system did get into this state overnight, and that said:
>
> [root@spamscan3 ~]# journalctl --verify
> 15bd478: invalid object
> File corruption detected at 
> /run/log/journal/f8ade260c5f84b8aa04095c233c041e0/system.journal:15bd478 (of 
> 25165824 bytes, 90%).
> FAIL: /run/log/journal/f8ade260c5f84b8aa04095c233c041e0/system.journal 
> (Cannot assign requested address)
> (and then a bunch of passes on the rest of the files)
>
>> There's also this patch as a suggested fix:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1292447#c9
>
> I'll take a look at that.
>
>> What version of systemd and rsyslog? systemd-219-19.el7_2.7 and
>> rsyslog-7.4.7-12 are current.
>
> Those are the versions I have.
>
>> If you're there already you could ry editing
>> /etc/systemd/journald.conf and uncommenting Compress=yes and changing
>> it to no.
>
> Thanks, I'm trying that on these servers.

Also I wonder if merely restarting the journal daemon solves it:

systemctl restart systemd-journald

What should happen is it realizes its own logs are corrupt and ignores
them, and starts working on new copies. And journalctl should still
try to read the old ones but skips the corrupt entries.

If that works you could schedule a restart of the journal periodically
as a goofy hack work around until it gets fixed. Clearly Red Hat knows
about this problem.

-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] systemd-journald corruption

2016-04-26 Thread Chris Murphy
On Tue, Apr 26, 2016 at 3:01 PM, Chris Adams <li...@cmadams.net> wrote:
> Once upon a time, Chris Murphy <li...@colorremedies.com> said:
>> On Tue, Apr 26, 2016, 2:09 PM Chris Adams <li...@cmadams.net> wrote:
>> > I have several recently-installed CentOS 7 servers that keep having
>> > systemd-journald corruption
>>
>> Determined with 'journalctl --verify' or another way?
>
> I get messages like this in dmesg:
>
> [4756650.489117] systemd-journald[21364]: Failed to write entry (21 items, 
> 637 bytes), ignoring: Cannot assign requested address

I haven't seen this. When I plug this text into a google search field,
no quotes, there are 360 results.

systemd-journald failed to write entry cannot assign requested address

There's also this patch as a suggested fix:
https://bugzilla.redhat.com/show_bug.cgi?id=1292447#c9

What version of systemd and rsyslog? systemd-219-19.el7_2.7 and
rsyslog-7.4.7-12 are current.

If you're there already you could ry editing
/etc/systemd/journald.conf and uncommenting Compress=yes and changing
it to no.


-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] systemd-journald corruption

2016-04-26 Thread Chris Murphy
On Tue, Apr 26, 2016, 2:09 PM Chris Adams <li...@cmadams.net> wrote:

> I have several recently-installed CentOS 7 servers that keep having
> systemd-journald corruption



Determined with 'journalctl --verify' or another way?


(which stops ALL logging, including syslog).
> Interestingly, they are all spam-scanning servers running amavisd-new
> (so could be some particular pattern is triggering it).
>
> Is there a "supported" way to just cut systemd-journald out of the
> picture and have log entries go straight to rsyslogd?
>

No. Everything reports to journald and rsyslog gets what it wants from
journald.

If you are referring to native journald logs corrupting, that should not
affect rsyslog. If you remove /var/log/journal then systemd-journald logs
will be stored volatile in /run.


> Has anyone else seen this?
>

Sortof, but not in a way that affects rsyslog. Usually journalctl just
skips over corrupt parts and systemd-journald will rotate logs when it
detects corruption to isolate corrupt files.


Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Dual boot C7 with Window 10

2016-04-22 Thread Chris Murphy
Oh and I can't stress enough to check for firmware updates. There's
metric tons of UEFI bugs. This little baby NUC has had 6 firmware
updates in 9 months. Some updates don't fix things I care about,
others do, and the changelogs aren't always really detailed when it
comes to thinks like user interface improvements or bug fixes.

-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Dual boot C7 with Window 10

2016-04-22 Thread Chris Murphy
On Fri, Apr 22, 2016 at 4:11 AM, Timothy Murphy <gayle...@eircom.net> wrote:
> Chris Murphy wrote:
>
>> What you should revert back to UEFI only, with Secure Boot enabled,
>> and reinstall CentOS, deleting the previous partition/mount points
>> including the BIOS Boot partition that was created for CentOS's
>> bootloader.
>
>> The gotcha is that with Secure Boot enabled, the CentOS GRUB-efi
>> package doesn't support chainloading the Windows bootloader. This is
>> getting fixed in Fedora 24 but I have no idea how long it'll take to
>> get to CentOS 7. You could either disable Secure Boot (which I don't
>> recommend) or you switch between CentOS and Windows using the
>> firmware's boot manager. You'll have to figure out which F key brings
>> up the boot manager. On my Intel NUC it's F10, *shrug*.
>
> May I ask a couple of questions which I'm afraid betray my ignorance.

It's much safer to betray ignorance and ask the question than end up
stuck in the mud. It's not your fault, we've kinda been betrayed with
these changes with a combination of overly complicated implementation,
massive piles of bugs, hideous documentation, and misleading
terminology reusage (mainly by the manufacturers).


>
> 1. Why is it advisable to "revert back to UEFI"?
> Is this just a safety measure?

Windows is already installed in UEFI mode. Mixed installations are
just a PITA to support. You'll get almost no help from anyone on a
list because how this works will be firmware dependent and chances are
no one else will have that same make/model and firmware revision.

And yes, I can't in good conscience recommend a setting that makes you
less safe. The computer came to you with Secure Boot enabled, and
you're best off leaving it in that condition. CentOS 7 supports UEFI
Secure Boot out the box. What it doesn't support is dual boot, but
that's technically true even if Secure Boot is disabled, or this were
a system with BIOS firmware. But the firmware boot manager can provide
you with a way to switch between the two. Firmware setup might even
have an option in there somewhere to present the boot manager by
default for each boot. This is true on my Intel NUC which uses
American Megatrends firmware.


> I would have thought that if an intruder had got in this far,
> enabling him to install unsigned modules,
> he would have you at his mercy anyway?

There are levels of compromise. The bootloader malware compromise
means you can reformat and still be owned. Secure Boot pretty much
assures that you're not compromised except in user space, which is why
you run with SELinux enabled, right?



>
> 2. I installed CentOS-7.2.1511 from a Live USB stick,
> and I have a Windows 10 partition that I can boot into.
> So I assume that UEFI is not used by default?
> Will it become so at some point?

If your firmware setup has an option for Secure Boot and/or "legacy"
anything, then it is UEFI firmware. Strictly speaking, UEFI != BIOS
but the manufacturers think we're all morons so they repurposed BIOS
to apply to a completely different behavior of firmware, completely
different discovery of the bootloader method, completely different
bootloader installation and location for the binaries. Anything that
comes with Windows 10 pre-installed has UEFI firmware, with Secure
Boot enabled and any legacy option disabled as a requirement of the
Windows hardware certification spec.

And CentOS can support that condition, you're best off not just
security wise, but in terms of getting support on lists the less you
customize things at a firmware level. And changing to a hybrid UEFI
CSM-BIOS mode is a mess. If it works for you, great, and if some
expert wants to hand hold, fine, but it's not something I recommend.
It's already complicated enough, I think it's made worse by enabling
legacy stuff.


-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] [OT] disk utility showing message "the partition is misaligned by"

2016-04-22 Thread Chris Murphy
On Fri, Apr 22, 2016 at 4:40 AM, g <gel...@bellsouth.net> wrote:

>
> =+=+=
> $ sudo fdisk -l /dev/sdc
> [sudo] password for geo:
>
> Disk /dev/sdc: 1000.2 GB, 1000204886016 bytes
> 255 heads, 63 sectors/track, 121601 cylinders
> Units = cylinders of 16065 * 512 = 8225280 bytes
> Sector size (logical/physical): 512 bytes / 4096 bytes
> I/O size (minimum/optimal): 4096 bytes / 4096 bytes
> Disk identifier: 0x0009ede7
>
>Device Boot  Start End  Blocks   Id  System
> /dev/sdc1   *   1 103  819200   83  Linux
> Partition 1 does not end on cylinder boundary.
> /dev/sdc2 103 230 1024000   82  Linux swap / Solaris
> Partition 2 does not end on cylinder boundary.
> /dev/sdc3 230210715073280   83  Linux
> /dev/sdc42107  121602   9598443525  Extended
> /dev/sdc52108   42019   320587970+  83  Linux
> Partition 5 does not start on physical sector boundary.
> =+=+=

What are these units? Tracks? So 1 = 63, so the start of sdc1 is 1=63?
Annoying. I wish these tools would get updated to do sectors by
default, tracks are useless.

You can try

parted /dev/sdc u s p

That should be in sectors. If the start value is divisible by 8, it is
4KiB sector aligned, *assuming* the drive does not have a jumper
enabled for Windows XP compatibility. I'd like to believe those drives
are long gone by now but heck we keep running into ancient versions of
fdisk and parted with bad legacy behaviors.

Use gparted booted from say a Fedora 23 live workstation USB stick
(created with dd), and 'dnf install gparted'. There's an option to
move/resize. Just give it a new start value and keep the size the
same. Moving takes a long time, every sector for the chosen partition
has to be copied and moved forward or backward.

Or back it up, blow it away, and repartition. Any new tool should warn
or flat out prevent you from improperly aligning but the simplest way
to do it is to always align on 1MiB boundaries. For example partition
1 starts at LBA 2048 which is 1MiB aligned, and now make all
partitions sized in MiB increments and they will all align

Depending on the age of the file system, it's not a bad idea to just
start over every once in a while.


-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] tune2fs: Filesystem has unsupported feature(s) while trying to open

2016-04-21 Thread Chris Murphy
On Tue, Apr 19, 2016 at 10:51 AM, Matt Garman <matthew.gar...@gmail.com> wrote:


># rpm -qf `which tune2fs`
>e2fsprogs-1.41.12-18.el6.x86_64

That's in the CentOS 6.4 repo, I don't see a newer one through 6.7 but
I didn't do a thorough check, just with google site: filter.


> # cat /etc/redhat-release
> CentOS release 6.5 (Final)

> # uname -a
> Linux lnxutil8 2.6.32-504.12.2.el6.x86_64 #1 SMP Wed Mar 11 22:03:14
> UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

And that's a centosplus kernel in the 6.6 repo; while the regular
kernel for 6.7 is currently kernel-2.6.32-573.22.1.el6.src.rpm. So I'm
going to guess you'd have this problem even if you weren't using the
centosplus kernel.

I suggest you do a yum upgrade anyway, 6.7 is current, clean it up,
test it, and then while chances are it's still a problem, then it's
probably a legit bug worth filing. In the meantime you'll have to
upgrade your e2fsprogs yourself.


> I did a little web searching on this, most of the hits were for much
> older systems, where (for example) the e2fsprogs only supported up to
> ext3, but the user had an ext4 filesystem.  Obviously that's not the
> case here.  In other words, the filesystem was created with the
> mkfs.ext4 binary from the same e2fsprogs package as the tune2fs binary
> I'm trying to use.
>
> Anyone ever seen anything like this?

Well the date of the kernel doesn't tell the whole story, so you need
a secret decoder ring to figure out what's been backported into this
distro kernels. There's far far less backporting happening in user
space tools. So it's not difficult for them to get stale when the
kernel is providing new features. But I'd say the kernel has newer
features than the progs supports and the progs are too far behind.

And yes, this happens on the XFS list and the Btrfs list too where
people are using old progs with new kernels and it can be a problem.
Sometimes new progs and old kernels are a problem too but that's less
common.


-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Dual boot C7 with Window 10

2016-04-21 Thread Chris Murphy
On Tue, Apr 19, 2016 at 10:53 AM,  <m.r...@5-cent.us> wrote:
> Jerry Geis wrote:
>> Thanks...
>> I added the "insmod ntfs" re-ran config no boot...
>> I change the hd1 to hd3 re-ran config no boot...
>> This is what my partition table looks like.
>> # Start  EndSize  TypeName
>>  1 2048   534527260M  EFI System  EFI system
> partition 2   534528   567295 16M  Microsoft reser
> Microsoft reserved
>> partition
>>  3   567296525326335  250.2G  Microsoft basic Basic data
> partition 4998166528   1000214527   1000M  Windows recover Basic data
> partition 5525326336525330431  2M  BIOS boot parti
>>  6525330432965732351210G  Microsoft basic
>>  7965732352982509567  8G  Linux swap
>> Thoughts?
>
> I haven't been following this, and perhaps I'm being dense... but I see
> BIOS boot partition, and I see 8G of Linux swap... where's the Linux /boot
> and / partitions?
>
>  mark

CentOS 7 has inherited an old bug/bad design choice by parted
developers, where they decided to use the partition type GUID for
"basic data" that Microsoft came up with, rather than following the
UEFI spec and creating their own partition type GUID for Linux
filesystems. Presumably partition 3 is Windows on NTFS, and partition
6 is a conventional partition with combined /boot / and /home. Just a
guess.

It's not a bad idea to get gdisk on the system, and change the type
code for the linux partition to gdisk code 8300, which translates to
partition type GUID 0FC63DAF-8483-4772-8E79-3D69D8477DE4. Windows 10
will ignore this, where at least Windows 8 and older invited the user
to format anything with the "basic data" GUID that had a file system
it didn't recognize.


-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Dual boot C7 with Window 10

2016-04-21 Thread Chris Murphy
On Tue, Apr 19, 2016 at 5:42 AM, Jerry Geis <ge...@pagestation.com> wrote:
> I have a laptop with windows 10.
> I went into the Windows disk manager and shrunk the volume
> to make room for C7. That worked.
>
> I also changed the BIOS from secure boot to "both" (secure/legacy)

Both is a problem. There's no practical way for an installer to
support both. Basically it makes the computer UEFI for Windows and
BIOS for CentOS 7 instead of UEFI for both.

>
> I installed C7, went fine. About the time it was done I realized I never
> saw anything about "other" boot options (seems I saw that in the past).
>
> Anyway sure enough, got done and C7 boots fine - no option there for
> Windows.  I did searching and found I needed to add to the
> /etc/grub.d/40_custom the following:
> menuentry "Windows 10" {
> set root='(hd0,1)'
> chainloader +1
> }
>
> then re-run the grub2-mkconfig -o /boot/grub2/grub.cfg
>
> I then rebooted and sure enough I got the menu item for "Windows 10"
> however when I select it it does not boot.
>
> How do I get Windows 10 to boot again ?

You'll have to use the firmware's boot manager. The legacy mode
enables a compatibility support module (CSM) so that UEFI presents a
faux-BIOS to the operating system, Cent OS in this case. So Cent OS
thinks it's on a BIOS system, and installs a BIOS based bootloader. A
BIOS bootloader cannot chainload a UEFI bootloader.

What you should revert back to UEFI only, with Secure Boot enabled,
and reinstall CentOS, deleting the previous partition/mount points
including the BIOS Boot partition that was created for CentOS's
bootloader.

The gotcha is that with Secure Boot enabled, the CentOS GRUB-efi
package doesn't support chainloading the Windows bootloader. This is
getting fixed in Fedora 24 but I have no idea how long it'll take to
get to CentOS 7. You could either disable Secure Boot (which I don't
recommend) or you switch between CentOS and Windows using the
firmware's boot manager. You'll have to figure out which F key brings
up the boot manager. On my Intel NUC it's F10, *shrug*.

-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] C7 + UEFI + GPT + RAID1

2016-03-13 Thread Chris Murphy
>
>
> Pretty much the only correct way to do this is with firmware (imsm) or
hardware RAID.

If you have empty drives anaconda can raid1 everything including the EFI
system partitions using mdadm metadata 0.9. But since the firmware doesn't
know this ESP is an array there is a possibility only one ESP gets modified
which is effectively corruption.

What's really needed is a boot services daemon that manages boot and ESP
volumes. Instead of RAID 1, it keeps them synced. And instead of them
always being mounted persistently, they're mounted on demand only when
they're modified, and only modified by API via the daemon. Of course this
doesn't exist yet. But without it, we've regressed in functionality and
reliability.


Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] 6.7 netinstall fails at insert cd to continue

2016-03-06 Thread Chris Murphy
On Sat, Mar 5, 2016 at 9:27 PM, g <gel...@bellsouth.net> wrote:
>
>
> On 03/05/16 20:22, Fred Smith wrote:
>> On Sat, Mar 05, 2016 at 12:48:17PM -0600, g wrote:
>>>
>>>
>>> On 03/05/16 09:04, Chris Murphy wrote:
>>>> You don't say how you created the media.
>>>>
>>> --
>>>
>>> true, i did not say how i created cd's.
>>  
>>> so you are saying that netinstall is incorrectly written because it ask
>>> for a cd and not an internet connection? i would think that the dev's
>>> would have corrected the wording being that netinstall has been a part
>>> of last 2 or 3 versions.
>>>
>>> 'selection on default'? do not recall seeing anything related to such.
>>
>> The last time I used the netinstall CD (on Centos 6, not very many
>> months ago) it asked for a URL, not a CD.
>>
> --
>
> why am i not surprised. :-D
>
> what got bombed was originally installed as 4.5 via dvd. so i have
> not had 'joy' of knowing the problems of a fresh install in a while.
>
> i am glad to say that by chance, day before problems i did run a
> 'yum list installed'. i am about 40% of getting system back to where
> it was.
>
> is there an easy way of running yum from a list instead of entering package
> names? entering names in groups of 5 to avoid problems is a bit slow.

yum group list

and

yum group list hidden

and then

yum group install "group name goes here" "another group name goes here"


>
> something else interesting about fresh install, i installed 6.5 dvd1 on
> my laptop without any problems.

It contains the packages on the media. Netinstalls grab the latest
versions of the packages. If you do a netinstall, and then a yum
upgrade after rebooting, nothing needs to be updated. If you download
even CentOS 6.7 and do a yum upgrade a bunch of stuff will get
replaced.




-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] 6.7 netinstall fails at insert cd to continue

2016-03-06 Thread Chris Murphy
On Sat, Mar 5, 2016 at 11:48 AM, g <gel...@bellsouth.net> wrote:
>
>
> On 03/05/16 09:04, Chris Murphy wrote:
>> You don't say how you created the media.
>>
> --
>
> true, i did not say how i created cd's.
>
> i used k3b as it is easier, less to remember, than using command line.

OK.


> usb's sticks were created using unetbootin and fedora-liveusb-creator.
> yes, i did not mention that i tried with 2 usb sticks. failure was
> same, did not feel it mattered. failure is failure.

No, unetbootin is pretty unreliable. I've actually not had it work
reliably with Fedora ISOs since forever, but I mainly use (U)EFI
systems is possibly why, but it doesn't appear to rewrite the
bootloader stuff correctly at all. At this point I've totally given up
on it.

Fedora liveusb-creator ought to work. But... And it's also currently
undergoing a rewrite. The most reliable way to create USB stick media
for CentOS and Fedora is dd.



>
>> Also netinstall used the network as source, not from CD/DVD. So you
>> should just leave the source selection on default.
>>
> --
>
> so you are saying that netinstall is incorrectly written because it ask
> for a cd and not an internet connection?

Seems suspicious to me yes. A netinstall uses a network source, there
are no packages on the netinstall media itself.


> i would think that the dev's
> would have corrected the wording being that netinstall has been a part
> of last 2 or 3 versions.
>
> 'selection on default'? do not recall seeing anything related to such.

OK I just ran the CentOS 6.7 netinstall ISO in gnome-boxes and it's
not the graphical anaconda that I'm used to with Fedora. There's an
"installation method" and it has Local CD/DVD selected at the top, but
that clearly needs to be set to URL or it's simply not a netinstall.
And then you need to give it a URL for a mirror, like this:
http://www.if-not-true-then-false.com/2011/centos-6-netinstall-network-installation/

This is preconfigured in Fedora for their netinstalls. I have no idea
how CentOS does it, but it doesn't appear to be ready to go.


-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] 6.7 netinstall fails at insert cd to continue

2016-03-05 Thread Chris Murphy
>
>
> You don't say how you created the media. Also netinstall used the network
as source, not from CD/DVD. So you should just leave the source selection
on default.


Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] heads up: /boot space on kernel upgrade

2016-02-11 Thread Chris Murphy
Default boot volume on Fedora is 500M, with a kernel installonly_limit
of 3. So far this seems sufficient, even accounting for the "rescue
kernel" (which is really a nohostonly initramfs, which is quite a bit
larger than the standard hostonly initramfs used for numbered
kernels).
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Measuring memory bandwidth utilization

2016-02-10 Thread Chris Murphy
On Tue, Feb 2, 2016 at 7:34 PM, Gordon Messmer <gordon.mess...@gmail.com> wrote:
> On 02/02/2016 05:34 PM, Benjamin Smith wrote:
>>
>> We've ruled out IOPs for the disks (~ 20%)
>
>
> How did you measure that?  What filesystem are you using?  What is the disk
> / array configuration?
> Which database?
>
> If you run "iostat -x 2" what does a representative summary look like?
>
>>   and raw CPU load (top shows perhaps
>> 1/2 of cores busy, but the system slows to a crawl.
>
>
> Define "busy"?

Yeah.

It'd nice to see the output from top so we can see what is consuming
most of the cpu or anything consuming less than it should because it's
waiting for something else that's slower. It might be useful to see
'perf top' if perf is installed, and if not install it, reproduce the
problem and let perf top run for a minute, then post it on fpaste or
pastebin so the formatting stays semisane.


-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Utility to zero unused blocks on disk

2016-02-09 Thread Chris Murphy
On Mon, Feb 8, 2016 at 11:18 PM, John R Pierce <pie...@hogranch.com> wrote:
> On 2/8/2016 9:54 PM, Chris Murphy wrote:
>>
>> Secure erase is really the only thing to use on SSDs. Writing a pile
>> of zeros just increases wear (minor negative) but also doesn't
>> actually set the cells to the state required to accept a new write, so
>> you've just added a lot more work for the SSD's garbage collector and
>> wear leveling, so it's going to be slower than before you did the
>> zeroing. Secure erase on an SSD erases the cells so they're ready to
>> accept writes.
>
>
> at least one SSD I had, the vendor told me writing a full pass of zeros on
> it via dd or whatever would completely reset the garbage collection and
> effectively defrag it.

Yes it'd be "defragged" in that it has no file system at all to be
fragmented in the first place (easier done with a mkfs). But a huge
percent of the available cells on the drive (the portion not
overprivisioned) would contain valid data (zeros) as far as the drive
firmware is concerned, and those cells storing zeros are not in a
state to accept writes. So unless it's a very good SSD that's so
overprovisioned that it can perform well without the benefit of trim,
and some can, this is odd advice. It's much simpler to just do a full
device mkfs, which will do a whole device trim, and if you want you
can then use wipefs to remove that filesystem's signature.

-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] "upstream testing"??

2016-02-09 Thread Chris Murphy
On Mon, Feb 8, 2016 at 1:35 AM, Chris Murphy <li...@colorremedies.com> wrote:
> Everything else is prone to failure.

Specifically, but not limited to, unetbootin. Really, people need to
just purge unetbootin from memory and stop recommending it. I've never
had it work on any (U)EFI system. And more often than not it would
fail to create media even for BIOS systems. Also, it's not supported
at all on Fedora, I seriously doubt it's supported by CentOS or Red
Hat. Just use dd, and accept the obliteration of the USB stick. That's
easy and safe.

Future talk. There's a rewrite happening on the Fedora side for
LiveUSB Creator that will initially use dd on the backend. I'll guess
that it'll still accept being pointed to a local ISO. There's some
talk about it hopefully being more modular so it can be "branded" by
different distros and hence more widely used, maintained, and
reliable.


-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] "upstream testing"??

2016-02-08 Thread Chris Murphy
Which System76 model? How is the install media created? Presumably it's a
USB stick, but how is it being created?

The easiest and most reliable is to use dd. Livecd-tools is also reliable
but has a number of options required to boot UEFi systems. LiveUSB Creator
should work. Everything else is prone to failure.

CentOS 6.4 is kinda old for new hardware. You're better off looking at
CentOS 7.1.


Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Utility to zero unused blocks on disk

2016-02-08 Thread Chris Murphy
On Mon, Feb 8, 2016 at 10:54 PM, Chris Murphy <li...@colorremedies.com> wrote:
> Secure erase is really the only thing to use on SSDs.

Oops. It's probably a fairly close approximation to just mkfs.btrfs -f
(or xfs) the entire block device for the SSD. If the kernel sees it as
non-rotational, it'll issue a whole device trim first, then write out
scant amount of metadata (btrfs writes out a tiny amount of metadata
at mkfs time, xfs a bit more, ext4 a lot and then even more after
mounting).

For most people this is probably a lot easier than the multistep
process using hdparm and secure erase.

-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Utility to zero unused blocks on disk

2016-02-08 Thread Chris Murphy
On Mon, Feb 8, 2016 at 3:18 PM,  <m.r...@5-cent.us> wrote:
> Chris Murphy wrote:
>> DBAN is obsolete. NIST 800-88 for some time now says to use secure erase
>> or enhanced security erase or crypto erase if supported.
>>
>> Other options do not erase data in remapped sectors.
>
> dban doesn't? What F/OSS does "secure erase"? And does it do what dban's
> DoD 5220.22-M does?

http://dban.org/download

That DoD standard is also obsolete per NIST 800-88. There's zero
evidence provided that 2 passes makes any difference compared to 1,
let alone doing 7.

hdparm --security-help

This takes the form of something like:

hdparm --user-master u --set-security-pass chickens /dev/sdX
hdparm --user-master u --security-erase-enhanced chickens /dev/sdX

The 2nd command doesn't return until completion. hdparm -I can give an
estimate of how long it will take. For HDDs I've found it slightly
overestimates how long it will take, but is generally pretty close.
For SSD's it can be way off. It says 8 minutes for my SSD, but the
command returns in 5 seconds and the SSD spits back all zeros.

Secure erase is really the only thing to use on SSDs. Writing a pile
of zeros just increases wear (minor negative) but also doesn't
actually set the cells to the state required to accept a new write, so
you've just added a lot more work for the SSD's garbage collector and
wear leveling, so it's going to be slower than before you did the
zeroing. Secure erase on an SSD erases the cells so they're ready to
accept writes.

-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Utility to zero unused blocks on disk

2016-02-08 Thread Chris Murphy
hdparm supports ATA secure erase. This is SSD safe, unlike other options.
It's faster than writing zeros to both HDD and SSD.

Chris Murphy

On Mon, Feb 8, 2016, 3:06 PM  <m.r...@5-cent.us> wrote:

> Wes James wrote:
> > Is there a utility to zero unused blocks on a disk?
> >
> > CentOS 6.7/Ext4
> >
> > I saw zerofree, but I’m not sure it would work on Ext4 or even work on
> > this version of CentOS.
> >
> I don't understand the point of doing this. If you want to sanitize the
> disk, use dban , which surely approaches industry standard for
> the open source answer.
>
> Just zeroing random blocks? Why? If you want to wipe a specific file,
> there's shred.
>
>mark
>
> ___
> CentOS mailing list
> CentOS@centos.org
> https://lists.centos.org/mailman/listinfo/centos
>
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Utility to zero unused blocks on disk

2016-02-08 Thread Chris Murphy
DBAN is obsolete. NIST 800-88 for some time now says to use secure erase or
enhanced security erase or crypto erase if supported.

Other options do not erase data in remapped sectors.

Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] How to get UEFI setting by shell?

2016-01-23 Thread Chris Murphy
On Fri, Jan 22, 2016, 5:25 PM John R Pierce <pie...@hogranch.com> wrote:

>
> yeah, I just realized, duh, secureboot on a VM is not an issue at all,
> so never mind all that.
>

It is an issue. Hyper V gen 2 has supported UEFI with Secure Boot enabled
by default for a few years.

>
> I do think the whole secureboot thing is a bad idea on a general purpose
> computer system, seems like an attempt at creating product lock in and
> turning the x86 PC into an appliance, which it really isn't.
>


It's precisely general purpose computers that are most susceptible to what
Secure Boot prevents. What the alternative?



Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] How does Live CD find OS's?

2016-01-23 Thread Chris Murphy
On Sat, Jan 23, 2016 at 6:50 AM, Timothy Murphy <gayle...@eircom.net> wrote:
> If I boot into CentOS on my home server from a Live CD or USB stick
> and go to Troubleshoot, it lists OS's it finds on the machine.
> How does it find these OS's?
> Presumably it looks through all the partitions on all the hard disks
> for something that looks like an OS?
> But how exactly does it identify an OS?


Live CD does not have a troubleshoot boot sub menu option. This is
available with non-lives like DVD and netinstall images. The "rescue a
system" option uses the 'rescue' or more recently the 'inst.rescue'
boot parameter, which tells anaconda to run the text rescue mode, and
all of that code is found in anaconda and python-blivet.


-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] HDD badblocks

2016-01-23 Thread Chris Murphy
On Thu, Jan 21, 2016 at 9:27 AM, Lamar Owen <lo...@pari.edu> wrote:
> On 01/20/2016 01:43 PM, Chris Murphy wrote:
>>
>> On Wed, Jan 20, 2016, 7:17 AM Lamar Owen <lo...@pari.edu> wrote:
>>
>>> The standard Unix way of refreshing the disk contents is with badblocks'
>>> non-destructive read-write test (badblocks -n or as the -cc option to
>>> e2fsck, for ext2/3/4 filesystems).
>>
>>
>> This isn't applicable to RAID, which is what this thread is about. For
>> RAID, use scrub, that's what is for.
>
>
> The badblocks read/write verification would need to be done on the RAID
> member devices, not the aggregate md device, for member device level remap.
> It might need to be done with the md offline, not sure.  Scrub?  There is a
> scrub command (and package) in CentOS, but it's meant for secure data
> erasure, and is not a non-destructive thing.  Ah, you're talking about what
> md will do if 'check' or 'repair' is written to the appropriate location in
> the sysfs for the md in question.  (This info is in the md(4) man page).


Correct.




-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] How to get UEFI setting by shell?

2016-01-22 Thread Chris Murphy
On Thu, Jan 21, 2016, 10:48 PM wk <304702...@qq.com> wrote:

> Hi,
>
>CentOS7.1, Dell PowerEdge R730xd.
>
>How to check/get UEFI information by shell/bash terminal ?   example:if
> UEFI is enabled? if secure boot is enabled?
>


You should find an early kernel message that secure boot is enabled. Just
'dmesg | grep -i secure'

You can also use 'mokutil --sb-state'


Chris Murphy



> Thanks.
> ___
> CentOS mailing list
> CentOS@centos.org
> https://lists.centos.org/mailman/listinfo/centos
>
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] HDD badblocks

2016-01-20 Thread Chris Murphy
On Wed, Jan 20, 2016, 7:17 AM Lamar Owen <lo...@pari.edu> wrote:

> On 01/19/2016 06:46 PM, Chris Murphy wrote:
> > Hence, bad sectors accumulate. And the consequence of this often
> > doesn't get figured out until a user looks at kernel messages and sees
> > a bunch of hard link resets
>
> The standard Unix way of refreshing the disk contents is with badblocks'
> non-destructive read-write test (badblocks -n or as the -cc option to
> e2fsck, for ext2/3/4 filesystems).


This isn't applicable to RAID, which is what this thread is about. For
RAID, use scrub, that's what is for.

The badblocks method fixes nothing if the sector is persistently bad and
the drive reports a read error. It fixes nothing if the command timeout is
reached before the drive either recovers or reports a read error. And even
if it works, you're relying on ECC recovered data rather than reading a
likely good copy from mirror or parity and writing that back to the bad
block.

But all of this still requires the proper configuration.


The remap will happen on the
> writeback of the contents.  It's been this way with enterprise SCSI
> drives for as long as I can remember there being enterprise-class SCSI
> drives.  ATA drives caught up with the SCSI ones back in the early 90's
> with this feature.  But it's always been true, to the best of my
> recollection, that the remap always happens on a write.


Properly configured, first a read error happens which includes the LBA of
the bad sector. The md driver needs that LBA to know how to find a good
copy of data from mirror or from parity. *Then* it weird to the bad LBA.

In the case of misconfiguration, the command timeout expiration and link
reset prevents the kernel from knowing the LBA if the bad sector and
therefore repair isn't possible.


The rationale
> is pretty simple: only on a write error does the drive know that it has
> the valid data in its buffer, and so that's the only safe time to put
> the data elsewhere.
>
> > This problem affects all software raid, including btrfs raid1. The
> > ideal scenario is you'll use 'smartctl -l scterc,70,70 /dev/sdX' in
> > startup script, so the drive fails reads on marginally bad sectors
> > with an error in 7 seconds maximum.
> >
> This is partly why enterprise arrays manage their own per-sector ECC and
> use 528-byte sector sizes.



Not all enterprise drives have 520/528 byte sectors. Those that do are
using T10-PI (formerly DIF) and it requires software support too. It's
pretty rare. It's 8000% easier to use ZFS on Linux or Btrfs.




> But the other fact of life of modern consumer-level hard drives is that
> *errored sectors are expected* and not exceptions.  Why else would a
> drive have a TLER in the two minute range like many of the WD Green
> drives do?  And with a consumer-level drive I would be shocked if
> badblocks reported the same number each time it ran through.
>

All drives expect bad sectors. Consumer drives reporting a read error will
put the host OS into an inconsistent state, so it should be avoided.
Becoming slow is better than implosion. And neither OS X or Windows do link
resets after merely 30 seconds either.


Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] LVM thin volumes fstrim operation not supported

2016-01-19 Thread Chris Murphy
My guess? The passthrough  is causing the error when the command passes
through to the actual device, which doesn't support Trim.

I don't know how it actually works, but you can try to poke it with this
stick: copy a large file to this LV. Check the LV with lvdisplay. Delete
the file. Fstrim. Lvdisplay. Now compare the two lvdisplay results.

It should show the PEs used are less after fstrim.


Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] HDD badblocks

2016-01-19 Thread Chris Murphy
On Mon, Jan 18, 2016, 4:39 AM Alessandro Baggi <alessandro.ba...@gmail.com>
wrote:

> Il 18/01/2016 12:09, Chris Murphy ha scritto:
> > What is the result for each drive?
> >
> > smartctl -l scterc 
> >
> >
> > Chris Murphy
> > ___
> > CentOS mailing list
> > CentOS@centos.org
> > https://lists.centos.org/mailman/listinfo/centos
> > .
> >
> SCT Error Recovery Control command not supported
>



The drive is disqualified unless your usecase can tolerate the possibly
very high error recovery time for these drives.

Do a search for Red Hat documentation on the SCSI Command Timer. By default
this is 30 seconds. You'll have to raise this to 120 out maybe even 180
depending on the maximum time the drive attempts to recover. The SCSI
Command Timer is a kernel seeing per block device. Basically it's giving
up, and resetting the link to drive because while the drive is in deep
recovery it doesn't respond to anything.




Chris Murphy




___
> CentOS mailing list
> CentOS@centos.org
> https://lists.centos.org/mailman/listinfo/centos
>
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] HDD badblocks

2016-01-19 Thread Chris Murphy
On Tue, Jan 19, 2016, 3:30 PM  <m.r...@5-cent.us> wrote:

> Chris Murphy wrote:
> > On Mon, Jan 18, 2016, 4:39 AM Alessandro Baggi
> > <alessandro.ba...@gmail.com>
> > wrote:
> >> Il 18/01/2016 12:09, Chris Murphy ha scritto:
> >> > What is the result for each drive?
> >> >
> >> > smartctl -l scterc 
> >> >
> >> SCT Error Recovery Control command not supported
> >>
> > The drive is disqualified unless your usecase can tolerate the possibly
> > very high error recovery time for these drives.
> >
> > Do a search for Red Hat documentation on the SCSI Command Timer. By
> > default
> > this is 30 seconds. You'll have to raise this to 120 out maybe even 180
> > depending on the maximum time the drive attempts to recover. The SCSI
> > Command Timer is a kernel seeing per block device. Basically it's giving
> > up, and resetting the link to drive because while the drive is in deep
> > recovery it doesn't respond to anything.
> >
> Replace the drive. Yesterday.


That's just masking the problem, his setup will still be misconfigured for
RAID.

It's a 512e AF drive? If so, the bad sector count is inflated by 8. In
reality less than 15 sectors are bad. And none have been reallocated due to
misconfiguration.


Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] HDD badblocks

2016-01-19 Thread Chris Murphy
On Tue, Jan 19, 2016 at 3:24 PM, Warren Young <w...@etr-usa.com> wrote:

> On a modern hard disk, you should *never* see bad sectors, because the drive 
> is busy hiding all the bad sectors it does find, then telling you everything 
> is fine.

This is not a given. Misconfiguration can make persistent bad sectors
very common, and this misconfiguration is the default situation in
RAID setups on Linux, which is why it's so common. This, and user
error, are the top causes for RAID 5 implosion on Linux (both mdadm
and lvm raid). The necessary sequence:

1. The drive needs to know the sector is bad.
2. The drive needs to be asked to read that sector.
3. The drive needs to give up trying to read that sector.
4. The drive needs to report the sector LBA back to the OS.
5. The OS needs to write something back to that same LBA.
6. The drive will write to the sector, and if it fails, will remap the
LBA to a different (reserve) physical sector.

Where this fails on Linux is step 3 and 4. By default consumer drives
either don't support SCT ERC, such as in the case in this thread, or
it's disabled. That condition means the time out for deep recovery of
bad sectors can be very high, 2 or 3 minutes. Usually it's less than
this, but often it's more than the kernel's default SCSI command
timer. When a command to the drive doesn't complete successfully in
the default of 30 seconds, the kernel resets the link to the drive,
which obliterates the entire command queue contents and the work it
was doing to recover the bad sector. Therefore step 4 never happens,
and no steps after that either.

Hence, bad sectors accumulate. And the consequence of this often
doesn't get figured out until a user looks at kernel messages and sees
a bunch of hard link resets and has a WTF moment, and asks questions.
More often they don't see those reset messages, or they don't ask
about them, so the next consequence is a drive fails. When it's a
drive other than one with bad sectors, in effect there are two bad
strips per stripe during reads (including rebuild) and that's when
there's total array collapse even though there was only one bad drive.
As a mask for this problem people are using RAID 6, but it's still a
misconfiguration that can cause RAID6 failures too.


>> Why smartctl does not update Reallocated_Event_Count?
>
> Because SMART lies.

Nope. The drive isn't being asked to write to those bad sectors. If it
can't successfully read the sector without error, it won't migrate the
data on its own (some drives never do this). So it necessitates a
write to the sector to cause the remap to happen.

The other thing is the bad sector count on 512e AF drives is inflated.
The number of bad sectors is in 512 byte sector increments. But there
is no such thing on an AF drive. One bad physical sector will be
reported as 8 bad sectors. And to fix the problem it requires writing
exactly all 8 of those logical sectors at one time in a single command
to the drive. Ergo I've had 'dd if=/dev/zero of=/dev/sda seek=blah
count=8' fail with a read error, due to the command being internally
reinterpreted as read-modify-write. Ridiculous but true. So you have
to use bs=4096 and count=1, and of course adjust seek LBA to be based
on 4096 bytes instead of 512.

So the simplest fix here is:

echo 160 /sys/block/sdX/device/timeout/

That's needed for each member drive. Note this is not a persistent
setting. And then this:

echo repair > /sys/block/mdX/md/sync_action

That's once. You'll see the read errors in dmesg, and md writing back
to the drive with the bad sector.

This problem affects all software raid, including btrfs raid1. The
ideal scenario is you'll use 'smartctl -l scterc,70,70 /dev/sdX' in
startup script, so the drive fails reads on marginally bad sectors
with an error in 7 seconds maximum.

The linux-raid@ list if chock full of this as a recurring theme.

-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] HDD badblocks

2016-01-18 Thread Chris Murphy
What is the result for each drive?

smartctl -l scterc 


Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] HDD badblocks

2016-01-18 Thread Chris Murphy
Also useful, complete dmesg posted somewhere (unless your MUA can be set to
not wrap lines)

Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Centos 7, grub2-mkconfig, unsupported sector [followup]

2016-01-01 Thread Chris Murphy
On Tue, Dec 29, 2015 at 2:43 PM,  <m.r...@5-cent.us> wrote:
> m.r...@5-cent.us wrote:
>>
>>Well, I get back from vacation, and three CentOS 7 boxes didn't come up
>> this morning (my manager and the other admin did the update & reboot).
>> On these three - but *not* another one or two, and I don't think those
>> others are Dells, they're supermicro's - the 327 kernel fell into the
>> rdosshell, I guess. I finally got one the three up by going back to the
>> 228.14 kernel.
>>
>>Now, after googling and finding the CentOS bugzilla, 0009860, that
>> referenced the upstream bugzilla, I applied the workaround discussed in
>> it, Adding initcall_blacklist=clocksource_done_booting to
>> GRUB_CMDLINE_LINUX in /etc/default/grub and then grub2-mkconfig -o
>> /etc/grub2.cfg"

You're writing the grub.cfg to the wrong location with the wrong name.
It needs to go to /boot/grub2/grub.cfg - that's where the bootloader
looks for it.

grubby looks for /etc/grub2.cfg which is a symlink to
/boot/grub2/grub.cfg, because grubby.



>>
>>On one. On the other, that's got a *large* RAID appliance (a JetStor w/
>> 42 4TB drives...), it seemed to work... then gave me a dozen or so
>> ERROR: unsupported sector size 4096 on /dev/sdd.
>> ERROR: unsupported sector size 4096 on /dev/sde.
>>
>>0. Did the grub2-mkconfig actually work correctly?
>>1. Is it safe to ignore the errors and reboot?
>>2. This seems to be an old bug in os-prober, that was fixed
>>  years ago - has it slipped back in?

I'm willing to guess that something now causes this RAID device to
report a logical sector size of 4096 bytes, rather than 512 bytes as
is ordinarily the case for real hard drives, most of which are now
512e AF drives. It is true that 4096 byte logical sectors aren't
supported by bootloaders on BIOS systems; in theory it could be
supported on UEFI systems.

> There's still the question of why os-probe, which was supposedly fixed
> almost five years ago, according to what I google, is back to complaining
> at 4k sectors.

I suggest added to /etc/default/grub

GRUB_DISABLE_OS_PROBER="true"

Since you only care about the current system being added to the boot
menu, searching for other OS's is irrelevant anyway.


-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Firewalld broken on Centos7?

2015-08-19 Thread Chris Murphy
On Wed, Aug 19, 2015 at 5:54 AM, Andrew Holway andrew.hol...@gmail.com wrote:
 Hi,

 I have a standard Centos7 AMI. Can anyone tell me whats happening here?

 Thanks,

 Andrew
 Aug 19 11:17:23 master dhclient[22897]: bound to 10.141.10.49 -- renewal in
 1795 seconds.
 Aug 19 11:17:24 master network: Determining IP information for eth0... done.
 Aug 19 11:17:24 master network: [  OK  ]
 Aug 19 11:17:24 master systemd: Started LSB: Bring up/down networking.
 Aug 19 11:23:43 master firewalld: 2015-08-19 11:23:43 ERROR: Failed to
 apply rules. A firewall reload might solve the issue if the firewall has
 been modified using ip*tables or ebtables.
 Aug 19 11:23:43 master firewalld: 2015-08-19 11:23:43 ERROR:
 '/sbin/iptables -D INPUT_ZONES -t filter -i eth0 -g IN_public' failed:
 iptables: No chain/target/match by that name.
 Aug 19 11:23:43 master firewalld: 2015-08-19 11:23:43 ERROR:
 COMMAND_FAILED: '/sbin/iptables -D INPUT_ZONES -t filter -i eth0 -g
 IN_public' failed: iptables: No chain/target/match by that name.
 Aug 19 11:35:58 master yum[23685]: Erased:
 cloud-init-0.7.5-10.el7.centos.1.x86_64

Firewalld and iptables are mutually exclusive, at least on Fedora.
There might be some use case for combining static and dynamic rules
(?) but I'd expect you should disable one or the other.

-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Optimum Block Size to use

2015-08-19 Thread Chris Murphy
On Tue, Aug 18, 2015 at 9:38 PM, Jatin Davey jasho...@cisco.com wrote:

 I have a few queries with respect to the block size being set in the system:

 1. Is 4k the optimum block size considering the amount of writes / second
 the application performs ?

 2. How do i find out the optimum block size given the application load in
 terms of reads / writes per second ?

 3. If there is a better block size that i can use , Can you suggest one ?

 4. What are the pros / Cons of changing the default block size ?

On x86, it's effectively fixed at 4096 bytes. There is a clustering
option in ext4 called bigalloc which isn't the same thing as block
size but might be what you're looking for if you have a lot of larger
file writes happening. But this implies CentOS 7 to get this feature.

 5. We use ext3 as the file system for the partition which has heavy writes
 per second , Should we migrate it to ext4 ? Any pros / cons for it ?

Piles of pros, and no meaningful cons. Just use ext4 with defaults.


-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Optimum Block Size to use

2015-08-19 Thread Chris Murphy
On Wed, Aug 19, 2015 at 4:47 AM, Leon Fauster
leonfaus...@googlemail.com wrote:

 furthermore check the fs alignment with
 the underlying disk ...

This is very important. Certain workloads and certain AF drive
firmware can really suck when there's a lot of read,modify,write done
by the drive (internally) if the fs block is not aligned to physical
sector size. I'm pretty sure parted and fdisk on CentOS 6 does
properly align, whereas they don't on CentOS 5. Proper alignment is
when the partition start LBA is divisible by 8. So a start LBA of 63
is not aligned, where 2048 is aligned and now common.


-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Grub legacy on Centos 7

2015-08-16 Thread Chris Murphy
On Sun, Aug 16, 2015 at 3:18 PM, Sachin Gupta sachin3072...@gmail.com wrote:
 Hello Everyone,

 We have centos6  server. And we are planning to upgrade it to Centos7.And
 GRUB 2 needs a new bios grub partition.

BIOS boot partition is only necessary on GPT partitioned disks. For
MBR partitioned disks, the GRUB 2 core.img goes into the MBR gap, the
same as before. On rare occasion when the 1st partition starts at LBA
63 the core.img can't fit into the gap. The supported work around is
repartitioning such that 1st partition starts at LBA 2048. The less
supported and recommended work around is to use grub2-install with -f


Creating a new partition is too
 much risky.

It really isn't. You can use gparted to resize/move the /boot
partition/volume safely and fast. And if it blows up just reformat it
it in the installer which isn't a bad idea to do anyway. There's no
need to keep an old CentOS 6 /boot partition around anyway.

I am wondering if it is possible to replace Grub2 with Grub
 legacy on Centos7 machine?

Yeah just yum erase grub2 and then force the installation of the
CentOS 6 grub package; then run grub-install.

-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] grub-install

2015-08-14 Thread Chris Murphy
On Fri, Aug 14, 2015 at 7:18 PM, Sachin Gupta sachin3072...@gmail.com wrote:
 Hello Everyone,

 I am a newbie. When I try to install GRUB2 on centos 5.2 system, I get
 following error.

CentOS 5 and 6 don't support GRUB 2, only GRUB legacy is supported.
There's a decent chance you could grab a Fedora RPM and it will
install OK but I don't know what version to recommend because CentOS 5
is so much older than even the oldest Fedora that first had GRUB 2.

It's a bit tedious to build GRUB 2 yourself from upstream, but that's
also an option. I only recommend that for computers with BIOS
firmware. There are some significant differences with the UEFI build
of GRUB 2 between upstream and RH/Fedora that might make you want to
drink heavily.

Not that upstream, the commands are grub-install and grub-mkconfig,
whereas for RH/Fedora it's grub2-install and grub2-mkconfig to avoid
confusion with the legacy versions that RH still supports.



 centos5: grub-install  /dev/sda
 //sbin/grub-setup: warn: This GPT partition label has no BIOS Boot
 Partition; embedding won't be possible!.
 //sbin/grub-setup: warn: Embedding is not possible.  GRUB can only be
 installed in this setup by using blocklists.  However, blocklists are
 UNRELIABLE and their use is discouraged..
 //sbin/grub-setup: error: will not proceed with blocklists.

 Can you please help me figure out the problem ?

Well I can't tell if this grub-install is actually GRUB 2 upstream,
which I suspect it might be because I'm not aware that legacy supports
GPT (?) But since GRUB legacy is from the pleistocene I may have just
forgotten. Anyway, the simplest solution is carve out 1MiB of free
space somehow, and make a 1MiB partition. I suggest using GPT fdisk,
a.k.a. gdisk. Set the partition type to EF02, which will set the
partition type GUID to that of BIOS Boot. And now GRUB should install
- it will automatically find that partition.

If you use parted, the flag to use is biosboot, which does the same
thing as EF02 in gdisk.



-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] CentOS 5 grub boot problem

2015-08-07 Thread Chris Murphy
On Fri, Aug 7, 2015 at 8:12 AM, Bowie Bailey bowie_bai...@buc.com wrote:
 I tried the grub commands you gave and still got the same results. I also
 have a copy of the SuperGrub disc, which is supposed to be able to fix grub
 problems.  It can boot the drive, but it can't fix it.  If nothing else, I
 guess I could just leave that disc in the drive and use it to boot the
 system.

 I'm going to do a fresh install to the new drives and see if that works.

I suppose it's worth a shot. But like I mentioned earlier, keep in
mind that CentOS 5 predates AF drives, so it will not correctly
partition these drives such that they have proper 8 sector alignment.

If you haven't already, check the logic board firmware and the HBA
firmware for current updates.

-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] CentOS 5 grub boot problem

2015-08-06 Thread Chris Murphy
On Thu, Aug 6, 2015 at 2:29 PM, Bowie Bailey bowie_bai...@buc.com wrote:
 On 8/6/2015 4:21 PM, Chris Murphy wrote:

 On Thu, Aug 6, 2015 at 2:08 PM, Bowie Bailey bowie_bai...@buc.com wrote:

 Doing a new install on the two 1TB drives is my current plan.  If that
 works, I can connect the old drive, copy over all the data, and then try
 to
 figure out what I need to do to get all the programs running again.

 Sounds like a pain. I would just adapt the CentOS 6 program.log
 commands for your case. That's a 2 minute test. And it ought to work.


 I'm not familiar with that.  How would I go about adapting the CentOS 6
 program.log commands?

I mentioned it in the last two posts yesterday on this subject.

-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] CentOS 5 grub boot problem

2015-08-06 Thread Chris Murphy
On Thu, Aug 6, 2015 at 2:39 PM, Leon Fauster leonfaus...@googlemail.com wrote:
 Am 06.08.2015 um 22:21 schrieb Chris Murphy li...@colorremedies.com:
 On Thu, Aug 6, 2015 at 2:08 PM, Bowie Bailey bowie_bai...@buc.com wrote:

 Doing a new install on the two 1TB drives is my current plan.  If that
 works, I can connect the old drive, copy over all the data, and then try to
 figure out what I need to do to get all the programs running again.

 Sounds like a pain. I would just adapt the CentOS 6 program.log
 commands for your case. That's a 2 minute test. And it ought to work.

 Clearly the computer finds the drive, reads the MBR and executes stage
 1. The missing part is it's not loading or not executing stage 2 for
 some reason. I'm just not convinced the bootloader is installed
 correctly is the source of the problem with the 2nd drive. It's not
 like the BIOS or HBA card firmware is going to faceplace right in
 between stage 1 and stage 2 bootloaders executing. If there were a
 problem there, the drive simply doesn't show up and no part of the
 bootloader gets loaded.


 on which OS (eg. c5, c6) was the partition created?

For the OP, I think it was CentOS 5, but he only said it's running CentOS 5 now.

For my test, it was CentOS 6, but that uses the same version of GRUB
legacy so the bootloader installation method for raid1 disks should be
the same.

-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] CentOS 5 grub boot problem

2015-08-06 Thread Chris Murphy
On Thu, Aug 6, 2015 at 2:29 PM, Bowie Bailey bowie_bai...@buc.com wrote:

 Definitely a strange problem.  I'm hoping that doing a new install onto
 these drives rather than trying to inherit the install used on the smaller
 drives will work better.

The CentOS installer, and parted, predate AF drives, so the
partitioning will not be correct with a new installation. There's no
way to get the installer to do proper alignment. You can partition
correctly in advance, and then have the installer reuse those
partitions though.

-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] CentOS 5 grub boot problem

2015-08-06 Thread Chris Murphy
On Thu, Aug 6, 2015 at 2:08 PM, Bowie Bailey bowie_bai...@buc.com wrote:

 Doing a new install on the two 1TB drives is my current plan.  If that
 works, I can connect the old drive, copy over all the data, and then try to
 figure out what I need to do to get all the programs running again.

Sounds like a pain. I would just adapt the CentOS 6 program.log
commands for your case. That's a 2 minute test. And it ought to work.

Clearly the computer finds the drive, reads the MBR and executes stage
1. The missing part is it's not loading or not executing stage 2 for
some reason. I'm just not convinced the bootloader is installed
correctly is the source of the problem with the 2nd drive. It's not
like the BIOS or HBA card firmware is going to faceplace right in
between stage 1 and stage 2 bootloaders executing. If there were a
problem there, the drive simply doesn't show up and no part of the
bootloader gets loaded.

-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] CentOS 5 grub boot problem

2015-08-06 Thread Chris Murphy
On Thu, Aug 6, 2015 at 2:57 PM, Bowie Bailey bowie_bai...@buc.com wrote:
 On 8/6/2015 4:39 PM, Chris Murphy wrote:

 On Thu, Aug 6, 2015 at 2:29 PM, Bowie Bailey bowie_bai...@buc.com wrote:

 On 8/6/2015 4:21 PM, Chris Murphy wrote:

 On Thu, Aug 6, 2015 at 2:08 PM, Bowie Bailey bowie_bai...@buc.com
 wrote:

 Doing a new install on the two 1TB drives is my current plan.  If that
 works, I can connect the old drive, copy over all the data, and then
 try
 to
 figure out what I need to do to get all the programs running again.

 Sounds like a pain. I would just adapt the CentOS 6 program.log
 commands for your case. That's a 2 minute test. And it ought to work.


 I'm not familiar with that.  How would I go about adapting the CentOS 6
 program.log commands?

 I mentioned it in the last two posts yesterday on this subject.


 Ok.  I'll give that a try tomorrow.  Just a couple of questions.

 install --stage2=/boot/grub/stage2 /grub/stage1 d (hd0,1) /grub/stage2 p
 (hd0,1)/grub/grub.conf

 It looks like this mixes paths relative to root and relative to /boot.  Did
 your test system have a separate /boot partition?

Yes.


 The --stage2 argument is
 os stage2 file according to my man page. Should this be relative to root
 even with a separate /boot partition?

I think it's being treated as a directory because it's going to access
this stage2 file.


 Also, why are the exact same root and install commands run twice in the log
 you show?  Is that just a duplicate, or does it need to be run twice for
 some reason?

I do not know. The whole thing is foreign to me. But both drives are
bootable as hd0 (the only drive connected). So it makes sense that the
configuration is treating this as an hd0 based installation of the
bootloader to both drives. The part were the stage 1 and 2 are
directed to separate drives must be the 'device (hd0) /dev/vdb'
command. Again, I don't know why it isn't either 'device (hd0) (hd1)'
or 'device /dev/vda /dev/vdb' but that's what the log sayeth.

-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] CentOS 5 grub boot problem

2015-08-06 Thread Chris Murphy
On Thu, Aug 6, 2015 at 2:59 PM, Gordon Messmer gordon.mess...@gmail.com wrote:
 On 08/05/2015 10:23 AM, Chris Murphy wrote:

 Nothing about hd0 or hd1 gets baked into the bootloader code. It's an
 absolute reference to a physical drive at the moment in time the
 command is made.


 Is that true?  If I have a system with two disks, where device.map labels
 one as hd0 and the other as hd1, and I swap those numbers, the resulting
 boot sector will differ by one bit.

Hrmm I can't reproduce this in a VM with identical drives. Are you
sure stage 2 is in an identical location on both drives? That would
account for a one bit (or more) of difference since GRUB's stage 1
contains an LBA to jump to, rather than depending on an MBR partition
active bit (boot flag) to know where to go next.


 My understanding was that those IDs are used to map to the BIOS disk ID.
 Stage 2 will be read from the partition specified to the grub installer at
 installation, as in:
 grub install --stage2=/boot/grub/stage2 /grub/stage1 d (hd0,1)/grub/stage2
 p (hd0,1)/grub/grub.conf

 Stage 1 and 1.5 will, as far as I know, always be on (hd0), but stage2 might
 not be.  If the BIOS device IDs specified in device.map weren't written into
 the boot loader, how else would it know where to look for them?

stage 1 cannot point to another drive at all. It's sole purpose is to
find stage 2, which must be on the same drive. Stage 1.5 is optional,
and I've never seen it get used on Linux, mainly because in the time
of GRUB legacy, I never encountered an installer that supported XFS
for /boot.

https://www.gnu.org/software/grub/manual/legacy/grub.html#Images



-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] CentOS 5 grub boot problem

2015-08-06 Thread Chris Murphy
On Thu, Aug 6, 2015 at 5:04 PM, Chris Murphy li...@colorremedies.com wrote:

 I might try nerfing the parted and grub stage 1 bootloaders on disk2,
 and see if the grub shell (which I should still get to from disk 1)
 will let me install grub directly on these two drives properly.

OK I did that and this works.

## At the GRUB boot menu, hit c to get to a shell
grub root (hd0,0)
grub setup (hd0)
grub setup (hd1)

That's it.

What I did to test is was I zero'd the first 440 bytes of vdb and the
the first 512 bytes of vdb1. I confirmed that this disk alone does not
boot at all. After running the above commands, either drive boots.

NOW, I get to say I've seen stage 1.5 get used because when it did the
setup, it said it was embedding /grub/e2fs_stage1_5. In the above case
hd0,0 is first disk first partition which is /boot.

Anyway, this seems about 8 million times easier than linux
grub-install CLI. Now I get to look back at OP's first email and see
if he did this exact same thing already, and whether we've come full
circle.

-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] CentOS 5 grub boot problem

2015-08-06 Thread Chris Murphy
On Thu, Aug 6, 2015 at 5:19 PM, Chris Murphy li...@colorremedies.com wrote:
 Now I get to look back at OP's first email and see
 if he did this exact same thing already, and whether we've come full
 circle.

Shit. He did.

All I can think of is that either the GRUB/BIOS device designations
are wrong (they should be (hd2) or (hd3) I can't actually tell how
many drives are connected to this system when all of this is
happening) so the bootloader is installing to a totally different
drive. Or yeah, there is in fact some goofy incompatibility with an
HBA where it gets to stage 1.5 and then implosion happens. *shrug*


-- 
Chris Murphy
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


  1   2   3   >