Re: [linux-lvm] bug? shrink lv by specifying pv extent to be removed does not behave as expected

2023-04-11 Thread Roger Heflin
On Tue, Apr 11, 2023 at 1:44 AM matthew patton  wrote:
>
> > my plan is to scan a disk for usable sectors and map the logical volume
> > around the broken sectors.
>
> 1977 called, they'd like their non-self-correcting HD controller 
> implementations back.
>
> From a real-world perspective there is ZERO (more like negative) utility to 
> this exercise. Controllers remap blocks all on their own and the so-called 
> geometry is entirely fictitious anyway. From a script/program "because I want 
> to" perspective you could leave LVM entirely out of it and just use a file 
> with arbitrary offsets scribbled with a "bad" signature.


The disks should be able to remap sectors all on their own.  Few
(none?) of the sata/sas non-raid controllers I know of do any disk
level remapping.  Some of the hardware raid ones may.   As implemented
the decision to remap (in the disk) seems to not always work
correctly.  I have a number of disks over several generations that
will refuse to re-map what is a clearly bad sector (re-writes to a
given sector succeed, and then immediate re-reads fail and on another
re-write succeed again and immediately fail on re-read, but do not get
remapped.

So given this, if one does not want to be replacing the given disks
there is room for software level remaps still to use the significant
number of disks with limited bad sectors on them.

___
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/


Re: [linux-lvm] bug? shrink lv by specifying pv extent to be removed does not behave as expected

2023-04-09 Thread Roger Heflin
On Sun, Apr 9, 2023 at 1:21 PM Roland  wrote:
>
> > Well, if the LV is being used for anything real, then I don't know of
> > anything where you could remove a block in the middle and still have a
> > working fs.   You can only reduce fs'es (the ones that you can reduce)
> > by reducing off of the end and making it smaller.
>
> yes, that's clear to me.
>
> > It makes zero sense to be able to remove a block in the middle of a LV
> > used by just about everything that uses LV's as nothing supports being
> > able to remove a block in the middle.
>
> yes, that critics is totally valid. from a fs point of view you completely
> corrupt  the volume, that's clear to me.
>
> > What is your use case that you believe removing a block in the middle
> > of an LV needs to work?
>
> my use case is creating some badblocks script with lvm which intelligently
> handles and skips broken sectors on disks which can't be used otherwise...
>
> my plan is to scan a disk for usable sectors and map the logical volume
> around the broken sectors.
>
> whenever more sectors get broken, i'd like to remove the broken ones to have
> a usable lv without broken sectors.
>
> since you need to rebuild your data anyway for that disk, you can also
> recreate the whole logical volume.
>
> my question and my project is a little bit academic. i'd simply want to try
> out how much use you can have from some dead disks which are trash 
> otherwise...
>
>
> the manpage is telling this:
>
>
> Resize an LV by specified PV extents.
>
> lvresize LV PV ...
> [ -r|--resizefs ]
> [ COMMON_OPTIONS ]
>
>
>
> so, that sounds like that i can resize in any direction by specifying extents.
>
>
> > Now if you really need to remove a specific block in the middle of the
> > LV then you are likely going to need to use pvmove with specific
> > blocks to replace those blocks with something else.
>
> yes, pvmove is the other approach for that.
>
> but will pvmove continue/finish by all means when moving extents located on a
> bad sector ?
>
> the data may be corrupted anywhy, so i thought it's better to skip it.
>
> what i'm really after is some "remap a physical extent to a healty/reserved
> section and let zfs selfheal do the rest".  just like "dismiss the problematic
> extents and replace with healthy extents".
>
> i'd better like remapping instead of removing a PE, as removing will 
> invalidate
> the whole LV
>
> roland
>


Create an LV per device, and when the device is replaced then lvremove
the devices list.  Once a sector/area is bad I would not trust the
sectors until you replace the device.  You may be able to try the
pvmove multiple times and the disk may be able to eventually rebuild
the data.

My experience with bad sectors is once it reports bad the disks will
often rewrite it at the same location and call it "good" when it is
going to report bad again almost immediately, or be a uselessly slow
sector.   Sometimes it will replace the sector on a
re-write/successful read but that seems unreliable.

On non-zfs fs'es I have found the "bad" file and renamed it
badfile. and put it in a dir called badblocks.  So long as the bad
block is in the file data then you can contain the badblock by
containing the bad file.   And since most of the disk will be file
data that should also be a management scheme not requiring a fs
rebuild.

The re-written sector may also be "slow" and it might be wise to treat
those sectors as bad, and in the "slow" sector case pvmove should
actually work.  For that you would need a badblocks that "timed" the
reads to disk and treats any sector taking longer that even say .25
seconds as slow/bad.   At 5400 rpm, .25/250ms translates to around 22
failed re-read tries.   If you time it you may have to do some testing
on the entire group of reads in smaller aligned sectors to figure out
which sector in the main read was bad.  If you scanned often enough
for slows you might catch them before they are completely bad.
Technically the disk is supposed to do that on its scans, but even
when I have turned the scans up to daily it does not seem to act
right.

And I have usually found that the bad "units" are 8 units of 8
512-byte sectors for a total of around 32k (aligned on the disk).

___
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/


Re: [linux-lvm] bug? shrink lv by specifying pv extent to be removed does not behave as expected

2023-04-09 Thread Roger Heflin
On Sun, Apr 9, 2023 at 10:18 AM Roland  wrote:
>
> hi,
>
> we can extend a logical volume by arbitrary pv extends like this :
>
>
> root@s740:~# lvresize mytestVG/blocks_allocated -l +1 /dev/sdb:5
>Size of logical volume mytestVG/blocks_allocated changed from 1.00
> MiB (1 extents) to 2.00 MiB (2 extents).
>Logical volume mytestVG/blocks_allocated successfully resized.
>
> root@s740:~# lvresize mytestVG/blocks_allocated -l +1 /dev/sdb:10
>Size of logical volume mytestVG/blocks_allocated changed from 2.00
> MiB (2 extents) to 3.00 MiB (3 extents).
>Logical volume mytestVG/blocks_allocated successfully resized.
>
> root@s740:~# lvresize mytestVG/blocks_allocated -l +1 /dev/sdb:15
>Size of logical volume mytestVG/blocks_allocated changed from 3.00
> MiB (3 extents) to 4.00 MiB (4 extents).
>Logical volume mytestVG/blocks_allocated successfully resized.
>
> root@s740:~# lvresize mytestVG/blocks_allocated -l +1 /dev/sdb:20
>Size of logical volume mytestVG/blocks_allocated changed from 4.00
> MiB (4 extents) to 5.00 MiB (5 extents).
>Logical volume mytestVG/blocks_allocated successfully resized.
>
> root@s740:~# pvs --segments
> -olv_name,seg_start_pe,seg_size_pe,pvseg_start  -O pvseg_start
>LV   Start SSize  Start
>blocks_allocated 0  1 0
> 0  4 1
>blocks_allocated 1  1 5
> 0  4 6
>blocks_allocated 2  110
> 0  411
>blocks_allocated 3  115
> 0  416
>blocks_allocated 4  120
> 0 47691721
>
>
> how can i do this in reverse ?
>
> when i specify the physical extend to be added, it works - but when is
> specifcy the physical extent to be removed,
> the last one is being removed but not the specified one.
>
> see here for example - i wanted to remove extent number 10 like i did
> add it, but instead extent number 20
> is being removed
>
> root@s740:~# lvresize mytestVG/blocks_allocated -l -1 /dev/sdb:10
>Ignoring PVs on command line when reducing.
>WARNING: Reducing active logical volume to 4.00 MiB.
>THIS MAY DESTROY YOUR DATA (filesystem etc.)
> Do you really want to reduce mytestVG/blocks_allocated? [y/n]: y
>Size of logical volume mytestVG/blocks_allocated changed from 5.00
> MiB (5 extents) to 4.00 MiB (4 extents).
>Logical volume mytestVG/blocks_allocated successfully resized.
>
> root@s740:~# pvs --segments
> -olv_name,seg_start_pe,seg_size_pe,pvseg_start  -O pvseg_start
>LV   Start SSize  Start
>blocks_allocated 0  1 0
> 0  4 1
>blocks_allocated 1  1 5
> 0  4 6
>blocks_allocated 2  110
> 0  411
>blocks_allocated 3  115
> 0 47692216
>
>
> how can i remove extent number 10 ?
>
> is this a bug ?
>

Well, if the LV is being used for anything real, then I don't know of
anything where you could remove a block in the middle and still have a
working fs.   You can only reduce fs'es (the ones that you can reduce)
by reducing off of the end and making it smaller.

It makes zero sense to be able to remove a block in the middle of a LV
used by just about everything that uses LV's as nothing supports being
able to remove a block in the middle.

What is your use case that you believe removing a block in the middle
of an LV needs to work?

Now if you really need to remove a specific block in the middle of the
LV then you are likely going to need to use pvmove with specific
blocks to replace those blocks with something else.

___
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/


Re: [linux-lvm] lvconvert --uncache takes hours

2023-03-02 Thread Roger Heflin
On Thu, Mar 2, 2023 at 11:44 AM Gionatan Danti  wrote:
>
> Il 2023-03-02 01:51 Roger Heflin ha scritto:
> > A spinning raid6 array is slow on writes (see raid6  write penalty).
> > Because of that the array can only do about 100 write operattions/sec.
>
> True. But does flushing cached data really proceed in random LBA order
> (as seen by HDDs), rather than trying to coalesce writes in linear
> fashion?
>
It is a 100G cache over 16TB, so even if it flushes in order the may
not be that close to each other (1 in 160).

Also if pieces are decided and added to the cached then the cache is
not in order on the ssd and proper coalescing would require reading
the entire cache and sorting the 3,000,000 location entries before
starting the de-stage.  And that complication of a de-stage is likely
not been coded yet if I was just guessing, the de-stage starts at the
beginning and continues to the end of the cache.

Even coded though, if the you have enough blocks cached and if the
blocks spread say one or 2 on each track it would break down to having
to write a tiny bit on each track with seeks between mostly breaking
down to the time required to simply read/write  the HD end to end.  At
150MB/sec (should be about the platter speed) that would take 3.5
hours.


> > If the disk is doing other work then it only has the extra capacity so
> > it could destage slower.
> >
> > A lot depends on how big each chunk is. The lvmcache indicates the
> > smallest chunksize is 32k.
> >
> > 100G / 32k = 3 million, and at 100seeks/sec that comes to at least an
> > hour.
>
> You are off an order of magnitude: 3 millions IOP at 100 IOPs means
> ~3s, so about 9 hours.

Right, I did the calc in my head and screwed it up.  I thought it
should have been higher but did not re-check it.
>
> > Lvm bookkeeping has to also be written to the spinning disks I would
> > think, so 2 hours if the array were idle.
> >
> > Throw in a 50% baseload on the disks and you get 4 hours.
> >
> > Hours is reasonable.
>
> If flushing happens in random disk order, than yes, you are bound to
> wait several hours indeed.
>

___
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/


Re: [linux-lvm] lvconvert --uncache takes hours

2023-03-02 Thread Roger Heflin
On Thu, Mar 2, 2023 at 2:34 AM Roy Sigurd Karlsbakk  wrote:
>
>
> - Original Message -
> > From: "Roger Heflin" 
> > To: "linux-lvm" 
> > Cc: "Malin Bruland" 
> > Sent: Thursday, 2 March, 2023 01:51:08
> > Subject: Re: [linux-lvm] lvconvert --uncache takes hours
>
> > On Wed, Mar 1, 2023 at 4:50 PM Roy Sigurd Karlsbakk  
> > wrote:
> >>
> >> Hi all
> >>
> >> Working with a friend's machine, it has lvmcache turned on with writeback. 
> >> This
> >> has worked well, but now it's uncaching and it takes *hours*. The amount of
> >> cache was chosen to 100GB on an SSD not used for much else and the dataset 
> >> that
> >> is being cached, is a RAID-6 set of 10x2TB with XFS on top. The system 
> >> mainly
> >> works with file serving, but also has some VMs that benefit from the 
> >> caching
> >> quite a bit. But then - I wonder - how can it spend hours emptying the 
> >> cache
> >> like this? Most write caching I know of last only seconds or perhaps in 
> >> really
> >> worst case scenarios, minutes. Since this is taking hours, it looks to me
> >> something should have been flushed ages ago.
> >>
> >> Have I (or we) done something very stupid here or is this really how it's
> >> supposed to work?
> >>
> >> Vennlig hilsen
> >>
> >> roy
> >
> > A spinning raid6 array is slow on writes (see raid6  write penalty).
> > Because of that the array can only do about 100 write operattions/sec.
>
> About 100 writes/second per data drive, that is. md parallilses I/O well.
>

No.  On writes you get 100 writes to the raid6 total.  With reads you
get 100 iops/disk.  The writes by their very raid6 nature cannot be
parallalized.

Each write to md requires a lot of work.   At min, you have to re-read
the sector you are writing, read the parity you need to update,
calculate the parity changes, and , adjust the parity and re-write any
parities that you need to change.Your other option is you might be
able to write an entire stripe, but that requires writes to all disks
+ parity calc + writes to parity.All options of writing data to
raid5/6 breakdown to iops/disk == total write iops.
The raid5/6 format requires the multiple reads and writes, and  really
makes it slow on writes.

> > If the disk is doing other work then it only has the extra capacity so
> > it could destage slower.
>
> The system was mostly idle.
>
> > A lot depends on how big each chunk is. The lvmcache indicates the
> > smallest chunksize is 32k.
> >
> > 100G / 32k = 3 million, and at 100seeks/sec that comes to at least an hour.
>
> Those 100GB was on SSD, not spinning rust. Last I checked, that was the whole 
> point with caching.

You are de-staging the SSD cache to spinning disks. correct?  The
writes to spinning disks are slow.

>
> > Lvm bookkeeping has to also be written to the spinning disks I would
> > think, so 2 hours if the array were idle.
>
> erm - why on earth would you do writes to hdd if you're caching it?

Once the cache is gone all LVM should be on the spinning disks.

>
> > Throw in a 50% baseload on the disks and you get 4 hours.
> >
> > Hours is reasonable.
>
> As I said, the system was idle.
>
> Vennlig hilsen
>

___
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/



Re: [linux-lvm] lvconvert --uncache takes hours

2023-03-01 Thread Roger Heflin
On Wed, Mar 1, 2023 at 4:50 PM Roy Sigurd Karlsbakk  wrote:
>
> Hi all
>
> Working with a friend's machine, it has lvmcache turned on with writeback. 
> This has worked well, but now it's uncaching and it takes *hours*. The amount 
> of cache was chosen to 100GB on an SSD not used for much else and the dataset 
> that is being cached, is a RAID-6 set of 10x2TB with XFS on top. The system 
> mainly works with file serving, but also has some VMs that benefit from the 
> caching quite a bit. But then - I wonder - how can it spend hours emptying 
> the cache like this? Most write caching I know of last only seconds or 
> perhaps in really worst case scenarios, minutes. Since this is taking hours, 
> it looks to me something should have been flushed ages ago.
>
> Have I (or we) done something very stupid here or is this really how it's 
> supposed to work?
>
> Vennlig hilsen
>
> roy

A spinning raid6 array is slow on writes (see raid6  write penalty).
Because of that the array can only do about 100 write operattions/sec.

If the disk is doing other work then it only has the extra capacity so
it could destage slower.

A lot depends on how big each chunk is. The lvmcache indicates the
smallest chunksize is 32k.

100G / 32k = 3 million, and at 100seeks/sec that comes to at least an hour.

Lvm bookkeeping has to also be written to the spinning disks I would
think, so 2 hours if the array were idle.

Throw in a 50% baseload on the disks and you get 4 hours.

Hours is reasonable.

___
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/



Re: [linux-lvm] LVM2 : performance drop even after deleting the snapshot

2022-10-14 Thread Roger Heflin
What is the underlying disk hardware you are running this on?
virtual, spinning, ssd, nvme?

On Thu, Oct 13, 2022 at 2:01 AM Pawan Sharma  wrote:
>
> adding this to lvm-devel mailing list also.
>
> Regards,
> Pawan
> 
> From: Pawan Sharma
> Sent: Wednesday, October 12, 2022 10:42 PM
> To: linux-lvm@redhat.com 
> Cc: Mitta Sai Chaithanya ; Kapil Upadhayay 
> 
> Subject: LVM2 : performance drop even after deleting the snapshot
>
> Hi Everyone,
>
>
> We are evaluating lvm2 snapshots and doing performance testing on it. This is 
> what we are doing :
>
> dump some data to lvm2 volume (using fio)
> take the snapshot
> delete the snapshot (no IOs anywhere after creating the snapshot)
> run the fio on lvm2 volume
>
> Here as you can see, we are just creating the snapshot and immediately 
> deleting it. There are no IOs to the main volume or anywhere. When we run the 
> fio after this (step 4) and we see around 50% drop in performance with 
> reference to the number we get in step 1.
>
> It is expected to see a performance drop if there is a snapshot because of 
> the COW. But here we deleted the snapshot, and it is not referring to any 
> data also. We should not see any performance drop here.
>
> Could someone please help me understand this behavior. Why are we seeing the 
> performance drop in this case? It seems like we deleted the snapshot but 
> still it is not deleted, and we are paying the COW penalty.
>
> System Info:
>
> OS : ubuntu 18.04
> Kernel : 5.4.0
>
> # lvm version
>   LVM version: 2.02.176(2) (2017-11-03)
>   Library version: 1.02.145 (2017-11-03)
>   Driver version:  4.41.0
>
> We also tried on latest ubuntu with newer version of LVM. We got the same 
> behavior.
>
> Any help/pointers would be appreciated. Thanks in advance.
>
> Regards,
> Pawan
> ___
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://listman.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

___
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/



Re: [linux-lvm] How to change default system dir

2022-09-29 Thread Roger Heflin
Assuming you are write protecting /etc for a reason, simply mv
everything in /etc/lvm to your new location and replace the lvm dir
with a lvm symbolic link to your location.

In the far past when booting with RO devices I had lots of links in
/etc/ pointing to files that were not on the read-only and/or
writable-but-lost-on-boot(overlay) device.

On Thu, Sep 29, 2022 at 5:58 AM Zdenek Kabelac  wrote:
>
> Dne 28. 09. 22 v 17:16 Bartłomiej Błachut napsal(a):
> > Hi,
> > I have a question for you, because what I need is to change in my
> > ubuntu(22.04) the localisation of default-system-dir from /etc/lvm to a 
> > place
> > where I have write access e.g. /home/my_user/new_lvm_dir. I've tried so far 
> > to
> > provide to ./configure option
> > --with-default-system-dir=/home/my_user/new_lvm_dir or later I replace
> > everywhere in code pattern /etc/lvm to /home/my_user/new_lvm_dir but without
> > any success. Are you able to give me any idea how I can do it ?
> >
> > I tried it on main/stable branches git://sourceware.org/git/lvm2.git
> > 
> >
>
> Hi
>
>
> lvm2 basically always require a 'root' access for any 'real usable' lvm2
> command - so saying you don't have write access to /etc  looks very strange to
> start with.
>
> We have also configure option  --with-confdir=[/etc]
> but I still think you should start with explaining what are you actually
> trying to do - as maybe there is some design issue??
>
>
> Regards
>
> Zdenek
>
> ___
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://listman.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

___
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/


Re: [linux-lvm] Problem with partially activate logical volume

2022-08-04 Thread Roger Heflin
fsck might be able to fix it enough that debugfs and/or mount works,
but it may also eliminate all the data, but since you have a clone and
no data it is probably worth a shot.

fsck -f -y   will either get you some data or not and make you have to
clone from the copy if you have something else to  try.

Did you try the "-c" option on debug fs?

On Thu, Aug 4, 2022 at 2:07 AM Ken Bass  wrote:
>
>
> That's pretty much it. Whenever any app attempts to read a block from the 
> missing drive, I get the "Buffer I/O error" message. So, even though my 
> recovery apps can scan the LV, marking blocks on the last drive as 
> missing/unknown/etc., they can't display any recovered data - which I know 
> does exist. Looking at raw data from the apps' scans, I can see directory 
> entries, as well as files. I'm sure the inodes and bitmaps are still there 
> for some of these, I just can't really reverse engineer and follow them 
> through. But isn't that what the apps are supposed to do?
>
> As for debugfs: pretty much the same issue: in order to use it, I need to 
> open the fs. But that (in debugfs) fails as well. So it can't help much. 
> Unless I'm missing something about debugfs.
>
> The one thing I haven't tried is to use vgreduce to remove the missing PV; 
> but that will also remove the LV as well, which is why I haven't tried it yet.
>
> Sorry I haven't replied sooner, but it takes a long time (days) to clone, 
> then scan 16Tb...
>
> So, please any suggestions are greatly appreciated, as well as needed.
>
> ken
>
> (I know: No backup; got burned; it hurts; and I will now always have backups. 
> 'Nuf said.)
>
>
> On Thu, Jul 28, 2022 at 3:12 AM Roger James  
> wrote:
>>
>> The procedure outlined should at least get you back to a state where the lv 
>> is consistent but with blank sectors where the data is missing. I would 
>> suggest using dd to make a backup partition image. Then you can either work 
>> on that or the original to mend the fs.
>>
>> On 27 July 2022 11:50:07 Roger Heflin  wrote:
>>
>>> I don't believe that is going to work.
>>>
>>> His issue is that the filesystem is refusing to work because of the
>>> missing data.
>>>
>>> man debugfs
>>>
>>> It will let you manually look at the metadata and structures of the
>>> ext2/3/4 fs.  You will likely need to use the "-c" option.
>>>
>>> It will be very manual and you should probably read up on the fs
>>> structure a bit.
>>>
>>> A data recovery company could get most of the data back, but they
>>> charge 5k-10k per TB, so likely close to 100k US$.
>>>
>>> And the issues will be that 1/3 of the metadata was on the missing
>>> disk, and some of the data was on the missing disk.
>>>
>>> I was able to do debugfs /dev/sda2  (my /boot) and do an ls and list
>>> out the files and then do a dump  /tmp/junk.out and copy out
>>> that file.
>>>
>>> So the issue will be writing up a script to do lses and find all of
>>> the files and dump all of the files to someplace else.
>>>
>>> On Wed, Jul 27, 2022 at 2:39 AM Roger James  
>>> wrote:
>>>>
>>>>
>>>> Try https://www.linuxsysadmins.com/recover-a-deleted-physical-volume/?amp
>>>>
>>>> On 26 July 2022 09:16:32 Ken Bass  wrote:
>>>>>
>>>>>
>>>>> (fwiw: I am new to this list, so please bear with me.)
>>>>>
>>>>> Background: I have a very large (20TB) logical volume consisting of 3 
>>>>> drives. One of those drives unexpectedloy died (isn't that always the 
>>>>> case :-)). The drive that failed happened to be the last PV. So I am 
>>>>> assuming that there is still 2/3 of the data still intact and, to some 
>>>>> extent, recoverable. Although, apparently the ext4 fs is not recognised.
>>>>>
>>>>> I activated the LV partially (via -P). But running any utility on that 
>>>>> (eg: dumpe2fs, e2fsck, ...) I get many of these  in dmesg:
>>>>>
>>>>> "Buffer I/O error on dev dm-0, logical block xxx, async page read."  
>>>>> The thing is, the xxx block is on the missing drive/pv.
>>>>>
>>>>> I have also tried some recovery software, but eventually get these same 
>>>>> messages, and the data recovered is not really useful.
>>>>>
>>>>> Please help! How can I get passed that dmesg err

Re: [linux-lvm] Problem with partially activate logical volume

2022-07-27 Thread Roger Heflin
I don't believe that is going to work.

His issue is that the filesystem is refusing to work because of the
missing data.

man debugfs

It will let you manually look at the metadata and structures of the
ext2/3/4 fs.  You will likely need to use the "-c" option.

It will be very manual and you should probably read up on the fs
structure a bit.

A data recovery company could get most of the data back, but they
charge 5k-10k per TB, so likely close to 100k US$.

And the issues will be that 1/3 of the metadata was on the missing
disk, and some of the data was on the missing disk.

I was able to do debugfs /dev/sda2  (my /boot) and do an ls and list
out the files and then do a dump  /tmp/junk.out and copy out
that file.

So the issue will be writing up a script to do lses and find all of
the files and dump all of the files to someplace else.

On Wed, Jul 27, 2022 at 2:39 AM Roger James  wrote:
>
> Try https://www.linuxsysadmins.com/recover-a-deleted-physical-volume/?amp
>
> On 26 July 2022 09:16:32 Ken Bass  wrote:
>>
>> (fwiw: I am new to this list, so please bear with me.)
>>
>> Background: I have a very large (20TB) logical volume consisting of 3 
>> drives. One of those drives unexpectedloy died (isn't that always the case 
>> :-)). The drive that failed happened to be the last PV. So I am assuming 
>> that there is still 2/3 of the data still intact and, to some extent, 
>> recoverable. Although, apparently the ext4 fs is not recognised.
>>
>> I activated the LV partially (via -P). But running any utility on that (eg: 
>> dumpe2fs, e2fsck, ...) I get many of these  in dmesg:
>>
>> "Buffer I/O error on dev dm-0, logical block xxx, async page read."  The 
>> thing is, the xxx block is on the missing drive/pv.
>>
>> I have also tried some recovery software, but eventually get these same 
>> messages, and the data recovered is not really useful.
>>
>> Please help! How can I get passed that dmesg error, and move on. 14TB 
>> recovered is better than 0.
>>
>> TIA
>> ken
>>
>>
>> ___
>> linux-lvm mailing list
>> linux-lvm@redhat.com
>> https://listman.redhat.com/mailman/listinfo/linux-lvm
>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>
>
> ___
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://listman.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

___
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/



Re: [linux-lvm] Recovering from a failed pvmove

2022-06-29 Thread Roger Heflin
Did your operating system have backups/archives of the old cfgs?

Fedora/Redhat keeps around 20 old copies going back typically several
months and so typically one of those can be used.

My archive dir looks like this for the roovg (note I do not do a lot
of vg work, typically
 ls -l /etc/lvm/archive/fedora*
-rw---. 1 root root 3159 Oct 16  2021
/etc/lvm/archive/fedora_00031-1021957302.vg
-rw---. 1 root root 2740 Oct 16  2021
/etc/lvm/archive/fedora_00032-1140767599.vg
-rw---. 1 root root 2331 Oct 16  2021
/etc/lvm/archive/fedora_00033-104560841.vg
-rw---. 1 root root 2084 Oct 16  2021
/etc/lvm/archive/fedora_00034-239698665.vg
-rw---. 1 root root 2330 Oct 16  2021
/etc/lvm/archive/fedora_00035-1994061504.vg
-rw---. 1 root root 1973 Oct 16  2021
/etc/lvm/archive/fedora_00036-974449793.vg
-rw---. 1 root root 1997 Oct 16  2021
/etc/lvm/archive/fedora_00037-1503932417.vg
-rw---. 1 root root 1997 Oct 16  2021
/etc/lvm/archive/fedora_00038-951442204.vg
-rw---. 1 root root 1997 Oct 16  2021
/etc/lvm/archive/fedora_00039-989943813.vg
-rw---. 1 root root 1968 Oct 16  2021
/etc/lvm/archive/fedora_00040-815563362.vg
-rw---. 1 root root 1997 Oct 16  2021
/etc/lvm/archive/fedora_00041-1303737065.vg
-rw---. 1 root root 1968 Oct 16  2021
/etc/lvm/archive/fedora_00042-714254626.vg
-rw---. 1 root root 1976 Oct 16  2021
/etc/lvm/archive/fedora_00043-858775161.vg
-rw---. 1 root root 2229 Oct 16  2021
/etc/lvm/archive/fedora_00044-1360584830.vg
-rw---. 1 root root 2238 Oct 16  2021
/etc/lvm/archive/fedora_00045-1806472194.vg
-rw---. 1 root root 2218 Oct 16  2021
/etc/lvm/archive/fedora_00046-519515389.vg
-rw---. 1 root root 1992 Oct 16  2021
/etc/lvm/archive/fedora_00047-1997891375.vg
-rw---. 1 root root 1988 Oct 16  2021
/etc/lvm/archive/fedora_00048-1666128451.vg
-rw---. 1 root root 2396 Oct 18  2021
/etc/lvm/archive/fedora_00049-67607509.vg
-rw---. 1 root root 2396 Oct 18  2021
/etc/lvm/archive/fedora_00050-1102265641.vg
-rw---. 1 root root 2404 Oct 18  2021
/etc/lvm/archive/fedora_00051-243833122.vg
-rw---. 1 root root 1738 Oct 18  2021
/etc/lvm/archive/fedora_00052-1619420890.vg
-rw---. 1 root root 1744 Oct 18  2021
/etc/lvm/archive/fedora_00053-149551096.vg
-rw---. 1 root root 1748 Nov 11  2021
/etc/lvm/archive/fedora_00054-1785934566.vg
-rw---. 1 root root 1748 Nov 11  2021
/etc/lvm/archive/fedora_00055-1992369253.vg
-rw---. 1 root root 1719 Nov 11  2021
/etc/lvm/archive/fedora_00056-26730175.vg
-rw---. 1 root root 1748 Nov 11  2021
/etc/lvm/archive/fedora_00057-1989410182.vg
-rw---. 1 root root 1719 Nov 11  2021
/etc/lvm/archive/fedora_00058-1590942582.vg
-rw---. 1 root root 1748 Nov 11  2021
/etc/lvm/archive/fedora_00059-140743745.vg

I have only personally had to edit the lvm archive I needed to use  a
few times, and that has always been to remove the MISSING tag from
PV's that are no longer missing and the missing tag blocks the archive
from being used.

Most of the time I am correcting VG's I revert to the archive copy
that was taked before the bad steps were done.

There are options that can be put in /etc/lvm/lvm.conf and
/etc/lvm/lvmlocal.conf that can cause the archive's to get collected
if they are not being automatically collected, I am pretty sure I have
seen a few distributions that do not seem to collect the archives that
do typically give you a restore point to revert to (without editing
the file directly).


On Wed, Jun 29, 2022 at 2:32 AM Roger James  wrote:
>
> I have now managed to fix the problem. I ran vgcfgbackup, then made a copy of 
> the backup for safety purposes. I than hand edited the original backup to 
> remove the missing pv (pv4), the root lv and the pvmove0 lv. I then than 
> vgcfgrestore. Everything is working.
>
> There must be a better way of doing this. Hand editing cfg files is not safe 
> or sensible. What have I missed?
>
> Roger
>
> On 28 June 2022 07:38:48 Roger James  wrote:
>>
>> Hi,
>>
>> I am struggling to recover from a failed pvmove. Unfortunately I only have a 
>> limited knowledge of lvm. I setup my lvm configuration many years ago.
>>
>> I was trying to move a lv to a SSD using pvmove. Unfortunately my brand new 
>> SSD choose that moment to fail (never buy cheap SSDs, lesson learnt!").
>>
>> This is the current status.
>>
>> roger@dragon:~$ sudo pvs
>>   WARNING: Couldn't find device with uuid 
>> uMtjop-PmMT-603f-GWWQ-fR4f-s4Sw-XSKNXZ.
>>   WARNING: VG wd is missing PV uMtjop-PmMT-603f-GWWQ-fR4f-s4Sw-XSKNXZ (last 
>> written to [unknown]).
>>   PV VG Fmt Attr PSize PFree
>>   /dev/sda1 wd lvm2 a-- <465.76g 0
>>   /dev/sdb1 wd lvm2 a-- <465.76g <80.45g
>>   /dev/sdc2 wd lvm2 a-- 778.74g 278.74g
>>   /dev/sdd1 wd lvm2 a-- <465.76g 0
>>   [unknown] wd lvm2 a-m <784.49g 685.66g
>> roger@dragon:~$ sudo lvs
>>   WARNING: Couldn't find device with uuid 
>> uMtjop-PmMT-603f-GWWQ-fR4f-s4Sw-XSKNXZ.
>>   WARNING: VG wd is missing PV 

Re: [linux-lvm] Recovering from a failed pvmove

2022-06-29 Thread Roger Heflin
For a case like this vgcfgrestore is probably the best option.  man
vgcfgrestore.

You need to see if you have archived vg copies that you can revert to
before the "add" of the pv that went bad.

The archives are typically  in /etc/lvm/archiive/* on RedHat
derivative OSes, not sure if they are different(and/or configured to exist)
on other distributions.

grep -i before /etc/lvm/archive/* and see which archive was made
before the initial pv addition.  vgcfgrestore -f  should work
but I usually have to adjust command line options to get it work when I
have used it to revert configs.  I think in that case it will find the vg
and pvid correctly.  No cleanup should need to be done so long as the other
device is completely gone.

And you will probably need to answer some prompts and warnings, and then
reboot the machine, and/or do this all under a livecd rescue boot.

What kind of cheap ssd were you using?  I have had really bad luck with
ones without RAM.  I RMA'ed one that failed in under a week and the new one
also failed in a very similar way in under a week.

On Tue, Jun 28, 2022 at 1:38 AM Roger James 
wrote:

> Hi,
>
> I am struggling to recover from a failed pvmove. Unfortunately I only have
> a limited knowledge of lvm. I setup my lvm configuration many years ago.
>
> I was trying to move a lv to a SSD using pvmove. Unfortunately my brand
> new SSD choose that moment to fail (never buy cheap SSDs, lesson learnt!").
>
> This is the current status.
>
> roger@dragon:~$ sudo pvs
>   WARNING: Couldn't find device with uuid
> uMtjop-PmMT-603f-GWWQ-fR4f-s4Sw-XSKNXZ.
>   WARNING: VG wd is missing PV uMtjop-PmMT-603f-GWWQ-fR4f-s4Sw-XSKNXZ
> (last written to [unknown]).
>   PV VG Fmt Attr PSize PFree
>   /dev/sda1 wd lvm2 a-- <465.76g 0
>   /dev/sdb1 wd lvm2 a-- <465.76g <80.45g
>   /dev/sdc2 wd lvm2 a-- 778.74g 278.74g
>   /dev/sdd1 wd lvm2 a-- <465.76g 0
>   [unknown] wd lvm2 a-m <784.49g 685.66g
> roger@dragon:~$ sudo lvs
>   WARNING: Couldn't find device with uuid
> uMtjop-PmMT-603f-GWWQ-fR4f-s4Sw-XSKNXZ.
>   WARNING: VG wd is missing PV uMtjop-PmMT-603f-GWWQ-fR4f-s4Sw-XSKNXZ
> (last written to [unknown]).
>   LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
>   home wd -wi--- 1.46t
>   root wd -wI-p- <108.83g
>
>   swap wd -wi--- 8.00g
>   work wd -wi--- 200.00g
> roger@dragon:~$ sudo vgs
>   WARNING: Couldn't find device with uuid
> uMtjop-PmMT-603f-GWWQ-fR4f-s4Sw-XSKNXZ.
>   WARNING: VG wd is missing PV uMtjop-PmMT-603f-GWWQ-fR4f-s4Sw-XSKNXZ
> (last written to [unknown]).
>   VG #PV #LV #SN Attr VSize VFree
>   wd 5 4 0 wz-pn- 2.89t 1.02t
>
> This is a recap of what I have tried so far.
>
> roger@dragon:~$ sudo pvmove --abort
>   WARNING: Couldn't find device with uuid
> uMtjop-PmMT-603f-GWWQ-fR4f-s4Sw-XSKNXZ.
>   WARNING: VG wd is missing PV uMtjop-PmMT-603f-GWWQ-fR4f-s4Sw-XSKNXZ
> (last written to [unknown]).
>   LVM command executed by lvmpolld failed.
>   For more information see lvmpolld messages in syslog or lvmpolld log
> file.
> roger@dragon:~$ sudo vgreduce --removemissing wd
>   WARNING: Couldn't find device with uuid
> uMtjop-PmMT-603f-GWWQ-fR4f-s4Sw-XSKNXZ.
>   WARNING: VG wd is missing PV uMtjop-PmMT-603f-GWWQ-fR4f-s4Sw-XSKNXZ
> (last written to [unknown]).
>   WARNING: Couldn't find device with uuid
> uMtjop-PmMT-603f-GWWQ-fR4f-s4Sw-XSKNXZ.
>   WARNING: Partial LV root needs to be repaired or removed.
>   WARNING: Partial LV pvmove0 needs to be repaired or removed.
>   There are still partial LVs in VG wd.
>   To remove them unconditionally use: vgreduce --removemissing --force.
>   To remove them unconditionally from mirror LVs use: vgreduce
> --removemissing --mirrorsonly --force.
>   WARNING: Proceeding to remove empty missing PVs.
>   WARNING: Couldn't find device with uuid
> uMtjop-PmMT-603f-GWWQ-fR4f-s4Sw-XSKNXZ.
> roger@dragon:~$ sudo lvchange -an wd/root
>   WARNING: Couldn't find device with uuid
> uMtjop-PmMT-603f-GWWQ-fR4f-s4Sw-XSKNXZ.
>   WARNING: VG wd is missing PV uMtjop-PmMT-603f-GWWQ-fR4f-s4Sw-XSKNXZ
> (last written to [unknown]).
> roger@dragon:~$ sudo vgreduce --removemissing wd
>   WARNING: Couldn't find device with uuid
> uMtjop-PmMT-603f-GWWQ-fR4f-s4Sw-XSKNXZ.
>   WARNING: VG wd is missing PV uMtjop-PmMT-603f-GWWQ-fR4f-s4Sw-XSKNXZ
> (last written to [unknown]).
>   WARNING: Couldn't find device with uuid
> uMtjop-PmMT-603f-GWWQ-fR4f-s4Sw-XSKNXZ.
>   WARNING: Partial LV root needs to be repaired or removed.
>   WARNING: Partial LV pvmove0 needs to be repaired or removed.
>   There are still partial LVs in VG wd.
>   To remove them unconditionally use: vgreduce --removemissing --force.
>   To remove them unconditionally from mirror LVs use: vgreduce
> --removemissing --mirrorsonly --force.
>   WARNING: Proceeding to remove empty missing PVs.
>   WARNING: Couldn't find device with uuid
> uMtjop-PmMT-603f-GWWQ-fR4f-s4Sw-XSKNXZ.
> roger@dragon:~$ sudo lvremove wd/pvmove0
>   WARNING: Couldn't find device with uuid
> 

Re: [linux-lvm] lvm commands hanging when run from inside a kubernetes pod

2022-06-03 Thread Roger Heflin
Random thoughts.

Make sure  use_lvmetad is 0, and its systemd units for it are
stopped/disabled.

Are you mounting /proc and /sys and /dev into the /host chroot?

/run may also be needed.

you might add a "-ttt" to the strace command to give timing data.



On Thu, Jun 2, 2022 at 1:41 AM Abhishek Agarwal <
mragarwal.develo...@gmail.com> wrote:

> These are not different LVM processes. The container process is using the
> LVM binary that the node itself has. We have achieved this by using scripts
> that point to the same lvm binary that is used by the node.
>
> Configmap(~shell script) used for the same has the following contents
> where `/host` refers to the root directory of the node:
>
> get_bin_path: |
>   #!/bin/sh
>   bin_name=$1
>   if [ -x /host/bin/which ]; then
> echo $(chroot /host /bin/which $bin_name | cut -d ' ' -f 1)
>   elif [ -x /host/usr/bin/which ]; then
> echo $(chroot /host /usr/bin/which $bin_name | cut -d ' ' -f 1)
>   else
> $(chroot /host which $bin_name | cut -d ' ' -f 1)
>   fi
>
> lvcreate: |
>   #!/bin/sh
>   path=$(/sbin/lvm-eg/get_bin_path "lvcreate")
>   chroot /host $path "$@"
>
> Also, the above logs in the pastebin link have errors because the vg lock
> has not been acquired and hence creation commands will fail. Once the lock
> is acquired, the `strace -f` command gives the following output being
> stuck. Check out this link for full details ->
> https://pastebin.com/raw/DwQfdmr8
>
> P.S: We at OpenEBS are trying to provide lvm storage to cloud native
> workloads with the help of kubernetes CSI drivers and since all these
> drivers run as pods and help dynamic provisioning of kubernetes
> volumes(storage) for the application, the lvm commands needs to be run from
> inside the pod. Reference -> https://github.com/openebs/lvm-localpv
>
> Regards
>
> On Wed, 1 Jun 2022 at 13:06, Demi Marie Obenour <
> d...@invisiblethingslab.com> wrote:
>
>> On Wed, Jun 01, 2022 at 12:20:32AM +0530, Abhishek Agarwal wrote:
>> > Hi Roger. Thanks for your reply. I have rerun the command with `strace
>> -f`
>> > as you suggested. Here is the pastebin link containing the detailed
>> output
>> > of the command: https://pastebin.com/raw/VRuBbHBc
>>
>> Even if you can get LVM “working”, it is still likely to cause data
>> corruption at some point, as there is no guarantee that different LVM
>> processes in different namespaces will see each others’ locks.
>>
>> Why do you need to run LVM in a container?  What are you trying to
>> accomplish?
>> --
>> Sincerely,
>> Demi Marie Obenour (she/her/hers)
>> Invisible Things Lab
>> ___
>> linux-lvm mailing list
>> linux-lvm@redhat.com
>> https://listman.redhat.com/mailman/listinfo/linux-lvm
>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>
> ___
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://listman.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
___
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/


Re: [linux-lvm] lvm commands hanging when run from inside a kubernetes pod

2022-05-30 Thread Roger Heflin
You need to rerun with an "strace -f".  That way it will also strace the
fork's one of which it appears to be waiting on.

On Mon, May 30, 2022 at 1:52 AM Abhishek Agarwal <
mragarwal.develo...@gmail.com> wrote:

> When a kubernetes pod is scheduled on the node having lvm2 libraries
> already installed and trying to run lvm commands using those node binaries
> from inside the pod container, the commands hang and are waiting on
> something to complete. Although when ctrl+c is pressed the terminal session
> resumes and checking the final code for the execution returns a "0" error
> code and the commands operation is also carried out successfully.
>
>
> Below is the command output and strace for the command:-
>
> strace of lvcreate on the pod container scheduled on the node where lvm2
> binaries are present:
> # strace /sbin/lvm-eg/lvcreate -n test-lv -L 1G shared-vg
> execve("/sbin/lvm-eg/lvcreate", ["/sbin/lvm-eg/lvcreate", "-n", "test-lv",
> "-L", "1G", "shared-vg"], 0x7ffe7d7a6c98 /* 37 vars */) = 0
> brk(NULL)= 0x55d196f41000
> arch_prctl(0x3001 /* ARCH_??? */, 0x7ffdd8247a80) = -1 EINVAL (Invalid
> argument)
> access("/etc/ld.so.preload", R_OK)   = -1 ENOENT (No such file or
> directory)
> openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
> fstat(3, {st_mode=S_IFREG|0644, st_size=8679, ...}) = 0
> mmap(NULL, 8679, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f0d37895000
> close(3)= 0
> openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
> read(3,
> "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\360A\2\0\0\0\0\0"..., 832)
> = 832
> pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"...,
> 784, 64) = 784
> pread64(3,
> "\4\0\0\0\20\0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0", 32,
> 848) = 32
> pread64(3,
> "\4\0\0\0\24\0\0\0\3\0\0\0GNU\0\237\333t\347\262\27\320l\223\27*\202C\370T\177"...,
> 68, 880) = 68
> fstat(3, {st_mode=S_IFREG|0755, st_size=2029560, ...}) = 0
> mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
> 0x7f0d37893000
> pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"...,
> 784, 64) = 784
> pread64(3,
> "\4\0\0\0\20\0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0", 32,
> 848) = 32
> pread64(3,
> "\4\0\0\0\24\0\0\0\3\0\0\0GNU\0\237\333t\347\262\27\320l\223\27*\202C\370T\177"...,
> 68, 880) = 68
> mmap(NULL, 2037344, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) =
> 0x7f0d376a1000
> mmap(0x7f0d376c3000, 1540096, PROT_READ|PROT_EXEC,
> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x22000) = 0x7f0d376c3000
> mmap(0x7f0d3783b000, 319488, PROT_READ,
> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x19a000) = 0x7f0d3783b000
> mmap(0x7f0d37889000, 24576, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1e7000) = 0x7f0d37889000
> mmap(0x7f0d3788f000, 13920, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f0d3788f000
> close(3)= 0
> arch_prctl(ARCH_SET_FS, 0x7f0d37894580) = 0
> mprotect(0x7f0d37889000, 16384, PROT_READ) = 0
> mprotect(0x55d196662000, 8192, PROT_READ) = 0
> mprotect(0x7f0d378c5000, 4096, PROT_READ) = 0
> munmap(0x7f0d37895000, 8679)  = 0
> getuid()= 0
> getgid()= 0
> getpid()= 19163
> rt_sigaction(SIGCHLD, {sa_handler=0x55d196657c30, sa_mask=~[RTMIN RT_1],
> sa_flags=SA_RESTORER, sa_restorer=0x7f0d376e40c0}, NULL, 8) = 0
> geteuid()= 0
> brk(NULL)= 0x55d196f41000
> brk(0x55d196f62000)   = 0x55d196f62000
> getppid()= 19160
> stat("/", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
> stat(".", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
> openat(AT_FDCWD, "/sbin/lvm-eg/lvcreate", O_RDONLY) = 3
> fcntl(3, F_DUPFD, 10)  = 10
> close(3)= 0
> fcntl(10, F_SETFD, FD_CLOEXEC) = 0
> geteuid()= 0
> getegid()= 0
> rt_sigaction(SIGINT, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0},
> 8) = 0
> rt_sigaction(SIGINT, {sa_handler=0x55d196657c30, sa_mask=~[RTMIN RT_1],
> sa_flags=SA_RESTORER, sa_restorer=0x7f0d376e40c0}, NULL, 8) = 0
> rt_sigaction(SIGQUIT, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0},
> 8) = 0
> rt_sigaction(SIGQUIT, {sa_handler=SIG_DFL, sa_mask=~[RTMIN RT_1],
> sa_flags=SA_RESTORER, sa_restorer=0x7f0d376e40c0}, NULL, 8) = 0
> rt_sigaction(SIGTERM, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0},
> 8) = 0
> rt_sigaction(SIGTERM, {sa_handler=SIG_DFL, sa_mask=~[RTMIN RT_1],
> sa_flags=SA_RESTORER, sa_restorer=0x7f0d376e40c0}, NULL, 8) = 0
> read(10, "#!/bin/sh\npath=$(/sbin/lvm-eg/ge"..., 8192) = 79
> pipe([3, 4])  = 0
> clone(child_stack=NULL,
> flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
> child_tidptr=0x7f0d37894850) = 19164
> close(4)= 0
> read(3, "/usr/sbin/lvcreate\n", 128)  = 19
> --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=19164, si_uid=0,
> si_status=0, 

Re: [linux-lvm] LVM autoactivation and udev

2022-03-09 Thread Roger Heflin
On Wed, Mar 9, 2022 at 10:35 AM David Teigland  wrote:
>

>
> - if all three of those don't catch it, then filter-mpath will also
>   check if the component wwid is listed in /etc/multipath/wwids and
>   ignore the device if it is.
>
> If all four of those methods fail to exclude a multipath component, then
> an LV could be activated using the component.  While this isn't good, it
> can be corrected by running lvchange --refresh.  I'd like to get any
> details of that happening to see if we can improve it.
>

I have also had luck with making sure multipath is in the initramfs,
and adding rd.lvm.vg= on the kernel boot line
so that in the initramfs only the VG needed for boot up is started up
early inside the initramfs.  That has typically let multipathd
inside the initrd grab any other devices and configure them for
multipath before LVM  activates them.

___
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/



[linux-lvm] Have tested dm-raid/lv mirroring with block guard on one leg but not the other, it fails to mirror the disk

2021-11-19 Thread Roger Heflin
I have tested dm-raid/lv mirroring and block guard on one leg but not
on the other and it initially seems to copy/mirror the device and then
random lv's start dropping the new block guard leg and set the refresh
attribute on the lv.   About this time we also start getting severe
filesystem corruption.  All that is being done is mirror the boot disk
to a SAN(block guard) lun and then split off that lun, and there is
breakage in the 10-30 minute window that takes.

I have tried on 2 vendor kernels and I have tried with fedora 35
(5.14.10-300) and all fail with similar overall results but slightly
different error messages in dmesg.

The redhat clone 7.9 kernel said this:  "tag#0 Add. Sense: Logical
block guard check failed", none of the other kernels had that good of
a message.   Fedora 35 got a DID_TRANSPORT_DISRUPTED and an io error
messages against a sector.The other vendor kernel got basic IO
errors similar to what you get if you had a bad sector.

Any idea if it can be made to work?  Or made to refuse and error in lvm?

I have the commands used and /etc/lvm/archive/* entries for the device
and dmesg from the fedora kernel since it is almost a current
kernel.org kernel.   On  the vendor kernels we tried echo 0 to the
read_verify integrity entries on multipath and the underlying devices
and lvm seemed re-enable read_verify on the devices it built above the
devices it was disabled on when it was setting up/doing the mirroring,
we did not try that on the fedora kernel.

thoughts?

___
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/



Re: [linux-lvm] logical volume usage type code, equivalent to GPT partition type GUID

2021-11-03 Thread Roger Heflin
You have some basic problems.

#1 with shared lvm you would need a way to tell where it should go, so
you would have to have a mount_hostname type data in logical_volumes
sections.

#2 you would also need a mount_point entry, and a mount_opts entry.

And then at the end you are basically moving fstab into the
logicial_volume headers and that is not that useful overall because it
does not simplify anything for the most part, and probably actually
complicates things as the hostname change would get tricky, and/or
mountpoint changes would require lvm commands (harder than editing
fstab).

So long as the lv's are named reasonably then someone who knows they
are doing can mount up/recreate fstab with the lv in the right place
(say var_log_abrt would mount on /var/log/abrt).

I am not sure this simplifies anything nor improves anything except in
the case of a lost fstab, but naming the lv's verbosely at least makes
that easier.

On Wed, Nov 3, 2021 at 12:45 PM Chris Murphy  wrote:
>
> Hi,
>
> I'm wondering to what degree the current LVM metadata format(s) can
> support additional or even arbitrary metadata.
>
> The UEFI spec defines the GPT, and GPT defines a "partition type GUID"
> for each partition to define it's usage/purpose, in rather open ended
> fashion. I'm wondering about an equivalent for this with LVM, whether
> it's useful and how difficult it would be to implement. This is all
> very hypothetical right now, so a high level discussion is preferred.
>
> The starting point is the Discoverable Partitions Spec:
> http://systemd.io/DISCOVERABLE_PARTITIONS/
>
> Where GPT partition type codes are used to discover file systems, and
> their intended use without having to explicitly place them into
> /etc/fstab for startup time discovery and mounting. But LVM doesn't
> have an equivalent for exposing such a capability, because it implies
> many volumes within the larger pool and also the pool might comprise
> many devices.
>
> The same problem exists for Btrfs subvolumes, and ZFS datasets.
>
> What might be possible and what is definitely not possible, is what
> I'm interested in understanding for now.
>
> Thanks,
>
> --
> Chris Murphy
>
> ___
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://listman.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>

___
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/



Re: [linux-lvm] Recovering "broken" disk ( 17th )

2021-10-21 Thread Roger Heflin
replying to my last email.

do the pvcreate -uuid and then do a pvs/lvs/vgs and see if the vg/lv's look
like that are there.   if so do a vgchange -ay  and then test
mounting the fs.

And if with the fs either commented out and/or ,nofail the normal os boots
up work from there as you should have backup files.

On Wed, Oct 20, 2021 at 1:01 PM Roger Heflin  wrote:

> is the pv in the root device vg?  if not changing fstab to not mount the
> missing  fs(es) should get it bootable.   I have a practice of putting
> ",nofail" on all non-root filesystems (ie defaults,nofail) since priority
> #1 is getting the machine up and on the network after a reboot such that it
> can be ssh'ed to and fixed as needed.
>
> If it is not the root device then on the root device there should be
> several prior archive copy in /etc/lvm/archive/*, and maybe some
> copies in /etc/lvm/backup.
>
> In the vg backup file there will be a bunch of uuids, you want the
> specific pv uuid and not the vg/lv uuids.  Each pv has a uuid and each lv
> has a uuid and the vg has a uuid.
>
> On Wed, Oct 20, 2021 at 8:39 AM Brian McCullough 
> wrote:
>
>> On Tue, Oct 19, 2021 at 05:06:37AM -0500, Roger Heflin wrote:
>> > I would edit the vgconfig you dd'ed with an editor and make sure it
>> looks
>> > reasonable for what you think you had.
>>
>> It turns out, comparing the information that I pulled off of the drive
>> with what I find in /etc/lvm/backup, that the first part of the vgconfig
>> information is missing.  As I said in one of my messages, the
>> information that I retrieved from the disk starts at 0x1200.  I don't
>> know whether that is correct or not.  It does not appear to be a proper
>> "backup" file, which I think it should be.
>>
>> I rebooted ( partially ) the machine and copied the vgconfig backup file
>> from that, but am somewhat concerned, because I don't seem to be able to
>> match the UUIDs.  The one that I seem to see in the vgconfig data that I
>> pulled off of the drive vs what I got out of /etc/lvm/backup.  Maybe I
>> am just mis-reading it.  I will continue my research for a bit.
>>
>>
>>
>>
>> > When you do the pvcreate --uuid it won't use anything except the uuid
>> info
>> > so the rest may not need to be exactly right, if you have to do a
>> > vgcfgrestore to get it to read the rest of the info will be used.
>>
>> Oh, thank you.   I did see that things got somewhat different on the
>> target drive when I did "pvcreate --uuid --restorefile."  I got paranoid
>> when I saw that, and re-copied the ddrestore file back to the target
>> drive before I did anything else.   Should I do "pvcreate --uuid
>> --norestorefile," instead?  Then, once it is back in the machine, do the
>> pvscan and vgcfgrestore, and expect good things?
>>
>>
>>
>> > I have seen some weird disk controller failures that appeared to zero
>> out
>> > the first bit of the disk (enough to get the partition table, grub, and
>> the
>> > pv header depending on where the first partition starts).
>>
>> I APPEAR to have a partition table, containing an NTFS partition, an LVM
>> partiton ( the one that I am concentrating on ) and a Linux partion.  I
>> would have thought that it was all LVM, but my memory could easily be
>> wrong.
>>
>>
>>
>> > You will need to reinstall grub if this was the bootable disk, since
>> there
>> > were 384 bytes of grub in the sector with the partition table that you
>> know
>> > are missing.
>>
>> Fortunately, this is all data, nothing to do with the boot sequence,
>> except that the machine will not boot with the missing PV.
>>
>>
>>
>> Thank you,
>> Brian
>>
>> ___
>> linux-lvm mailing list
>> linux-lvm@redhat.com
>> https://listman.redhat.com/mailman/listinfo/linux-lvm
>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>
>>
___
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

Re: [linux-lvm] Recovering "broken" disk ( 17th )

2021-10-21 Thread Roger Heflin
is the pv in the root device vg?  if not changing fstab to not mount the
missing  fs(es) should get it bootable.   I have a practice of putting
",nofail" on all non-root filesystems (ie defaults,nofail) since priority
#1 is getting the machine up and on the network after a reboot such that it
can be ssh'ed to and fixed as needed.

If it is not the root device then on the root device there should be
several prior archive copy in /etc/lvm/archive/*, and maybe some
copies in /etc/lvm/backup.

In the vg backup file there will be a bunch of uuids, you want the specific
pv uuid and not the vg/lv uuids.  Each pv has a uuid and each lv has a uuid
and the vg has a uuid.

On Wed, Oct 20, 2021 at 8:39 AM Brian McCullough  wrote:

> On Tue, Oct 19, 2021 at 05:06:37AM -0500, Roger Heflin wrote:
> > I would edit the vgconfig you dd'ed with an editor and make sure it looks
> > reasonable for what you think you had.
>
> It turns out, comparing the information that I pulled off of the drive
> with what I find in /etc/lvm/backup, that the first part of the vgconfig
> information is missing.  As I said in one of my messages, the
> information that I retrieved from the disk starts at 0x1200.  I don't
> know whether that is correct or not.  It does not appear to be a proper
> "backup" file, which I think it should be.
>
> I rebooted ( partially ) the machine and copied the vgconfig backup file
> from that, but am somewhat concerned, because I don't seem to be able to
> match the UUIDs.  The one that I seem to see in the vgconfig data that I
> pulled off of the drive vs what I got out of /etc/lvm/backup.  Maybe I
> am just mis-reading it.  I will continue my research for a bit.
>
>
>
>
> > When you do the pvcreate --uuid it won't use anything except the uuid
> info
> > so the rest may not need to be exactly right, if you have to do a
> > vgcfgrestore to get it to read the rest of the info will be used.
>
> Oh, thank you.   I did see that things got somewhat different on the
> target drive when I did "pvcreate --uuid --restorefile."  I got paranoid
> when I saw that, and re-copied the ddrestore file back to the target
> drive before I did anything else.   Should I do "pvcreate --uuid
> --norestorefile," instead?  Then, once it is back in the machine, do the
> pvscan and vgcfgrestore, and expect good things?
>
>
>
> > I have seen some weird disk controller failures that appeared to zero out
> > the first bit of the disk (enough to get the partition table, grub, and
> the
> > pv header depending on where the first partition starts).
>
> I APPEAR to have a partition table, containing an NTFS partition, an LVM
> partiton ( the one that I am concentrating on ) and a Linux partion.  I
> would have thought that it was all LVM, but my memory could easily be
> wrong.
>
>
>
> > You will need to reinstall grub if this was the bootable disk, since
> there
> > were 384 bytes of grub in the sector with the partition table that you
> know
> > are missing.
>
> Fortunately, this is all data, nothing to do with the boot sequence,
> except that the machine will not boot with the missing PV.
>
>
>
> Thank you,
> Brian
>
> ___
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://listman.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
>
___
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

Re: [linux-lvm] Recovering "broken" disk ( 17th )

2021-10-20 Thread Roger Heflin
I would edit the vgconfig you dd'ed with an editor and make sure it looks
reasonable for what you think you had.

When you do the pvcreate --uuid it won't use anything except the uuid info
so the rest may not need to be exactly right, if you have to do a
vgcfgrestore to get it to read the rest of the info will be used.

I have seen some weird disk controller failures that appeared to zero out
the first bit of the disk (enough to get the partition table, grub, and the
pv header depending on where the first partition starts).

You will need to reinstall grub if this was the bootable disk, since there
were 384 bytes of grub in the sector with the partition table that you know
are missing.

On Tue, Oct 19, 2021 at 1:04 AM Brian McCullough 
wrote:

> On Mon, Oct 18, 2021 at 09:49:44AM -0500, Roger Heflin wrote:
> > You will need a lvm backup file for the pvcreate --uuid I believe (there
> > may be some option to get around needing the backup file).
> >
> > That will put the header back on if you either have an lvm backup and/or
> > archive file, you might also need a vgcfgrestore afterwards depending on
> if
> > anything else is missing.
> >
> > I have never done it, but it looks possible to make a lvm backup file by
> > reading it directly off the disk with dd, so that you will have a file
> that
> > pvcreate is ok with, that is if there is no way to force it without a
> > backup file.
> >
> > But, this should get the pv back showing up with whatever sectors that
> you
> > successfully recovered.
>
> Thank you, Roger.
>
> I might be able to do that on the original machine, which won't boot at
> the moment because of this failure, but I have been using a working
> machine so far to use ddrestore etc.
>
> Yes, as you said, I was able to use dd to copy off the vgconfig data
> into a file.  So you think that I might be able to use that as the
> reference in pvcreate?  That makes some sense.
>
> I started this out asking ( but not clearly ) whether the data on the
> disk appeared to be in the correct place and whether what was there was
> correct, but I think that I answered that for myself with the
> comparison to the working PV partition.
>
>
> Thanks again,
> Brian
>
> ___
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://listman.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
>
___
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

Re: [linux-lvm] Recovering "broken" disk ( 17th )

2021-10-18 Thread Roger Heflin
You will need a lvm backup file for the pvcreate --uuid I believe (there
may be some option to get around needing the backup file).

That will put the header back on if you either have an lvm backup and/or
archive file, you might also need a vgcfgrestore afterwards depending on if
anything else is missing.

I have never done it, but it looks possible to make a lvm backup file by
reading it directly off the disk with dd, so that you will have a file that
pvcreate is ok with, that is if there is no way to force it without a
backup file.

But, this should get the pv back showing up with whatever sectors that you
successfully recovered.

On Sun, Oct 17, 2021 at 10:02 PM Brian McCullough  wrote:

>
> Folks,
>
> I have had a disk go bad on me, causing me to lose one PV.
>
>
> I seem to have retrieved the partition using ddrescue, but it also seems
> to be missing some label information, because pvscan doesn't see it.
>
> Using hexdump, I see the string " LVM2 " at 0x1004, but nothing before
> that.  The whole phrase is:
>
> 0x01000  16 d6 8e db 20 4c 56 4d  32 20 78 5b 35 41 25 72
>
>
> I find what appears to be an LVM2 configuration section at 0x1200, and
> so I was able to read the UUID that this PV should have.
>
>
> On another machine, I dumped a PV partition, and find "LABLEONE" at
> 0x200, with the same " LVM2 " at 0x01000.
>
> I was concerned that my dump was offset, but the comparison to the
> "good" one suggests that that isn't the problem, but just the missing
> "LABLEONE" and related information at 0x0200.
>
> If I do a "pvcreate --uuid " would this fix that recovered partition
> so that pvscan and friends can work properly, and I can finally boot
> that machine?
>
>
> Thank you,
> Brian
>
>
> ___
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://listman.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
>
___
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

Re: [linux-lvm] " Failed to find physical volume

2021-09-28 Thread Roger Heflin
A possibility I just debugged for a non-booting system.

If there is a partition table on the underlying device then that
device is not detected as an LVM1/2 member in at least one version of
udevd, and won't be seen nor turned on automatically by the
systemd-udev code.

lvm vgchange -ay worked to enable it (emergency mode, it was the root
pv--no udevd involvement) and eventually I found the partition table
and removed it and the machine would then boot without needing a
manual intervention.

dd if=/dev/zero of=device bs=512 count=1 was used once we determined
there was a partition signature still left (after partition deletion
with fdisk, still had a header), examined with dd if=/dev/device
bs=512 count=1 | xxd found 4 non-zero bytes in the block.

On Sun, Sep 26, 2021 at 9:22 AM alessandro macuz
 wrote:
>
> Thanks Roger, Zdenek,
>
> I have my ZVOL on my NAS exposed as LUNs. The initiator were switched off and 
> for unknown reason I found my NAS switched off as well.
> It had run for long and I feared the worst (CPU/motherboard/etc). Instead 
> once powered up everything started to work again but the LUNs that seemed to 
> jeopardized.
> I have many ZVOLs used by ESXi in which I have EVE-NG who uses LVM and such 
> ZVOLs have the same size so I wanted to inspect them to check the hostname.
>
> Now some LUNs started to work normally, some others still behave weirdly. I 
> will run pvs on them with extra debugs to see what's going on.
>
> Many thanks,
>
> Alex
>
> Le jeu. 23 sept. 2021 à 23:48, Roger Heflin  a écrit :
>>
>> If you have lvmetad running and in use then the lvm commands ask it
>> what the system has on it.
>>
>> I have seen on random boots fairly separated systems (rhel7 versions,
>> and many years newer fedora systems) at random fail to find one or
>> more pv.s
>>
>> I have disabled it at home, and in my day job we have also disabled
>> (across 20k+ systems) as we confirmed it had inconsistency issues
>> several times on a variety of our newest installs.
>>
>> Stopping lvmetad and/or restarting it would generally fix it.But
>> it was a source of enough random issues(often failure to mount on a
>> boot, so often issues that resulted in page-outs to debug)  and did
>> not speed things up much enough to be worth it even on devices with
>> >2000 SAN volumes.
>>
>> On Thu, Sep 23, 2021 at 8:52 AM Zdenek Kabelac  wrote:
>> >
>> > Dne 22. 09. 21 v 18:48 alessandro macuz napsal(a):
>> > > fdisk correctly identifies the extended partition as 8e.
>> > > I wonder which kind of data lvmdiskscan and pvs use in order to list LVM
>> > > physical volumes.
>> > > Does PVS check some specific metadata within the partition without just
>> > > relying on the type of partition displayed by fdisk?
>> > >
>> > >
>> >
>> > Hi
>> >
>> > Yes - PVs do have header signature keeping information about PV attributes
>> > and also has the storage area to keep lvm2 metadata.
>> >
>> > Partition flags known to fdisk are irrelevant.
>> >
>> >
>> > Regards
>> >
>> > Zdenek
>> >
>> > ___
>> > linux-lvm mailing list
>> > linux-lvm@redhat.com
>> > https://listman.redhat.com/mailman/listinfo/linux-lvm
>> > read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>> >


___
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

Re: [linux-lvm] " Failed to find physical volume

2021-09-23 Thread Roger Heflin
If you have lvmetad running and in use then the lvm commands ask it
what the system has on it.

I have seen on random boots fairly separated systems (rhel7 versions,
and many years newer fedora systems) at random fail to find one or
more pv.s

I have disabled it at home, and in my day job we have also disabled
(across 20k+ systems) as we confirmed it had inconsistency issues
several times on a variety of our newest installs.

Stopping lvmetad and/or restarting it would generally fix it.But
it was a source of enough random issues(often failure to mount on a
boot, so often issues that resulted in page-outs to debug)  and did
not speed things up much enough to be worth it even on devices with
>2000 SAN volumes.

On Thu, Sep 23, 2021 at 8:52 AM Zdenek Kabelac  wrote:
>
> Dne 22. 09. 21 v 18:48 alessandro macuz napsal(a):
> > fdisk correctly identifies the extended partition as 8e.
> > I wonder which kind of data lvmdiskscan and pvs use in order to list LVM
> > physical volumes.
> > Does PVS check some specific metadata within the partition without just
> > relying on the type of partition displayed by fdisk?
> >
> >
>
> Hi
>
> Yes - PVs do have header signature keeping information about PV attributes
> and also has the storage area to keep lvm2 metadata.
>
> Partition flags known to fdisk are irrelevant.
>
>
> Regards
>
> Zdenek
>
> ___
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://listman.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>

___
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/



Re: [linux-lvm] Discussion: performance issue on event activation mode

2021-06-08 Thread Roger Heflin
The case we had is large physical machines with around 1000 disks.   We did
not see the issue on the smaller cpu/disked physicals and/or vm's.  It
seemed like both high cpu counts and high disk counts were needed, but in
our environment both of those are usually together.The smallest
machines that had the issues had 72 threads (36 actual cores).   And the
disk devices were all SSD SAN luns so I would expect all of the devices to
respond to and return IO requests in under .3ms under normal conditions.
They were also all partitioned and multipath'ed.  90% of the disk would not
have had any LVM on them at all but would have been at least initial
scanned by something, but the systemd LVM parts where what was timing out,
and based on the time udev was getting in the 90-120 seconds (90 minutes of
time) it very much seemed to be having serious cpu time issues doing
something.

I have done some simple tests forking a bunch of tests forking off a bunch
of  /usr/sbin/lvm pvscan --cache major:minor in the background and in
parallel rapidly and cannot get it to really act badly except with numbers
that are >2.

And if I am reading the direct case pvscan that is fast about the only
thing that differs is that it does not spawn off lots of processes and
events and just does the pvscan once.  Between udev and systemd I am not
clear on how many different events have to be handled and how many of those
events need to spawn new threads and/or fork new processes off.
Something doing one of those 2 things or both would seem to have been the
cause of the issue I saw in the past.

When it has difficult booting up like this what does ps axuS | grep udev
look like time wise?


On Mon, Jun 7, 2021 at 10:30 AM heming.z...@suse.com 
wrote:

> On 6/7/21 6:27 PM, Martin Wilck wrote:
> > On So, 2021-06-06 at 11:35 -0500, Roger Heflin wrote:
> >> This might be a simpler way to control the number of threads at the
> >> same time.
> >>
> >> On large machines (cpu wise, memory wise and disk wise).   I have
> >> only seen lvm timeout when udev_children is set to default.   The
> >> default seems to be set wrong, and the default seemed to be tuned for
> >> a case where a large number of the disks on the machine were going to
> >> be timing out (or otherwise really really slow), so to support this
> >> case a huge number of threads was required..I found that with it
> >> set to default on a close to 100 core machine that udev got about 87
> >> minutes of time during the boot up (about 2 minutes).  Changing the
> >> number of children to =4 resulted in udev getting around 2-3 minutes
> >> in the same window, and actually resulted in a much faster boot up
> >> and a much more reliable boot up (no timeouts).
> >
> > Wow, setting the number of children to 4 is pretty radical. We decrease
> > this parameter often on large machines, but we never went all the way
> > down to a single-digit number. If that's really necessary under
> > whatever circumstances, it's clear evidence of udev's deficiencies.
> >
> > I am not sure if it's better than Heming's suggestion though. It would
> > affect every device in the system. It wouldn't even be possible to
> > process more than 4 totally different events at the same time.
> >
>
> hello
>
> I tested udev.children_max with value 1, 2 & 4. The results showed it
> didn't take effect, and the booting time even longer than before.
> This solution may suite for some special cases.
>
> (my env: kvm-qemu vm, 6vpu, 22G mem, 1015 disks)
>
> Regards
> heming
>
>
___
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

Re: [linux-lvm] Issue after upgrading the LVM2 package from 2.02.176 to 2.02.180

2020-09-17 Thread Roger Heflin
There is only a reject, there are no included devices.  Once you add a
filter it overrides the default filter including all I believe.

The fixes may have been fixing filtering as once you specify a filter
then there is no implied include (anymore). You should probably look
at the list of fixes that happened between those 2 versions.


On Thu, Sep 17, 2020 at 2:25 AM KrishnaMurali Chennuboina
 wrote:
>
> Hi Roger,
>
> Missed to add my comment in the earlier mail.
> From the filter it should not exclude /dev/sda*. But not sure why it is being 
> excluded while executing pvcreate command.
>
> Thanks.
>
> On Tue, 15 Sep 2020 at 18:28, KrishnaMurali Chennuboina 
>  wrote:
>>
>> Hi Roger,
>>
>> Thanks for the information shared.
>> Filter which we used in our conf file was,
>> # Accept every block device:
>>   filter = [ "r|^/dev/drbd.*|" ]
>>
>>
>> Thanks.
>>
>> On Tue, 15 Sep 2020 at 16:36, Roger Heflin  wrote:
>>>
>>> #1:
>>> Device /dev/sda3 excluded by a filter.)
>>> Failed to execute command: pvcreate -ffy /dev/sda3
>>> ec=0
>>>
>>> excluded by filter is likely the issue, I think there was a bug were
>>> it allowed that pvcreate to work when it should have blocked it
>>> because of the filter.  It should not allow a pvcreate against
>>> something blocked by a filter.
>>>
>>> #2: Read-only locking type set. Write locks are prohibited.
>>> I am going to guess either / is not mounted rw, or you don't have the
>>> directory mounted rw that is needed to create the locks (/var/run/lvm
>>> usually).
>>>
>>> On Tue, Sep 15, 2020 at 1:42 AM KrishnaMurali Chennuboina
>>>  wrote:
>>> >
>>> > Hi Roger,
>>> >
>>> > I have tried this with the older LVM package(.176) and this issue was not 
>>> > seen. Issue was seen with .180 version every time.
>>> > # executing command: vgchange -ay
>>> > (status, output): (0,   WARNING: Failed to connect to lvmetad. Falling 
>>> > back to device scanning.)
>>> >   WARNING: Failed to connect to lvmetad. Falling back to device scanning.
>>> > # executing command: pvcreate -ffy /dev/sda3
>>> > (status, output): (5,   WARNING: Failed to connect to lvmetad. Falling 
>>> > back to device scanning.
>>> >   Error reading device /dev/sda3 at 0 length 4096.
>>> >   Error reading device /dev/ram0 at 0 length 4096.
>>> >   Error reading device /dev/loop0 at 0 length 4096.
>>> >   Error reading device /dev/sda at 0 length 512.
>>> >   Error reading device /dev/sda at 0 length 4096.
>>> >   Error reading device /dev/ram1 at 0 length 4096.
>>> >   Error reading device /dev/sda1 at 0 length 4096.
>>> >   Error reading device /dev/ram2 at 0 length 4096.
>>> >   Error reading device /dev/sda2 at 0 length 4096.
>>> >   Error reading device /dev/ram3 at 0 length 4096.
>>> >   Error reading device /dev/ram4 at 0 length 4096.
>>> >   Error reading device /dev/ram5 at 0 length 4096.
>>> >   Error reading device /dev/ram6 at 0 length 4096.
>>> >   Error reading device /dev/ram7 at 0 length 4096.
>>> >   Error reading device /dev/ram8 at 0 length 4096.
>>> >   Error reading device /dev/ram9 at 0 length 4096.
>>> >   Error reading device /dev/ram10 at 0 length 4096.
>>> >   Error reading device /dev/ram11 at 0 length 4096.
>>> >   Error reading device /dev/ram12 at 0 length 4096.
>>> >   Error reading device /dev/ram13 at 0 length 4096.
>>> >   Error reading device /dev/ram14 at 0 length 4096.
>>> >   Error reading device /dev/ram15 at 0 length 4096.
>>> >   Device /dev/sda3 excluded by a filter.)
>>> > Failed to execute command: pvcreate -ffy /dev/sda3
>>> > ec=0
>>> >
>>> > I have tried with different options of pvcreate but didnt helped much. 
>>> > After the system got halted with the above error, i tried to executing 
>>> > pvs command but got the below error.
>>> > bash-4.4# pvs
>>> >   Error reading device /dev/ram0 at 0 length 4096.
>>> >   Error reading device /dev/loop0 at 0 length 4096.
>>> >   Error reading device /dev/sda at 0 length 512.
>>> >   Error reading device /dev/sda at 0 length 4096.
>>> >   Error reading device /dev/ram1 at 0 length 4096.
>>> >   Error reading device /dev/sda1 at 0 length 4096.
>>> >   Error reading 

Re: [linux-lvm] Issue after upgrading the LVM2 package from 2.02.176 to 2.02.180

2020-09-15 Thread Roger Heflin
#1:
Device /dev/sda3 excluded by a filter.)
Failed to execute command: pvcreate -ffy /dev/sda3
ec=0

excluded by filter is likely the issue, I think there was a bug were
it allowed that pvcreate to work when it should have blocked it
because of the filter.  It should not allow a pvcreate against
something blocked by a filter.

#2: Read-only locking type set. Write locks are prohibited.
I am going to guess either / is not mounted rw, or you don't have the
directory mounted rw that is needed to create the locks (/var/run/lvm
usually).

On Tue, Sep 15, 2020 at 1:42 AM KrishnaMurali Chennuboina
 wrote:
>
> Hi Roger,
>
> I have tried this with the older LVM package(.176) and this issue was not 
> seen. Issue was seen with .180 version every time.
> # executing command: vgchange -ay
> (status, output): (0,   WARNING: Failed to connect to lvmetad. Falling back 
> to device scanning.)
>   WARNING: Failed to connect to lvmetad. Falling back to device scanning.
> # executing command: pvcreate -ffy /dev/sda3
> (status, output): (5,   WARNING: Failed to connect to lvmetad. Falling back 
> to device scanning.
>   Error reading device /dev/sda3 at 0 length 4096.
>   Error reading device /dev/ram0 at 0 length 4096.
>   Error reading device /dev/loop0 at 0 length 4096.
>   Error reading device /dev/sda at 0 length 512.
>   Error reading device /dev/sda at 0 length 4096.
>   Error reading device /dev/ram1 at 0 length 4096.
>   Error reading device /dev/sda1 at 0 length 4096.
>   Error reading device /dev/ram2 at 0 length 4096.
>   Error reading device /dev/sda2 at 0 length 4096.
>   Error reading device /dev/ram3 at 0 length 4096.
>   Error reading device /dev/ram4 at 0 length 4096.
>   Error reading device /dev/ram5 at 0 length 4096.
>   Error reading device /dev/ram6 at 0 length 4096.
>   Error reading device /dev/ram7 at 0 length 4096.
>   Error reading device /dev/ram8 at 0 length 4096.
>   Error reading device /dev/ram9 at 0 length 4096.
>   Error reading device /dev/ram10 at 0 length 4096.
>   Error reading device /dev/ram11 at 0 length 4096.
>   Error reading device /dev/ram12 at 0 length 4096.
>   Error reading device /dev/ram13 at 0 length 4096.
>   Error reading device /dev/ram14 at 0 length 4096.
>   Error reading device /dev/ram15 at 0 length 4096.
>   Device /dev/sda3 excluded by a filter.)
> Failed to execute command: pvcreate -ffy /dev/sda3
> ec=0
>
> I have tried with different options of pvcreate but didnt helped much. After 
> the system got halted with the above error, i tried to executing pvs command 
> but got the below error.
> bash-4.4# pvs
>   Error reading device /dev/ram0 at 0 length 4096.
>   Error reading device /dev/loop0 at 0 length 4096.
>   Error reading device /dev/sda at 0 length 512.
>   Error reading device /dev/sda at 0 length 4096.
>   Error reading device /dev/ram1 at 0 length 4096.
>   Error reading device /dev/sda1 at 0 length 4096.
>   Error reading device /dev/ram2 at 0 length 4096.
>   Error reading device /dev/sda2 at 0 length 4096.
>   Error reading device /dev/ram3 at 0 length 4096.
>   Error reading device /dev/sda3 at 0 length 4096.
>   Error reading device /dev/ram4 at 0 length 4096.
>   Error reading device /dev/sda4 at 0 length 4096.
>   Error reading device /dev/ram5 at 0 length 4096.
>   Error reading device /dev/ram6 at 0 length 4096.
>   Error reading device /dev/ram7 at 0 length 4096.
>   Error reading device /dev/ram8 at 0 length 4096.
>   Error reading device /dev/ram9 at 0 length 4096.
>   Error reading device /dev/ram10 at 0 length 4096.
>   Error reading device /dev/ram11 at 0 length 4096.
>   Error reading device /dev/ram12 at 0 length 4096.
>   Error reading device /dev/ram13 at 0 length 4096.
>   Error reading device /dev/ram14 at 0 length 4096.
>   Error reading device /dev/ram15 at 0 length 4096.
>   Error reading device /dev/sdb at 0 length 512.
>   Error reading device /dev/sdb at 0 length 4096.
>   Error reading device /dev/sdb1 at 0 length 4096.
>   Error reading device /dev/sdb2 at 0 length 4096.
>   Error reading device /dev/sdb3 at 0 length 4096.
>   Read-only locking type set. Write locks are prohibited.
>   Recovery of standalone physical volumes failed.
>   Cannot process standalone physical volumes
> bash-4.4#
>
> Attached the complete log in initial mail.
>
> Thanks.
>
> On Mon, 14 Sep 2020 at 20:29, Roger Heflin  wrote:
>>
>> In general I would suggest fully disabling lvmetad from the config
>> files and from being started up.
>>
>> Issues around it not answering (like above) and answering but somehow
>> having stale/wrong info have burned me too many times to trust it.  It
>> may be a lvmetad bug, or be udevd weirdness.
>>
>> The only sign

Re: [linux-lvm] Issue after upgrading the LVM2 package from 2.02.176 to 2.02.180

2020-09-14 Thread Roger Heflin
In general I would suggest fully disabling lvmetad from the config
files and from being started up.

Issues around it not answering (like above) and answering but somehow
having stale/wrong info have burned me too many times to trust it.  It
may be a lvmetad bug, or be udevd weirdness.

The only significant improvement it makes is it reduces the lvm
command time on installs with significant numbers of devices, but
given that the info has been wrong often enough that is not worth the
risk.

On Mon, Sep 14, 2020 at 2:25 AM KrishnaMurali Chennuboina
 wrote:
>
> Hi Team,
>
> While trying to analyze one the issue, we felt that upgrading the current 
> LVM2 package in our repository will be the best approach.
> As part of that, we have updated the respective package from  2.02.176 to 
> 2.02.180. We have verified the same and booted x86_64 hardware without any 
> issues.
>
> But while trying to boot mips64 hardware we have started observing the below 
> issue.  Providing the snippet of the log,
>
> # executing command: vgchange -an
> (status, output): (0,   WARNING: Failed to connect to lvmetad. Falling back 
> to device scanning.)
>   WARNING: Failed to connect to lvmetad. Falling back to device scanning.
>
> Attached the detailed log for reference. There is no other change included 
> other than the LVM2 update.
>
> LVM2 Version: 2.02.176
> Updated LVM2 version: 2.02.180
>
> Please share inputs on why this issue is being observed with .180 version?
> Please let me know if i can share any other information.
>
> Thanks.
>
> ___
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

___
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/



Re: [linux-lvm] lvm raid5 : drives all present but vg/lvm will not assemble

2020-03-23 Thread Roger Heflin
cat /proc/mdstat, the 253,19 is likely an /dev/mdX device and to get
an io error like that it has to be in a wrong state.

On Mon, Mar 23, 2020 at 5:14 AM Bernd Eckenfels  wrote:
>
> Do you see any dmesg kernel errors when you try to activate the LVs?
>
> Gruss
> Bernd
>
>
> --
> http://bernd.eckenfels.net
> 
> Von: linux-lvm-boun...@redhat.com  im Auftrag 
> von Andrew Falgout 
> Gesendet: Saturday, March 21, 2020 4:22:04 AM
> An: linux-lvm@redhat.com 
> Betreff: [linux-lvm] lvm raid5 : drives all present but vg/lvm will not 
> assemble
>
>
> This started on a Raspberry PI 4 running raspbian.  I moved the disks to my 
> Fedora 31 system, that is running the latest updates and kernel.  When I had 
> the same issues there I knew it wasn't raspbian.
>
> I've reached the end of my rope on this. The disks are there, all three are 
> accounted for, and the LVM data on them can be seen.  But it refuses to 
> activate stating I/O errors.
>
> [root@hypervisor01 ~]# pvs
>   PV VGFmt  Attr PSizePFree
>   /dev/sda1  local_storage01   lvm2 a--  <931.51g   0
>   /dev/sdb1  local_storage01   lvm2 a--  <931.51g   0
>   /dev/sdc1  local_storage01   lvm2 a--  <931.51g   0
>   /dev/sdd1  local_storage01   lvm2 a--  <931.51g   0
>   /dev/sde1  local_storage01   lvm2 a--  <931.51g   0
>   /dev/sdf1  local_storage01   lvm2 a--  <931.51g <931.51g
>   /dev/sdg1  local_storage01   lvm2 a--  <931.51g <931.51g
>   /dev/sdh1  local_storage01   lvm2 a--  <931.51g <931.51g
>   /dev/sdi3  fedora_hypervisor lvm2 a--27.33g   <9.44g
>   /dev/sdk1  vg1   lvm2 a--<7.28t   0
>   /dev/sdl1  vg1   lvm2 a--<7.28t   0
>   /dev/sdm1  vg1   lvm2 a--<7.28t   0
> [root@hypervisor01 ~]# vgs
>   VG#PV #LV #SN Attr   VSize  VFree
>   fedora_hypervisor   1   2   0 wz--n- 27.33g <9.44g
>   local_storage01 8   1   0 wz--n- <7.28t <2.73t
>   vg1 3   1   0 wz--n- 21.83t 0
> [root@hypervisor01 ~]# lvs
>   LVVGAttr   LSize  Pool Origin Data%  Meta%  
> Move Log Cpy%Sync Convert
>   root  fedora_hypervisor -wi-ao 15.00g
>   swap  fedora_hypervisor -wi-ao  2.89g
>   libvirt   local_storage01   rwi-aor--- <2.73t   
>  100.00
>   gluster02 vg1   Rwi---r--- 14.55t
>
> The one in question is the vg1/gluster02 lvm group.
>
> I try to activate the VG:
> [root@hypervisor01 ~]# vgchange -ay vg1
>   device-mapper: reload ioctl on  (253:19) failed: Input/output error
>   0 logical volume(s) in volume group "vg1" now active
>
> I've got the debugging output from :
> vgchange -ay vg1 - -
> lvchange -ay --partial vg1/gluster02 - -
>
> Just not sure where I should dump the data for people to look at.  Is there a 
> way to tell the md system to ignore the metadata since there wasn't an actual 
> disk failure, and rebuild the metadata off what is in the lvm?  Or can I even 
> get the LV to mount, so I can pull the data off.
>
> Any help is appreciated.  If I can save the data great.  I'm tossing this to 
> the community to see if anyone else has an idea of what I can do.
> ./digitalw00t
> ___
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/


___
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/



Re: [linux-lvm] recover volume group & locical volumes from PV?

2019-05-11 Thread Roger Heflin
check to see if there are LVM copies in
/etc/lvm/{archive,backup,cache}. if you find one that looks right then
you can use vgcfgrestore to put it back onto the correct pv.  The
copies should document the last command ran before the copy was made,
you need to make sure that no critical commands were run after the
copy was made for it to be the right copy.  The file will have the
full layout of the vg/pv/lv in it, and contains all of information
that made the pv/vg/lv.

On Sat, May 11, 2019 at 5:49 PM "Rainer Fügenstein"  wrote:
>
> hi,
>
> I am (was) using Fedora 28 installed in several LVs on /dev/sda5 (= PV),
> where sda is a "big" SSD.
>
> by accident, I attached (via SATA hot swap bay) an old harddisk
> (/dev/sdc1), which was used about 2 months temporarily to move the volume
> group / logical volumes from the "old" SSD to the "new" SSD (pvadd,
> pvmove, ...)
>
> this combination of old PV and new PV messed up the filesystems. when I
> noticed the mistake, I did a shutdown and physically removed /dev/sdc.
> this also removed VG and LVs on /dev/sda5, causing the system crash on
> boot.
>
> the layout was something like this:
>
> /dev/sda3 ==> /boot
> /dev/fedora_sh64/lv_home
> /dev/fedora_sh64/lv_root
> /dev/fedora_sh64/lv_var
> ...
>
> [root@localhost-live ~]# pvs
>   PV VG Fmt  Attr PSize   PFree
>   /dev/sda5 lvm2 ---  <47.30g <47.30g
>
> [root@localhost-live ~]# pvdisplay
>   "/dev/sda5" is a new physical volume of "<47.30 GiB"
>   --- NEW Physical volume ---
>   PV Name   /dev/sda5
>   VG Name
>   PV Size   <47.30 GiB
>   Allocatable   NO
>   PE Size   0
>   Total PE  0
>   Free PE   0
>   Allocated PE  0
>   PV UUID   EOi5Ln-W26D-SER2-tLke-xNMP-Prgq-aUfLPz
>
> (please note "NEW Physical Volume" ?!?!)
>
> [root@localhost-live ~]# vgdisplay
> [root@localhost-live ~]# lvdisplay
>
> [root@localhost-live ~]# pvscan
>   PV /dev/sda5  lvm2 [<47.30 GiB]
>   Total: 1 [<47.30 GiB] / in use: 0 [0   ] / in no VG: 1 [<47.30 GiB]
>
> [root@localhost-live ~]# pvck /dev/sda5
>   Found label on /dev/sda5, sector 1, type=LVM2 001
>   Found text metadata area: offset=4096, size=1044480
>
> after re-adding the old /dev/sdc1 disk, VG and LVs show up, filesystems a
> bit damaged, but readable. content is about two months old.
>
> [root@localhost-live ~]# pvs
>   PV VG  Fmt  Attr PSizePFree
>   /dev/sda5  lvm2 ---   <47.30g <47.30g
>   /dev/sdc1  fedora_sh64 lvm2 a--  <298.09g 273.30g
>
> is there any chance to get VG and LVs back?
>
> thanks in advance.
>
> ___
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

___
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

Re: [linux-lvm] Unknown PV missing ?

2019-02-17 Thread Roger Heflin
pv0 the first pv in the vg, internally lvm keeps track of disks that way.

It is an md device, do you have the md modules and pieces needed for
boot in your dracut.conf and/or dracut.conf.d files?

On Sun, Feb 17, 2019 at 8:58 AM Georges Giralt  wrote:
>
> Hello,
>
> I've this computer running since a very long time on the same LVM
> configuration consisting of 2 PV  and one VG onto which the whole system
> is installed (it is bare metal running now Ubuntu 18.04.2 LTS.).
>
> Every time I do an upgrade-grub, I get the following message :
>
> ==
>
> Création du fichier de configuration GRUB…
> /usr/sbin/grub-probe : attention : Impossible de trouver le volume
> physique « pv0 ». Certains modules risquent de manquer dans l'image de
> base..
> Image Linux trouvée : /boot/vmlinuz-4.18.0-15-generic
>
> ==
>
> IF I do a pvscan -v I get this :
>
> ==
>
> # pvscan  -v
>  Wiping internal VG cache
>  Wiping cache of LVM-capable devices
>PV /dev/md1 VG vg0 lvm2 [<291,91 GiB / 126,66 GiB
> free]
>PV /dev/nvme0n1p1   VG vg0 lvm2 [<119,24 GiB / <91,80 GiB
> free]
>Total: 2 [411,14 GiB] / in use: 2 [411,14 GiB] / in no VG: 0 [0   ]
> #
>
> And this is the extract of vgdisplay -v pertaining to PVs
>
> ===
>--- Physical volumes ---
>PV Name   /dev/md1
>PV UUID   gcjLSH-3lL5-NjBq-QdYh-vddQ-pwlg-UmEiZs
>PV Status allocatable
>Total PE / Free PE74728 / 32426
>
>PV Name   /dev/nvme0n1p1
>PV UUID   UGWM1s-GSly-Riyp-3hfE-s2nr-ng2E-wNuTVS
>PV Status allocatable
>Total PE / Free PE30525 / 23500
>
> #
>
> 
>
> Nowhere is a "pv0" referenced. So I wonder what is this pv0 and from
> where it comes ?
>
> Could you, please, help me get rid of this message ?
>
> Because it puzzle me since a long time and I can't find how to suppress it.
>
> Many thanks in advance for your help and advice.
>
>
> --
> "If a man empties his purse into his head, no man can take it away from him.
> An investment in knowledge always pays the best interest"
>
>
> Benjamin Franklin.
>
> ___
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

___
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

Re: [linux-lvm] lvcreate from a setuid-root binary

2018-11-16 Thread Roger Heflin
Why aren't you just using sudo for this?
On Fri, Nov 16, 2018 at 11:14 AM Christoph Pleger
 wrote:
>
> Hello,
>
> > How do you plan to 'authorize' passed command line options ??
>
> My program has no command line options. It just takes PAM_USER from PAM
> environment and creates a logical volume /dev/vg1/$PAM_USER, creates a
> filesystem and changes directory permissions of the top directory of the
> new filesystem.
>
> > lvm2 is designed to be always executed with root privileges - so it's
> > believed admin knows how he can destroy his own system.
> >
> > It is NOT designed/supposed to be used as suid binary - this would
> > give user a way to big power to very easily destroy your filesystem
> > and gain root privileges (i.e.by overwriting  /etc/passwd file)
>
> Either you misunderstood what I mean, or I am misunderstanding what you
> mean - I do not set lvcreate suid root, but a program that has only a
> small and well defined set of instructions (described above) and that
> restricts its execution to only one user (by checking the real uid
> before setuid(0)).
>
> Regards
>Christoph
>
> ___
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

___
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/


Re: [linux-lvm] How to trash a broke VG

2018-08-03 Thread Roger Heflin
Assuming you want to completely eliminate the vg so that you can
rebuild it from scratch and the lv's are no longer mounted, then this
should work:  IF the lv is mounted you should remove it from fstab and
reboot, and see what state it comes up in and first attempt to
vgchange it off as that is cleaner than doing the dmsetup tricks.

If you cannot get the lv's lvchanged to off such that the
/dev/ is empty or non-existant, then this is a lower level way
(this still requires the device to be un-mounted, if mounted the
command will fail).

dmsetup table | grep 

Then dmsetup remove   (until all component lv's are
removed, this should empty the /dev/vgname/ directory of all devices.

Once in this state you can use the pvremove command with the extra
force options, it will tell you what vg it was part of and require you
to answer y or n.

I have had to do this a number of times when events have happened
causing disks to be lost/died/corrupted.



On Fri, Aug 3, 2018 at 12:21 AM, Jeff Allison
 wrote:
> OK Chaps I've broken it.
>
> I have a VG containing one LV and made up of 3 live disks and 2 failed disks.
>
> Whilst the disks were failing I attempted to move date off the failing
> disks, which failed so I now have a pvmove0 that won't go away either.
>
>
> So if I attempt to even remove a live disk I get an error.
>
> [root@nas ~]# vgreduce -v vg_backup /dev/sdi1
> Using physical volume(s) on command line.
> Wiping cache of LVM-capable devices
> Wiping internal VG cache
>   Couldn't find device with uuid eFMoUW-6Ml5-fTyn-E2cT-sFXu-kYla-MmiAUV.
>   Couldn't find device with uuid LxXhsb-Mgag-ESXZ-fPP6-52dE-iNp2-uKdLwu.
> There are 2 physical volumes missing.
>   Cannot change VG vg_backup while PVs are missing.
>   Consider vgreduce --removemissing.
> There are 2 physical volumes missing.
>   Cannot process volume group vg_backup
>   Failed to find physical volume "/dev/sdi1".
>
> Then if I attempt a vgreduce --removemissing I get
>
> [root@nas ~]# vgreduce --removemissing vg_backup
>   Couldn't find device with uuid eFMoUW-6Ml5-fTyn-E2cT-sFXu-kYla-MmiAUV.
>   Couldn't find device with uuid LxXhsb-Mgag-ESXZ-fPP6-52dE-iNp2-uKdLwu.
>   WARNING: Partial LV lv_backup needs to be repaired or removed.
>   WARNING: Partial LV pvmove0 needs to be repaired or removed.
>   There are still partial LVs in VG vg_backup.
>   To remove them unconditionally use: vgreduce --removemissing --force.
>   Proceeding to remove empty missing PVs.
>
> So I try force
> [root@nas ~]# vgreduce --removemissing --force vg_backup
>   Couldn't find device with uuid eFMoUW-6Ml5-fTyn-E2cT-sFXu-kYla-MmiAUV.
>   Couldn't find device with uuid LxXhsb-Mgag-ESXZ-fPP6-52dE-iNp2-uKdLwu.
>   Removing partial LV lv_backup.
>   Can't remove locked LV lv_backup.
>
> So no go.
>
> If I try lvremove pvmove0
>
> [root@nas ~]# lvremove -v pvmove0
> Using logical volume(s) on command line.
> VG name on command line not found in list of VGs: pvmove0
> Wiping cache of LVM-capable devices
>   Volume group "pvmove0" not found
>   Cannot process volume group pvmove0
>
> So Heeelp I seem to be caught in some kind of loop.
>
> ___
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

___
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/