Re: Fact finding / clarification [WAS Re: Re: smartctl cannot access mystorage, need syntax help]

2024-01-14 Thread Andrew M.A. Cater
On Sun, Jan 14, 2024 at 12:04:58PM -0500, gene heskett wrote:
> On 1/14/24 06:59, Andrew M.A. Cater wrote:
> > Hi Gene,
> > 
> > There's a whole series of long threads which loop through several
> > subjects - I can tease out a couple of things.
> > 
> > 1.) You have one large deskside machine - large enough that it's tough
> > to lift or move - which is used for many things.
> Correct, a one size fits all machine.

OK. That much at least is understood :)
> > 
> > 2.) You've added various drive controllers and various drives over a
> > period.
> 
> Correct.
> 
> > Unclear: At least one of your RAID devices may be mixed between on
> > motherboard connections and on-card drive controller connections??
> No, I boot from /dev/sda, a 1T samsung 870 SSD plugged into the motherboards
> sata which has 6 ports

OK. I'd suggest you plug this into the *first* SATA port on the motherboard,
maybe, and the CD/DVD drive (/dev/sr0) into the second on the motherboard.

> Because I didn't have 4 ports left for the raid, it on its own controller,
> one of 2 extra sata controllers currently plugged in. Both of the extra
> controlleres are just controllers, no raid in their pedigree, 1st extra has
> 6 ports, 2nd extra has 16 ports.
> > 

2a.) How many devices in the RAID in total?

2b.) How do you have these configured - what RAID configuration are
you looking to do in mdadm here?


> > 3.) You "lost" a RAID a while ago so you don't trust RAID on some devices
> > but you're persisting with RAIDs.
> No here, it was a pair of quite new 2T seagates that died and started this
> whole maryann. Lasted about a month from 1st powerup to going offline in the
> night with no warning about 3 weeks after 1st powerup. Lost everything back
> to about 2002. The only raid I've ever had is the current one, which
> smartctl was sending me emails about but not thru a normal chaanel, I only
> found them when I found a strange mbox file in my home dir. Last mail in the
> mbox file was dated Jan 7th of this year.
> But I've now sussed the smartctl syntax and all 4 drives of the raid say
> they are healthy.

If you have four drives in your RAID - maybe plug them up to channels
3-6 on your motherboard and remove the extra drive controllers?

That should simplify things mightily. mdadm will reassemble the RAID
appropriately.

> > 
> > 4.) You have various add in cards but you don't seem to know which RAID is
> > which / what's "locking" your filesystem / what's causing your problems.
> > 

OK. Possibly irrelevant given your reply below.

> > You now have a slow access to one/more of your RAID devices.
> Which from the very limited clues seems to be related to my original of of
> plasma for a desktop, with xfce4 on top of that. So I suspecting the problem
> might be mixed gui related. This lag or lockup, whatever you want to call it
> occurs for any app that opens a file requestor, there at least 30 seconds of
> this lag before the gui opens the requestor, at which point everything
> returns to normal. Failing ns reslution? I've NDI.  The lags are not logged
> anyplace I've managed to find a log to read.
> 
> > 5). Unclear: All / ("most"??) of those RAID devices are using Linux mdadm
> > rather than "RAID" supplied by the individual cards/controllers.
> 
> Correct.

OK - at least one more thing understood.
> 
> > Various of us - including myself - have suggested that you simplify things
> > / get another machine and divide up functionality. For various reasons
> > you can't / won't do that.
> 
> Mostly lack of space in this tiny childs bedroom to do that, over the last
> 35 years its best described as a midden heap. ;o)>
> 
> > Can you answer the questions I've posted above, please, to try
> > and clarify what you have. I would have asked you for /etc/fstab and
> > a couple of other files, but this is good enough to be going on with.
> Instant /etc/fstab:
> gene@coyote:/etc$ cat fstab
> # /etc/fstab: static file system information.
> #
> # Use 'blkid' to print the universally unique identifier for a
> # device; this may be used with UUID= as a more robust way to name devices
> # that works even if disks are added and removed. See fstab(5).
> #
> # systemd generates mount units based on this file, see systemd.mount(5).
> # Please run 'systemctl daemon-reload' after making changes here.
> #
> #
> # / was on /dev/sda1 during installation
> UUID=f295334b-fdcb-4428-bed3-cb9e9e129be6 /   ext4
> errors=remount-ro 0   1
> # /tmp was on /dev/sda3 during installation
> UUID=518cb65d-21f0-493f-8bb5-a5f435796991 /tmpext4 defaults
> 0   2
> # swap was on /dev/sda2 during installation
> UUID=422b50db-9913-4ed3-92c3-dc18be72cc61 noneswapsw
> 0   0
> /dev/sr0/media/cdrom0   udf,iso9660 user,noauto 0   0
> UUID=bc6135de-0578-4e3b-b2c0-5c4687abd9bd /home ext4
> errors=remount-ro  0   2
> UUID=d24c3a99-9f40-4b71-92d4-916804553cb5 none  swapsw 0   0
> # first pu

Re: Fact finding / clarification [WAS Re: Re: smartctl cannot access mystorage, need syntax help]

2024-01-14 Thread gene heskett

On 1/14/24 06:59, Andrew M.A. Cater wrote:

Hi Gene,

Frankly: Dealing with you over a mailing list can be very frustrating for
others trying to help (and especially for people trying to follow the list
who are reading the lists in the background and facing long, long threads).

You're not helping explain yourself well because the mails keep referring
to "other stuff that happened a while ago".

People ask to see mails / messages or whatever exactly because we can't
sit next to you, we can't see what you type, we can't see what you're
meaning.

That's why well-meaning folk keep asking questions to try
and establish what's going on and get a sense of where we can help
(if at all).

There's a whole series of long threads which loop through several
subjects - I can tease out a couple of things.

1.) You have one large deskside machine - large enough that it's tough
to lift or move - which is used for many things.

Correct, a one size fits all machine.


2.) You've added various drive controllers and various drives over a
period.


Correct.


Unclear: At least one of your RAID devices may be mixed between on
motherboard connections and on-card drive controller connections??
No, I boot from /dev/sda, a 1T samsung 870 SSD plugged into the 
motherboards sata which has 6 ports
Because I didn't have 4 ports left for the raid, it on its own 
controller, one of 2 extra sata controllers currently plugged in. Both 
of the extra controlleres are just controllers, no raid in their 
pedigree, 1st extra has 6 ports, 2nd extra has 16 ports.


3.) You "lost" a RAID a while ago so you don't trust RAID on some devices
but you're persisting with RAIDs.
No here, it was a pair of quite new 2T seagates that died and started 
this whole maryann. Lasted about a month from 1st powerup to going 
offline in the night with no warning about 3 weeks after 1st powerup. 
Lost everything back to about 2002. The only raid I've ever had is the 
current one, which smartctl was sending me emails about but not thru a 
normal chaanel, I only found them when I found a strange mbox file in my 
home dir. Last mail in the mbox file was dated Jan 7th of this year.
But I've now sussed the smartctl syntax and all 4 drives of the raid say 
they are healthy.


4.) You have various add in cards but you don't seem to know which RAID is
which / what's "locking" your filesystem / what's causing your problems.

You now have a slow access to one/more of your RAID devices.
Which from the very limited clues seems to be related to my original of 
of plasma for a desktop, with xfce4 on top of that. So I suspecting the 
problem might be mixed gui related. This lag or lockup, whatever you 
want to call it occurs for any app that opens a file requestor, there at 
least 30 seconds of this lag before the gui opens the requestor, at 
which point everything returns to normal. Failing ns reslution? I've 
NDI.  The lags are not logged anyplace I've managed to find a log to read.



5). Unclear: All / ("most"??) of those RAID devices are using Linux mdadm
rather than "RAID" supplied by the individual cards/controllers.


Correct.


Various of us - including myself - have suggested that you simplify things
/ get another machine and divide up functionality. For various reasons
you can't / won't do that.


Mostly lack of space in this tiny childs bedroom to do that, over the 
last 35 years its best described as a midden heap. ;o)>



Can you answer the questions I've posted above, please, to try
and clarify what you have. I would have asked you for /etc/fstab and
a couple of other files, but this is good enough to be going on with.

Instant /etc/fstab:
gene@coyote:/etc$ cat fstab
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# systemd generates mount units based on this file, see systemd.mount(5).
# Please run 'systemctl daemon-reload' after making changes here.
#
#
# / was on /dev/sda1 during installation
UUID=f295334b-fdcb-4428-bed3-cb9e9e129be6 /   ext4 
errors=remount-ro 0   1

# /tmp was on /dev/sda3 during installation
UUID=518cb65d-21f0-493f-8bb5-a5f435796991 /tmpext4 
defaults0   2

# swap was on /dev/sda2 during installation
UUID=422b50db-9913-4ed3-92c3-dc18be72cc61 noneswapsw 
 0   0

/dev/sr0/media/cdrom0   udf,iso9660 user,noauto 0   0
UUID=bc6135de-0578-4e3b-b2c0-5c4687abd9bd /home ext4 
errors=remount-ro  0   2
UUID=d24c3a99-9f40-4b71-92d4-916804553cb5 none  swapsw 
0   0

# first put it where it is now & reboot
#LABEL=homesde1 /mnt/homesde1 ext4 errors=remount-ro 0 2
gene@coyote:/etc$

I have not been able to use that last line as a target for rsync, it 
make around 13.5 gig of a 360G copy and locks up all i/o. So I've now 
refomatted