subject:"How to fix I\/O errors\?"

Re: HELP! Re: How to fix I/O errors? (SOLVED)

2017-02-13 Thread Mark Fletcher

On Sun, Feb 12, 2017 at 09:36:16PM -0500, Bob Weber wrote:
> I use a program called ossec.  It watches logs of all my linux boxes so I get
> email messages about disk problems.  I also do periodic self tests on all my
> drives controlled by smartd from the  smartmontools package.  I also use a
> package called logwatch which summarizes my logs.   The messages from mdadm 
> and
> smartd are seen by ossec.  When I mess with an array to make it larger and 
> add a
> disk for backup I get the messages in my mailbox about a degraded array.  As 
> I'm
> reading them I am startled until I remember ...Oh I did that!  I have a daily
> cron job that emails the output of "smartctl -a /dev/sdx" for each drive on 
> each
> machine so I can keep a history of the parameters for each drive.
> 

$ apt-file search ossec

sagan-rules: /etc/sagan-rules/ossec.rules

Seems like the only reference to ossec in Jessie is this rules file in 
the Sagan package. Looking at the description for sagan-rules, it seems 
to be along the right lines. But the sagan package is not in Jessie it 
seems. It's in wheezy and in stretch/sid, but not in jessie. Any idea 
what's up with that?

And was ossec packaged, or did you build it from source?

Cheers

Mark

Re: HELP! Re: How to fix I/O errors? (SOLVED)

2017-02-12 Thread Marc Shapiro


On 02/12/2017 06:36 PM, Bob Weber wrote:


After writing this I wonder if I am over doing this.  I just don't want to loose
data from a failing drive.  I lived through 3.5 inch floppies which seemed to
always fail.  And tape drives that were painfully slow.  Not to mention back in
the mid 70s saving Z80 programs and data to audio cassette tapes at 1200 baud!
I was so glad to get my first 8 inch floppys working.

...Bob

I, too remember the cassette tapes for saving files and programs on my 
TRS-80 Model III.  I think I still have a few of those tapes (10 minutes 
tapes for a single program) lying around.  The Radio Shack cassette 
player has long since died, however.



Marc

Re: HELP! Re: How to fix I/O errors? (SOLVED)

2017-02-12 Thread Bob Weber

On 02/12/2017 01:59 PM, Marc Shapiro wrote:
> On 02/12/2017 08:30 AM, Marc Auslander wrote:
>> I do not use LVM over raid 1.  I think it can be made to work,
>> although IIRC booting from an LVM over RAID partion has caused issues.
> my boot partitions are separate.  They are not under LVM.
>> LVM is useful when space requirements are changing over time and the
>> ability to add additional disks and grow logical partions is needed.
>> In my case, that isn't an issue.  I have only a small number of
>> paritions - 3 because of history but starting from scratch, I'd only
>> have two - root (including boot) and /home.
> I started using LVM when I had a much smaller disk (40GB).  With the current
> 1TB disk, even with three accounts on the box, and expanding several
> partitions when moving to the new disk, I have still partitioned less than
> half the disk and that is less than 1/3 used. So, no, LVM is probably not an
> issue any more.
>
> BTW, what is your third partition, and why would you not separate it now if
> starting from scratch?
>> I converted to mdamd raid as follows, IIRC.
>>
>> Install the second disk, and parition it the way I wanted.
>> Create a one disk raid 1 partion in each of the new paritions.
>> Take down my system, boot a live system from CD, and use a reliable
>> copy program like rsync to copy each of the partitions contents to the
>> equivalent raid partition.
>> Run grub to set the new disk as bootable.  This is by far the
>> trickiest part.
>> Boot the new system and verify it's happy.
>> Repartion the now spare disk to match the new one if necessary.
>> You may need to zero the front of each partion with dd if=/dev/zero
>> to avoid mdadm error checks.
>> Add the partitions from that disk to the mdadm paritions and let mdadm
>> do its thing.
>>
> On 02/12/2017 07:08 AM, Bob Weber wrote:
>>
>> I use raid 1 also for the redundancy it provides.  If I need a backup I just
>> connect a disk, grow each array and add it to the array (I have 3 arrays for
>> /, /home and swap).  It syncs up in a couple hours (depending on size of the
>> array).  If you have grub install itself on the added disk you have a
>> bootable copy of your system (mdadm will complain about a degraded array).  I
>> then remove the drive and place it in another outbuilding in case of fire. 
>> You can even use a external USB disk housing for the drive to keep from
>> shutting down the system.  The sync is MUCH slower ... just coma back the
>> next day and you will have your backup.  You then grow each array back to the
>> number of disks you had before and all is happy again.  Note that this single
>> disk backup will only work with raid 1.
>>
> So, how do you do a complete restore from backup?  Boot from just the single
> backup drive and add additional drives as Marc Auslander describes, above?

Yes if that is what you need to do if there was a complete failure in your
machine and maybe you had to start over with a new motherboard and power supply.

>
>
> One other question.  If using raid, how do you know when a disk is starting to
> have trouble, as mine did?  Since the whole purpose of raid is to keep the
> system up and running I wouldn't expect errors to pop up like I was getting. 
> Do you have to keep an eye on log files?  Which ones?  Or is there some other
> way that mdadm provides notification of errors?  I've got to admit, even
> though I have been using Debian for 18 or 19 years (since Bo), log files have
> never been my favorite thing.  I generally only look at them when I have a
> problem and someone on this luist tells me what to look for and where.
>
> Marc
>
>
I use a program called ossec.  It watches logs of all my linux boxes so I get
email messages about disk problems.  I also do periodic self tests on all my
drives controlled by smartd from the  smartmontools package.  I also use a
package called logwatch which summarizes my logs.   The messages from mdadm and
smartd are seen by ossec.  When I mess with an array to make it larger and add a
disk for backup I get the messages in my mailbox about a degraded array.  As I'm
reading them I am startled until I remember ...Oh I did that!  I have a daily
cron job that emails the output of "smartctl -a /dev/sdx" for each drive on each
machine so I can keep a history of the parameters for each drive.

I also use backuppc on a dedicated server to backup all my boxes.  That way I
can get back files I deleted by mistake or modified and has to go back to a
previous version.  I now have all my machines on raid 1,  My wife just recently
gave up on Win 10 with all those updates that just took over her machine when
Windows wanted to!  So now she is running Debian/KDE.

After writing this I wonder if I am over doing this.  I just don't want to loose
data from a failing drive.  I lived through 3.5 inch floppies which seemed to
always fail.  And tape drives that were painfully slow.  Not to mention back in
the mid 70s saving Z80 programs and data to audio cassette

Re: HELP! Re: How to fix I/O errors? (SOLVED)

2017-02-12 Thread Marc Auslander

Marc Shapiro  writes:

> BTW, what is your third partition, and why would you not separate it
> now if starting from scratch?
My third partition is for backups which I make to protect against
software or operator error.  At one point it was on a separate disk
since disks were small and without LVM had to be a different
partition/file system.
>
>
> One other question.  If using raid, how do you know when a disk is
> starting to have trouble, as mine did?  Since the whole purpose of
...
> Marc

Ok - I'm pretty paranoid about that.  smart is checking.
mdadm will notice if a disk is bad and turn
it off, so to speak.  Again in the logs.
I run a cron job to check form smart errors based on:

smartctl -l error -q errorsonly "device"
smartctl -H -q errorsonly "device"

But I've always checked all my disks once a week.  A root cron job
reads the whole disk with dd into /dev/null.  Any error get logged, of
course.  Separately, a cron job scans syslog and syslog.1 grepping for
"IO Error" and informs me by email if any new errors are found.  This
catches error in the dd check but also actual errors in operation.

Re: HELP! Re: How to fix I/O errors? (SOLVED)

2017-02-12 Thread Marc Shapiro


On 02/12/2017 08:30 AM, Marc Auslander wrote:

I do not use LVM over raid 1.  I think it can be made to work,
although IIRC booting from an LVM over RAID partion has caused issues.

my boot partitions are separate.  They are not under LVM.

LVM is useful when space requirements are changing over time and the
ability to add additional disks and grow logical partions is needed.
In my case, that isn't an issue.  I have only a small number of
paritions - 3 because of history but starting from scratch, I'd only
have two - root (including boot) and /home.
I started using LVM when I had a much smaller disk (40GB).  With the 
current 1TB disk, even with three accounts on the box, and expanding 
several partitions when moving to the new disk, I have still partitioned 
less than half the disk and that is less than 1/3 used. So, no, LVM is 
probably not an issue any more.


BTW, what is your third partition, and why would you not separate it now 
if starting from scratch?

I converted to mdamd raid as follows, IIRC.

Install the second disk, and parition it the way I wanted.
Create a one disk raid 1 partion in each of the new paritions.
Take down my system, boot a live system from CD, and use a reliable
copy program like rsync to copy each of the partitions contents to the
equivalent raid partition.
Run grub to set the new disk as bootable.  This is by far the
trickiest part.
Boot the new system and verify it's happy.
Repartion the now spare disk to match the new one if necessary.
You may need to zero the front of each partion with dd if=/dev/zero
to avoid mdadm error checks.
Add the partitions from that disk to the mdadm paritions and let mdadm
do its thing.


On 02/12/2017 07:08 AM, Bob Weber wrote:


I use raid 1 also for the redundancy it provides.  If I need a backup 
I just connect a disk, grow each array and add it to the array (I have 
3 arrays for /, /home and swap).  It syncs up in a couple hours 
(depending on size of the array).  If you have grub install itself on 
the added disk you have a bootable copy of your system (mdadm will 
complain about a degraded array).  I then remove the drive and place 
it in another outbuilding in case of fire.  You can even use a 
external USB disk housing for the drive to keep from shutting down the 
system.  The sync is MUCH slower ... just coma back the next day and 
you will have your backup.  You then grow each array back to the 
number of disks you had before and all is happy again.  Note that this 
single disk backup will only work with raid 1.


So, how do you do a complete restore from backup?  Boot from just the 
single backup drive and add additional drives as Marc Auslander 
describes, above?



One other question.  If using raid, how do you know when a disk is 
starting to have trouble, as mine did?  Since the whole purpose of raid 
is to keep the system up and running I wouldn't expect errors to pop up 
like I was getting.  Do you have to keep an eye on log files?  Which 
ones?  Or is there some other way that mdadm provides notification of 
errors?  I've got to admit, even though I have been using Debian for 18 
or 19 years (since Bo), log files have never been my favorite thing.  I 
generally only look at them when I have a problem and someone on this 
luist tells me what to look for and where.


Marc

Re: HELP! Re: How to fix I/O errors? (SOLVED)

2017-02-12 Thread Marc Auslander

Marc Shapiro  writes:

> the past couple of weeks.  AIUI you can use LVM over raid.  Is there
> any actual advantage to this?  I was trying to determine the
> advantages of using straight raid, straight LVM, or LVM over raid.  If
> I decide, later, to use raid, how dificult is it to add to a currently
> running system (with, or without LVM)?
>
>
> Marc
I do not use LVM over raid 1.  I think it can be made to work,
although IIRC booting from an LVM over RAID partion has caused issues.

LVM is useful when space requirements are changing over time and the
ability to add additional disks and grow logical partions is needed.
In my case, that isn't an issue.  I have only a small number of
paritions - 3 because of history but starting from scratch, I'd only
have two - root (including boot) and /home.

I converted to mdamd raid as follows, IIRC.

Install the second disk, and parition it the way I wanted.
Create a one disk raid 1 partion in each of the new paritions.
Take down my system, boot a live system from CD, and use a reliable
copy program like rsync to copy each of the partitions contents to the
equivalent raid partition.
Run grub to set the new disk as bootable.  This is by far the
trickiest part.
Boot the new system and verify it's happy.
Repartion the now spare disk to match the new one if necessary.
You may need to zero the front of each partion with dd if=/dev/zero
to avoid mdadm error checks.
Add the partitions from that disk to the mdadm paritions and let mdadm
do its thing.

Re: HELP! Re: How to fix I/O errors? (SOLVED)

2017-02-12 Thread Bob Weber

I use raid 1 also for the redundancy it provides.  If I need a backup I just
connect a disk, grow each array and add it to the array (I have 3 arrays for /,
/home and swap).  It syncs up in a couple hours (depending on size of the
array).  If you have grub install itself on the added disk you have a bootable
copy of your system (mdadm will complain about a degraded array).  I then remove
the drive and place it in another outbuilding in case of fire.  You can even use
a external USB disk housing for the drive to keep from shutting down the
system.  The sync is MUCH slower ... just coma back the next day and you will
have your backup.  You then grow each array back to the number of disks you had
before and all is happy again.  Note that this single disk backup will only work
with raid 1.

*...Bob*
On 02/11/2017 10:42 PM, Marc Shapiro wrote:
> On 02/11/2017 05:22 PM, Marc Auslander wrote:
>> You didn't ask for advice so take it or ignore it.
>>
>> IMHO, in this day and age, there is no reason not to run raid 1.  Two
>> disks, identially partitioned, each parition set up as a raid 1
>> partition with two copies.
>>
>> When a disk dies, you remove it from all the raid partitions, pop in a
>> new disk, partition it,  add the new partitions back into the raid
>> partitions and raid rebuilds the copies.
>>
>> Except for taking the system down to replace the disk (assuming you
>> don't have a third installed as a spare) you just keep running as if
>> nothing has happened.
>>
> I had been considering using raid 1 and I have not yet ruled it out entirely. 
> I have never used raid and have been reading up on it over the past couple of
> weeks.  AIUI you can use LVM over raid.  Is there any actual advantage to
> this?  I was trying to determine the advantages of using straight raid,
> straight LVM, or LVM over raid.  If I decide, later, to use raid, how dificult
> is it to add to a currently running system (with, or without LVM)?
>
>
> Marc
>
>

Re: HELP! Re: How to fix I/O errors? (SOLVED)

2017-02-11 Thread Marc Shapiro


On 02/11/2017 05:22 PM, Marc Auslander wrote:

You didn't ask for advice so take it or ignore it.

IMHO, in this day and age, there is no reason not to run raid 1.  Two
disks, identially partitioned, each parition set up as a raid 1
partition with two copies.

When a disk dies, you remove it from all the raid partitions, pop in a
new disk, partition it,  add the new partitions back into the raid
partitions and raid rebuilds the copies.

Except for taking the system down to replace the disk (assuming you
don't have a third installed as a spare) you just keep running as if
nothing has happened.

I had been considering using raid 1 and I have not yet ruled it out 
entirely.  I have never used raid and have been reading up on it over 
the past couple of weeks.  AIUI you can use LVM over raid.  Is there any 
actual advantage to this?  I was trying to determine the advantages of 
using straight raid, straight LVM, or LVM over raid.  If I decide, 
later, to use raid, how dificult is it to add to a currently running 
system (with, or without LVM)?



Marc

Re: HELP! Re: How to fix I/O errors? (SOLVED)

2017-02-11 Thread Felix Miata


Marc Auslander composed on 2017-02-11 20:22 (UTC-0500):


IMHO, in this day and age, there is no reason not to run raid 1.

Are you sure? Laptops have been outselling desktops for years.
--
"The wise are known for their understanding, and pleasant
words are persuasive." Proverbs 16:21 (New Living Translation)

 Team OS/2 ** Reg. Linux User #211409 ** a11y rocks!

Felix Miata  ***  http://fm.no-ip.com/

Re: HELP! Re: How to fix I/O errors? (SOLVED)

2017-02-11 Thread Marc Auslander

You didn't ask for advice so take it or ignore it.

IMHO, in this day and age, there is no reason not to run raid 1.  Two
disks, identially partitioned, each parition set up as a raid 1
partition with two copies.

When a disk dies, you remove it from all the raid partitions, pop in a
new disk, partition it,  add the new partitions back into the raid
partitions and raid rebuilds the copies.

Except for taking the system down to replace the disk (assuming you
don't have a third installed as a spare) you just keep running as if
nothing has happened.

Re: HELP! Re: How to fix I/O errors? (SOLVED)

2017-02-11 Thread David Christensen


On 02/10/17 23:39, Marc Shapiro wrote:

On 02/08/2017 05:32 PM, David Christensen wrote:

On 02/08/17 15:59, Marc Shapiro wrote:

So how do I lay down a low level format on [the new 1 TB] drive?

I would use the SeaTools bootable CD to fill the drive with zeroes:
On 02/03/17 23:13, David Christensen wrote:

Sometimes you get lucky and the tool is a live CD:

www.seagate.com/files/www-content/support-content/downloads/seatools/_shared/downloads/SeaToolsDOS223ALL.ISO

I didn't feel like burning a CD and it has been a long time since I had
a box with a 3.5" floppy (although i do have one or two drives in a box
somewhere and quite a few of the folppies, themselves, as well)


3.5" floppy?  The link above is for a live CD.


 so I just used dd to write zeros to the disk. It took a while, but it 

> did the job.

For a HDD, the effect should be the same.



I partitioned the new disk with 3 physical partitions of 2GB each for
root/boot partitions.  ...
The 4th partition was set up for LVM and was set as a Physical Volume
(PV) to be added to the volume group along with my old drive.


The problem with putting everything on one big disk is that it becomes 
impractical to clone the system image.  I'm still climbing the disk 
imaging learning curve, but it's a useful technique that has saved me 
countless hours.




In the end, I picked yet another method for moving to the new disk. ...


Congratulations on your success battling through it all, especially LVM.


David

Re: HELP! Re: How to fix I/O errors? (SOLVED)

2017-02-10 Thread Marc Shapiro

On 02/08/2017 05:32 PM, David Christensen wrote:

On 02/08/17 15:59, Marc Shapiro wrote:

So how do I lay down a low level format on [the new 1 TB] drive?

I would use the SeaTools bootable CD to fill the drive with zeroes:

On 02/03/17 23:13, David Christensen wrote:
> Sometimes you get lucky and the tool is a live CD:
>
>
www.seagate.com/files/www-content/support-content/downloads/seatools/_shared/downloads/SeaToolsDOS223ALL.ISO

David

I didn't feel like burning a CD and it has been a long time since I had
a box with a 3.5" floppy (although i do have one or two drives in a box
somewhere and quite a few of the folppies, themselves, as well) so I
just used dd to write zeros to the disk. It took a while, but it did
the job. In the end, I picked yet another method for moving to the new
disk. As mentioned in my first post, I am using LVM and I have unused
space in the VG. I was debating with myself whether I wanted to continue
to use LVM, or just use raw disk partitions. I almost went with raw
disk partitions before I came across 'pvmove', which does exactly what I
needed. So...

I partitioned the new disk with 3 physical partitions of 2GB each for
root/boot partitions.

The 4th partition was set up for LVM and was set as a Physical Volume
(PV) to be added to the volume group along with my old drive.

Before adding the new disk, I created a new Logical Volume (LV) and
manually copied my home partition (one user tree at a time) to the new
partition. This spat out errors whenever it hit an unreadable sector
and I redirected those errors to a file for later use.

I then added the LVM partition from the new disk to the Volume Group
(VG) and did a 'pvmove' for each LV from the old PVto the new PV.

I included the original LV for /home, along with the newly copied LV. I
expected it to spit out errors and fail, but it didn't. I could hear it
struggle a bit when it hit the bad spots, but then it kept going. This
was actually a good thing. I had the list of affected files from when I
did the manual copy of the /home partition, so I knew what to check
after the move. Several of the files were videos. Using the original
files before copying, Xine would play up to the first I/O Error and then
freeze, even though it continued to read the file and advance the
timeline until the file ended. Using the manually copied file, which
truncated at the first error, I also only got the beginning of the video
and then it ended. Using the file from the original LV which I moved to
the new disk with pvmove, however, gave better results. There is a bit
of flicker when it hits a sector that had been unreadable before moving,
but it continues on so the rest of the video can be viewed. A few of
the other files I did delete (Libre Office document files do not survive
well, but I have a PDF of that file if I ever need it again).

Then I just had to copy over the root/boot partitions which I did from a
shell after booting my clonezilla CD (it came in handy after all) and
run lilo on them to make the new disk bootable. Everything seems good,
now. I ran the full test from SeagateTools (st) again, today, just to
verify that all was still good. It was. I now have an empty PV in my
LVM volume group that I will need to remove before I add any new Logical
Volumes (LVs), but I can do that any time. Since there are no LVs on it
nothing will attempt to read from it, or write to it.

I'll keep an eye on the disk for a while, but this should fix the
problem. If I ever have a failing disk again I hope that I will
remember this method because the LVM pvmove command really did make
moving to another disk easy. The hard part was dealing with the
root/boot partitions and getting the new disk bootable.

Hopefully this thread will help someone else who has a similar problem
in the future.

Marc

Re: HELP! Re: How to fix I/O errors?

2017-02-10 Thread Ric Moore


On 02/09/2017 12:13 PM, Greg Wooledge wrote:


You shared your philosophy ("tear it all down and rebuild it from scratch
every two years")


I don't know where you got this. The OP was having one helluva time with 
a harddrive. I suggested that he create a partition to store his 
personal files "more safely" as /opt, when he did a partition, format 
and re-install to the new drive. After he could mount the failing drive 
and copy as many personal files as he could salvage to the new 
/opt/ install. Then, if the need arises, a re-install is 
relatively painless. I have never exposed wipe and re-install every two 
years. That would be stupid. The decision to upgrade is purely a 
personal one, driven either by choice or necessity.



and I shared mine ("keep everything unchanged until
you are forced to change it").



A dying harddrive will drive change, don't you think??


Neither one is right, and neither one
is wrong.  I just wanted both viewpoints to be equally represented.



"Viewpoints", as in politics, do not remedy a failing drive nor the 
rescue of it's contents. That was the reason the OP posted. Please keep 
his needs in mind. Ric



--
My father, Victor Moore (Vic) used to say:
"There are two Great Sins in the world...
..the Sin of Ignorance, and the Sin of Stupidity.
Only the former may be overcome." R.I.P. Dad.
http://linuxcounter.net/user/44256.html

Re: HELP! Re: How to fix I/O errors?

2017-02-09 Thread Greg Wooledge

On Thu, Feb 09, 2017 at 12:03:18PM -0500, Ric Moore wrote:
> How so?? Don't "many other operating systems" have different 
> configuration files in many other locations?? I wouldn't expect BSD 
> config files to migrate to Linux, or Windows to do anything useful.

When I shared my $HOME between OpenBSD and Debian for a time, I didn't
have many problems at all.  There are some shell functions that I only
created when $(uname -s) was Linux, but that's about it.

Most of the command-line tools that use dot-files in $HOME are the same.
Just stick with the older-common-denominator syntax in things like
~/.muttrc and and ~/.ssh/config and you should be fine.  (Hint: when
mixing Debian with other non-legacy Unixes, usually it'll be Debian that
has the older version of the tool.)

You shared your philosophy ("tear it all down and rebuild it from scratch
every two years") and I shared mine ("keep everything unchanged until
you are forced to change it").  Neither one is right, and neither one
is wrong.  I just wanted both viewpoints to be equally represented.

Re: HELP! Re: How to fix I/O errors?

2017-02-09 Thread Ric Moore


On 02/09/2017 08:10 AM, Greg Wooledge wrote:

On Wed, Feb 08, 2017 at 06:06:34PM -0500, Ric Moore wrote:

Careful there, I would not copy any of the /home/username/dot-files or
dot directories over, except like .mozilla and .thunderbird, so you
don't carry over some old and crufty setting that might have been
problematic.


I have the exact opposite philosophy.  My home directory has survived
across many, many different operating systems and computers.


How so?? Don't "many other operating systems" have different 
configuration files in many other locations?? I wouldn't expect BSD 
config files to migrate to Linux, or Windows to do anything useful.




If a
new version of some app breaks compatibility with a dot file, which
is rare, then I'll handle that on a case by case basis.


...and that is you. I suspect that in this case that the OP doesn't wish 
anything to jump up and bite his behind. And, you seem to be able to 
deal with things on a case by case level, but just maybe the OP cannot. 
Ergo, some discretion is in order ...unless you are willing to provide 
life support in person.



Otherwise,
I get to keep all of my comfortable settings.


True, true. But, we're now talking about your comfort level, with 
successful builds, and not his. Some empathy is always a good thing, 
especially when it comes to tech support advice. :) Ric



--
My father, Victor Moore (Vic) used to say:
"There are two Great Sins in the world...
..the Sin of Ignorance, and the Sin of Stupidity.
Only the former may be overcome." R.I.P. Dad.
http://linuxcounter.net/user/44256.html

Re: HELP! Re: How to fix I/O errors?

2017-02-09 Thread Greg Wooledge

On Wed, Feb 08, 2017 at 06:06:34PM -0500, Ric Moore wrote:
> Careful there, I would not copy any of the /home/username/dot-files or 
> dot directories over, except like .mozilla and .thunderbird, so you 
> don't carry over some old and crufty setting that might have been 
> problematic.

I have the exact opposite philosophy.  My home directory has survived
across many, many different operating systems and computers.  If a
new version of some app breaks compatibility with a dot file, which
is rare, then I'll handle that on a case by case basis.  Otherwise,
I get to keep all of my comfortable settings.

Re: HELP! Re: How to fix I/O errors?

2017-02-08 Thread rhkramer

On Wednesday, February 08, 2017 06:37:55 PM Marc Shapiro wrote:
> On 02/08/2017 03:06 PM, Ric Moore wrote:
> > On 02/08/2017 04:38 PM, Marc Shapiro wrote:
> > Careful there, I would not copy any of the /home/username/dot-files or
> > dot directories over, except like .mozilla and .thunderbird, so you
> > don't carry over some old and crufty setting that might have been
> > problematic. To spare you nightmares like this one, I use the /opt
> > directory on a separate partition for all of my personal data.
> > So, I use /opt/ric/Documents and in my brand-new /home/ric directory I
> > delete the newly created Documents directory and then link (ln -s
> > /opt/ric/Documents Documents) and do the same with the other familiar
> > home directories like Videos, Music, Downloads, everything except
> > Desktop. If something goes ape, systemk-wise, you can do a fresh
> > install of / (root) directory and leave /opt alone. I've done this
> > since the old Caldera days. Nary a burp in the barrel! Ric

Why not make your own top level directory, i.e. /ric (with Documents and 
such)--that's what I do.

> I don't usually go quite that far, but photos, videos, and virtual disks
> are all in /usr/local/  which I will also need to copy over.  

Same comment as above--why not make your own top level directory for that 
stuff.  (Reading the File Hierarchy Standard (FHS), I don't think that is quite 
the intent of /usr/local--and could make some things inconvenient at one time 
or another...)

> You say to
> avoid copying   except .mozilla and .thunderbird.  I have 117 such
> dot-files and dot-directories.  Are you saying only to leave .mozilla
> and .thunderbird and have everything else rebuild when it is next used.
> Admittedly, that will get rid of some cruft, but how should I determine
> if there are others that I should keep?
> 
> 
> I tried to format the new drive using st (Seagate Tools).  It said that
> it would remove all data, which is expected, but nothing was removed!
> It also took less than a minute.  Should I be using /dev/sda in the
> command line instead of /dev/sg0 (which is how st -l lists the drive?
> 
> 
> Marc
> 
> 
> 
> Marc

Re: HELP! Re: How to fix I/O errors?

2017-02-08 Thread David Christensen

On 02/08/17 15:59, Marc Shapiro wrote:

So how do I lay down a low level format on [the new 1 TB] drive?

I would use the SeaTools bootable CD to fill the drive with zeroes:

On 02/03/17 23:13, David Christensen wrote:
> Sometimes you get lucky and the tool is a live CD:
>
> 
www.seagate.com/files/www-content/support-content/downloads/seatools/_shared/downloads/SeaToolsDOS223ALL.ISO

David

Re: HELP! Re: How to fix I/O errors?

2017-02-08 Thread Marc Shapiro


On 02/08/2017 03:37 PM, Marc Shapiro wrote:

On 02/08/2017 03:06 PM, Ric Moore wrote:

On 02/08/2017 04:38 PM, Marc Shapiro wrote:

On 02/08/2017 01:26 PM, Ric Moore wrote:

On 02/08/2017 02:37 AM, Marc Shapiro wrote:
How it went is not well.  I tested the new drive with SeagateTools 
and
it was fine.  Then I made a clonezilla live CD and booted from 
it.  It
stopped on the first read error with a message saying to restart 
using

the rescue option.  I did that.  After 5 hours it finished without
mentioning any errors.

I tried to boot to the old disk (since it was still wired that 
way).  I
got dropped int a maintenance shell with fs errors in /dev/sda4 
which is
the physical volume for all my LVM logical volumes -- /usr, /var, 
/home

and /temp.  It says to run fsck manually.

I decided to try the new drive, so I changed the cables and 
re-booted.


Maintenance shell, again.

/ mounted clean

lvm started

/home fs has errors run fsck (at this point, I'm afraid to try it)

/var, /usr, and /tmp all say that the superblock can not be read, 
or is

invalid.  Try running

e2fsck -b 8193 
or
e2fsck -b 32768 

Which do I use?

How did trying to clone the disk nake such a mess of BOTH disks?



You cloned a mess, you got a perfect copy. I'd do a clean install to
the new drive, after formatting the entire drive. Once you boot into
that drive, mount the old drive. It should show up in 
/media/

Then copy the directories of personal stuff you want to keep to a new
location on the new drive. I use cp -raf 
 and everything, including sub-directories, file
ownership and file permissions are preserved. If a file is clunky, it
won't copy it and should proceed.

Next, if you are in your office, observe if the window is open. If
yes, throw the old drive out of it. :) Ric



Ric,


As soon as I finished my last post (above) I realized that what you
suggest is exactly what I should have done in the first place. Why I
did not realize that earlier (and save myself a lot of headaches) I do
not know.  The system is now booting to the old drive, just as it did
before.  I think it just needed a good night's sleep.  I know that I 
did.


My next steps are:

Format new drive

Install fresh on new drive

Mount and copy /home from old drive to new drive


Careful there, I would not copy any of the /home/username/dot-files 
or dot directories over, except like .mozilla and .thunderbird, so 
you don't carry over some old and crufty setting that might have been 
problematic. To spare you nightmares like this one, I use the /opt 
directory on a separate partition for all of my personal data.
So, I use /opt/ric/Documents and in my brand-new /home/ric directory 
I delete the newly created Documents directory and then link (ln -s 
/opt/ric/Documents Documents) and do the same with the other familiar 
home directories like Videos, Music, Downloads, everything except 
Desktop. If something goes ape, systemk-wise, you can do a fresh 
install of / (root) directory and leave /opt alone. I've done this 
since the old Caldera days. Nary a burp in the barrel! Ric




I don't usually go quite that far, but photos, videos, and virtual 
disks are all in /usr/local/  which I will also need to copy over.  
You say to avoid copying   except .mozilla and .thunderbird.  I have 
117 such dot-files and dot-directories.  Are you saying only to leave 
.mozilla and .thunderbird and have everything else rebuild when it is 
next used.  Admittedly, that will get rid of some cruft, but how 
should I determine if there are others that I should keep?



I tried to format the new drive using st (Seagate Tools).  It said 
that it would remove all data, which is expected, but nothing was 
removed!  It also took less than a minute.  Should I be using /dev/sda 
in the command line instead of /dev/sg0 (which is how st -l lists the 
drive)?
I just tried this with 'st -i /dev/sda' (which should give drive info) 
and it does nothing, so that doesn't work.  So how do I lay down a low 
level format on this drive?



Marc



Marc

Re: HELP! Re: How to fix I/O errors?

2017-02-08 Thread Marc Shapiro


On 02/08/2017 03:06 PM, Ric Moore wrote:

On 02/08/2017 04:38 PM, Marc Shapiro wrote:

On 02/08/2017 01:26 PM, Ric Moore wrote:

On 02/08/2017 02:37 AM, Marc Shapiro wrote:

How it went is not well.  I tested the new drive with SeagateTools and
it was fine.  Then I made a clonezilla live CD and booted from it.  It
stopped on the first read error with a message saying to restart using
the rescue option.  I did that.  After 5 hours it finished without
mentioning any errors.

I tried to boot to the old disk (since it was still wired that 
way).  I
got dropped int a maintenance shell with fs errors in /dev/sda4 
which is
the physical volume for all my LVM logical volumes -- /usr, /var, 
/home

and /temp.  It says to run fsck manually.

I decided to try the new drive, so I changed the cables and re-booted.

Maintenance shell, again.

/ mounted clean

lvm started

/home fs has errors run fsck (at this point, I'm afraid to try it)

/var, /usr, and /tmp all say that the superblock can not be read, 
or is

invalid.  Try running

e2fsck -b 8193 
or
e2fsck -b 32768 

Which do I use?

How did trying to clone the disk nake such a mess of BOTH disks?



You cloned a mess, you got a perfect copy. I'd do a clean install to
the new drive, after formatting the entire drive. Once you boot into
that drive, mount the old drive. It should show up in /media/
Then copy the directories of personal stuff you want to keep to a new
location on the new drive. I use cp -raf 
 and everything, including sub-directories, file
ownership and file permissions are preserved. If a file is clunky, it
won't copy it and should proceed.

Next, if you are in your office, observe if the window is open. If
yes, throw the old drive out of it. :) Ric



Ric,


As soon as I finished my last post (above) I realized that what you
suggest is exactly what I should have done in the first place. Why I
did not realize that earlier (and save myself a lot of headaches) I do
not know.  The system is now booting to the old drive, just as it did
before.  I think it just needed a good night's sleep.  I know that I 
did.


My next steps are:

Format new drive

Install fresh on new drive

Mount and copy /home from old drive to new drive


Careful there, I would not copy any of the /home/username/dot-files or 
dot directories over, except like .mozilla and .thunderbird, so you 
don't carry over some old and crufty setting that might have been 
problematic. To spare you nightmares like this one, I use the /opt 
directory on a separate partition for all of my personal data.
So, I use /opt/ric/Documents and in my brand-new /home/ric directory I 
delete the newly created Documents directory and then link (ln -s 
/opt/ric/Documents Documents) and do the same with the other familiar 
home directories like Videos, Music, Downloads, everything except 
Desktop. If something goes ape, systemk-wise, you can do a fresh 
install of / (root) directory and leave /opt alone. I've done this 
since the old Caldera days. Nary a burp in the barrel! Ric




I don't usually go quite that far, but photos, videos, and virtual disks 
are all in /usr/local/  which I will also need to copy over.  You say to 
avoid copying   except .mozilla and .thunderbird.  I have 117 such 
dot-files and dot-directories.  Are you saying only to leave .mozilla 
and .thunderbird and have everything else rebuild when it is next used.  
Admittedly, that will get rid of some cruft, but how should I determine 
if there are others that I should keep?



I tried to format the new drive using st (Seagate Tools).  It said that 
it would remove all data, which is expected, but nothing was removed!  
It also took less than a minute.  Should I be using /dev/sda in the 
command line instead of /dev/sg0 (which is how st -l lists the drive?



Marc



Marc

Re: HELP! Re: How to fix I/O errors?

2017-02-08 Thread Ric Moore


On 02/08/2017 04:38 PM, Marc Shapiro wrote:

On 02/08/2017 01:26 PM, Ric Moore wrote:

On 02/08/2017 02:37 AM, Marc Shapiro wrote:

How it went is not well.  I tested the new drive with SeagateTools and
it was fine.  Then I made a clonezilla live CD and booted from it.  It
stopped on the first read error with a message saying to restart using
the rescue option.  I did that.  After 5 hours it finished without
mentioning any errors.

I tried to boot to the old disk (since it was still wired that way).  I
got dropped int a maintenance shell with fs errors in /dev/sda4 which is
the physical volume for all my LVM logical volumes -- /usr, /var, /home
and /temp.  It says to run fsck manually.

I decided to try the new drive, so I changed the cables and re-booted.

Maintenance shell, again.

/ mounted clean

lvm started

/home fs has errors run fsck (at this point, I'm afraid to try it)

/var, /usr, and /tmp all say that the superblock can not be read, or is
invalid.  Try running

e2fsck -b 8193 
or
e2fsck -b 32768 

Which do I use?

How did trying to clone the disk nake such a mess of BOTH disks?



You cloned a mess, you got a perfect copy. I'd do a clean install to
the new drive, after formatting the entire drive. Once you boot into
that drive, mount the old drive. It should show up in /media/
Then copy the directories of personal stuff you want to keep to a new
location on the new drive. I use cp -raf 
 and everything, including sub-directories, file
ownership and file permissions are preserved. If a file is clunky, it
won't copy it and should proceed.

Next, if you are in your office, observe if the window is open. If
yes, throw the old drive out of it. :) Ric



Ric,


As soon as I finished my last post (above) I realized that what you
suggest is exactly what I should have done in the first place.  Why I
did not realize that earlier (and save myself a lot of headaches) I do
not know.  The system is now booting to the old drive, just as it did
before.  I think it just needed a good night's sleep.  I know that I did.

My next steps are:

Format new drive

Install fresh on new drive

Mount and copy /home from old drive to new drive


Careful there, I would not copy any of the /home/username/dot-files or 
dot directories over, except like .mozilla and .thunderbird, so you 
don't carry over some old and crufty setting that might have been 
problematic. To spare you nightmares like this one, I use the /opt 
directory on a separate partition for all of my personal data.
So, I use /opt/ric/Documents and in my brand-new /home/ric directory I 
delete the newly created Documents directory and then link (ln -s 
/opt/ric/Documents Documents) and do the same with the other familiar 
home directories like Videos, Music, Downloads, everything except 
Desktop. If something goes ape, systemk-wise, you can do a fresh install 
of / (root) directory and leave /opt alone. I've done this since the old 
Caldera days. Nary a burp in the barrel! Ric




--
My father, Victor Moore (Vic) used to say:
"There are two Great Sins in the world...
..the Sin of Ignorance, and the Sin of Stupidity.
Only the former may be overcome." R.I.P. Dad.
http://linuxcounter.net/user/44256.html

Re: HELP! Re: How to fix I/O errors?

2017-02-08 Thread David Christensen

On 02/07/17 23:37, Marc Shapiro wrote:
> How it went is not well.

> David Christensen wrote:
>> Run memtest86+ for 24+ hours to verify that you don't have a memory
>> problem.

Did you test the memory?  If not, test it now just to be sure.

>> Use SeaTools to wipe the new 1 TB drive and run the short and long
>> tests.  Stop if anything fails.

I tested the new drive with SeagateTools and it
was fine.

Please confirm that you wiped the 1 TB recovery drive.

Then I made a clonezilla live CD and booted from it.  It stopped
on the first read error with a message saying to restart using the rescue
option.  I did that.  After 5 hours it finished without mentioning any
errors.

I tried to boot to the old disk (since it was still wired that way).  I got
dropped int a maintenance shell with fs errors in /dev/sda4 which is the
physical volume for all my LVM logical volumes -- /usr, /var, /home and
/temp.  It says to run fsck manually.

I decided to try the new drive, so I changed the cables and re-booted.

Maintenance shell, again.

/ mounted clean

lvm started

/home fs has errors run fsck (at this point, I'm afraid to try it)

/var, /usr, and /tmp all say that the superblock can not be read, or is
invalid.  Try running

e2fsck -b 8193 
or
e2fsck -b 32768 

Which do I use?

>

How did trying to clone the disk nake such a mess of BOTH disks?

Don't blame Clonezilla.  Everything is decaying -- you, me, those hard 
drives, etc..  With that in mind, do the most precious operations first 
-- because in 1 second, 1 minute, 1 hour, 1 day, 1 month, 1 year, 1 
decade, 1 century, whatever, the data will be inaccessible without 
extraordinary means.

Forget about booting off the failing 1 TB disk.  Disconnect it for now.

Forget about booting off the 1 TB recovery disk.  It should now contain 
whatever blocks Clonezilla was able to recover.  It is now in a state 
analogous to Swiss cheese.  Disconnect it for now.

Any help getting a working system again will be greatly appreciated.

On the computer you use for e-mail, start an administration log folder 
for the machine in question.  Start a log.txt file and take notes.  Cut 
and paste what you can.  Photograph screens and transcribe what you 
can't.  Collect important files.  Put it all into a version control system.

>> I'd do a fresh install on a 16+ GB SSD (USB flash drives also
>> work).

Install SSH when you build the new system drive.

Use ssh(1) to log in from your e-mail computer.  Consider using 
script(1) to capture your console sessions, and scp(1) to copy out the 
files.  Read fsck(8) and consider your moves carefully.  Reconnect the 1 
TB recovery disk and see what fsck can recover.

David

HELP! Re: How to fix I/O errors?

2017-02-07 Thread Marc Shapiro

How it went is not well.  I tested the new drive with SeagateTools and it
was fine.  Then I made a clonezilla live CD and booted from it.  It stopped
on the first read error with a message saying to restart using the rescue
option.  I did that.  After 5 hours it finished without mentioning any
errors.

I tried to boot to the old disk (since it was still wired that way).  I got
dropped int a maintenance shell with fs errors in /dev/sda4 which is the
physical volume for all my LVM logical volumes -- /usr, /var, /home and
/temp.  It says to run fsck manually.

I decided to try the new drive, so I changed the cables and re-booted.

Maintenance shell, again.

/ mounted clean

lvm started

/home fs has errors run fsck (at this point, I'm afraid to try it)

/var, /usr, and /tmp all say that the superblock can not be read, or is
invalid.  Try running

e2fsck -b 8193 
or
e2fsck -b 32768 

Which do I use?

How did trying to clone the disk nake such a mess of BOTH disks?

Any help getting a working system again will be greatly appreciated.

Marc

On Feb 6, 2017 2:37 PM, "David Christensen" 
wrote:

On 02/06/17 13:15, Marc Shapiro wrote:

> I am pasting the result of smartctl -x /dev/sda below as I have no real
> clue what to do with the information, but I have a few questions first.
>
> 1) I have purchased a new, very similar, Seagate 1TB drive and I plan to
> install it and copy the whole system to the new drive.
>

It sounds like you don't have a backup of the failing 1 TB drive (?).

Do you have a file server with ~1 TB of free space?  RAID?

Run memtest86+ for 24+ hours to verify that you don't have a memory problem.

Use SeaTools to wipe the new 1 TB drive and run the short and long tests.
Stop if anything fails.

What is the best
> way to do this copy since I don't wangt to copy bad sectors?
>

I've done it with 'dd' in the past, but will use 'ddrescue' in the future.

2) Once I have verified that the new drive boots
>

I'd do a fresh install on a 16+ GB SSD (USB flash drives also work).  A
recovered system disk image is too uncertain.

and everything is running properly
>

As I understand it, the drive microcontroller calculates and stores a
checksum with every sector (block).  That's one way it knows that a block
is bad upon reading.  So, when you copy out whatever blocks you can get,
you probably won't have errors in those blocks.

But, files and directories are stored on one or more sectors.  Depending
upon your file system, fsck may or may not find the missing blocks.

When you're done, the destination disk is likely to be missing files and/or
directories.

I am hoping to reformat the old drive.  This should
> reallocate the bad sectors IIRC.  I then would like to set up a raid
> with both drives (keeping a close eye on the old drive).The
> feasibility of this, I would guess, depends on what the posted smartctl
> information tells someone who knows what to look for.
>
> 3) As I understand it, the above mentioned raid should be safe since,
> even if the old drive deteriorates further, the system can run on just
> the new drive.  Is that correct?
>

Once you've copied out whatever blocks you can get, use SeaTools to wipe
the old 1 TB drive and run short and long tests.  If all three pass, I
might be tempted to re-use the drive.

If it fails to wipe and has plaintext, destroy it with a sledge hammer.
(Wear safety glasses!)

If it wipes but fails the short or long tests, recycle it.

Here is the smafrtctl output:
>
...

=== START OF READ SMART DATA SECTION ===
> SMART overall-health self-assessment test result: PASSED
>

Interesting, given that the drive failed SeaTools (short test?).

General SMART Values:
> Offline data collection status:  (0x82)Offline data collection activity
> was completed without error.
> Auto Offline Data Collection: Enabled.
> Self-test execution status:  ( 121)The previous self-test
> completed having
> the read element of the test failed.
>

Matches SeaTools result.

Total time to complete Offline
> data collection: (  600) seconds.
>
...

SMART Attributes Data Structure revision number: 10
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME  FLAGSVALUE WORST THRESH FAIL RAW_VALUE
>   1 Raw_Read_Error_Rate POSR--   117   095   006- 165391146
>   3 Spin_Up_TimePO   095   093   000-0
>   4 Start_Stop_Count-O--CK   100   100   020-406
>   5 Reallocated_Sector_Ct   PO--CK   072   072   036-1181
>   7 Seek_Error_Rate POSR--   087   060   030- 656506200
>   9 Power_On_Hours  -O--CK   048   048   000-46195
>  10 Spin_Retry_CountPO--C-   100   100   097-0
>  12 Power_Cycle_Count   -O--CK   100   100   020-203
> 183 Runtime_Bad_Block   -O--CK   092   092   000-8
> 184 End-to-End_Error-O--CK   100   100

45 matches

Mail list logo