Re: how to recover after power outage

2009-04-06 Thread John Almberg
Check the machine-hostname.err file when you manually try and  
start MySQL.
Provided that you have mysql_enable=YES in /etc/rc.conf you  
should be able
to manually attempt to start with /usr/local/etc/rc.d/mysql-server  
start (it
seems to work reliably when you type out the entire command path- 
wise).


Note that if somehow permissions on the my.cnf file got changed  
MySQL won't
start if my.cnf is world writable. Check for stale PID and  
sockets. Normally
these shouldn't be a problem as a startup will just overwrite  
them. Check
these to eliminate any wonkiness, e.g. some permission change  
isn't allowing

for MySQL to wipe the old ones.

The whateverthehostname.err log and possibly /var/log/messages  
might give
some clue for what's going on. If the database files are corrupt  
just clean
them out and replace with a backup done with dump. Ensure the /var/ 
db/mysql
tree is chowned mysql:mysql. If you had to install/reinstall from  
ports the
install should have created the appropriate uid/gid accounts.  
Check and see

if these are missing.

At any rate I wish you the best of luck. Now that you can SSH in  
you can

probably fix it up.




Okay, so my new database server is running with backup data and I am  
trying to salvage the old database, or what's left of it.


Unfortunately, it seems like what's left of it, is not much.

the /var/db/mysql directory tree is now a file:

qu# ls -l /var/db/mysql
-rwx--  2 mysql  wheel  1024 Jul  5  2008 /var/db/mysql

The situation looks hopeless to me. Is it?

Another question: given that the file system took a major hit, should  
I try to fix it, or just do a clean install? I'm leaning towards the  
clean install since I've been meaning to upgrade this machine to 7.1  
anyway.


Is there anyway to fix the file system, reliably? fsck doesn't seem  
to be able to solve all the problems.


-- John

 
___

freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: how to recover after power outage

2009-04-06 Thread Roland Smith
On Mon, Apr 06, 2009 at 02:08:18PM -0400, John Almberg wrote:
snip 
 Okay, so my new database server is running with backup data and I am  
 trying to salvage the old database, or what's left of it.
 
 Unfortunately, it seems like what's left of it, is not much.
 
 the /var/db/mysql directory tree is now a file:
 
 qu# ls -l /var/db/mysql
 -rwx--  2 mysql  wheel  1024 Jul  5  2008 /var/db/mysql

Normally it shouldn't be possible to turn a directory into a file. Using
open(2) to create a file that already exists as a directory should
result in an error.

 The situation looks hopeless to me. Is it?

It might not be. Unless the data was actively wiped or overwritten, the
data is probably still there on the disk in unallocated
sectors. Forensic analysis programs like the sleuth kit
[http://www.sleuthkit.org/sleuthkit/desc.php] _might_ be able to get
some of the data back. But don't hold you breath. It's practically
impossible to get data back from a modern drive once it has been overwritten.

 Another question: given that the file system took a major hit, should  
 I try to fix it, or just do a clean install? I'm leaning towards the  
 clean install since I've been meaning to upgrade this machine to 7.1  
 anyway.

I would advise you to make a copy of the disk contents with dd, so you
can poke around in it at your leisure. Then check the disk with e.g.
smartmontools or the tools provided by the manufacturer and to a clean
install.

 Is there anyway to fix the file system, reliably? fsck doesn't seem  
 to be able to solve all the problems.

Is that with fsck_ffs running in preen mode? If so, try it without the
-p option. If that doesn't work you might contemplate using the -D
option, but this can be dangerous; see fsck_ffs(8). If fsck_ffs even
then cannot repair the damage, there's not much you can do except wipe
the disk and reinstall. Also, check for loose (S)ATA cables. This can
cause g_vfs_done errors while the disk is fine. If there are no obvious
errors of that kind I'd be extra suspicious about disk hardware
failure. If the drive is still in warranty, I'd have it replaced. If
not, you might still think about replacing it. buying a new disk is
almost certainly cheaper that trawling through a diskload of data trying
to make sense of it...

Roland
-- 
R.F.Smith   http://www.xs4all.nl/~rsmith/
[plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated]
pgp: 1A2B 477F 9970 BA3C 2914  B7CE 1277 EFB0 C321 A725 (KeyID: C321A725)


pgpX6jAuUhmyD.pgp
Description: PGP signature


how to recover after power outage

2009-04-05 Thread John Almberg
Blast... my beautiful FreeBSD servers were rudely switched off when  
my data had a power outage a couple hours ago. They restored power  
about 30 minutes later, and one box came up no problem.


The other has a login prompt on the serial console, but my login does  
not work. I get a Login incorrect message, even though the username/ 
password is correct.


When I try to SSH into the box, I get this (server name changed):

$ ssh u...@example.com -p 48420
ssh: connect to host example.com port 48420: Connection refused

In other words, I seem to be locked out.

I don't want to do anything drastic without having a good idea what  
I'm doing. Any suggestions, much appreciated.


-- John

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: how to recover after power outage

2009-04-05 Thread Glen Barber
On Sun, Apr 5, 2009 at 2:59 AM, John Almberg jalmb...@identry.com wrote:
 Blast... my beautiful FreeBSD servers were rudely switched off when my data
 had a power outage a couple hours ago. They restored power about 30 minutes
 later, and one box came up no problem.

 The other has a login prompt on the serial console, but my login does not
 work. I get a Login incorrect message, even though the username/password
 is correct.


Can you log in as *any* user?  Even root login fails?

 When I try to SSH into the box, I get this (server name changed):

 $ ssh u...@example.com -p 48420
 ssh: connect to host example.com port 48420: Connection refused

 In other words, I seem to be locked out.

 I don't want to do anything drastic without having a good idea what I'm
 doing. Any suggestions, much appreciated.


What was the previous (estimated) uptime on the machine?  In other
words, did you change something and not/forget to restart the service?
 Have you tried ssh-ing to port 22 to see if the setting was changed
back to default?

Are there any other services on this box?  If so, are they running?


-- 
Glen Barber
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: how to recover after power outage

2009-04-05 Thread John Almberg


On Apr 5, 2009, at 4:41 AM, Glen Barber wrote:

On Sun, Apr 5, 2009 at 2:59 AM, John Almberg jalmb...@identry.com  
wrote:
Blast... my beautiful FreeBSD servers were rudely switched off  
when my data
had a power outage a couple hours ago. They restored power about  
30 minutes

later, and one box came up no problem.

The other has a login prompt on the serial console, but my login  
does not
work. I get a Login incorrect message, even though the username/ 
password

is correct.



Can you log in as *any* user?  Even root login fails?


Can't log in at all.




When I try to SSH into the box, I get this (server name changed):

$ ssh u...@example.com -p 48420
ssh: connect to host example.com port 48420: Connection refused

In other words, I seem to be locked out.

I don't want to do anything drastic without having a good idea  
what I'm

doing. Any suggestions, much appreciated.



What was the previous (estimated) uptime on the machine?


Several months, at least.


In other
words, did you change something and not/forget to restart the service?


I don't believe so, but if I forgot it, then I guess anything is  
possible.



 Have you tried ssh-ing to port 22 to see if the setting was changed
back to default?


I can't at the moment, because the guys at NYI are working on the  
box. They have run fsck, which doesn't seem to have solved the problem.




Are there any other services on this box?  If so, are they running?


The main app is MySQL. I don't think it is running, but can't really  
tell unless I can log in.


I have backups, and while NYI is trying to get this box running, I'm  
setting up a new database server, just in case...


-- John

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: how to recover after power outage

2009-04-05 Thread Michael Powell
John Almberg wrote:

 
 On Apr 5, 2009, at 4:41 AM, Glen Barber wrote:
 
 On Sun, Apr 5, 2009 at 2:59 AM, John Almberg jalmb...@identry.com
 wrote:
 Blast... my beautiful FreeBSD servers were rudely switched off
 when my data
 had a power outage a couple hours ago. They restored power about
 30 minutes
 later, and one box came up no problem.

 The other has a login prompt on the serial console, but my login
 does not
 work. I get a Login incorrect message, even though the username/
 password
 is correct.


 Can you log in as *any* user?  Even root login fails?
 
 Can't log in at all.
 

 When I try to SSH into the box, I get this (server name changed):

 $ ssh u...@example.com -p 48420
 ssh: connect to host example.com port 48420: Connection refused

 In other words, I seem to be locked out.

 I don't want to do anything drastic without having a good idea
 what I'm
 doing. Any suggestions, much appreciated.


 What was the previous (estimated) uptime on the machine?
 
 Several months, at least.
 
 In other
 words, did you change something and not/forget to restart the service?
 
 I don't believe so, but if I forgot it, then I guess anything is
 possible.
 
  Have you tried ssh-ing to port 22 to see if the setting was changed
 back to default?
 
 I can't at the moment, because the guys at NYI are working on the
 box. They have run fsck, which doesn't seem to have solved the problem.
 

 Are there any other services on this box?  If so, are they running?
 
 The main app is MySQL. I don't think it is running, but can't really
 tell unless I can log in.
 
 I have backups, and while NYI is trying to get this box running, I'm
 setting up a new database server, just in case...
 

If you were lucky having the guys at NYI login to single user mode at the 
console and run fsck in an attempt to clear up minor file system damage 
would have squared things away. MySQL is not real happy if there has been fs 
damage to the underlying files and their .bin logs.

However, not being able to log in to a basic service like SSH is not good. 
Whether or not MySQL wants to come up SSH should still be working. In the 
end the guys at NYI are probably going to have to do a full system load and 
restore the last backup, and/or replace defective hardware.

I have seen old hard drives in RAID arrays that had perked along for years 
show no hint of any problem. Power down the machine to do something like 
blow the dust out or stick in some more memory sticks and it won't come up 
again. Had I not powered down it may have happily run a while longer. I have 
seen drives fail like this before, especially when they are fairly old. At 
this stage you can only emit SIGH and replace/rebuild.

But if the NYI guys are responsible for providing you with a running system 
the onus is on them to get it going again, at least up to a certain point. 
After that you would need to pick up and carry the ball the rest of the way.

-Mike




___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: how to recover after power outage

2009-04-05 Thread John Almberg


The main app is MySQL. I don't think it is running, but can't really
tell unless I can log in.

I have backups, and while NYI is trying to get this box running, I'm
setting up a new database server, just in case...



If you were lucky having the guys at NYI login to single user mode  
at the
console and run fsck in an attempt to clear up minor file system  
damage
would have squared things away. MySQL is not real happy if there  
has been fs

damage to the underlying files and their .bin logs.

However, not being able to log in to a basic service like SSH is  
not good.
Whether or not MySQL wants to come up SSH should still be working.  
In the
end the guys at NYI are probably going to have to do a full system  
load and

restore the last backup, and/or replace defective hardware.

I have seen old hard drives in RAID arrays that had perked along  
for years
show no hint of any problem. Power down the machine to do something  
like
blow the dust out or stick in some more memory sticks and it won't  
come up
again. Had I not powered down it may have happily run a while  
longer. I have
seen drives fail like this before, especially when they are fairly  
old. At

this stage you can only emit SIGH and replace/rebuild.

But if the NYI guys are responsible for providing you with a  
running system
the onus is on them to get it going again, at least up to a certain  
point.
After that you would need to pick up and carry the ball the rest of  
the way.


Okay, so the machine is back online and I can log in again.

The hardware is only 18 months old or so... good quality stuff, so  
hopefully nothing is physically damaged. We'll see...


Unfortunately, mysql isn't working at the moment... will make a  
backup of data (I have the previous night's backup, of course, but  
would like the latest, if possible.) Then will try to figure out  
what's working and what's not.


-- John

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: how to recover after power outage

2009-04-05 Thread Michael Powell
John Almberg wrote:

[snip]
 
 Okay, so the machine is back online and I can log in again.
 
 The hardware is only 18 months old or so... good quality stuff, so
 hopefully nothing is physically damaged. We'll see...
 
 Unfortunately, mysql isn't working at the moment... will make a
 backup of data (I have the previous night's backup, of course, but
 would like the latest, if possible.) Then will try to figure out
 what's working and what's not.
 

Check the machine-hostname.err file when you manually try and start MySQL. 
Provided that you have mysql_enable=YES in /etc/rc.conf you should be able 
to manually attempt to start with /usr/local/etc/rc.d/mysql-server start (it 
seems to work reliably when you type out the entire command path-wise).

Note that if somehow permissions on the my.cnf file got changed MySQL won't 
start if my.cnf is world writable. Check for stale PID and sockets. Normally 
these shouldn't be a problem as a startup will just overwrite them. Check 
these to eliminate any wonkiness, e.g. some permission change isn't allowing 
for MySQL to wipe the old ones.

The whateverthehostname.err log and possibly /var/log/messages might give 
some clue for what's going on. If the database files are corrupt just clean 
them out and replace with a backup done with dump. Ensure the /var/db/mysql 
tree is chowned mysql:mysql. If you had to install/reinstall from ports the 
install should have created the appropriate uid/gid accounts. Check and see 
if these are missing. 

At any rate I wish you the best of luck. Now that you can SSH in you can 
probably fix it up.

-Mike



___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: how to recover after power outage

2009-04-05 Thread John Almberg


On Apr 5, 2009, at 2:10 PM, Michael Powell wrote:


John Almberg wrote:

[snip]


Okay, so the machine is back online and I can log in again.

The hardware is only 18 months old or so... good quality stuff, so
hopefully nothing is physically damaged. We'll see...

Unfortunately, mysql isn't working at the moment... will make a
backup of data (I have the previous night's backup, of course, but
would like the latest, if possible.) Then will try to figure out
what's working and what's not.



Check the machine-hostname.err file when you manually try and start  
MySQL.
Provided that you have mysql_enable=YES in /etc/rc.conf you  
should be able
to manually attempt to start with /usr/local/etc/rc.d/mysql-server  
start (it
seems to work reliably when you type out the entire command path- 
wise).


Note that if somehow permissions on the my.cnf file got changed  
MySQL won't
start if my.cnf is world writable. Check for stale PID and sockets.  
Normally
these shouldn't be a problem as a startup will just overwrite them.  
Check
these to eliminate any wonkiness, e.g. some permission change isn't  
allowing

for MySQL to wipe the old ones.

The whateverthehostname.err log and possibly /var/log/messages  
might give
some clue for what's going on. If the database files are corrupt  
just clean
them out and replace with a backup done with dump. Ensure the /var/ 
db/mysql
tree is chowned mysql:mysql. If you had to install/reinstall from  
ports the
install should have created the appropriate uid/gid accounts. Check  
and see

if these are missing.

At any rate I wish you the best of luck. Now that you can SSH in  
you can

probably fix it up.


Well, I had to give up, temporarily, on this server to get my clients  
back online.


I took a nice machine I had laying around, loaded a fresh copy of  
FreeBSD on it, installed mysql, and loaded the Saturday morning  
database backup.


I had to set up all the database permissions correctly, which took  
some time, but I'm happy to say that I've got all my clients back  
online with this new database server.


Now I am going to catch a couple hours sleep (this has been going on  
since 2 am). Once I restore some brain cells, I'll see if I can  
figure out what's happening with the main database server. NYI has  
taken it off line, for some reason, and I can't log into it anyway,  
at the moment.


Thanks for all the helpful advice. It's great to have this list to  
fall back on in a crisis.


Brgds: John

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org