melbogia wrote:
melbogia wrote:
There is no "panic" or "syncing filesystems" in
/var/adm/messages. Here is what I see after I execute
the command "mkfile 100g /datapool/testfile"
I was on the machine at 1600 hours yesterday and
the next entry after that at 8:43 AM is system
startup information, it seems it can't even write
anything to /var/adm/messages before it reboots?
Also this may be relevant, the datapool pool
doesn't exist after it comes up from crash. I have to
do a "zpool import -f datapool" to import it.
This may be multiple issues. The fact a reboot
happens with no trace in
/var/adm/messages is somewhat worrying to me.
However, we need to frame
the issue. Things like:
When you were using the system on the 10th, how
were you connected
to it? Were you logged in remotely (via
ssh/telnet/etc...)?
Where you on the console? (graphics console, or
text mode)?
What did you see on the screen around 16:19 ?
I was connected to the server via ssh and I didn't see anything at 16:19.
I can't imagine you saw absolutely nothing from your ssh session. I'm
guessing that it hung, and eventually gave some error like "timeout" or
"connection reset by peer" or something? Even if you reset it, did you
try ssh again, ping, etc...? What were the results here?
Did you go (if able) to the system itself? What state was it in? etc...
It wasn't a power cut was it? ;-)
I am not sure about the hardware reset issues, I'll poke around and see what I
find. But there is another machine with the Opensolaris 2009.06 as well, it has
the exact same hardware with the exception of hard disks. It has been working
fine.
That's kind of my point. If there is an "identical" system (and beware -
they are rarely exactly identical with no differences whatsoever), and
it is functioning fine, then (assuming similar usage patterns) it may
help point to a hardware/firmware issue, etc... Don't get me wrong - I'm
not saying your hardware is bust - just something to bear in mind if the
fault always stays with the one system and no other system is affected.
Dec 11 08:43:36 dirt zfs: [ID 427000 kern.warning]
WARNING: pool 'datapool' could not be loaded as it
was last accessed by another system (host: dirt
hostid: 0x409a4c). See:
http://www.sun.com/msg/ZFS-8000-EY
Whilst I can't comment on this line myself, it does
(IMO) explain why
you had to force import the pool after the reboot.
Assuming there is only one host with name "dirt",
then this would seem
to be a bug.
that is correct, there is only one host with that name in the environment.
You might want to ask this question on zfs-discuss, and give them the
background.
Just a thought (assuming the system will let you do this), who else was
logged in at the time? Could somebody have done a forced export of all
zpools on the system? Could this explain why the messages file stopped
being written to?
Regards,
Brian
--
Brian Ruthven
Solaris Revenue Product Engineering
Sun Microsystems UK
Sparc House, Guillemont Park, Camberley, GU17 9QG
_______________________________________________
opensolaris-discuss mailing list
[email protected]