from:"Maidak Alexander J"

Re: [zfs-discuss] MPxIO and removing physical devices

2009-11-09 Thread Maidak Alexander J

I'm not sure if this is exactly what you're looking for but check out the work 
around in this bug:

http://bugs.opensolaris.org/view_bug.do;jsessionid=9011b9dacffa0b615db182bbcd7b?bug_id=6559281

Basically Look through cfgadm -al and run the following command on the 
unusable attachment points, Example: 

cfgadm -o unusable_FCP_dev -c unconfigure c2::5005076801400525 

You might also try the Storage-Discuss list.

-Alex

-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Karl Katzke
Sent: Tuesday, November 03, 2009 3:11 PM
To: zfs-discuss@opensolaris.org
Subject: [zfs-discuss] MPxIO and removing physical devices

I am a bit of a Solaris newbie. I have a brand spankin' new Solaris 10u8 
machine (x4250) that is running an attached J4400 and some internal drives. 
We're using multipathed SAS I/O (enabled via stmsboot), so the device mount 
points have been moved off from their normal c0t5d0 to long strings -- in the 
case of c0t5d0, it's now /dev/rdsk/c6t5000CCA00A274EDCd0. (I can see the 
cross-referenced devices with stmsboot -L.)

Normally, when replacing a disk on a Solaris system, I would run cfgadm -c 
unconfigure c0::dsk/c0t5d0. However, cfgadm -l does not list c6, nor does it 
list any disks. In fact, running cfgadm against the places where I think things 
are supposed to live gets me the following:

bash# cfgadm -l /dev/rdsk/c0t5d0
Ap_Id Type Receptacle Occupant Condition
/dev/rdsk/c0t5d0: No matching library found

bash# cfgadm -l /dev/rdsk/c6t5000CCA00A274EDCd0
cfgadm: Attachment point not found

bash# cfgadm -l /dev/dsk/c6t5000CCA00A274EDCd0
Ap_Id  Type Receptacle   Occupant Condition
/dev/dsk/c6t5000CCA00A274EDCd0: No matching library found

bash# cfgadm -l c6t5000CCA00A274EDCd0
Ap_Id Type Receptacle Occupant Condition
c6t5000CCA00A274EDCd0: No matching library found

I ran devfsadm -C -v and it removed all of the old attachment points for the 
/dev/dsk/c0t5d0 devices and created some for the c6 devices. Running cfgadm -al 
shows a c0, c4, and c5 -- these correspond to the actual controllers, but no 
devices are attached to the controllers. 

I found an old email on this list about MPxIO that said the solution was 
basically to yank the physical device after making sure that no I/O was 
happening to it. While this worked and allowed us to return the device to 
service as a spare in the zpool it inhabits, more concerning was what happened 
when we ran mpathadm list lu after yanking the device and returning it to 
service: 

-- 

bash# mpathadm list lu
/dev/rdsk/c6t5000CCA00A2A9398d0s2
Total Path Count: 1
Operational Path Count: 1
/dev/rdsk/c6t5000CCA00A29EE2Cd0s2
Total Path Count: 1
Operational Path Count: 1
/dev/rdsk/c6t5000CCA00A2BDBFCd0s2
Total Path Count: 1
Operational Path Count: 1
/dev/rdsk/c6t5000CCA00A2A8E68d0s2
Total Path Count: 1
Operational Path Count: 1
/dev/rdsk/c6t5000CCA00A0537ECd0s2
Total Path Count: 1
Operational Path Count: 1
mpathadm: Error: Unable to get configuration information.
mpathadm: Unable to complete operation

(Side note: Some of the disks are single path via an internal controller, and 
some of them are multi path in the J4400  via two external controllers.) 

A reboot fixed the 'issue' with mpathadm and it now outputs complete data. 

 

So -- how do I administer and remove physical devices that are in 
multipath-managed controllers on Solaris 10u8 without breaking multipath and 
causing configuration changes that interfere with the services and devices 
attached via mpathadm and the other voodoo and black magic inside? I can't seem 
to find this documented anywhere, even if the instructions to enable 
multipathing with stmsboot -e were quite complete and worked well! 

Thanks,
Karl Katzke



-- 

Karl Katzke
Systems Analyst II
TAMU - RGS


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-11 Thread Maidak Alexander J

If you're having issues with a disk contoller or disk IO driver its highly 
likely that a savecore to disk after the panic will fail.  I'm not sure how to 
work around this, maybe a dedicated dump device not on a controller that uses a 
different driver then the one that you're having issues with?

-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Blake
Sent: Wednesday, March 11, 2009 4:45 PM
To: Richard Elling
Cc: Marc Bevand; zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] reboot when copying large amounts of data

I guess I didn't make it clear that I had already tried using savecore to 
retrieve the core from the dump device.

I added a larger zvol for dump, to make sure that I wasn't running out of space 
on the dump device:

r...@host:~# dumpadm
  Dump content: kernel pages
   Dump device: /dev/zvol/dsk/rpool/bigdump (dedicated) Savecore directory: 
/var/crash/host
  Savecore enabled: yes

I was using the -L option only to try to get some idea of why the system load 
was climbing to 1 during a simple file copy.

On Wed, Mar 11, 2009 at 4:58 PM, Richard Elling richard.ell...@gmail.com 
wrote:
 Blake wrote:

 I'm attaching a screenshot of the console just before reboot.  The 
 dump doesn't seem to be working, or savecore isn't working.

 On Wed, Mar 11, 2009 at 11:33 AM, Blake blake.ir...@gmail.com wrote:

 I'm working on testing this some more by doing a savecore -L right 
 after I start the copy.

 savecore -L is not what you want.

 By default, for OpenSolaris, savecore on boot is disabled.  But the 
 core will have been dumped into the dump slice, which is not used for swap.
 So you should be able to run savecore at a later time to collect the 
 core from the last dump.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] MPxIO and removing physical devices

Re: [zfs-discuss] reboot when copying large amounts of data

2 matches

Site Navigation

Mail list logo

Footer information