Re: [Veritas-vx] Question over DMP partitionsize

2012-07-27 Thread phil . coleman
Hi All,
Thanks for all the information and suggestions it's given me a lot to
look at. Hopefully I've covered all your questions below...

 I've checked and there's no evidence I can find that anyone has made any
changes against the arrays, either via the command line or vea - the latter
which we don't use. However, looking at the tuneable values and what
they're set to and what the defaults are there are two that are different
to the defaults...

  fuj411:/root# vxdmpadm gettune all
  Tunable   Current Value  Default Value
  ----  -
  dmp_failed_io_threshold   5760057600
  dmp_retry_count   55
  dmp_pathswitch_blks_shift11   11
  dmp_queue_depth  32   32
  dmp_cache_open  off  off
  dmp_daemon_count 10   10
  dmp_scsi_timeout 30   30
  dmp_delayq_interval  15   15
  dmp_path_age  0  300
  dmp_stat_interval 11
  dmp_health_time   0   60
  dmp_probe_idle_lun   on   on
  dmp_log_level 11
  dmp_retry_timeout 00
  fuj411:/root#

I don't know who would've changed these or when, or at all. There's a lot
of disk seen down each path - physical HBA. We have an A and a B side to
the SAN with the disk being presented down both sides to separate HBA's on
the server. The errors and other messages are always to the same array
(IBM_SHARK1) which is the one that has the different value set, and is the
one that see's the throttling against it.

 The version of VxVM is 4.1 MP2 so I know we are downlevel and doesn't
have the new default value of 512 that's come in with 5.1 MP3.

 All paths are active and i/o is seen going down both paths to all disks.
I can see this at the server and the SAN port level. The storage team have
checked and the cannot find anything untoward. I've also had a look at the
HBA's as well (fcinfo hba-port -l HBA_WWN)

 The recovery option on both arrays is the same and set to the
following...

  fuj411:/root# vxdmpadm getattr enclosure IBM_SHARK0 recoveryoption
  ENCLR-NAME  RECOVERY-OPTION  DEFAULT[VAL]CURRENT[VAL]
  
==
  IBM_SHARK0  Throttle Timebound[10]   Timebound
  [10]
  IBM_SHARK0  Error-Retry  Fixed-Retry[5]  Fixed-Retry
  [5]
  fuj411:/root#
  fuj411:/root# vxdmpadm getattr enclosure IBM_SHARK1 recoveryoption
  ENCLR-NAME  RECOVERY-OPTION  DEFAULT[VAL]CURRENT[VAL]
  
==
  IBM_SHARK1  Throttle Timebound[10]   Timebound
  [10]
  IBM_SHARK1  Error-Retry  Fixed-Retry[5]  Fixed-Retry
  [5]
  fuj411:/root#

My suspicion is that we have some kind of fault it's just identifying
where. I suspect this will involve getting the server up to the latest
patch levels in the o/s and within the version installed for starters and
getting the storage team to carry out a full end to end test of the
hardware. In the meantime thanks again for all your help with this, very
much appreciated. I think I will take Dmitry's advice and log a call with
Symantec and see if they can explain the difference and whether it's just
the way this array type operates.

Thanks again.

Cheers
Phil.



   
 Dmitry Glushenok  
 gl...@jet.msk.su 
   To 
   William Havey bbha...@gmail.com   
 26/07/2012 21:09  phil.cole...@ba.com 
cc 
   veritas-vx@mailman.eng.auburn.edu   
   Subject 
   Re: [Veritas-vx] Question over DMP  
   partitionsize

Re: [Veritas-vx] Question over DMP partitionsize

2012-07-27 Thread William Havey
Your situation helps me understand the internals of DMP that much more.
Sorry my benefit comes at your disadvantage. Good luck with the mystery.

BTW, a tip-of-the-hat to you for using the word ontoward. Nice to read a
tech person using our language so well.

On Fri, Jul 27, 2012 at 6:26 AM, phil.cole...@ba.com wrote:

 Hi All,
 Thanks for all the information and suggestions it's given me a lot
 to
 look at. Hopefully I've covered all your questions below...

  I've checked and there's no evidence I can find that anyone has made any
 changes against the arrays, either via the command line or vea - the latter
 which we don't use. However, looking at the tuneable values and what
 they're set to and what the defaults are there are two that are different
 to the defaults...

   fuj411:/root# vxdmpadm gettune all
   Tunable   Current Value  Default Value
   ----  -
   dmp_failed_io_threshold   5760057600
   dmp_retry_count   55
   dmp_pathswitch_blks_shift11   11
   dmp_queue_depth  32   32
   dmp_cache_open  off  off
   dmp_daemon_count 10   10
   dmp_scsi_timeout 30   30
   dmp_delayq_interval  15   15
   dmp_path_age  0  300
   dmp_stat_interval 11
   dmp_health_time   0   60
   dmp_probe_idle_lun   on   on
   dmp_log_level 11
   dmp_retry_timeout 00
   fuj411:/root#

 I don't know who would've changed these or when, or at all. There's a lot
 of disk seen down each path - physical HBA. We have an A and a B side to
 the SAN with the disk being presented down both sides to separate HBA's on
 the server. The errors and other messages are always to the same array
 (IBM_SHARK1) which is the one that has the different value set, and is the
 one that see's the throttling against it.

  The version of VxVM is 4.1 MP2 so I know we are downlevel and doesn't
 have the new default value of 512 that's come in with 5.1 MP3.

  All paths are active and i/o is seen going down both paths to all disks.
 I can see this at the server and the SAN port level. The storage team have
 checked and the cannot find anything untoward. I've also had a look at the
 HBA's as well (fcinfo hba-port -l HBA_WWN)

  The recovery option on both arrays is the same and set to the
 following...

   fuj411:/root# vxdmpadm getattr enclosure IBM_SHARK0 recoveryoption
   ENCLR-NAME  RECOVERY-OPTION  DEFAULT[VAL]CURRENT[VAL]

 ==
   IBM_SHARK0  Throttle Timebound[10]   Timebound
   [10]
   IBM_SHARK0  Error-Retry  Fixed-Retry[5]  Fixed-Retry
   [5]
   fuj411:/root#
   fuj411:/root# vxdmpadm getattr enclosure IBM_SHARK1 recoveryoption
   ENCLR-NAME  RECOVERY-OPTION  DEFAULT[VAL]CURRENT[VAL]

 ==
   IBM_SHARK1  Throttle Timebound[10]   Timebound
   [10]
   IBM_SHARK1  Error-Retry  Fixed-Retry[5]  Fixed-Retry
   [5]
   fuj411:/root#

 My suspicion is that we have some kind of fault it's just identifying
 where. I suspect this will involve getting the server up to the latest
 patch levels in the o/s and within the version installed for starters and
 getting the storage team to carry out a full end to end test of the
 hardware. In the meantime thanks again for all your help with this, very
 much appreciated. I think I will take Dmitry's advice and log a call with
 Symantec and see if they can explain the difference and whether it's just
 the way this array type operates.

 Thanks again.

 Cheers
 Phil.




  Dmitry Glushenok
  gl...@jet.msk.su
To
William Havey bbha...@gmail.com
  26/07/2012 21:09  phil.cole...@ba.com
 cc
veritas-vx@mailman.eng.auburn.edu
Subject
Re: [Veritas-vx] Question over DMP
partitionsize










 Hello William, Phil,

 26.07.2012, в 21:25, William Havey написал(а

Re: [Veritas-vx] Question over DMP partitionsize

2012-07-26 Thread phil . coleman
Hi William,
Thanks for the reply. Not sure how to get this track cach size?
What's confusing me most here is that the values are so different between
the two arrays. They're identical models set-up up the same and with the
same number of disks allocated to the servers - sorry, forgot to mention
they're in an HA pair using VCS. The only difference is that IBM_SHARK0 is
local to the server where the workload is currently running, and IBM_SHARK1
is in another building about 1.5KM's away. It's this disparity that is
confusing me and making me wonder whether an issue we are seeing is being
caused by this, or if it's an indication of an issue, though I've been
informed by our storage people that there's nothing wrong with either
array.
This is certainly not something we've changed so I don't know if
VxVM/DMP is throttling things back because it's seeing an issue. I have
seen messages in the /etc/vx/dmpevents.log file for disks in the IBM_SHARK1
array reporting 'Throttled Path' and then 'Un-throttled Path' and I'm
trying to work out if the two are linked.

Cheers
Phil.



   
 William Havey 
 bbha...@gmail.co 
 m To 
   phil.cole...@ba.com 
 26/07/2012 15:48   cc 
   
   Subject 
   Re: [Veritas-vx] Question over DMP  
   partitionsize   
   
   
   
   
   
   




Partitionsize is in play when the iopolicy is Balanced, which is the
default policy. You have Balanced.

It is defined as The partitionsize attribute: Each successive I/O starting
within in this range (default is 2048 sectors) goes through the same path
as the previous I/O

The man page has Takes the track cache into consideration when balancing
I/O across paths

What is the track cache size on each array? Seems that the partitionsize
should be the same value as the track cache size.



On Thu, Jul 26, 2012 at 7:09 AM, phil.cole...@ba.com wrote:

  Hi,
          I'm trying to understand why one array has a wildly different
  partitionsize value to the other...

  fuj411:/root# vxdmpadm getattr arrayname IBM_SHARK partitionsize
  ENCLR_NAME     DEFAULT        CURRENT
  
  IBM_SHARK0     2048           2048
  IBM_SHARK1     256            512
  fuj411:/root#

  Both arrays are IBM ESS SHARK 2105's which are active/active and
  operating
  in a Balanced i/o policy...

  fuj411:/root# vxdmpadm getattr arrayname IBM_SHARK iopolicy
  ENCLR_NAME     DEFAULT        CURRENT
  
  IBM_SHARK0     Balanced       Balanced
  IBM_SHARK1     Balanced       Balanced
  fuj411:/root#

  The server is a Fujitsu PW850 running Solaris 10 with VxVM v4.1 MP2. The
  volume are mirrored between the two arrays.

  I'm trying to get a better understanding of what this means and why it
  would be so different between the two arrays. Any help greatly
  appreciated.

  Cheers
  Phil.

  --
  This message is private and confidential and may also be legally
  privileged.  If you have received this message in error, please email it
  back to the sender and immediately permanently delete it from your
  computer system.  Please do not read, print, re-transmit, store or act in
  reliance on it or any attachments.

  British Airways may monitor email traffic data and also the content of
  emails, where permitted by law, for the purposes of security and staff
  training and in order to prevent or detect unauthorised use of the
  British Airways email system.

  Virus checking of emails (including attachments) is the responsibility of
  the recipient.

  British Airways Plc is a public limited company registered in England and
  Wales.  Registered number: 177.  Registered office: Waterside, PO Box
  365, Harmondsworth, West Drayton, Middlesex, England, UB7 0GB.

  Additional terms and conditions are available on our website: www.ba.com

  ___
  Veritas-vx maillist  -  Veritas-vx@mailman.eng.auburn.edu
  http://mailman.eng.auburn.edu/mailman

Re: [Veritas-vx] Question over DMP partitionsize

2012-07-26 Thread Terrie Douglas
DMP is most likely throttling things back due to an issue on a path:
I/O throttling is a mechanism by which Dynamic Multi-Pathing (DMP) temporarily 
stops issuing I/Os to paths that appear to be either overloaded or 
underperforming. There is a default I/O throttling mechanism in DMP based on 
the number of requests queued on a path.

I am researching why you are seeing differing partition sizes.   Would think it 
has something to do with configuration within the OS level or vxvm area, not 
the array itself.


Regards, 
Terrie Douglas
Sr. Prin. Technical Support Engineer
Symantec Software Corporation
Email: terrie_doug...@symantec.com
Customer Support: 1(800) 342-0652





View your case online at: https://mysupport.symantec.com

Save time and visit the Veritas Installation Assessment Services website and 
check out our automated tools: https://vias.symantec.com/main.php

This message (including any attachments) is intended only for the use of the 
individual or entity to which it is addressed and may contain information that 
is non-public,proprietary, privileged, confidential, and exempt from disclosure 
under applicable law or may constitute as attorney work product. If you are not 
the intended recipient, you are hereby notified that any use, dissemination, 
distribution, or copying of this communication is strictly prohibited. If you 
have received this communication in error,notify us immediately by telephone 
and (i) destroy this message if a facsimile or (ii) delete this message 
immediately if this is an electronic communication.


-Original Message-
From: veritas-vx-boun...@mailman.eng.auburn.edu 
[mailto:veritas-vx-boun...@mailman.eng.auburn.edu] On Behalf Of 
phil.cole...@ba.com
Sent: Thursday, July 26, 2012 8:56 AM
To: William Havey
Cc: veritas-vx@mailman.eng.auburn.edu
Subject: Re: [Veritas-vx] Question over DMP partitionsize

Hi William,
Thanks for the reply. Not sure how to get this track cach size?
What's confusing me most here is that the values are so different between
the two arrays. They're identical models set-up up the same and with the
same number of disks allocated to the servers - sorry, forgot to mention
they're in an HA pair using VCS. The only difference is that IBM_SHARK0 is
local to the server where the workload is currently running, and IBM_SHARK1
is in another building about 1.5KM's away. It's this disparity that is
confusing me and making me wonder whether an issue we are seeing is being
caused by this, or if it's an indication of an issue, though I've been
informed by our storage people that there's nothing wrong with either
array.
This is certainly not something we've changed so I don't know if
VxVM/DMP is throttling things back because it's seeing an issue. I have
seen messages in the /etc/vx/dmpevents.log file for disks in the IBM_SHARK1
array reporting 'Throttled Path' and then 'Un-throttled Path' and I'm
trying to work out if the two are linked.

Cheers
Phil.

Veritas-vx maillist  -  Veritas-vx@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-vx
___
Veritas-vx maillist  -  Veritas-vx@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-vx


Re: [Veritas-vx] Question over DMP partitionsize

2012-07-26 Thread William Havey
Phil,

I don't believe any DMP values are changed by vxconfigd automatically.

To be certain the partitionsize has not been manually changed,
Is VxVM command line logging enabled? run vxcmdlog -l to find out
If yes, look in /var/adm/vx/cmdlog for any vxdmpadm setattr commands.
Do you use vea? If so, its commands are logged in /var/adm/vx/veacmdlog. But,
I don't believe these values can beset in vea.

I believe the throttle and unthrottle in the dmpevents log file means
holding back I/O from being addressed and sent out to storage but I
certainly can't say the two (partitionsize of 512 and log entries for the
SHARK1 with that size) are related or not.

Are all paths being used for I/O just about the same?
vxdmpadm iostat start
vxdmpadm iostat reset
vxdmpadm iostat show enclosure*=*IBM_SHARK0 enclosure*=*IBM_SHARK1 interval=5

The first iteration of the utility shows cumulative statistics from the
time of mounting the file systems. So, ignore the first output. If totals
are about equal, then the value the partitionsize value may be irrelevant.
Perhaps I/O is truly random so that no I/O address is within 512 sectors of
the previous I/O. Even in a random environment sometimes numerous
successive I/O's can be within 512 sectors of each other so that throttling
those I/Os kicks-in.

IHTH,

Bill


On Thu, Jul 26, 2012 at 11:56 AM, phil.cole...@ba.com wrote:

 Hi William,
 Thanks for the reply. Not sure how to get this track cach size?
 What's confusing me most here is that the values are so different between
 the two arrays. They're identical models set-up up the same and with the
 same number of disks allocated to the servers - sorry, forgot to mention
 they're in an HA pair using VCS. The only difference is that IBM_SHARK0 is
 local to the server where the workload is currently running, and IBM_SHARK1
 is in another building about 1.5KM's away. It's this disparity that is
 confusing me and making me wonder whether an issue we are seeing is being
 caused by this, or if it's an indication of an issue, though I've been
 informed by our storage people that there's nothing wrong with either
 array.
 This is certainly not something we've changed so I don't know if
 VxVM/DMP is throttling things back because it's seeing an issue. I have
 seen messages in the /etc/vx/dmpevents.log file for disks in the IBM_SHARK1
 array reporting 'Throttled Path' and then 'Un-throttled Path' and I'm
 trying to work out if the two are linked.

 Cheers
 Phil.




  William Havey
  bbha...@gmail.co
  m To
phil.cole...@ba.com
  26/07/2012 15:48   cc

Subject
Re: [Veritas-vx] Question over DMP
partitionsize










 Partitionsize is in play when the iopolicy is Balanced, which is the
 default policy. You have Balanced.

 It is defined as The partitionsize attribute: Each successive I/O starting
 within in this range (default is 2048 sectors) goes through the same path
 as the previous I/O

 The man page has Takes the track cache into consideration when balancing
 I/O across paths

 What is the track cache size on each array? Seems that the partitionsize
 should be the same value as the track cache size.



 On Thu, Jul 26, 2012 at 7:09 AM, phil.cole...@ba.com wrote:

   Hi,
   I'm trying to understand why one array has a wildly different
   partitionsize value to the other...

   fuj411:/root# vxdmpadm getattr arrayname IBM_SHARK partitionsize
   ENCLR_NAME DEFAULTCURRENT
   
   IBM_SHARK0 2048   2048
   IBM_SHARK1 256512
   fuj411:/root#

   Both arrays are IBM ESS SHARK 2105's which are active/active and
   operating
   in a Balanced i/o policy...

   fuj411:/root# vxdmpadm getattr arrayname IBM_SHARK iopolicy
   ENCLR_NAME DEFAULTCURRENT
   
   IBM_SHARK0 Balanced   Balanced
   IBM_SHARK1 Balanced   Balanced
   fuj411:/root#

   The server is a Fujitsu PW850 running Solaris 10 with VxVM v4.1 MP2. The
   volume are mirrored between the two arrays.

   I'm trying to get a better understanding of what this means and why it
   would be so different between the two arrays. Any help greatly
   appreciated.

   Cheers
   Phil.

   --
   This message is private and confidential and may also be legally
   privileged.  If you have received this message in error, please email it
   back to the sender and immediately permanently delete it from your
   computer system.  Please do not read, print, re-transmit, store or act in
   reliance on it or any attachments.

   British Airways may monitor email traffic data and also the content of
   emails