from:"Jim Senicka"

Re: [Veritas-ha] Cluster Interconnect cables: Direct connect orVLANs?

2009-09-16 Thread Jim Senicka

My only addition to the comments by the esteemed gentleman from Virginia is to 
make sure you have a solid practice in place to manage cluster ID when you go 
VLAN, as there may be cases when your network people “cross the streams”

 

 

 

From: veritas-ha-boun...@mailman.eng.auburn.edu 
[mailto:veritas-ha-boun...@mailman.eng.auburn.edu] On Behalf Of Eric Hennessey
Sent: Wednesday, September 16, 2009 11:59 AM
To: Jon Price; veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] Cluster Interconnect cables: Direct connect orVLANs?

 

The configuration you’re considering – running your cluster interconnects over 
two separate VLANs – is actually our preferred and recommended method, even 
when deploying a simple 2-node cluster.  While using direct connections between 
cluster nodes is simple and convenient, it becomes problematic if you decide to 
add a node to the cluster.

 

Rest easy with your design. :-)

 

Eric

 

From: veritas-ha-boun...@mailman.eng.auburn.edu 
[mailto:veritas-ha-boun...@mailman.eng.auburn.edu] On Behalf Of Jon Price
Sent: Tuesday, September 15, 2009 3:58 PM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] Cluster Interconnect cables: Direct connect or VLANs?

 


Hi,

For Veritas Cluster 5.0 We also have Storage Foundation for Oracle. 

Currently we use direct connect cables between the two nodes in our Veritas 
Cluster for the heartbeat.
However, we are switching to new systems and running the direct connect cables 
is more difficult than it used to be.
So, we are considering the use of two VLANs for this purpose. I believe that 
traffic on these two VLANs is limited to only Cluster heartbeat connections 
(though not just ours).

What is the downside of using VLANs for the heartbeat?
In what scenarios could problems develop?

I'm concerned that if our network has a serious problem and goes down that 
each Node in the Cluster might be isolated and both Nodes import the disk 
groups, mount volumes, etc and thus data corruption.

Is data corruption a possibility if the entire network goes down or in other 
scenarios?

Does Veritas also use quorum or any other methods to protect against split 
brain induced  damage?


Thanks

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] LLT heartbeat redundancy

2009-05-03 Thread Jim Senicka

This is not a limitation, as you had two independent failures. Bonding
would remove the ability to discriminate between a link and a node
failure. 
My feeling is in the scenario you describe, VCS is operating properly,
and it is not a limitation.
If you have issues with port or cable failures, add a low pri connection
on a third network.


-Original Message-
From: Imri Zvik [mailto:im...@inter.net.il] 
Sent: Sunday, May 03, 2009 11:57 AM
To: Jim Senicka
Cc: veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] LLT heartbeat redundancy

On Sunday 03 May 2009 18:25:08 Jim Senicka wrote:
 You had 2 failures. No real way to design around that.
 GAB visible would prevent bad things from occurring.

Thank you for the fast response :)

Well, In linux I can use the bonding module to aggregate the interfaces
and 
work around this limitation. I've read in this discussion: 
http://www.mail-archive.com/veritas-ha@mailman.eng.auburn.edu/msg01016.h
tml
That since 5.0MP3 there is a cross-platform solution (I need this for
Solaris 
10). Do you happen to know more about this feature?

Thanks!

P.S.

Does anyone knows if Sun Cluster has the same limitation?

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] LLT heartbeat redundancy

2009-05-03 Thread Jim Senicka

LLT is designed to use jeopardy to detect the difference between
single link fail and dual link fail in most situations. Having a single
mesh may remove this capability.

Let me check on this with engineering and see if we have any more up to
date recommendations


-Original Message-
From: veritas-ha-boun...@mailman.eng.auburn.edu
[mailto:veritas-ha-boun...@mailman.eng.auburn.edu] On Behalf Of Imri
Zvik
Sent: Sunday, May 03, 2009 12:18 PM
To: Jim Senicka
Cc: veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] LLT heartbeat redundancy

On Sunday 03 May 2009 19:03:16 Jim Senicka wrote:
 This is not a limitation, as you had two independent failures. Bonding
 would remove the ability to discriminate between a link and a node
 failure.

I didn't understand this one - With bonding I can maintain full mesh 
topology - No matter which one of the links fails, if a node still has
at 
least one active link, LLT will still be able to see all the other
nodes. 
This achieves greater HA than without the bonding.


 My feeling is in the scenario you describe, VCS is operating properly,
 and it is not a limitation.

Of course it is operating properly - that's how it was designed to work
:)
I'm just saying that the cluster could be more redundant if it wasn't
designed 
that way :)

 If you have issues with port or cable failures, add a low pri
connection
 on a third network.



___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] SUMMARY: filesystem corruption after the cluster nodereboot

2009-04-01 Thread Jim Senicka

Running a non journeled file system in a cluster is always a bad idea,
as your recovery time is always effected by file system start up tasks.
Running UFS in logging mode was usually a pretty big performance hit.
Why not VxFS?



-Original Message-
From: veritas-ha-boun...@mailman.eng.auburn.edu
[mailto:veritas-ha-boun...@mailman.eng.auburn.edu] On Behalf Of
Aleksandr Nepomnyashchiy
Sent: Tuesday, March 31, 2009 6:07 PM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] SUMMARY: filesystem corruption after the cluster
nodereboot

Many thanks to Tom Stephens for his help in troubleshooting.

What happened :
Both fs1 and fs2 became corrupted after the node crash. Most probably
VCS tried to FSCK both and was successful with fs1 (size ~4G) and
didn't complete within the timeout period on fs2 (size ~100G). So,
fsck of fs2 was killed and didn't leave anything in the engine_A.log


Suggested actions:
A) Implement UFS logging on both fs1 and fs2 - should eliminate the
file system corruption and the need for FSCK (I will definitely
implement this).
B) Increase the OnlineTimeout value for the Mount type from the
default of 300 seconds  (this should be considered carefully, can
cause troubles).


PS I was considering adding -y in FsckOpt but it doesn't make any
difference - online script adds -y option to fsck (regardless of
whether you specify it ot not in the FsckOpt). This is the case for
online script version 2.9 from 02/13/01 18:15:47.



===   Please see the original post below
=



Dear VCS gurus,
Please help me to understand why only 1 out of 2 mount points came up
after the carsh.

I can see in the log that fs1 was fsck-ed by VCS and brought online.
Was fsck even attempted on fs2? And if not why?

VCS is 2.0, both fs1 and fs2 are ufs,  nothing in FsckOpt.


== engine_A.log from the healthy node =
TAG_E 2009/03/26 18:25:55 (node_d) VCS:13001:Resource(mnt_fs1): Output
of the completed operation (online)
mount: the state of /dev/vx/dsk/mydg/fs1 is not okay
   and it was attempted to be mounted read/write
mount: Please run fsck and try again
** /dev/vx/rdsk/mydg/fs1
** Last Mounted on /mount/fs1
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3a - Check Connectivity
** Phase 3b - Verify Shadows/ACLs
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cylinder Groups

FILE SYSTEM STATE IN SUPERBLOCK IS WRONG; FIX?  yes

7324 files, 2158506 used, 1773622 free (4910 frags, 221089 blocks,
0.1% fragmentation)
TAG_E 2009/03/26 18:25:55 VCS:10298:Resource mnt_fs1 (Owner: unknown,
Group: srvgrA) is online on node_d (VCS initiated)
TAG_E 2009/03/26 18:30:07 (node_d) VCS:13003:Resource(mnt_fs2): Output
of the timedout operation (online)
mount: the state of /dev/vx/dsk/mydg/fs2 is not okay
   and it was attempted to be mounted read/write
mount: Please run fsck and try again
TAG_B 2009/03/26 18:30:07 (node_d) VCS:13012:Resource(mnt_fs2): online
procedure did not complete within the expected time.
TAG_D 2009/03/26 18:30:07 (node_d) VCS:13065:Agent is calling clean
for resource(mnt_fs2) because online did not complete within the
expected time.



Thank you,
Aleksandr
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] SFCFSRAC - node with the highest nodeid panics afternode with the lowest nodeid rejoins

2009-03-17 Thread Jim Senicka

What is your gabtab settings?

You seem to have two independent cluster generations.

You should have /sbin/gabconfig -c -n4 in gabtab




-Original Message-
From: Imri Zvik [mailto:im...@inter.net.il] 
Sent: Tuesday, March 17, 2009 10:16 AM
To: Jim Senicka
Cc: veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] SFCFSRAC - node with the highest nodeid panics
afternode with the lowest nodeid rejoins

On Tuesday 17 March 2009 15:32:55 Jim Senicka wrote:
 A few questions

 1. Do you have a support case open?

Yes, for over two weeks.

 2. Do you reconnect the FC before the node boots?

Yes, FC is reconnected immediately after the panic.

 3. Is the network available during boot time?

Yes.


 GAB: port b is halting the system due to network failure essentially
 means that VXFEN is connecting between two clusters with different
 generation numbers, which should only happen if the clusters booted
 independent of each other, then were joined at the network level

This is weird. As you can see from the logs I've attached before, the
cluster 
nodes 1, 2 and 3 were members, and node 0 rejoined them



___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] Removing VCS group

2009-02-02 Thread Jim Senicka

No. It means you do not have to do that. 

Sent from my Nokia E62 handheld by goodlink.

 -Original Message-
From:   i man [mailto:imanuk2...@googlemail.com]
Sent:   Monday, February 02, 2009 08:07 AM US Mountain Standard Time
To: Jim Senicka
Cc: veritas-ha@mailman.eng.auburn.edu
Subject:Re: [Veritas-ha] Removing VCS group

Jim,
does it mean that I would need to do the same activity of removing diskgroup
coponents and putting them into spare pool on both the parts of cluster ?
Ciao.

On Mon, Feb 2, 2009 at 2:15 PM, Jim Senicka james_seni...@symantec.comwrote:

 The diskgroup is destroyed. All info about a VxVM diskgroup is in the dg,
 so no need to do anything else (no info is on the host).

 In straight failover VxVM, the only tie point between VCS and VxVM is the
 VCS agent that imports and deports specified diskgroups. VxVM has no
 knowledge of VCS and VCS really only knows the name of a DG it is supposed
 to manage.

 Sent from my Nokia E62 handheld by goodlink.

  -Original Message-
 From:   i man [mailto:imanuk2...@googlemail.com]
 Sent:   Monday, February 02, 2009 06:18 AM US Mountain Standard Time
 To: veritas-ha@mailman.eng.auburn.edu
  Subject:Re: [Veritas-ha] Removing VCS group

 Thankyou to all for your help.

 Now I have some queries regarding the cluster.

 I have imported and destroyed the diskgroup on one system of the cluster.

 1. Do I have to do it on both the systems of the cluster ?

 We have multipathing enabled on the systems.

 # vxdmpadm listctlr all
 CTLR-NAME   ENCLR-TYPE  STATE  ENCLR-NAME
 =
 c7  EMC ENABLED  MC0
 c6  EMC ENABLED  MC0
 c0  DiskENABLED  Disk
 c7  EMC ENABLED  MC1
 c6  EMC ENABLED  MC1
  I am still a little confused as to the integration of the vxvm and vcs.
 Can
 somebody send me some link as well which shows how they are constructed
 together so that I have better understanding.

 Ciao.

 On Fri, Jan 30, 2009 at 11:29 AM, Jim Senicka james_seni...@symantec.com
 wrote:

  Removal of the service group has zero effect on the storage. You need to
  use appropriate VxVM commands to manage the disk group. The vxprint
 command
  is VxVM and has nothing to do with VCS.
  Removing the service group was fine. Now you need to complete the VxVM
  work.

  Sent from my Nokia E62 handheld by goodlink.

   -Original Message-
  From:   i man [mailto:imanuk2...@googlemail.com]
  Sent:   Friday, January 30, 2009 04:26 AM US Mountain Standard Time
  To: veritas-ha@mailman.eng.auburn.edu
  Subject:[Veritas-ha] Removing VCS group

  all,

  I think I'm i bit of trouble.

  Im trying to remove a cluster service gorup which has a veritas disk
 group
  configured . My task is to free up the disks used by the removel of SG
 and
  DG and move them to free pool. From the cluster GUI. I have removed the
  resources, the SG. some questions regarding the same...

1. Did the Veritas disk group got deleted automatically when I removed
the cluster component s?
2. I could not see any service group thorugh vxprint command now.
3. How could I now move the disks from the service gorup pool ?

  Ciao.

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] Removing VCS group

2009-02-02 Thread Jim Senicka

The diskgroup is destroyed. All info about a VxVM diskgroup is in the dg, so no 
need to do anything else (no info is on the host). 

In straight failover VxVM, the only tie point between VCS and VxVM is the VCS 
agent that imports and deports specified diskgroups. VxVM has no knowledge of 
VCS and VCS really only knows the name of a DG it is supposed to manage. 




Sent from my Nokia E62 handheld by goodlink.


 -Original Message-
From:   i man [mailto:imanuk2...@googlemail.com]
Sent:   Monday, February 02, 2009 06:18 AM US Mountain Standard Time
To: veritas-ha@mailman.eng.auburn.edu
Subject:Re: [Veritas-ha] Removing VCS group

Thankyou to all for your help.

Now I have some queries regarding the cluster.

I have imported and destroyed the diskgroup on one system of the cluster.

1. Do I have to do it on both the systems of the cluster ?

We have multipathing enabled on the systems.

# vxdmpadm listctlr all
CTLR-NAME   ENCLR-TYPE  STATE  ENCLR-NAME
=
c7  EMC ENABLED  MC0
c6  EMC ENABLED  MC0
c0  DiskENABLED  Disk
c7  EMC ENABLED  MC1
c6  EMC ENABLED  MC1
 I am still a little confused as to the integration of the vxvm and vcs. Can
somebody send me some link as well which shows how they are constructed
together so that I have better understanding.

Ciao.


On Fri, Jan 30, 2009 at 11:29 AM, Jim Senicka james_seni...@symantec.comwrote:

 Removal of the service group has zero effect on the storage. You need to
 use appropriate VxVM commands to manage the disk group. The vxprint command
 is VxVM and has nothing to do with VCS.
 Removing the service group was fine. Now you need to complete the VxVM
 work.


 Sent from my Nokia E62 handheld by goodlink.


  -Original Message-
 From:   i man [mailto:imanuk2...@googlemail.com]
 Sent:   Friday, January 30, 2009 04:26 AM US Mountain Standard Time
 To: veritas-ha@mailman.eng.auburn.edu
 Subject:[Veritas-ha] Removing VCS group

 all,

 I think I'm i bit of trouble.

 Im trying to remove a cluster service gorup which has a veritas disk group
 configured . My task is to free up the disks used by the removel of SG and
 DG and move them to free pool. From the cluster GUI. I have removed the
 resources, the SG. some questions regarding the same...


   1. Did the Veritas disk group got deleted automatically when I removed
   the cluster component s?
   2. I could not see any service group thorugh vxprint command now.
   3. How could I now move the disks from the service gorup pool ?

 Ciao.


___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] VCS Configuration 1

2009-01-21 Thread Jim Senicka

Comments below with JS

From: veritas-ha-boun...@mailman.eng.auburn.edu
[mailto:veritas-ha-boun...@mailman.eng.auburn.edu] On Behalf Of i man
Sent: Wednesday, January 21, 2009 2:36 PM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] VCS Configuration 1

All,

I am trying to configure a VCS resource. Have some confusion regarding
the below points.

1)  I have created the online, offline, monitor, clean scripts. Can
anybody explain me how these scripts are called by VCS. 

JS They are called when a resource of that type is configured and
needs to do specific state changes. Once you have your type definition,
and an ABRAAgent in the ABRA directory, you then need to create a
resource of that type in the main.cf

I know I have defined the .cf file for the application but it seems they
are not called from the ArgList parameter. I cross checked with other
running applications as well and seems they are not called anywhere.
Does arglist by default call these scripts ?  I mean I was expecting
these scripts to be called at any configuration file.

JS Huh? Sorry. The entry point scripts for a resource type are only
called when a resource of that type needs to be controlled. So unless
you have a resource of tyoe Oracle defined in a service group, the
Oracle entry points are not called. For that matter, the OracleAgent is
not even started

My .cf file looks like below

# more ABRA.cf
type ABRA (
static int RestartLimit = 2
static str ArgList[] = { Vandalhome, stupiduser }
str Vandalhome
str stupiduser 
)

2)  One of the reasons for the above question is , I think all my
applications do not have a proper Clean procedure as the clean script
does not make sense to me. Just wanted to check if Clean is implemented
on the system or not.

 JS You would have had to have implemented the clean procedure

3)  What agent attribute would be best suited for the resource to
wait for specific interval of time before starting the procedure.

The numerical value returned by online or offline sets the number of
seconds before monitor is called

Im toying with following attributes.

OnlineWaitLimit : But SADG Says Number of monitor intervals to wait
after completing the online procedure, and before the resource becomes
online. : My requirement is to wait before starting the online
procedure

OnlineTimeout: Not convinced with this as well

ConfInterval: Had seen the implementation of this parameter in main.cmd
so not happy about it.

Ciao

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] Metro/Global cluster solution options

2008-12-24 Thread Jim Senicka

Talk with your Symantec rep?

The System Engineer can easily come in and discuss how VCS can manage
your DR Automation needs

 

 

From: veritas-ha-boun...@mailman.eng.auburn.edu
[mailto:veritas-ha-boun...@mailman.eng.auburn.edu] On Behalf Of rajesh
Kharya (rkharya)
Sent: Wednesday, December 24, 2008 5:15 AM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] Metro/Global cluster solution options

 

Hi,

 

We are evaluating possible clustering solution for one project where
entire application environment will be hosted in 2 Data centers, some 50
miles apart. The application environment will be identical on both the
DCs and will be accessed via Global Site Selector/Application Control
Engines at the network layer. At the very back end we have a requirement
of putting a 2 node cluster in each data center on Linux OS preferably.
Within a DC one node will be active while the other will be passive.
Storage will be configured as mounted file systems. We need to know in
what way VCS can help in -

 

A) data replication between the clusters in 2 DCs, assuming the two
clusters are working independently.

B) Is there a possibility of having all nodes(1-4) part of a single
cluster, where they are separated out by 50 miles and have common
storage between them(possibly CFS implementation). Node1/3 remains
active while Node2/4 are standbys. 

 

 

 Site ASite B

--

 

Node1Node3

|| 

Node2   Node4

 

 

Any pointers to references/documentations appreciated.

 

Thanks,

 

~ Rajesh.

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] SRDF agent for cascaded SRDF in global cluster

2008-12-24 Thread Jim Senicka

The SRDF replication control agent for VCS HA/DR does not currently
support cascaded SRDF. It only supports STAR.

We are looking at adding cascaded, but no official support at this time,
and no committed date for cascade support.
Speak with your Symantec rep?

 

From: veritas-ha-boun...@mailman.eng.auburn.edu
[mailto:veritas-ha-boun...@mailman.eng.auburn.edu] On Behalf Of Pavel A
Tsvetkov
Sent: Tuesday, December 23, 2008 6:09 AM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] SRDF agent for cascaded SRDF in global cluster

 


Hello all! 

Just a  small question. The new SRDF agent 5.0.0.4 has a support for
SRDF STAR. This is a good thing. 
But what about   cascaded SRDF for Symmetrix DMX  version 5773 and
above? If we use R1 and R12 on one site (Replicated Data Cluster
R1-R12) and 
R2 on another site in global cluster  (R12 - R2) is it possible to use
SRDF VCS agent in that case? I don't see reasons why this agent cannot
be used  


Kind regards, Pavel Tsvetkov

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] Question about HA and disks

2008-10-27 Thread Jim Senicka

In the original message
 We had an issue where a serverA failed and serverB took over.
However, serverB took over when serverA was still 'crashing' (it took a
good 10-15mins to crash),

I can assume crash = panic, as crashing has to refer to dumping core
to disk.

If this is the case, there will be no logs on server A, as it is mid
panic.

In this case (the node is in the middle of a crash dump), it will not be
writing to data disks. What ever was written happened before the kernel
call to panic. Fencing will protect that data once the new node imports,
but in the case described here, the corruption had to happen before the
panic, so fence would not have helped.

Bottom line is the node ceased writing as soon as the non maskable
interrupt was called for panic (unless Linux somehow violates every Unix
kernel rule, which I seriously doubt). When VCS took over the service
group on Server B, Server A was down and could not have been writing


-Original Message-
From: Jon E Price/SYS/NYTIMES [mailto:[EMAIL PROTECTED] 
Sent: Monday, October 27, 2008 8:14 PM
To: Jim Senicka; Andrey Dmitriev; Joshua Fielden;
veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] Question about HA and disks

Hi,

A few questions..

Andrey: Could you post the logs (or even portions of them) which show
what
ServerA was doing during the takeover?

Joshua: You're saying that IO Fencing can prevent split brain situations
in
which one server is still writing to a filesystem while a 2nd server has
taken over that same service group and begun writing to the same fs,
thus
possibly causing corruption?

http://sfdoccentral.symantec.com/sf/5.0/linux/html/vcs_install/ch_vcs_in
stall_iofence.html#190559

Jim: What's the evidence that the server panic'd?
And is 16 seconds the default for the heartbeat failure?


Jon






 

 Jim Senicka

 [EMAIL PROTECTED]

 mantec.com
To 
 Sent by:  Andrey Dmitriev

 veritas-ha-bounce [EMAIL PROTECTED],

 [EMAIL PROTECTED]
veritas-ha@mailman.eng.auburn.edu 
 urn.edu
cc 
 

 
Subject 
 10/27/2008 07:19  Re: [Veritas-ha] Question about
HA  
 PMand disks

 

 

 

 

 

 







When a server panics, it stops writing to anything but the dump device.
VCS did exactly as designed. 16 seconds after heartbeat failure it
started takeover. Whatever was damaged on your file system was already
damaged at that point, regardless how long it took to dump core to the
dump device. I would look at the cause of the panic, and it is likely it
was something to do with what garbaged your FS


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Andrey
Dmitriev
Sent: Monday, October 27, 2008 2:01 PM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] Question about HA and disks

We had an issue where a serverA failed and serverB took over.
However, serverB took over when serverA was still 'crashing' (it took a
good 10-15mins to crash), and apparently still had a hold of file
systems (system logs confirm that takeover occurred while serverA was
still 'puking').
The file systems on ServerB came up corrupt, and we lost some data b/c
of that.
HA is setup via heartbeats. File system is vxfs, OS is RedHat 4.0.
Is there are any way to avoid that?

Thanks,
Andrey

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha




___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] Question about HA and disks

2008-10-27 Thread Jim Senicka

While I think fencing is always the right choice, I still think this was
a system issue. The system stopped heart beating for 16 seconds, plus
the 5 seconds gab stable time out. At this point, VCS failed over.
Fencing would not have been in play until the import on the second node.
So if the corruption happened during the 21 seconds, it would not have
helped.
If there is a case where the node is nearly dead for an extended
period of time, not capable of kernel level heartbeat from LLT, but is
still writing to disk, then by all means you need I/O fencing to protect
you from the OS.





-Original Message-
From: Brad Boyer 
Sent: Monday, October 27, 2008 8:57 PM
To: Jim Senicka; Jon E Price/SYS/NYTIMES; Andrey Dmitriev; Joshua
Fielden; veritas-ha@mailman.eng.auburn.edu
Subject: RE: [Veritas-ha] Question about HA and disks

Based on the original description, I would presume that the system did
not actually panic immediately. I've seen Linux systems oops without
immediate panics many times. I would make no assumption of what the
dying system was doing in this case without real evidence, especially
not that it actually got as far as a panic. Linux is not UNIX (it's just
unofficially POSIX compliant), and you shouldn't make the assumption
that Linux will act like UNIX (it definitely acts different in quite a
few ways). Seeing as this is RHEL4, this system probably isn't even
capable of taking a crash dump, and thus would be unlikely to be taking
time writing a crash dump as opposed to doing some damage to the data on
disk. Even with the current Red Hat release (RHEL5) crash dumps aren't
enabled by default.

My suggestion is that using I/O fencing would be the right answer here.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jim
Senicka
Sent: Monday, October 27, 2008 5:21 PM
To: Jon E Price/SYS/NYTIMES; Andrey Dmitriev; Joshua Fielden;
veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] Question about HA and disks

In the original message
 We had an issue where a serverA failed and serverB took over.
However, serverB took over when serverA was still 'crashing' (it took a
good 10-15mins to crash),

I can assume crash = panic, as crashing has to refer to dumping core
to disk.

If this is the case, there will be no logs on server A, as it is mid
panic.

In this case (the node is in the middle of a crash dump), it will not be
writing to data disks. What ever was written happened before the kernel
call to panic. Fencing will protect that data once the new node imports,
but in the case described here, the corruption had to happen before the
panic, so fence would not have helped.

Bottom line is the node ceased writing as soon as the non maskable
interrupt was called for panic (unless Linux somehow violates every Unix
kernel rule, which I seriously doubt). When VCS took over the service
group on Server B, Server A was down and could not have been writing


-Original Message-
From: Jon E Price/SYS/NYTIMES [mailto:[EMAIL PROTECTED] 
Sent: Monday, October 27, 2008 8:14 PM
To: Jim Senicka; Andrey Dmitriev; Joshua Fielden;
veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] Question about HA and disks

Hi,

A few questions..

Andrey: Could you post the logs (or even portions of them) which show
what
ServerA was doing during the takeover?

Joshua: You're saying that IO Fencing can prevent split brain situations
in
which one server is still writing to a filesystem while a 2nd server has
taken over that same service group and begun writing to the same fs,
thus
possibly causing corruption?

http://sfdoccentral.symantec.com/sf/5.0/linux/html/vcs_install/ch_vcs_in
stall_iofence.html#190559

Jim: What's the evidence that the server panic'd?
And is 16 seconds the default for the heartbeat failure?


Jon






 

 Jim Senicka

 [EMAIL PROTECTED]

 mantec.com
To 
 Sent by:  Andrey Dmitriev

 veritas-ha-bounce [EMAIL PROTECTED],

 [EMAIL PROTECTED]
veritas-ha@mailman.eng.auburn.edu 
 urn.edu
cc 
 

 
Subject 
 10/27/2008 07:19  Re: [Veritas-ha] Question about
HA  
 PMand disks

 

 

 

 

 

 







When a server panics, it stops writing to anything but the dump device.
VCS did exactly as designed. 16 seconds after heartbeat failure it
started takeover. Whatever was damaged on your file system was already
damaged at that point, regardless how long it took to dump core to the
dump device. I would look at the cause of the panic, and it is likely it
was something to do with what garbaged your FS


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Andrey
Dmitriev
Sent: Monday, October 27, 2008 2:01 PM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] Question about HA and disks

We had an issue where a serverA failed and serverB took over

Re: [Veritas-ha] VCS 5.0 MP1: issue probing disk-group !?

2008-10-22 Thread Jim Senicka

Cut and paste main.cf for the service group in question?

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Pascal
Grostabussiat
Sent: Tuesday, October 21, 2008 9:11 AM
Cc: Veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] VCS 5.0 MP1: issue probing disk-group !?

To Jim, Scott and Gene.

Jim Senicka wrote:
 Is the disk group agent running on the systems?

Yes it is:

root 16295 1   0 16:16:01 ?   1:29 
/opt/VRTSvcs/bin/DiskGroup/DiskGroupAgent -type DiskGroup

 Has the cluster been started since you created the service group
 definition?

Yes. I restarted VCS hopping it might somehow change something, but no. 
I am thinking about rebooting one server.
 Are all resources enabled in the service groups?

Yes. I tried to disable them and re-enable them. But I come back to the 
same situation.

Scott3, James wrote:
 Have you made sure the volumes are ENABLED ACTIVE?  Can you send a
 vxprint on the group?  Is it a shared group or a active/passive group?
 Also send a vxdg list. 
Enabled and active, yes. The disk-group is active/passive (to be mounted

on one host at a time).

bash-3.00# vxprint -l dba_DG
Disk group: dba_DG

Group:dba_DG
info: dgid=1224062934.119.hostname
version:  140
alignment: 8192 (bytes)
detach-policy: global
dg-fail-policy: dgdisable
copies:   nconfig=default nlog=default
devices:  max=32767 cur=3
minors:   = 62000
cds=on

bash-3.00# vxprint -g dba_DG
TY NAME ASSOCKSTATE   LENGTH   PLOFFS   STATETUTIL0

PUTIL0
dg dba_DG   dba_DG   -----
-

dm dba_DG01 c0t216000C0FF87E774d10s2 - 525417536 -  --
-

v  dba_archive  fsgenENABLED  20971520 -ACTIVE   -
-
pl dba_archive-01 dba_archive ENABLED 20971520 -ACTIVE   -
-
sd dba_DG01-02  dba_archive-01 ENABLED 20971520 0   --
-

v  dba_data fsgenENABLED  104857600 -   ACTIVE   -
-
pl dba_data-01  dba_data ENABLED  104857600 -   ACTIVE   -
-
sd dba_DG01-03  dba_data-01  ENABLED  104857600 0   --
-

v  dba_redo fsgenENABLED  20971520 -ACTIVE   -
-
pl dba_redo-01  dba_redo ENABLED  20971520 -ACTIVE   -
-
sd dba_DG01-01  dba_redo-01  ENABLED  20971520 0--
-

bash-3.00# vxdg list
NAME STATE   ID
xxx_DG  enabled,cds  1224062531.89.hostname
xxx_DG  enabled,cds  1224062634.101.hostname
xxx_DG  enabled,cds  1224062699.109.hostname
dba_DG   enabled,cds  1224062934.119.hostname
xxx_DG   enabled,cds  1224062443.81.hostname
xxx_DGenabled,cds  1224062569.93.hostname
xxx_DG  enabled,cds  1224062672.105.hostname
xxx_DG  enabled,cds  1224062491.85.hostname

Gene Henriksen wrote:
 If you have a ? in the GUI, then it cannot probe the resource on one
 system or the other. It will not import on either until it is probed
on
 both. This is to avoid a concurrency violation.

Yes. Fully agree.
 Hold the cursor on the resource and a pop-up box should show the
status
 so you can see where it is not probed.

Status is unkown on both server A and B.
 This could be due to one system having never seen the DG. Can you run
 vxdisk -o alldgs list and see the DG on both systems?

I can import/deport that disk-group using vxdg without a problem

bash-3.00# vxdisk -o alldgs list
DEVICE   TYPEDISK GROUPSTATUS
c0t216000C0FF87E774d0s2 auto:none   --online

invalid
c0t216000C0FF87E774d1s2 auto:cdsdiskxxx_DG01  xxx_DG   online
c0t216000C0FF87E774d2s2 auto:cdsdiskxxx_DG01xxx_DG  online
c0t216000C0FF87E774d3s2 auto:cdsdiskxxx_DG01xxx_DG  online
c0t216000C0FF87E774d4s2 auto:cdsdiskxxx_DG01  xxx_DGonline
c0t216000C0FF87E774d5s2 auto:cdsdisk-(xxx_DG)online
c0t216000C0FF87E774d6s2 auto:cdsdiskxxx_DG01xxx_DG  online
c0t216000C0FF87E774d7s2 auto:cdsdiskxxx_DG01xxx_DG  online
c0t216000C0FF87E774d8s2 auto:cdsdiskxxx_DG01  xxx_DG  online
c0t216000C0FF87E774d9s2 auto:cdsdisk-(xxx_DG)   online
c0t216000C0FF87E774d10s2 auto:cdsdiskdba_DG01 dba_DG
online
c2t0d0s2 auto:none   --online invalid
c2t2d0s2 auto:none   --online invalid
c2t3d0s2 auto:none   --online invalid

 The other possibility is a typo in the DiskGroup resource attribute.
 Make sure it has no leading spaces, is the correct case (just like
 vxdisk list shows it.
I thought about this and double-checked. Nothing. I recreated the 
resource and paid attention to such possibility, nothing.

Regards,
/Pascal

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] VCS 5.0 MP1: issue probing disk-group !?

2008-10-21 Thread Jim Senicka

Is the disk group agent running on the systems?
Has the cluster been started since you created the service group
definition?
Are all resources enabled in the service groups?


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Pascal
Grostabussiat
Sent: Tuesday, October 21, 2008 7:09 AM
To: Veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] VCS 5.0 MP1: issue probing disk-group !?

Hi,

I have experiencing a weird issue since yesterday and I cannot get that 
solve buy surfing and checking around. So I hope to get a hint using the

mailing-list.

Our sysadmin recently installed a system with two Sun SPARC for me with 
VxVM, VxFS and VCS. In short I have VERITAS Foundation 5.0 with MP1.

  DESC:  Veritas Cluster Server by Symantec
PSTAMP:  Veritas-5.0MP1-11/29/06-17:15:00

  DESC:  Virtual Disk Subsystem
PSTAMP:  Veritas-5.0-MP1.26:2007-02-28

  DESC:  Commercial File System
PSTAMP:  VERITAS-FS-5.0.1.0-2007-01-17-5.0MP1=123202-02

Now I have an issue with all disk-groups, like for example dba_DG. Using

the command line or the VERITAS Enterprise Administrator I can 
import/deport the disk-group, I can mount the corresponding volumes and 
creates file-systems on them. No issue there.

Now I go to the VERITAS Cluster Administrator and there our sysadmin had

already created resources for the disk-groups. However, I cannot bring 
anyone online because the GUI keeps on telling me that the resource has 
not been probed on the system (I have two systems, tried to online on A 
and B, but same behavior). I deleted the resource, created a new one, 
same issue. I still have a ? mark on the resource. Issuing a probe 
does not solve anything. I checked the engine_A.log and can see that the

probe was fired, but nothing more. I can run the hares -probe dba_DG 
-sys A and I get the prompt back, nothing else appears !?

I am puzzled ! Any idea ? Any known issue ?

Many thanks in advance.
Regards,
/Pascal
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] IPMultiNICB, mpathd and network outages

2008-10-20 Thread Jim Senicka

I would be more concerned about future failures being handled properly.
If you were able to take out all networks from all nodes at same time,
you have a SPOF. If this was a one time maintenance upgrade to your
network gear and not a normal event, setting VCS to not respond to
network events means that future cable or port issues will not be
handled.
If it is a common occurrence for all networks to be lost, perhaps you
need to address the network issues :-)



-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
DeMontier, Frank
Sent: Monday, October 20, 2008 11:10 AM
To: Paul Robertson; veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] IPMultiNICB, mpathd and network outages

FaultPropagation=0 should do it.

Buddy DeMontier
State Street Global Advisors
Infrastructure Technical Services
Boston Ma 02111

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Paul
Robertson
Sent: Monday, October 20, 2008 10:37 AM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] IPMultiNICB, mpathd and network outages

We recently experienced a Cisco network issue which prevented all
nodes in that subnet from accessing the default gateway for about a
minute.

The Solaris nodes which run probe-based IPMP reported that all
interfaces had failed because they were unable to ping the default
gateway; however, they came back within seconds once the network issue
was resolved. Fine.

Unfortunately, our VCS nodes initiated an offline of the service group
after the IPMultiNICB resources detected the IPMP fault. Since the
service group offline/online takes several minutes, the outage on
these nodes was more painful. Furthermore, since the peer cluster
nodes in the same subnet were also experiencing the same mpathd fault,
there would have been little advantage to failing over the service
group to another node.

We would like to find a way to configure VCS so that the service group
does not offline (and any dependent resources within the service group
are not offlined) in the event of an mpathd (i.e. IPMultiNICB)
failure. In looking through the documentation, it seems that the
closest we can come is to increase the IPMultiNICB ToleranceLimit from
1 to a huge value:

 # hatype -modify IPMultiNICB ToleranceLimit 

This should achieve our desired goal, but I can't help thinking that
it's an ugly hack, and that there must be a better way. Any
suggestions are appreciated.

Cheers,

Paul

P.S. A snippet of the main.cf file is listed below:


 group multinicbsg (
   SystemList = { app04 = 1, app05 = 2 }
   Parallel = 1
   )

   MultiNICB multinicb (
   UseMpathd = 1
   MpathdCommand = /usr/lib/inet/in.mpathd -a
   Device = { ce0 = 0, ce4 = 2 }
   DefaultRouter = 192.168.9.1
   )

   Phantom phantomb (
   )

   phantomb requires multinicb

 group app_grp (
   SystemList = { app04 = 0, app05 = 0 }
   )

   IPMultiNICB app_ip (
   BaseResName = multinicb
   Address = 192.168.9.34
   NetMask = 255.255.255.0

   Proxy appmnic_proxy (
   TargetResName = multinicb
   )

   (various other resources, including some that depend on app_ip
   excluded for brevity)

   app_ip requires appmnic_proxy
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] Server crashes but VCS doesn't detect it

2008-06-17 Thread Jim Senicka

If power cycle fixed it, it was still heartbeating on LLT. 

Sent from my Nokia E62 handheld by goodlink.

 -Original Message-
From:   Andrey Dmitriev [mailto:[EMAIL PROTECTED]
Sent:   Tuesday, June 17, 2008 03:33 PM US Mountain Standard Time
To: veritas-ha@mailman.eng.auburn.edu
Cc: veritas-ha@mailman.eng.auburn.edu
Subject:[Veritas-ha] Server crashes but VCS doesn't detect it

Had sort of a weird case today.
We had a server failure, lost network, console was being filled with some sort 
of crash info.
The cluster however, showed everything online. We also had a netdump configured 
(linux), but that couldn't work b/c network was down.
Customer is unhappy why it didn't fail over.
Anyone can think of a reason or think of how I can prevent something similar in 
the future?
I sort of suspect LLT might still have been up somewhat.
It wasn't until we powercycled the box did the other nodes detect it was down.

-andrey

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] vxfencing 2 nodes

2008-06-12 Thread Jim Senicka

For any cluster larger than 1 node, I/O fencing is highly recommended to
protect data integrity in the event of a split brain.
2 nodes is not in any way more resistant to split brain than 3 nodes or
more.


VCS does not use any form of quorum based membership (quorum has a
number of it's own ugly issues), so there is no difference in how our
membership works when you have 2, 3, or 32 nodes
 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Shashi
Kanth Boddula
Sent: Thursday, June 12, 2008 7:30 AM
To: Mayank Vasa
Cc: veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] vxfencing 2 nodes

Ok, thanks for clarification.

I have seen many clustering products documentation which says that
fencing/quorum is optional/not_required for more than 2 node clusters,
and they says that there is very very less chance of happening split
brain condition for more than 2 node clusters.

-- Shashi

Mayank Vasa wrote:
 Shashi:

 The number of nodes is not a decision making factor for fencing. For a

 cluster greater than 2 nodes, fencing helps to protect your data in 
 the case of a split brain scenario.

 SFRAC requires fencing. It is not supported without it.

 Regards,
 + Mayank


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Shashi

 Kanth Boddula
 Sent: Wednesday, June 11, 2008 12:23 AM
 To: veritas-ha@mailman.eng.auburn.edu
 Subject: [Veritas-ha] vxfencing 2 nodes

 Is vxfencing required if we go for =3 node cluster ?  Or, vxfencing 
 is optional if we go for =3 node cluster ?


 I am going for 4-node VCS5 SFRAC, still vxfencing required for me ? , 
 does all VCS5 SFRAC modules work properly without vxfencing ?



 ___
 Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu 
 http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

   

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] Veritas Cluster Server

2008-06-05 Thread Jim Senicka

Have you opened a support case?
 
To the best of my knowledge, VCS 4.1 does not support RHEL 5. 
Support can confirm



From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Goutham
N
Sent: Thursday, June 05, 2008 8:37 AM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] Veritas Cluster Server


 
Hi,
 
I am installing Veritas Cluster server 4.1 on Red Hat Linux 5.x
environment. I am getting the following error message.
Can anyone help with a solution.
 
 

Cluster Server configured successfully.

Starting Cluster Server:

Starting LLT on usplselux141

/etc/init.d/llt start 21

exit=256

Starting LLT: 

LLT: loading module...

LLT:Error: cannot find compatible module binary

/sbin/lltconfig 21

exit=256

LLT lltconfig ERROR V-14-2-15000 open /dev/llt failed: No such file or
directory

Error

CPI ERROR V-9-120-1171 Could not start LLT on usplselux141: LLT
lltconfig ERROR V-14-2-15000 open /dev/llt failed: No

such file or directory

The installvcs log is saved at:

/opt/VRTS/install/logs/installvcs605084118.log



-- 
N. Gowthaman 
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] VCS with replicated storage

2008-06-05 Thread Jim Senicka

You are attempting to build what is called a Replicated Data Cluster.
This should be documented in the UG as I recall.
You will use identical DG and volume resources, with the appropriate
replication management resource under the DG. To do this and comply with
the EULA, you need the HA/DR Edition of VCS, to license you to use the
replication agents.
 
In an RDC, the replication agent manages read/write enabling and
direction of replication.  When you failover, the opposite node is write
enabled, then the normal DG and volume agents bring up the storage
Hugh Shannon here at Symantec is the Technical Product Manager
responsible for these type configs



From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Esson,
Paul
Sent: Thursday, June 05, 2008 9:40 AM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] VCS with replicated storage



Folks,

 

My background with VCS is limited to local clusters with shared storage
arrays ad software mirroring of volumes using VxVM.

I have been asked to implement a VCS 5.0 cluster on Solaris 10 using
replicated block-level NetApps storage.  This will be a stretched
cluster with one node on each of two sites and heartbeat connections
using VLANs.  

 

What I am struggling with at the moment is how to configure the storage
resources within VCS.  I am use to defining shared volume groups/volumes
but as I see it each node will effectively have a local LUN or LUNs with
blocks being replicated at the array level from the active to the
inactive node.

 

Do I create separate Volume Groups and Volumes on each node and set the
associated attributes on a per system basis such that failover starts
the application up mounting the file system on the replica volume of the
alternative node?

 

Regards

Paul Esson 
Redstor Limited 

Direct:   +44 (0) 1224 595381 
Mobile:  +44 (0) 7766 906514 
E-Mail:  [EMAIL PROTECTED] 
Web:www.redstor.com 

REDSTOR LIMITED 
Torridon House 
73-75 Regent Quay 
Aberdeen 
UK 
AB11 5AR 

Disclaimer: 
The information included in this e-mail is of a confidential nature and
is intended only for the addressee.  If you are not the intended
addressee, any disclosure, copying or distribution by you is prohibited
and may be unlawful.  Disclosure to any party other than the addressee,
whether inadvertent or otherwise is not intended to waive privilege or
confidentiality.

 

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] .stale file

2008-06-03 Thread Jim Senicka

Right.
But that can also be done via CLI or GUI with the cluster running.

From: i man [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, June 03, 2008 9:48 AM
To: Jim Senicka
Cc: Gene Henriksen; veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] .stale file

Jim,

This is to update systems with some new service groups. This is not on a
single system but rather large number of systems (100+)

Also so many thanks to Gene and John for resolving my doubts.

Ciao,

On Tue, Jun 3, 2008 at 2:30 PM, Jim Senicka [EMAIL PROTECTED]
wrote:

Bigger question is what are you routinely using stop -force to
accomplish?

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Gene
Henriksen
Sent: Tuesday, June 03, 2008 8:17 AM
To: i man; veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] .stale file

It indicates you did not close and save the cluster
configuration after making modifications. It is a warning. If you close
and save the config, it goes away.

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of i man
Sent: Tuesday, June 03, 2008 7:28 AM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] .stale file

All,

Had some queries regarding the .stale file present in the
/etc/VRTSvcs/conf/config directory. I know that if the haagents are
restarted with hastop -all -force and this file is present the cluster
memebers could be in stale admin wait state. I have been deleting this
file then hastop -all -force and then hastart on the the nodes. I do not
want the service groups to go offline that's why -force.

My query is what is the use of .stale ?
Would hastart -force help to get nodes back if this file is
present ?
Is file deletion the only method to get the nodes back ?

I noticed recently that when getting the cluster back, this way
my clusters the information about the admin password. I thnk I'm doing
something wrong.any help.

Ciao.

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] .stale file

2008-06-03 Thread Jim Senicka

Bigger question is what are you routinely using stop -force to
accomplish?
 



From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Gene
Henriksen
Sent: Tuesday, June 03, 2008 8:17 AM
To: i man; veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] .stale file



It indicates you did not close and save the cluster configuration after
making modifications. It is a warning. If you close and save the config,
it goes away.

 



From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of i man
Sent: Tuesday, June 03, 2008 7:28 AM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] .stale file

 

All,

Had some queries regarding the .stale file present in the
/etc/VRTSvcs/conf/config directory. I know that if the haagents are
restarted with hastop -all -force and this file is present the cluster
memebers could be in stale admin wait state. I have been deleting this
file then hastop -all -force and then hastart on the the nodes. I do not
want the service groups to go offline that's why -force.

My query is what is the use of .stale ?
Would hastart -force help to get nodes back if this file is present ?
Is file deletion the only method to get the nodes back ?

I noticed recently that when getting the cluster back, this way my
clusters the information about the admin password. I thnk I'm doing
something wrong.any help.

Ciao.

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] .stale file

2008-06-03 Thread Jim Senicka

you only need one notifier, usually in the CSG.
No need for proxy anywhere else.

From: i man [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, June 03, 2008 12:00 PM
To: Gene Henriksen
Cc: John Cronin; Jim Senicka; veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] .stale file

Gene,John,Jim,

Thats excellent. So many thanks again for the new ideas. There is one
last query regarding the whole activity. 

This is regarding the use of Proxy for notifier. Nobody has been able to
tell me definately whether this is required for the notifier. If I
create my notifier in the Cluster service group or any other service
group does it require a proxy to send alerts. If so and if I create the
notifier in separate service group is it fine if I create the proxy in
Cluster service group. 

Having gone thorugh BARG there are sample examples which explain
notifier dependency on proxy, but even without the proxy things seem to
be working fine for me in a test system.Also when installing thorugh GUI
it does ask about some NIC card information, the step which I always
skipped, don't know how relevant this is for the creation and working of
notifier.

Ciao

On Tue, Jun 3, 2008 at 4:22 PM, Gene Henriksen
[EMAIL PROTECTED] wrote:

Putting the Notifier in the cluster service group also has an
advantage because CSG is the first SG up and the hardest to kill,
therefore in times of lots of problems you will get notification more so
than if the service group you arbitrarily chose to use is faulted on all
systems in the cluster, then notification is also down.

You could create the CSG in one system, save the configuration,
run hacf -cftocmd . in the /etc/VRTSvcs/conf/config directory, then
edit the main.cmd (look toward the bottom) to find the commands to
create the CSG and Notifier, make a script and modify to run on other
clusters.

From: John Cronin [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, June 03, 2008 10:45 AM
To: i man
Cc: Jim Senicka; Gene Henriksen;
veritas-ha@mailman.eng.auburn.edu 

Subject: Re: [Veritas-ha] .stale file

It would be no problem to create a Notifier resource in any
arbitrary service group with the CLI.  If I understand this correctly,
what you are doing is shutting down VCS, and then editing main.cf to
change the config?  If this was for one or two clusters, it might be an
OK way to do it, but if this is for hundreds of systems, it would be
better to learn how to use the CLI and then script the changes.

Also, what is the problem with putting the notifier in the
ClusterService group?  I can't see how putting it in another service
group would provide you any particular benefit - the Notifier if going
to do the same things no matter which service group it is in.  Since it
is a cluster wide service, it makes sense that it should be in the
ClusterService group.

As for using hastop -all -force, I tend to use it frequently
on production systems when I am doing something that requires stopping
the cluster, but does not require stopping the systems or the services
running on those systems (e.g. patching or upgrading VCS, or
reconfiguring GAB or LLT).  However, I would not do this to accomplish
something that can be done with CLI commands.

-- 

John Cronin

On 6/3/08, i man [EMAIL PROTECTED] wrote: 

Correct Jim, If this would have been a normal cluster service
group I would loved to have done that. What I'm trying to obtain is
creation of snmp notifier in a separate service group . Through GUI you
cannot create it in your own service group but could only create it as a
part of Clusterservicegroup. Not sure if this is achievable through CLI.

Any suggestions ? 

On Tue, Jun 3, 2008 at 2:52 PM, Jim Senicka
[EMAIL PROTECTED] wrote:

Right.

But that can also be done via CLI or GUI with the cluster
running.

From: i man [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, June 03, 2008 9:48 AM
To: Jim Senicka
Cc: Gene Henriksen; veritas-ha@mailman.eng.auburn.edu 

Subject: Re: [Veritas-ha] .stale file

Jim,

This is to update systems with some new service groups. This is
not on a single system but rather large number of systems (100+)

Also so many thanks to Gene and John for resolving my doubts.

Ciao,

On Tue, Jun 3, 2008 at 2:30 PM, Jim Senicka
[EMAIL PROTECTED] wrote:

Bigger question is what are you routinely using stop -force to
accomplish?

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED

Re: [Veritas-ha] Importance of NIC Proxy in Clusterservice group

2008-06-02 Thread Jim Senicka

You should be monitoring the NIC in some service group on the box. A NIC
Proxy is used to prevent duplicate monitoring by other service groups

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of i man
Sent: Monday, June 02, 2008 12:13 PM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] Importance of NIC Proxy in Clusterservice group

All,

Can anybody let me know why is a NIC proxy required in clusterservice
group ?

Also is this necessary to create a NIC proxy in clusterservice group for
SNMP notifier which is created in separate service group.

Cioa.

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] question about hastop

2008-05-23 Thread Jim Senicka

Hastop -force -all does not take down resources.
But why not add the resources online?
Hastop -force -all is really only used for heavy lifting, like upgrading
VCS bits. You can add the resources on the fly using CLI or GUI 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Paveza,
Gary
Sent: Friday, May 23, 2008 10:13 AM
To: 'Veritas HA'
Subject: [Veritas-ha] question about hastop

I currently have a Veritas Cluster for RAC which really only is
responsible for mounting the filesystems for the cluster.  The database
start / stop and CSSD are handled via system startup scripts.  I need to
modify the main.cf file to add a resource for Networker.  

If I issue the hastop -all -force command (as outlined in the Networker
manual), will this shutdown the cluster and make the filesystems umount?
Or will everything remain up and running?

-
Gary Paveza, Jr.
AIG - Personal Lines Division
Technical Specialist - Architecture - HP CSE, SCSA
(302) 252-4831 - phone

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] Veritas Cluster Server 5.0 available for RHEL 5.x

2008-05-22 Thread Jim Senicka

5.0MP3 will add RHEL 5 support. Talk with your rep on release dates?
 



From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Tom
Stephens
Sent: Thursday, May 22, 2008 11:48 AM
To: Goutham N; veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] Veritas Cluster Server 5.0 available for RHEL
5.x



Not according to the release notes for the product.  These can be found
at: 

 

ftp://exftpp.symantec.com/pub/support/products/ClusterServer_UNIX/283850
.pdf (For Linux 5.0)

ftp://exftpp.symantec.com/pub/support/products/ClusterServer_UNIX/287175
.pdf (For Linux 5.0 MP1)

ftp://exftpp.symantec.com/pub/support/products/ClusterServer_UNIX/289442
.pdf (For Linux 5.0 MP2)

.

 

Tom

 

 

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Goutham
N
Sent: Thursday, May 22, 2008 1:40 AM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] Veritas Cluster Server 5.0 available for RHEL 5.x

 

Is Veritas Cluster Server 5.0 available for RHEL Version 5 ?



-- 
N. Gowthaman 

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] bundled HP-UX vxfm/vxfs

2008-04-18 Thread Jim Senicka

With SFRAC already installed, shouldn't you have VCS already installed?  

Sent from my Nokia E62 handheld by goodlink.

 -Original Message-
From:   Shashi Kanth Boddula [mailto:[EMAIL PROTECTED]
Sent:   Friday, April 18, 2008 02:12 AM US Mountain Standard Time
To: veritas-ha@mailman.eng.auburn.edu
Subject:[Veritas-ha] bundled HP-UX vxfm/vxfs

I used to get the bellow message whenever i install VCS

SFRAC version 4.1 includes VRTSvxvm version 4.1.010.  A more recent
version of VRTSvxvm, 4.1.011, is already installed.

CPI WARNING V-9-10-1400 In this situation VRTSvxvm version 4.1.011 will
not be installed or downgraded.
SFRAC version 4.1 may not operate correctly with this more recent package.
The VRTSvxvm package must be removed manually before version 4.1.010 can
be installed.

SFRAC version 4.1 includes VRTSvxfs version 4.1.  A more recent version
of VRTSvxfs, 4.1.001, is already installed.
CPI WARNING V-9-10-1400 In this situation VRTSvxfs version 4.1.001 will
not be installed or downgraded.
SFRAC version 4.1 may not operate correctly with this more recent package.
The VRTSvxfs package must be removed manually before version 4.1 can be
installed.

Is there any known issues/problems if we proceed to install VCS without
removing operating system bundled VxVM/VxFS, and continue to install VCS
with operating system bundled VxVM/VxFS (not VCS bundled VxVM/VxFS) ?  
Or, we can simply ignore this message, and cluster will operate normally
with out any problems ?

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] Coordinator disks

2008-04-11 Thread Jim Senicka

No. 
A. It must be an odd number. (otherwise no majority possible)
B. You cannot add online. 

You will need to bounce the cluster (or at least the fence driver) to move to 
the new array


Sent from my Nokia E62 handheld by goodlink.


 -Original Message-
From:   Rongsheng Fang [mailto:[EMAIL PROTECTED]
Sent:   Friday, April 11, 2008 04:09 PM US Mountain Standard Time
To: veritas-ha@mailman.eng.auburn.edu
Subject:[Veritas-ha] Coordinator disks

Hi,

Does anybody know how many coordinator disks the coordinator disk  
group can have? The VCS installation guide says 3, but doesn't say if  
more is supported.

We currently have 3 coordinator disks configured for a VCS 5.0 MP1  
cluster with IO fencing enabled. We will need to shutdown the array  
where the coordinator disks reside for temporarily (for a few days).  
So I am thinking if I can add another three coordinator disks from  
another array to the coordinator disk group. This way the coordinator  
disk group would still have 3 available coordinator disks while the  
original three are down. Would this work? Or what's the best way to  
deal with this situation?

Thanks,

Rongsheng
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] Coordinator disks

2008-04-11 Thread Jim Senicka

Fence won't start if even. 

Sent from my Nokia E62 handheld by goodlink.

 -Original Message-
From:   Joshua Fielden [mailto:[EMAIL PROTECTED]
Sent:   Friday, April 11, 2008 04:30 PM US Mountain Standard Time
To: Rongsheng Fang; veritas-ha@mailman.eng.auburn.edu
Subject:Re: [Veritas-ha] Coordinator disks

3 *or more*, but they need to be an odd number, so minimize the amount of time 
they're even -- coordinator races are decided by holding a majority, so you 
have exposure while the total number of disks are even.

Cheers,

jf

Sent by GoodLink (www.good.com)

 -Original Message-
From:   Rongsheng Fang [mailto:[EMAIL PROTECTED]
Sent:   Friday, April 11, 2008 04:09 PM US Mountain Standard Time
To: veritas-ha@mailman.eng.auburn.edu
Subject:[Veritas-ha] Coordinator disks

Hi,

Does anybody know how many coordinator disks the coordinator disk  
group can have? The VCS installation guide says 3, but doesn't say if  
more is supported.

We currently have 3 coordinator disks configured for a VCS 5.0 MP1  
cluster with IO fencing enabled. We will need to shutdown the array  
where the coordinator disks reside for temporarily (for a few days).  
So I am thinking if I can add another three coordinator disks from  
another array to the coordinator disk group. This way the coordinator  
disk group would still have 3 available coordinator disks while the  
original three are down. Would this work? Or what's the best way to  
deal with this situation?

Thanks,

Rongsheng
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] Question re SFRAC 5.0

2008-02-29 Thread Jim Senicka

Kelly,
That is not normal. If the DB is top of the tree, and set to non
critical, it should not cause the group to offline. Even after we
introduced FaultPropagation and ManageFaults the core
Critical/Non-Critical behavior should not have changed.
Can you open a case on this?
 

Jim


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
[EMAIL PROTECTED]
Sent: Friday, February 29, 2008 11:57 AM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] Question re SFRAC 5.0


We are testing a new SFRAC 5.0 cluster.  One of the scenarios is a
shutdown abort to one instance.  When we did this, it took the whole
group offline on that node even though the database is the top resource
in the dependency tree.  Is this normal behavior?  I don't remember this
every happening before.  I remember it only taking the database offline
and leaving the mounts up.  The database is a non-critical resource with
nothing depending on it. 

Thanks in advance for your help!


**
The information contained in this message, including attachments, may
contain privileged or confidential information that is intended to be
delivered only to the person identified above. If you are not the
intended recipient, or the person responsible for delivering this
message to the intended recipient, Alltel requests that you immediately
notify the sender and asks that you do not read the message or its
attachments, and that you delete them without copying or sending them to
anyone else. 



___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] LLT crossed links

2008-02-19 Thread Jim Senicka

I disagree, as long as the SAP stuff is taken care of.
2 dedicated + 2 additional (even sharing a VLAN) is  pretty good.



-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Joshua
Fielden
Sent: Tuesday, February 19, 2008 11:40 AM
To: Ceri Davies
Cc: veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] LLT crossed links

One can't set up a successful cluster planning for the best case -- one
has to plan for the worst case. 2, 4, or 40 links, the underlying
discipline doesn't change.

What happens, in the below scenario, when you lose both dedicated
heartbeats? You're left with two links on the same VLAN, which is
verboten.

Cheers,

jf


Sent by GoodLink (www.good.com)


 -Original Message-
From:   Ceri Davies [mailto:[EMAIL PROTECTED]
Sent:   Tuesday, February 19, 2008 09:35 AM US Mountain Standard Time
To: Joshua Fielden
Cc: veritas-ha@mailman.eng.auburn.edu
Subject:Re: [Veritas-ha] LLT crossed links

Even if I have four links?  The situation is that I have:

  e1000g0 - public interface, VLAN 2, say
  e1000g1 - heartbeart interface, VLAN 3
  nxge0   - public interfae, VLAN 2
  nxge1   - heartbeat interface, VLAN 4

I don't see how having e1000g0 and nxge0 both on VLAN2 can cause the
problems you mention given the presence of the other high priority
links.  Are you certain that's the case?

Thanks,

Ceri

On Tue, Feb 19, 2008 at 09:28:55AM -0700, Joshua Fielden wrote:
 Having multiple LLT links on the same VLAN/network can cause a variety
of problems such as split-brain scenarios, inability to rejoin the
cluster, and cluster failures.
 
 The heartbeats really need to be isolated from each other.
 
 Cheers,
 
 jf
 
 
 Sent by GoodLink (www.good.com)
 
 
  -Original Message-
 From: Ceri Davies [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, February 19, 2008 09:25 AM US Mountain Standard Time
 To:   veritas-ha@mailman.eng.auburn.edu
 Subject:  [Veritas-ha] LLT crossed links
 
 
 I have a couple of clusters running Solaris 10, VCS 5.
 
 I'm running IPMP on my public links and I want to configure each 
 public interface as a low-priority link.
 
 Since they're connected to the same VLAN, when I start LLT I get the 
 following warning:
 
  llt: LLT WARNING V-14-1-10497 crossed links? link 0 and link 3 of
  node 1 on the same network
 
 I'm fully aware of what this means, but I'm not 100% sure if this is 
 likely to cause me a problem or whether it's just a warning in case I 
 thought I'd connected them to different VLANs.
 
 Is this likely to be OK?  I have two other links per node which have a

 dedicated VLAN each.
 
 Ceri
 --
 That must be wonderful!  I don't understand it at all.
   -- Moliere

--
That must be wonderful!  I don't understand it at all.
  -- Moliere

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] Is VxVM mirror supported in VCS GCO option?

2007-12-28 Thread Jim Senicka

We made a decision to not support VxVM mirror in a GCO environment
because it breaks our ability to use SCSI-III based fencing. While you
could make the mirror work, it is not a Symantec supported
configuration. For dual cluster configs we would require some form of
replication.
 



From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Gene
Henriksen
Sent: Friday, December 28, 2007 5:49 AM
To: Pavel A Tsvetkov; veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] Is VxVM mirror supported in VCS GCO option?



To mirror volumes you must be dealing with a relatively small distance,
such as less than 80K. For these distances, why not use a single cluster
called a stretch or campus cluster? In SF 5.0 there is the concept
of site awareness so that VM is aware of the two sites and if a volume
at the remote site becomes detached, then all volumes at the remote site
are detached thereby maintaining consistency of the site.

 

I have not heard of the limitation you mention, I do know that in a
Replicated Data Cluster (VVR within a cluster), synchronous replication
is required because unlike GCO there is nothing to prevent failover and
we don't want the cluster to experience failovers and take over with old
data automatically.

 

With mirroring, it certainly would be possible. As in the case of
replicating data we do not recommend automatic failover. Automatic
failover could result in split brain destroying the data if the link
between the two clusters were interrupted making it appear the primary
cluster was down.

 

A lot of configurations are possible, a lot will work, but they may not
be supported. I am not sure who told you this, but I would ask for an
explanation. One possible problem could be the loss of SAN between sites
for hours followed by a failover to the remote site with old data with
the VCS admin being unaware of the storage problem.

 

I think the primary concern is split brain. With replication, you are
working with two distinct data sets. If both sides become active due to
a loss of connectivity, the data is not being corrupted, the two sites
are just growing further apart. 

 



From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Pavel A
Tsvetkov
Sent: Friday, December 28, 2007 5:15 AM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] Is VxVM mirror supported in VCS GCO option?

 


Hello all! 

Just  one interesting question about VCS GCO. I was told that VxVM
mirror is not supported if using with Global Cluster Option. Only
replicated volumes can be used ... 
Is it true?  It seems strange to me... Why not?  I think it is quite
possible to failover mirrored VxVM volume between  clusters... Or not???



Kind regards 

Pavel Tsvetkov

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] Inbound and outbound traffic

2007-12-13 Thread Jim Senicka

Not really a VCS issue.
It really depends on the IP stack from the OS, or modifying the
application to bind to a specific IP.
Usually the source IP of an outbound packet will be whatever the base
address (first address configured) is on that interface.
One possible solution is to set the base address to be on a different
subnet, that way only your VIP is on the subnet in use, and will be the
first configured interface
 



From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Pablo
Calvo
Sent: Wednesday, December 12, 2007 10:37 AM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] Inbound and outbound traffic



How can I set inbound and outbound traffic to use the same interface
(physical and virtual address)?

 

Uniqs S.A.

Sturiza 503 - Olivos 

Buenos Aires - Argentina

TE: (5411) 4711-7755/4799-5516

Cel: (54911) 53747697

 


No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.503 / Virus Database: 269.17.1/1182 - Release Date:
12/12/2007 11:29 AM


___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

[Veritas-ha] New VCS course!

2007-12-13 Thread Jim Senicka

Howdy all.
Education informed me that we have a new class online around multiple
clusters
 
Our new course that includes GCO, Secure Clusters, CMC, Solaris Zones,
RemoteSG agent and the campus cluster capability in VM that allows
site tagging is now available. The schedule for it is as follows (all we
need is students):

 Oak Brook IL Jan 30 thru 1 Feb

Mountain View Feb 4-7

Herndon, VA Feb 20-22

 






Jim Senicka
Senior Director, Technical Product Management
Server and Storage Management Group
Symantec Corporation
www.symantec.com http://www.symantec.com/ 
-
Office: 757-766-0200
Mobile: 757-870-3484
Email: [EMAIL PROTECTED]
-
 


 
att326b1.jpg___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] best way for patching of cluster servers

2007-12-07 Thread Jim Senicka

We will get that resolved (Eric and I). 

Jim Senicka




Sent from my Nokia E62 handheld by goodlink.


 -Original Message-
From:   [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent:   Friday, December 07, 2007 01:11 PM US Mountain Standard Time
To: Eric Hennessey
Cc: veritas-ha@mailman.eng.auburn.edu
Subject:Re: [Veritas-ha] best way for patching of cluster servers

Hi Eric,

That's funny, I've been told by Veritas Support that Veritas does not
support nodes in the same Cluster running at different Solaris patch
levels, no less different versions of Solaris.

Jon








   
 Eric Hennessey  
 [EMAIL PROTECTED] 
 ymantec.com   To 
 Sent by:  [EMAIL PROTECTED]  
 veritas-ha-bounce  cc 
 [EMAIL PROTECTED] veritas-ha@mailman.eng.auburn.edu   
 urn.edu   Subject 
   Re: [Veritas-ha] best way for   
   patching of cluster servers 
 12/07/2007 06:12  
 AM
   
   
   
   






Hi Upen,

My guess is you spoke with Sun sales when you asked this question.  Try
rephrasing your question to your Sun contact.  Ask him/her if they will
support a collection of systems running Solaris 9 running at different
patch levels, without regard to them being clustered.  That you're running
VCS on these systems isn't Sun's support problem, it's ours, and we
unequivocally support mixing not only different patch levels but different
Solaris versions in the same cluster.  We do this so you can leverage the
cluster as an operational support tool to enable rolling upgrades of the OS
with a minimum of application down time.

The response you got sounds like it came from someone interested in selling
Sun Cluster.  Just because THEY won't support different patch levels and
Solaris versions in the same cluster doesn't mean WE won't.  :-)

Cheers!
Eric

From: upen [mailto:[EMAIL PROTECTED]
Sent: Thursday, December 06, 2007 7:43 PM
To: Eric Hennessey
Cc: veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] best way for patching of cluster servers

Thanks Eric

One question,

Does Veritas/symantec provide support for Patching Sun servers involved in
Veritas -ha cluster. I contacted Sun for support (we have valid gold
contract) but they still refused to support as machines are part of veritas
cluster.

I am not having  Veritas contract number but I know that our contract was
renewed and I have valid contract. Is there anyway I can find out from
symantec/veritas what is my contract number if I am able to give them
necessary information-machine serial and company info. I don't know
whosoever renewed contract at work place does not seem to be of much help
in term of contract number information ...

I am not able to see the site properly on my linux machine and may be I am
not looking at proper place..if anyone can give me some contact of
veritas/symantec where I can find this information about my contract
details and support for patching sun after this..

Thanks
On Dec 6, 2007 10:25 AM, Eric Hennessey [EMAIL PROTECTED]
wrote:


The typical approach to applying OS patches in a clustered environment is
to patch an idle server, let it reboot and rejoin the cluster, and make
sure it's running OK.  If it is, use the cluster software to switch
application(s) from an active server to the one you just patched, and if
the app comes up successfully there, apply the patch to the server that's
now idle.  Keep doing this until all nodes in the cluster have been
patched.





Eric






From: [EMAIL PROTECTED] [mailto:
[EMAIL PROTECTED] On Behalf Of upen
Sent: Thursday, December 06, 2007 4:03 PM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] best way for patching of cluster servers





Hi

when it is patching of Sun stand alone servers I can patch them and I know
after reboot everything will be fine.

I wanted to how to patch Veritas-ha clustered Sun OS 5.9 Machines. Right
now the cluster service and application services are running on Server 2. I
am not sure after patching if something messes the cluster or applications
running. Please let me know best practices to update patches on Cluster
servers so that machines will have

Re: [Veritas-ha] connectivity delays

2007-11-28 Thread Jim Senicka

What address to you telnet to?

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Tihomir
Cavuzic
Sent: Wednesday, November 28, 2007 4:43 AM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] connectivity delays

Hello,

Let me introduce my little connectivity question, maybe VCS-related:

VCS 4.1, 2 Netras 440 with Solaris 10, 5 service groups, one of them is
network. Config files attached. The problem is that often we experience
connectivity delays which are demonstrated for instance by telnet
hold-ups, temporary outages of diameter links and similar, all in
duration of couple of seconds. As soon as it is over, everything goes
back to normal, telnet buffer is emptied, diameter links are up again
automatically etc.

Is there any chance this could have something to do with VCS, or I
should be looking only to Solaris, switch (port) configuration and
ethernet interfaces on my Solaris boxes? I ask this since many boxes are
connected to the same switch, switch ports are uniformly configured, and
still only my machines have trouble with delays. The only differece is
that only my machines have VCS and Solaris 10 -- all the others have
Solaris 8/9 and no VCS.

Sorry if it sounds trivial, I'm just not sure where to start looking
into...

Thanks/Regards
Tihomir

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] Interconnect hardware specifications

2007-11-27 Thread Jim Senicka

A switch? No.
2 switches? Ok.

We would be looking for 100BaseT or Gigabit, full duplex. Not so much
from a bandwidth standpoint, just reliability. Full duplex removes
collision issues.

No problems with dedicated switches per interconnect network 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Stefhen
Hovland
Sent: Tuesday, November 27, 2007 3:02 PM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] Interconnect hardware specifications

Does anyone have any information location as to a minimum hardware type
to be used for VCS interconnects? We have some production boxes running
with a Linksys switch in between the hosts and I would like to know for
sure if this is a good idea or not.


Thanks,
Stefhen
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] SF/HA 5.0 on Solaris 9: HAD Self Check error

2007-11-13 Thread Jim Senicka

HAD is not talking to GAB.
Excessive system utilization, or a blocked /var file system or some such
issue.
 



From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Marianne
Van Den Berg
Sent: Tuesday, November 13, 2007 1:17 PM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] SF/HA 5.0 on Solaris 9: HAD Self Check error


Hi all
 
Brand new installation - 2-node cluster, Solaris 9 with latest O/S
patches, SF/HA 5.0 with MP1.
 
IPMultiNICB config'ed as parallel sg (using mpathd) and ClusterService
group.
 
Getting these errors about 3 minutes after hastart.   Any ideas??
 
/var/adm/messages:
 
Nov 13 15:59:11 drp-db-1 gab: [ID 272231 kern.notice] GAB WARNING
V-15-1-20057 Port h process 140 inactive 7 sec
Nov 13 15:59:12 drp-db-1 gab: [ID 272231 kern.notice] GAB WARNING
V-15-1-20057 Port h process 140 inactive 8 sec
Nov 13 15:59:13 drp-db-1 gab: [ID 272231 kern.notice] GAB WARNING
V-15-1-20057 Port h process 140 inactive 9 sec
Nov 13 15:59:14 drp-db-1 gab: [ID 272231 kern.notice] GAB WARNING
V-15-1-20057 Port h process 140 inactive 10 sec
Nov 13 15:59:15 drp-db-1 Had[140]: [ID 702911 daemon.alert] VCS WARNING
V-16-1-51047 HAD Self Check: Excessive delay in the HAD heartbeat to GAB
(10 seconds)
Nov 13 15:59:15 drp-db-1 gab: [ID 272231 kern.notice] GAB WARNING
V-15-1-20057 Port h process 140 inactive 11 sec
Nov 13 15:59:16 drp-db-1 gab: [ID 272231 kern.notice] GAB WARNING
V-15-1-20057 Port h process 140 inactive 12 sec
Nov 13 15:59:17 drp-db-1 gab: [ID 272231 kern.notice] GAB WARNING
V-15-1-20057 Port h process 140 inactive 13 sec
Nov 13 15:59:18 drp-db-1 gab: [ID 272231 kern.notice] GAB WARNING
V-15-1-20057 Port h process 140 inactive 14 sec
Nov 13 15:59:19 drp-db-1 gab: [ID 191522 kern.notice] GAB WARNING
V-15-1-20058 Port h process 140: heartbeat failed, killing process
Nov 13 15:59:19 drp-db-1 gab: [ID 975177 kern.notice] GAB INFO
V-15-1-20059 Port h heartbeat interval 15000 msec. Statistics:
Nov 13 15:59:19 drp-db-1 gab: [ID 217350 kern.notice] GAB INFO
V-15-1-20129 Port h: heartbeats in 0 ~ 3000 msec: 3869
Nov 13 15:59:19 drp-db-1 gab: [ID 217350 kern.notice] GAB INFO
V-15-1-20129 Port h: heartbeats in 3000 ~ 6000 msec: 0
Nov 13 15:59:19 drp-db-1 gab: [ID 217350 kern.notice] GAB INFO
V-15-1-20129 Port h: heartbeats in 6000 ~ 9000 msec: 0
Nov 13 15:59:19 drp-db-1 gab: [ID 217350 kern.notice] GAB INFO
V-15-1-20129 Port h: heartbeats in 9000 ~ 12000 msec: 0
Nov 13 15:59:19 drp-db-1 gab: [ID 217350 kern.notice] GAB INFO
V-15-1-20129 Port h: heartbeats in 12000 ~ 15000 msec: 0
Nov 13 15:59:19 drp-db-1 gab: [ID 259915 kern.notice] GAB INFO
V-15-1-20094 number of processes:   158
Nov 13 15:59:19 drp-db-1 gab: [ID 631272 kern.notice] GAB INFO
V-15-1-20095 load average in 1 min: 0. 6
Nov 13 15:59:19 drp-db-1 gab: [ID 587815 kern.notice] GAB INFO
V-15-1-20096 load average in 5 min: 0. 8
Nov 13 15:59:19 drp-db-1 gab: [ID 980060 kern.notice] GAB INFO
V-15-1-20097 load average in 15 min:0.10
Nov 13 15:59:19 drp-db-1 gab: [ID 559196 kern.notice] GAB INFO
V-15-1-20098 pagein rate:   0
Nov 13 15:59:19 drp-db-1 gab: [ID 582491 kern.notice] GAB INFO
V-15-1-20099 pageout rate:  0
Nov 13 15:59:19 drp-db-1 gab: [ID 940236 kern.notice] GAB INFO
V-15-1-20041 Port h: client process failure: killing process
Nov 13 15:59:19 drp-db-1 Had[140]: [ID 702911 daemon.alert] VCS WARNING
V-16-1-53034 HAD Signal SIGABRT received
Nov 13 15:59:19 drp-db-1 Had[140]: [ID 702911 daemon.alert] VCS NOTICE
V-16-1-53038 Beginning execution of the diagnostics script
Nov 13 15:59:21 drp-db-1 Had[140]: [ID 702911 daemon.alert] VCS NOTICE
V-16-1-53039 Completed execution of the diagnostics script
Nov 13 15:59:22 drp-db-1 gab: [ID 397130 kern.notice] GAB INFO
V-15-1-20032 Port h closed
Nov 13 15:59:22 drp-db-1 syslog[29181]: [ID 702911 daemon.notice] VCS
ERROR V-16-1-11103 VCS exited. It will restart
 
 
 
had restarts, but the same thing happens again after a couple of
minutes.
 
Regards
 
Marianne
 
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] SF/HA 5.0 on Solaris 9: HAD Self Check error

2007-11-13 Thread Jim Senicka

Sorry Randy,
that was not a case of saying dunno. HAD not heartbeating GAB is
usually indicative of a system load issue or something blocking the
ability of HAD to open necessary lock files. These are general
statements as this can happen on any environment and should be easy to
track down.

Specific questions, or more difficult to solve issues need to be opened
as a support case.
This is a general discussion forum, not a support avenue for VCS.
Since the support guys have access to explorer output, core files, and
far more day to day experience, they can answer far better.
 
 


From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Randy
Slead
Sent: Tuesday, November 13, 2007 2:43 PM
To: veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] SF/HA 5.0 on Solaris 9: HAD Self Check error


I have seen this on all version of VCS (4/5) even at 10% system
utilization.  And Symantec going I dunno, is not helpful.

Jim Senicka [EMAIL PROTECTED] wrote: 

HAD is not talking to GAB.
Excessive system utilization, or a blocked /var file system or
some such issue.
 



From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Marianne
Van Den Berg
Sent: Tuesday, November 13, 2007 1:17 PM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] SF/HA 5.0 on Solaris 9: HAD Self Check
error


Hi all
 
Brand new installation - 2-node cluster, Solaris 9 with latest
O/S patches, SF/HA 5.0 with MP1.
 
IPMultiNICB config'ed as parallel sg (using mpathd) and
ClusterService group.
 
Getting these errors about 3 minutes after hastart.   Any
ideas??
 
/var/adm/messages:
 
Nov 13 15:59:11 drp-db-1 gab: [ID 272231 kern.notice] GAB
WARNING V-15-1-20057 Port h process 140 inactive 7 sec
Nov 13 15:59:12 drp-db-1 gab: [ID 272231 kern.notice] GAB
WARNING V-15-1-20057 Port h process 140 inactive 8 sec
Nov 13 15:59:13 drp-db-1 gab: [ID 272231 kern.notice] GAB
WARNING V-15-1-20057 Port h process 140 inactive 9 sec
Nov 13 15:59:14 drp-db-1 gab: [ID 272231 kern.notice] GAB
WARNING V-15-1-20057 Port h process 140 inactive 10 sec
Nov 13 15:59:15 drp-db-1 Had[140]: [ID 702911 daemon.alert] VCS
WARNING V-16-1-51047 HAD Self Check: Excessive delay in the HAD
heartbeat to GAB (10 seconds)
Nov 13 15:59:15 drp-db-1 gab: [ID 272231 kern.notice] GAB
WARNING V-15-1-20057 Port h process 140 inactive 11 sec
Nov 13 15:59:16 drp-db-1 gab: [ID 272231 kern.notice] GAB
WARNING V-15-1-20057 Port h process 140 inactive 12 sec
Nov 13 15:59:17 drp-db-1 gab: [ID 272231 kern.notice] GAB
WARNING V-15-1-20057 Port h process 140 inactive 13 sec
Nov 13 15:59:18 drp-db-1 gab: [ID 272231 kern.notice] GAB
WARNING V-15-1-20057 Port h process 140 inactive 14 sec
Nov 13 15:59:19 drp-db-1 gab: [ID 191522 kern.notice] GAB
WARNING V-15-1-20058 Port h process 140: heartbeat failed, killing
process
Nov 13 15:59:19 drp-db-1 gab: [ID 975177 kern.notice] GAB INFO
V-15-1-20059 Port h heartbeat interval 15000 msec. Statistics:
Nov 13 15:59:19 drp-db-1 gab: [ID 217350 kern.notice] GAB INFO
V-15-1-20129 Port h: heartbeats in 0 ~ 3000 msec: 3869
Nov 13 15:59:19 drp-db-1 gab: [ID 217350 kern.notice] GAB INFO
V-15-1-20129 Port h: heartbeats in 3000 ~ 6000 msec: 0
Nov 13 15:59:19 drp-db-1 gab: [ID 217350 kern.notice] GAB INFO
V-15-1-20129 Port h: heartbeats in 6000 ~ 9000 msec: 0
Nov 13 15:59:19 drp-db-1 gab: [ID 217350 kern.notice] GAB INFO
V-15-1-20129 Port h: heartbeats in 9000 ~ 12000 msec: 0
Nov 13 15:59:19 drp-db-1 gab: [ID 217350 kern.notice] GAB INFO
V-15-1-20129 Port h: heartbeats in 12000 ~ 15000 msec: 0
Nov 13 15:59:19 drp-db-1 gab: [ID 259915 kern.notice] GAB INFO
V-15-1-20094 number of processes:   158
Nov 13 15:59:19 drp-db-1 gab: [ID 631272 kern.notice] GAB INFO
V-15-1-20095 load average in 1 min: 0. 6
Nov 13 15:59:19 drp-db-1 gab: [ID 587815 kern.notice] GAB INFO
V-15-1-20096 load average in 5 min: 0. 8
Nov 13 15:59:19 drp-db-1 gab: [ID 980060 kern.notice] GAB INFO
V-15-1-20097 load average in 15 min:0.10
Nov 13 15:59:19 drp-db-1 gab: [ID 559196 kern.notice] GAB INFO
V-15-1-20098 pagein rate:   0
Nov 13 15:59:19 drp-db-1 gab: [ID 582491 kern.notice] GAB INFO
V-15-1-20099 pageout rate:  0
Nov 13 15:59:19 drp-db-1 gab: [ID 940236 kern.notice] GAB INFO
V-15-1-20041 Port h: client process failure: killing process
Nov 13 15:59:19 drp-db-1 Had[140]: [ID 702911 daemon.alert] VCS
WARNING V-16-1-53034 HAD Signal SIGABRT received
Nov 13 15:59:19 drp-db-1 Had[140]: [ID 702911 daemon.alert] VCS
NOTICE V-16-1-53038 Beginning execution of the diagnostics script

Re: [Veritas-ha] OnlineRetryLimit weird behaviour

2007-10-09 Thread Jim Senicka

Online Retry Limit sets how many times to attempt to online a resource
when initial attempt fails.
This is not a service group setting



From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
Gurugunti, Mahesh
Sent: Tuesday, October 09, 2007 11:54 AM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] OnlineRetryLimit weird behaviour


I set OnlineRetryLimit = 1 for a service group, the service group keeps
on restarting more that once inspite of this setting.
 
Any ideas?
 
Mahesh




-
The information contained in this transmission may be privileged and
confidential and is intended only for the use of the person(s) named
above. If you are not the intended recipient, or an employee or agent
responsible
for delivering this message to the intended recipient, any review,
dissemination,
distribution or duplication of this communication is strictly
prohibited. If you are
not the intended recipient, please contact the sender immediately by
reply e-mail
and destroy all copies of the original message. Please note that we do
not accept
account orders and/or instructions by e-mail, and therefore will not be
responsible
for carrying out such orders and/or instructions.  If you, as the
intended recipient
of this message, the purpose of which is to inform and update our
clients, prospects
and consultants of developments relating to our services and products,
would not
like to receive further e-mail correspondence from the sender, please
reply to the
sender indicating your wishes.  In the U.S.: 1345 Avenue of the
Americas, New York,
NY 10105.   
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] Adding a LUN in Veritas Cluster

2007-10-09 Thread Jim Senicka

Nothing. Unless you use volume resources in the dependency tree

Sent from my Nokia E62 handheld by goodlink.

 -Original Message-
From:   Artur Baruchi [mailto:[EMAIL PROTECTED]
Sent:   Tuesday, October 09, 2007 05:46 PM US Mountain Standard Time
To: veritas-ha@mailman.eng.auburn.edu
Subject:[Veritas-ha] Adding a LUN in Veritas Cluster

Hi,

After the server recognize a LUN, what is the steps to add these LUNs
in Veritas Cluster, I already have a VG that is shared.

Thanks,
Artur Baruchi
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] Conversion from Assymetric to Symmetric VCS Cluster

2007-09-11 Thread Jim Senicka

Add a second service group and set its auto start list to have node B
first

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
Shivalingam Vanam
Sent: Tuesday, September 11, 2007 9:38 AM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] Conversion from Assymetric to Symmetric VCS
Cluster

Hi, 

  Can some one point me to the documentation on the subject matter? We
would like to create a new SG ono B node by doing so.

Thanks
VSL

More photos; more messages; more whatever - Get MORE with Windows
Live(tm) Hotmail(r). NOW with 5GB storage. Get more!
http://imagine-windowslive.com/hotmail/?locale=en-usocid=TXT_TAGHM_mig
ration_HM_mini_5G_0907  
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] change cluster node

2007-09-11 Thread Jim Senicka

cluster service group is a VCS thing. It will not effect your app at
all, and does not need to be running for your application to run. It is
there for the Web UI and to host the connector if GCO is configured

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of upen
Sent: Tuesday, September 11, 2007 10:45 AM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] change cluster node

Hi,

Following is the result of hastatus -summary.

I want to make both groups on any one node but with least down time if
at all.

bb is service group while ClusterService is cluster group.

how to change ClusterService on node2 ONLINE and ClusterService on
OFFLINE so that both service groups will be on single node. 

Also will this involve any application services downtime ?

 hastatus -summary

-- SYSTEM STATE
-- System   StateFrozen  

A  node1RUNNING  0
A  node2RUNNING  0

-- GROUP STATE
-- Group   System   Probed AutoDisabledState

B  ClusterService  node1Y  N   ONLINE

B  ClusterService  node2Y  N   OFFLINE

B  bb  node1Y  N   OFFLINE

B  bb  node2Y  N   ONLINE 

I am new to VCS, so please help with complete commands.

Thanks in advance.

- 
upen,
emerge -uD life (Upgrade Life with dependencies) 
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] regarding veritas ha and apache logs from blackboard

2007-09-05 Thread Jim Senicka

This is pretty much an apache issue, not VCS. If you need to bounce apache to 
make it happen, you would simply freeze the service group while doing so to 
keep VCS from reacting, or use VCS to stop/start apache. 
As for the command to clear logs, I cannot help you there. 


Sent from my Nokia E62 handheld by goodlink.


 -Original Message-
From:   upen [mailto:[EMAIL PROTECTED]
Sent:   Wednesday, September 05, 2007 12:21 PM US Mountain Standard Time
To: veritas-ha@mailman.eng.auburn.edu
Subject:[Veritas-ha] regarding veritas ha and apache logs from 
blackboard

Hi we are using blackboard application with apache 1.3.33 version on
Sun nodes on veritas ha cluster.

I was told that if apache logs increase beyond 2 GB the Blackboard
application misbehaves.

How can I clear the logs or transfer and make 0 size . Please let me
know the procedure to have a minimum downtime with application
services.

Thanks in advance
upendra


-- 
upen,
emerge -uD life (Upgrade Life with dependencies)
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] Fw: gab restarts had

2007-08-09 Thread Jim Senicka

A couple points.

- HAD getting recycled by GAB is due to HAD not heart beating with GAB on the 
local box. This has pretty much  zero to do with LLT heartbeat between boxes. 
- HAD not heart beating GAB is indicative of HAD either being swapped out due 
to extreme high load (it runs as a real time process on Solaris) or HAD 
blocking for some reason in an I/O call. This can really only happen if /var is 
full or write protected, as there is some lock file activity there.
- The only way HAD could possibly be effected by physical networks was if it 
was blocking on some piece of data that must be sent, but you would also see 
lots of corresponding LLT alarms.

So, based on what I see here, HAD is not running correctly due to either a 
problem with /var or a load issue.


-Original Message-
From: Peter DrakeUnderkoffler [mailto:[EMAIL PROTECTED] 
Sent: Thursday, August 09, 2007 9:26 AM
To: Kiss László - Károly
Cc: Jim Senicka; veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] Fw: gab restarts had

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

But it shouldn't be halting the system now, gab will still kill had.

Do you have any llt errors or more importantly, any layer 2 errors with the 
heartbeat networks?

How do you know it's not the load?  What are you using to determine that?  What 
do you see in the /var/adm/messages or the output to dmesg a little after this 
is happening.  Those errors are a symptom of a system under too much load, but 
other things can cause that kind of symptom.  You need to actually start 
digging into the O/S layer and figure out what the system is doing.  The 
adjustment I mentioned to gabtab allows you that opportunity.

The other solution is to open a support call with Symantec and let them figure 
out what is going on.

Thanks
Peter

Peter DrakeUnderkoffler
Xinupro, LLC
617-834-2352



Kiss László - Károly wrote:
 Hi,
 I followed your instruction adn edited gabtab which now looks like:
 /sbin/gabconfig -c -k -n2
 
 but still the HAD is restarted by the gab.
 
 BR,
 Laszlo
 
 - Original Message 
 From: Peter DrakeUnderkoffler [EMAIL PROTECTED]
 To: Jim Senicka [EMAIL PROTECTED]
 Cc: Kiss László - Károly [EMAIL PROTECTED]; 
 veritas-ha@mailman.eng.auburn.edu
 Sent: Wednesday, 8 August, 2007 5:33:32 PM
 Subject: Re: [Veritas-ha] Fw: gab restarts had
 
 I agree with Jim, that is the failure scenario when the system is 
 overloaded and gab isn't able to communicate for a period of time.  As 
 a temporary measure, you can add a -k to gabtab and restart gab.  
 This will have it not force the system to panic giving you time to 
 resolve the underlying issue.
 I wouldn't leave this in place though
 
 Thanks
 Peter
 
 
 Peter DrakeUnderkoffler
 Xinupro, LLC
 617-834-2352
 
 
 
 Jim Senicka wrote:
 is the system heavily loaded?
 GAB restarts HAD when HAD does not communicate with GAB for 16 seconds.
 This usually happens only in super overload situations

 
 
 *From:* [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] *On Behalf Of 
 *Kiss László - Károly
 *Sent:* Wednesday, August 08, 2007 11:04 AM
 *To:* veritas-ha@mailman.eng.auburn.edu
 *Subject:* [Veritas-ha] Fw: gab restarts had

 Sorry, I forgot the file :(
 Here it is
 - Forwarded Message 
 From: Kiss László - Károly [EMAIL PROTECTED]
 To: veritas-ha@mailman.eng.auburn.edu
 Sent: Wednesday, 8 August, 2007 5:02:42 PM
 Subject: Re: [Veritas-ha] gab restarts had

 Hi,

 We have a two node cluster, VCS 4.1.

 When I try to bring online/offline a resource or when I try to make 
 a switchover I get some very strange behaviour. The gab daemon 
 restarts the veritas and thus I can't do anything with it.
 I checked all the stepps from the install guide Verifying LLT, GAB, 
 and Cluster Operation chapter an everything looks fine but when I 
 try to do something it just restarts. I attached the complete log of 
 a restart, here is a snipet from it:

 Aug  8 22:29:23 NTMS1AN1 gab: [ID 272231 kern.notice] GAB WARNING
 V-15-1-20057 Port h process 5182 inactive 14 sec Aug  8 22:29:24 
 NTMS1AN1 Had[5182]: [ID 702911 daemon.alert] VCS WARNING
 V-16-1-53024 HAD Signal SIGABRT received Aug  8 22:29:24 NTMS1AN1 
 Had[5182]: [ID 702911 daemon.alert] VCS NOTICE
 V-16-1-53028 Beginning execution of the diagnostics script Aug  8 
 22:29:24 NTMS1AN1 gab: [ID 191522 kern.notice] GAB WARNING
 V-15-1-20058 Port h process 5182: heartbeat failed, killing process


 Thanks.

 BR,
 Laszlo


 
  Yahoo! Answers - Get better answers from someone who knows. Try 
 it now

 http://uk.answers.yahoo.com/;_ylc=X3oDMTEydmViNG02BF9TAzIxMTQ3MTcxOTAEc2VjA21haWwEc2xrA3RhZ2xpbmU.


 
  Yahoo! Mail is the world's favourite email. Don't settle for 
 less, sign up for your free account today

 http://uk.rd.yahoo.com/evt=44106/*http

Re: [Veritas-ha] Fw: gab restarts had

2007-08-08 Thread Jim Senicka

what OS/Version and what version of VCS?
Something is blocking HAD ability to heartbeat GAB

From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Kiss László - 
Károly
Sent: Wednesday, August 08, 2007 11:38 AM
To: Peter DrakeUnderkoffler; Jim Senicka
Cc: veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] Fw: gab restarts had

Thanks for both of you!

Looks like the system is not loaded, an oracle and a java app is running on it 
but is not loaded. and this error comes only when I try to do something with 
vcs.
I definitly would not let this in place, that's why I would like to get some 
info, what to do in a siutation like this.
Thanks.

- Original Message 
From: Peter DrakeUnderkoffler [EMAIL PROTECTED]
To: Jim Senicka [EMAIL PROTECTED]
Cc: Kiss László - Károly [EMAIL PROTECTED]; veritas-ha@mailman.eng.auburn.edu
Sent: Wednesday, 8 August, 2007 5:33:32 PM
Subject: Re: [Veritas-ha] Fw: gab restarts had

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I agree with Jim, that is the failure scenario when the system is overloaded
and gab isn't able to communicate for a period of time.  As a temporary
measure, you can add a -k to gabtab and restart gab.  This will have it
not force the system to panic giving you time to resolve the underlying issue.
I wouldn't leave this in place though

Thanks
Peter

Peter DrakeUnderkoffler
Xinupro, LLC
617-834-2352

Jim Senicka wrote:
 is the system heavily loaded?
 GAB restarts HAD when HAD does not communicate with GAB for 16 seconds.
 This usually happens only in super overload situations

 *From:* [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] *On Behalf Of *Kiss
 László - Károly
 *Sent:* Wednesday, August 08, 2007 11:04 AM
 *To:* veritas-ha@mailman.eng.auburn.edu
 *Subject:* [Veritas-ha] Fw: gab restarts had

 Sorry, I forgot the file :(
 Here it is
 - Forwarded Message 
 From: Kiss László - Károly [EMAIL PROTECTED]
 To: veritas-ha@mailman.eng.auburn.edu
 Sent: Wednesday, 8 August, 2007 5:02:42 PM
 Subject: Re: [Veritas-ha] gab restarts had

 Hi,

 We have a two node cluster, VCS 4.1.

 When I try to bring online/offline a resource or when I try to make a
 switchover I get some very strange behaviour. The gab daemon restarts
 the veritas and thus I can't do anything with it.
 I checked all the stepps from the install guide Verifying LLT, GAB, and
 Cluster Operation chapter an everything looks fine but when I try to do
 something it just restarts. I attached the complete log of a restart,
 here is a snipet from it:

 Aug  8 22:29:23 NTMS1AN1 gab: [ID 272231 kern.notice] GAB WARNING
 V-15-1-20057 Port h process 5182 inactive 14 sec
 Aug  8 22:29:24 NTMS1AN1 Had[5182]: [ID 702911 daemon.alert] VCS WARNING
 V-16-1-53024 HAD Signal SIGABRT received
 Aug  8 22:29:24 NTMS1AN1 Had[5182]: [ID 702911 daemon.alert] VCS NOTICE
 V-16-1-53028 Beginning execution of the diagnostics script
 Aug  8 22:29:24 NTMS1AN1 gab: [ID 191522 kern.notice] GAB WARNING
 V-15-1-20058 Port h process 5182: heartbeat failed, killing process

 Thanks.

 BR,
 Laszlo

 Yahoo! Answers - Get better answers from someone who knows. Try it now
 http://uk.answers.yahoo.com/;_ylc=X3oDMTEydmViNG02BF9TAzIxMTQ3MTcxOTAEc2VjA21haWwEc2xrA3RhZ2xpbmU.

 Yahoo! Mail is the world's favourite email. Don't settle for less, sign
 up for your free account today
 http://uk.rd.yahoo.com/evt=44106/*http://uk.docs.yahoo.com/mail/winter07.html.

 ___
 Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
 http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGueJMl+lekZRM55oRAlHCAKCphUmbjZPjOGoPIJPqLhUvxrMiJQCeM4j5
TkSvq1fjh7bB6GHtHmKFCZc=
=/UyC
-END PGP SIGNATURE-

Yahoo! Mail is the world's favourite email. Don't settle for less, sign up for 
your free account today 
http://uk.rd.yahoo.com/evt=44106/*http://uk.docs.yahoo.com/mail/winter07.html 
.
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] LLTnot configured error after reboot

2007-06-21 Thread Jim Senicka

GAB saying LLT not configured means LLT is not running. It is not saying LLT is 
not configured correctly in llttab


Sent from my Nokia E62 handheld by goodlink.


 -Original Message-
From:   robertinoau [mailto:[EMAIL PROTECTED]
Sent:   Wednesday, June 20, 2007 11:40 PM Mountain Standard Time
To: Damodharan K; veritas-ha@mailman.eng.auburn.edu
Subject:Re: [Veritas-ha] LLTnot configured error after reboot

Try this:

/etc/rc2.d/S92gab start

Then 

gabconfig -c -x


--- Damodharan K [EMAIL PROTECTED] wrote:

 hi all
 
 
 After unloading and loading GAB
 Its giving the following error
 # gabconfig -c -x
 GAB gabconfig ERROR V-15-2-25015 LLT not configured
 
 But the LLT are correctly configured
  test02-ap:  more /etc/llttab
  set-node test02-ap
  set-cluster 70
  link qfe2 /dev/qfe:2 - ether - -
  link qfe7 /dev/qfe:7 - ether - -
  
  
  test02-ap: more /etc/gabtab
  /sbin/gabconfig -c -n2
  
  test02-ap: gabconfig -a
  GAB Port Memberships
 

===
  
  test02-ap: more /etc/gabtab
  /sbin/gabconfig -c -n2
  
 
 
 
 Damodharan K
 Tata Consultancy Services
 Mailto: [EMAIL PROTECTED]
 Website: http://www.tcs.com
 
 
 
 robertinoau [EMAIL PROTECTED] 
 06/21/2007 05:08 AM
 
 To
 Damodharan K [EMAIL PROTECTED],
 veritas-ha@mailman.eng.auburn.edu
 cc
 
 Subject
 Re: [Veritas-ha] LLTand GAB problem after first
 rebooting when configured
 
 
 
 
 
 
 Try this:
 
 
 1.) Unload GAB
 
 # gabconfig -U
 
 2.) Restart GAB.
 
 # gabconfig -c -x
 
 3.) Finally restart HAD.
 
 # hastart
 
 --- Damodharan K [EMAIL PROTECTED] wrote:
 
  
  Dear all,
  
  Iam having V480 2 servers with vcs 4.1 and vxvm
 4.1
  
  Iam newly building two node cluster. At
 installation
  and configuration the
  cluster service worked fine. But after reboot the
  LLT , GAB is not running
  and not able to start Cluster service .Please help
  to slove this issue .Iam
  sending configuraion and the engine log
  
  
  Engine_A.log
  
  2007/04/18 14:13:44 VCS INFO V-16-1-10125 GAB
  timeout set to 15000 ms
  2007/04/18 14:13:44 VCS ERROR V-16-1-10116
  GabHandle::open failed errno =
  261
  2007/04/18 14:13:44 VCS ERROR V-16-1-11033 GAB
 open
  failed. Exiting
  2007/04/18 14:13:54 VCS NOTICE V-16-1-11022 VCS
  engine (had) started
  2007/04/18 14:13:54 VCS NOTICE V-16-1-11027 VCS
  engine startup
  arguments=-restar
  
  Configurations
  
  test02-ap: gabconfig -l
  GAB Driver Configuration
  Driver state : Unconfigured
  Partition arbitration: Disabled
  Control port seed: Enabled
  Halt on process death: Disabled
  Missed heartbeat halt: Disabled
  Halt on rejoin   : Disabled
  Keep on killing  : Disabled
  Quorum flag  : Disabled
  Restart  : Disabled
  Node count   : 2
  Disk HB interval (ms): 1000
  Disk HB miss count   : 4
  IOFENCE timeout (ms) : 15000
  Stable timeout (ms)  : 5000
  
  
  
  test02-ap:  more /etc/llttab
  set-node test02-ap
  set-cluster 70
  link qfe2 /dev/qfe:2 - ether - -
  link qfe7 /dev/qfe:7 - ether - -
  
  
  test02-ap: more /etc/gabtab
  /sbin/gabconfig -c -n2
  
  test02-ap: gabconfig -a
  GAB Port Memberships
 

===
  
  test02-ap: more /etc/gabtab
  /sbin/gabconfig -c -n2
  
  
  test02-ap: more main.cf
  include types.cf
  
  cluster vcsdev-ap (
  UserNames = { admin = bopHojOlpKppNxpJom }
  ClusterAddress = 172.25.7.98
  Administrators = { admin }
  CredRenewFrequency = 0
  UseFence = SCSI3
  CounterInterval = 5
  )
  
  system test01-ap (
  Limits = { Processors = 4 }
  )
  
  system test02-ap (
  Limits = { Processors = 4 }
  )
  
  group ClusterService (
  SystemList = { test01-ap = 0, test02-ap =
 1
  }
  AutoStartList = { test01-ap, test02-ap }
  FailOverPolicy = Load
  AutoStartPolicy = Load
  OnlineRetryLimit = 3
  OnlineRetryInterval = 120
  Load = 4
  )
  
  IP webip (
  Device = ce0
  Address = 172.25.7.98
  NetMask = 255.255.255.248
  )
  
  NIC csgnic (
  Device = ce0
  )
  
  VRTSWebApp VCSweb (
  Critical = 0
  AppName = vcs
  InstallDir =
 /opt/VRTSweb/VERITAS
  TimeForOnline = 5
  RestartLimit = 3
  )
  
  VCSweb requires webip
  webip requires csgnic
  
  
  // resource dependency tree
  //
  //  group ClusterService
 
=== message truncated ===



  
_
  

Yahoo!7 Mail has just got even bigger and better with unlimited storage on all 
webmail accounts.
http://au.docs.yahoo.com/mail/unlimitedstorage.html

Re: [Veritas-ha] LLTand GAB problem after first rebooting whenconfigured

2007-06-21 Thread Jim Senicka

LLT is not starting right?  All other data is non relevant. Fix the llt issue 
so gab can start so had can start


Sent from my Nokia E62 handheld by goodlink.


 -Original Message-
From:   Damodharan K [mailto:[EMAIL PROTECTED]
Sent:   Thursday, June 21, 2007 04:06 PM Mountain Standard Time
To: veritas-ha@mailman.eng.auburn.edu
Subject:[Veritas-ha] LLTand GAB problem after first rebooting 
whenconfigured


Dear all,

Iam having V480 2 servers with vcs 4.1 and vxvm 4.1

Iam newly building two node cluster. At installation and configuration the
cluster service worked fine. But after reboot the LLT , GAB is not running
and not able to start Cluster service .Please help to slove this issue .Iam
sending configuraion and the engine log


Engine_A.log

2007/04/18 14:13:44 VCS INFO V-16-1-10125 GAB timeout set to 15000 ms
2007/04/18 14:13:44 VCS ERROR V-16-1-10116 GabHandle::open failed errno =
261
2007/04/18 14:13:44 VCS ERROR V-16-1-11033 GAB open failed. Exiting
2007/04/18 14:13:54 VCS NOTICE V-16-1-11022 VCS engine (had) started
2007/04/18 14:13:54 VCS NOTICE V-16-1-11027 VCS engine startup
arguments=-restar

Configurations

test02-ap: gabconfig -l
GAB Driver Configuration
Driver state : Unconfigured
Partition arbitration: Disabled
Control port seed: Enabled
Halt on process death: Disabled
Missed heartbeat halt: Disabled
Halt on rejoin   : Disabled
Keep on killing  : Disabled
Quorum flag  : Disabled
Restart  : Disabled
Node count   : 2
Disk HB interval (ms): 1000
Disk HB miss count   : 4
IOFENCE timeout (ms) : 15000
Stable timeout (ms)  : 5000



test02-ap:  more /etc/llttab
set-node test02-ap
set-cluster 70
link qfe2 /dev/qfe:2 - ether - -
link qfe7 /dev/qfe:7 - ether - -


test02-ap: more /etc/gabtab
/sbin/gabconfig -c -n2

test02-ap: gabconfig -a
GAB Port Memberships
===

test02-ap: more /etc/gabtab
/sbin/gabconfig -c -n2


test02-ap: more main.cf
include types.cf

cluster vcsdev-ap (
UserNames = { admin = bopHojOlpKppNxpJom }
ClusterAddress = 172.25.7.98
Administrators = { admin }
CredRenewFrequency = 0
UseFence = SCSI3
CounterInterval = 5
)

system test01-ap (
Limits = { Processors = 4 }
)

system test02-ap (
Limits = { Processors = 4 }
)

group ClusterService (
SystemList = { test01-ap = 0, test02-ap = 1 }
AutoStartList = { test01-ap, test02-ap }
FailOverPolicy = Load
AutoStartPolicy = Load
OnlineRetryLimit = 3
OnlineRetryInterval = 120
Load = 4
)

IP webip (
Device = ce0
Address = 172.25.7.98
NetMask = 255.255.255.248
)

NIC csgnic (
Device = ce0
)

VRTSWebApp VCSweb (
Critical = 0
AppName = vcs
InstallDir = /opt/VRTSweb/VERITAS
TimeForOnline = 5
RestartLimit = 3
)

VCSweb requires webip
webip requires csgnic


// resource dependency tree
//
//  group ClusterService
//  {
//  VRTSWebApp VCSweb
//  {
//  IP webip
//  {
//  NIC csgnic
//  }
//  }
//  }




Damodharan K
Tata Consultancy Services
Mailto: [EMAIL PROTECTED]
Website: http://www.tcs.com
=-=-=
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you


___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] sample for apache application

2007-04-30 Thread Jim Senicka

what OS?
The Linux 5.0 bundled agent reference guide has the Apache agent
documented, and I believe the other OS do as well
 

  _  

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of osk
Sent: Monday, April 30, 2007 3:01 AM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] sample for apache application



Hi,
   I am new to vcs, can you give me one example to configure apache
as resoure.

recommandation are welcome.

regards
Karthikeyan.N
-- 
winners don't do different things
they do things differently 
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] Resource Group Dependencies

2007-04-24 Thread Jim Senicka

We are not planning to address that in VCS at this time. (Multiple
children).
Please have your account team contact me inside Symantec?

Also, what are you running that a  40 second shutdown is too long?
 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Ceri
Davies
Sent: Tuesday, April 24, 2007 9:39 AM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] Resource Group Dependencies

I note that there is a restriction that a group may have only one child
group; is there any future in which this might be relaxed?



As a usage case, this is why I want this: I have a multi-node cluster in
which I have multiple zones on each node, and fail applications over
between the zones.

I don't wish to use the configuration quoted in the User's Guide for
zones, as this configuration requires that, when a service group fails
over, that the zone be stopped on the failing node and then started on
the node that the group is failing over to.  This is bad as, in my
testing, starting a zone is very quick, but waiting for one to shut down
takes about 40 seconds.

Therefore, I'm eschewing this and have created a parallel resource group
that starts a zone on each node and have the application resource groups
simply configured with a firm local dependency on the zone resource
group; e.g. with an identically configured zone vleappp on each node, I
use:

  group vleappp_zones (
SystemList = { clna = 0, clnb = 0 }
Parallel = 1
AutoStartList = { clna, clnb }
)

Zone vleappp_zone (
ZoneName = vleappp
)

  group vle_app_prod (
SystemList = { clna = 1, clnb = 0 }
)

Application vleappp_apache (
StartProgram = /nondistinct/vle/application start
StopProgram = /nondistinct/vle/application stop
PidFiles = {
 
/zones/local/roots/vleappp/root/nondistinct/vle/logs/httpd .pid }
ContainerName = vleappp
)

Mount vleappp_mount (
)

Blah otherstuff ()

...

requires group vleappp_zones online local firm

This works perfectly for me, except now I want to add a global
dependency on the vle_ora_prod group as well.  Aargh.

I simply can't wait for the zones to shut down so is there some other
option?

Ceri
--
That must be wonderful!  I don't understand it at all.
  -- Moliere


___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] Naming conventions for VCS; VCS style guide?

2007-04-24 Thread Jim Senicka

comments below

  _  

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Colb,
Andrew
Sent: Tuesday, April 24, 2007 2:14 PM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] Naming conventions for VCS; VCS style guide?

All,

We are about to initiate and upgrade several VCS clusters. The plan for
these upgrades will enable us to build/test in parallel with existing
production clusters and  to revisit our traditional cluster nomenclature
and naming conventions.  

Our configuration has a five-node production Solaris Veritas cluster at
our headquarters; we are building a four-node equivalent at our warm
business continuity site (active data replication). The two sites are
connected by a point-to-point DS-3; firewall rules allow one site to see
and interact with the other.

Our current VCS nomenclature is pretty much ad hoc. The new VCS
nomenclature would have the following structure:   stem_object## where
stem is either a functional name (e.g., db)  or a singular, universal
name (e.g., prod), object is dg or sg, and ## is a zero-padded numeric
for serialized differentiation.

Question 1: Can we use identical names for VCS diskgroups and service
groups at the two sites (HQ and Continuity) simultaneously? Host names
will, of course, be different. The clusters will have different cluster
IDs. If we do use identical names, will that create a problem if we move
on to Global Cluster Option and/or to VVR?  For example, if we have a
service group named db_sg01 in both our headquarters cluster and our
business continuity cluster, will VCS complain? 

JS Absolutely. No issues with identical names in separate clusters 

Question 2: Is there an advantage in Veritas management/administration
if all the stem names are the same? That is, if we replace existing stem
names such as db, auth, appsrv, etc with a single universal name such as
prod, will we be gaining anything in exchange for giving up the
functional association? 

JS Cluster Management console either has or will be providing a
search function, so it all depends on how you want to search :-) 

Thanks in advance for any discussion, advice, ideas, guidance, and
warnings,

Andy Colb

Investment Company Institute

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] Proxy resource in status unknown

2007-04-16 Thread Jim Senicka

The actual NIC is not enabled, so the Proxy cannot probe. (at least that
is my first thought here)

-Original Message-
From: Fred Grieco [mailto:[EMAIL PROTECTED] 
Sent: Monday, April 16, 2007 11:29 AM
To: Jim Senicka; veritas-ha@mailman.eng.auburn.edu
Subject: RE: [Veritas-ha] Proxy resource in status unknown

Here are the snipets from the main.cf.  There are three SGs, one with
the actual NIC resource and two with proxies.  Both proxies show the
online status unknown state.

group ClusterService (
SystemList = { pa-ocsun-01 = 0, pa-ocsun-02 =
1 }
AutoStartList = { pa-ocsun-01, pa-ocsun-02 }
OnlineRetryLimit = 3
OnlineRetryInterval = 120
)

IP webip (
Device = ce0
Address = 192.168.49.146
NetMask = 255.255.255.0
)
...

Proxy NICProxycsg (
Critical = 0
TargetResName = nic1
)

group VVR-Remote (
SystemList = { pa-ocsun-01 = 0, pa-ocsun-02 =
1 }
)
...
IP replip (
Critical = 0
Device = ce0
Address = 192.168.49.68
NetMask = 255.255.255.0
)

NIC nic1 (
Enabled = 0
Device = ce0
NetworkType = ether
NetworkHosts = { 192.168.49.1 }
)

...

group oc451 (
SystemList = { pa-ocsun-01 = 0, pa-ocsun-02 =
1 }
AutoStartList = { pa-ocsun-01, pa-ocsun-02 }
)
...
IP VIP (
Critical = 0
Device = ce0
Address = 192.168.49.145
NetMask = 255.255.255.0
)
...
Proxy NIC-Proxy (
Critical = 0
TargetResName = nic1
)

...

Fred
--- Jim Senicka [EMAIL PROTECTED] wrote:

 Can you cut/paste main.cf sections?

 -Original Message-
 From: Fred Grieco [mailto:[EMAIL PROTECTED]
 Sent: Monday, April 16, 2007 9:30 AM
 To: Jim Senicka; veritas-ha@mailman.eng.auburn.edu
 Subject: RE: [Veritas-ha] Proxy resource in status unknown

 Yes, with the same priorities.

 --- Jim Senicka [EMAIL PROTECTED] wrote:

  Are the system lists for both service groups the
 same?

  -Original Message-
  From: [EMAIL PROTECTED]
  [mailto:[EMAIL PROTECTED]
  On Behalf Of Fred
  Grieco
  Sent: Monday, April 16, 2007 9:08 AM
  To: veritas-ha@mailman.eng.auburn.edu
  Subject: [Veritas-ha] Proxy resource in status
 unknown

  I've set up a proxy resource that references a NIC
 resource in another

  service group.  The NIC resource is online, but
 the proxy resource
  shows Online|status unknown.

  What does this mean in a Proxy resource?  And is
 there any way to
  clear the unknown status?  This is on a live
 Oracle cluster so I don't

  have the opportunity to down everything, etc.

  TIA,

  Fred

  __
  Do You Yahoo!?
  Tired of spam?  Yahoo! Mail has the best spam
 protection around
  http://mail.yahoo.com
 ___
  Veritas-ha maillist  -
  Veritas-ha@mailman.eng.auburn.edu

http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

 __
 Do You Yahoo!?
 Tired of spam?  Yahoo! Mail has the best spam protection around 
 http://mail.yahoo.com

__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com 

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] Step-by-Step instructions for adding storage to cluster

2007-04-11 Thread Jim Senicka

You need to get through VCS training to be honest. Without knowing every 
detail, I cannot give exact steps. With basic VCS training this would be 
trivial and you would be fully confident to make the changes


[Sent from my Nokia E62 handheld via Goodlink]


 -Original Message-
From:   Lynette Oliver [mailto:[EMAIL PROTECTED]
Sent:   Tuesday, April 10, 2007 08:04 PM Pacific Standard Time
To: Jim Senicka; veritas-ha@mailman.eng.auburn.edu
Subject:RE: [Veritas-ha] Step-by-Step instructions for adding storage 
to cluster

Thank you for your response, Jim.

Do you have the steps?  I've inherited a VCS configuration but have never
worked on it before. I'm afraid to make changes for fear of creating a
situation that could cause a failover.

 

  _  

From: Jim Senicka [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, April 10, 2007 7:33 PM
To: Lynette Oliver; veritas-ha@mailman.eng.auburn.edu
Subject: RE: [Veritas-ha] Step-by-Step instructions for adding storage to
cluster

 

if you add volumes you will need to add additional volume resources (if you
use volume resources) in the service group, plus whatever additional file
systems you add as additional file system resources

 

Growing file systems requires no changes in the cluster

 

  _  

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Lynette
Oliver
Sent: Wednesday, April 11, 2007 12:49 AM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] Step-by-Step instructions for adding storage to
cluster

Hello HA GURUs,

I'm looking for someone to provide me with step-by-step instructions for
adding storage to a cluster.  For example, I have an existing cluster that
requires a new volume group to be added.  I have documentation to indicate
how to create volume groups and volumes using vxvm but nothing that
describes how to integrate this with an existing cluster.  In addition, if I
need to grow a filesystem for a given volume group managed by a cluster, how
do I do so?  Please help. This is VCS 4.1 on Solaris 2.9 running on Hitachi
USP.

 

Thanks,

loliver




___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] VCS 5.0 / Solaris 10 Resource Controls / Oracle Agent

2007-04-11 Thread Jim Senicka

Bryan
Unfortunately, at this time the VCS 5.x agents are pretty much not
designed to work in an SRM environment. We are looking at what it will
take to support this 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Bryan
Pepin
Sent: Tuesday, April 10, 2007 4:34 PM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] VCS 5.0 / Solaris 10 Resource Controls / Oracle
Agent

Hello,

In the process of deploying Oracle 10g on top of SFRAC 5.0 running
Solaris 10, I've noticed the following issues around setting shared
memory parameters for Oracle. The Oracle Agent does not assume the
project that I have assigned to the Oracle user? It is assuming the
system project, and when I try to add the resource controls to that
system or the default project, that does not work either?

Here are the details:

Trying to use Solaris' new project methodology to establish the IPC
tunables, here is what I did:

# projadd -c 'IPC Tunables' -U oracle -G dba -K
'project.max-shm-memory=(privileged,16gb,deny)' user.oracle

Now, as the Oracle user, this allows the DB to open without issue.

However, when I configure the Oracle VCS agent to start the DB, it
appears that the VCS processes are assuming the system project, and
when they start the database processes, they are assuming the roles of
that project, rather than those of the oracle user that I have defined?

Here is the error in the messages file when the DB tries to open from
the VCS agent:

[ID 883052 kern.notice] privileged rctl project.max-shm-memory (value
6291603456) exceeded by project 0

So I logically thought I could apply the same tunings to the system
project, but that does not work either.

This is what my project file looks like:

system:0process.max-sem-nsems=(privileged,4096,deny);\
process.max-sem-ops=(privileged,4096,deny);project.max-sem-ids=(privileg
ed,4096,deny);\
project.max-shm-ids=(privileged,512,deny);project.max-shm-memory=(privil
eged,17179869184,deny)
user.root:1
noproject:2
default:3
group.staff:10
user.oracle100:IPC
Tunables:oracle:dba:process.max-sem-nsems=(privileged,4096,deny);\
process.max-sem-ops=(privileged,4096,deny);project.max-sem-ids=(privileg
ed,4096,deny);\
project.max-shm-ids=(privileged,512,deny);project.max-shm-memory=(privil
eged,17179869184,deny)

What I have been able to do is change the parameters on the fly with
prctl:

# ps -ef -o pid,project,args | grep -i OracleAgent -- to get the PID
and Project # prctl -n project.max-shm-memory -i process PID -- to
display # prctl -n project.max-shm-memory -r -v 16gb -i process PID
-- to set

Once I do that, it allows me to start the database via the Oracle Agent.

Has anyone run into this issue?

This may be me not properly setting up the system project, but I figure
someone must have run into this and they could share how they resolved
it.

I'm hoping there is an easy solution out there, rather than having to
always change the parameter on the running Agent?

Hope that all makes sense.

Thanks.

-Bryan

PS. What I have realized is that if I put the shmmax parameters in the
/etc/system that works, but I was hoping to have to fall back into that
routine.

--

Bryan Pepin
Unix Enterprise Systems

EMC Corporation
4400 Computer Drive
Westboro, MA 01580
508-898-4776
[EMAIL PROTECTED]

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha


___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] load-balancing in VCS

2007-04-06 Thread Jim Senicka

No, you will need an IP per node and run a off the shelf IP load balancer out 
front. This is far more standard approach than pumping all traffic through one 
node and let it forward to all others in the cluster.  A serious case of 
marketecture versus real feature on the Sun Cluster side


[Sent from my Nokia E62 handheld via  goodlink]


 -Original Message-
From:   Rongsheng Fang [mailto:[EMAIL PROTECTED]
Sent:   Friday, April 06, 2007 10:29 AM Pacific Standard Time
To: veritas-ha@mailman.eng.auburn.edu
Subject:[Veritas-ha] load-balancing in VCS

Hi,

Does VCS has (or support) the equivalent functionality of Scalable Data 
Service in Sun Cluster, which can balance the load between cluster nodes?

http://docs.sun.com/app/docs/doc/819-0579/6n30dc0nf?a=view

I know that in VCS the service instances can start/run on different 
cluster nodes in parallel mode, but can these service instances share 
the same virtual IP which can only be up on one node?

Thanks,

Rongsheng
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha



___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] Custom Agent

2007-04-05 Thread Jim Senicka

If you already have start/stop/monitor,
Take a look at the Application Agent in the BARG. That should cover like
98% of apps

-Original Message-
From: Fred Butler [mailto:[EMAIL PROTECTED] 
Sent: Thursday, April 05, 2007 11:47 AM
To: 'Stanley, Jon'; veritas-ha@mailman.eng.auburn.edu; Jim Senicka
Subject: RE: [Veritas-ha] Custom Agent

Thanks Jon / Jim! I know you guys don't want to hear this but I write
these agents all the time for Sun Cluster and this is my first request
to do one for VCS. I already have the start / stop / monitor scripts
already created and I just needed the info to incorporate them into the
VCS Framework. I will have to write a clean script after I determine if
there are things like, shared memory, semaphores or lock files that need
to be cleaned up. 

Jon - Agent Developers Guide huh :-)! Next time I will RTFM I will
also read the document Jim sent me. Thanks again!

Regards,
Fred Butler
(484) 241-5912 (Cell #1)
(484) 903-4742 (Cell #2)
http://www.arch.com/ Pin#: 8778977117

-Original Message-
From: Stanley, Jon [mailto:[EMAIL PROTECTED]
Sent: Thursday, April 05, 2007 11:06 AM
To: Fred Butler; veritas-ha@mailman.eng.auburn.edu
Subject: RE: [Veritas-ha] Custom Agent

Have you looked at the aptly named 'Agent Developers Guide'? :-)

Or maybe the Application agent does what you need it to do instead?  If
you can provide external scripts to do the online, offline, monitor, and
clean functionality, then that's all that you need... 

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Fred 
 Butler
 Sent: Thursday, April 05, 2007 14:45
 To: veritas-ha@mailman.eng.auburn.edu
 Subject: [Veritas-ha] Custom Agent

 Team - I need to write a custom agent in VCS and I need to know what 
 manual has this information. Or - if someone has some notes on this 
 process they would like to share I would be very appreciative.

 ___
 Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu 
 http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] LVMVG agent does work with VIO ! ! !

2007-04-02 Thread Jim Senicka

We have a number of issues with reservations, and breaking reservations
and such.
So as of now, if the HCL says not supported, it is not.
Please work with your account team to find out what can be done (if
anything) to get this added

  _  

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Pavel A
Tsvetkov
Sent: Monday, April 02, 2007 9:43 AM
To: Veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] LVMVG agent does work with VIO ! ! !



Hello all! 

My  last post was a question about Symantec  support of LVMVG agent in
VIO configuration. It seems to me nobody could answer my question... 
So I decided to check it out myself. I installed VCS Cluster 5 MP1 for
AIX on my 570 server with two LPARs and two VIO-s. I  used only one VIO
in my configuration. 
 One disk was shared by VIO for  two LPAR-s.  The LVM group was created
and clustered. Everything was quite right! No problems with switching
over of the LVM group from one LPAR to another. 
 So I'd like very much  to get any comments from Symantec people ! 

  Regards, Pavel 
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] Veritas Volume Replicator in Replicated DataCluster question

2007-03-14 Thread Jim Senicka

RDC = autofailover. 
GCO = Operator confirmed failover. 

So in GCO, an operator makes a choice to startup on old data or wait on 
original primary. This is not possible inside a single cluster. And for bunker, 
this is a GCO config as well


[Sent from my Nokia E62 handheld via  goodlink]


 -Original Message-
From:   Pavel A Tsvetkov [mailto:[EMAIL PROTECTED]
Sent:   Wednesday, March 14, 2007 02:09 AM Pacific Standard Time
To: Jim Senicka; Veritas-ha@mailman.eng.auburn.edu
Subject:Re: Re: [Veritas-ha] Veritas Volume Replicator in Replicated 
DataCluster question

Hello Jim!

Thank you for the answer. But if I have a choice for replication mode in 
global cluster  it  can be useful to have the same thing in RDC. 

Let it be not up-to-date data on the Secondary , but it is still 
consistent. :)  So the application can be started.

And  we should take the bunker into consideration!   The bunker uses 
synchronous connection with Primary SRL, so the asynchronous Secondary 
site can have up-to-date data from the bunker.

So if we  use bunker and the bunker agent it is quite possible to take RDC 
in asynchronous mode.


  With best regards, Pavel

We do not support an automatic failover to out of date secondary. So RDC 
is sync only.  If you need async, you need to not treat the the 
replication like a shared disk, and instead treat it like replication. 
Take a look at global cluster option, now part of VCS HA/DR edition


___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] Veritas Volume Replicator in ReplicatedDataCluster question

2007-03-14 Thread Jim Senicka

If you configure auto failover in GCO (not recommended), then you need
to make sure you are using sync replication only.

-Original Message-
From: Cronin, John S [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, March 14, 2007 10:07 AM
To: Jim Senicka; Pavel A Tsvetkov; Veritas-ha@mailman.eng.auburn.edu
Subject: RE: [Veritas-ha] Veritas Volume Replicator in
ReplicatedDataCluster question

I believe auto-failover is a configurable option with GCO, unless
something has changed recently (I didn't go look at the docs).  The
default is operator confirmation before fail-over, but I have agreed to
configured GCO for auto-failover before if a split-brain did not present
any significant risk to the customer (eg the customer said it was OK and
their preference, and after inquiring into the facts of the situation, I
agreed with their conclusions).

--
John Cronin
678-480-6266

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jim
Senicka
Sent: Wednesday, March 14, 2007 5:14 AM
To: Pavel A Tsvetkov; Veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] Veritas Volume Replicator in
ReplicatedDataCluster question

RDC = autofailover. 
GCO = Operator confirmed failover. 

So in GCO, an operator makes a choice to startup on old data or wait on
original primary. This is not possible inside a single cluster. And for
bunker, this is a GCO config as well

[Sent from my Nokia E62 handheld via  goodlink]

 -Original Message-
From:   Pavel A Tsvetkov [mailto:[EMAIL PROTECTED]
Sent:   Wednesday, March 14, 2007 02:09 AM Pacific Standard Time
To: Jim Senicka; Veritas-ha@mailman.eng.auburn.edu
Subject:Re: Re: [Veritas-ha] Veritas Volume Replicator in
Replicated DataCluster question

Hello Jim!

Thank you for the answer. But if I have a choice for replication mode in
global cluster  it  can be useful to have the same thing in RDC. 

Let it be not up-to-date data on the Secondary , but it is still
consistent. :)  So the application can be started.

And  we should take the bunker into consideration!   The bunker uses 
synchronous connection with Primary SRL, so the asynchronous Secondary
site can have up-to-date data from the bunker.

So if we  use bunker and the bunker agent it is quite possible to take
RDC in asynchronous mode.

  With best regards, Pavel

We do not support an automatic failover to out of date secondary. So
RDC
is sync only.  If you need async, you need to not treat the the 
replication like a shared disk, and instead treat it like replication. 
Take a look at global cluster option, now part of VCS HA/DR edition

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

*

The information transmitted is intended only for the person or entity to
which it is addressed and may contain confidential, proprietary, and/or
privileged material. Any review, retransmission, dissemination or other
use of, or taking of any action in reliance upon this information by
persons or entities other than the intended recipient is prohibited. If
you received this in error, please contact the sender and delete the
material from all computers. GA622

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] Veritas Volume Replicator in Replicated DataClusterquestion

2007-03-13 Thread Jim Senicka

We do not support an automatic failover to out of date secondary. So RDC is 
sync only.  If you need async, you need to not treat the replication like a 
shared disk, and instead treat it like replication. Take a look at global 
cluster option, now part of VCS HA/DR edition. 


[Sent from my Nokia E62 handheld via  goodlink]


 -Original Message-
From:   Eric Hennessey [mailto:[EMAIL PROTECTED]
Sent:   Tuesday, March 13, 2007 09:07 AM Pacific Standard Time
To: Pavel A Tsvetkov; Veritas-ha@mailman.eng.auburn.edu
Subject:Re: [Veritas-ha] Veritas Volume Replicator in Replicated 
DataClusterquestion

RDCs are supported only with synchronous replication, regardless of the
type of replication used.  It doesn't matter if it's VVR or some form of
array-based replication.
 
Eric

  _  

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Pavel A
Tsvetkov
Sent: Tuesday, March 13, 2007 11:08 AM
To: Veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] Veritas Volume Replicator in Replicated Data
Clusterquestion



Hello all! 

It is known that  VVR 4.x can work in Replicated Data Cluster only in
synchronous mode. 
What about version 5 ? Is it possible to have RLINK asynchonous between
Primary and Secondary sites? 

Thanks! Pavel



___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] IO Fencing

2007-03-06 Thread Jim Senicka

I/O fencing removes any chance of a split brain in corner cases where all 
interconnects are severed between sets of nodes and the nodes remain running. 


[Sent from my Nokia E62 handheld via  goodlink]


 -Original Message-
From:   Tharindu Rukshan Bamunuarachchi [mailto:[EMAIL PROTECTED]
Sent:   Tuesday, March 06, 2007 04:46 AM Pacific Standard Time
To: Veritas-ha@mailman.eng.auburn.edu
Cc: veritas-vx@mailman.eng.auburn.edu
Subject:Re: [Veritas-ha] IO Fencing

Seems to be I can not enable I/O fencing. My disk controller does not
support SCSI3-PR.

Can someone pls tell me the effect on applications,  If Fencing I/O is
disabled.

Thankx
Tharindu

On 3/6/07, Tharindu Rukshan Bamunuarachchi [EMAIL PROTECTED] wrote:

 Dear All,

 I have installed Veritas SFCFS on Sun 3310 DiskArray.

 But I could not enables I/O Fencing.

 Can someone please explain what is I/O Fencing in Veritas SFCFS.

 How can I enable I/O fencing, What the benefits I would get with I/O
 fencing.

 Thankx
 Tharindu
 --
 Tharindu Rukshan Bamunuarachchi
 all fabrications are subject to decay




-- 
Tharindu Rukshan Bamunuarachchi
all fabrications are subject to decay



___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] VCS with Blades

2007-03-02 Thread Jim Senicka

VCS does not *require* private links. We recommend but do not require.  

We do require 2 links. You will need to make omne NIC high pri and one low. 


[Sent from my Nokia E62 handheld via  goodlink]


 -Original Message-
From:   Kiss László - Károly [mailto:[EMAIL PROTECTED]
Sent:   Friday, March 02, 2007 07:38 AM Pacific Standard Time
To: [EMAIL PROTECTED]
Subject:[Veritas-ha]  VCS with Blades

Hi,

Does anyone have experience using VCS with IBM Blades? Especially with Blade 
LS21?
We are just planning to use this hardware and the first problem is the lack of 
resource for heartbeat link. The Blade has only 2 network ports and both are 
used for the public network so no interface remains for the heartbeat private 
network. Is there any other choice for the heartbeat link than private network?

Thanks.

Best Regards,
Laszlo



 

No need to miss a message. Get email on-the-go 
with Yahoo! Mail for Mobile. Get started.
http://mobile.yahoo.com/mail 


___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] VCS with Blades

2007-03-02 Thread Jim Senicka

No need for node ID to be different, but cluster ID must be managed. Newer 
releases of VCS allow up to 64k cluster numbers if I recall correctly. 


[Sent from my Nokia E62 handheld via  goodlink]


 -Original Message-
From:   Andrey Dmitriev [mailto:[EMAIL PROTECTED]
Sent:   Friday, March 02, 2007 10:33 AM Pacific Standard Time
To: [EMAIL PROTECTED]
Subject:Re: [Veritas-ha] VCS with Blades

If you do over public. Make sure your cluster ID is unique, and maybe node
IDs too across different clusters on the network/subnet.
 
That bit us on an older version of VERITAS Cluster (1.3)
 
-a

  _  

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Stanley, Jon
Sent: Friday, March 02, 2007 10:51 AM
To: Kiss László - Károly; [EMAIL PROTECTED]
Subject: Re: [Veritas-ha] VCS with Blades


I know that in HP blades you can put in mezziane cards that give you
additional ports beyond the on-board (they have two slots, so you could put
in a dual-channel FC card, I think, and a quad-port Ethernet adapter).  I
think that you *can* use the public network for LLT, not sure if this is
actually supported or not for anything other than a lowpri link.  


  _  

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Kiss László
- Károly
Sent: Friday, March 02, 2007 10:23 AM
To: [EMAIL PROTECTED]
Subject: [Veritas-ha] VCS with Blades


Hi,

Does anyone have experience using VCS with IBM Blades? Especially with Blade
LS21?
We are just planning to use this hardware and the first problem is the lack
of resource for heartbeat link. The Blade has only 2 network ports and both
are used for the public network so no interface remains for the heartbeat
private network. Is there any other choice for the heartbeat link than
private network?

Thanks.

Best Regards,
Laszlo


  _  

Don't get soaked. Take a
http://tools.search.yahoo.com/shortcuts/?fr=oni_on_mail#news quick peak
at the forecast 
with theYahoo!
http://tools.search.yahoo.com/shortcuts/?fr=oni_on_mail#news Search
weather shortcut.




___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] Creating a new clustermemebrship with existing ones

2007-01-30 Thread Jim Senicka

Is this a one time thing, and cluster will stay spit?

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of R
Sent: Tuesday, January 30, 2007 5:47 AM
To: veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] Creating a new clustermemebrship with existing
ones

One way of splitting your 6-node cluster into 2 x 3-node clusters could
be as
follows:

1. Switch all the existing Service Groups to the first 3 nodes.
   #hagrp -switch sg_name -to system_name 2. Delete SystemList
entries of the second 3 nodes from all the service groups.
   #hagrp -modify sg_name SystemList -delete system_name 3. Delete
second three nodes from the cluster
   #hasys -delete system_name
4. Create a new 3-node cluster using the deleted nodes.

If Service Groups on the 6-node cluster needs to be split between the 2
x 3 node cluster, then you might have to take the Service Groups offline
atleast once.

-R

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] SF4.1 VCS5.0

2007-01-29 Thread Jim Senicka

That will work

  _  

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Pavel A
Tsvetkov
Sent: Monday, January 29, 2007 9:45 AM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] SF4.1  VCS5.0

Hello all! 

Is it possible to run VCS5.0 with SF4.1 ? Any problems? 

Thsnks a lot! Pavel.
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] Need to move my site from one location to another (using VCS 4.1)

2006-12-11 Thread Jim Senicka

From VCS side, you will need to update host names in main.cf and llthosts. You 
will also need to update virtual IP in main.cf per service group. Oracle will 
likely need an update in listener.ora to reflect the new VIP for listener
(Sent from my Blackberry wireless handheld)  

-Original Message-
From: [EMAIL PROTECTED] [EMAIL PROTECTED]
To: veritas-ha@mailman.eng.auburn.edu veritas-ha@mailman.eng.auburn.edu
Sent: Mon Dec 11 07:11:52 2006
Subject: [Veritas-ha] Need to move my site from one location to another (using 
VCS 4.1)

hi all 

Iam a new member for this group and proud to hear and join to this group. 

Iam having 4 nodes(Sun solaris -10)  in a cluster using veritas Cluster 4.1 
which got connected to SAN disks. These servers having 7 service groups  in 
which each will providing different oracle application service. Here we are 
planning to move all server including SAN  from our location A  to another 
location B, Only IPs and hostname need to changed at B location. 

How can i go head for this site shifting (configuring the cluster) in step by 
step. 
Give your valuable inputs, how and what are the things to be changed in cluster 
and OS . 

Also please clarify  if any changes in cluster config will reflect any problem 
for Oracle Database. 

Specfication for your 
OS : sunsolaris 10 
VCS : 4.1 
San: Hitachi 
Oracle : 9i and 10g 

Cheers and Regards
Damodharan K
Tata Consultancy Services
Mailto: [EMAIL PROTECTED]
Website: http://www.tcs.com
=-=-=

Notice: The information contained in this e-mail

message and/or attachments to it may contain 

confidential or privileged information. If you are 

not the intended recipient, any dissemination, use, 

review, distribution, printing or copying of the 

information contained in this e-mail message 

and/or attachments to it are strictly prohibited. If 

you have received this communication in error, 

please notify us by reply e-mail or telephone and 

immediately and permanently delete the message 

and any attachments. Thank you

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] HA nodes Patch levels

2006-11-13 Thread Jim Senicka




VCS supports any release and patch level that the version 
of VCS supports within a single cluster.
So you do not need identical patch levels, or even same OS 
release.
best practice would be to keep same, but we can easily 
support multiple versions during upgrades.



From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of Evsyukov, 
SergeySent: Monday, November 13, 2006 10:18 AMTo: 
veritas-ha@mailman.eng.auburn.eduSubject: [Veritas-ha] HA nodes Patch 
levels


Hello 
colleagues,
We have two nodes forHA cluster installation. 
They has identical OS version (Solaris 5.9), but different kernel 
patch level 118558-30 vs. 118558-25.
Is it admissible configuration of cluster or even patch 
level must be identical?

Thanks, 
Sergey
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] low priority heartbeat vs I/O Fencing

2006-09-19 Thread Jim Senicka

EMC = 170 odd Gig disks.
Coordinator needs 3 10mb LUNS.
I waste more space than that on storing bad jokes from email..

I/O fencing is best possible config to prevent split brain.
Low Priority is a best practice, but is still not bulletproof.

 

-Original Message-
From: Steven Sim [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, September 19, 2006 11:18 AM
To: Jim Senicka; veritas-ha@mailman.eng.auburn.edu
Subject: low priority heartbeat vs I/O Fencing

Hello Gurus;

Firstly, I wish to thank James Senicka of Symantec for his wonderfully
fast and very technically accurate replies.

I wish all other product vendors were so efficient.

Support like this is one reason why I will continue to push VCS as a
clustering solution.

I am currently trying to convince a customer to implement I/0 Fencing
with three SAN based co-ordinator disks.

I've told him 3 are required (minimum) and they cannot be used for data.

At which point he threw a look at me and asked me whether I was aware of
how much per byte his EMC was costing him.

Sosome bright spark suggested a low priority heartbeat. Which I was
going to implement anyway, with or without I/O Fencing.

My question is;

Is a low priority heartbeat sufficient for I/O Fencing? If so, why the
strong recommendation for I/O Fencing with three co-ordinator disks?

I have been telling people that a low priority heartbeat is not
sufficient protection against split brain scenarios. Could you guys
comment?

Warmest Regards
Steven Sim




Fujitsu Asia Pte. Ltd.
_

This e-mail is confidential and may also be privileged. If you are not
the intended recipient, please notify us immediately. You should not
copy or use it for any purpose, nor disclose its contents to any other
person. 

Opinions, conclusions and other information in this message that do not
relate to the official business of my firm shall be understood as
neither given nor endorsed by it.




___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

Re: [Veritas-ha] LLT errors - delayed and lost hb ticks

2006-09-15 Thread Jim Senicka

Title: LLT errors - delayed and lost hb ticks



you have two LLT streams sharing common 
infrastructure/switch/VLAN.
Each LLT link must be completely independent and neither 
stream should see packets from the other.



From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of Kawaley 
WinstonSent: Friday, September 15, 2006 11:18 AMTo: 
veritas-ha@mailman.eng.auburn.eduSubject: [Veritas-ha] LLT errors - 
delayed and lost hb ticks

Hi 
all,
We are running VCS 
4.1 on a two Solaris 9 systems and have configured a local cluster 
for our Configuration Management software called Clearcase. Recently I have 
been receiving a lot 
of the following LLT latency errors:
Sep 14 17:24:18 ncfbvcs01 llt: [ID 794702 kern.notice] LLT INFO 
V-14-1-10019 delayed hb 18561 ticks from 1 link 0 (bge1)Sep 14 17:24:18 
ncfbvcs01 llt: [ID 602713 kern.notice] LLT INFO V-14-1-10023 lost 373 hb seq 
30608288 from 1 link 0 (bge1)Sep 14 17:24:18 ncfbvcs01 llt: [ID 794702 
kern.notice] LLT INFO V-14-1-10019 delayed hb 18561 ticks from 1 link 1 
(bge2)Sep 14 17:24:18 ncfbvcs01 llt: [ID 602713 kern.notice] LLT INFO 
V-14-1-10023 lost 373 hb seq 30608288 from 1 link 1 (bge2)Sep 14 17:24:18 
ncfbvcs01 llt: [ID 602713 kern.notice] LLT INFO V-14-1-10023 lost -4 hb seq 
30608285 from 1 link 1 (bge2)Sep 14 17:24:18 ncfbvcs01 llt: [ID 602713 
kern.notice] LLT INFO V-14-1-10023 lost -4 hb seq 30608285 from 1 link 0 
(bge1)Sep 14 17:24:48 ncfbvcs01 llt: [ID 794702 kern.notice] LLT INFO 
V-14-1-10019 delayed hb 2955 ticks from 1 link 1 (bge2)Sep 14 17:24:48 
ncfbvcs01 llt: [ID 602713 kern.notice] LLT INFO V-14-1-10023 lost 62 hb seq 
30608348 from 1 link 1 (bge2)Sep 14 17:24:48 ncfbvcs01 llt: [ID 794702 
kern.notice] LLT INFO V-14-1-10019 delayed hb 2955 ticks from 1 link 0 
(bge1)Sep 14 17:24:48 ncfbvcs01 llt: [ID 602713 kern.notice] LLT INFO 
V-14-1-10023 lost 62 hb seq 30608348 from 1 link 0 (bge1)Sep 14 17:24:48 
ncfbvcs01 llt: [ID 602713 kern.notice] LLT INFO V-14-1-10023 lost -4 hb seq 
30608345 from 1 link 0 (bge1)Sep 14 17:24:48 ncfbvcs01 llt: [ID 602713 
kern.notice] LLT INFO V-14-1-10023 lost -4 hb seq 30608345 from 1 link 1 
(bge2)

Does anyone know what 
exactly is causing 
these delayed and lost ticks and how they can be 
corrected?
Thanks,
Winston 
Kawaley


___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

68 matches

Mail list logo