Re: [Ocfs2-users] OCFS2 and VMware ESX

2009-11-04 Thread Brian Kroth
Late to the party...

Here's what I did to get OCFS2 going with RDMs _and_ VMotion (with some
exceptions).  Almost surely not supported, but it works:

First, create the RDM as a physical passthru using either the gui or the
cli.  It need not be on a separate virtual controller.  However, to have
VMotion working it must _not_ use bus sharing.

For the other nodes setup another passthru RDM that uses the same path
as the previous one.  You must do this on the cli since the GUI will
hide the LUN path once it's been configured any where else.

# Query the path:
cd /vmfs/volumes/whereever/node1
vmkfstools -q node1_1.vmdk
# Make a passthru VMDK for that RDM
cd /vmfs/volumes/wherever/node2
vmkfstools -z /vmfs/devices/disks/vmhba37\:25\:0\:0 node2.vmdk

Now you can add that disk to the VM either by editing the vmx like David
said, or with the GUI.

Here's the catch - ESX locks that /vmfs/devices/disk/whatever path when
a node starts using it, so you can't 
1) Run more than one of these vm nodes on the same esx node
2) Migrate one of these vm nodes to an esx node that is already running
another of the vm nodes.

So, if your esx cluster is greater than your ocfs2 cluster, you're ok.
Else, you need to stick to either the bus sharing method which means no
vmotion, or the cluster in a box method which means all on one esx node
(which kinda defeats the point in my opinion).

I have noticed significant performance benefits from moving to RDMs vs
sw iscsi virtualized in the guest os.  It's the only reason I'd risk all
this.

As to snapshots I've been told by the vmware techs not to use them for
production level things (or at least not for very long) as they can
really kill performance.

We do snapshots on the raid device serving the lun and restrict its
visibility to a particular machine (actually another vm) in order to do
all of the backups from there.  Actually it had to be on a separate VM
outside the normal cluster else the machine would refuse to mount it.
Works out better this way anyways since then we're not burdening the
production VM with the backup work (though it still hits the same
storage device).

The script that does this also deals with all of the necessary fs fixups
to have multiple snapshots mounted at once though I think the 1.4.2
version of ocfs2-tools provides a cloned fs option for doing all that
now.

Brian

David Murphy da...@icewatermedia.com 2009-10-22 09:10:
With  RDM  versus the method Kent described. It's a bit more complicated
and will prevent snapshots and vmotion.
 
 
 
Basically follow what he said but instead of making a vmdk disk  choose
RDM and select  a  LUN.
 
 
 
Then make sure that machine is NOT powered on, log into the esx host and
move the RDM file to say /vmfs/volumes/volume_name/RawDeviceMaps ( you
need to make that folder).
 
Next manually edit the VMX for that host and change it path to the RDM to
where every  you moved it to.
 
 
 
Now you can create new clones  of your base  template, and add the RDM
drive to it ( as ken mentioned , its VERY important), pointing to the
RawDeviceMaps folder and the correct RDM file for that LUN.
 
 
 
 
 
This approach has many issue  so I'm planning on moving away from it.
 
 
 
 
 
1)  You can't clone
 
2)  You can't  snapshot
 
3)  You can't vmotion
 
4)  If you delete a host that has that drive attached you completely
destroy the RDM file. (BAD JOJO)
 
I you do need  to have cluster in such an environment I would suggest a
combination of the 2 approaches.
 
 
 
 
 
 
 
1)  Build a new LUN and  make it VMFS and let the  ESX hosts discover
it.
 
2)  Create the VMDK's on that LUN not in you main VMFS for  VM's
 
3)  Make sure you set any  OCFS drive to separate controller and
physical, persistent  ( so it won't snapshot it)
 
 
 
You should retain snap/vmotion. But we aware. I am not sure if cloning
will make a  new vmdk on your VMFS volume you make for the ocfs drives. So
I would have a base template I clone, then add that drive to the clone (
to guarantee the drives location).
 
 
 
 
 
It's a bit more work that just saving the VMDK to the  VM's folder on your
main VMFS, but it separates the OCFS drives to another  LUN. So you could
easily stop your cluster,  take a snapshot of the lun for backups and
bring them back up.  Limiting your downtime window. Might be over kill
depend on the companies backup stance.
 
 
 
 
 
Hope it helps
David
 
 
 
From: ocfs2-users-boun...@oss.oracle.com
[mailto:ocfs2-users-boun...@oss.oracle.com] On Behalf Of Rankin, Kent
Sent: Monday, July 28, 2008 9:13 PM
To: Haydn Cahir; ocfs2-users@oss.oracle.com
Subject: Re: [Ocfs2-users] OCFS2 and VMware ESX
 
 
 
What I did a few days ago was to create a vmware disk for each OCFS2
filesystem, and store it with one of the VM nodes.  Then, add that disk to
each

Re: [Ocfs2-users] OCFS2 and VMware ESX

2009-10-21 Thread Rankin, Kent
What I did a few days ago was to create a vmware disk for each OCFS2 
filesystem, and store it with one of the VM nodes.  Then, add that disk to each 
additional VM.  When you add it, use a separate SCSI host number.  In other 
words, if the OS is on SCSI 0:0, make the disk SCSI 1:0, or some  arbitrary 
other HBA number.  Then you can go to each hosts second VM SCSI device and 
modify it to be shared, and of type Physical (if I remember correctly).  At 
that point, it works fine.


--
Kent Rankin


-Original Message-
From: ocfs2-users-boun...@oss.oracle.com on behalf of Haydn Cahir
Sent: Mon 7/28/2008 9:48 PM
To: ocfs2-users@oss.oracle.com
Subject: [Ocfs2-users] OCFS2 and VMware ESX
 
Hi,

We are haing some serious issues trying to configure an OCFS2 cluster on 3 SLES 
10 SP2 boxes running in VMware ESX 3.0.1. Before I go into any of the detailed 
errors we are experiencing I first wanted to ask everyone if they have 
successfully configured this solution? We would be interested to find out what 
needs to be set at the VMware level (RDM, VMFS, NICS etc) and what needs to be 
configured at the O/S level. We have a LUN on our SAN that we have presented to 
our VMware hosts that we are using for this.

Any help would be greatly appreciated!

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] OCFS2 and VMware ESX

2008-07-30 Thread Haydn Cahir
Thanks everyone for you replies. We have started fresh using raw mapped luns on 
two nodes, fresh format and everything is working well.

Thanks again.

 Sedlock, Mark A. [EMAIL PROTECTED] 07/30/08 12:27 AM 
The device mappings were nothing out of the ordinary, LSI Logic SCSI
controller (only one for the whole VM), we're using two raw mapped LUNs
to each VM both OCFS2, we're not using redundant SAN uplinks (so there's
no managed paths), Physical mappings (not virtual).  We had some
problems when we first started before we figured out we needed to keep
the VMs on different physical ESX nodes since multiple VMs on the same
host didn't play well with raw mapped physical LUNs (which seems obvious
in retrospect).

In this set up we didn't have to adjust the SCSI host number (as Kent
mentioned).  We've run heartbeats on both a private network (second
virtual NIC, dedicated virtual switch in ESX) and the primary network
interface; both have worked fine.

--mark
Mark Sedlock
Network and System Services
Rowan University


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Haydn Cahir
Sent: Monday, July 28, 2008 11:08 PM
To: ocfs2-users@oss.oracle.com
Subject: Re: [Ocfs2-users] OCFS2 and VMware ESX

Hi Mark,

Thanks for your reply. How did you configure your RDM mappings? We have
tried a few combinations already. We have three nodes and are trying to
use a single OCFS2 volume. We are encountering a range of errors like
VM's not starting when another node is already started (goes back to the
RDM configurations we think), two of the nodes are able to edit files in
the OCFS2 volumes but the third doesn't see any changes made by the
other nodes and the OCFS2 volume switching to read-only due to errors on
the volume.

We have tried running just two nodes and still get the problem where the
volume will switch over to read-only. I will look into the time
differences on the server, we normally have to make changes in the grub
config and NTP settings to keep the time in synch.

FYI the version of OCFS2 on SLES 10 SP2 is completely different:

HIT-TCN1:~ # rpm -qa | grep -i ocfs
ocfs2-tools-1.4.0-0.3
ocfs2console-1.4.0-0.3

I can't even find reference to this version on the Oracle web site.

Cheers

 Sedlock, Mark A. [EMAIL PROTECTED] 07/29/08 12:42 PM 
We run a similar set up, SLES 10 SP1, we were ESX 3.0.x and are now 3.5.
We're running the version of ocfs2 that shipped with SLES 10 SP1.

4 nodes accessing raw mapped LUNs via ESX from an HP SAN on HP Blade
Servers. Qlogic HBAs, standard NICs; nothing special.

The biggest hurdle we ran into was time synch on the individual hosts
(VMWare ESX + some variants of Linux have an interesting clock tick
relationship which I still don't understand) that was causing some ugly
fencing.

It's been running well for about 8 months.  Overall we're pretty happy
with it thus far.  That said, we don't let ESX VMotion the cluster nodes
via DRS, but that's more because we haven't tested it.  The cluster is
used for Apache web hosting.

web7:~:%1003#rpm -qa | grep -i ocfs
ocfs2-tools-1.2.3-0.7
ocfs2console-1.2.3-0.7
ocfs2-tools-devel-1.2.3-0.7
web7:~:%1004#uname -a
Linux web7 2.6.16.53-0.16-smp #1 SMP Tue Oct 2 16:57:49 UTC 2007 i686
athlon i386 GNU/Linux
web7:~:%1005#cat /etc/SuSE-release 
SUSE Linux Enterprise Server 10 (i586)
VERSION = 10
PATCHLEVEL = 1
web7:~:%1006#

--mark
Mark Sedlock
Network and System Services
Rowan University



___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

--- Scanned by M+ Guardian Messaging Firewall ---



___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users


Re: [Ocfs2-users] OCFS2 and VMware ESX

2008-07-29 Thread Sedlock, Mark A.
The device mappings were nothing out of the ordinary, LSI Logic SCSI
controller (only one for the whole VM), we're using two raw mapped LUNs
to each VM both OCFS2, we're not using redundant SAN uplinks (so there's
no managed paths), Physical mappings (not virtual).  We had some
problems when we first started before we figured out we needed to keep
the VMs on different physical ESX nodes since multiple VMs on the same
host didn't play well with raw mapped physical LUNs (which seems obvious
in retrospect).

In this set up we didn't have to adjust the SCSI host number (as Kent
mentioned).  We've run heartbeats on both a private network (second
virtual NIC, dedicated virtual switch in ESX) and the primary network
interface; both have worked fine.

--mark
Mark Sedlock
Network and System Services
Rowan University


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Haydn Cahir
Sent: Monday, July 28, 2008 11:08 PM
To: ocfs2-users@oss.oracle.com
Subject: Re: [Ocfs2-users] OCFS2 and VMware ESX

Hi Mark,

Thanks for your reply. How did you configure your RDM mappings? We have
tried a few combinations already. We have three nodes and are trying to
use a single OCFS2 volume. We are encountering a range of errors like
VM's not starting when another node is already started (goes back to the
RDM configurations we think), two of the nodes are able to edit files in
the OCFS2 volumes but the third doesn't see any changes made by the
other nodes and the OCFS2 volume switching to read-only due to errors on
the volume.

We have tried running just two nodes and still get the problem where the
volume will switch over to read-only. I will look into the time
differences on the server, we normally have to make changes in the grub
config and NTP settings to keep the time in synch.

FYI the version of OCFS2 on SLES 10 SP2 is completely different:

HIT-TCN1:~ # rpm -qa | grep -i ocfs
ocfs2-tools-1.4.0-0.3
ocfs2console-1.4.0-0.3

I can't even find reference to this version on the Oracle web site.

Cheers

 Sedlock, Mark A. [EMAIL PROTECTED] 07/29/08 12:42 PM 
We run a similar set up, SLES 10 SP1, we were ESX 3.0.x and are now 3.5.
We're running the version of ocfs2 that shipped with SLES 10 SP1.

4 nodes accessing raw mapped LUNs via ESX from an HP SAN on HP Blade
Servers. Qlogic HBAs, standard NICs; nothing special.

The biggest hurdle we ran into was time synch on the individual hosts
(VMWare ESX + some variants of Linux have an interesting clock tick
relationship which I still don't understand) that was causing some ugly
fencing.

It's been running well for about 8 months.  Overall we're pretty happy
with it thus far.  That said, we don't let ESX VMotion the cluster nodes
via DRS, but that's more because we haven't tested it.  The cluster is
used for Apache web hosting.

web7:~:%1003#rpm -qa | grep -i ocfs
ocfs2-tools-1.2.3-0.7
ocfs2console-1.2.3-0.7
ocfs2-tools-devel-1.2.3-0.7
web7:~:%1004#uname -a
Linux web7 2.6.16.53-0.16-smp #1 SMP Tue Oct 2 16:57:49 UTC 2007 i686
athlon i386 GNU/Linux
web7:~:%1005#cat /etc/SuSE-release 
SUSE Linux Enterprise Server 10 (i586)
VERSION = 10
PATCHLEVEL = 1
web7:~:%1006#

--mark
Mark Sedlock
Network and System Services
Rowan University



___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users


[Ocfs2-users] OCFS2 and VMware ESX

2008-07-28 Thread Haydn Cahir
Hi,

We are haing some serious issues trying to configure an OCFS2 cluster on 3 SLES 
10 SP2 boxes running in VMware ESX 3.0.1. Before I go into any of the detailed 
errors we are experiencing I first wanted to ask everyone if they have 
successfully configured this solution? We would be interested to find out what 
needs to be set at the VMware level (RDM, VMFS, NICS etc) and what needs to be 
configured at the O/S level. We have a LUN on our SAN that we have presented to 
our VMware hosts that we are using for this.

Any help would be greatly appreciated!

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users


Re: [Ocfs2-users] OCFS2 and VMware ESX

2008-07-28 Thread Sedlock, Mark A.
We run a similar set up, SLES 10 SP1, we were ESX 3.0.x and are now 3.5.
We're running the version of ocfs2 that shipped with SLES 10 SP1.

4 nodes accessing raw mapped LUNs via ESX from an HP SAN on HP Blade
Servers. Qlogic HBAs, standard NICs; nothing special.

The biggest hurdle we ran into was time synch on the individual hosts
(VMWare ESX + some variants of Linux have an interesting clock tick
relationship which I still don't understand) that was causing some ugly
fencing.

It's been running well for about 8 months.  Overall we're pretty happy
with it thus far.  That said, we don't let ESX VMotion the cluster nodes
via DRS, but that's more because we haven't tested it.  The cluster is
used for Apache web hosting.

web7:~:%1003#rpm -qa | grep -i ocfs
ocfs2-tools-1.2.3-0.7
ocfs2console-1.2.3-0.7
ocfs2-tools-devel-1.2.3-0.7
web7:~:%1004#uname -a
Linux web7 2.6.16.53-0.16-smp #1 SMP Tue Oct 2 16:57:49 UTC 2007 i686
athlon i386 GNU/Linux
web7:~:%1005#cat /etc/SuSE-release 
SUSE Linux Enterprise Server 10 (i586)
VERSION = 10
PATCHLEVEL = 1
web7:~:%1006#

--mark
Mark Sedlock
Network and System Services
Rowan University

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:ocfs2-users-
 [EMAIL PROTECTED] On Behalf Of Haydn Cahir
 Sent: Monday, July 28, 2008 9:49 PM
 To: ocfs2-users@oss.oracle.com
 Subject: [Ocfs2-users] OCFS2 and VMware ESX
 
 Hi,
 
 We are haing some serious issues trying to configure an OCFS2 cluster
 on 3 SLES 10 SP2 boxes running in VMware ESX 3.0.1. Before I go into
 any of the detailed errors we are experiencing I first wanted to ask
 everyone if they have successfully configured this solution? We would
 be interested to find out what needs to be set at the VMware level
 (RDM, VMFS, NICS etc) and what needs to be configured at the O/S
level.
 We have a LUN on our SAN that we have presented to our VMware hosts
 that we are using for this.
 
 Any help would be greatly appreciated!
 
 ___
 Ocfs2-users mailing list
 Ocfs2-users@oss.oracle.com
 http://oss.oracle.com/mailman/listinfo/ocfs2-users

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users


Re: [Ocfs2-users] OCFS2 and VMware ESX

2008-07-28 Thread Sunil Mushran
SLES10 SP2 is shipping OCFS2 1.4. We will releasing the
same for (RH)EL in the coming weeks.

-Original Message-
From Haydn Cahir [EMAIL PROTECTED]
Sent Mon 7/28/2008 8:07 PM
To ocfs2-users@oss.oracle.com
Subject Re: [Ocfs2-users] OCFS2 and VMware ESX

Hi Mark,

Thanks for your reply. How did you configure your RDM mappings? We have tried a 
few combinations already. We have three nodes and are trying to use a single 
OCFS2 volume. We are encountering a range of errors like VM's not starting when 
another node is already started (goes back to the RDM configurations we think), 
two of the nodes are able to edit files in the OCFS2 volumes but the third 
doesn't see any changes made by the other nodes and the OCFS2 volume switching 
to read-only due to errors on the volume.

We have tried running just two nodes and still get the problem where the volume 
will switch over to read-only. I will look into the time differences on the 
server, we normally have to make changes in the grub config and NTP settings to 
keep the time in synch.

FYI the version of OCFS2 on SLES 10 SP2 is completely different:

HIT-TCN1:~ # rpm -qa | grep -i ocfs
ocfs2-tools-1.4.0-0.3
ocfs2console-1.4.0-0.3

I can't even find reference to this version on the Oracle web site.

Cheers

 Sedlock, Mark A. [EMAIL PROTECTED] 07/29/08 12:42 PM 
We run a similar set up, SLES 10 SP1, we were ESX 3.0.x and are now 3.5.
We're running the version of ocfs2 that shipped with SLES 10 SP1.

4 nodes accessing raw mapped LUNs via ESX from an HP SAN on HP Blade
Servers. Qlogic HBAs, standard NICs; nothing special.

The biggest hurdle we ran into was time synch on the individual hosts
(VMWare ESX + some variants of Linux have an interesting clock tick
relationship which I still don't understand) that was causing some ugly
fencing.

It's been running well for about 8 months.  Overall we're pretty happy
with it thus far.  That said, we don't let ESX VMotion the cluster nodes
via DRS, but that's more because we haven't tested it.  The cluster is
used for Apache web hosting.

web7:~:%1003#rpm -qa | grep -i ocfs
ocfs2-tools-1.2.3-0.7
ocfs2console-1.2.3-0.7
ocfs2-tools-devel-1.2.3-0.7
web7:~:%1004#uname -a
Linux web7 2.6.16.53-0.16-smp #1 SMP Tue Oct 2 16:57:49 UTC 2007 i686
athlon i386 GNU/Linux
web7:~:%1005#cat /etc/SuSE-release 
SUSE Linux Enterprise Server 10 (i586)
VERSION = 10
PATCHLEVEL = 1
web7:~:%1006#

--mark
Mark Sedlock
Network and System Services
Rowan University



___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users



___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users