Re: [ceph-users] Ceph, LIO, VMWARE anyone?

Nick Fisk Wed, 21 Jan 2015 14:38:25 -0800

Hi Jake,


Thanks for this, I have been going through this and have a pretty good idea on 
what you are doing now, however I maybe missing something looking through your 
scripts, but I’m still not quite understanding how you are managing to make 
sure locking is happening with the ESXi ATS SCSI command.

 

>From this slide

 

https://wiki.ceph.com/@api/deki/files/38/hammer-ceph-devel-summit-scsi-target-clustering.pdf
   (Page 8)

 

It seems to indicate that for a true active/active setup the two targets need 
to be aware of each other and exchange locking information for it to work 
reliably, I’ve also watched the video from the Ceph developer summit where this 
is discussed and it seems that Ceph+Kernel need changes to allow this locking 
to be pushed back to the RBD layer so it can be shared, from what I can see 
browsing through the Linux Git Repo, these patches haven’t made the mainline 
kernel yet.

 

Can you shed any light on this? As tempting as having active/active is, I’m 
wary about using the configuration until I understand how the locking is 
working and if fringe cases involving multiple ESXi hosts writing to the same 
LUN on different targets could spell disaster.

 

Many thanks,

Nick

 

From: Jake Young [mailto:jak3...@gmail.com] 
Sent: 14 January 2015 16:54
To: Nick Fisk
Cc: Giuseppe Civitella; ceph-users
Subject: Re: [ceph-users] Ceph, LIO, VMWARE anyone?

 

Yes, it's active/active and I found that VMWare can switch from path to path 
with no issues or service impact.

 

  

I posted some config files here: github.com/jak3kaj/misc 
<http://github.com/jak3kaj/misc> 

 

One set is from my LIO nodes, both the primary and secondary configs so you can 
see what I needed to make unique.  The other set (targets.conf) are from my tgt 
nodes.  They are both 4 LUN configs.

 

Like I said in my previous email, there is no performance difference between 
LIO and tgt.  The only service I'm running on these nodes is a single iscsi 
target instance (either LIO or tgt).

 

Jake

 

On Wed, Jan 14, 2015 at 8:41 AM, Nick Fisk <n...@fisk.me.uk 
<mailto:n...@fisk.me.uk> > wrote:

Hi Jake,

 

I can’t remember the exact details, but it was something to do with a potential 
problem when using the pacemaker resource agents. I think it was to do with a 
potential hanging issue when one LUN on a shared target failed and then it 
tried to kill all the other LUNS to fail the target over to another host. This 
then leaves the TCM part of LIO locking the RBD which also can’t fail over.

 

That said I did try multiple LUNS on one target as a test and didn’t experience 
any problems.

 

I’m interested in the way you have your setup configured though. Are you saying 
you effectively have an active/active configuration with a path going to either 
host, or are you failing the iSCSI IP between hosts? If it’s the former, have 
you had any problems with scsi locking/reservations…etc between the two targets?

 

I can see the advantage to that configuration as you reduce/eliminate a lot of 
the troubles I have had with resources failing over.

 

Nick

 

From: Jake Young [mailto:jak3...@gmail.com <mailto:jak3...@gmail.com> ] 
Sent: 14 January 2015 12:50
To: Nick Fisk
Cc: Giuseppe Civitella; ceph-users
Subject: Re: [ceph-users] Ceph, LIO, VMWARE anyone?

 

Nick,

 

Where did you read that having more than 1 LUN per target causes stability 
problems?

 

I am running 4 LUNs per target. 

 

For HA I'm running two linux iscsi target servers that map the same 4 rbd 
images. The two targets have the same serial numbers, T10 address, etc.  I copy 
the primary's config to the backup and change IPs. This way VMWare thinks they 
are different target IPs on the same host. This has worked very well for me. 

 

One suggestion I have is to try using rbd enabled tgt. The performance is 
equivalent to LIO, but I found it is much better at recovering from a cluster 
outage. I've had LIO lock up the kernel or simply not recognize that the rbd 
images are available; where tgt will eventually present the rbd images again. 

 

I have been slowly adding servers and am expanding my test setup to a 
production setup (nice thing about ceph). I now have 6 OSD hosts with 7 disks 
on each. I'm using the LSI Nytro cache raid controller, so I don't have a 
separate journal and have 40Gb networking. I plan to add another 6 OSD hosts in 
another rack in the next 6 months (and then another 6 next year). I'm doing 3x 
replication, so I want to end up with 3 racks. 

 

Jake

On Wednesday, January 14, 2015, Nick Fisk <n...@fisk.me.uk 
<mailto:n...@fisk.me.uk> > wrote:

Hi Giuseppe,

 

I am working on something very similar at the moment. I currently have it 
working on some test hardware but seems to be working reasonably well.

 

I say reasonably as I have had a few instability’s but these are on the HA 
side, the LIO and RBD side of things have been rock solid so far. The main 
problems I have had seem to be around recovering from failure with resources 
ending up in a unmanaged state. I’m not currently using fencing so this may be 
part of the cause.

 

As a brief description of my configuration.

 

4 Hosts each having 2 OSD’s also running the monitor role

3 additional host in a HA cluster which act as iSCSI proxy nodes.

 

I’m using the IP, RBD, iSCSITarget and iSCSILUN resource agents to provide HA 
iSCSI LUN which maps back to a RBD. All the agents for each RBD are in a group 
so they follow each other between hosts.

 

I’m using 1 LUN per target as I read somewhere there are stability problems 
using more than 1 LUN per target.

 

Performance seems ok, I can get about 1.2k random IO’s out the iSCSI LUN. These 
seems to be about right for the Ceph cluster size, so I don’t think the LIO 
part is causing any significant overhead.

 

We should be getting our production hardware shortly which wil have 40 OSD’s 
with journals and a SSD caching tier, so within the next month or so I will 
have a better idea of running it in a production environment and the 
performance of the system.

 

Hope that helps, if you have any questions, please let me know.

 

Nick

 

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
Giuseppe Civitella
Sent: 13 January 2015 11:23
To: ceph-users
Subject: [ceph-users] Ceph, LIO, VMWARE anyone?

 

Hi all,

 

I'm working on a lab setup regarding Ceph serving rbd images as ISCSI 
datastores to VMWARE via a LIO box. Is there someone that already did something 
similar wanting to share some knowledge? Any production deployments? What about 
LIO's HA and luns' performances?

 

Thanks 

Giuseppe


  <http://xo4t.mjt.lu/o/xo4t/ffe5c988/gpqqil1e.gif> 


  <http://xo4t.mjt.lu/o/xo4t/bbeb420c/g30lsmle.gif>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph, LIO, VMWARE anyone?

Reply via email to