Re: need some clarity, if anyone has a minute
On Tue, 2010-06-29 at 17:22 -0700, Patrick wrote: > On Jun 29, 6:08 am, Christopher Barry > wrote: > > > At the end of the day, I am trying to automagically find the optimal > > configuration for the type of storage available (i.e. does it support > > the proprietary MPIO driver I need to work with, dm-multipath, or just a > > straight connection), and what nics on what subnets area available on > > the host, how do these relate to the portals the host can see, and if > > bonding would be desirable. The matrix of possibilities is somewhat > > daunting... > > True, but you can narrow it down a lot, I think. > > If you assign a subnet to each interface on the RAID, and assign one > of those subnets to each interface on the host, you will naturally get > a fault-tolerant path between the host and the RAID. You just need to > probe each RAID IP once. understood. > > I think I asked before: What type of hardware RAIDs are these, > exactly? dm-multipath is actually very generic; it might "just work" > even if your vendor has their own proprietary software. the vendor is moving to dmmp, but is not there yet. There are incompatibilities in their current release, or I would definitely standardize on the built-in Linux driver. > > Port bonding is still an option, but thinking about it some more... > Unless your RAID also supports bonding, I am not sure it simplifies > your configuration at all. You will need one iSCSI session for each > IP address on the RAID whether you bond the ports on your host or > not. So bonding only simplifies things if you can do it both at the > host and at the target, and even then (a) it will be hard to get good > load balancing and (b) you will be stuck using one (non-redundant) > switch. you may want to have a look at the bonding.txt file in your kernel Documentation directory, or just have a look here: http://www.mjmwired.net/kernel/Documentation/networking/bonding.txt There are actually 6 'modes' available for setting up a bonded virtual interface with the bonding driver. I believe the '802.3ad' mode is the one you're thinking of that requires specific switch or other device support. However, I'm not considering using this mode in this initial implementation, but rather thinking that mode 6, or by it's long name 'balance-alb' might be a good bonding mode to use in this scenario. I'd be very interested in your thoughts on that once you've had a chance to peruse the link and think about the bonding options presented in it. Regards, -C > > - Pat > -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-is...@googlegroups.com. To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
Re: need some clarity, if anyone has a minute
On Jun 29, 6:08 am, Christopher Barry wrote: > At the end of the day, I am trying to automagically find the optimal > configuration for the type of storage available (i.e. does it support > the proprietary MPIO driver I need to work with, dm-multipath, or just a > straight connection), and what nics on what subnets area available on > the host, how do these relate to the portals the host can see, and if > bonding would be desirable. The matrix of possibilities is somewhat > daunting... True, but you can narrow it down a lot, I think. If you assign a subnet to each interface on the RAID, and assign one of those subnets to each interface on the host, you will naturally get a fault-tolerant path between the host and the RAID. You just need to probe each RAID IP once. I think I asked before: What type of hardware RAIDs are these, exactly? dm-multipath is actually very generic; it might "just work" even if your vendor has their own proprietary software. Port bonding is still an option, but thinking about it some more... Unless your RAID also supports bonding, I am not sure it simplifies your configuration at all. You will need one iSCSI session for each IP address on the RAID whether you bond the ports on your host or not. So bonding only simplifies things if you can do it both at the host and at the target, and even then (a) it will be hard to get good load balancing and (b) you will be stuck using one (non-redundant) switch. - Pat -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-is...@googlegroups.com. To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
Re: need some clarity, if anyone has a minute
On Fri, 2010-06-25 at 11:06 -0700, Patrick J. LoPresti wrote: > On Jun 23, 12:41 pm, Christopher Barry > wrote: > > > > Absolutely correct. What I was looking for were comparisons of the > > methods below, and wanted subnet stuff out of the way while discussing > > that. > > Ah, I see. > > Well, that is fine (even necessary) for the port bonding approach, but > for multi-path I/O (whether device-mapper or proprietary) it will > probably not do what you expect. When Linux has two interfaces on the > same subnet, in my experience it tends to send all traffic through > just one of them. So you will definitely want to split up the subnets > before testing multi-path I/O. > > > Here I do not understand your reasoning. My understanding was I would > > need a session per iface to each portal to survive a controller port > > failure. If this assumption is wrong, please explain. > > I may have misunderstood your use of "portal". I was thinking in the > RFC 3720 sense of "IP address". > > So you have four IP addresses on the RAID, and four IP addresses on > the Linux host. You have made all of your SCSI target devices visible > as logical units on all four addresses on the RAID. So to get fully > redundant paths, you only need to connect each of the four IP > addresses on the Linux host to a single IP address on the RAID. (So > Linux will see each logical unit four times.) > > I thought you were saying you would initiate a connection from each > host IP address to every RAID IP address (16 connections). That would > cause each each LU to show up 16 times, thus being harder to manage, > with no advantages in performance or fault-tolerance. But now it > sounds like that is not what you meant :-). Thanks for your reply Pat. actually, you were correct in your first assumption - I was indeed thinking that I would need to 'login' from each iface to each portal in order for the initiator to know about all of the paths. This was likely due to the fact that I was 'simplifying' :) by using a single subnet in my example. In reality, there would be multiple subnets, and obviously this could not occur (efficiently, as routing would obviously come into play). Thank you for clearing that up for me. At the end of the day, I am trying to automagically find the optimal configuration for the type of storage available (i.e. does it support the proprietary MPIO driver I need to work with, dm-multipath, or just a straight connection), and what nics on what subnets area available on the host, how do these relate to the portals the host can see, and if bonding would be desirable. The matrix of possibilities is somewhat daunting... > > > this is also something I am uncertain about. For instance, in the > > balance-alb mode, each slave will communicate with a remote ip > > consistently. In the case of two slaves, and two portals how would the > > traffic be apportioned? would it write to both simultaneously? could > > this corrupt the disk in any way? would it always only use a single > > slave/portal? > > This is what I meant by being "at the mercy of the load balancing > performed by the bonding". > > If I understand the description of "balance-alb" correctly, outgoing > traffic will be more-or-less round-robin; it tries to balance the load > among the available interfaces, without worrying about keeping packets > in order. If packets wind up out of order, TCP will put them back in > order at the other end, possibly (probably?) at the cost of some > performance. > > Inbound traffic from any particular portal will go to a single slave. > But there is no guarantee that the traffic will then be properly > balanced. > > The advantage of multipath I/O is that it can balance the traffic at > the level of SCSI commands. I suspect this will be both faster and > more consistent, but again, I have not actually tried using bonding. > > - Pat > -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-is...@googlegroups.com. To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
Re: need some clarity, if anyone has a minute
On Jun 23, 12:41 pm, Christopher Barry wrote: > > Absolutely correct. What I was looking for were comparisons of the > methods below, and wanted subnet stuff out of the way while discussing > that. Ah, I see. Well, that is fine (even necessary) for the port bonding approach, but for multi-path I/O (whether device-mapper or proprietary) it will probably not do what you expect. When Linux has two interfaces on the same subnet, in my experience it tends to send all traffic through just one of them. So you will definitely want to split up the subnets before testing multi-path I/O. > Here I do not understand your reasoning. My understanding was I would > need a session per iface to each portal to survive a controller port > failure. If this assumption is wrong, please explain. I may have misunderstood your use of "portal". I was thinking in the RFC 3720 sense of "IP address". So you have four IP addresses on the RAID, and four IP addresses on the Linux host. You have made all of your SCSI target devices visible as logical units on all four addresses on the RAID. So to get fully redundant paths, you only need to connect each of the four IP addresses on the Linux host to a single IP address on the RAID. (So Linux will see each logical unit four times.) I thought you were saying you would initiate a connection from each host IP address to every RAID IP address (16 connections). That would cause each each LU to show up 16 times, thus being harder to manage, with no advantages in performance or fault-tolerance. But now it sounds like that is not what you meant :-). > this is also something I am uncertain about. For instance, in the > balance-alb mode, each slave will communicate with a remote ip > consistently. In the case of two slaves, and two portals how would the > traffic be apportioned? would it write to both simultaneously? could > this corrupt the disk in any way? would it always only use a single > slave/portal? This is what I meant by being "at the mercy of the load balancing performed by the bonding". If I understand the description of "balance-alb" correctly, outgoing traffic will be more-or-less round-robin; it tries to balance the load among the available interfaces, without worrying about keeping packets in order. If packets wind up out of order, TCP will put them back in order at the other end, possibly (probably?) at the cost of some performance. Inbound traffic from any particular portal will go to a single slave. But there is no guarantee that the traffic will then be properly balanced. The advantage of multipath I/O is that it can balance the traffic at the level of SCSI commands. I suspect this will be both faster and more consistent, but again, I have not actually tried using bonding. - Pat -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-is...@googlegroups.com. To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
Re: need some clarity, if anyone has a minute
Thanks Patrick. please see inline. On Wed, 2010-06-23 at 08:04 -0700, Patrick wrote: > On Jun 23, 7:28 am, Christopher Barry > wrote: > > This array has it's own specific MPIO drivers, > > and does not support DM-Multipath. I'm trying to get a handle on the > > differences in redundancy provided by the various layers involved in the > > connection from host to array, in a generic sense. > > What kind of array is it? Are you certain it "does not support" > multipath I/O? Multipath I/O is pretty generic... > > > For simplicity, all ports are on the same subnet. > > I actually would not do that. The design is cleaner and easier to > visualize (IMO) if you put the ports onto different subnets/VLANs. > Even better is to put each one on a different physical switch so you > can tolerate the failure of a switch. Absolutely correct. What I was looking for were comparisons of the methods below, and wanted subnet stuff out of the way while discussing that. > > > scenario #1 > > Single (bonded) NIC, default iface, login to all controller portals. > > Here you are at the mercy of the load balancing performed by the > bonding, which is probably worse than the load-balancing performed at > higher levels. But I admit I have not tried it, so if you decide to > do some performance comparisons, please let me know what you > find. :-) > > I will skip right down to... > > > scenario #4 > > Dual NIC, iface per NIC, MPIO driver, login to all controller portals > > from each iface > > Why log into all portals from each interface? It buys you nothing and > makes the setup more complex. Just log into one target portal from > each interface and do multi-pathing among them. This will also make > your automation (much) simpler. Here I do not understand your reasoning. My understanding was I would need a session per iface to each portal to survive a controller port failure. If this assumption is wrong, please explain. > > Again, I would recommend assigning one subnet to each interface. It > is hard to convince Linux to behave sanely when you have multiple > interfaces connected to the same subnet. (Linux will tend to send all > traffic for that subnet via the same interface. Yes, you can hack > around this. But why?) > > In other words, I would do eth0 -> subnet 0 -> portal 0, eth1 -> > subnet 1 -> portal 1, eth2 -> subnet 2 -> portal 2, etc. This is very > easy to draw, explain, and reason about. Then set up multipath I/O > and you are done. > > In fact, this is exactly what I am doing myself. I have multiple > clients and multiple hardware iSCSI RAID units (Infortrend); each > interface on each client and RAID connects to a single subnet. Then I > am using cLVM to stripe among the hardware RAIDs. I am obtaining > sustained read speeds of ~1200 megabytes/second (yes, sustained; no > cache). Plus I have the redundancy of multipath I/O. > > Trying the port bonding approach is on my "to do" list, but this setup > is working so well I have not bothered yet. this is also something I am uncertain about. For instance, in the balance-alb mode, each slave will communicate with a remote ip consistently. In the case of two slaves, and two portals how would the traffic be apportioned? would it write to both simultaneously? could this corrupt the disk in any way? would it always only use a single slave/portal? > > - Pat > -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-is...@googlegroups.com. To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
Re: need some clarity, if anyone has a minute (correction)
On 06/23/2010 09:34 AM, Christopher Barry wrote: correction inline: On Wed, 2010-06-23 at 10:28 -0400, Christopher Barry wrote: Hello, I'm implementing some code to automagically configure iscsi connections to a proprietary array. This array has it's own specific MPIO drivers, and does not support DM-Multipath. I'm trying to get a handle on the differences in redundancy provided by the various layers involved in the connection from host to array, in a generic sense. The array has two iSCSI ports per controller, and two controllers. The targets can be seen through any of the ports. For simplicity, all ports are on the same subnet. I'll describe a series of scenarios, and maybe someone can speak their level of usefulness, redundancy, gotchas, nuances, etc: scenario #1 Single NIC, default iface, login to all controller portals. Of course with this there is no redundancy on the initiator side. If the nic on the initiator side dies, you are in trouble. And if you are not using multipath software in the block/scsi or net layer then logging into all the portals is no use. Using dm-multipath across the target ports for most targets works well, and would allow you to take advantage of redundancy there. Can't say anyhting about your mpio code you are using or your target since I do not know what they are. scenario #2 Dual NIC, iface per NIC, login to all controller portals from each iface Without some multipath software then this is pretty useless too. If you use something like dm-multipath then it can round robin or failover over all the paths that will get created. scenario #3 Two bonded NICs in mode balance-alb Single NIC, default iface, login to all controller portals. single bonded interface, not single NIC. I do not have a lot of experience with this. scenario #4 Dual NIC, iface per NIC, MPIO driver, login to all controller portals from each iface The iscsi/scsi/block layers provide what dm-multipath needs for this, and for most targets it should work. Again, I have no idea what your MPIO driver does and needs so I cannot say if it will work well for you. -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-is...@googlegroups.com. To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
Re: need some clarity, if anyone has a minute
On Jun 23, 7:28 am, Christopher Barry wrote: > This array has it's own specific MPIO drivers, > and does not support DM-Multipath. I'm trying to get a handle on the > differences in redundancy provided by the various layers involved in the > connection from host to array, in a generic sense. What kind of array is it? Are you certain it "does not support" multipath I/O? Multipath I/O is pretty generic... > For simplicity, all ports are on the same subnet. I actually would not do that. The design is cleaner and easier to visualize (IMO) if you put the ports onto different subnets/VLANs. Even better is to put each one on a different physical switch so you can tolerate the failure of a switch. > scenario #1 > Single (bonded) NIC, default iface, login to all controller portals. Here you are at the mercy of the load balancing performed by the bonding, which is probably worse than the load-balancing performed at higher levels. But I admit I have not tried it, so if you decide to do some performance comparisons, please let me know what you find. :-) I will skip right down to... > scenario #4 > Dual NIC, iface per NIC, MPIO driver, login to all controller portals > from each iface Why log into all portals from each interface? It buys you nothing and makes the setup more complex. Just log into one target portal from each interface and do multi-pathing among them. This will also make your automation (much) simpler. Again, I would recommend assigning one subnet to each interface. It is hard to convince Linux to behave sanely when you have multiple interfaces connected to the same subnet. (Linux will tend to send all traffic for that subnet via the same interface. Yes, you can hack around this. But why?) In other words, I would do eth0 -> subnet 0 -> portal 0, eth1 -> subnet 1 -> portal 1, eth2 -> subnet 2 -> portal 2, etc. This is very easy to draw, explain, and reason about. Then set up multipath I/O and you are done. In fact, this is exactly what I am doing myself. I have multiple clients and multiple hardware iSCSI RAID units (Infortrend); each interface on each client and RAID connects to a single subnet. Then I am using cLVM to stripe among the hardware RAIDs. I am obtaining sustained read speeds of ~1200 megabytes/second (yes, sustained; no cache). Plus I have the redundancy of multipath I/O. Trying the port bonding approach is on my "to do" list, but this setup is working so well I have not bothered yet. - Pat -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-is...@googlegroups.com. To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
Re: need some clarity, if anyone has a minute (correction)
correction inline: On Wed, 2010-06-23 at 10:28 -0400, Christopher Barry wrote: > Hello, > > I'm implementing some code to automagically configure iscsi connections > to a proprietary array. This array has it's own specific MPIO drivers, > and does not support DM-Multipath. I'm trying to get a handle on the > differences in redundancy provided by the various layers involved in the > connection from host to array, in a generic sense. > > The array has two iSCSI ports per controller, and two controllers. The > targets can be seen through any of the ports. For simplicity, all ports > are on the same subnet. > > I'll describe a series of scenarios, and maybe someone can speak their > level of usefulness, redundancy, gotchas, nuances, etc: > > scenario #1 > Single NIC, default iface, login to all controller portals. > > scenario #2 > Dual NIC, iface per NIC, login to all controller portals from each iface > > scenario #3 > Two bonded NICs in mode balance-alb > Single NIC, default iface, login to all controller portals. single bonded interface, not single NIC. > > scenario #4 > Dual NIC, iface per NIC, MPIO driver, login to all controller portals > from each iface > > > Appreciate any advice, > Thanks, > -C > -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-is...@googlegroups.com. To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.