subject:"\[ceph\-users\] iSCSI Multipath \(Load Balancing\) vs RBD Exclusive Lock"

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-15 Thread Mike Christie

On 03/15/2018 02:32 PM, Maxim Patlasov wrote:
> On Thu, Mar 15, 2018 at 12:48 AM, Mike Christie  > wrote:
> 
> ...
> 
> It looks like there is a bug.
> 
> 1. A regression was added when I stopped killing the iscsi connection
> when the lock is taken away from us to handle a failback bug where it
> was causing ping ponging. That combined with #2 will cause the bug.
> 
> 2. I did not anticipate the type of sleeps above where they are injected
> any old place in the kernel. For example, if a command had really got
> stuck on the network then the nop timer would fire which forces the
> iscsi thread's recv() to fail and that submitting thread to exit. Or we
> should handle the delay-request-in-tcmu-runner.diff issue ok, because we
> wait for those commands. However, we could just get rescheduled due to
> hitting a preemption point and we might not be rescheduled for longer
> than failover timeout seconds. For this it could just be some buggy code
> that gets run on all the cpus for more than failover timeout seconds
> then recovers, and we would hit the bug in your patch above.
> 
> The 2 attached patches fix the issues for me on linux. Note that it only
> works on linux right now and it only works with 2 nodes. It probably
> also works for ESX/windows, but I need to reconfig some timers.
> 
> Apply ceph-iscsi-config-explicit-standby.patch to ceph-iscsi-config and
> tcmu-runner-use-explicit.patch to tcmu-runner.
> 
> 
> 
> Mike, thank you for patches, they seem to work. There is an issue, but
> not related to data corruption: if the second path (gateway) is not
> available and I restart tcmu-runner on the first gateway, all subsequent
> i/o hangs for long because tcmu-runner is in UNLOCKED state and
> initiator doesn't resend explicit ALUA activation request for long while
> (190s).

Yeah, I should have a fix for that. We are returning the wrong error
code for explicit alua. I needed to change it to a value that indicated
we are in a state where we do not have the lock (we are in alua standby)
so the initiator does not keep retrying until the scsi command *
max_retries check if fired in the linux scsi layer.

Jason suggested how to properly support vmware/windows and more than 2
nodes. The fix for that will allow me to properly figure out lock states
and return the proper error codes for that error. I am hoping to be done
with a ruff code tomorrow.

> 
> Can you please also clarify how explicit ALUA (with these patches
> applied) is immune to a situation when there are some stale requests
> sitting in kernel queues by the moment tcmu-runner handles
> tcmu_explicit_transition() --> tcmu_acquire_dev_lock(). Does it mean
> that all requests are strictly ordered and initiator will never send us
> read/wrtie requests until we complete that explicit ALUA activation request?
> 

Basically yes. Here is some extra info and what I wrote on github for
people that do not like GH:

- There is only one cmd submitting thread per iscsi session, so commands
are put in the tcmu queue in order and then tcmu-runner only has the one
thread per device that initially checks the commands and decides if we
need to do a failover or dispatch to the handler.

For your test it would work like this:

1. STPG successfully executed on node1.
2. WRITEs sent and get stuck on node1.
3. Failover to node2. WRITEs execute ok on this node.
4. If the WRITEs are unjammed at this time they are just failed, because
we will hit the blacklist checks or unlocked checks.
5. If node2 were to fail while the commands were still stuck then
node1's iscsi session would normally have dropped and lio would not be
allowing new logins due to the stuck WRITEs on node1 (So this is when
you commonly see the ABORT messages and stuck logins that are reported
on the list every once in a while).

If the initiator did not escalate to session level recovery, then before
doing new IO the initiator would send a STPG and that would be stuck
behind the stuck WRITEs from step 2. Before we can dequeue the STPG in
runner then we have to wait for the stuck WRITEs.

Note the runner STPG/lock code will also wait for commands that have
been sent to the handler module or are stuck in a runner thread before
doing starting the lock acquire call, so if a WRITE got stuck there we
will be ok.
6. Once the WRITEs unjam and are failed the STPG is executed. If the
STPG is successful, that is reported to the initiator and it will start
sending IO.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-15 Thread Maxim Patlasov

On Thu, Mar 15, 2018 at 12:48 AM, Mike Christie  wrote:

> ...
>
> It looks like there is a bug.
>
> 1. A regression was added when I stopped killing the iscsi connection
> when the lock is taken away from us to handle a failback bug where it
> was causing ping ponging. That combined with #2 will cause the bug.
>
> 2. I did not anticipate the type of sleeps above where they are injected
> any old place in the kernel. For example, if a command had really got
> stuck on the network then the nop timer would fire which forces the
> iscsi thread's recv() to fail and that submitting thread to exit. Or we
> should handle the delay-request-in-tcmu-runner.diff issue ok, because we
> wait for those commands. However, we could just get rescheduled due to
> hitting a preemption point and we might not be rescheduled for longer
> than failover timeout seconds. For this it could just be some buggy code
> that gets run on all the cpus for more than failover timeout seconds
> then recovers, and we would hit the bug in your patch above.
>
> The 2 attached patches fix the issues for me on linux. Note that it only
> works on linux right now and it only works with 2 nodes. It probably
> also works for ESX/windows, but I need to reconfig some timers.
>
> Apply ceph-iscsi-config-explicit-standby.patch to ceph-iscsi-config and
> tcmu-runner-use-explicit.patch to tcmu-runner.
>
>
>
Mike, thank you for patches, they seem to work. There is an issue, but not
related to data corruption: if the second path (gateway) is not available
and I restart tcmu-runner on the first gateway, all subsequent i/o hangs
for long because tcmu-runner is in UNLOCKED state and initiator doesn't
resend explicit ALUA activation request for long while (190s).

Can you please also clarify how explicit ALUA (with these patches applied)
is immune to a situation when there are some stale requests sitting in
kernel queues by the moment tcmu-runner handles tcmu_explicit_transition()
--> tcmu_acquire_dev_lock(). Does it mean that all requests are strictly
ordered and initiator will never send us read/wrtie requests until we
complete that explicit ALUA activation request?

Thanks,
Maxim
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-15 Thread Mike Christie

On 03/14/2018 04:28 PM, Maxim Patlasov wrote:
> On Wed, Mar 14, 2018 at 12:05 PM, Michael Christie  > wrote:
> 
> On 03/14/2018 01:27 PM, Michael Christie wrote:
> > On 03/14/2018 01:24 PM, Maxim Patlasov wrote:
> >> On Wed, Mar 14, 2018 at 11:13 AM, Jason Dillaman 
> 
> >> >> wrote:
> >>
> >> Maxim, can you provide steps for a reproducer?
> >>
> >>
> >> Yes, but it involves adding two artificial delays: one in 
> tcmu-runner
> >> and another in kernel iscsi. If you're willing to take pains of
> >
> > Send the patches for the changes.
> >
> >> ...
> Where you send the patches that add your delays could you send the
> target side /var/log/tcmu-runner.log with log_level = 4.
> ...
> 
> 
> Mike, see please patches and /var/log/tcmu-runner.log in attachment.
> 
> Time-line was like this:
> 
> 1) 13:56:31 the client (iscsi-initiator) sends a request leading to
> "Acquired exclusive lock." on gateway.
> 2) 13:56:49 tcmu-runner is suspended by SIGSTOP
> 3) 13:56:50 the client executes:
> 
> dd of=/dev/mapper/mpatha if=/dev/zero oflag=direct bs=1536 count=1 seek=10 &
> dd of=/dev/mapper/mpatha if=/dev/zero oflag=direct bs=2560 count=1 seek=10 &
> dd of=/dev/mapper/mpatha if=/dev/zero oflag=direct bs=3584 count=1 seek=10
> 
> 4) 13:56:51 gateway is detached from client (and neighbor gateways) by
> "iptables ... -j DROP"
> 5) 13:57:06 the client switches to another path, completes these
> requests above
> 6) 13:57:07 the client executes (that another path is still active):
> 
> dd of=/dev/mapper/mpatha if=/dev/urandom oflag=direct bs=3584 count=1
> seek=10
> 
> 7) 13:57:09 tcmu-runner is resumed by SIGCONT
> 8) 13:57:15 tcmu-runner successfully processes the third request (zero
> bs=3584) overwriting newer data.
> 
> 9) verify that newer data was really overwritten:
> 
> # dd if=/dev/mapper/mpatha iflag=direct bs=3584 count=1 skip=10 |od -x
> 1+0 records in
> 1+0 records out
> 3584 bytes (3.6 kB) copied, 0.00232227 s, 1.5 MB/s
> 000        
> 
> Thanks,
> Maxim
> 
> 
> delay-request-in-kernel.diff
> 
> 
> diff --git a/drivers/target/iscsi/iscsi_target.c 
> b/drivers/target/iscsi/iscsi_target.c
> index 9eb10d3..f48ee2c 100644
> --- a/drivers/target/iscsi/iscsi_target.c
> +++ b/drivers/target/iscsi/iscsi_target.c
> @@ -1291,6 +1291,13 @@ int iscsit_process_scsi_cmd(struct iscsi_conn *conn, 
> struct iscsi_cmd *cmd,
>  
>   immed_ret = iscsit_handle_immediate_data(cmd, hdr,
>   cmd->first_burst_len);
> +
> + if (be32_to_cpu(hdr->data_length) == 3584) {
> + u64 end_time = ktime_get_ns() + 25ULL * 1000 * 1000 * 1000;
> + while (ktime_get_ns() < end_time)
> + schedule_timeout_uninterruptible(HZ);
> + }
> +


It looks like there is a bug.

1. A regression was added when I stopped killing the iscsi connection
when the lock is taken away from us to handle a failback bug where it
was causing ping ponging. That combined with #2 will cause the bug.

2. I did not anticipate the type of sleeps above where they are injected
any old place in the kernel. For example, if a command had really got
stuck on the network then the nop timer would fire which forces the
iscsi thread's recv() to fail and that submitting thread to exit. Or we
should handle the delay-request-in-tcmu-runner.diff issue ok, because we
wait for those commands. However, we could just get rescheduled due to
hitting a preemption point and we might not be rescheduled for longer
than failover timeout seconds. For this it could just be some buggy code
that gets run on all the cpus for more than failover timeout seconds
then recovers, and we would hit the bug in your patch above.

The 2 attached patches fix the issues for me on linux. Note that it only
works on linux right now and it only works with 2 nodes. It probably
also works for ESX/windows, but I need to reconfig some timers.

Apply ceph-iscsi-config-explicit-standby.patch to ceph-iscsi-config and
tcmu-runner-use-explicit.patch to tcmu-runner.


diff --git a/alua.c b/alua.c
index 9b36e9f..20e01ef 100644
--- a/alua.c
+++ b/alua.c
@@ -56,6 +56,17 @@ static int tcmu_get_alua_int_setting(struct alua_grp *group,
 	return tcmu_get_cfgfs_int(path);
 }
 
+static int tcmu_set_alua_int_setting(struct alua_grp *group,
+ const char *setting, int val)
+{
+	char path[PATH_MAX];
+
+	snprintf(path, sizeof(path), CFGFS_CORE"/%s/%s/alua/%s/%s",
+		 group->dev->tcm_hba_name, group->dev->tcm_dev_name,
+		 group->name, setting);
+	return tcmu_set_cfgfs_ul(path, val);
+}
+
 static void tcmu_release_tgt_ports(struct alua_grp *group)
 {
 	struct tgt_port *port, *port_next;
@@ -205,10 +216,28 @@

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-14 Thread Maxim Patlasov

On Wed, Mar 14, 2018 at 12:05 PM, Michael Christie 
wrote:

> On 03/14/2018 01:27 PM, Michael Christie wrote:
>> > On 03/14/2018 01:24 PM, Maxim Patlasov wrote:
>> >> On Wed, Mar 14, 2018 at 11:13 AM, Jason Dillaman > >> > wrote:
>> >>
>> >> Maxim, can you provide steps for a reproducer?
>> >>
>> >>
>> >> Yes, but it involves adding two artificial delays: one in tcmu-runner
>> >> and another in kernel iscsi. If you're willing to take pains of
>> >
>> > Send the patches for the changes.
>> >
>> >> ...
>> Where you send the patches that add your delays could you send the
>> target side /var/log/tcmu-runner.log with log_level = 4.
>> ...
>>
>
Mike, see please patches and /var/log/tcmu-runner.log in attachment.

Time-line was like this:

1) 13:56:31 the client (iscsi-initiator) sends a request leading to
"Acquired exclusive lock." on gateway.
2) 13:56:49 tcmu-runner is suspended by SIGSTOP
3) 13:56:50 the client executes:

dd of=/dev/mapper/mpatha if=/dev/zero oflag=direct bs=1536 count=1 seek=10 &
dd of=/dev/mapper/mpatha if=/dev/zero oflag=direct bs=2560 count=1 seek=10 &
dd of=/dev/mapper/mpatha if=/dev/zero oflag=direct bs=3584 count=1 seek=10

4) 13:56:51 gateway is detached from client (and neighbor gateways) by
"iptables ... -j DROP"
5) 13:57:06 the client switches to another path, completes these requests
above
6) 13:57:07 the client executes (that another path is still active):

dd of=/dev/mapper/mpatha if=/dev/urandom oflag=direct bs=3584 count=1
seek=10

7) 13:57:09 tcmu-runner is resumed by SIGCONT
8) 13:57:15 tcmu-runner successfully processes the third request (zero
bs=3584) overwriting newer data.

9) verify that newer data was really overwritten:

# dd if=/dev/mapper/mpatha iflag=direct bs=3584 count=1 skip=10 |od -x
1+0 records in
1+0 records out
3584 bytes (3.6 kB) copied, 0.00232227 s, 1.5 MB/s
000        

Thanks,
Maxim
diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c
index 9eb10d3..f48ee2c 100644
--- a/drivers/target/iscsi/iscsi_target.c
+++ b/drivers/target/iscsi/iscsi_target.c
@@ -1291,6 +1291,13 @@ int iscsit_process_scsi_cmd(struct iscsi_conn *conn, struct iscsi_cmd *cmd,
 
 	immed_ret = iscsit_handle_immediate_data(cmd, hdr,
 	cmd->first_burst_len);
+
+	if (be32_to_cpu(hdr->data_length) == 3584) {
+		u64 end_time = ktime_get_ns() + 25ULL * 1000 * 1000 * 1000;
+		while (ktime_get_ns() < end_time)
+			schedule_timeout_uninterruptible(HZ);
+	}
+
 after_immediate_data:
 	if (immed_ret == IMMEDIATE_DATA_NORMAL_OPERATION) {
 		/*
diff --git a/tcmur_cmd_handler.c b/tcmur_cmd_handler.c
index 6d89a5f..1b77f11 100644
--- a/tcmur_cmd_handler.c
+++ b/tcmur_cmd_handler.c
@@ -2292,6 +2292,9 @@ static int tcmur_cmd_handler(struct tcmu_device *dev, struct tcmulib_cmd *cmd)
 	struct tcmur_device *rdev = tcmu_get_daemon_dev_private(dev);
 	uint8_t *cdb = cmd->cdb;
 
+	if (cmd->iovec[0].iov_len == 2560)
+		sleep(1);
+
 	track_aio_request_start(rdev);
 
 	if (tcmu_dev_in_recovery(dev)) {
2018-03-14 13:56:03.042 953 [DEBUG] main:1056: handler path: /usr/lib64/tcmu-runner
2018-03-14 13:56:03.043 953 [INFO] dyn_config_start:410: Inotify is watching "/etc/tcmu/tcmu.conf", wd: 1, mask: IN_ALL_EVENTS
2018-03-14 13:56:03.059 953 [INFO] load_our_module:537: Inserted module 'target_core_user'
2018-03-14 13:56:03.102 953 [DEBUG] main:1069: 2 runner handlers found
2018-03-14 13:56:03.104 953 [DEBUG] dbus_bus_acquired:441: bus org.kernel.TCMUService1 acquired
2018-03-14 13:56:03.105 953 [DEBUG] dbus_name_acquired:457: name org.kernel.TCMUService1 acquired
2018-03-14 13:56:05.474 953 [DEBUG] handle_netlink:207: cmd 1. Got header version 2. Supported 2.
2018-03-14 13:56:05.474 953 [DEBUG] dev_added:763 rbd/rbd.disk_1: Got block_size 512, size in bytes 2147483648
2018-03-14 13:56:05.474 953 [DEBUG] tcmu_rbd_open:739 rbd/rbd.disk_1: tcmu_rbd_open config rbd/rbd/disk_1;osd_op_timeout=30 block size 512 num lbas 4194304.
2018-03-14 13:56:05.482 953 [DEBUG] timer_check_and_set_def:378 rbd/rbd.disk_1: The cluster's default osd op timeout(0.00), osd heartbeat grace(20) interval(6)
2018-03-14 13:56:05.505 953 [DEBUG] tcmu_rbd_detect_device_class:295 rbd/rbd.disk_1: Pool rbd using crush rule "replicated_rule"
2018-03-14 13:56:05.506 953 [DEBUG] tcmu_rbd_detect_device_class:311 rbd/rbd.disk_1: SSD not a registered device class.
2018-03-14 13:56:11.933 953 [DEBUG] tcmu_rbd_has_lock:511 rbd/rbd.disk_1: Not owner
2018-03-14 13:56:11.935 953 [DEBUG] tcmu_rbd_has_lock:511 rbd/rbd.disk_1: Not owner
2018-03-14 13:56:11.937 953 [DEBUG] tcmu_rbd_has_lock:511 rbd/rbd.disk_1: Not owner
2018-03-14 13:56:11.938 953 [DEBUG] tcmu_rbd_has_lock:511 rbd/rbd.disk_1: Not owner
2018-03-14 13:56:12.930 953 [DEBUG] tcmu_rbd_has_lock:511 rbd/rbd.disk_1: Not owner
2018-03-14 13:56:12.932 953 [DEBUG] tcmu_rbd_has_lock:511 rbd/rbd.disk_1: Not owner
2018-03-14 13:56:12.935 953 [DEBUG]

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-14 Thread Maxim Patlasov

On Wed, Mar 14, 2018 at 12:05 PM, Michael Christie 
wrote:

> On 03/14/2018 01:27 PM, Michael Christie wrote:
> > On 03/14/2018 01:24 PM, Maxim Patlasov wrote:
> >> On Wed, Mar 14, 2018 at 11:13 AM, Jason Dillaman  >> > wrote:
> >>
> >> Maxim, can you provide steps for a reproducer?
> >>
> >>
> >> Yes, but it involves adding two artificial delays: one in tcmu-runner
> >> and another in kernel iscsi. If you're willing to take pains of
> >
> > Send the patches for the changes.
> >
> >> recompiling kernel and tcmu-runner on one of gateway nodes, I'll help to
> >> reproduce.
> >>
> >> Generally, the idea of reproducer is simple: let's model a situation
> >> when two stale requests got stuck in kernel mailbox waiting to be
> >> consumed by tcmu-runner, and another one got stuck in iscsi layer --
> >> immediately after reading iscsi request from the socket. If we unblock
> >> tcmu-runner after newer data went through another gateway, the first
> >> stale request will switch tcmu-runner state from LOCKED to UNLOCKED>>
> state, then the second stale request will trigger alua_thread to
> >> re-acquire the lock, so when the third request comes to tcmu-runner, the
> Where you send the patches that add your delays could you send the
> target side /var/log/tcmu-runner.log with log_level = 4.
>
> For this test above you should see the second request will be sent to
> rbd's tcmu_rbd_aio_write function. That command should fail in
> rbd_finish_aio_generic and tcmu_rbd_handle_blacklisted_cmd will be
> called. We should then be blocking until IO in that iscsi connection is
> flushed in tgt_port_grp_recovery_thread_fn. That function will not
> return from the enable=0 until the iscsi connection is stopped and the
> commands in it have completed.
>
> Other commands you had in flight should eventually hit
> tcmur_cmd_handler's tcmu_dev_in_recovery check and be failed there or if
> they had already passed that check then the cmd would be sent to
> tcmu_rbd_aio_write and they should be getting the blacklisted error like
> above.
>
>
Mike,

In my scenario the second request is not sent to rbd's tcmu_rbd_aio_write
function:

tcmur_cmd_handler -->
  tcmur_alua_implicit_transition -->
alua_implicit_transition --> // rdev->lock_state == UNLOCKED here
  tcmu_set_sense_data // returns SAM_STAT_CHECK_CONDITION

Hence tcmur_cmd_handler goes to "untrack:". I'll send
/var/log/tcmu-runner.log and delay patches an hour later.

Thanks,
Maxim
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-14 Thread Michael Christie

On 03/14/2018 01:27 PM, Michael Christie wrote:
> On 03/14/2018 01:24 PM, Maxim Patlasov wrote:
>> On Wed, Mar 14, 2018 at 11:13 AM, Jason Dillaman > > wrote:
>>
>> Maxim, can you provide steps for a reproducer?
>>
>>
>> Yes, but it involves adding two artificial delays: one in tcmu-runner
>> and another in kernel iscsi. If you're willing to take pains of
> 
> Send the patches for the changes.
> 
>> recompiling kernel and tcmu-runner on one of gateway nodes, I'll help to
>> reproduce.
>>
>> Generally, the idea of reproducer is simple: let's model a situation
>> when two stale requests got stuck in kernel mailbox waiting to be
>> consumed by tcmu-runner, and another one got stuck in iscsi layer --
>> immediately after reading iscsi request from the socket. If we unblock
>> tcmu-runner after newer data went through another gateway, the first
>> stale request will switch tcmu-runner state from LOCKED to UNLOCKED>> state, 
>> then the second stale request will trigger alua_thread to
>> re-acquire the lock, so when the third request comes to tcmu-runner, the
Where you send the patches that add your delays could you send the
target side /var/log/tcmu-runner.log with log_level = 4.

For this test above you should see the second request will be sent to
rbd's tcmu_rbd_aio_write function. That command should fail in
rbd_finish_aio_generic and tcmu_rbd_handle_blacklisted_cmd will be
called. We should then be blocking until IO in that iscsi connection is
flushed in tgt_port_grp_recovery_thread_fn. That function will not
return from the enable=0 until the iscsi connection is stopped and the
commands in it have completed.

Other commands you had in flight should eventually hit
tcmur_cmd_handler's tcmu_dev_in_recovery check and be failed there or if
they had already passed that check then the cmd would be sent to
tcmu_rbd_aio_write and they should be getting the blacklisted error like
above.


>> lock is already reacquired and it goes to OSD smoothly overwriting newer
>> data.
>>
>>  
>>
>>
>> On Wed, Mar 14, 2018 at 2:06 PM, Maxim Patlasov
>> > wrote:
>> > On Sun, Mar 11, 2018 at 5:10 PM, Mike Christie
>> > wrote:
>> >>
>> >> On 03/11/2018 08:54 AM, shadow_lin wrote:
>> >> > Hi Jason,
>> >> > How the old target gateway is blacklisted? Is it a feature of
>> the target
>> >> > gateway(which can support active/passive multipath) should
>> provide or is
>> >> > it only by rbd excusive lock?
>> >> > I think excusive lock only let one client can write to rbd at
>> the same
>> >> > time,but another client can obtain the lock later when the lock is
>> >> > released.
>> >>
>> >> For the case where we had the lock and it got taken:
>> >>
>> >> If IO was blocked, then unjammed and it has already passed the target
>> >> level checks then the IO will be failed by the OSD due to the
>> >> blacklisting. When we get IO errors from ceph indicating we are
>> >> blacklisted the tcmu rbd layer will fail the IO indicating the state
>> >> change and that the IO can be retried. We will also tell the target
>> >> layer rbd does not have the lock anymore and to just stop the iscsi
>> >> connection while we clean up the blacklisting, running commands and
>> >> update our state.
>> >
>> >
>> > Mike, can you please give more details on how you tell the target
>> layer rbd
>> > does not have the lock and to stop iscsi connection. Which
>> > tcmu-runner/kernel-target functions are used for that?
>> >
>> > In fact, I performed an experiment with three stale write requests
>> stuck on
>> > blacklisted gateway, and one of them managed to overwrite newer
>> data. I
>> > followed all instructions from
>> >
>> http://docs.ceph.com/docs/master/rbd/iscsi-target-cli-manual-install/ 
>> 
>> and
>> > http://docs.ceph.com/docs/master/rbd/iscsi-target-cli/
>> , so I'm
>> interested
>> > what I'm missing...
>> >
>> > Thanks,
>> > Maxim
>> >
>> > Thanks,
>> > Maxim
>> >
>> >>
>> >>
>> >
>>
>>
>>
>> --
>> Jason
>>
>>
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-14 Thread Maxim Patlasov

On Wed, Mar 14, 2018 at 11:47 AM, Michael Christie 
wrote:

>
> > ...
>
> Ignore all these questions.  I'm pretty sure I know the issue.
>
>
Fine, but can you please also elaborate on:

> For this case it would be tcmu_rbd_handle_blacklisted_cmd

How does it tell kernel to stop iscsi connection?

Thanks,
Maxim
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-14 Thread Michael Christie

On 03/14/2018 01:26 PM, Michael Christie wrote:
> On 03/14/2018 01:06 PM, Maxim Patlasov wrote:
>> On Sun, Mar 11, 2018 at 5:10 PM, Mike Christie > > wrote:
>>
>> On 03/11/2018 08:54 AM, shadow_lin wrote:
>> > Hi Jason,
>> > How the old target gateway is blacklisted? Is it a feature of the 
>> target
>> > gateway(which can support active/passive multipath) should provide or 
>> is
>> > it only by rbd excusive lock?
>> > I think excusive lock only let one client can write to rbd at the same
>> > time,but another client can obtain the lock later when the lock is 
>> released.
>>
>> For the case where we had the lock and it got taken:
>>
>> If IO was blocked, then unjammed and it has already passed the target
>> level checks then the IO will be failed by the OSD due to the
>> blacklisting. When we get IO errors from ceph indicating we are
>> blacklisted the tcmu rbd layer will fail the IO indicating the state
>> change and that the IO can be retried. We will also tell the target
>> layer rbd does not have the lock anymore and to just stop the iscsi
>> connection while we clean up the blacklisting, running commands and
>> update our state.
>>
>>
>> Mike, can you please give more details on how you tell the target layer
>> rbd does not have the lock and to stop iscsi connection. Which
>> tcmu-runner/kernel-target functions are used for that?
> 
> For this case it would be tcmu_rbd_handle_blacklisted_cmd. Note for
> failback type of test, we might not hit that error if the initiator does
> a RTPG before it sends IO. In that case we would see
> tcmu_update_dev_lock_state get run first and the iscsi connection would
> not be dropped.
> 
>>
>> In fact, I performed an experiment with three stale write requests stuck
>> on blacklisted gateway, and one of them managed to overwrite newer data.
> 
> What is the test exactly? What OS for the initiator?
> 
> What kernel were you using and are you using the upstream tools/libs or
> the RHCS ones?
> 
> Can you run your tests and send the initiator side kernel logs and on
> the iscsi targets send the /var/log/tcmu-runner.log with debugging in
> enabled. To do that open
> 
> /etc/tcmu/tcmu.conf
> 
> on the iscsi target nodes and set
> 
> log_level = 5
> 
> If that is too much output drop it to level 4.
> 
> 
>> I followed all instructions from
>> http://docs.ceph.com/docs/master/rbd/iscsi-target-cli-manual-install/
>> and http://docs.ceph.com/docs/master/rbd/iscsi-target-cli/, so I'm
>> interested what I'm missing...
> 
> You used the initiator settings in
> http://docs.ceph.com/docs/master/rbd/iscsi-initiators/
> too right?
> 

Ignore all these questions.  I'm pretty sure I know the issue.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-14 Thread Michael Christie

On 03/14/2018 01:24 PM, Maxim Patlasov wrote:
> On Wed, Mar 14, 2018 at 11:13 AM, Jason Dillaman  > wrote:
> 
> Maxim, can you provide steps for a reproducer?
> 
> 
> Yes, but it involves adding two artificial delays: one in tcmu-runner
> and another in kernel iscsi. If you're willing to take pains of

Send the patches for the changes.

> recompiling kernel and tcmu-runner on one of gateway nodes, I'll help to
> reproduce.
> 
> Generally, the idea of reproducer is simple: let's model a situation
> when two stale requests got stuck in kernel mailbox waiting to be
> consumed by tcmu-runner, and another one got stuck in iscsi layer --
> immediately after reading iscsi request from the socket. If we unblock
> tcmu-runner after newer data went through another gateway, the first
> stale request will switch tcmu-runner state from LOCKED to UNLOCKED
> state, then the second stale request will trigger alua_thread to
> re-acquire the lock, so when the third request comes to tcmu-runner, the
> lock is already reacquired and it goes to OSD smoothly overwriting newer
> data.
> 
>  
> 
> 
> On Wed, Mar 14, 2018 at 2:06 PM, Maxim Patlasov
> > wrote:
> > On Sun, Mar 11, 2018 at 5:10 PM, Mike Christie
> > wrote:
> >>
> >> On 03/11/2018 08:54 AM, shadow_lin wrote:
> >> > Hi Jason,
> >> > How the old target gateway is blacklisted? Is it a feature of
> the target
> >> > gateway(which can support active/passive multipath) should
> provide or is
> >> > it only by rbd excusive lock?
> >> > I think excusive lock only let one client can write to rbd at
> the same
> >> > time,but another client can obtain the lock later when the lock is
> >> > released.
> >>
> >> For the case where we had the lock and it got taken:
> >>
> >> If IO was blocked, then unjammed and it has already passed the target
> >> level checks then the IO will be failed by the OSD due to the
> >> blacklisting. When we get IO errors from ceph indicating we are
> >> blacklisted the tcmu rbd layer will fail the IO indicating the state
> >> change and that the IO can be retried. We will also tell the target
> >> layer rbd does not have the lock anymore and to just stop the iscsi
> >> connection while we clean up the blacklisting, running commands and
> >> update our state.
> >
> >
> > Mike, can you please give more details on how you tell the target
> layer rbd
> > does not have the lock and to stop iscsi connection. Which
> > tcmu-runner/kernel-target functions are used for that?
> >
> > In fact, I performed an experiment with three stale write requests
> stuck on
> > blacklisted gateway, and one of them managed to overwrite newer
> data. I
> > followed all instructions from
> >
> http://docs.ceph.com/docs/master/rbd/iscsi-target-cli-manual-install/ 
> 
> and
> > http://docs.ceph.com/docs/master/rbd/iscsi-target-cli/
> , so I'm
> interested
> > what I'm missing...
> >
> > Thanks,
> > Maxim
> >
> > Thanks,
> > Maxim
> >
> >>
> >>
> >
> 
> 
> 
> --
> Jason
> 
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-14 Thread Michael Christie

On 03/14/2018 01:06 PM, Maxim Patlasov wrote:
> On Sun, Mar 11, 2018 at 5:10 PM, Mike Christie  > wrote:
> 
> On 03/11/2018 08:54 AM, shadow_lin wrote:
> > Hi Jason,
> > How the old target gateway is blacklisted? Is it a feature of the target
> > gateway(which can support active/passive multipath) should provide or is
> > it only by rbd excusive lock?
> > I think excusive lock only let one client can write to rbd at the same
> > time,but another client can obtain the lock later when the lock is 
> released.
> 
> For the case where we had the lock and it got taken:
> 
> If IO was blocked, then unjammed and it has already passed the target
> level checks then the IO will be failed by the OSD due to the
> blacklisting. When we get IO errors from ceph indicating we are
> blacklisted the tcmu rbd layer will fail the IO indicating the state
> change and that the IO can be retried. We will also tell the target
> layer rbd does not have the lock anymore and to just stop the iscsi
> connection while we clean up the blacklisting, running commands and
> update our state.
> 
> 
> Mike, can you please give more details on how you tell the target layer
> rbd does not have the lock and to stop iscsi connection. Which
> tcmu-runner/kernel-target functions are used for that?

For this case it would be tcmu_rbd_handle_blacklisted_cmd. Note for
failback type of test, we might not hit that error if the initiator does
a RTPG before it sends IO. In that case we would see
tcmu_update_dev_lock_state get run first and the iscsi connection would
not be dropped.

> 
> In fact, I performed an experiment with three stale write requests stuck
> on blacklisted gateway, and one of them managed to overwrite newer data.

What is the test exactly? What OS for the initiator?

What kernel were you using and are you using the upstream tools/libs or
the RHCS ones?

Can you run your tests and send the initiator side kernel logs and on
the iscsi targets send the /var/log/tcmu-runner.log with debugging in
enabled. To do that open

/etc/tcmu/tcmu.conf

on the iscsi target nodes and set

log_level = 5

If that is too much output drop it to level 4.


> I followed all instructions from
> http://docs.ceph.com/docs/master/rbd/iscsi-target-cli-manual-install/
> and http://docs.ceph.com/docs/master/rbd/iscsi-target-cli/, so I'm
> interested what I'm missing...

You used the initiator settings in
http://docs.ceph.com/docs/master/rbd/iscsi-initiators/
too right?

> 
> Thanks,
> Maxim
> 
> Thanks,
> Maxim
>  
> 
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-14 Thread Maxim Patlasov

On Wed, Mar 14, 2018 at 11:13 AM, Jason Dillaman 
wrote:

> Maxim, can you provide steps for a reproducer?
>

Yes, but it involves adding two artificial delays: one in tcmu-runner and
another in kernel iscsi. If you're willing to take pains of recompiling
kernel and tcmu-runner on one of gateway nodes, I'll help to reproduce.

Generally, the idea of reproducer is simple: let's model a situation when
two stale requests got stuck in kernel mailbox waiting to be consumed by
tcmu-runner, and another one got stuck in iscsi layer -- immediately after
reading iscsi request from the socket. If we unblock tcmu-runner after
newer data went through another gateway, the first stale request will
switch tcmu-runner state from LOCKED to UNLOCKED state, then the second
stale request will trigger alua_thread to re-acquire the lock, so when the
third request comes to tcmu-runner, the lock is already reacquired and it
goes to OSD smoothly overwriting newer data.

>
> On Wed, Mar 14, 2018 at 2:06 PM, Maxim Patlasov 
> wrote:
> > On Sun, Mar 11, 2018 at 5:10 PM, Mike Christie 
> wrote:
> >>
> >> On 03/11/2018 08:54 AM, shadow_lin wrote:
> >> > Hi Jason,
> >> > How the old target gateway is blacklisted? Is it a feature of the
> target
> >> > gateway(which can support active/passive multipath) should provide or
> is
> >> > it only by rbd excusive lock?
> >> > I think excusive lock only let one client can write to rbd at the same
> >> > time,but another client can obtain the lock later when the lock is
> >> > released.
> >>
> >> For the case where we had the lock and it got taken:
> >>
> >> If IO was blocked, then unjammed and it has already passed the target
> >> level checks then the IO will be failed by the OSD due to the
> >> blacklisting. When we get IO errors from ceph indicating we are
> >> blacklisted the tcmu rbd layer will fail the IO indicating the state
> >> change and that the IO can be retried. We will also tell the target
> >> layer rbd does not have the lock anymore and to just stop the iscsi
> >> connection while we clean up the blacklisting, running commands and
> >> update our state.
> >
> >
> > Mike, can you please give more details on how you tell the target layer
> rbd
> > does not have the lock and to stop iscsi connection. Which
> > tcmu-runner/kernel-target functions are used for that?
> >
> > In fact, I performed an experiment with three stale write requests stuck
> on
> > blacklisted gateway, and one of them managed to overwrite newer data. I
> > followed all instructions from
> > http://docs.ceph.com/docs/master/rbd/iscsi-target-cli-manual-install/
> and
> > http://docs.ceph.com/docs/master/rbd/iscsi-target-cli/, so I'm
> interested
> > what I'm missing...
> >
> > Thanks,
> > Maxim
> >
> > Thanks,
> > Maxim
> >
> >>
> >>
> >
>
>
>
> --
> Jason
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-14 Thread Jason Dillaman

Maxim, can you provide steps for a reproducer?

On Wed, Mar 14, 2018 at 2:06 PM, Maxim Patlasov  wrote:
> On Sun, Mar 11, 2018 at 5:10 PM, Mike Christie  wrote:
>>
>> On 03/11/2018 08:54 AM, shadow_lin wrote:
>> > Hi Jason,
>> > How the old target gateway is blacklisted? Is it a feature of the target
>> > gateway(which can support active/passive multipath) should provide or is
>> > it only by rbd excusive lock?
>> > I think excusive lock only let one client can write to rbd at the same
>> > time,but another client can obtain the lock later when the lock is
>> > released.
>>
>> For the case where we had the lock and it got taken:
>>
>> If IO was blocked, then unjammed and it has already passed the target
>> level checks then the IO will be failed by the OSD due to the
>> blacklisting. When we get IO errors from ceph indicating we are
>> blacklisted the tcmu rbd layer will fail the IO indicating the state
>> change and that the IO can be retried. We will also tell the target
>> layer rbd does not have the lock anymore and to just stop the iscsi
>> connection while we clean up the blacklisting, running commands and
>> update our state.
>
>
> Mike, can you please give more details on how you tell the target layer rbd
> does not have the lock and to stop iscsi connection. Which
> tcmu-runner/kernel-target functions are used for that?
>
> In fact, I performed an experiment with three stale write requests stuck on
> blacklisted gateway, and one of them managed to overwrite newer data. I
> followed all instructions from
> http://docs.ceph.com/docs/master/rbd/iscsi-target-cli-manual-install/ and
> http://docs.ceph.com/docs/master/rbd/iscsi-target-cli/, so I'm interested
> what I'm missing...
>
> Thanks,
> Maxim
>
> Thanks,
> Maxim
>
>>
>>
>



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-14 Thread Maxim Patlasov

On Sun, Mar 11, 2018 at 5:10 PM, Mike Christie  wrote:

> On 03/11/2018 08:54 AM, shadow_lin wrote:
> > Hi Jason,
> > How the old target gateway is blacklisted? Is it a feature of the target
> > gateway(which can support active/passive multipath) should provide or is
> > it only by rbd excusive lock?
> > I think excusive lock only let one client can write to rbd at the same
> > time,but another client can obtain the lock later when the lock is
> released.
>
> For the case where we had the lock and it got taken:
>
> If IO was blocked, then unjammed and it has already passed the target
> level checks then the IO will be failed by the OSD due to the
> blacklisting. When we get IO errors from ceph indicating we are
> blacklisted the tcmu rbd layer will fail the IO indicating the state
> change and that the IO can be retried. We will also tell the target
> layer rbd does not have the lock anymore and to just stop the iscsi
> connection while we clean up the blacklisting, running commands and
> update our state.
>

Mike, can you please give more details on how you tell the target layer rbd
does not have the lock and to stop iscsi connection. Which
tcmu-runner/kernel-target functions are used for that?

In fact, I performed an experiment with three stale write requests stuck on
blacklisted gateway, and one of them managed to overwrite newer data. I
followed all instructions from
http://docs.ceph.com/docs/master/rbd/iscsi-target-cli-manual-install/ and
http://docs.ceph.com/docs/master/rbd/iscsi-target-cli/, so I'm interested
what I'm missing...

Thanks,
Maxim

Thanks,
Maxim

>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-13 Thread Ilya Dryomov

On Mon, Mar 12, 2018 at 8:20 PM, Maged Mokhtar  wrote:
> On 2018-03-12 21:00, Ilya Dryomov wrote:
>
> On Mon, Mar 12, 2018 at 7:41 PM, Maged Mokhtar  wrote:
>
> On 2018-03-12 14:23, David Disseldorp wrote:
>
> On Fri, 09 Mar 2018 11:23:02 +0200, Maged Mokhtar wrote:
>
> 2)I undertand that before switching the path, the initiator will send a
> TMF ABORT can we pass this to down to the same abort_request() function
> in osd_client that is used for osd_request_timeout expiry ?
>
>
> IIUC, the existing abort_request() codepath only cancels the I/O on the
> client/gw side. A TMF ABORT successful response should only be sent if
> we can guarantee that the I/O is terminated at all layers below, so I
> think this would have to be implemented via an additional OSD epoch
> barrier or similar.
>
> Cheers, David
>
> Hi David,
>
> I was thinking we would get the block request then loop down to all its osd
> requests and cancel those using the same  osd request cancel function.
>
>
> All that function does is tear down OSD client / messenger data
> structures associated with the OSD request.  Any OSD request that hit
> the TCP layer may eventually get through to the OSDs.
>
> Thanks,
>
> Ilya
>
> Hi Ilya,
>
> OK..so i guess this also applies as well to osd_request_timeout expiry, it
> is not guaranteed to stop all stale ios.

Yes.  The purpose of osd_request_timeout is to unblock the client side
by failing the I/O on the client side.  It doesn't attempt to stop any
in-flight I/O -- it simply marks it as failed.

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-12 Thread Maged Mokhtar

On 2018-03-12 21:00, Ilya Dryomov wrote:

> On Mon, Mar 12, 2018 at 7:41 PM, Maged Mokhtar  wrote: 
> 
>> On 2018-03-12 14:23, David Disseldorp wrote:
>> 
>> On Fri, 09 Mar 2018 11:23:02 +0200, Maged Mokhtar wrote:
>> 
>> 2)I undertand that before switching the path, the initiator will send a
>> TMF ABORT can we pass this to down to the same abort_request() function
>> in osd_client that is used for osd_request_timeout expiry ?
>> 
>> IIUC, the existing abort_request() codepath only cancels the I/O on the
>> client/gw side. A TMF ABORT successful response should only be sent if
>> we can guarantee that the I/O is terminated at all layers below, so I
>> think this would have to be implemented via an additional OSD epoch
>> barrier or similar.
>> 
>> Cheers, David
>> 
>> Hi David,
>> 
>> I was thinking we would get the block request then loop down to all its osd
>> requests and cancel those using the same  osd request cancel function.
> 
> All that function does is tear down OSD client / messenger data
> structures associated with the OSD request.  Any OSD request that hit
> the TCP layer may eventually get through to the OSDs.
> 
> Thanks,
> 
> Ilya

Hi Ilya, 

OK..so i guess this also applies as well to osd_request_timeout expiry,
it is not guaranteed to stop all stale ios.  

Maged___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-12 Thread Ilya Dryomov

On Mon, Mar 12, 2018 at 7:41 PM, Maged Mokhtar  wrote:
> On 2018-03-12 14:23, David Disseldorp wrote:
>
> On Fri, 09 Mar 2018 11:23:02 +0200, Maged Mokhtar wrote:
>
> 2)I undertand that before switching the path, the initiator will send a
> TMF ABORT can we pass this to down to the same abort_request() function
> in osd_client that is used for osd_request_timeout expiry ?
>
>
> IIUC, the existing abort_request() codepath only cancels the I/O on the
> client/gw side. A TMF ABORT successful response should only be sent if
> we can guarantee that the I/O is terminated at all layers below, so I
> think this would have to be implemented via an additional OSD epoch
> barrier or similar.
>
> Cheers, David
>
> Hi David,
>
> I was thinking we would get the block request then loop down to all its osd
> requests and cancel those using the same  osd request cancel function.

All that function does is tear down OSD client / messenger data
structures associated with the OSD request.  Any OSD request that hit
the TCP layer may eventually get through to the OSDs.

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-12 Thread David Disseldorp

Hi Maged,

On Mon, 12 Mar 2018 20:41:22 +0200, Maged Mokhtar wrote:

> I was thinking we would get the block request then loop down to all its
> osd requests and cancel those using the same  osd request cancel
> function. 

Until we can be certain of termination, I don't think it makes sense to
change the current behaviour of blocking the TMF ABORT response until
the cluster I/O completes.

Cheers, David
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-12 Thread Maged Mokhtar

On 2018-03-12 14:23, David Disseldorp wrote:

> On Fri, 09 Mar 2018 11:23:02 +0200, Maged Mokhtar wrote:
> 
>> 2)I undertand that before switching the path, the initiator will send a 
>> TMF ABORT can we pass this to down to the same abort_request() function 
>> in osd_client that is used for osd_request_timeout expiry ?
> 
> IIUC, the existing abort_request() codepath only cancels the I/O on the
> client/gw side. A TMF ABORT successful response should only be sent if
> we can guarantee that the I/O is terminated at all layers below, so I
> think this would have to be implemented via an additional OSD epoch
> barrier or similar.
> 
> Cheers, David

Hi David, 

I was thinking we would get the block request then loop down to all its
osd requests and cancel those using the same  osd request cancel
function. 

Maged___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-12 Thread David Disseldorp

On Fri, 09 Mar 2018 11:23:02 +0200, Maged Mokhtar wrote:

> 2)I undertand that before switching the path, the initiator will send a 
> TMF ABORT can we pass this to down to the same abort_request() function 
> in osd_client that is used for osd_request_timeout expiry ? 

IIUC, the existing abort_request() codepath only cancels the I/O on the
client/gw side. A TMF ABORT successful response should only be sent if
we can guarantee that the I/O is terminated at all layers below, so I
think this would have to be implemented via an additional OSD epoch
barrier or similar.

Cheers, David
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-11 Thread Mike Christie

On 03/11/2018 08:54 AM, shadow_lin wrote:
> Hi Jason,
> How the old target gateway is blacklisted? Is it a feature of the target
> gateway(which can support active/passive multipath) should provide or is
> it only by rbd excusive lock? 
> I think excusive lock only let one client can write to rbd at the same
> time,but another client can obtain the lock later when the lock is released.

For the case where we had the lock and it got taken:

If IO was blocked, then unjammed and it has already passed the target
level checks then the IO will be failed by the OSD due to the
blacklisting. When we get IO errors from ceph indicating we are
blacklisted the tcmu rbd layer will fail the IO indicating the state
change and that the IO can be retried. We will also tell the target
layer rbd does not have the lock anymore and to just stop the iscsi
connection while we clean up the blacklisting, running commands and
update our state.

The case where the initiator switched on us while we were grabbing the
lock is similar:

After we grab the lock and before we start sending IO to the rbd/ceph
layers, we will have flushed IO in various queues similar to above but a
little less invasively and tested the iscsi connection to make sure it
is not stuck on the network. If the path is still the good one, then the
initaitor will retry the IOs on it. If the iscsi connection has been
dropped, then the iscsi layer detects this and just drops IO during the
flush. So, if the failover timers have fired and the multipath layer is
already using a new path then the IO is not going to be running on
multiple paths.

>  
> 2018-03-11
> 
> shadowlin
>  
> 
> 
> *发件人：*Jason Dillaman <jdill...@redhat.com>
>     *发送时间：*2018-03-11 07:46
> *主题：*Re: Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD
> Exclusive Lock
> *收件人：*"shadow_lin"<shadow_...@163.com>
> *抄送：*"Mike Christie"<mchri...@redhat.com>,"Lazuardi
> Nasution"<mrxlazuar...@gmail.com>,"Ceph
> Users"<ceph-users@lists.ceph.com>
>  
> On Sat, Mar 10, 2018 at 10:11 AM, shadow_lin <shadow_...@163.com> wrote: 
> > Hi Jason, 
> > 
> >>As discussed in this thread, for active/passive, upon initiator 
> >>failover, we used the RBD exclusive-lock feature to blacklist the old 
> >>"active" iSCSI target gateway so that it cannot talk w/ the Ceph 
> >>cluster before new writes are accepted on the new target gateway. 
> > 
> > I can get during the new active target gateway was talking to rbd the 
> old 
> > active target gateway cannot write because of the RBD exclusive-lock 
> > But after the new target gateway done the writes,if the old target 
> gateway 
> > had some blocked io during the failover,cant it then get the lock and 
> > overwrite the new writes? 
>  
> Negative -- it's blacklisted so it cannot talk to the cluster. 
>  
> > PS: 
> > Petasan say they can do active/active iscsi with patched suse kernel. 
>  
> I'll let them comment on these corner cases. 
>  
> > 2018-03-10 
>     >  
> > shadowlin 
> > 
> >  
> > 
> > 发件人：Jason Dillaman <jdill...@redhat.com> 
> > 发送时间：2018-03-10 21:40 
> > 主题：Re: [ceph-
> users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock 
> > 收件人："shadow_lin"<shadow_...@163.com> 
> > 抄送："Mike Christie"<mchri...@redhat.com>,"Lazuardi 
> > Nasution"<mrxlazuar...@gmail.com>,"Ceph 
> Users"<ceph-users@lists.ceph.com> 
> > 
> > On Sat, Mar 10, 2018 at 7:42 AM, shadow_lin <shadow_...@163.com> wrote: 
> >> Hi Mike, 
> >> So for now only suse kernel with target_rbd_core and tcmu-runner can 
> run 
> >> active/passive multipath safely? 
> > 
> > Negative, the LIO / tcmu-runner implementation documented here [1] is 
> > safe for active/passive. 
> > 
> >> I am a newbie to iscsi. I think the stuck io get excuted cause 
> overwrite 
> >> problem can happen with both active/active and active/passive. 
> >> What makes the active/passive safer than active/active? 
> > 
> > As discussed in this thread, for active/passive, upon initiator 
> > failover, we used the RBD exclusive-lock feature to blacklist the old 
>

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-11 Thread Jason Dillaman

On Sun, Mar 11, 2018 at 9:54 AM, shadow_lin <shadow_...@163.com> wrote:
> Hi Jason,
> How the old target gateway is blacklisted?

When the newly active target gateway breaks the lock of the old target
gateway, that process will blacklist the old client [1].

> Is it a feature of the target
> gateway(which can support active/passive multipath) should provide or is it
> only by rbd excusive lock?
> I think excusive lock only let one client can write to rbd at the same
> time,but another client can obtain the lock later when the lock is released.

In general, yes -- but blacklist on lock break has been part of
exclusive-lock since the start. I am honestly not just making this up,
this is how it works.

> 2018-03-11
> 
> shadowlin
>
> 
>
> 发件人：Jason Dillaman <jdill...@redhat.com>
> 发送时间：2018-03-11 07:46
> 主题：Re: Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive
> Lock
> 收件人："shadow_lin"<shadow_...@163.com>
> 抄送："Mike Christie"<mchri...@redhat.com>,"Lazuardi
> Nasution"<mrxlazuar...@gmail.com>,"Ceph Users"<ceph-users@lists.ceph.com>
>
> On Sat, Mar 10, 2018 at 10:11 AM, shadow_lin <shadow_...@163.com> wrote:
>> Hi Jason,
>>
>>>As discussed in this thread, for active/passive, upon initiator
>>>failover, we used the RBD exclusive-lock feature to blacklist the old
>>>"active" iSCSI target gateway so that it cannot talk w/ the Ceph
>>>cluster before new writes are accepted on the new target gateway.
>>
>> I can get during the new active target gateway was talking to rbd the old
>> active target gateway cannot write because of the RBD exclusive-lock
>> But after the new target gateway done the writes,if the old target gateway
>> had some blocked io during the failover,cant it then get the lock and
>> overwrite the new writes?
>
> Negative -- it's blacklisted so it cannot talk to the cluster.
>
>> PS:
>> Petasan say they can do active/active iscsi with patched suse kernel.
>
> I'll let them comment on these corner cases.
>
>> 2018-03-10
>> 
>> shadowlin
>>
>> 
>>
>> 发件人：Jason Dillaman <jdill...@redhat.com>
>> 发送时间：2018-03-10 21:40
>> 主题：Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock
>> 收件人："shadow_lin"<shadow_...@163.com>
>> 抄送："Mike Christie"<mchri...@redhat.com>,"Lazuardi
>> Nasution"<mrxlazuar...@gmail.com>,"Ceph Users"<ceph-users@lists.ceph.com>
>>
>> On Sat, Mar 10, 2018 at 7:42 AM, shadow_lin <shadow_...@163.com> wrote:
>>> Hi Mike,
>>> So for now only suse kernel with target_rbd_core and tcmu-runner can run
>>> active/passive multipath safely?
>>
>> Negative, the LIO / tcmu-runner implementation documented here [1] is
>> safe for active/passive.
>>
>>> I am a newbie to iscsi. I think the stuck io get excuted cause overwrite
>>> problem can happen with both active/active and active/passive.
>>> What makes the active/passive safer than active/active?
>>
>> As discussed in this thread, for active/passive, upon initiator
>> failover, we used the RBD exclusive-lock feature to blacklist the old
>> "active" iSCSI target gateway so that it cannot talk w/ the Ceph
>> cluster before new writes are accepted on the new target gateway.
>>
>>> What mechanism should be implement to avoid the problem with
>>> active/passive
>>> and active/active multipath?
>>
>> Active/passive it solved as discussed above. For active/active, we
>> don't have a solution that is known safe under all failure conditions.
>> If LIO supported MCS (multiple connections per session) instead of
>> just MPIO (multipath IO), the initiator would provide enough context
>> to the target to detect IOs from a failover situation.
>>
>>> 2018-03-10
>>> 
>>> shadowlin
>>>
>>> 
>>>
>>> 发件人：Mike Christie <mchri...@redhat.com>
>>> 发送时间：2018-03-09 00:54
>>> 主题：Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive
>>> Lock
>>> 收件人："shadow_lin"<shadow_...@163.com>,"Lazuardi
>>> Nasution"<mrxlazuar...@gmail.com>,"Ceph Users"<ceph-users@lists.ceph.com>
>&g

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-11 Thread shadow_lin

Hi Jason,
How the old target gateway is blacklisted? Is it a feature of the target 
gateway(which can support active/passive multipath) should provide or is it 
only by rbd excusive lock? 
I think excusive lock only let one client can write to rbd at the same time,but 
another client can obtain the lock later when the lock is released.

2018-03-11 


shadowlin




发件人：Jason Dillaman <jdill...@redhat.com>
发送时间：2018-03-11 07:46
主题：Re: Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock
收件人："shadow_lin"<shadow_...@163.com>
抄送："Mike Christie"<mchri...@redhat.com>,"Lazuardi 
Nasution"<mrxlazuar...@gmail.com>,"Ceph Users"<ceph-users@lists.ceph.com>

On Sat, Mar 10, 2018 at 10:11 AM, shadow_lin <shadow_...@163.com> wrote: 
> Hi Jason, 
> 
>>As discussed in this thread, for active/passive, upon initiator 
>>failover, we used the RBD exclusive-lock feature to blacklist the old 
>>"active" iSCSI target gateway so that it cannot talk w/ the Ceph 
>>cluster before new writes are accepted on the new target gateway. 
> 
> I can get during the new active target gateway was talking to rbd the old 
> active target gateway cannot write because of the RBD exclusive-lock 
> But after the new target gateway done the writes,if the old target gateway 
> had some blocked io during the failover,cant it then get the lock and 
> overwrite the new writes? 

Negative -- it's blacklisted so it cannot talk to the cluster. 

> PS: 
> Petasan say they can do active/active iscsi with patched suse kernel. 

I'll let them comment on these corner cases. 

> 2018-03-10 
> ____ 
> shadowlin 
> 
> ____________ 
> 
> 发件人：Jason Dillaman <jdill...@redhat.com> 
> 发送时间：2018-03-10 21:40 
> 主题：Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock 
> 收件人："shadow_lin"<shadow_...@163.com> 
> 抄送："Mike Christie"<mchri...@redhat.com>,"Lazuardi 
> Nasution"<mrxlazuar...@gmail.com>,"Ceph Users"<ceph-users@lists.ceph.com> 
> 
> On Sat, Mar 10, 2018 at 7:42 AM, shadow_lin <shadow_...@163.com> wrote: 
>> Hi Mike, 
>> So for now only suse kernel with target_rbd_core and tcmu-runner can run 
>> active/passive multipath safely? 
> 
> Negative, the LIO / tcmu-runner implementation documented here [1] is 
> safe for active/passive. 
> 
>> I am a newbie to iscsi. I think the stuck io get excuted cause overwrite 
>> problem can happen with both active/active and active/passive. 
>> What makes the active/passive safer than active/active? 
> 
> As discussed in this thread, for active/passive, upon initiator 
> failover, we used the RBD exclusive-lock feature to blacklist the old 
> "active" iSCSI target gateway so that it cannot talk w/ the Ceph 
> cluster before new writes are accepted on the new target gateway. 
> 
>> What mechanism should be implement to avoid the problem with 
>> active/passive 
>> and active/active multipath? 
> 
> Active/passive it solved as discussed above. For active/active, we 
> don't have a solution that is known safe under all failure conditions. 
> If LIO supported MCS (multiple connections per session) instead of 
> just MPIO (multipath IO), the initiator would provide enough context 
> to the target to detect IOs from a failover situation. 
> 
>> 2018-03-10 
>>  
>> shadowlin 
>> 
>>  
>> 
>> 发件人：Mike Christie <mchri...@redhat.com> 
>> 发送时间：2018-03-09 00:54 
>> 主题：Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock 
>> 收件人："shadow_lin"<shadow_...@163.com>,"Lazuardi 
>> Nasution"<mrxlazuar...@gmail.com>,"Ceph Users"<ceph-users@lists.ceph.com> 
>> 抄送： 
>> 
>> On 03/07/2018 09:24 AM, shadow_lin wrote: 
>>> Hi Christie, 
>>> Is it safe to use active/passive multipath with krbd with exclusive lock 
>>> for lio/tgt/scst/tcmu? 
>> 
>> No. We tried to use lio and krbd initially, but there is a issue where 
>> IO might get stuck in the target/block layer and get executed after new 
>> IO. So for lio, tgt and tcmu it is not safe as is right now. We could 
>> add some code tcmu's file_example handler which can be used with krbd so 
>> it works like the rbd one. 
>> 
>> I do know enough about SCST right now. 
>> 
>> 
>>> Is it safe to use active/active multipath If use suse kernel with 
>>> target_core_rbd? 
>>> Thanks. 
>>> 
>>> 2018-03-07 
>>

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-10 Thread Maged Mokhtar

--
From: "Jason Dillaman" <jdill...@redhat.com>
Sent: Sunday, March 11, 2018 1:46 AM
To: "shadow_lin" <shadow_...@163.com>
Cc: "Lazuardi Nasution" <mrxlazuar...@gmail.com>; "Ceph Users" 
<ceph-users@lists.ceph.com>
Subject: Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive 
Lock

On Sat, Mar 10, 2018 at 10:11 AM, shadow_lin <shadow_...@163.com> wrote:

Hi Jason,

As discussed in this thread, for active/passive, upon initiator
failover, we used the RBD exclusive-lock feature to blacklist the old
"active" iSCSI target gateway so that it cannot talk w/ the Ceph
cluster before new writes are accepted on the new target gateway.

I can get during the new active target gateway was talking to rbd the old
active target gateway cannot write because of the RBD exclusive-lock
But after the new target gateway done the writes,if the old target 
gateway

had some blocked io during the failover,cant it then get the lock and
overwrite the new writes?

Negative -- it's blacklisted so it cannot talk to the cluster.

PS:
Petasan say they can do active/active iscsi with patched suse kernel.

I'll let them comment on these corner cases.

We are not currently handling these corner cases. We have not hit this in 
practice but will work on it. We need to account for in-flight time early in 
the target stack before reaching krbd/tcmu.

/Maged

2018-03-10

shadowlin

____

发件人：Jason Dillaman <jdill...@redhat.com>
发送时间：2018-03-10 21:40
主题：Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive 
Lock

收件人："shadow_lin"<shadow_...@163.com>
抄送："Mike Christie"<mchri...@redhat.com>,"Lazuardi
Nasution"<mrxlazuar...@gmail.com>,"Ceph Users"<ceph-users@lists.ceph.com>

On Sat, Mar 10, 2018 at 7:42 AM, shadow_lin <shadow_...@163.com> wrote:

Hi Mike,
So for now only suse kernel with target_rbd_core and tcmu-runner can run
active/passive multipath safely?

Negative, the LIO / tcmu-runner implementation documented here [1] is
safe for active/passive.

I am a newbie to iscsi. I think the stuck io get excuted cause overwrite
problem can happen with both active/active and active/passive.
What makes the active/passive safer than active/active?

As discussed in this thread, for active/passive, upon initiator
failover, we used the RBD exclusive-lock feature to blacklist the old
"active" iSCSI target gateway so that it cannot talk w/ the Ceph
cluster before new writes are accepted on the new target gateway.

What mechanism should be implement to avoid the problem with
active/passive
and active/active multipath?

Active/passive it solved as discussed above. For active/active, we
don't have a solution that is known safe under all failure conditions.
If LIO supported MCS (multiple connections per session) instead of
just MPIO (multipath IO), the initiator would provide enough context
to the target to detect IOs from a failover situation.

2018-03-10
____________
shadowlin

____

发件人：Mike Christie <mchri...@redhat.com>
发送时间：2018-03-09 00:54
主题：Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive 
Lock

收件人："shadow_lin"<shadow_...@163.com>,"Lazuardi
Nasution"<mrxlazuar...@gmail.com>,"Ceph 
Users"<ceph-users@lists.ceph.com>

抄送：

On 03/07/2018 09:24 AM, shadow_lin wrote:

Hi Christie,
Is it safe to use active/passive multipath with krbd with exclusive 
lock

for lio/tgt/scst/tcmu?

No. We tried to use lio and krbd initially, but there is a issue where
IO might get stuck in the target/block layer and get executed after new
IO. So for lio, tgt and tcmu it is not safe as is right now. We could
add some code tcmu's file_example handler which can be used with krbd so
it works like the rbd one.

I do know enough about SCST right now.

Is it safe to use active/active multipath If use suse kernel with
target_core_rbd?
Thanks.

2018-03-07
----------------
shadowlin

----

*发件人：*Mike Christie <mchri...@redhat.com>
*发送时间：*2018-03-07 03:51
*主题：*Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD
Exclusive Lock
*收件人：*"Lazuardi Nasution"<mrxlazuar...@gmail.com>,"Ceph
Users"<ceph-users@lists.ceph.com>
*抄送：*

On 03/06/2018 01:17 PM, Lazuardi Nasution wrote:
> Hi,
>
> I want to do load balanced multipathing (multiple iSCSI
gateway/exporter
> nodes) of iSCSI backed with RBD images. Should I disable 
exclusive

lock
> feature? What if I don't disable that feature? I'm using TGT
(manual
> way) since I g

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-10 Thread Jason Dillaman

On Sat, Mar 10, 2018 at 10:11 AM, shadow_lin <shadow_...@163.com> wrote:
> Hi Jason,
>
>>As discussed in this thread, for active/passive, upon initiator
>>failover, we used the RBD exclusive-lock feature to blacklist the old
>>"active" iSCSI target gateway so that it cannot talk w/ the Ceph
>>cluster before new writes are accepted on the new target gateway.
>
> I can get during the new active target gateway was talking to rbd the old
> active target gateway cannot write because of the RBD exclusive-lock
> But after the new target gateway done the writes,if the old target gateway
> had some blocked io during the failover,cant it then get the lock and
> overwrite the new writes?

Negative -- it's blacklisted so it cannot talk to the cluster.

> PS:
> Petasan say they can do active/active iscsi with patched suse kernel.

I'll let them comment on these corner cases.

> 2018-03-10
> 
> shadowlin
>
> ____
>
> 发件人：Jason Dillaman <jdill...@redhat.com>
> 发送时间：2018-03-10 21:40
> 主题：Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock
> 收件人："shadow_lin"<shadow_...@163.com>
> 抄送："Mike Christie"<mchri...@redhat.com>,"Lazuardi
> Nasution"<mrxlazuar...@gmail.com>,"Ceph Users"<ceph-users@lists.ceph.com>
>
> On Sat, Mar 10, 2018 at 7:42 AM, shadow_lin <shadow_...@163.com> wrote:
>> Hi Mike,
>> So for now only suse kernel with target_rbd_core and tcmu-runner can run
>> active/passive multipath safely?
>
> Negative, the LIO / tcmu-runner implementation documented here [1] is
> safe for active/passive.
>
>> I am a newbie to iscsi. I think the stuck io get excuted cause overwrite
>> problem can happen with both active/active and active/passive.
>> What makes the active/passive safer than active/active?
>
> As discussed in this thread, for active/passive, upon initiator
> failover, we used the RBD exclusive-lock feature to blacklist the old
> "active" iSCSI target gateway so that it cannot talk w/ the Ceph
> cluster before new writes are accepted on the new target gateway.
>
>> What mechanism should be implement to avoid the problem with
>> active/passive
>> and active/active multipath?
>
> Active/passive it solved as discussed above. For active/active, we
> don't have a solution that is known safe under all failure conditions.
> If LIO supported MCS (multiple connections per session) instead of
> just MPIO (multipath IO), the initiator would provide enough context
> to the target to detect IOs from a failover situation.
>
>> 2018-03-10
>> 
>> shadowlin
>>
>> 
>>
>> 发件人：Mike Christie <mchri...@redhat.com>
>> 发送时间：2018-03-09 00:54
>> 主题：Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock
>> 收件人："shadow_lin"<shadow_...@163.com>,"Lazuardi
>> Nasution"<mrxlazuar...@gmail.com>,"Ceph Users"<ceph-users@lists.ceph.com>
>> 抄送：
>>
>> On 03/07/2018 09:24 AM, shadow_lin wrote:
>>> Hi Christie,
>>> Is it safe to use active/passive multipath with krbd with exclusive lock
>>> for lio/tgt/scst/tcmu?
>>
>> No. We tried to use lio and krbd initially, but there is a issue where
>> IO might get stuck in the target/block layer and get executed after new
>> IO. So for lio, tgt and tcmu it is not safe as is right now. We could
>> add some code tcmu's file_example handler which can be used with krbd so
>> it works like the rbd one.
>>
>> I do know enough about SCST right now.
>>
>>
>>> Is it safe to use active/active multipath If use suse kernel with
>>> target_core_rbd?
>>> Thanks.
>>>
>>> 2018-03-07
>>> 
>>> shadowlin
>>>
>>> 
>>>
>>> *发件人：*Mike Christie <mchri...@redhat.com>
>>> *发送时间：*2018-03-07 03:51
>>> *主题：*Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD
>>> Exclusive Lock
>>> *收件人：*"Lazuardi Nasution"<mrxlazuar...@gmail.com>,"Ceph
>>> Users"<ceph-users@lists.ceph.com>
>>> *抄送：*
>>>
>>> On 03/06/2018 01:17 PM, Lazuardi Nasution wrote:
>>> > Hi,
>>> >
>>> > I want to do load balanced multipathing (multiple iSCSI
>>> gatewa

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-10 Thread shadow_lin

Hi Jason,

>As discussed in this thread, for active/passive, upon initiator 
>failover, we used the RBD exclusive-lock feature to blacklist the old 
>"active" iSCSI target gateway so that it cannot talk w/ the Ceph 
>cluster before new writes are accepted on the new target gateway. 

I can get during the new active target gateway was talking to rbd the old 
active target gateway cannot write because of the RBD exclusive-lock 
But after the new target gateway done the writes,if the old target gateway had 
some blocked io during the failover,cant it then get the lock and overwrite the 
new writes?

PS:
Petasan say they can do active/active iscsi with patched suse kernel.

2018-03-10 



shadowlin




发件人：Jason Dillaman <jdill...@redhat.com>
发送时间：2018-03-10 21:40
主题：Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock
收件人："shadow_lin"<shadow_...@163.com>
抄送："Mike Christie"<mchri...@redhat.com>,"Lazuardi 
Nasution"<mrxlazuar...@gmail.com>,"Ceph Users"<ceph-users@lists.ceph.com>

On Sat, Mar 10, 2018 at 7:42 AM, shadow_lin <shadow_...@163.com> wrote: 
> Hi Mike, 
> So for now only suse kernel with target_rbd_core and tcmu-runner can run 
> active/passive multipath safely? 

Negative, the LIO / tcmu-runner implementation documented here [1] is 
safe for active/passive. 

> I am a newbie to iscsi. I think the stuck io get excuted cause overwrite 
> problem can happen with both active/active and active/passive. 
> What makes the active/passive safer than active/active? 

As discussed in this thread, for active/passive, upon initiator 
failover, we used the RBD exclusive-lock feature to blacklist the old 
"active" iSCSI target gateway so that it cannot talk w/ the Ceph 
cluster before new writes are accepted on the new target gateway. 

> What mechanism should be implement to avoid the problem with active/passive 
> and active/active multipath? 

Active/passive it solved as discussed above. For active/active, we 
don't have a solution that is known safe under all failure conditions. 
If LIO supported MCS (multiple connections per session) instead of 
just MPIO (multipath IO), the initiator would provide enough context 
to the target to detect IOs from a failover situation. 

> 2018-03-10 
> ____ 
> shadowlin 
> 
> ________________ 
> 
> 发件人：Mike Christie <mchri...@redhat.com> 
> 发送时间：2018-03-09 00:54 
> 主题：Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock 
> 收件人："shadow_lin"<shadow_...@163.com>,"Lazuardi 
> Nasution"<mrxlazuar...@gmail.com>,"Ceph Users"<ceph-users@lists.ceph.com> 
> 抄送： 
> 
> On 03/07/2018 09:24 AM, shadow_lin wrote: 
>> Hi Christie, 
>> Is it safe to use active/passive multipath with krbd with exclusive lock 
>> for lio/tgt/scst/tcmu? 
> 
> No. We tried to use lio and krbd initially, but there is a issue where 
> IO might get stuck in the target/block layer and get executed after new 
> IO. So for lio, tgt and tcmu it is not safe as is right now. We could 
> add some code tcmu's file_example handler which can be used with krbd so 
> it works like the rbd one. 
> 
> I do know enough about SCST right now. 
> 
> 
>> Is it safe to use active/active multipath If use suse kernel with 
>> target_core_rbd? 
>> Thanks. 
>> 
>> 2018-03-07 
>> ---------------- 
>> shadowlin 
>> 
>>  
>> 
>> *发件人：*Mike Christie <mchri...@redhat.com> 
>> *发送时间：*2018-03-07 03:51 
>> *主题：*Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD 
>> Exclusive Lock 
>> *收件人：*"Lazuardi Nasution"<mrxlazuar...@gmail.com>,"Ceph 
>> Users"<ceph-users@lists.ceph.com> 
>> *抄送：* 
>> 
>> On 03/06/2018 01:17 PM, Lazuardi Nasution wrote: 
>> > Hi, 
>> > 
>> > I want to do load balanced multipathing (multiple iSCSI 
>> gateway/exporter 
>> > nodes) of iSCSI backed with RBD images. Should I disable exclusive 
>> lock 
>> > feature? What if I don't disable that feature? I'm using TGT (manual 
>> > way) since I get so many CPU stuck error messages when I was using 
>> LIO. 
>> > 
>> 
>> You are using LIO/TGT with krbd right? 
>> 
>> You cannot or shouldn't do active/active multipathing. If you have the 
>> lock enabled then it bounces between paths for each IO and will be 
>> slow. 
>> If you do not have it enabled then you can end up with stale IO 
>> overwriting current data. 
>> 
>> 
>> 
>> 
> 
> 
> ___ 
> ceph-users mailing list 
> ceph-users@lists.ceph.com 
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
> 

[1] http://docs.ceph.com/docs/master/rbd/iscsi-overview/ 

--  
Jason ___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-10 Thread Jason Dillaman

On Sat, Mar 10, 2018 at 7:42 AM, shadow_lin <shadow_...@163.com> wrote:
> Hi Mike,
> So for now only suse kernel with target_rbd_core and tcmu-runner can run
> active/passive multipath safely?

Negative, the LIO / tcmu-runner implementation documented here [1] is
safe for active/passive.

> I am a newbie to iscsi. I think the stuck io get excuted cause overwrite
> problem can happen with both active/active and active/passive.
> What makes the active/passive safer than active/active?

As discussed in this thread, for active/passive, upon initiator
failover, we used the RBD exclusive-lock feature to blacklist the old
"active" iSCSI target gateway so that it cannot talk w/ the Ceph
cluster before new writes are accepted on the new target gateway.

> What mechanism should be implement to avoid the problem with active/passive
> and active/active multipath?

Active/passive it solved as discussed above. For active/active, we
don't have a solution that is known safe under all failure conditions.
If LIO supported MCS (multiple connections per session) instead of
just MPIO (multipath IO), the initiator would provide enough context
to the target to detect IOs from a failover situation.

> 2018-03-10
> 
> shadowlin
>
> 
>
> 发件人：Mike Christie <mchri...@redhat.com>
> 发送时间：2018-03-09 00:54
> 主题：Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock
> 收件人："shadow_lin"<shadow_...@163.com>,"Lazuardi
> Nasution"<mrxlazuar...@gmail.com>,"Ceph Users"<ceph-users@lists.ceph.com>
> 抄送：
>
> On 03/07/2018 09:24 AM, shadow_lin wrote:
>> Hi Christie,
>> Is it safe to use active/passive multipath with krbd with exclusive lock
>> for lio/tgt/scst/tcmu?
>
> No. We tried to use lio and krbd initially, but there is a issue where
> IO might get stuck in the target/block layer and get executed after new
> IO. So for lio, tgt and tcmu it is not safe as is right now. We could
> add some code tcmu's file_example handler which can be used with krbd so
> it works like the rbd one.
>
> I do know enough about SCST right now.
>
>
>> Is it safe to use active/active multipath If use suse kernel with
>> target_core_rbd?
>> Thanks.
>>
>> 2018-03-07
>> --------
>> shadowlin
>>
>> ----
>>
>> *发件人：*Mike Christie <mchri...@redhat.com>
>> *发送时间：*2018-03-07 03:51
>> *主题：*Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD
>> Exclusive Lock
>> *收件人：*"Lazuardi Nasution"<mrxlazuar...@gmail.com>,"Ceph
>> Users"<ceph-users@lists.ceph.com>
>> *抄送：*
>>
>> On 03/06/2018 01:17 PM, Lazuardi Nasution wrote:
>> > Hi,
>> >
>> > I want to do load balanced multipathing (multiple iSCSI
>> gateway/exporter
>> > nodes) of iSCSI backed with RBD images. Should I disable exclusive
>> lock
>> > feature? What if I don't disable that feature? I'm using TGT (manual
>> > way) since I get so many CPU stuck error messages when I was using
>> LIO.
>> >
>>
>> You are using LIO/TGT with krbd right?
>>
>> You cannot or shouldn't do active/active multipathing. If you have the
>> lock enabled then it bounces between paths for each IO and will be
>> slow.
>> If you do not have it enabled then you can end up with stale IO
>> overwriting current data.
>>
>>
>>
>>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

[1] http://docs.ceph.com/docs/master/rbd/iscsi-overview/

-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-10 Thread shadow_lin

Hi Mike,  
So for now only suse kernel with target_rbd_core and tcmu-runner can run 
active/passive multipath safely?
I am a newbie to iscsi. I think the stuck io get excuted cause overwrite 
problem can happen with both active/active and active/passive.
What makes the active/passive safer than active/active? 
What mechanism should be implement to avoid the problem with active/passive and 
active/active multipath?
2018-03-10 

shadowlin

发件人：Mike Christie <mchri...@redhat.com>
发送时间：2018-03-09 00:54
主题：Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock
收件人："shadow_lin"<shadow_...@163.com>,"Lazuardi 
Nasution"<mrxlazuar...@gmail.com>,"Ceph Users"<ceph-users@lists.ceph.com>
抄送：

On 03/07/2018 09:24 AM, shadow_lin wrote: 
> Hi Christie, 
> Is it safe to use active/passive multipath with krbd with exclusive lock 
> for lio/tgt/scst/tcmu? 

No. We tried to use lio and krbd initially, but there is a issue where 
IO might get stuck in the target/block layer and get executed after new 
IO. So for lio, tgt and tcmu it is not safe as is right now. We could 
add some code tcmu's file_example handler which can be used with krbd so 
it works like the rbd one. 

I do know enough about SCST right now. 

> Is it safe to use active/active multipath If use suse kernel with 
> target_core_rbd? 
> Thanks. 
>   
> 2018-03-07 
>  
> shadowlin 
>   
>  
>  
>     *发件人：*Mike Christie <mchri...@redhat.com> 
> *发送时间：*2018-03-07 03:51 
> *主题：*Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD 
> Exclusive Lock 
> *收件人：*"Lazuardi Nasution"<mrxlazuar...@gmail.com>,"Ceph 
> Users"<ceph-users@lists.ceph.com> 
> *抄送：* 
>   
> On 03/06/2018 01:17 PM, Lazuardi Nasution wrote:  
> > Hi,  
> >   
> > I want to do load balanced multipathing (multiple iSCSI 
> gateway/exporter  
> > nodes) of iSCSI backed with RBD images. Should I disable exclusive lock 
>  
> > feature? What if I don't disable that feature? I'm using TGT (manual  
> > way) since I get so many CPU stuck error messages when I was using LIO. 
>  
> >   
>   
> You are using LIO/TGT with krbd right?  
>   
> You cannot or shouldn't do active/active multipathing. If you have the  
> lock enabled then it bounces between paths for each IO and will be slow.  
> If you do not have it enabled then you can end up with stale IO  
> overwriting current data.  
>   
>   
>   
>  ___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-09 Thread Maged Mokhtar

rt. For
> example, if you were using tgt/lio right now and this was a
> COMPARE_AND_WRITE, the READ part might take osd_request_timeout - 1
> seconds, and then the write part might take osd_request_timeout -1
> seconds so you need to have your fast_io_fail long enough for that type
> of case. For tgt a WRITE_SAME command might be N WRITEs to krbd, so you
> need to make sure your queue depths are set so you do not end up with
> something similar as the CAW but where M WRITEs get executed and take
> osd_request_timeout -1 seconds then M more, etc and at some point the
> iscsi connection is lost so the failover timer had started. Some ceph
> requests also might be multiple requests.
> 
> Maybe an overly paranoid case, but I still worry about because I do not
> want to mess up anyone's data, is that a disk on the iscsi target node
> goes flakey. In the target we do kmalloc(GFP_KERNEL) to execute a SCSI
> command, and that blocks trying to write data to the flakey disk. If the
> disk recovers and we can eventually recover, did you account for the
> recovery timers in that code path when configuring the failover and krbd
> timers.
> 
> One other case we have been debating about is if krbd/librbd is able to
> put the ceph request on the wire but then the iscsi connection goes
> down, will the ceph request always get sent to the OSD before the
> initiator side failover timeouts have fired and it starts using a
> different target node.
> 
>> Best regards,
>> 
>> On Mar 8, 2018 11:54 PM, "Mike Christie" <mchri...@redhat.com
>> <mailto:mchri...@redhat.com>> wrote:
>> 
>> On 03/07/2018 09:24 AM, shadow_lin wrote:
>>> Hi Christie,
>>> Is it safe to use active/passive multipath with krbd with
>> exclusive lock
>>> for lio/tgt/scst/tcmu?
>> 
>> No. We tried to use lio and krbd initially, but there is a issue where
>> IO might get stuck in the target/block layer and get executed after new
>> IO. So for lio, tgt and tcmu it is not safe as is right now. We could
>> add some code tcmu's file_example handler which can be used with krbd so
>> it works like the rbd one.
>> 
>> I do know enough about SCST right now.
>> 
>>> Is it safe to use active/active multipath If use suse kernel with
>>> target_core_rbd?
>>> Thanks.
>>> 
>>> 2018-03-07
>>> 
>> 
>>> shadowlin
>>> 
>>> 
>> 
>>> 
>>> *发件人：*Mike Christie <mchri...@redhat.com
>> <mailto:mchri...@redhat.com>>
>>> *发送时间：*2018-03-07 03:51
>>> *主题：*Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD
>>> Exclusive Lock
>>> *收件人：*"Lazuardi Nasution"<mrxlazuar...@gmail.com
>> <mailto:mrxlazuar...@gmail.com>>,"Ceph
>>> Users"<ceph-users@lists.ceph.com
>> <mailto:ceph-users@lists.ceph.com>>
>>> *抄送：*
>>> 
>>> On 03/06/2018 01:17 PM, Lazuardi Nasution wrote:
>>>> Hi,
>>>>
>>>> I want to do load balanced multipathing (multiple iSCSI
>> gateway/exporter
>>>> nodes) of iSCSI backed with RBD images. Should I disable
>> exclusive lock
>>>> feature? What if I don't disable that feature? I'm using TGT
>> (manual
>>>> way) since I get so many CPU stuck error messages when I was
>> using LIO.
>>>>
>>> 
>>> You are using LIO/TGT with krbd right?
>>> 
>>> You cannot or shouldn't do active/active multipathing. If you
>> have the
>>> lock enabled then it bounces between paths for each IO and
>> will be slow.
>>> If you do not have it enabled then you can end up with stale IO
>>> overwriting current data.
>>> 
>>> 
>>> 
>>> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-08 Thread Lazuardi Nasution

Hi Jason,

I understand. Thank you for your explanation.

Best regards,

On Mar 9, 2018 3:45 AM, "Jason Dillaman" <jdill...@redhat.com> wrote:

> On Thu, Mar 8, 2018 at 3:41 PM, Lazuardi Nasution
> <mrxlazuar...@gmail.com> wrote:
> > Hi Jason,
> >
> > If there is the case that the gateway cannot access the Ceph, I think you
> > are right. Anyway, I put iSCSI Gateway on MON node.
>
> It's connectivity to the specific OSD associated to the IO operation
> that is the issue. If you understand the risks and are comfortable
> with them, active/active is a perfectly acceptable solution. I just
> wanted to ensure you understood the risk since you stated corruption
> "seems impossible".
>
> > Best regards,
> >
> >
> > On Mar 9, 2018 1:41 AM, "Jason Dillaman" <jdill...@redhat.com> wrote:
> >
> > On Thu, Mar 8, 2018 at 12:47 PM, Lazuardi Nasution
> > <mrxlazuar...@gmail.com> wrote:
> >> Jason,
> >>
> >> As long you don't activate any cache and single image for single client
> >> only, it seem impossible to have old data overwrite. May be, it is
> related
> >> to I/O pattern too. Anyway, maybe other Ceph users have different
> >> experience. It can be different result with different case.
> >
> > Write operation (A) is sent to gateway X who cannot access the Ceph
> > cluster so the IO is queued. The initiator's multipath layer times out
> > and resents write operation (A) to gateway Y, followed by write
> > operation (A') to gateway Y. Shortly thereafter, gateway X is able to
> > send its delayed write operation (A) to the Ceph cluster and
> > overwrites write operation (A') -- thus your data went back in time.
> >
> >> Best regards,
> >>
> >>
> >> On Mar 9, 2018 12:35 AM, "Jason Dillaman" <jdill...@redhat.com> wrote:
> >>
> >> On Thu, Mar 8, 2018 at 11:59 AM, Lazuardi Nasution
> >> <mrxlazuar...@gmail.com> wrote:
> >>> Hi Mike,
> >>>
> >>> Since I have moved from LIO to TGT, I can do full ALUA (active/active)
> of
> >>> multiple gateways. Of course I have to disable any write back cache at
> >>> any
> >>> level (RBD cache and TGT cache). It seem to be safe to disable
> exclusive
> >>> lock since each RBD image is accessed only by single client and as long
> >>> as
> >>> I
> >>> know mostly ALUA use RR of I/O path.
> >>
> >> How do you figure that's safe for preventing an overwrite with old
> >> data in an active/active path hiccup?
> >>
> >>> Best regards,
> >>>
> >>> On Mar 8, 2018 11:54 PM, "Mike Christie" <mchri...@redhat.com> wrote:
> >>>>
> >>>> On 03/07/2018 09:24 AM, shadow_lin wrote:
> >>>> > Hi Christie,
> >>>> > Is it safe to use active/passive multipath with krbd with exclusive
> >>>> > lock
> >>>> > for lio/tgt/scst/tcmu?
> >>>>
> >>>> No. We tried to use lio and krbd initially, but there is a issue where
> >>>> IO might get stuck in the target/block layer and get executed after
> new
> >>>> IO. So for lio, tgt and tcmu it is not safe as is right now. We could
> >>>> add some code tcmu's file_example handler which can be used with krbd
> so
> >>>> it works like the rbd one.
> >>>>
> >>>> I do know enough about SCST right now.
> >>>>
> >>>>
> >>>> > Is it safe to use active/active multipath If use suse kernel with
> >>>> > target_core_rbd?
> >>>> > Thanks.
> >>>> >
> >>>> > 2018-03-07
> >>>> >
> >>>> >
> >>>> > 
> 
> >>>> > shadowlin
> >>>> >
> >>>> >
> >>>> >
> >>>> > 
> 
> >>>> >
> >>>> > *发件人：*Mike Christie <mchri...@redhat.com>
> >>>> > *发送时间：*2018-03-07 03:51
> >>>> > *主题：*Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD
> >>>> > Exclusive Lock
> >>>> > *收件人：*"Lazuardi Nasution"<mrxlazuar...@gmail.com>,"Ceph
> >>>> > Users"<

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-08 Thread Jason Dillaman

On Thu, Mar 8, 2018 at 3:41 PM, Lazuardi Nasution
<mrxlazuar...@gmail.com> wrote:
> Hi Jason,
>
> If there is the case that the gateway cannot access the Ceph, I think you
> are right. Anyway, I put iSCSI Gateway on MON node.

It's connectivity to the specific OSD associated to the IO operation
that is the issue. If you understand the risks and are comfortable
with them, active/active is a perfectly acceptable solution. I just
wanted to ensure you understood the risk since you stated corruption
"seems impossible".

> Best regards,
>
>
> On Mar 9, 2018 1:41 AM, "Jason Dillaman" <jdill...@redhat.com> wrote:
>
> On Thu, Mar 8, 2018 at 12:47 PM, Lazuardi Nasution
> <mrxlazuar...@gmail.com> wrote:
>> Jason,
>>
>> As long you don't activate any cache and single image for single client
>> only, it seem impossible to have old data overwrite. May be, it is related
>> to I/O pattern too. Anyway, maybe other Ceph users have different
>> experience. It can be different result with different case.
>
> Write operation (A) is sent to gateway X who cannot access the Ceph
> cluster so the IO is queued. The initiator's multipath layer times out
> and resents write operation (A) to gateway Y, followed by write
> operation (A') to gateway Y. Shortly thereafter, gateway X is able to
> send its delayed write operation (A) to the Ceph cluster and
> overwrites write operation (A') -- thus your data went back in time.
>
>> Best regards,
>>
>>
>> On Mar 9, 2018 12:35 AM, "Jason Dillaman" <jdill...@redhat.com> wrote:
>>
>> On Thu, Mar 8, 2018 at 11:59 AM, Lazuardi Nasution
>> <mrxlazuar...@gmail.com> wrote:
>>> Hi Mike,
>>>
>>> Since I have moved from LIO to TGT, I can do full ALUA (active/active) of
>>> multiple gateways. Of course I have to disable any write back cache at
>>> any
>>> level (RBD cache and TGT cache). It seem to be safe to disable exclusive
>>> lock since each RBD image is accessed only by single client and as long
>>> as
>>> I
>>> know mostly ALUA use RR of I/O path.
>>
>> How do you figure that's safe for preventing an overwrite with old
>> data in an active/active path hiccup?
>>
>>> Best regards,
>>>
>>> On Mar 8, 2018 11:54 PM, "Mike Christie" <mchri...@redhat.com> wrote:
>>>>
>>>> On 03/07/2018 09:24 AM, shadow_lin wrote:
>>>> > Hi Christie,
>>>> > Is it safe to use active/passive multipath with krbd with exclusive
>>>> > lock
>>>> > for lio/tgt/scst/tcmu?
>>>>
>>>> No. We tried to use lio and krbd initially, but there is a issue where
>>>> IO might get stuck in the target/block layer and get executed after new
>>>> IO. So for lio, tgt and tcmu it is not safe as is right now. We could
>>>> add some code tcmu's file_example handler which can be used with krbd so
>>>> it works like the rbd one.
>>>>
>>>> I do know enough about SCST right now.
>>>>
>>>>
>>>> > Is it safe to use active/active multipath If use suse kernel with
>>>> > target_core_rbd?
>>>> > Thanks.
>>>> >
>>>> > 2018-03-07
>>>> >
>>>> >
>>>> > 
>>>> > shadowlin
>>>> >
>>>> >
>>>> >
>>>> > 
>>>> >
>>>> > *发件人：*Mike Christie <mchri...@redhat.com>
>>>> > *发送时间：*2018-03-07 03:51
>>>> > *主题：*Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD
>>>> > Exclusive Lock
>>>> > *收件人：*"Lazuardi Nasution"<mrxlazuar...@gmail.com>,"Ceph
>>>> > Users"<ceph-users@lists.ceph.com>
>>>> > *抄送：*
>>>> >
>>>> > On 03/06/2018 01:17 PM, Lazuardi Nasution wrote:
>>>> > > Hi,
>>>> > >
>>>> > > I want to do load balanced multipathing (multiple iSCSI
>>>> > gateway/exporter
>>>> > > nodes) of iSCSI backed with RBD images. Should I disable
>>>> > exclusive
>>>> > lock
>>>> > > feature? What if I don't disable that feature? I'm using TGT
>>>> > (manual
>>>> > > way) since I get so many CPU stuck error messages when I was
>>>> > using
>>>> > LIO.
>>>> > >
>>>> >
>>>> > You are using LIO/TGT with krbd right?
>>>> >
>>>> > You cannot or shouldn't do active/active multipathing. If you have
>>>> > the
>>>> > lock enabled then it bounces between paths for each IO and will be
>>>> > slow.
>>>> > If you do not have it enabled then you can end up with stale IO
>>>> > overwriting current data.
>>>> >
>>>> >
>>>> >
>>>> >
>>>>
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
>>
>>
>> --
>> Jason
>>
>>
>>
>
>
>
> --
> Jason
>
>



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-08 Thread Lazuardi Nasution

Hi Jason,

If there is the case that the gateway cannot access the Ceph, I think you
are right. Anyway, I put iSCSI Gateway on MON node.

Best regards,


On Mar 9, 2018 1:41 AM, "Jason Dillaman" <jdill...@redhat.com> wrote:

On Thu, Mar 8, 2018 at 12:47 PM, Lazuardi Nasution
<mrxlazuar...@gmail.com> wrote:
> Jason,
>
> As long you don't activate any cache and single image for single client
> only, it seem impossible to have old data overwrite. May be, it is related
> to I/O pattern too. Anyway, maybe other Ceph users have different
> experience. It can be different result with different case.

Write operation (A) is sent to gateway X who cannot access the Ceph
cluster so the IO is queued. The initiator's multipath layer times out
and resents write operation (A) to gateway Y, followed by write
operation (A') to gateway Y. Shortly thereafter, gateway X is able to
send its delayed write operation (A) to the Ceph cluster and
overwrites write operation (A') -- thus your data went back in time.

> Best regards,
>
>
> On Mar 9, 2018 12:35 AM, "Jason Dillaman" <jdill...@redhat.com> wrote:
>
> On Thu, Mar 8, 2018 at 11:59 AM, Lazuardi Nasution
> <mrxlazuar...@gmail.com> wrote:
>> Hi Mike,
>>
>> Since I have moved from LIO to TGT, I can do full ALUA (active/active) of
>> multiple gateways. Of course I have to disable any write back cache at
any
>> level (RBD cache and TGT cache). It seem to be safe to disable exclusive
>> lock since each RBD image is accessed only by single client and as long
as
>> I
>> know mostly ALUA use RR of I/O path.
>
> How do you figure that's safe for preventing an overwrite with old
> data in an active/active path hiccup?
>
>> Best regards,
>>
>> On Mar 8, 2018 11:54 PM, "Mike Christie" <mchri...@redhat.com> wrote:
>>>
>>> On 03/07/2018 09:24 AM, shadow_lin wrote:
>>> > Hi Christie,
>>> > Is it safe to use active/passive multipath with krbd with exclusive
>>> > lock
>>> > for lio/tgt/scst/tcmu?
>>>
>>> No. We tried to use lio and krbd initially, but there is a issue where
>>> IO might get stuck in the target/block layer and get executed after new
>>> IO. So for lio, tgt and tcmu it is not safe as is right now. We could
>>> add some code tcmu's file_example handler which can be used with krbd so
>>> it works like the rbd one.
>>>
>>> I do know enough about SCST right now.
>>>
>>>
>>> > Is it safe to use active/active multipath If use suse kernel with
>>> > target_core_rbd?
>>> > Thanks.
>>> >
>>> > 2018-03-07
>>> >
>>> > 

>>> > shadowlin
>>> >
>>> >
>>> > 

>>> >
>>> > *发件人：*Mike Christie <mchri...@redhat.com>
>>> > *发送时间：*2018-03-07 03:51
>>> > *主题：*Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD
>>> > Exclusive Lock
>>> > *收件人：*"Lazuardi Nasution"<mrxlazuar...@gmail.com>,"Ceph
>>> > Users"<ceph-users@lists.ceph.com>
>>> > *抄送：*
>>> >
>>> > On 03/06/2018 01:17 PM, Lazuardi Nasution wrote:
>>> > > Hi,
>>> > >
>>> > > I want to do load balanced multipathing (multiple iSCSI
>>> > gateway/exporter
>>> > > nodes) of iSCSI backed with RBD images. Should I disable
>>> > exclusive
>>> > lock
>>> > > feature? What if I don't disable that feature? I'm using TGT
>>> > (manual
>>> > > way) since I get so many CPU stuck error messages when I was
>>> > using
>>> > LIO.
>>> > >
>>> >
>>> > You are using LIO/TGT with krbd right?
>>> >
>>> > You cannot or shouldn't do active/active multipathing. If you have
>>> > the
>>> > lock enabled then it bounces between paths for each IO and will be
>>> > slow.
>>> > If you do not have it enabled then you can end up with stale IO
>>> > overwriting current data.
>>> >
>>> >
>>> >
>>> >
>>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
>
> --
> Jason
>
>
>



--
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-08 Thread Jason Dillaman

y about because I do not
>> want to mess up anyone's data, is that a disk on the iscsi target node
>> goes flakey. In the target we do kmalloc(GFP_KERNEL) to execute a SCSI
>> command, and that blocks trying to write data to the flakey disk. If the
>> disk recovers and we can eventually recover, did you account for the
>> recovery timers in that code path when configuring the failover and krbd
>> timers.
>>
>> One other case we have been debating about is if krbd/librbd is able to
>> put the ceph request on the wire but then the iscsi connection goes
>> down, will the ceph request always get sent to the OSD before the
>> initiator side failover timeouts have fired and it starts using a
>> different target node.
>
> If krbd/librbd is able to put the ceph request on the wire, then that could
> cause data corruption in the
> active/passive case too, right?

In general, yes. However, that's why the LIO/librbd approach uses the
RBD exclusive-lock feature in combination w/ Ceph client blacklisting
to ensure that cannot occur. Upon path failover, the old RBD client is
blacklisted from the Ceph cluster to ensure it can never complete its
(possible) in-flight writes.

> Thanks,
> Ashish
>
>>
>>
>>
>>> Best regards,
>>>
>>> On Mar 8, 2018 11:54 PM, "Mike Christie" <mchri...@redhat.com
>>> <mailto:mchri...@redhat.com>> wrote:
>>>
>>>  On 03/07/2018 09:24 AM, shadow_lin wrote:
>>>  > Hi Christie,
>>>  > Is it safe to use active/passive multipath with krbd with
>>>  exclusive lock
>>>  > for lio/tgt/scst/tcmu?
>>>
>>>  No. We tried to use lio and krbd initially, but there is a issue
>>> where
>>>  IO might get stuck in the target/block layer and get executed after
>>> new
>>>  IO. So for lio, tgt and tcmu it is not safe as is right now. We
>>> could
>>>  add some code tcmu's file_example handler which can be used with
>>> krbd so
>>>  it works like the rbd one.
>>>
>>>  I do know enough about SCST right now.
>>>
>>>
>>>  > Is it safe to use active/active multipath If use suse kernel with
>>>  > target_core_rbd?
>>>  > Thanks.
>>>  >
>>>  > 2018-03-07
>>>  >
>>>
>>> 
>>>  > shadowlin
>>>  >
>>>  >
>>>
>>> 
>>>  >
>>>  > *发件人：*Mike Christie <mchri...@redhat.com
>>>  <mailto:mchri...@redhat.com>>
>>>  > *发送时间：*2018-03-07 03:51
>>>  > *主题：*Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD
>>>  > Exclusive Lock
>>>  > *收件人：*"Lazuardi Nasution"<mrxlazuar...@gmail.com
>>>  <mailto:mrxlazuar...@gmail.com>>,"Ceph
>>>  > Users"<ceph-users@lists.ceph.com
>>>  <mailto:ceph-users@lists.ceph.com>>
>>>  > *抄送：*
>>>  >
>>>  > On 03/06/2018 01:17 PM, Lazuardi Nasution wrote:
>>>  > > Hi,
>>>  > >
>>>  > > I want to do load balanced multipathing (multiple iSCSI
>>>  gateway/exporter
>>>  > > nodes) of iSCSI backed with RBD images. Should I disable
>>>  exclusive lock
>>>  > > feature? What if I don't disable that feature? I'm using TGT
>>>  (manual
>>>  > > way) since I get so many CPU stuck error messages when I was
>>>  using LIO.
>>>  > >
>>>  >
>>>  > You are using LIO/TGT with krbd right?
>>>  >
>>>  > You cannot or shouldn't do active/active multipathing. If you
>>>  have the
>>>  > lock enabled then it bounces between paths for each IO and
>>>  will be slow.
>>>  > If you do not have it enabled then you can end up with stale
>>> IO
>>>  > overwriting current data.
>>>  >
>>>  >
>>>  >
>>>  >
>>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-08 Thread Ashish Samant

dhat.com>> wrote:

 On 03/07/2018 09:24 AM, shadow_lin wrote:
 > Hi Christie,
 > Is it safe to use active/passive multipath with krbd with
 exclusive lock
 > for lio/tgt/scst/tcmu?

 No. We tried to use lio and krbd initially, but there is a issue where
 IO might get stuck in the target/block layer and get executed after new
 IO. So for lio, tgt and tcmu it is not safe as is right now. We could
 add some code tcmu's file_example handler which can be used with krbd so
 it works like the rbd one.

 I do know enough about SCST right now.


 > Is it safe to use active/active multipath If use suse kernel with
 > target_core_rbd?
 > Thanks.
 >
 > 2018-03-07
 >
 
 > shadowlin
 >
 >
 
 >
 > *发件人：*Mike Christie <mchri...@redhat.com
 <mailto:mchri...@redhat.com>>
 > *发送时间：*2018-03-07 03:51
 > *主题：*Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD
 > Exclusive Lock
 > *收件人：*"Lazuardi Nasution"<mrxlazuar...@gmail.com
 <mailto:mrxlazuar...@gmail.com>>,"Ceph
 > Users"<ceph-users@lists.ceph.com
 <mailto:ceph-users@lists.ceph.com>>
 > *抄送：*
 >
 > On 03/06/2018 01:17 PM, Lazuardi Nasution wrote:
 > > Hi,
 > >
 > > I want to do load balanced multipathing (multiple iSCSI
 gateway/exporter
 > > nodes) of iSCSI backed with RBD images. Should I disable
 exclusive lock
 > > feature? What if I don't disable that feature? I'm using TGT
 (manual
 > > way) since I get so many CPU stuck error messages when I was
 using LIO.
 > >
 >
 > You are using LIO/TGT with krbd right?
 >
 > You cannot or shouldn't do active/active multipathing. If you
 have the
 > lock enabled then it bounces between paths for each IO and
 will be slow.
 > If you do not have it enabled then you can end up with stale IO
 > overwriting current data.
 >
 >
 >
 >


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-08 Thread Mike Christie

On 03/08/2018 12:44 PM, Mike Christie wrote:
> stuck/queued then your osd_request_timeout value might be too short. For

Sorry, I meant too long.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-08 Thread Mike Christie

ve multipath with krbd with
> exclusive lock
> > for lio/tgt/scst/tcmu?
> 
> No. We tried to use lio and krbd initially, but there is a issue where
> IO might get stuck in the target/block layer and get executed after new
> IO. So for lio, tgt and tcmu it is not safe as is right now. We could
> add some code tcmu's file_example handler which can be used with krbd so
> it works like the rbd one.
> 
> I do know enough about SCST right now.
> 
> 
> > Is it safe to use active/active multipath If use suse kernel with
> > target_core_rbd?
> > Thanks.
> >
> > 2018-03-07
> >
> 
> > shadowlin
> >
> >
> ----
> >
> >     *发件人：*Mike Christie <mchri...@redhat.com
> <mailto:mchri...@redhat.com>>
> > *发送时间：*2018-03-07 03:51
> > *主题：*Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD
> > Exclusive Lock
> > *收件人：*"Lazuardi Nasution"<mrxlazuar...@gmail.com
> <mailto:mrxlazuar...@gmail.com>>,"Ceph
> > Users"<ceph-users@lists.ceph.com
> <mailto:ceph-users@lists.ceph.com>>
> > *抄送：*
> >
> > On 03/06/2018 01:17 PM, Lazuardi Nasution wrote:
> > > Hi,
> > >
> > > I want to do load balanced multipathing (multiple iSCSI
> gateway/exporter
> > > nodes) of iSCSI backed with RBD images. Should I disable
> exclusive lock
> > > feature? What if I don't disable that feature? I'm using TGT
> (manual
> > > way) since I get so many CPU stuck error messages when I was
> using LIO.
> > >
> >
> > You are using LIO/TGT with krbd right?
> >
> > You cannot or shouldn't do active/active multipathing. If you
> have the
> > lock enabled then it bounces between paths for each IO and
> will be slow.
> > If you do not have it enabled then you can end up with stale IO
> > overwriting current data.
> >
> >
> >
> >
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-08 Thread Jason Dillaman

On Thu, Mar 8, 2018 at 12:47 PM, Lazuardi Nasution
<mrxlazuar...@gmail.com> wrote:
> Jason,
>
> As long you don't activate any cache and single image for single client
> only, it seem impossible to have old data overwrite. May be, it is related
> to I/O pattern too. Anyway, maybe other Ceph users have different
> experience. It can be different result with different case.

Write operation (A) is sent to gateway X who cannot access the Ceph
cluster so the IO is queued. The initiator's multipath layer times out
and resents write operation (A) to gateway Y, followed by write
operation (A') to gateway Y. Shortly thereafter, gateway X is able to
send its delayed write operation (A) to the Ceph cluster and
overwrites write operation (A') -- thus your data went back in time.

> Best regards,
>
>
> On Mar 9, 2018 12:35 AM, "Jason Dillaman" <jdill...@redhat.com> wrote:
>
> On Thu, Mar 8, 2018 at 11:59 AM, Lazuardi Nasution
> <mrxlazuar...@gmail.com> wrote:
>> Hi Mike,
>>
>> Since I have moved from LIO to TGT, I can do full ALUA (active/active) of
>> multiple gateways. Of course I have to disable any write back cache at any
>> level (RBD cache and TGT cache). It seem to be safe to disable exclusive
>> lock since each RBD image is accessed only by single client and as long as
>> I
>> know mostly ALUA use RR of I/O path.
>
> How do you figure that's safe for preventing an overwrite with old
> data in an active/active path hiccup?
>
>> Best regards,
>>
>> On Mar 8, 2018 11:54 PM, "Mike Christie" <mchri...@redhat.com> wrote:
>>>
>>> On 03/07/2018 09:24 AM, shadow_lin wrote:
>>> > Hi Christie,
>>> > Is it safe to use active/passive multipath with krbd with exclusive
>>> > lock
>>> > for lio/tgt/scst/tcmu?
>>>
>>> No. We tried to use lio and krbd initially, but there is a issue where
>>> IO might get stuck in the target/block layer and get executed after new
>>> IO. So for lio, tgt and tcmu it is not safe as is right now. We could
>>> add some code tcmu's file_example handler which can be used with krbd so
>>> it works like the rbd one.
>>>
>>> I do know enough about SCST right now.
>>>
>>>
>>> > Is it safe to use active/active multipath If use suse kernel with
>>> > target_core_rbd?
>>> > Thanks.
>>> >
>>> > 2018-03-07
>>> >
>>> > 
>>> > shadowlin
>>> >
>>> >
>>> > 
>>> >
>>> > *发件人：*Mike Christie <mchri...@redhat.com>
>>> > *发送时间：*2018-03-07 03:51
>>> > *主题：*Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD
>>> > Exclusive Lock
>>> > *收件人：*"Lazuardi Nasution"<mrxlazuar...@gmail.com>,"Ceph
>>> > Users"<ceph-users@lists.ceph.com>
>>> > *抄送：*
>>> >
>>> > On 03/06/2018 01:17 PM, Lazuardi Nasution wrote:
>>> > > Hi,
>>> > >
>>> > > I want to do load balanced multipathing (multiple iSCSI
>>> > gateway/exporter
>>> > > nodes) of iSCSI backed with RBD images. Should I disable
>>> > exclusive
>>> > lock
>>> > > feature? What if I don't disable that feature? I'm using TGT
>>> > (manual
>>> > > way) since I get so many CPU stuck error messages when I was
>>> > using
>>> > LIO.
>>> > >
>>> >
>>> > You are using LIO/TGT with krbd right?
>>> >
>>> > You cannot or shouldn't do active/active multipathing. If you have
>>> > the
>>> > lock enabled then it bounces between paths for each IO and will be
>>> > slow.
>>> > If you do not have it enabled then you can end up with stale IO
>>> > overwriting current data.
>>> >
>>> >
>>> >
>>> >
>>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
>
> --
> Jason
>
>
>



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-08 Thread Lazuardi Nasution

Jason,

As long you don't activate any cache and single image for single client
only, it seem impossible to have old data overwrite. May be, it is related
to I/O pattern too. Anyway, maybe other Ceph users have different
experience. It can be different result with different case.

Best regards,


On Mar 9, 2018 12:35 AM, "Jason Dillaman" <jdill...@redhat.com> wrote:

On Thu, Mar 8, 2018 at 11:59 AM, Lazuardi Nasution
<mrxlazuar...@gmail.com> wrote:
> Hi Mike,
>
> Since I have moved from LIO to TGT, I can do full ALUA (active/active) of
> multiple gateways. Of course I have to disable any write back cache at any
> level (RBD cache and TGT cache). It seem to be safe to disable exclusive
> lock since each RBD image is accessed only by single client and as long
as I
> know mostly ALUA use RR of I/O path.

How do you figure that's safe for preventing an overwrite with old
data in an active/active path hiccup?

> Best regards,
>
> On Mar 8, 2018 11:54 PM, "Mike Christie" <mchri...@redhat.com> wrote:
>>
>> On 03/07/2018 09:24 AM, shadow_lin wrote:
>> > Hi Christie,
>> > Is it safe to use active/passive multipath with krbd with exclusive
lock
>> > for lio/tgt/scst/tcmu?
>>
>> No. We tried to use lio and krbd initially, but there is a issue where
>> IO might get stuck in the target/block layer and get executed after new
>> IO. So for lio, tgt and tcmu it is not safe as is right now. We could
>> add some code tcmu's file_example handler which can be used with krbd so
>> it works like the rbd one.
>>
>> I do know enough about SCST right now.
>>
>>
>> > Is it safe to use active/active multipath If use suse kernel with
>> > target_core_rbd?
>> > Thanks.
>> >
>> > 2018-03-07
>> > 
--------
>> > shadowlin
>> >
>> > ----

>> >
>> > *发件人：*Mike Christie <mchri...@redhat.com>
>> > *发送时间：*2018-03-07 03:51
>> > *主题：*Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD
>> > Exclusive Lock
>> > *收件人：*"Lazuardi Nasution"<mrxlazuar...@gmail.com>,"Ceph
>> > Users"<ceph-users@lists.ceph.com>
>> > *抄送：*
>> >
>> > On 03/06/2018 01:17 PM, Lazuardi Nasution wrote:
>> > > Hi,
>> > >
>> > > I want to do load balanced multipathing (multiple iSCSI
>> > gateway/exporter
>> > > nodes) of iSCSI backed with RBD images. Should I disable
exclusive
>> > lock
>> > > feature? What if I don't disable that feature? I'm using TGT
>> > (manual
>> > > way) since I get so many CPU stuck error messages when I was
using
>> > LIO.
>> > >
>> >
>> > You are using LIO/TGT with krbd right?
>> >
>> > You cannot or shouldn't do active/active multipathing. If you have
>> > the
>> > lock enabled then it bounces between paths for each IO and will be
>> > slow.
>> > If you do not have it enabled then you can end up with stale IO
>> > overwriting current data.
>> >
>> >
>> >
>> >
>>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



--
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-08 Thread Jason Dillaman

On Thu, Mar 8, 2018 at 11:59 AM, Lazuardi Nasution
<mrxlazuar...@gmail.com> wrote:
> Hi Mike,
>
> Since I have moved from LIO to TGT, I can do full ALUA (active/active) of
> multiple gateways. Of course I have to disable any write back cache at any
> level (RBD cache and TGT cache). It seem to be safe to disable exclusive
> lock since each RBD image is accessed only by single client and as long as I
> know mostly ALUA use RR of I/O path.

How do you figure that's safe for preventing an overwrite with old
data in an active/active path hiccup?

> Best regards,
>
> On Mar 8, 2018 11:54 PM, "Mike Christie" <mchri...@redhat.com> wrote:
>>
>> On 03/07/2018 09:24 AM, shadow_lin wrote:
>> > Hi Christie,
>> > Is it safe to use active/passive multipath with krbd with exclusive lock
>> > for lio/tgt/scst/tcmu?
>>
>> No. We tried to use lio and krbd initially, but there is a issue where
>> IO might get stuck in the target/block layer and get executed after new
>> IO. So for lio, tgt and tcmu it is not safe as is right now. We could
>> add some code tcmu's file_example handler which can be used with krbd so
>> it works like the rbd one.
>>
>> I do know enough about SCST right now.
>>
>>
>> > Is it safe to use active/active multipath If use suse kernel with
>> > target_core_rbd?
>> > Thanks.
>> >
>> > 2018-03-07
>> > 
>> > shadowlin
>> >
>> > ----
>> >
>> > *发件人：*Mike Christie <mchri...@redhat.com>
>> > *发送时间：*2018-03-07 03:51
>> > *主题：*Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD
>> > Exclusive Lock
>> > *收件人：*"Lazuardi Nasution"<mrxlazuar...@gmail.com>,"Ceph
>> > Users"<ceph-users@lists.ceph.com>
>> > *抄送：*
>> >
>> > On 03/06/2018 01:17 PM, Lazuardi Nasution wrote:
>> > > Hi,
>> > >
>> > > I want to do load balanced multipathing (multiple iSCSI
>> > gateway/exporter
>> > > nodes) of iSCSI backed with RBD images. Should I disable exclusive
>> > lock
>> > > feature? What if I don't disable that feature? I'm using TGT
>> > (manual
>> > > way) since I get so many CPU stuck error messages when I was using
>> > LIO.
>> > >
>> >
>> > You are using LIO/TGT with krbd right?
>> >
>> > You cannot or shouldn't do active/active multipathing. If you have
>> > the
>> > lock enabled then it bounces between paths for each IO and will be
>> > slow.
>> > If you do not have it enabled then you can end up with stale IO
>> > overwriting current data.
>> >
>> >
>> >
>> >
>>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-08 Thread Lazuardi Nasution

Hi Mike,

Since I have moved from LIO to TGT, I can do full ALUA (active/active) of
multiple gateways. Of course I have to disable any write back cache at any
level (RBD cache and TGT cache). It seem to be safe to disable exclusive
lock since each RBD image is accessed only by single client and as long as
I know mostly ALUA use RR of I/O path.

Best regards,

On Mar 8, 2018 11:54 PM, "Mike Christie" <mchri...@redhat.com> wrote:

> On 03/07/2018 09:24 AM, shadow_lin wrote:
> > Hi Christie,
> > Is it safe to use active/passive multipath with krbd with exclusive lock
> > for lio/tgt/scst/tcmu?
>
> No. We tried to use lio and krbd initially, but there is a issue where
> IO might get stuck in the target/block layer and get executed after new
> IO. So for lio, tgt and tcmu it is not safe as is right now. We could
> add some code tcmu's file_example handler which can be used with krbd so
> it works like the rbd one.
>
> I do know enough about SCST right now.
>
>
> > Is it safe to use active/active multipath If use suse kernel with
> > target_core_rbd?
> > Thanks.
> >
> > 2018-03-07
> > 
> > shadowlin
> >
> > ----
> >
> > *发件人：*Mike Christie <mchri...@redhat.com>
> > *发送时间：*2018-03-07 03:51
> > *主题：*Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD
> > Exclusive Lock
> > *收件人：*"Lazuardi Nasution"<mrxlazuar...@gmail.com>,"Ceph
> > Users"<ceph-users@lists.ceph.com>
> > *抄送：*
> >
> > On 03/06/2018 01:17 PM, Lazuardi Nasution wrote:
> > > Hi,
> > >
> > > I want to do load balanced multipathing (multiple iSCSI
> gateway/exporter
> > > nodes) of iSCSI backed with RBD images. Should I disable exclusive
> lock
> > > feature? What if I don't disable that feature? I'm using TGT
> (manual
> > > way) since I get so many CPU stuck error messages when I was using
> LIO.
> > >
> >
> > You are using LIO/TGT with krbd right?
> >
> > You cannot or shouldn't do active/active multipathing. If you have
> the
> > lock enabled then it bounces between paths for each IO and will be
> slow.
> > If you do not have it enabled then you can end up with stale IO
> > overwriting current data.
> >
> >
> >
> >
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-08 Thread Mike Christie

On 03/07/2018 09:24 AM, shadow_lin wrote:
> Hi Christie,
> Is it safe to use active/passive multipath with krbd with exclusive lock
> for lio/tgt/scst/tcmu?

No. We tried to use lio and krbd initially, but there is a issue where
IO might get stuck in the target/block layer and get executed after new
IO. So for lio, tgt and tcmu it is not safe as is right now. We could
add some code tcmu's file_example handler which can be used with krbd so
it works like the rbd one.

I do know enough about SCST right now.


> Is it safe to use active/active multipath If use suse kernel with
> target_core_rbd?
> Thanks.
>  
> 2018-03-07
> 
> shadowlin
>  
> 
> 
> *发件人：*Mike Christie <mchri...@redhat.com>
> *发送时间：*2018-03-07 03:51
> *主题：*Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD
> Exclusive Lock
> *收件人：*"Lazuardi Nasution"<mrxlazuar...@gmail.com>,"Ceph
> Users"<ceph-users@lists.ceph.com>
> *抄送：*
>  
> On 03/06/2018 01:17 PM, Lazuardi Nasution wrote: 
> > Hi, 
> >  
> > I want to do load balanced multipathing (multiple iSCSI 
> gateway/exporter 
> > nodes) of iSCSI backed with RBD images. Should I disable exclusive lock 
> > feature? What if I don't disable that feature? I'm using TGT (manual 
> > way) since I get so many CPU stuck error messages when I was using LIO. 
> >  
>  
> You are using LIO/TGT with krbd right? 
>  
> You cannot or shouldn't do active/active multipathing. If you have the 
> lock enabled then it bounces between paths for each IO and will be slow. 
> If you do not have it enabled then you can end up with stale IO 
> overwriting current data. 
>  
>  
>  
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-07 Thread shadow_lin

Hi David,
Thanks for the info.
Could I assume that if use active/passive multipath with rbd exclusive lock  
then all targets which support rbd(via block) are safe?
2018-03-08 

shadow_lin 

发件人：David Disseldorp <dd...@suse.de>
发送时间：2018-03-08 08:47
主题：Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock
收件人："shadow_lin"<shadow_...@163.com>
抄送："Mike Christie"<mchri...@redhat.com>,"Lazuardi 
Nasution"<mrxlazuar...@gmail.com>,"Ceph Users"<ceph-users@lists.ceph.com>

Hi shadowlin, 

On Wed, 7 Mar 2018 23:24:42 +0800, shadow_lin wrote: 

> Is it safe to use active/active multipath If use suse kernel with 
> target_core_rbd? 
> Thanks. 

A cross-gateway failover race-condition similar to what Mike described 
is currently possible with active/active target_core_rbd. It's a corner 
case that is dependent on a client assuming that unacknowledged I/O has 
been implicitly terminated and can be resumed via an alternate path, 
while the original gateway at the same time issues the original request 
such that it reaches the Ceph cluster after differing I/O to the same 
region via the alternate path. 
It's not something that we've observed in the wild, but is nevertheless 
a bug that is being worked on, with a resolution that should also be 
usable for active/active tcmu-runner. 

Cheers, David ___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-07 Thread David Disseldorp

Hi shadowlin,

On Wed, 7 Mar 2018 23:24:42 +0800, shadow_lin wrote:

> Is it safe to use active/active multipath If use suse kernel with 
> target_core_rbd?
> Thanks.

A cross-gateway failover race-condition similar to what Mike described
is currently possible with active/active target_core_rbd. It's a corner
case that is dependent on a client assuming that unacknowledged I/O has
been implicitly terminated and can be resumed via an alternate path,
while the original gateway at the same time issues the original request
such that it reaches the Ceph cluster after differing I/O to the same
region via the alternate path.
It's not something that we've observed in the wild, but is nevertheless
a bug that is being worked on, with a resolution that should also be
usable for active/active tcmu-runner.

Cheers, David
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-07 Thread shadow_lin

Hi Christie,
Is it safe to use active/passive multipath with krbd with exclusive lock for 
lio/tgt/scst/tcmu?
Is it safe to use active/active multipath If use suse kernel with 
target_core_rbd?
Thanks.

2018-03-07 

shadowlin

发件人：Mike Christie <mchri...@redhat.com>
发送时间：2018-03-07 03:51
主题：Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock
收件人："Lazuardi Nasution"<mrxlazuar...@gmail.com>,"Ceph 
Users"<ceph-users@lists.ceph.com>
抄送：

On 03/06/2018 01:17 PM, Lazuardi Nasution wrote: 
> Hi, 
>  
> I want to do load balanced multipathing (multiple iSCSI gateway/exporter 
> nodes) of iSCSI backed with RBD images. Should I disable exclusive lock 
> feature? What if I don't disable that feature? I'm using TGT (manual 
> way) since I get so many CPU stuck error messages when I was using LIO. 
>  

You are using LIO/TGT with krbd right? 

You cannot or shouldn't do active/active multipathing. If you have the 
lock enabled then it bounces between paths for each IO and will be slow. 
If you do not have it enabled then you can end up with stale IO 
overwriting current data. ___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-06 Thread Mike Christie

On 03/06/2018 01:17 PM, Lazuardi Nasution wrote:
> Hi,
> 
> I want to do load balanced multipathing (multiple iSCSI gateway/exporter
> nodes) of iSCSI backed with RBD images. Should I disable exclusive lock
> feature? What if I don't disable that feature? I'm using TGT (manual
> way) since I get so many CPU stuck error messages when I was using LIO.
> 

You are using LIO/TGT with krbd right?

You cannot or shouldn't do active/active multipathing. If you have the
lock enabled then it bounces between paths for each IO and will be slow.
If you do not have it enabled then you can end up with stale IO
overwriting current data.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-06 Thread Lazuardi Nasution

Hi,

I want to do load balanced multipathing (multiple iSCSI gateway/exporter
nodes) of iSCSI backed with RBD images. Should I disable exclusive lock
feature? What if I don't disable that feature? I'm using TGT (manual way)
since I get so many CPU stuck error messages when I was using LIO.

Best regards,
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

45 matches

Mail list logo