On 03/14/2018 01:27 PM, Michael Christie wrote:
> On 03/14/2018 01:24 PM, Maxim Patlasov wrote:
>> On Wed, Mar 14, 2018 at 11:13 AM, Jason Dillaman <jdill...@redhat.com
>> <mailto:jdill...@redhat.com>> wrote:
>>
>>     Maxim, can you provide steps for a reproducer?
>>
>>
>> Yes, but it involves adding two artificial delays: one in tcmu-runner
>> and another in kernel iscsi. If you're willing to take pains of
> 
> Send the patches for the changes.
> 
>> recompiling kernel and tcmu-runner on one of gateway nodes, I'll help to
>> reproduce.
>>
>> Generally, the idea of reproducer is simple: let's model a situation
>> when two stale requests got stuck in kernel mailbox waiting to be
>> consumed by tcmu-runner, and another one got stuck in iscsi layer --
>> immediately after reading iscsi request from the socket. If we unblock
>> tcmu-runner after newer data went through another gateway, the first
>> stale request will switch tcmu-runner state from LOCKED to UNLOCKED>> state, 
>> then the second stale request will trigger alua_thread to
>> re-acquire the lock, so when the third request comes to tcmu-runner, the
Where you send the patches that add your delays could you send the
target side /var/log/tcmu-runner.log with log_level = 4.

For this test above you should see the second request will be sent to
rbd's tcmu_rbd_aio_write function. That command should fail in
rbd_finish_aio_generic and tcmu_rbd_handle_blacklisted_cmd will be
called. We should then be blocking until IO in that iscsi connection is
flushed in tgt_port_grp_recovery_thread_fn. That function will not
return from the enable=0 until the iscsi connection is stopped and the
commands in it have completed.

Other commands you had in flight should eventually hit
tcmur_cmd_handler's tcmu_dev_in_recovery check and be failed there or if
they had already passed that check then the cmd would be sent to
tcmu_rbd_aio_write and they should be getting the blacklisted error like
above.


>> lock is already reacquired and it goes to OSD smoothly overwriting newer
>> data.
>>
>>  
>>
>>
>>     On Wed, Mar 14, 2018 at 2:06 PM, Maxim Patlasov
>>     <mpatla...@skytap.com <mailto:mpatla...@skytap.com>> wrote:
>>     > On Sun, Mar 11, 2018 at 5:10 PM, Mike Christie
>>     <mchri...@redhat.com <mailto:mchri...@redhat.com>> wrote:
>>     >>
>>     >> On 03/11/2018 08:54 AM, shadow_lin wrote:
>>     >> > Hi Jason,
>>     >> > How the old target gateway is blacklisted? Is it a feature of
>>     the target
>>     >> > gateway(which can support active/passive multipath) should
>>     provide or is
>>     >> > it only by rbd excusive lock?
>>     >> > I think excusive lock only let one client can write to rbd at
>>     the same
>>     >> > time,but another client can obtain the lock later when the lock is
>>     >> > released.
>>     >>
>>     >> For the case where we had the lock and it got taken:
>>     >>
>>     >> If IO was blocked, then unjammed and it has already passed the target
>>     >> level checks then the IO will be failed by the OSD due to the
>>     >> blacklisting. When we get IO errors from ceph indicating we are
>>     >> blacklisted the tcmu rbd layer will fail the IO indicating the state
>>     >> change and that the IO can be retried. We will also tell the target
>>     >> layer rbd does not have the lock anymore and to just stop the iscsi
>>     >> connection while we clean up the blacklisting, running commands and
>>     >> update our state.
>>     >
>>     >
>>     > Mike, can you please give more details on how you tell the target
>>     layer rbd
>>     > does not have the lock and to stop iscsi connection. Which
>>     > tcmu-runner/kernel-target functions are used for that?
>>     >
>>     > In fact, I performed an experiment with three stale write requests
>>     stuck on
>>     > blacklisted gateway, and one of them managed to overwrite newer
>>     data. I
>>     > followed all instructions from
>>     >
>>     http://docs.ceph.com/docs/master/rbd/iscsi-target-cli-manual-install/ 
>> <http://docs.ceph.com/docs/master/rbd/iscsi-target-cli-manual-install/>
>>     and
>>     > http://docs.ceph.com/docs/master/rbd/iscsi-target-cli/
>>     <http://docs.ceph.com/docs/master/rbd/iscsi-target-cli/>, so I'm
>>     interested
>>     > what I'm missing...
>>     >
>>     > Thanks,
>>     > Maxim
>>     >
>>     > Thanks,
>>     > Maxim
>>     >
>>     >>
>>     >>
>>     >
>>
>>
>>
>>     --
>>     Jason
>>
>>
> 

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to