Re: [ceph-users] Slow requests troubleshooting in Luminous - details missing

2018-03-11 Thread Alex Gorbachev
On Mon, Mar 5, 2018 at 11:20 PM, Brad Hubbard  wrote:
> On Fri, Mar 2, 2018 at 3:54 PM, Alex Gorbachev  
> wrote:
>> On Thu, Mar 1, 2018 at 10:57 PM, David Turner  wrote:
>>> Blocked requests and slow requests are synonyms in ceph. They are 2 names
>>> for the exact same thing.
>>>
>>>
>>> On Thu, Mar 1, 2018, 10:21 PM Alex Gorbachev  
>>> wrote:

 On Thu, Mar 1, 2018 at 2:47 PM, David Turner 
 wrote:
 > `ceph health detail` should show you more information about the slow
 > requests.  If the output is too much stuff, you can grep out for blocked
 > or
 > something.  It should tell you which OSDs are involved, how long they've
 > been slow, etc.  The default is for them to show '> 32 sec' but that may
 > very well be much longer and `ceph health detail` will show that.

 Hi David,

 Thank you for the reply.  Unfortunately, the health detail only shows
 blocked requests.  This seems to be related to a compression setting
 on the pool, nothing in OSD logs.

 I replied to another compression thread.  This makes sense since
 compression is new, and in the past all such issues were reflected in
 OSD logs and related to either network or OSD hardware.

 Regards,
 Alex

 >
 > On Thu, Mar 1, 2018 at 2:23 PM Alex Gorbachev 
 > wrote:
 >>
 >> Is there a switch to turn on the display of specific OSD issues?  Or
 >> does the below indicate a generic problem, e.g. network and no any
 >> specific OSD?
 >>
 >> 2018-02-28 18:09:36.438300 7f6dead56700  0
 >> mon.roc-vm-sc3c234@0(leader).data_health(46) update_stats avail 56%
 >> total 15997 MB, used 6154 MB, avail 9008 MB
 >> 2018-02-28 18:09:41.477216 7f6dead56700  0 log_channel(cluster) log
 >> [WRN] : Health check failed: 73 slow requests are blocked > 32 sec
 >> (REQUEST_SLOW)
 >> 2018-02-28 18:09:47.552669 7f6dead56700  0 log_channel(cluster) log
 >> [WRN] : Health check update: 74 slow requests are blocked > 32 sec
 >> (REQUEST_SLOW)
 >> 2018-02-28 18:09:53.794882 7f6de8551700  0
 >> mon.roc-vm-sc3c234@0(leader) e1 handle_command mon_command({"prefix":
 >> "status", "format": "json"} v 0) v1
 >>
 >> --
>>
>> I was wrong where the pool compression does not matter, even
>> uncompressed pool also generates these slow messages.
>>
>> Question is why no subsequent message relating to specific OSDs (like
>> in Jewel and prior, like this example from RH:
>>
>> 2015-08-24 13:18:10.024659 osd.1 127.0.0.1:6812/3032 9 : cluster [WRN]
>> 6 slow requests, 6 included below; oldest blocked for > 61.758455 secs
>>
>> 2016-07-25 03:44:06.510583 osd.50 [WRN] slow request 30.005692 seconds
>> old, received at {date-time}: osd_op(client.4240.0:8
>> benchmark_data_ceph-1_39426_object7 [write 0~4194304] 0.69848840) v4
>> currently waiting for subops from [610]
>>
>> In comparison, my Luminous cluster only shows the general slow/blocked 
>> message:
>>
>> 2018-03-01 21:52:54.237270 7f7e419e3700  0 log_channel(cluster) log
>> [WRN] : Health check failed: 116 slow requests are blocked > 32 sec
>> (REQUEST_SLOW)
>> 2018-03-01 21:53:00.282721 7f7e419e3700  0 log_channel(cluster) log
>> [WRN] : Health check update: 66 slow requests are blocked > 32 sec
>> (REQUEST_SLOW)
>> 2018-03-01 21:53:08.534244 7f7e419e3700  0 log_channel(cluster) log
>> [WRN] : Health check update: 5 slow requests are blocked > 32 sec
>> (REQUEST_SLOW)
>> 2018-03-01 21:53:10.382510 7f7e419e3700  0 log_channel(cluster) log
>> [INF] : Health check cleared: REQUEST_SLOW (was: 5 slow requests are
>> blocked > 32 sec)
>> 2018-03-01 21:53:10.382546 7f7e419e3700  0 log_channel(cluster) log
>> [INF] : Cluster is now healthy
>>
>> So where are the details?
>
> Working on this, thanks.
>
> See https://tracker.ceph.com/issues/23236

As in the tracker, but I think it would be useful to others:

If something like below comes up, how do you troubleshoot the cause of
past events, especially if this is specific to just a handful of 1000s
of OSDs on many hosts?

2018-03-11 22:00:00.000132 mon.roc-vm-sc3c234 [INF] overall HEALTH_OK
2018-03-11 22:44:46.173825 mon.roc-vm-sc3c234 [WRN] Health check
failed: 12 slow requests are blocked > 32 sec (REQUEST_SLOW)
2018-03-11 22:44:52.245738 mon.roc-vm-sc3c234 [WRN] Health check
update: 9 slow requests are blocked > 32 sec (REQUEST_SLOW)
2018-03-11 22:44:57.925686 mon.roc-vm-sc3c234 [WRN] Health check
update: 10 slow requests are blocked > 32 sec (REQUEST_SLOW)
2018-03-11 22:45:02.926031 mon.roc-vm-sc3c234 [WRN] Health check
update: 14 slow requests are blocked > 32 sec (REQUEST_SLOW)
2018-03-11 22:45:06.413741 mon.roc-vm-sc3c234 [INF] Health check
cleared: REQUEST_SLOW (was: 14 slow requests are blocked > 32 sec)
2018-03-11 22:45:06.413814 mon.roc-vm-sc3c234 [INF] Cluster is now 

Re: [ceph-users] XFS Metadata corruption while activating OSD

2018-03-11 Thread Christian Wuerdig
Hm, so you're running OSD nodes with 2GB of RAM and 2x10TB = 20TB of
storage? Literally everything posted on this list in relation to HW
requirements and related problems will tell you that this simply isn't
going to work. The slightest hint of a problem will simply kill the OSD
nodes with OOM. Have you tried with smaller disks - like 1TB models (or
even smaller like 256GB SSDs) and see if the same problem persists?


On Tue, 6 Mar 2018 at 10:51, 赵赵贺东  wrote:

> Hello ceph-users,
>
> It is a really really *Really* tough problem for our team.
> We investigated in the problem for a long time, try a lot of efforts, but
> can’t solve the problem, even the concentrate cause of the problem is still
> unclear for us!
> So, Anyone give any solution/suggestion/opinion whatever  will be highly
> highly appreciated!!!
>
> Problem Summary:
> When we activate osd, there will be  metadata corrupttion in the
> activating disk, probability is 100% !
>
> Admin Nodes node:
> Platform: X86
> OS: Ubuntu 16.04
> Kernel: 4.12.0
> Ceph: Luminous 12.2.2
>
> OSD nodes:
> Platform: armv7
> OS:   Ubuntu 14.04
> Kernel:   4.4.39
> Ceph: Lominous 12.2.2
> Disk: 10T+10T
> Memory: 2GB
>
> Deploy log:
>
>
> dmesg log:(Sorry arms001-01 dmesg log has log has been lost, but error
> message about metadata corruption on arms003-10 are the same with
> arms001-01)
> Mar  5 11:08:49 arms003-10 kernel: [  252.534232] XFS (sda1): Unmount and
> run xfs_repair
> Mar  5 11:08:49 arms003-10 kernel: [  252.539100] XFS (sda1): First 64
> bytes of corrupted metadata buffer:
> Mar  5 11:08:49 arms003-10 kernel: [  252.545504] eb82f000: 58 46 53 42 00
> 00 10 00 00 00 00 00 91 73 fe fb  XFSB.s..
> Mar  5 11:08:49 arms003-10 kernel: [  252.553569] eb82f010: 00 00 00 00 00
> 00 00 00 00 00 00 00 00 00 00 00  
> Mar  5 11:08:49 arms003-10 kernel: [  252.561624] eb82f020: fc 4e e3 89 50
> 8f 42 aa be bc 07 0c 6e fa 83 2f  .N..P.B.n../
> Mar  5 11:08:49 arms003-10 kernel: [  252.569706] eb82f030: 00 00 00 00 80
> 00 00 07 ff ff ff ff ff ff ff ff  
> Mar  5 11:08:49 arms003-10 kernel: [  252.58] XFS (sda1): metadata I/O
> error: block 0x48b9ff80 ("xfs_trans_read_buf_map") error 117 numblks 8
> Mar  5 11:08:49 arms003-10 kernel: [  252.602944] XFS (sda1): Metadata
> corruption detected at xfs_dir3_data_read_verify+0x58/0xd0, xfs_dir3_data
> block 0x48b9ff80
> Mar  5 11:08:49 arms003-10 kernel: [  252.614170] XFS (sda1): Unmount and
> run xfs_repair
> Mar  5 11:08:49 arms003-10 kernel: [  252.619030] XFS (sda1): First 64
> bytes of corrupted metadata buffer:
> Mar  5 11:08:49 arms003-10 kernel: [  252.625403] eb901000: 58 46 53 42 00
> 00 10 00 00 00 00 00 91 73 fe fb  XFSB.s..
> Mar  5 11:08:49 arms003-10 kernel: [  252.633441] eb901010: 00 00 00 00 00
> 00 00 00 00 00 00 00 00 00 00 00  
> Mar  5 11:08:49 arms003-10 kernel: [  252.641474] eb901020: fc 4e e3 89 50
> 8f 42 aa be bc 07 0c 6e fa 83 2f  .N..P.B.n../
> Mar  5 11:08:49 arms003-10 kernel: [  252.649519] eb901030: 00 00 00 00 80
> 00 00 07 ff ff ff ff ff ff ff ff  
> Mar  5 11:08:49 arms003-10 kernel: [  252.657554] XFS (sda1): metadata I/O
> error: block 0x48b9ff80 ("xfs_trans_read_buf_map") error 117 numblks 8
> Mar  5 11:08:49 arms003-10 kernel: [  252.675056] XFS (sda1): Metadata
> corruption detected at xfs_dir3_data_read_verify+0x58/0xd0, xfs_dir3_data
> block 0x48b9ff80
> Mar  5 11:08:49 arms003-10 kernel: [  252.686228] XFS (sda1): Unmount and
> run xfs_repair
> Mar  5 11:08:49 arms003-10 kernel: [  252.691054] XFS (sda1): First 64
> bytes of corrupted metadata buffer:
> Mar  5 11:08:49 arms003-10 kernel: [  252.697425] eb901000: 58 46 53 42 00
> 00 10 00 00 00 00 00 91 73 fe fb  XFSB.s..
> Mar  5 11:08:49 arms003-10 kernel: [  252.705459] eb901010: 00 00 00 00 00
> 00 00 00 00 00 00 00 00 00 00 00  
> Mar  5 11:08:49 arms003-10 kernel: [  252.713489] eb901020: fc 4e e3 89 50
> 8f 42 aa be bc 07 0c 6e fa 83 2f  .N..P.B.n../
> Mar  5 11:08:49 arms003-10 kernel: [  252.721520] eb901030: 00 00 00 00 80
> 00 00 07 ff ff ff ff ff ff ff ff  
> Mar  5 11:08:49 arms003-10 kernel: [  252.729558] XFS (sda1): metadata I/O
> error: block 0x48b9ff80 ("xfs_trans_read_buf_map") error 117 numblks 8
> Mar  5 11:08:49 arms003-10 kernel: [  252.741953] XFS (sda1): Metadata
> corruption detected at xfs_dir3_data_read_verify+0x58/0xd0, xfs_dir3_data
> block 0x48b9ff80
> Mar  5 11:08:49 arms003-10 kernel: [  252.753139] XFS (sda1): Unmount and
> run xfs_repair
> Mar  5 11:08:49 arms003-10 kernel: [  252.757955] XFS (sda1): First 64
> bytes of corrupted metadata buffer:
> Mar  5 11:08:49 arms003-10 kernel: [  252.764336] eb901000: 58 46 53 42 00
> 00 10 00 00 00 00 00 91 73 fe fb  XFSB.s..
> Mar  5 11:08:49 arms003-10 kernel: [  252.772365] eb901010: 00 00 00 00 00
> 00 00 00 00 00 00 00 00 00 00 00  
> Mar  5 11:08:49 arms003-10 

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-11 Thread Mike Christie
On 03/11/2018 08:54 AM, shadow_lin wrote:
> Hi Jason,
> How the old target gateway is blacklisted? Is it a feature of the target
> gateway(which can support active/passive multipath) should provide or is
> it only by rbd excusive lock? 
> I think excusive lock only let one client can write to rbd at the same
> time,but another client can obtain the lock later when the lock is released.

For the case where we had the lock and it got taken:

If IO was blocked, then unjammed and it has already passed the target
level checks then the IO will be failed by the OSD due to the
blacklisting. When we get IO errors from ceph indicating we are
blacklisted the tcmu rbd layer will fail the IO indicating the state
change and that the IO can be retried. We will also tell the target
layer rbd does not have the lock anymore and to just stop the iscsi
connection while we clean up the blacklisting, running commands and
update our state.

The case where the initiator switched on us while we were grabbing the
lock is similar:

After we grab the lock and before we start sending IO to the rbd/ceph
layers, we will have flushed IO in various queues similar to above but a
little less invasively and tested the iscsi connection to make sure it
is not stuck on the network. If the path is still the good one, then the
initaitor will retry the IOs on it. If the iscsi connection has been
dropped, then the iscsi layer detects this and just drops IO during the
flush. So, if the failover timers have fired and the multipath layer is
already using a new path then the IO is not going to be running on
multiple paths.

>  
> 2018-03-11
> 
> shadowlin
>  
> 
> 
> *发件人:*Jason Dillaman 
> *发送时间:*2018-03-11 07:46
> *主题:*Re: Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD
> Exclusive Lock
> *收件人:*"shadow_lin"
> *抄送:*"Mike Christie","Lazuardi
> Nasution","Ceph
> Users"
>  
> On Sat, Mar 10, 2018 at 10:11 AM, shadow_lin  wrote: 
> > Hi Jason, 
> > 
> >>As discussed in this thread, for active/passive, upon initiator 
> >>failover, we used the RBD exclusive-lock feature to blacklist the old 
> >>"active" iSCSI target gateway so that it cannot talk w/ the Ceph 
> >>cluster before new writes are accepted on the new target gateway. 
> > 
> > I can get during the new active target gateway was talking to rbd the 
> old 
> > active target gateway cannot write because of the RBD exclusive-lock 
> > But after the new target gateway done the writes,if the old target 
> gateway 
> > had some blocked io during the failover,cant it then get the lock and 
> > overwrite the new writes? 
>  
> Negative -- it's blacklisted so it cannot talk to the cluster. 
>  
> > PS: 
> > Petasan say they can do active/active iscsi with patched suse kernel. 
>  
> I'll let them comment on these corner cases. 
>  
> > 2018-03-10 
> >  
> > shadowlin 
> > 
> >  
> > 
> > 发件人:Jason Dillaman  
> > 发送时间:2018-03-10 21:40 
> > 主题:Re: [ceph-
> users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock 
> > 收件人:"shadow_lin" 
> > 抄送:"Mike Christie","Lazuardi 
> > Nasution","Ceph 
> Users" 
> > 
> > On Sat, Mar 10, 2018 at 7:42 AM, shadow_lin  wrote: 
> >> Hi Mike, 
> >> So for now only suse kernel with target_rbd_core and tcmu-runner can 
> run 
> >> active/passive multipath safely? 
> > 
> > Negative, the LIO / tcmu-runner implementation documented here [1] is 
> > safe for active/passive. 
> > 
> >> I am a newbie to iscsi. I think the stuck io get excuted cause 
> overwrite 
> >> problem can happen with both active/active and active/passive. 
> >> What makes the active/passive safer than active/active? 
> > 
> > As discussed in this thread, for active/passive, upon initiator 
> > failover, we used the RBD exclusive-lock feature to blacklist the old 
> > "active" iSCSI target gateway so that it cannot talk w/ the Ceph 
> > cluster before new writes are accepted on the new target gateway. 
> > 
> >> What mechanism should be implement to avoid the problem with 
> >> active/passive 
> >> and active/active multipath? 
> > 
> > Active/passive it solved as discussed above. For active/active, we 
> > don't have a solution that is known safe under all failure conditions. 
> > If LIO supported MCS (multiple connections per session) instead of 
> > just 

Re: [ceph-users] rbd-nbd not resizing even after kernel tweaks

2018-03-11 Thread Alex Gorbachev
On Sun, Mar 11, 2018 at 4:23 AM, Mykola Golub  wrote:
> On Sat, Mar 10, 2018 at 08:25:15PM -0500, Alex Gorbachev wrote:
>> I am running into the problem described in
>> https://lkml.org/lkml/2018/2/19/565 and
>> https://tracker.ceph.com/issues/23137
>>
>> I went ahead and built a custom kernel reverting the change
>> https://github.com/torvalds/linux/commit/639812a1ed9bf49ae2c026086fbf975339cd1eef
>>
>> After that a resize shows in lsblk and /sys/block/nbdX/size, but not
>> in parted for a mounted filesystem.
>>
>> Unmapping and remapping the NBD device shows the size in parted.
>
> Note 639812a is only a part of changes. The more invasive changes are
> in 29eaadc [1]. For me the most suspicious looks removing
> bd_set_size() in nbd_size_update(), but this is just a wild guess.
>
> I would recommend to contact the authors of the change. This would
> also be a gentle remainder for Josef that he promised to fix this.
>
> [1] 
> https://github.com/torvalds/linux/commit/29eaadc0364943b6352e8994158febcb699c9f9b

Got it, I am on it, thanks.

Alex

>
> --
> Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-11 Thread Jason Dillaman
On Sun, Mar 11, 2018 at 9:54 AM, shadow_lin  wrote:
> Hi Jason,
> How the old target gateway is blacklisted?

When the newly active target gateway breaks the lock of the old target
gateway, that process will blacklist the old client [1].

> Is it a feature of the target
> gateway(which can support active/passive multipath) should provide or is it
> only by rbd excusive lock?
> I think excusive lock only let one client can write to rbd at the same
> time,but another client can obtain the lock later when the lock is released.

In general, yes -- but blacklist on lock break has been part of
exclusive-lock since the start. I am honestly not just making this up,
this is how it works.

> 2018-03-11
> 
> shadowlin
>
> 
>
> 发件人:Jason Dillaman 
> 发送时间:2018-03-11 07:46
> 主题:Re: Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive
> Lock
> 收件人:"shadow_lin"
> 抄送:"Mike Christie","Lazuardi
> Nasution","Ceph Users"
>
> On Sat, Mar 10, 2018 at 10:11 AM, shadow_lin  wrote:
>> Hi Jason,
>>
>>>As discussed in this thread, for active/passive, upon initiator
>>>failover, we used the RBD exclusive-lock feature to blacklist the old
>>>"active" iSCSI target gateway so that it cannot talk w/ the Ceph
>>>cluster before new writes are accepted on the new target gateway.
>>
>> I can get during the new active target gateway was talking to rbd the old
>> active target gateway cannot write because of the RBD exclusive-lock
>> But after the new target gateway done the writes,if the old target gateway
>> had some blocked io during the failover,cant it then get the lock and
>> overwrite the new writes?
>
> Negative -- it's blacklisted so it cannot talk to the cluster.
>
>> PS:
>> Petasan say they can do active/active iscsi with patched suse kernel.
>
> I'll let them comment on these corner cases.
>
>> 2018-03-10
>> 
>> shadowlin
>>
>> 
>>
>> 发件人:Jason Dillaman 
>> 发送时间:2018-03-10 21:40
>> 主题:Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock
>> 收件人:"shadow_lin"
>> 抄送:"Mike Christie","Lazuardi
>> Nasution","Ceph Users"
>>
>> On Sat, Mar 10, 2018 at 7:42 AM, shadow_lin  wrote:
>>> Hi Mike,
>>> So for now only suse kernel with target_rbd_core and tcmu-runner can run
>>> active/passive multipath safely?
>>
>> Negative, the LIO / tcmu-runner implementation documented here [1] is
>> safe for active/passive.
>>
>>> I am a newbie to iscsi. I think the stuck io get excuted cause overwrite
>>> problem can happen with both active/active and active/passive.
>>> What makes the active/passive safer than active/active?
>>
>> As discussed in this thread, for active/passive, upon initiator
>> failover, we used the RBD exclusive-lock feature to blacklist the old
>> "active" iSCSI target gateway so that it cannot talk w/ the Ceph
>> cluster before new writes are accepted on the new target gateway.
>>
>>> What mechanism should be implement to avoid the problem with
>>> active/passive
>>> and active/active multipath?
>>
>> Active/passive it solved as discussed above. For active/active, we
>> don't have a solution that is known safe under all failure conditions.
>> If LIO supported MCS (multiple connections per session) instead of
>> just MPIO (multipath IO), the initiator would provide enough context
>> to the target to detect IOs from a failover situation.
>>
>>> 2018-03-10
>>> 
>>> shadowlin
>>>
>>> 
>>>
>>> 发件人:Mike Christie 
>>> 发送时间:2018-03-09 00:54
>>> 主题:Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive
>>> Lock
>>> 收件人:"shadow_lin","Lazuardi
>>> Nasution","Ceph Users"
>>> 抄送:
>>>
>>> On 03/07/2018 09:24 AM, shadow_lin wrote:
 Hi Christie,
 Is it safe to use active/passive multipath with krbd with exclusive lock
 for lio/tgt/scst/tcmu?
>>>
>>> No. We tried to use lio and krbd initially, but there is a issue where
>>> IO might get stuck in the target/block layer and get executed after new
>>> IO. So for lio, tgt and tcmu it is not safe as is right now. We could
>>> add some code tcmu's file_example handler which can be used with krbd so
>>> it works like the rbd one.
>>>
>>> I do know enough about SCST right now.
>>>
>>>
 Is it safe to use active/active multipath If use suse kernel with
 target_core_rbd?
 Thanks.

 2018-03-07
 
 shadowlin

 

 

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-11 Thread shadow_lin
Hi Jason,
How the old target gateway is blacklisted? Is it a feature of the target 
gateway(which can support active/passive multipath) should provide or is it 
only by rbd excusive lock? 
I think excusive lock only let one client can write to rbd at the same time,but 
another client can obtain the lock later when the lock is released.

2018-03-11 


shadowlin




发件人:Jason Dillaman 
发送时间:2018-03-11 07:46
主题:Re: Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock
收件人:"shadow_lin"
抄送:"Mike Christie","Lazuardi 
Nasution","Ceph Users"

On Sat, Mar 10, 2018 at 10:11 AM, shadow_lin  wrote: 
> Hi Jason, 
> 
>>As discussed in this thread, for active/passive, upon initiator 
>>failover, we used the RBD exclusive-lock feature to blacklist the old 
>>"active" iSCSI target gateway so that it cannot talk w/ the Ceph 
>>cluster before new writes are accepted on the new target gateway. 
> 
> I can get during the new active target gateway was talking to rbd the old 
> active target gateway cannot write because of the RBD exclusive-lock 
> But after the new target gateway done the writes,if the old target gateway 
> had some blocked io during the failover,cant it then get the lock and 
> overwrite the new writes? 

Negative -- it's blacklisted so it cannot talk to the cluster. 

> PS: 
> Petasan say they can do active/active iscsi with patched suse kernel. 

I'll let them comment on these corner cases. 

> 2018-03-10 
>  
> shadowlin 
> 
>  
> 
> 发件人:Jason Dillaman  
> 发送时间:2018-03-10 21:40 
> 主题:Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock 
> 收件人:"shadow_lin" 
> 抄送:"Mike Christie","Lazuardi 
> Nasution","Ceph Users" 
> 
> On Sat, Mar 10, 2018 at 7:42 AM, shadow_lin  wrote: 
>> Hi Mike, 
>> So for now only suse kernel with target_rbd_core and tcmu-runner can run 
>> active/passive multipath safely? 
> 
> Negative, the LIO / tcmu-runner implementation documented here [1] is 
> safe for active/passive. 
> 
>> I am a newbie to iscsi. I think the stuck io get excuted cause overwrite 
>> problem can happen with both active/active and active/passive. 
>> What makes the active/passive safer than active/active? 
> 
> As discussed in this thread, for active/passive, upon initiator 
> failover, we used the RBD exclusive-lock feature to blacklist the old 
> "active" iSCSI target gateway so that it cannot talk w/ the Ceph 
> cluster before new writes are accepted on the new target gateway. 
> 
>> What mechanism should be implement to avoid the problem with 
>> active/passive 
>> and active/active multipath? 
> 
> Active/passive it solved as discussed above. For active/active, we 
> don't have a solution that is known safe under all failure conditions. 
> If LIO supported MCS (multiple connections per session) instead of 
> just MPIO (multipath IO), the initiator would provide enough context 
> to the target to detect IOs from a failover situation. 
> 
>> 2018-03-10 
>>  
>> shadowlin 
>> 
>>  
>> 
>> 发件人:Mike Christie  
>> 发送时间:2018-03-09 00:54 
>> 主题:Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock 
>> 收件人:"shadow_lin","Lazuardi 
>> Nasution","Ceph Users" 
>> 抄送: 
>> 
>> On 03/07/2018 09:24 AM, shadow_lin wrote: 
>>> Hi Christie, 
>>> Is it safe to use active/passive multipath with krbd with exclusive lock 
>>> for lio/tgt/scst/tcmu? 
>> 
>> No. We tried to use lio and krbd initially, but there is a issue where 
>> IO might get stuck in the target/block layer and get executed after new 
>> IO. So for lio, tgt and tcmu it is not safe as is right now. We could 
>> add some code tcmu's file_example handler which can be used with krbd so 
>> it works like the rbd one. 
>> 
>> I do know enough about SCST right now. 
>> 
>> 
>>> Is it safe to use active/active multipath If use suse kernel with 
>>> target_core_rbd? 
>>> Thanks. 
>>> 
>>> 2018-03-07 
>>>  
>>> shadowlin 
>>> 
>>>  
>>> 
>>> *发件人:*Mike Christie  
>>> *发送时间:*2018-03-07 03:51 
>>> *主题:*Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD 
>>> Exclusive Lock 
>>> *收件人:*"Lazuardi Nasution","Ceph 
>>> Users" 
>>> *抄送:* 
>>> 
>>> On 03/06/2018 01:17 PM, Lazuardi Nasution wrote: 
>>> > Hi, 
>>> > 
>>> > I want to do load balanced multipathing (multiple iSCSI 
>>> 

Re: [ceph-users] rbd-nbd not resizing even after kernel tweaks

2018-03-11 Thread Mykola Golub
On Sat, Mar 10, 2018 at 08:25:15PM -0500, Alex Gorbachev wrote:
> I am running into the problem described in
> https://lkml.org/lkml/2018/2/19/565 and
> https://tracker.ceph.com/issues/23137
> 
> I went ahead and built a custom kernel reverting the change
> https://github.com/torvalds/linux/commit/639812a1ed9bf49ae2c026086fbf975339cd1eef
> 
> After that a resize shows in lsblk and /sys/block/nbdX/size, but not
> in parted for a mounted filesystem.
> 
> Unmapping and remapping the NBD device shows the size in parted.

Note 639812a is only a part of changes. The more invasive changes are
in 29eaadc [1]. For me the most suspicious looks removing
bd_set_size() in nbd_size_update(), but this is just a wild guess.

I would recommend to contact the authors of the change. This would
also be a gentle remainder for Josef that he promised to fix this.

[1] 
https://github.com/torvalds/linux/commit/29eaadc0364943b6352e8994158febcb699c9f9b

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com