On Fri, Sep 19, 2014 at 11:22 AM, Micha Krause <[email protected]> wrote:
> Hi,
>
>> I have build an NFS Server based on Sebastiens Blog Post here:
>>
>> http://www.sebastien-han.fr/blog/2012/07/06/nfs-over-rbd/
>>
>> Im using Kernel 3.14-0.bpo.1-amd64 on Debian wheezy, the host is a VM on
>> Vmware.
>>
>> Using rsync im writing data via nfs from one client to this Server.
>>
>> The NFS Server crashes multiple times per day, I can't even login to the
>> Server then.
>> After a reset, there is no kernel log about the crash, so I guess
>> something is blocking
>> all I/Os.
>
>
> Ok, it seems that I just can't get a shell, but I can run commands via ssh
> directly.
So does it actually crash or it's just the blocked I/Os? If it doesn't
crash, you should be able to get everything off dmesg.
>
> I was able to get the following informations:
>
> dmesg:
>
> [18102.981064] INFO: task nfsd:2769 blocked for more than 120 seconds.
> [18102.981112] Not tainted 3.14-0.bpo.1-amd64 #1
> [18102.981150] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [18102.981216] nfsd D ffff88003fc14340 0 2769 2
> 0x00000000
> [18102.981218] ffff88003bac6e20 0000000000000046 0000000000000000
> ffff88003d47ada0
> [18102.981219] 0000000000014340 ffff88003ce31fd8 0000000000014340
> ffff88003bac6e20
> [18102.981221] ffff88003ce31728 ffff8800029539f0 7fffffffffffffff
> 7fffffffffffffff
> [18102.981223] Call Trace:
> [18102.981225] [<ffffffff814eedbd>] ? schedule_timeout+0x1ed/0x250
> [18102.981231] [<ffffffffa04b0f92>] ? _xfs_buf_find+0xd2/0x280 [xfs]
> [18102.981234] [<ffffffff8117fc2c>] ? kmem_cache_alloc+0x1bc/0x1f0
> [18102.981236] [<ffffffff814f193c>] ? __down_common+0x97/0xea
> [18102.981241] [<ffffffffa04b0faa>] ? _xfs_buf_find+0xea/0x280 [xfs]
> [18102.981243] [<ffffffff810aa697>] ? down+0x37/0x40
> [18102.981247] [<ffffffffa04b0e02>] ? xfs_buf_lock+0x32/0xf0 [xfs]
> [18102.981252] [<ffffffffa04b0faa>] ? _xfs_buf_find+0xea/0x280 [xfs]
> [18102.981257] [<ffffffffa04b1215>] ? xfs_buf_get_map+0x35/0x1a0 [xfs]
> [18102.981263] [<ffffffffa04b2153>] ? xfs_buf_read_map+0x33/0x130 [xfs]
> [18102.981269] [<ffffffffa05161da>] ? xfs_trans_read_buf_map+0x34a/0x4f0
> [xfs]
> [18102.981275] [<ffffffffa05036f9>] ? xfs_imap_to_bp+0x69/0xf0 [xfs]
> [18102.981281] [<ffffffffa0503bcd>] ? xfs_iread+0x7d/0x3f0 [xfs]
> [18102.981284] [<ffffffff810e8939>] ? make_kgid+0x9/0x10
> [18102.981286] [<ffffffff811b148e>] ? inode_init_always+0x10e/0x1d0
> [18102.981292] [<ffffffffa04ba11a>] ? xfs_iget+0x2ba/0x810 [xfs]
> [18102.981298] [<ffffffffa04fd9a6>] ? xfs_ialloc+0xe6/0x740 [xfs]
> [18102.981305] [<ffffffffa04ca1ee>] ? kmem_zone_alloc+0x6e/0xf0 [xfs]
> [18102.981311] [<ffffffffa04fe083>] ? xfs_dir_ialloc+0x83/0x300 [xfs]
> [18102.981317] [<ffffffffa04c8e43>] ? xfs_trans_reserve+0x213/0x220 [xfs]
> [18102.981323] [<ffffffffa04fe87e>] ? xfs_create+0x4fe/0x720 [xfs]
> [18102.981329] [<ffffffffa04bfd02>] ? xfs_vn_mknod+0xd2/0x200 [xfs]
> [18102.981331] [<ffffffff811a6b54>] ? vfs_create+0xe4/0x160
> [18102.981335] [<ffffffffa0400d9e>] ? do_nfsd_create+0x53e/0x610 [nfsd]
> [18102.981339] [<ffffffffa0407f4d>] ? nfsd3_proc_create+0x16d/0x250 [nfsd]
> [18102.981342] [<ffffffffa03f9d74>] ? nfsd_dispatch+0xe4/0x230 [nfsd]
> [18102.981347] [<ffffffffa035dd64>] ? svc_process_common+0x354/0x690
> [sunrpc]
> [18102.981349] [<ffffffff81096ab0>] ? try_to_wake_up+0x280/0x280
> [18102.981353] [<ffffffffa035e3fb>] ? svc_process+0x10b/0x160 [sunrpc]
> [18102.981359] [<ffffffffa03f96d7>] ? nfsd+0xb7/0x130 [nfsd]
> [18102.981363] [<ffffffffa03f9620>] ? nfsd_destroy+0x70/0x70 [nfsd]
> [18102.981365] [<ffffffff81086d6c>] ? kthread+0xbc/0xe0
> [18102.981367] [<ffffffff81086cb0>] ? flush_kthread_worker+0xa0/0xa0
> [18102.981369] [<ffffffff814faecc>] ? ret_from_fork+0x7c/0xb0
> [18102.981371] [<ffffffff81086cb0>] ? flush_kthread_worker+0xa0/0xa0
Is that the only hung task in dmesg?
>
> iostat:
> avg-cpu: %user %nice %system %iowait %steal %idle
> 0.00 0.00 1.00 99.00 0.00 0.00
>
> Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz
> avgqu-sz await r_await w_await svctm %util
> sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> 0.00 0.00 0.00 0.00 0.00 0.00
> dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> 0.00 0.00 0.00 0.00 0.00 0.00
> dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> 0.00 0.00 0.00 0.00 0.00 0.00
> dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> 0.00 0.00 0.00 0.00 0.00 0.00
> dm-3 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> 0.00 0.00 0.00 0.00 0.00 0.00
> dm-4 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> 0.00 0.00 0.00 0.00 0.00 0.00
> rbd0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> 46.00 0.00 0.00 0.00 0.00 100.00
> rbd1 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> 12.00 0.00 0.00 0.00 0.00 100.00
> rbd2 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> 136.00 0.00 0.00 0.00 0.00 100.00
> rbd3 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> 0.00 0.00 0.00 0.00 0.00 0.00
> rbd4 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> 11.00 0.00 0.00 0.00 0.00 100.00
> rbd5 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> 57.00 0.00 0.00 0.00 0.00 100.00
> emcpowerig 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> 32.00 0.00 0.00 0.00 0.00 100.00
> emcpowerhq 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> 38.00 0.00 0.00 0.00 0.00 100.00
>
> (for some reason rbd6 and rbd7 are shown as emcpower in iostat, no idea why)
>
> cat /sys/kernel/debug/ceph/*/*
>
> have osdmap 39226
> want next osdmap
> epoch 10
> mon0 10.210.32.11:6789
> mon1 10.210.33.11:6789
> mon2 10.210.34.11:6789
> 582277 osd19 37.e52208ae rb.0.1676178.2ae8944a.000000005ffd
> write
> 582278 osd2 37.f52b9433 rb.0.1676178.2ae8944a.000000006000
> write
> 582279 osd2 37.b3f0aae3 rb.0.1676178.2ae8944a.00000000641b
> write
> 582280 osd28 37.d8768bba rb.0.1676178.2ae8944a.00000000641c
> write
> 582282 osd29 37.a923b4c6 rb.0.1676178.2ae8944a.000000008032
> write
> 582283 osd28 37.a2510620 rb.0.1676178.2ae8944a.000000008034
> write
> 582289 osd18 37.c96bc19d rb.0.1345def.2ae8944a.0000001401cf
> write
> 582290 osd1 37.3edba98c rb.0.165171e.238e1f29.000000039fe3
> write
> 582291 osd20 37.89f3f734 rb.0.1676160.238e1f29.0000000002ee
> write
> 582292 osd20 37.89f3f734 rb.0.1676160.238e1f29.0000000002ee
> write
> 582293 osd2 37.d34be89b rb.0.1345def.2ae8944a.00000003c961
> read
> 582294 osd4 37.c611292f rb.0.1345def.2ae8944a.00000007a19a
> read
> 582295 osd20 37.b1ae9634 rb.0.1345def.2ae8944a.00000003def7
> read
> 582296 osd19 37.3928a207 rb.0.1689cf4.238e1f29.000000034106
> write
> 582297 osd15 37.1011005e rb.0.1689d69.238e1f29.000000034614
> write
> 582298 osd15 37.1011005e rb.0.1689d69.238e1f29.000000034614
> write
> 582299 osd24 37.80a8ba78 rb.0.167614e.2ae8944a.000000026149
> write
> 582300 osd18 37.fe0a5354 rb.0.1689d69.238e1f29.0000000221bb
> write
> 582301 osd18 37.fe0a5354 rb.0.1689d69.238e1f29.0000000221bb
> write
> 582302 osd18 37.fe0a5354 rb.0.1689d69.238e1f29.0000000221bb
> write
> 582303 osd18 37.fe0a5354 rb.0.1689d69.238e1f29.0000000221bb
> write
> 582304 osd18 37.fe0a5354 rb.0.1689d69.238e1f29.0000000221bb
> write
> 582305 osd18 37.fe0a5354 rb.0.1689d69.238e1f29.0000000221bb
> write
> 582306 osd18 37.fe0a5354 rb.0.1689d69.238e1f29.0000000221bb
> write
> 582307 osd18 37.fe0a5354 rb.0.1689d69.238e1f29.0000000221bb
> write
> 582308 osd2 37.d142a662 rb.0.1689d69.238e1f29.0000000221bc
> write
> 582309 osd15 37.1011005e rb.0.1689d69.238e1f29.000000034614
> write
> 582310 osd15 37.1011005e rb.0.1689d69.238e1f29.000000034614
> write
> 582311 osd24 37.3c5f1cf8 rb.0.1689d69.238e1f29.0000000221b9
> write
> 582312 osd24 37.3c5f1cf8 rb.0.1689d69.238e1f29.0000000221b9
> write
> 582313 osd24 37.3c5f1cf8 rb.0.1689d69.238e1f29.0000000221b9
> write
> 582314 osd24 37.3c5f1cf8 rb.0.1689d69.238e1f29.0000000221b9
> write
> 582315 osd24 37.3c5f1cf8 rb.0.1689d69.238e1f29.0000000221b9
> write
> 582316 osd24 37.3c5f1cf8 rb.0.1689d69.238e1f29.0000000221b9
> write
> 582317 osd24 37.3c5f1cf8 rb.0.1689d69.238e1f29.0000000221b9
> write
> 582318 osd24 37.3c5f1cf8 rb.0.1689d69.238e1f29.0000000221b9
> write
> 582319 osd1 37.36930f4c rb.0.1689d69.238e1f29.0000000221ba
> write
> 582320 osd2 37.d142a662 rb.0.1689d69.238e1f29.0000000221bc
> write
> 582321 osd2 37.d142a662 rb.0.1689d69.238e1f29.0000000221bc
> write
> 582322 osd2 37.d142a662 rb.0.1689d69.238e1f29.0000000221bc
> write
> 582323 osd2 37.d142a662 rb.0.1689d69.238e1f29.0000000221bc
> write
> 582324 osd2 37.d142a662 rb.0.1689d69.238e1f29.0000000221bc
> write
> 582325 osd2 37.d142a662 rb.0.1689d69.238e1f29.0000000221bc
> write
> 582326 osd2 37.d142a662 rb.0.1689d69.238e1f29.0000000221bc
> write
> 582327 osd2 37.d142a662 rb.0.1689d69.238e1f29.0000000221bc
> write
> 582328 osd27 37.886d4a41 rb.0.1689d69.238e1f29.0000000221bd
> write
> 582329 osd19 37.3928a207 rb.0.1689cf4.238e1f29.000000034106
> write
> 582330 osd23 37.834e04c8 rb.0.1689cf4.238e1f29.00000002485f
> write
> 582331 osd2 37.a7fb0062 rb.0.1689cf4.238e1f29.00000002bfea
> write
> 582332 osd2 37.a7fb0062 rb.0.1689cf4.238e1f29.00000002bfea
> write
> 582333 osd29 37.e3741a18 rb.0.1689cf4.238e1f29.00000002c0f6
> write
> 582334 osd19 37.dec7ad07 rb.0.1689cf4.238e1f29.000000031fe7
> write
> 582335 osd19 37.dec7ad07 rb.0.1689cf4.238e1f29.000000031fe7
> write
> 582336 osd26 37.3f51e7ac rb.0.1689cf4.238e1f29.0000000320e7
> write
> 582337 osd18 37.536a8f6 rb.0.1689cf4.238e1f29.000000033fe6
> write
> 582338 osd18 37.536a8f6 rb.0.1689cf4.238e1f29.000000033fe6
> write
> 582339 osd18 37.e3d0fce6 rb.0.1689cf4.238e1f29.00000002005f
> write
> 582340 osd1 37.36930f4c rb.0.1689d69.238e1f29.0000000221ba
> write
> 582341 osd1 37.36930f4c rb.0.1689d69.238e1f29.0000000221ba
> write
> 582342 osd1 37.36930f4c rb.0.1689d69.238e1f29.0000000221ba
> write
> 582343 osd1 37.36930f4c rb.0.1689d69.238e1f29.0000000221ba
> write
> 582344 osd27 37.886d4a41 rb.0.1689d69.238e1f29.0000000221bd
> write
> 582345 osd27 37.886d4a41 rb.0.1689d69.238e1f29.0000000221bd
> write
> 582346 osd27 37.886d4a41 rb.0.1689d69.238e1f29.0000000221bd
> write
> 582347 osd16 37.9764b7df rb.0.1676178.2ae8944a.000000016392
> write
> 582348 osd16 37.9764b7df rb.0.1676178.2ae8944a.000000016392
> write
> 582349 osd16 37.9764b7df rb.0.1676178.2ae8944a.000000016392
> write
> 582350 osd16 37.9764b7df rb.0.1676178.2ae8944a.000000016392
> write
> 582351 osd16 37.9764b7df rb.0.1676178.2ae8944a.000000016392
> write
> 582352 osd16 37.9764b7df rb.0.1676178.2ae8944a.000000016392
> write
> 582353 osd4 37.1067846f rb.0.1676178.2ae8944a.00000003a43c
> write
> 582354 osd26 37.ecccc929 rb.0.1676178.2ae8944a.00000003c703
> write
> 582355 osd18 37.ed91673e rb.0.1676178.2ae8944a.00000003c704
> write
> 582356 osd23 37.58fdaf5a rb.0.1676178.2ae8944a.00000003dfe1
> write
> 582357 osd23 37.58fdaf5a rb.0.1676178.2ae8944a.00000003dfe1
> write
> 582358 osd23 37.a050d40f rb.0.1676178.2ae8944a.00000003e0a4
> write
> 582359 osd1 37.c3f1f60c rb.0.1676178.2ae8944a.00000003e24f
> write
> 582360 osd18 37.8ad6d326 rb.0.1676178.2ae8944a.00000003e50b
> write
> 582361 osd28 37.cdf2c9ba rb.0.1676178.2ae8944a.00000003e50c
> write
> 582362 osd27 37.99cdab71 rb.0.1676178.2ae8944a.00000003e50d
> write
> 582363 osd27 37.99cdab71 rb.0.1676178.2ae8944a.00000003e50d
> write
> 582364 osd16 37.1eaec3ff rb.0.1676178.2ae8944a.00000000641a
> write
> 582365 osd24 37.d881c8e rb.0.1676178.2ae8944a.000000007ffc
> write
> 582366 osd24 37.d881c8e rb.0.1676178.2ae8944a.000000007ffc
> write
> 582367 osd24 37.d881c8e rb.0.1676178.2ae8944a.000000007ffc
> write
> 582368 osd27 37.d054a135 rb.0.1676178.2ae8944a.00000000842c
> write
> 582369 osd27 37.d054a135 rb.0.1676178.2ae8944a.00000000842c
> write
> 582370 osd18 37.876fd4d7 rb.0.1676178.2ae8944a.000000008430
> write
> 582371 osd19 37.e08b5547 rb.0.1676178.2ae8944a.000000008431
> write
> 582372 osd26 37.cd501c rb.0.1676178.2ae8944a.00000000843e
> write
> 582373 osd26 37.cd501c rb.0.1676178.2ae8944a.00000000843e
> write
> 582374 osd2 37.c5381d37 rb.0.1676178.2ae8944a.000000015ff5
> write
> 582375 osd2 37.c5381d37 rb.0.1676178.2ae8944a.000000015ff5
> write
> 582376 osd17 37.82f3d395 rb.0.1676178.2ae8944a.000000016173
> write
> 582377 osd19 37.9af0f044 rb.0.1676178.2ae8944a.000000016389
> write
> 582378 osd29 37.6d121a86 rb.0.1676178.2ae8944a.00000002004a
> write
> 582379 osd2 37.ee45629b rb.0.1689d69.238e1f29.00000000484e
> write
> 582380 osd28 37.b24e1c3a rb.0.1689d69.238e1f29.000000005ffd
> write
> 582381 osd28 37.b24e1c3a rb.0.1689d69.238e1f29.000000005ffd
> write
> 582382 osd27 37.56c53271 rb.0.1689d69.238e1f29.00000000618d
> write
> 582383 osd4 37.a5d8fbaf rb.0.1689d69.238e1f29.000000007ffc
> write
> 582384 osd4 37.a5d8fbaf rb.0.1689d69.238e1f29.000000007ffc
> write
> 582385 osd4 37.a5d8fbaf rb.0.1689d69.238e1f29.000000007ffc
> write
> 582386 osd1 37.947c5bcc rb.0.1689d69.238e1f29.0000000080bd
> write
> 582387 osd26 37.7ae1d612 rb.0.1689d69.238e1f29.000000008112
> write
> 582388 osd18 37.9bf4cf3c rb.0.1689d69.238e1f29.0000000081eb
> write
> 582389 osd1 37.9d5cf7f0 rb.0.1689d69.238e1f29.0000000081ec
> write
> 582390 osd19 37.4f750fee rb.0.1689d69.238e1f29.0000000081f0
> write
> 582391 osd19 37.3db853ad rb.0.1689d69.238e1f29.000000009ffb
> write
> 582392 osd19 37.3db853ad rb.0.1689d69.238e1f29.000000009ffb
> write
> 582393 osd26 37.94b385c rb.0.1689d69.238e1f29.00000000a1d1
> write
> 582394 osd28 37.fd40607a rb.0.1689d69.238e1f29.000000022197
> write
> 582395 osd15 37.1011005e rb.0.1689d69.238e1f29.000000034614
> write
> 582396 osd27 37.c42eab1 rb.0.1689d69.238e1f29.000000036160
> write
> 582397 osd2 37.2ac07662 rb.0.1689d69.238e1f29.00000001fffc
> write
> 582398 osd24 37.80a8ba78 rb.0.167614e.2ae8944a.000000026149
> write
> 582399 osd18 37.5c0f9c25 rb.0.167614e.2ae8944a.00000002a133
> write
> 582400 osd2 37.89e0f5b3 rb.0.167614e.2ae8944a.00000002ffe8
> write
> 582401 osd26 37.9be1215c rb.0.167614e.2ae8944a.00000000042b
> write
> 582402 osd23 37.a64e7d0f rb.0.167614e.2ae8944a.00000000238f
> write
> 582403 osd20 37.da28f8ab rb.0.167614e.2ae8944a.000000003ffe
> write
> 582404 osd24 37.b7af1f40 rb.0.167614e.2ae8944a.000000004005
> write
> 582405 osd20 37.cecda9b4 rb.0.167614e.2ae8944a.000000004b8a
> write
> 582406 osd19 37.29bcdcae rb.0.167614e.2ae8944a.000000005ffd
> write
> 582407 osd19 37.29bcdcae rb.0.167614e.2ae8944a.000000005ffd
> write
> 582408 osd27 37.f29b1b01 rb.0.167614e.2ae8944a.000000006242
> write
> 582409 osd18 37.d68a6565 rb.0.167614e.2ae8944a.000000007ffc
> write
> 582410 osd18 37.d68a6565 rb.0.167614e.2ae8944a.000000007ffc
> write
> 582411 osd24 37.87cc6a8e rb.0.167614e.2ae8944a.0000000082f2
> write
> 582412 osd15 37.a91eb70d rb.0.167614e.2ae8944a.000000009ffb
> write
> 582413 osd15 37.3b20fd05 rb.0.167614e.2ae8944a.00000000a03a
> write
> 582414 osd2 37.89e0f5b3 rb.0.167614e.2ae8944a.00000002ffe8
> write
> 582415 osd2 37.89e0f5b3 rb.0.167614e.2ae8944a.00000002ffe8
> write
> 582416 osd26 37.f0bb2a2a rb.0.167614e.2ae8944a.0000000301cc
> write
> 582417 osd26 37.3b4ebe12 rb.0.167614e.2ae8944a.0000000301cd
> write
> 582418 osd24 37.346ed74e rb.0.167614e.2ae8944a.000000031fe7
> write
> 582419 osd24 37.346ed74e rb.0.167614e.2ae8944a.000000031fe7
> write
> 582420 osd27 37.3f46e175 rb.0.167614e.2ae8944a.00000003215c
> write
> 582421 osd18 37.796dd926 rb.0.167614e.2ae8944a.00000003217a
> write
> 582422 osd19 37.4287ec84 rb.0.167614e.2ae8944a.000000033fe6
> write
> 582423 osd19 37.4287ec84 rb.0.167614e.2ae8944a.000000033fe6
> write
> 582424 osd26 37.84b7afe9 rb.0.167614e.2ae8944a.000000034112
> write
> 582425 osd24 37.f9a234b8 rb.0.167614e.2ae8944a.000000034113
> write
> 582426 osd18 37.6bd0b876 rb.0.167614e.2ae8944a.000000035fe5
> write
> 582427 osd18 37.6bd0b876 rb.0.167614e.2ae8944a.000000035fe5
> write
> 582428 osd23 37.1ae6123b rb.0.167614e.2ae8944a.00000003611a
> write
> 582429 osd18 37.8202a597 rb.0.167614e.2ae8944a.00000003611b
> write
> 582430 osd17 37.4d252e13 rb.0.167614e.2ae8944a.000000037fe4
> write
> 582431 osd17 37.4d252e13 rb.0.167614e.2ae8944a.000000037fe4
> write
> 582432 osd4 37.7caa0b rb.0.167614e.2ae8944a.00000003800a
> write
> 582433 osd4 37.5917feaf rb.0.167614e.2ae8944a.0000000380cd
> write
> 582434 osd29 37.abe267c6 rb.0.167614e.2ae8944a.0000000380cf
> write
> 582435 osd16 37.dce75fe7 rb.0.167614e.2ae8944a.000000039fe3
> write
> 582436 osd16 37.dce75fe7 rb.0.167614e.2ae8944a.000000039fe3
> write
> 582437 osd1 37.f04f6991 rb.0.167614e.2ae8944a.00000003a02f
> write
> 582438 osd1 37.f04f6991 rb.0.167614e.2ae8944a.00000003a02f
> write
> 582439 osd27 37.33f8b641 rb.0.167614e.2ae8944a.00000003cc91
> write
> 582440 osd22 37.be13ad3d rb.0.167614e.2ae8944a.00000003e40d
> write
> 582441 osd18 37.bbf96366 rb.0.167614e.2ae8944a.000000020035
> write
> 582442 osd18 37.bbf96366 rb.0.167614e.2ae8944a.000000020035
> write
> 582443 osd1 37.3edba98c rb.0.165171e.238e1f29.000000039fe3
> write
> 582444 osd1 37.3edba98c rb.0.165171e.238e1f29.000000039fe3
> write
> 582445 osd1 37.3edba98c rb.0.165171e.238e1f29.000000039fe3
> write
> 582446 osd16 37.4800911f rb.0.165171e.238e1f29.00000003a085
> write
> 582447 osd2 37.bdec20a3 rb.0.165171e.238e1f29.00000003a12c
> write
> 582448 osd23 37.33d51508 rb.0.165171e.238e1f29.00000003a12d
> write
> 582449 osd16 37.8a8ec527 rb.0.165171e.238e1f29.00000003a138
> write
> 582450 osd17 37.d7435c13 rb.0.165171e.238e1f29.00000003a139
> write
> 582451 osd17 37.d7435c13 rb.0.165171e.238e1f29.00000003a139
> write
> 582452 osd15 37.7e4b4f9e rb.0.165171e.238e1f29.00000003a13a
> write
> 582453 osd26 37.ae80f3ac rb.0.165171e.238e1f29.00000003a13b
> write
> 582454 osd26 37.ae80f3ac rb.0.165171e.238e1f29.00000003a13b
> write
> 582455 osd15 37.fc3cadcd rb.0.165171e.238e1f29.00000003a13c
> write
> 582456 osd15 37.fc3cadcd rb.0.165171e.238e1f29.00000003a13c
> write
> 582457 osd19 37.dcc1f244 rb.0.165171e.238e1f29.00000003a13d
> write
> 582458 osd28 37.c5ce907a rb.0.165171e.238e1f29.00000003a13e
> write
> 582459 osd28 37.c5ce907a rb.0.165171e.238e1f29.00000003a13e
> write
> 582460 osd18 37.d7371b26 rb.0.165171e.238e1f29.00000003a13f
> write
> 582461 osd18 37.89ec9be5 rb.0.165171e.238e1f29.00000003a140
> write
> 582462 osd4 37.5032c82f rb.0.165171e.238e1f29.00000003bfe2
> write
> 582463 osd4 37.5032c82f rb.0.165171e.238e1f29.00000003bfe2
> write
> 582464 osd26 37.54a4fd50 rb.0.165171e.238e1f29.00000003bfe9
> write
> 582465 osd23 37.2929897b rb.0.165171e.238e1f29.00000003c136
> write
> 582466 osd23 37.2929897b rb.0.165171e.238e1f29.00000003c136
> write
> 582467 osd20 37.b9aff419 rb.0.165171e.238e1f29.00000003dfe1
> write
> 582468 osd20 37.b9aff419 rb.0.165171e.238e1f29.00000003dfe1
> write
> 582469 osd24 37.685a8638 rb.0.165171e.238e1f29.00000003e08c
> write
> 582470 osd26 37.adfd8b12 rb.0.165171e.238e1f29.00000003e14a
> write
> 582471 osd2 37.a67386a2 rb.0.165171e.238e1f29.00000003e14b
> write
> 582472 osd18 37.688ac754 rb.0.165171e.238e1f29.00000002802c
> write
> 582473 osd15 37.d2bed74d rb.0.1676160.238e1f29.00000002000d
> write
> 582474 osd23 37.9c9a1a8f rb.0.165171e.238e1f29.000000020002
> write
Have you looked at Ceph servers? krbd is really just a client, so if
OSDs don't reply to its requests it can't do much. From a quick look
this doesn't look like a krbd bug.
> epoch 39226
> flags
> pg_pool 0 pg_num 256 / 255
> pg_pool 1 pg_num 128 / 127
> pg_pool 4 pg_num 32 / 31
> pg_pool 19 pg_num 512 / 511
> pg_pool 25 pg_num 8 / 7
> pg_pool 27 pg_num 1 / 0
> pg_pool 28 pg_num 1 / 0
> pg_pool 29 pg_num 1 / 0
> pg_pool 30 pg_num 1 / 0
> pg_pool 31 pg_num 1 / 0
> pg_pool 32 pg_num 1 / 0
> pg_pool 33 pg_num 1 / 0
> pg_pool 34 pg_num 1 / 0
> pg_pool 35 pg_num 2 / 1
> pg_pool 36 pg_num 1 / 0
> pg_pool 37 pg_num 64 / 63
> pg_pool 40 pg_num 2 / 1
> pg_pool 41 pg_num 1 / 0
> osd0 10.210.33.22:6815 100% (exists, up)
> osd1 10.210.33.22:6800 100% (exists, up)
> osd2 10.210.33.22:6805 100% (exists, up)
> osd3 10.210.32.22:6800 0% (doesn't exist)
> osd4 10.210.33.22:6810 100% (exists, up)
> osd5 10.210.33.22:6820 100% (doesn't exist)
> osd6 10.210.33.22:6805 100% (doesn't exist)
> osd7 10.210.33.22:6825 100% (doesn't exist)
> osd8 10.210.33.22:6830 100% (doesn't exist)
> osd9 10.210.33.22:6835 100% (doesn't exist)
> osd10 10.210.33.22:6805 100% (doesn't exist)
> osd11 10.210.33.22:6845 100% (doesn't exist)
> osd12 10.210.33.22:6850 100% (doesn't exist)
> osd13 10.210.33.22:6855 100% (doesn't exist)
> osd14 10.210.33.22:6860 100% (doesn't exist)
> osd15 10.210.32.23:6800 100% (exists, up)
> osd16 10.210.32.23:6807 100% (exists, up)
> osd17 10.210.32.23:6801 100% (exists, up)
> osd18 10.210.32.23:6816 100% (exists, up)
> osd19 10.210.32.23:6812 100% (exists, up)
> osd20 10.210.34.21:6800 100% (exists, up)
> osd21 10.210.34.21:6804 100% (exists, up)
> osd22 10.210.34.21:6809 100% (exists, up)
> osd23 10.210.34.21:6814 100% (exists, up)
> osd24 10.210.34.21:6819 100% (exists, up)
> osd25 10.210.33.21:6800 100% (exists, up)
> osd26 10.210.33.21:6805 100% (exists, up)
> osd27 10.210.33.21:6810 100% (exists, up)
> osd28 10.210.33.21:6815 100% (exists, up)
> osd29 10.210.33.21:6820 100% (exists, up)
> osd30 10.210.33.22:6865 100% (doesn't exist)
> osd31 10.210.33.22:6870 0% (doesn't exist)
> osd32 10.210.33.21:6800 0% (doesn't exist)
>
> I don't know how to interpret this, the doesn't exist lines are correct,
> these osds where removed.
> Why are they still known to the rbd client? The OSDs where removed before
> the client was booted.
What procedure did you follow to remove those OSDs?
Thanks,
Ilya
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com