Re: [ceph-users] Luminous Bluestore OSDs crashing with ASSERT

2020-01-21 Thread Stefan Priebe - Profihost AG
Hello Igor,

thanks for all your feedback and all your help.

The first thing i'll try is to upgrade a bunch of system from
4.19.66 kernel to 4.19.97 and see what happens.

I'll report back in 7-10 days to verify whether this helps.

Greets,
Stefan

Am 20.01.20 um 13:12 schrieb Igor Fedotov:
> Hi Stefan,
> 
> these lines are result of transaction dump performed on a failure during
> transaction submission (which is shown as
> 
> "submit_transaction error: Corruption: block checksum mismatch code = 2"
> 
> Most probably they are out of interest (checksum errors are unlikely to
> be caused by transaction content) and hence we need earlier stuff to
> learn what caused that
> 
> checksum mismatch.
> 
> It's hard to give any formal overview of what you should look for, from
> my troubleshooting experience generally one may try to find:
> 
> - some previous error/warning indications (e.g. allocation, disk access,
> etc)
> 
> - prior OSD crashes (sometimes they might have different causes/stack
> traces/assertion messages)
> 
> - any timeout or retry indications
> 
> - any uncommon log patterns which aren't present during regular running
> but happen each time before the crash/failure.
> 
> 
> Anyway I think the inspection depth should be much(?) deeper than
> presumably it is (from what I can see from your log snippets).
> 
> Ceph keeps last 1 log events with an increased log level and dumps
> them on crash with negative index starting at - up to -1 as a prefix.
> 
> -1> 2020-01-16 01:10:13.404090 7f3350a14700 -1 rocksdb:
> 
> 
> It would be great If you share several log snippets for different
> crashes containing these last 1 lines.
> 
> 
> Thanks,
> 
> Igor
> 
> 
> On 1/19/2020 9:42 PM, Stefan Priebe - Profihost AG wrote:
>> Hello Igor,
>>
>> there's absolutely nothing in the logs before.
>>
>> What do those lines mean:
>> Put( Prefix = O key =
>> 0x7f8001cc45c881217262'd_data.4303206b8b4567.9632!='0xfffe6f0012'x'
>>
>> Value size = 480)
>> Put( Prefix = O key =
>> 0x7f8001cc45c881217262'd_data.4303206b8b4567.9632!='0xfffe'o'
>>
>> Value size = 510)
>>
>> on the right size i always see 0xfffe on all
>> failed OSDs.
>>
>> greets,
>> Stefan
>> Am 19.01.20 um 14:07 schrieb Stefan Priebe - Profihost AG:
>>> Yes, except that this happens on 8 different clusters with different
>>> hw but same ceph version and same kernel version.
>>>
>>> Greets,
>>> Stefan
>>>
 Am 19.01.2020 um 11:53 schrieb Igor Fedotov :

 So the intermediate summary is:

 Any OSD in the cluster can experience interim RocksDB checksum
 failure. Which isn't present after OSD restart.

 No HW issues observed, no persistent artifacts (except OSD log)
 afterwards.

 And looks like the issue is rather specific to the cluster as no
 similar reports from other users seem to be present.


 Sorry, I'm out of ideas other then collect all the failure logs and
 try to find something common in them. May be this will shed some
 light..

 BTW from my experience it might make sense to inspect OSD log prior
 to failure (any error messages and/or prior restarts, etc) sometimes
 this might provide some hints.


 Thanks,

 Igor


> On 1/17/2020 2:30 PM, Stefan Priebe - Profihost AG wrote:
> HI Igor,
>
>> Am 17.01.20 um 12:10 schrieb Igor Fedotov:
>> hmmm..
>>
>> Just in case - suggest to check H/W errors with dmesg.
> this happens on around 80 nodes - i don't expect all of those have not
> identified hw errors. Also all of them are monitored - no dmesg
> outpout
> contains any errors.
>
>> Also there are some (not very much though) chances this is another
>> incarnation of the following bug:
>> https://tracker.ceph.com/issues/22464
>> https://github.com/ceph/ceph/pull/24649
>>
>> The corresponding PR works around it for main device reads (user data
>> only!) but theoretically it might still happen
>>
>> either for DB device or DB data at main device.
>>
>> Can you observe any bluefs spillovers? Are there any correlation
>> between
>> failing OSDs and spillover presence if any, e.g. failing OSDs always
>> have a spillover. While OSDs without spillovers never face the
>> issue...
>>
>> To validate this hypothesis one can try to monitor/check (e.g. once a
>> day for a week or something) "bluestore_reads_with_retries"
>> counter over
>> OSDs to learn if the issue is happening
>>
>> in the system.  Non-zero values mean it's there for user data/main
>> device and hence is likely to happen for DB ones as well (which
>> doesn't
>> have any workaround yet).
> OK i checked bluestore_reads_with_retries on 360 osds but all of
> them say 0.
>
>
>> 

Re: [ceph-users] Luminous Bluestore OSDs crashing with ASSERT

2020-01-20 Thread Igor Fedotov

Hi Stefan,

these lines are result of transaction dump performed on a failure during 
transaction submission (which is shown as


"submit_transaction error: Corruption: block checksum mismatch code = 2"

Most probably they are out of interest (checksum errors are unlikely to 
be caused by transaction content) and hence we need earlier stuff to 
learn what caused that


checksum mismatch.

It's hard to give any formal overview of what you should look for, from 
my troubleshooting experience generally one may try to find:


- some previous error/warning indications (e.g. allocation, disk access, 
etc)


- prior OSD crashes (sometimes they might have different causes/stack 
traces/assertion messages)


- any timeout or retry indications

- any uncommon log patterns which aren't present during regular running 
but happen each time before the crash/failure.



Anyway I think the inspection depth should be much(?) deeper than 
presumably it is (from what I can see from your log snippets).


Ceph keeps last 1 log events with an increased log level and dumps 
them on crash with negative index starting at - up to -1 as a prefix.


-1> 2020-01-16 01:10:13.404090 7f3350a14700 -1 rocksdb:


It would be great If you share several log snippets for different 
crashes containing these last 1 lines.



Thanks,

Igor


On 1/19/2020 9:42 PM, Stefan Priebe - Profihost AG wrote:

Hello Igor,

there's absolutely nothing in the logs before.

What do those lines mean:
Put( Prefix = O key =
0x7f8001cc45c881217262'd_data.4303206b8b4567.9632!='0xfffe6f0012'x'
Value size = 480)
Put( Prefix = O key =
0x7f8001cc45c881217262'd_data.4303206b8b4567.9632!='0xfffe'o'
Value size = 510)

on the right size i always see 0xfffe on all
failed OSDs.

greets,
Stefan
Am 19.01.20 um 14:07 schrieb Stefan Priebe - Profihost AG:

Yes, except that this happens on 8 different clusters with different hw but 
same ceph version and same kernel version.

Greets,
Stefan


Am 19.01.2020 um 11:53 schrieb Igor Fedotov :

So the intermediate summary is:

Any OSD in the cluster can experience interim RocksDB checksum failure. Which 
isn't present after OSD restart.

No HW issues observed, no persistent artifacts (except OSD log) afterwards.

And looks like the issue is rather specific to the cluster as no similar 
reports from other users seem to be present.


Sorry, I'm out of ideas other then collect all the failure logs and try to find 
something common in them. May be this will shed some light..

BTW from my experience it might make sense to inspect OSD log prior to failure 
(any error messages and/or prior restarts, etc) sometimes this might provide 
some hints.


Thanks,

Igor



On 1/17/2020 2:30 PM, Stefan Priebe - Profihost AG wrote:
HI Igor,


Am 17.01.20 um 12:10 schrieb Igor Fedotov:
hmmm..

Just in case - suggest to check H/W errors with dmesg.

this happens on around 80 nodes - i don't expect all of those have not
identified hw errors. Also all of them are monitored - no dmesg outpout
contains any errors.


Also there are some (not very much though) chances this is another
incarnation of the following bug:
https://tracker.ceph.com/issues/22464
https://github.com/ceph/ceph/pull/24649

The corresponding PR works around it for main device reads (user data
only!) but theoretically it might still happen

either for DB device or DB data at main device.

Can you observe any bluefs spillovers? Are there any correlation between
failing OSDs and spillover presence if any, e.g. failing OSDs always
have a spillover. While OSDs without spillovers never face the issue...

To validate this hypothesis one can try to monitor/check (e.g. once a
day for a week or something) "bluestore_reads_with_retries" counter over
OSDs to learn if the issue is happening

in the system.  Non-zero values mean it's there for user data/main
device and hence is likely to happen for DB ones as well (which doesn't
have any workaround yet).

OK i checked bluestore_reads_with_retries on 360 osds but all of them say 0.



Additionally you might want to monitor memory usage as the above
mentioned PR denotes high memory pressure as potential trigger for these
read errors. So if such pressure happens the hypothesis becomes more valid.

we already do this heavily and have around 10GB of memory per OSD. Also
no of those machines show any io pressure at all.

All hosts show a constant rate of around 38GB to 45GB mem available in
/proc/meminfo.

Stefan


Thanks,

Igor

PS. Everything above is rather a speculation for now. Available
information is definitely not enough for extensive troubleshooting the
cases which happens that rarely.

You might want to start collecting failure-related information
(including but not limited to failure logs, perf counter dumps, system
resource reports etc) for future analysis.



On 1/16/2020 11:58 PM, Stefan Priebe - 

Re: [ceph-users] Luminous Bluestore OSDs crashing with ASSERT

2020-01-19 Thread Stefan Priebe - Profihost AG
Hello Igor,

there's absolutely nothing in the logs before.

What do those lines mean:
Put( Prefix = O key =
0x7f8001cc45c881217262'd_data.4303206b8b4567.9632!='0xfffe6f0012'x'
Value size = 480)
Put( Prefix = O key =
0x7f8001cc45c881217262'd_data.4303206b8b4567.9632!='0xfffe'o'
Value size = 510)

on the right size i always see 0xfffe on all
failed OSDs.

greets,
Stefan
Am 19.01.20 um 14:07 schrieb Stefan Priebe - Profihost AG:
> Yes, except that this happens on 8 different clusters with different hw but 
> same ceph version and same kernel version.
> 
> Greets,
> Stefan
> 
>> Am 19.01.2020 um 11:53 schrieb Igor Fedotov :
>>
>> So the intermediate summary is:
>>
>> Any OSD in the cluster can experience interim RocksDB checksum failure. 
>> Which isn't present after OSD restart.
>>
>> No HW issues observed, no persistent artifacts (except OSD log) afterwards.
>>
>> And looks like the issue is rather specific to the cluster as no similar 
>> reports from other users seem to be present.
>>
>>
>> Sorry, I'm out of ideas other then collect all the failure logs and try to 
>> find something common in them. May be this will shed some light..
>>
>> BTW from my experience it might make sense to inspect OSD log prior to 
>> failure (any error messages and/or prior restarts, etc) sometimes this might 
>> provide some hints.
>>
>>
>> Thanks,
>>
>> Igor
>>
>>
>>> On 1/17/2020 2:30 PM, Stefan Priebe - Profihost AG wrote:
>>> HI Igor,
>>>
 Am 17.01.20 um 12:10 schrieb Igor Fedotov:
 hmmm..

 Just in case - suggest to check H/W errors with dmesg.
>>> this happens on around 80 nodes - i don't expect all of those have not
>>> identified hw errors. Also all of them are monitored - no dmesg outpout
>>> contains any errors.
>>>
 Also there are some (not very much though) chances this is another
 incarnation of the following bug:
 https://tracker.ceph.com/issues/22464
 https://github.com/ceph/ceph/pull/24649

 The corresponding PR works around it for main device reads (user data
 only!) but theoretically it might still happen

 either for DB device or DB data at main device.

 Can you observe any bluefs spillovers? Are there any correlation between
 failing OSDs and spillover presence if any, e.g. failing OSDs always
 have a spillover. While OSDs without spillovers never face the issue...

 To validate this hypothesis one can try to monitor/check (e.g. once a
 day for a week or something) "bluestore_reads_with_retries" counter over
 OSDs to learn if the issue is happening

 in the system.  Non-zero values mean it's there for user data/main
 device and hence is likely to happen for DB ones as well (which doesn't
 have any workaround yet).
>>> OK i checked bluestore_reads_with_retries on 360 osds but all of them say 0.
>>>
>>>
 Additionally you might want to monitor memory usage as the above
 mentioned PR denotes high memory pressure as potential trigger for these
 read errors. So if such pressure happens the hypothesis becomes more valid.
>>> we already do this heavily and have around 10GB of memory per OSD. Also
>>> no of those machines show any io pressure at all.
>>>
>>> All hosts show a constant rate of around 38GB to 45GB mem available in
>>> /proc/meminfo.
>>>
>>> Stefan
>>>
 Thanks,

 Igor

 PS. Everything above is rather a speculation for now. Available
 information is definitely not enough for extensive troubleshooting the
 cases which happens that rarely.

 You might want to start collecting failure-related information
 (including but not limited to failure logs, perf counter dumps, system
 resource reports etc) for future analysis.



 On 1/16/2020 11:58 PM, Stefan Priebe - Profihost AG wrote:
> Hi Igor,
>
> answers inline.
>
> Am 16.01.20 um 21:34 schrieb Igor Fedotov:
>> you may want to run fsck against failing OSDs. Hopefully it will shed
>> some light.
> fsck just says everything fine:
>
> # ceph-bluestore-tool --command fsck --path /var/lib/ceph/osd/ceph-27/
> fsck success
>
>
>> Also wondering if OSD is able to recover (startup and proceed working)
>> after facing the issue?
> no recover needed. It just runs forever after restarting.
>
>> If so do you have any one which failed multiple times? Do you have logs
>> for these occurrences?
> may be but there are most probably weeks or month between those failures
> - most probably logs are already deleted.
>
>> Also please note that patch you mentioned doesn't fix previous issues
>> (i.e. duplicate allocations), it prevents from new ones only.
>>
>> But fsck should show them if any...
> None showed.
>
> Stefan
>
>> Thanks,
>>
>> Igor

Re: [ceph-users] Luminous Bluestore OSDs crashing with ASSERT

2020-01-19 Thread Stefan Priebe - Profihost AG
Yes, except that this happens on 8 different clusters with different hw but 
same ceph version and same kernel version.

Greets,
Stefan

> Am 19.01.2020 um 11:53 schrieb Igor Fedotov :
> 
> So the intermediate summary is:
> 
> Any OSD in the cluster can experience interim RocksDB checksum failure. Which 
> isn't present after OSD restart.
> 
> No HW issues observed, no persistent artifacts (except OSD log) afterwards.
> 
> And looks like the issue is rather specific to the cluster as no similar 
> reports from other users seem to be present.
> 
> 
> Sorry, I'm out of ideas other then collect all the failure logs and try to 
> find something common in them. May be this will shed some light..
> 
> BTW from my experience it might make sense to inspect OSD log prior to 
> failure (any error messages and/or prior restarts, etc) sometimes this might 
> provide some hints.
> 
> 
> Thanks,
> 
> Igor
> 
> 
>> On 1/17/2020 2:30 PM, Stefan Priebe - Profihost AG wrote:
>> HI Igor,
>> 
>>> Am 17.01.20 um 12:10 schrieb Igor Fedotov:
>>> hmmm..
>>> 
>>> Just in case - suggest to check H/W errors with dmesg.
>> this happens on around 80 nodes - i don't expect all of those have not
>> identified hw errors. Also all of them are monitored - no dmesg outpout
>> contains any errors.
>> 
>>> Also there are some (not very much though) chances this is another
>>> incarnation of the following bug:
>>> https://tracker.ceph.com/issues/22464
>>> https://github.com/ceph/ceph/pull/24649
>>> 
>>> The corresponding PR works around it for main device reads (user data
>>> only!) but theoretically it might still happen
>>> 
>>> either for DB device or DB data at main device.
>>> 
>>> Can you observe any bluefs spillovers? Are there any correlation between
>>> failing OSDs and spillover presence if any, e.g. failing OSDs always
>>> have a spillover. While OSDs without spillovers never face the issue...
>>> 
>>> To validate this hypothesis one can try to monitor/check (e.g. once a
>>> day for a week or something) "bluestore_reads_with_retries" counter over
>>> OSDs to learn if the issue is happening
>>> 
>>> in the system.  Non-zero values mean it's there for user data/main
>>> device and hence is likely to happen for DB ones as well (which doesn't
>>> have any workaround yet).
>> OK i checked bluestore_reads_with_retries on 360 osds but all of them say 0.
>> 
>> 
>>> Additionally you might want to monitor memory usage as the above
>>> mentioned PR denotes high memory pressure as potential trigger for these
>>> read errors. So if such pressure happens the hypothesis becomes more valid.
>> we already do this heavily and have around 10GB of memory per OSD. Also
>> no of those machines show any io pressure at all.
>> 
>> All hosts show a constant rate of around 38GB to 45GB mem available in
>> /proc/meminfo.
>> 
>> Stefan
>> 
>>> Thanks,
>>> 
>>> Igor
>>> 
>>> PS. Everything above is rather a speculation for now. Available
>>> information is definitely not enough for extensive troubleshooting the
>>> cases which happens that rarely.
>>> 
>>> You might want to start collecting failure-related information
>>> (including but not limited to failure logs, perf counter dumps, system
>>> resource reports etc) for future analysis.
>>> 
>>> 
>>> 
>>> On 1/16/2020 11:58 PM, Stefan Priebe - Profihost AG wrote:
 Hi Igor,
 
 answers inline.
 
 Am 16.01.20 um 21:34 schrieb Igor Fedotov:
> you may want to run fsck against failing OSDs. Hopefully it will shed
> some light.
 fsck just says everything fine:
 
 # ceph-bluestore-tool --command fsck --path /var/lib/ceph/osd/ceph-27/
 fsck success
 
 
> Also wondering if OSD is able to recover (startup and proceed working)
> after facing the issue?
 no recover needed. It just runs forever after restarting.
 
> If so do you have any one which failed multiple times? Do you have logs
> for these occurrences?
 may be but there are most probably weeks or month between those failures
 - most probably logs are already deleted.
 
> Also please note that patch you mentioned doesn't fix previous issues
> (i.e. duplicate allocations), it prevents from new ones only.
> 
> But fsck should show them if any...
 None showed.
 
 Stefan
 
> Thanks,
> 
> Igor
> 
> 
> 
> On 1/16/2020 10:04 PM, Stefan Priebe - Profihost AG wrote:
>> Hi Igor,
>> 
>> ouch sorry. Here we go:
>> 
>>  -1> 2020-01-16 01:10:13.404090 7f3350a14700 -1 rocksdb:
>> submit_transaction error: Corruption: block checksum mismatch code = 2
>> Rocksdb transaction:
>> Put( Prefix = M key =
>> 0x0402'.OBJ_0002.953BFD0A.bb85c.rbd%udata%e3e8eac6b8b4567%e1f2e..'
>> 
>> 
>> Value size = 97)
>> Put( Prefix = M key =
>> 0x0402'.MAP_000BB85C_0002.953BFD0A.bb85c.rbd%udata%e3e8eac6b8b4567%e1f2e..'

Re: [ceph-users] Luminous Bluestore OSDs crashing with ASSERT

2020-01-19 Thread Igor Fedotov

So the intermediate summary is:

Any OSD in the cluster can experience interim RocksDB checksum failure. 
Which isn't present after OSD restart.


No HW issues observed, no persistent artifacts (except OSD log) afterwards.

And looks like the issue is rather specific to the cluster as no similar 
reports from other users seem to be present.



Sorry, I'm out of ideas other then collect all the failure logs and try 
to find something common in them. May be this will shed some light..


BTW from my experience it might make sense to inspect OSD log prior to 
failure (any error messages and/or prior restarts, etc) sometimes this 
might provide some hints.



Thanks,

Igor


On 1/17/2020 2:30 PM, Stefan Priebe - Profihost AG wrote:

HI Igor,

Am 17.01.20 um 12:10 schrieb Igor Fedotov:

hmmm..

Just in case - suggest to check H/W errors with dmesg.

this happens on around 80 nodes - i don't expect all of those have not
identified hw errors. Also all of them are monitored - no dmesg outpout
contains any errors.


Also there are some (not very much though) chances this is another
incarnation of the following bug:
https://tracker.ceph.com/issues/22464
https://github.com/ceph/ceph/pull/24649

The corresponding PR works around it for main device reads (user data
only!) but theoretically it might still happen

either for DB device or DB data at main device.

Can you observe any bluefs spillovers? Are there any correlation between
failing OSDs and spillover presence if any, e.g. failing OSDs always
have a spillover. While OSDs without spillovers never face the issue...

To validate this hypothesis one can try to monitor/check (e.g. once a
day for a week or something) "bluestore_reads_with_retries" counter over
OSDs to learn if the issue is happening

in the system.  Non-zero values mean it's there for user data/main
device and hence is likely to happen for DB ones as well (which doesn't
have any workaround yet).

OK i checked bluestore_reads_with_retries on 360 osds but all of them say 0.



Additionally you might want to monitor memory usage as the above
mentioned PR denotes high memory pressure as potential trigger for these
read errors. So if such pressure happens the hypothesis becomes more valid.

we already do this heavily and have around 10GB of memory per OSD. Also
no of those machines show any io pressure at all.

All hosts show a constant rate of around 38GB to 45GB mem available in
/proc/meminfo.

Stefan


Thanks,

Igor

PS. Everything above is rather a speculation for now. Available
information is definitely not enough for extensive troubleshooting the
cases which happens that rarely.

You might want to start collecting failure-related information
(including but not limited to failure logs, perf counter dumps, system
resource reports etc) for future analysis.



On 1/16/2020 11:58 PM, Stefan Priebe - Profihost AG wrote:

Hi Igor,

answers inline.

Am 16.01.20 um 21:34 schrieb Igor Fedotov:

you may want to run fsck against failing OSDs. Hopefully it will shed
some light.

fsck just says everything fine:

# ceph-bluestore-tool --command fsck --path /var/lib/ceph/osd/ceph-27/
fsck success



Also wondering if OSD is able to recover (startup and proceed working)
after facing the issue?

no recover needed. It just runs forever after restarting.


If so do you have any one which failed multiple times? Do you have logs
for these occurrences?

may be but there are most probably weeks or month between those failures
- most probably logs are already deleted.


Also please note that patch you mentioned doesn't fix previous issues
(i.e. duplicate allocations), it prevents from new ones only.

But fsck should show them if any...

None showed.

Stefan


Thanks,

Igor



On 1/16/2020 10:04 PM, Stefan Priebe - Profihost AG wrote:

Hi Igor,

ouch sorry. Here we go:

  -1> 2020-01-16 01:10:13.404090 7f3350a14700 -1 rocksdb:
submit_transaction error: Corruption: block checksum mismatch code = 2
Rocksdb transaction:
Put( Prefix = M key =
0x0402'.OBJ_0002.953BFD0A.bb85c.rbd%udata%e3e8eac6b8b4567%e1f2e..'


Value size = 97)
Put( Prefix = M key =
0x0402'.MAP_000BB85C_0002.953BFD0A.bb85c.rbd%udata%e3e8eac6b8b4567%e1f2e..'


Value size = 93)
Put( Prefix = M key =
0x0916'.823257.73922044' Value size = 196)
Put( Prefix = M key =
0x0916'.823257.73922045' Value size = 184)
Put( Prefix = M key = 0x0916'._info' Value size = 899)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f'x'


Value size = 418)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0003'x'


Value size = 474)
Put( Prefix = O key =

Re: [ceph-users] Luminous Bluestore OSDs crashing with ASSERT

2020-01-17 Thread Stefan Priebe - Profihost AG
HI Igor,

Am 17.01.20 um 12:10 schrieb Igor Fedotov:
> hmmm..
> 
> Just in case - suggest to check H/W errors with dmesg.

this happens on around 80 nodes - i don't expect all of those have not
identified hw errors. Also all of them are monitored - no dmesg outpout
contains any errors.

> Also there are some (not very much though) chances this is another
> incarnation of the following bug:
> https://tracker.ceph.com/issues/22464
> https://github.com/ceph/ceph/pull/24649
> 
> The corresponding PR works around it for main device reads (user data
> only!) but theoretically it might still happen
> 
> either for DB device or DB data at main device.
>
> Can you observe any bluefs spillovers? Are there any correlation between
> failing OSDs and spillover presence if any, e.g. failing OSDs always
> have a spillover. While OSDs without spillovers never face the issue...
>
> To validate this hypothesis one can try to monitor/check (e.g. once a
> day for a week or something) "bluestore_reads_with_retries" counter over
> OSDs to learn if the issue is happening
> 
> in the system.  Non-zero values mean it's there for user data/main
> device and hence is likely to happen for DB ones as well (which doesn't
> have any workaround yet).

OK i checked bluestore_reads_with_retries on 360 osds but all of them say 0.


> Additionally you might want to monitor memory usage as the above
> mentioned PR denotes high memory pressure as potential trigger for these
> read errors. So if such pressure happens the hypothesis becomes more valid.

we already do this heavily and have around 10GB of memory per OSD. Also
no of those machines show any io pressure at all.

All hosts show a constant rate of around 38GB to 45GB mem available in
/proc/meminfo.

Stefan

> Thanks,
> 
> Igor
> 
> PS. Everything above is rather a speculation for now. Available
> information is definitely not enough for extensive troubleshooting the
> cases which happens that rarely.
> 
> You might want to start collecting failure-related information
> (including but not limited to failure logs, perf counter dumps, system
> resource reports etc) for future analysis.
> 
> 
> 
> On 1/16/2020 11:58 PM, Stefan Priebe - Profihost AG wrote:
>> Hi Igor,
>>
>> answers inline.
>>
>> Am 16.01.20 um 21:34 schrieb Igor Fedotov:
>>> you may want to run fsck against failing OSDs. Hopefully it will shed
>>> some light.
>> fsck just says everything fine:
>>
>> # ceph-bluestore-tool --command fsck --path /var/lib/ceph/osd/ceph-27/
>> fsck success
>>
>>
>>> Also wondering if OSD is able to recover (startup and proceed working)
>>> after facing the issue?
>> no recover needed. It just runs forever after restarting.
>>
>>> If so do you have any one which failed multiple times? Do you have logs
>>> for these occurrences?
>> may be but there are most probably weeks or month between those failures
>> - most probably logs are already deleted.
>>
>>> Also please note that patch you mentioned doesn't fix previous issues
>>> (i.e. duplicate allocations), it prevents from new ones only.
>>>
>>> But fsck should show them if any...
>> None showed.
>>
>> Stefan
>>
>>> Thanks,
>>>
>>> Igor
>>>
>>>
>>>
>>> On 1/16/2020 10:04 PM, Stefan Priebe - Profihost AG wrote:
 Hi Igor,

 ouch sorry. Here we go:

  -1> 2020-01-16 01:10:13.404090 7f3350a14700 -1 rocksdb:
 submit_transaction error: Corruption: block checksum mismatch code = 2
 Rocksdb transaction:
 Put( Prefix = M key =
 0x0402'.OBJ_0002.953BFD0A.bb85c.rbd%udata%e3e8eac6b8b4567%e1f2e..'


 Value size = 97)
 Put( Prefix = M key =
 0x0402'.MAP_000BB85C_0002.953BFD0A.bb85c.rbd%udata%e3e8eac6b8b4567%e1f2e..'


 Value size = 93)
 Put( Prefix = M key =
 0x0916'.823257.73922044' Value size = 196)
 Put( Prefix = M key =
 0x0916'.823257.73922045' Value size = 184)
 Put( Prefix = M key = 0x0916'._info' Value size = 899)
 Put( Prefix = O key =
 0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f'x'


 Value size = 418)
 Put( Prefix = O key =
 0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0003'x'


 Value size = 474)
 Put( Prefix = O key =
 0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0007c000'x'


 Value size = 392)
 Put( Prefix = O key =
 0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0009'x'


 Value size = 317)
 Put( Prefix = O key =
 0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f000a'x'


Re: [ceph-users] Luminous Bluestore OSDs crashing with ASSERT

2020-01-17 Thread Igor Fedotov

hmmm..

Just in case - suggest to check H/W errors with dmesg.

Also there are some (not very much though) chances this is another 
incarnation of the following bug:


https://tracker.ceph.com/issues/22464

https://github.com/ceph/ceph/pull/24649

The corresponding PR works around it for main device reads (user data 
only!) but theoretically it might still happen


either for DB device or DB data at main device.

Can you observe any bluefs spillovers? Are there any correlation between 
failing OSDs and spillover presence if any, e.g. failing OSDs always 
have a spillover. While OSDs without spillovers never face the issue...


To validate this hypothesis one can try to monitor/check (e.g. once a 
day for a week or something) "bluestore_reads_with_retries" counter over 
OSDs to learn if the issue is happening


in the system.  Non-zero values mean it's there for user data/main 
device and hence is likely to happen for DB ones as well (which doesn't 
have any workaround yet).


Additionally you might want to monitor memory usage as the above 
mentioned PR denotes high memory pressure as potential trigger for these 
read errors. So if such pressure happens the hypothesis becomes more valid.



Thanks,

Igor

PS. Everything above is rather a speculation for now. Available 
information is definitely not enough for extensive troubleshooting the 
cases which happens that rarely.


You might want to start collecting failure-related information 
(including but not limited to failure logs, perf counter dumps, system 
resource reports etc) for future analysis.




On 1/16/2020 11:58 PM, Stefan Priebe - Profihost AG wrote:

Hi Igor,

answers inline.

Am 16.01.20 um 21:34 schrieb Igor Fedotov:

you may want to run fsck against failing OSDs. Hopefully it will shed
some light.

fsck just says everything fine:

# ceph-bluestore-tool --command fsck --path /var/lib/ceph/osd/ceph-27/
fsck success



Also wondering if OSD is able to recover (startup and proceed working)
after facing the issue?

no recover needed. It just runs forever after restarting.


If so do you have any one which failed multiple times? Do you have logs
for these occurrences?

may be but there are most probably weeks or month between those failures
- most probably logs are already deleted.


Also please note that patch you mentioned doesn't fix previous issues
(i.e. duplicate allocations), it prevents from new ones only.

But fsck should show them if any...

None showed.

Stefan


Thanks,

Igor



On 1/16/2020 10:04 PM, Stefan Priebe - Profihost AG wrote:

Hi Igor,

ouch sorry. Here we go:

     -1> 2020-01-16 01:10:13.404090 7f3350a14700 -1 rocksdb:
submit_transaction error: Corruption: block checksum mismatch code = 2
Rocksdb transaction:
Put( Prefix = M key =
0x0402'.OBJ_0002.953BFD0A.bb85c.rbd%udata%e3e8eac6b8b4567%e1f2e..'

Value size = 97)
Put( Prefix = M key =
0x0402'.MAP_000BB85C_0002.953BFD0A.bb85c.rbd%udata%e3e8eac6b8b4567%e1f2e..'

Value size = 93)
Put( Prefix = M key =
0x0916'.823257.73922044' Value size = 196)
Put( Prefix = M key =
0x0916'.823257.73922045' Value size = 184)
Put( Prefix = M key = 0x0916'._info' Value size = 899)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f'x'

Value size = 418)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0003'x'

Value size = 474)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0007c000'x'

Value size = 392)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0009'x'

Value size = 317)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f000a'x'

Value size = 521)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f000f4000'x'

Value size = 558)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0013'x'

Value size = 649)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f00194000'x'

Value size = 449)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f001cc000'x'

Value size = 580)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0020'x'

Value size = 435)
Put( Prefix = O key =

Re: [ceph-users] Luminous Bluestore OSDs crashing with ASSERT

2020-01-16 Thread Stefan Priebe - Profihost AG
Hi Igor,

answers inline.

Am 16.01.20 um 21:34 schrieb Igor Fedotov:
> you may want to run fsck against failing OSDs. Hopefully it will shed
> some light.

fsck just says everything fine:

# ceph-bluestore-tool --command fsck --path /var/lib/ceph/osd/ceph-27/
fsck success


> Also wondering if OSD is able to recover (startup and proceed working)
> after facing the issue?

no recover needed. It just runs forever after restarting.

> If so do you have any one which failed multiple times? Do you have logs
> for these occurrences?

may be but there are most probably weeks or month between those failures
- most probably logs are already deleted.

> Also please note that patch you mentioned doesn't fix previous issues
> (i.e. duplicate allocations), it prevents from new ones only.
> 
> But fsck should show them if any...

None showed.

Stefan

> Thanks,
> 
> Igor
> 
> 
> 
> On 1/16/2020 10:04 PM, Stefan Priebe - Profihost AG wrote:
>> Hi Igor,
>>
>> ouch sorry. Here we go:
>>
>>     -1> 2020-01-16 01:10:13.404090 7f3350a14700 -1 rocksdb:
>> submit_transaction error: Corruption: block checksum mismatch code = 2
>> Rocksdb transaction:
>> Put( Prefix = M key =
>> 0x0402'.OBJ_0002.953BFD0A.bb85c.rbd%udata%e3e8eac6b8b4567%e1f2e..'
>>
>> Value size = 97)
>> Put( Prefix = M key =
>> 0x0402'.MAP_000BB85C_0002.953BFD0A.bb85c.rbd%udata%e3e8eac6b8b4567%e1f2e..'
>>
>> Value size = 93)
>> Put( Prefix = M key =
>> 0x0916'.823257.73922044' Value size = 196)
>> Put( Prefix = M key =
>> 0x0916'.823257.73922045' Value size = 184)
>> Put( Prefix = M key = 0x0916'._info' Value size = 899)
>> Put( Prefix = O key =
>> 0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f'x'
>>
>> Value size = 418)
>> Put( Prefix = O key =
>> 0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0003'x'
>>
>> Value size = 474)
>> Put( Prefix = O key =
>> 0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0007c000'x'
>>
>> Value size = 392)
>> Put( Prefix = O key =
>> 0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0009'x'
>>
>> Value size = 317)
>> Put( Prefix = O key =
>> 0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f000a'x'
>>
>> Value size = 521)
>> Put( Prefix = O key =
>> 0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f000f4000'x'
>>
>> Value size = 558)
>> Put( Prefix = O key =
>> 0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0013'x'
>>
>> Value size = 649)
>> Put( Prefix = O key =
>> 0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f00194000'x'
>>
>> Value size = 449)
>> Put( Prefix = O key =
>> 0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f001cc000'x'
>>
>> Value size = 580)
>> Put( Prefix = O key =
>> 0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0020'x'
>>
>> Value size = 435)
>> Put( Prefix = O key =
>> 0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0024'x'
>>
>> Value size = 569)
>> Put( Prefix = O key =
>> 0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0029'x'
>>
>> Value size = 465)
>> Put( Prefix = O key =
>> 0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f002e'x'
>>
>> Value size = 710)
>> Put( Prefix = O key =
>> 0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0030'x'
>>
>> Value size = 599)
>> Put( Prefix = O key =
>> 0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0036c000'x'
>>
>> Value size = 372)
>> Put( Prefix = O key =
>> 0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f003a6000'x'
>>
>> Value size = 130)
>> Put( Prefix = O key =
>> 0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f003b4000'x'
>>
>> Value size = 540)
>> Put( Prefix = O key =
>> 0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f003fc000'x'
>>
>> Value size = 47)
>> Put( Prefix = O key =
>> 

Re: [ceph-users] Luminous Bluestore OSDs crashing with ASSERT

2020-01-16 Thread Igor Fedotov

Stefan,

you may want to run fsck against failing OSDs. Hopefully it will shed 
some light.


Also wondering if OSD is able to recover (startup and proceed working) 
after facing the issue?


If so do you have any one which failed multiple times? Do you have logs 
for these occurrences?



Also please note that patch you mentioned doesn't fix previous issues 
(i.e. duplicate allocations), it prevents from new ones only.


But fsck should show them if any...


Thanks,

Igor



On 1/16/2020 10:04 PM, Stefan Priebe - Profihost AG wrote:

Hi Igor,

ouch sorry. Here we go:

-1> 2020-01-16 01:10:13.404090 7f3350a14700 -1 rocksdb:
submit_transaction error: Corruption: block checksum mismatch code = 2
Rocksdb transaction:
Put( Prefix = M key =
0x0402'.OBJ_0002.953BFD0A.bb85c.rbd%udata%e3e8eac6b8b4567%e1f2e..'
Value size = 97)
Put( Prefix = M key =
0x0402'.MAP_000BB85C_0002.953BFD0A.bb85c.rbd%udata%e3e8eac6b8b4567%e1f2e..'
Value size = 93)
Put( Prefix = M key =
0x0916'.823257.73922044' Value size = 196)
Put( Prefix = M key =
0x0916'.823257.73922045' Value size = 184)
Put( Prefix = M key = 0x0916'._info' Value size = 899)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f'x'
Value size = 418)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0003'x'
Value size = 474)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0007c000'x'
Value size = 392)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0009'x'
Value size = 317)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f000a'x'
Value size = 521)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f000f4000'x'
Value size = 558)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0013'x'
Value size = 649)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f00194000'x'
Value size = 449)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f001cc000'x'
Value size = 580)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0020'x'
Value size = 435)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0024'x'
Value size = 569)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0029'x'
Value size = 465)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f002e'x'
Value size = 710)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0030'x'
Value size = 599)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0036c000'x'
Value size = 372)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f003a6000'x'
Value size = 130)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f003b4000'x'
Value size = 540)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f003fc000'x'
Value size = 47)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c'o'
Value size = 1731)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0xfffe6f0004'x'
Value size = 675)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0xfffe6f0008'x'
Value size = 395)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0xfffe'o'
Value size = 1328)
Put( Prefix = X key = 0x18a38deb Value size = 14)
Put( Prefix = X key = 0x18a38dea Value size = 14)
Put( 

Re: [ceph-users] Luminous Bluestore OSDs crashing with ASSERT

2020-01-16 Thread Stefan Priebe - Profihost AG
Hi Igor,

ouch sorry. Here we go:

   -1> 2020-01-16 01:10:13.404090 7f3350a14700 -1 rocksdb:
submit_transaction error: Corruption: block checksum mismatch code = 2
Rocksdb transaction:
Put( Prefix = M key =
0x0402'.OBJ_0002.953BFD0A.bb85c.rbd%udata%e3e8eac6b8b4567%e1f2e..'
Value size = 97)
Put( Prefix = M key =
0x0402'.MAP_000BB85C_0002.953BFD0A.bb85c.rbd%udata%e3e8eac6b8b4567%e1f2e..'
Value size = 93)
Put( Prefix = M key =
0x0916'.823257.73922044' Value size = 196)
Put( Prefix = M key =
0x0916'.823257.73922045' Value size = 184)
Put( Prefix = M key = 0x0916'._info' Value size = 899)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f'x'
Value size = 418)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0003'x'
Value size = 474)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0007c000'x'
Value size = 392)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0009'x'
Value size = 317)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f000a'x'
Value size = 521)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f000f4000'x'
Value size = 558)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0013'x'
Value size = 649)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f00194000'x'
Value size = 449)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f001cc000'x'
Value size = 580)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0020'x'
Value size = 435)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0024'x'
Value size = 569)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0029'x'
Value size = 465)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f002e'x'
Value size = 710)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0030'x'
Value size = 599)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f0036c000'x'
Value size = 372)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f003a6000'x'
Value size = 130)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f003b4000'x'
Value size = 540)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c6f003fc000'x'
Value size = 47)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0x000bb85c'o'
Value size = 1731)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0xfffe6f0004'x'
Value size = 675)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0xfffe6f0008'x'
Value size = 395)
Put( Prefix = O key =
0x7f80029acdfb05217262'd_data.3e8eac6b8b4567.1f2e!='0xfffe'o'
Value size = 1328)
Put( Prefix = X key = 0x18a38deb Value size = 14)
Put( Prefix = X key = 0x18a38dea Value size = 14)
Put( Prefix = X key = 0x0d7a035b Value size = 14)
Put( Prefix = X key = 0x0d7a035c Value size = 14)
Put( Prefix = X key = 0x0d7a0355 Value size = 14)
Put( Prefix = X key = 0x0d7a0356 Value size = 17)
Put( Prefix = X key = 0x1a54f6e4 Value size = 14)
Put( Prefix = X key = 0x1b1c061e Value size = 14)
Put( Prefix = X key = 0x0d7a038f Value size = 14)
Put( Prefix = X key = 0x0d7a0389 Value size = 14)
Put( Prefix = X key = 0x0d7a0358 Value size = 14)
Put( Prefix = X key = 

Re: [ceph-users] Luminous Bluestore OSDs crashing with ASSERT

2020-01-16 Thread Igor Fedotov

Hi Stefan,

would you please share log snippet prior the assertions? Looks like 
RocksDB is failing during transaction submission...



Thanks,

Igor

On 1/16/2020 11:56 AM, Stefan Priebe - Profihost AG wrote:

Hello,

does anybody know a fix for this ASSERT / crash?

2020-01-16 02:02:31.316394 7f8c3f5ab700 -1
/build/ceph/src/os/bluestore/BlueStore.cc: In function 'void
BlueStore::_kv_sync_thread()' thread 7f8c3f5ab700 time 2020-01-16
02:02:31.304993
/build/ceph/src/os/bluestore/BlueStore.cc: 8808: FAILED assert(r == 0)

  ceph version 12.2.12-11-gd3eae83543
(d3eae83543bffc0fc6c43823feb637fa851b6213) luminous (stable)
  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x102) [0x55e6df9d9232]
  2: (BlueStore::_kv_sync_thread()+0x24c5) [0x55e6df8614b5]
  3: (BlueStore::KVSyncThread::entry()+0xd) [0x55e6df8a208d]
  4: (()+0x7494) [0x7f8c50190494]
  5: (clone()+0x3f) [0x7f8c4f217acf]

all bluestore OSDs are randomly crashing sometimes (once a week).

Greets,
Stefan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com