Re: [ceph-users] Random health OSD_SCRUB_ERRORS on various OSDs, after pg repair back to HEALTH_OK

Brad Hubbard Tue, 06 Mar 2018 01:12:00 -0800

debug_osd that is... :)

On Tue, Mar 6, 2018 at 7:10 PM, Brad Hubbard <[email protected]> wrote:


>
>
> On Tue, Mar 6, 2018 at 5:26 PM, Marco Baldini - H.S. Amiata <
> [email protected]> wrote:
>
>> Hi
>>
>> I monitor dmesg in each of the 3 nodes, no hardware issue reported. And
>> the problem happens with various different OSDs in different nodes, for me
>> it is clear it's not an hardware problem.
>>
>
> If you have osd_debug set to 25 or greater when you run the deep scrub you
> should get more information about the nature of the read error in the
> ReplicatedBackend::be_deep_scrub() function (assuming this is a
> replicated pool).
>
> This may create large logs so watch they don't exhaust storage.
>
>> Thanks for reply
>>
>>
>>
>> Il 05/03/2018 21:45, Vladimir Prokofev ha scritto:
>>
>> > always solved by ceph pg repair <PG>
>> That doesn't necessarily means that there's no hardware issue. In my case
>> repair also worked fine and returned cluster to OK state every time, but in
>> time faulty disk fail another scrub operation, and this repeated multiple
>> times before we replaced that disk.
>> One last thing to look into is dmesg at your OSD nodes. If there's a
>> hardware read error it will be logged in dmesg.
>>
>> 2018-03-05 18:26 GMT+03:00 Marco Baldini - H.S. Amiata <
>> [email protected]>:
>>
>>> Hi and thanks for reply
>>>
>>> The OSDs are all healthy, in fact after a ceph pg repair <PG> the ceph
>>> health is back to OK and in the OSD log I see  <PG> repair ok, 0 fixed
>>>
>>> The SMART data of the 3 OSDs seems fine
>>>
>>> *OSD.5*
>>>
>>> # ceph-disk list | grep osd.5
>>>  /dev/sdd1 ceph data, active, cluster ceph, osd.5, block /dev/sdd2
>>>
>>> # smartctl -a /dev/sdd
>>> smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.13.13-6-pve] (local build)
>>> Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
>>>
>>> === START OF INFORMATION SECTION ===
>>> Model Family:     Seagate Barracuda 7200.14 (AF)
>>> Device Model:     ST1000DM003-1SB10C
>>> Serial Number:    Z9A1MA1V
>>> LU WWN Device Id: 5 000c50 090c7028b
>>> Firmware Version: CC43
>>> User Capacity:    1,000,204,886,016 bytes [1.00 TB]
>>> Sector Sizes:     512 bytes logical, 4096 bytes physical
>>> Rotation Rate:    7200 rpm
>>> Form Factor:      3.5 inches
>>> Device is:        In smartctl database [for details use: -P show]
>>> ATA Version is:   ATA8-ACS T13/1699-D revision 4
>>> SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
>>> Local Time is:    Mon Mar  5 16:17:22 2018 CET
>>> SMART support is: Available - device has SMART capability.
>>> SMART support is: Enabled
>>>
>>> === START OF READ SMART DATA SECTION ===
>>> SMART overall-health self-assessment test result: PASSED
>>>
>>> General SMART Values:
>>> Offline data collection status:  (0x82)     Offline data collection activity
>>>                                     was completed without error.
>>>                                     Auto Offline Data Collection: Enabled.
>>> Self-test execution status:      (   0)     The previous self-test routine 
>>> completed
>>>                                     without error or no self-test has ever
>>>                                     been run.
>>> Total time to complete Offline
>>> data collection:            (    0) seconds.
>>> Offline data collection
>>> capabilities:                        (0x7b) SMART execute Offline immediate.
>>>                                     Auto Offline data collection on/off 
>>> support.
>>>                                     Suspend Offline collection upon new
>>>                                     command.
>>>                                     Offline surface scan supported.
>>>                                     Self-test supported.
>>>                                     Conveyance Self-test supported.
>>>                                     Selective Self-test supported.
>>> SMART capabilities:            (0x0003)     Saves SMART data before entering
>>>                                     power-saving mode.
>>>                                     Supports SMART auto save timer.
>>> Error logging capability:        (0x01)     Error logging supported.
>>>                                     General Purpose Logging supported.
>>> Short self-test routine
>>> recommended polling time:    (   1) minutes.
>>> Extended self-test routine
>>> recommended polling time:    ( 109) minutes.
>>> Conveyance self-test routine
>>> recommended polling time:    (   2) minutes.
>>> SCT capabilities:          (0x1085) SCT Status supported.
>>>
>>> SMART Attributes Data Structure revision number: 10
>>> Vendor Specific SMART Attributes with Thresholds:
>>> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  
>>> WHEN_FAILED RAW_VALUE
>>>   1 Raw_Read_Error_Rate     0x000f   082   063   006    Pre-fail  Always    
>>>    -       193297722
>>>   3 Spin_Up_Time            0x0003   097   097   000    Pre-fail  Always    
>>>    -       0
>>>   4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always    
>>>    -       60
>>>   5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always    
>>>    -       0
>>>   7 Seek_Error_Rate         0x000f   091   060   045    Pre-fail  Always    
>>>    -       1451132477
>>>   9 Power_On_Hours          0x0032   085   085   000    Old_age   Always    
>>>    -       13283
>>>  10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always    
>>>    -       0
>>>  12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always    
>>>    -       61
>>> 183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always    
>>>    -       0
>>> 184 End-to-End_Error        0x0032   100   100   099    Old_age   Always    
>>>    -       0
>>> 187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always    
>>>    -       0
>>> 188 Command_Timeout         0x0032   100   100   000    Old_age   Always    
>>>    -       0 0 0
>>> 189 High_Fly_Writes         0x003a   086   086   000    Old_age   Always    
>>>    -       14
>>> 190 Airflow_Temperature_Cel 0x0022   071   055   040    Old_age   Always    
>>>    -       29 (Min/Max 23/32)
>>> 193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always    
>>>    -       607
>>> 194 Temperature_Celsius     0x0022   029   014   000    Old_age   Always    
>>>    -       29 (0 14 0 0 0)
>>> 195 Hardware_ECC_Recovered  0x001a   004   001   000    Old_age   Always    
>>>    -       193297722
>>> 197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always    
>>>    -       0
>>> 198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline   
>>>    -       0
>>> 199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always    
>>>    -       0
>>> 240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline   
>>>    -       13211h+23m+08.363s
>>> 241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline   
>>>    -       53042120064
>>> 242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline   
>>>    -       170788993187
>>>
>>> *OSD.4*
>>>
>>> # ceph-disk list | grep osd.4
>>>  /dev/sdc1 ceph data, active, cluster ceph, osd.4, block /dev/sdc2
>>>
>>> # smartctl -a /dev/sdc
>>> smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.13.13-6-pve] (local build)
>>> Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
>>>
>>> === START OF INFORMATION SECTION ===
>>> Model Family:     Seagate Barracuda 7200.14 (AF)
>>> Device Model:     ST1000DM003-1SB10C
>>> Serial Number:    Z9A1M1BW
>>> LU WWN Device Id: 5 000c50 090c78d27
>>> Firmware Version: CC43
>>> User Capacity:    1,000,204,886,016 bytes [1.00 TB]
>>> Sector Sizes:     512 bytes logical, 4096 bytes physical
>>> Rotation Rate:    7200 rpm
>>> Form Factor:      3.5 inches
>>> Device is:        In smartctl database [for details use: -P show]
>>> ATA Version is:   ATA8-ACS T13/1699-D revision 4
>>> SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
>>> Local Time is:    Mon Mar  5 16:20:46 2018 CET
>>> SMART support is: Available - device has SMART capability.
>>> SMART support is: Enabled
>>>
>>> === START OF READ SMART DATA SECTION ===
>>> SMART overall-health self-assessment test result: PASSED
>>>
>>> General SMART Values:
>>> Offline data collection status:  (0x82)     Offline data collection activity
>>>                                     was completed without error.
>>>                                     Auto Offline Data Collection: Enabled.
>>> Self-test execution status:      (   0)     The previous self-test routine 
>>> completed
>>>                                     without error or no self-test has ever
>>>                                     been run.
>>> Total time to complete Offline
>>> data collection:            (    0) seconds.
>>> Offline data collection
>>> capabilities:                        (0x7b) SMART execute Offline immediate.
>>>                                     Auto Offline data collection on/off 
>>> support.
>>>                                     Suspend Offline collection upon new
>>>                                     command.
>>>                                     Offline surface scan supported.
>>>                                     Self-test supported.
>>>                                     Conveyance Self-test supported.
>>>                                     Selective Self-test supported.
>>> SMART capabilities:            (0x0003)     Saves SMART data before entering
>>>                                     power-saving mode.
>>>                                     Supports SMART auto save timer.
>>> Error logging capability:        (0x01)     Error logging supported.
>>>                                     General Purpose Logging supported.
>>> Short self-test routine
>>> recommended polling time:    (   1) minutes.
>>> Extended self-test routine
>>> recommended polling time:    ( 109) minutes.
>>> Conveyance self-test routine
>>> recommended polling time:    (   2) minutes.
>>> SCT capabilities:          (0x1085) SCT Status supported.
>>>
>>> SMART Attributes Data Structure revision number: 10
>>> Vendor Specific SMART Attributes with Thresholds:
>>> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  
>>> WHEN_FAILED RAW_VALUE
>>>   1 Raw_Read_Error_Rate     0x000f   082   063   006    Pre-fail  Always    
>>>    -       194906537
>>>   3 Spin_Up_Time            0x0003   097   097   000    Pre-fail  Always    
>>>    -       0
>>>   4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always    
>>>    -       64
>>>   5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always    
>>>    -       0
>>>   7 Seek_Error_Rate         0x000f   091   060   045    Pre-fail  Always    
>>>    -       1485899434
>>>   9 Power_On_Hours          0x0032   085   085   000    Old_age   Always    
>>>    -       13390
>>>  10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always    
>>>    -       0
>>>  12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always    
>>>    -       65
>>> 183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always    
>>>    -       0
>>> 184 End-to-End_Error        0x0032   100   100   099    Old_age   Always    
>>>    -       0
>>> 187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always    
>>>    -       0
>>> 188 Command_Timeout         0x0032   100   100   000    Old_age   Always    
>>>    -       0 0 0
>>> 189 High_Fly_Writes         0x003a   095   095   000    Old_age   Always    
>>>    -       5
>>> 190 Airflow_Temperature_Cel 0x0022   074   051   040    Old_age   Always    
>>>    -       26 (Min/Max 19/29)
>>> 193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always    
>>>    -       616
>>> 194 Temperature_Celsius     0x0022   026   014   000    Old_age   Always    
>>>    -       26 (0 14 0 0 0)
>>> 195 Hardware_ECC_Recovered  0x001a   004   001   000    Old_age   Always    
>>>    -       194906537
>>> 197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always    
>>>    -       0
>>> 198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline   
>>>    -       0
>>> 199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always    
>>>    -       0
>>> 240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline   
>>>    -       13315h+20m+30.974s
>>> 241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline   
>>>    -       52137467719
>>> 242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline   
>>>    -       177227508503
>>>
>>>
>>> *OSD.8*
>>>
>>> # ceph-disk list | grep osd.8
>>>  /dev/sda1 ceph data, active, cluster ceph, osd.8, block /dev/sda2
>>>
>>> # smartctl -a /dev/sda
>>> smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.13.13-6-pve] (local build)
>>> Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
>>>
>>> === START OF INFORMATION SECTION ===
>>> Model Family:     Seagate Barracuda 7200.14 (AF)
>>> Device Model:     ST1000DM003-1SB10C
>>> Serial Number:    Z9A2BEF2
>>> LU WWN Device Id: 5 000c50 0910f5427
>>> Firmware Version: CC43
>>> User Capacity:    1,000,203,804,160 bytes [1.00 TB]
>>> Sector Sizes:     512 bytes logical, 4096 bytes physical
>>> Rotation Rate:    7200 rpm
>>> Form Factor:      3.5 inches
>>> Device is:        In smartctl database [for details use: -P show]
>>> ATA Version is:   ATA8-ACS T13/1699-D revision 4
>>> SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
>>> Local Time is:    Mon Mar  5 16:22:47 2018 CET
>>> SMART support is: Available - device has SMART capability.
>>> SMART support is: Enabled
>>>
>>> === START OF READ SMART DATA SECTION ===
>>> SMART overall-health self-assessment test result: PASSED
>>>
>>> General SMART Values:
>>> Offline data collection status:  (0x82)     Offline data collection activity
>>>                                     was completed without error.
>>>                                     Auto Offline Data Collection: Enabled.
>>> Self-test execution status:      (   0)     The previous self-test routine 
>>> completed
>>>                                     without error or no self-test has ever
>>>                                     been run.
>>> Total time to complete Offline
>>> data collection:            (    0) seconds.
>>> Offline data collection
>>> capabilities:                        (0x7b) SMART execute Offline immediate.
>>>                                     Auto Offline data collection on/off 
>>> support.
>>>                                     Suspend Offline collection upon new
>>>                                     command.
>>>                                     Offline surface scan supported.
>>>                                     Self-test supported.
>>>                                     Conveyance Self-test supported.
>>>                                     Selective Self-test supported.
>>> SMART capabilities:            (0x0003)     Saves SMART data before entering
>>>                                     power-saving mode.
>>>                                     Supports SMART auto save timer.
>>> Error logging capability:        (0x01)     Error logging supported.
>>>                                     General Purpose Logging supported.
>>> Short self-test routine
>>> recommended polling time:    (   1) minutes.
>>> Extended self-test routine
>>> recommended polling time:    ( 110) minutes.
>>> Conveyance self-test routine
>>> recommended polling time:    (   2) minutes.
>>> SCT capabilities:          (0x1085) SCT Status supported.
>>>
>>> SMART Attributes Data Structure revision number: 10
>>> Vendor Specific SMART Attributes with Thresholds:
>>> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  
>>> WHEN_FAILED RAW_VALUE
>>>   1 Raw_Read_Error_Rate     0x000f   083   063   006    Pre-fail  Always    
>>>    -       224621855
>>>   3 Spin_Up_Time            0x0003   097   097   000    Pre-fail  Always    
>>>    -       0
>>>   4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always    
>>>    -       275
>>>   5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always    
>>>    -       0
>>>   7 Seek_Error_Rate         0x000f   081   060   045    Pre-fail  Always    
>>>    -       149383284
>>>   9 Power_On_Hours          0x0032   093   093   000    Old_age   Always    
>>>    -       6210
>>>  10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always    
>>>    -       0
>>>  12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always    
>>>    -       265
>>> 183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always    
>>>    -       0
>>> 184 End-to-End_Error        0x0032   100   100   099    Old_age   Always    
>>>    -       0
>>> 187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always    
>>>    -       0
>>> 188 Command_Timeout         0x0032   100   100   000    Old_age   Always    
>>>    -       0 0 0
>>> 189 High_Fly_Writes         0x003a   098   098   000    Old_age   Always    
>>>    -       2
>>> 190 Airflow_Temperature_Cel 0x0022   069   058   040    Old_age   Always    
>>>    -       31 (Min/Max 21/35)
>>> 193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always    
>>>    -       516
>>> 194 Temperature_Celsius     0x0022   031   017   000    Old_age   Always    
>>>    -       31 (0 17 0 0 0)
>>> 195 Hardware_ECC_Recovered  0x001a   005   001   000    Old_age   Always    
>>>    -       224621855
>>> 197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always    
>>>    -       0
>>> 198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline   
>>>    -       0
>>> 199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always    
>>>    -       0
>>> 240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline   
>>>    -       6154h+03m+35.126s
>>> 241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline   
>>>    -       24333847321
>>> 242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline   
>>>    -       50261005553
>>>
>>>
>>>
>>> However it's not only these 3 OSD to have PG with errors, these are
>>> onlyl the most recent, in the last 3 months I had often OSD_SCRUB_ERRORS in
>>> various OSDs, always solved by ceph pg repair <PG>, I don't think it's an
>>> hardware issue.
>>>
>>>
>>>
>>>
>>>
>>> Il 05/03/2018 13:40, Vladimir Prokofev ha scritto:
>>>
>>> > candidate had a read error
>>> speaks for itself - while scrubbing it coudn't read data.
>>> I had similar issue, and it was just OSD dying - errors and relocated
>>> sectors in SMART, just replaced the disk. But in your case it seems that
>>> errors are on different OSDs? Are your OSDs all healthy?
>>> You can use this command to see some details.
>>> rados list-inconsistent-obj <pg.id> --format=json-pretty
>>> pg.id is the PG that's reporting as inconsistent. My guess is that
>>> you'll see read errors in this output, with OSD number that encountered
>>> error. After that you have to check that OSD health - SMART details, etc.
>>> Not always it's the disk itself that causing problems - for example we
>>> had read errors because of a faulty backplane interface in a server;
>>> changing the chassis resolved this issue.
>>>
>>>
>>> 2018-03-05 14:21 GMT+03:00 Marco Baldini - H.S. Amiata <
>>> [email protected]>:
>>>
>>>> Hi
>>>>
>>>> After some days with debug_osd 5/5 I found [ERR] in different days,
>>>> different PGs, different OSDs, different hosts. This is what I get in the
>>>> OSD logs:
>>>>
>>>> *OSD.5 (host 3)*
>>>> 2018-03-01 20:30:02.702269 7fdf4d515700  2 osd.5 pg_epoch: 16486 pg[9.1c( 
>>>> v 16486'51798 (16431'50251,16486'51798] local-lis/les=16474/16475 n=3629 
>>>> ec=1477/1477 lis/c 16474/16474 les/c/f 16475/16477/0 16474/16474/16474) 
>>>> [5,6] r=0 lpr=16474 crt=16486'51798 lcod 16486'51797 mlcod 16486'51797 
>>>> active+clean+scrubbing+deep] 9.1c shard 6: soid 
>>>> 9:3b157c56:::rbd_data.1526386b8b4567.0000000000001761:head candidate had a 
>>>> read error
>>>> 2018-03-01 20:30:02.702278 7fdf4d515700 -1 log_channel(cluster) log [ERR] 
>>>> : 9.1c shard 6: soid 
>>>> 9:3b157c56:::rbd_data.1526386b8b4567.0000000000001761:head candidate had a 
>>>> read error
>>>>
>>>> *
>>>> OSD.4 (host 3)*
>>>> 2018-02-28 00:03:33.458558 7f112cf76700 -1 log_channel(cluster) log [ERR] 
>>>> : 13.65 shard 2: soid 
>>>> 13:a719ecdf:::rbd_data.5f65056b8b4567.000000000000f8eb:head candidate had 
>>>> a read error
>>>>
>>>> *OSD.8 (host 2)*
>>>> 2018-02-27 23:55:15.100084 7f4dd0816700 -1 log_channel(cluster) log [ERR] 
>>>> : 14.31 shard 1: soid 
>>>> 14:8cc6cd37:::rbd_data.30b15b6b8b4567.00000000000081a1:head candidate had 
>>>> a read error
>>>>
>>>> I don't know what this error is meaning, and as always a ceph pg repair
>>>> fixes it. I don't think this is normal.
>>>>
>>>> Ideas?
>>>>
>>>> Thanks
>>>>
>>>> Il 28/02/2018 14:48, Marco Baldini - H.S. Amiata ha scritto:
>>>>
>>>> Hi
>>>>
>>>> I read the bugtracker issue and it seems a lot like my problem, even if
>>>> I can't check the reported checksum because I don't have it in my logs,
>>>> perhaps it's because of debug osd = 0/0 in ceph.conf
>>>>
>>>> I just raised the OSD log level
>>>>
>>>> ceph tell osd.* injectargs --debug-osd 5/5
>>>>
>>>> I'll check OSD logs in the next days...
>>>>
>>>> Thanks
>>>>
>>>>
>>>>
>>>> Il 28/02/2018 11:59, Paul Emmerich ha scritto:
>>>>
>>>> Hi,
>>>>
>>>> might be http://tracker.ceph.com/issues/22464
>>>>
>>>> Can you check the OSD log file to see if the reported checksum
>>>> is 0x6706be76?
>>>>
>>>>
>>>> Paul
>>>>
>>>> Am 28.02.2018 um 11:43 schrieb Marco Baldini - H.S. Amiata <
>>>> [email protected]>:
>>>>
>>>> Hello
>>>>
>>>> I have a little ceph cluster with 3 nodes, each with 3x1TB HDD and
>>>> 1x240GB SSD. I created this cluster after Luminous release, so all OSDs are
>>>> Bluestore. In my crush map I have two rules, one targeting the SSDs and one
>>>> targeting the HDDs. I have 4 pools, one using the SSD rule and the others
>>>> using the HDD rule, three pools are size=3 min_size=2, one is size=2
>>>> min_size=1 (this one have content that it's ok to lose)
>>>>
>>>> In the last 3 month I'm having a strange random problem. I planned my
>>>> osd scrubs during the night (osd scrub begin hour = 20, osd scrub end hour
>>>> = 7) when office is closed so there is low impact on the users. Some
>>>> mornings, when I ceph the cluster health, I find:
>>>>
>>>> HEALTH_ERR X scrub errors; Possible data damage: Y pgs inconsistent
>>>> OSD_SCRUB_ERRORS X scrub errors
>>>> PG_DAMAGED Possible data damage: Y pg inconsistent
>>>>
>>>> X and Y sometimes are 1, sometimes 2.
>>>>
>>>> I issue a ceph health detail, check the damaged PGs, and run a ceph pg
>>>> repair for the damaged PGs, I get
>>>>
>>>> instructing pg PG on osd.N to repair
>>>>
>>>> PG are different, OSD that have to repair PG is different, even the
>>>> node hosting the OSD is different, I made a list of all PGs and OSDs. This
>>>> morning is the most recent case:
>>>>
>>>> > ceph health detail
>>>> HEALTH_ERR 2 scrub errors; Possible data damage: 2 pgs inconsistent
>>>> OSD_SCRUB_ERRORS 2 scrub errors
>>>> PG_DAMAGED Possible data damage: 2 pgs inconsistent
>>>> pg 13.65 is active+clean+inconsistent, acting [4,2,6]
>>>> pg 14.31 is active+clean+inconsistent, acting [8,3,1]
>>>>
>>>> > ceph pg repair 13.65
>>>> instructing pg 13.65 on osd.4 to repair
>>>>
>>>> (node-2)> tail /var/log/ceph/ceph-osd.4.log
>>>> 2018-02-28 08:38:47.593447 7f112cf76700  0 log_channel(cluster) log [DBG] 
>>>> : 13.65 repair starts
>>>> 2018-02-28 08:39:37.573342 7f112cf76700  0 log_channel(cluster) log [DBG] 
>>>> : 13.65 repair ok, 0 fixed
>>>>
>>>> > ceph pg repair 14.31
>>>> instructing pg 14.31 on osd.8 to repair
>>>>
>>>> (node-3)> tail /var/log/ceph/ceph-osd.8.log
>>>> 2018-02-28 08:52:37.297490 7f4dd0816700  0 log_channel(cluster) log [DBG] 
>>>> : 14.31 repair starts
>>>> 2018-02-28 08:53:00.704020 7f4dd0816700  0 log_channel(cluster) log [DBG] 
>>>> : 14.31 repair ok, 0 fixed
>>>>
>>>>
>>>>
>>>> I made a list of when I got OSD_SCRUB_ERRORS, what PG and what OSD had
>>>> to repair PG. Date is dd/mm/yyyy
>>>>
>>>> 21/12/2017   --  pg 14.29 is active+clean+inconsistent, acting [6,2,4]
>>>>
>>>> 18/01/2018   --  pg 14.5a is active+clean+inconsistent, acting [6,4,1]
>>>>
>>>> 22/01/2018   --  pg 9.3a is active+clean+inconsistent, acting [2,7]
>>>>
>>>> 29/01/2018   --  pg 13.3e is active+clean+inconsistent, acting [4,6,1]
>>>>                  instructing pg 13.3e on osd.4 to repair
>>>>
>>>> 07/02/2018   --  pg 13.7e is active+clean+inconsistent, acting [8,2,5]
>>>>                  instructing pg 13.7e on osd.8 to repair
>>>>
>>>> 09/02/2018   --  pg 13.30 is active+clean+inconsistent, acting [7,3,2]
>>>>                  instructing pg 13.30 on osd.7 to repair
>>>>
>>>> 15/02/2018   --  pg 9.35 is active+clean+inconsistent, acting [1,8]
>>>>                  instructing pg 9.35 on osd.1 to repair
>>>>
>>>>                  pg 13.3e is active+clean+inconsistent, acting [4,6,1]
>>>>                  instructing pg 13.3e on osd.4 to repair
>>>>
>>>> 17/02/2018   --  pg 9.2d is active+clean+inconsistent, acting [7,5]
>>>>                  instructing pg 9.2d on osd.7 to repair
>>>>
>>>> 22/02/2018   --  pg 9.24 is active+clean+inconsistent, acting [5,8]
>>>>                  instructing pg 9.24 on osd.5 to repair
>>>>
>>>> 28/02/2018   --  pg 13.65 is active+clean+inconsistent, acting [4,2,6]
>>>>                  instructing pg 13.65 on osd.4 to repair
>>>>
>>>>                  pg 14.31 is active+clean+inconsistent, acting [8,3,1]
>>>>                  instructing pg 14.31 on osd.8 to repair
>>>>
>>>>
>>>>
>>>>
>>>> If can be useful, my ceph.conf is here:
>>>>
>>>> [global]
>>>> auth client required = none
>>>> auth cluster required = none
>>>> auth service required = none
>>>> fsid = 24d5d6bc-0943-4345-b44e-46c19099004b
>>>> cluster network = 10.10.10.0/24
>>>> public network = 10.10.10.0/24
>>>> keyring = /etc/pve/priv/$cluster.$name.keyring
>>>> mon allow pool delete = true
>>>> osd journal size = 5120
>>>> osd pool default min size = 2
>>>> osd pool default size = 3
>>>> bluestore_block_db_size = 64424509440
>>>>
>>>> debug asok = 0/0
>>>> debug auth = 0/0
>>>> debug buffer = 0/0
>>>> debug client = 0/0
>>>> debug context = 0/0
>>>> debug crush = 0/0
>>>> debug filer = 0/0
>>>> debug filestore = 0/0
>>>> debug finisher = 0/0
>>>> debug heartbeatmap = 0/0
>>>> debug journal = 0/0
>>>> debug journaler = 0/0
>>>> debug lockdep = 0/0
>>>> debug mds = 0/0
>>>> debug mds balancer = 0/0
>>>> debug mds locker = 0/0
>>>> debug mds log = 0/0
>>>> debug mds log expire = 0/0
>>>> debug mds migrator = 0/0
>>>> debug mon = 0/0
>>>> debug monc = 0/0
>>>> debug ms = 0/0
>>>> debug objclass = 0/0
>>>> debug objectcacher = 0/0
>>>> debug objecter = 0/0
>>>> debug optracker = 0/0
>>>> debug osd = 0/0
>>>> debug paxos = 0/0
>>>> debug perfcounter = 0/0
>>>> debug rados = 0/0
>>>> debug rbd = 0/0
>>>> debug rgw = 0/0
>>>> debug throttle = 0/0
>>>> debug timer = 0/0
>>>> debug tp = 0/0
>>>>
>>>>
>>>> [osd]
>>>> keyring = /var/lib/ceph/osd/ceph-$id/keyring
>>>> osd max backfills = 1
>>>> osd recovery max active = 1
>>>>
>>>> osd scrub begin hour = 20
>>>> osd scrub end hour = 7
>>>> osd scrub during recovery = false
>>>> osd scrub load threshold = 0.3
>>>>
>>>> [client]
>>>> rbd cache = true
>>>> rbd cache size = 268435456      # 256MB
>>>> rbd cache max dirty = 201326592    # 192MB
>>>> rbd cache max dirty age = 2
>>>> rbd cache target dirty = 33554432    # 32MB
>>>> rbd cache writethrough until flush = true
>>>>
>>>>
>>>> #[mgr]
>>>> #debug_mgr = 20
>>>>
>>>>
>>>> [mon.pve-hs-main]
>>>> host = pve-hs-main
>>>> mon addr = 10.10.10.251:6789
>>>>
>>>> [mon.pve-hs-2]
>>>> host = pve-hs-2
>>>> mon addr = 10.10.10.252:6789
>>>>
>>>> [mon.pve-hs-3]
>>>> host = pve-hs-3
>>>> mon addr = 10.10.10.253:6789
>>>>
>>>>  My ceph versions:
>>>>
>>>> {
>>>>     "mon": {
>>>>         "ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) 
>>>> luminous (stable)": 3
>>>>     },
>>>>     "mgr": {
>>>>         "ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) 
>>>> luminous (stable)": 3
>>>>     },
>>>>     "osd": {
>>>>         "ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) 
>>>> luminous (stable)": 12
>>>>     },
>>>>     "mds": {},
>>>>     "overall": {
>>>>         "ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) 
>>>> luminous (stable)": 18
>>>>     }
>>>> }
>>>>
>>>>
>>>>
>>>>
>>>> My ceph osd tree:
>>>>
>>>> ID CLASS WEIGHT  TYPE NAME            STATUS REWEIGHT PRI-AFF
>>>> -1       8.93686 root default
>>>> -6       2.94696     host pve-hs-2
>>>>  3   hdd 0.90959         osd.3            up  1.00000 1.00000
>>>>  4   hdd 0.90959         osd.4            up  1.00000 1.00000
>>>>  5   hdd 0.90959         osd.5            up  1.00000 1.00000
>>>> 10   ssd 0.21819         osd.10           up  1.00000 1.00000
>>>> -3       2.86716     host pve-hs-3
>>>>  6   hdd 0.85599         osd.6            up  1.00000 1.00000
>>>>  7   hdd 0.85599         osd.7            up  1.00000 1.00000
>>>>  8   hdd 0.93700         osd.8            up  1.00000 1.00000
>>>> 11   ssd 0.21819         osd.11           up  1.00000 1.00000
>>>> -7       3.12274     host pve-hs-main
>>>>  0   hdd 0.96819         osd.0            up  1.00000 1.00000
>>>>  1   hdd 0.96819         osd.1            up  1.00000 1.00000
>>>>  2   hdd 0.96819         osd.2            up  1.00000 1.00000
>>>>  9   ssd 0.21819         osd.9            up  1.00000 1.00000
>>>>
>>>>
>>>> My pools:
>>>>
>>>> pool 9 'cephbackup' replicated size 2 min_size 1 crush_rule 1 object_hash 
>>>> rjenkins pg_num 64 pgp_num 64 last_change 5665 flags hashpspool 
>>>> stripe_width 0 application rbd
>>>>         removed_snaps [1~3]
>>>> pool 13 'cephwin' replicated size 3 min_size 2 crush_rule 1 object_hash 
>>>> rjenkins pg_num 128 pgp_num 128 last_change 16454 flags hashpspool 
>>>> stripe_width 0 application rbd
>>>>         removed_snaps [1~5]
>>>> pool 14 'cephnix' replicated size 3 min_size 2 crush_rule 1 object_hash 
>>>> rjenkins pg_num 128 pgp_num 128 last_change 16482 flags hashpspool 
>>>> stripe_width 0 application rbd
>>>>         removed_snaps [1~227]
>>>> pool 17 'cephssd' replicated size 3 min_size 2 crush_rule 2 object_hash 
>>>> rjenkins pg_num 64 pgp_num 64 last_change 8601 flags hashpspool 
>>>> stripe_width 0 application rbd
>>>>         removed_snaps [1~3]
>>>>
>>>>
>>>>
>>>> I can't understand where the problem comes from, I don't think it's
>>>> hardware, if I had a failed disk, then I should have problems always on the
>>>> same OSD. Any ideas
>>>>
>>>> Thanks
>>>>
>>>>
>>>>
>>>> --
>>>> *Marco Baldini*
>>>> *H.S. Amiata Srl*
>>>> Ufficio:   0577-779396
>>>> Cellulare:   335-8765169
>>>> WEB:   www.hsamiata.it
>>>> EMAIL:   [email protected]
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> [email protected]
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>>>
>>>> --
>>>> Mit freundlichen Grüßen / Best Regards
>>>> Paul Emmerich
>>>>
>>>> croit GmbH
>>>> Freseniusstr. 31h
>>>> 81247 München
>>>> www.croit.io
>>>> Tel: +49 89 1896585 90 <+49%2089%20189658590>
>>>>
>>>> Geschäftsführer: Martin Verges
>>>> Handelsregister: Amtsgericht München
>>>> USt-IdNr: DE310638492
>>>>
>>>>
>>>> --
>>>> *Marco Baldini*
>>>> *H.S. Amiata Srl*
>>>> Ufficio:   0577-779396
>>>> Cellulare:   335-8765169
>>>> WEB:   www.hsamiata.it
>>>> EMAIL:   [email protected]
>>>>
>>>>
>>>> _______________________________________________
>>>> ceph-users mailing 
>>>> [email protected]http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>>>
>>>> --
>>>> *Marco Baldini*
>>>> *H.S. Amiata Srl*
>>>> Ufficio:   0577-779396
>>>> Cellulare:   335-8765169
>>>> WEB:   www.hsamiata.it
>>>> EMAIL:   [email protected]
>>>>
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> [email protected]
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>>>
>>>
>>>
>>> _______________________________________________
>>> ceph-users mailing 
>>> [email protected]http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>>
>>> --
>>> *Marco Baldini*
>>> *H.S. Amiata Srl*
>>> Ufficio:   0577-779396
>>> Cellulare:   335-8765169
>>> WEB:   www.hsamiata.it
>>> EMAIL:   [email protected]
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> [email protected]
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>>
>>
>>
>> _______________________________________________
>> ceph-users mailing 
>> [email protected]http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>> --
>> *Marco Baldini*
>> *H.S. Amiata Srl*
>> Ufficio:   0577-779396
>> Cellulare:   335-8765169
>> WEB:   www.hsamiata.it
>> EMAIL:   [email protected]
>>
>> _______________________________________________
>> ceph-users mailing list
>> [email protected]
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>
>
> --
> Cheers,
> Brad
>



-- 
Cheers,
Brad

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Random health OSD_SCRUB_ERRORS on various OSDs, after pg repair back to HEALTH_OK

Reply via email to