Re: [ceph-users] PGs inconsistent, do I fear data loss?

Gregory Farnum Wed, 01 Nov 2017 11:30:37 -0700

On Wed, Nov 1, 2017 at 11:27 AM Denes Dolhay <[email protected]> wrote:


> Hello,
> I have a trick question for Mr. Turner's scenario:
> Let's assume size=2, min_size=1
> -We are looking at pg "A" acting [1, 2]
> -osd 1 goes down, OK
> -osd 1 comes back up, backfill of pg "A" commences from osd 2 to osd 1, OK
> -osd 2 goes down (and therefore pg "A" 's backfill to osd 1 is incomplete
> and stopped) not OK, but this is the case...
> --> In this event, why does osd 1 accept IO to pg "A" knowing full well,
> that it's data is outdated and will cause an inconsistent state?
> Wouldn't it be prudent to deny io to pg "A" until either
> -osd 2 comes back (therefore we have a clean osd in the acting group) ...
> backfill would continue to osd 1 of course
> -or data in pg "A" is manually marked as lost, and then continues
> operation from osd 1 's (outdated) copy?
>

It does deny IO in that case. I think David was pointing out that if OSD 2
is actually dead and gone, you've got data loss despite having only lost
one OSD.
-Greg


>
> Thanks in advance, I'm really curious!
>
> Denes.
>
>
>
> On 11/01/2017 06:33 PM, Mario Giammarco wrote:
>
> I have read your post then read the thread you suggested, very
> interesting.
> Then I read again your post and understood better.
> The most important thing is that even with min_size=1 writes are
> acknowledged after ceph wrote size=2 copies.
> In the thread above there is:
>
> As David already said, when all OSDs are up and in for a PG Ceph will wait 
> for ALL OSDs to Ack the write. Writes in RADOS are always synchronous.
>
> Only when OSDs go down you need at least min_size OSDs up before writes or 
> reads are accepted.
>
> So if min_size = 2 and size = 3 you need at least 2 OSDs online for I/O to 
> take place.
>
>
> You then show me a sequence of events that may happen in some use cases.
> I tell you my use case which is quite different. We use ceph under
> proxmox. The servers have disks on raid 5 (I agree that it is better to
> expose single disks to Ceph but it is late).
> So it is unlikely that a ceph disk fails because of raid. If a disks fail
> probabably is because the entire server has failed (and we need to provide
> business availability in this case) and so it will never come up again so
> in my situation your sequence of events will never happen.
> What shocked me is that I did not expect to see so many inconsistencies.
> Thanks,
> Mario
>
>
> 2017-11-01 16:45 GMT+01:00 David Turner <[email protected]>:
>
>> It looks like you're running with a size = 2 and min_size = 1 (the
>> min_size is a guess, the size is based on how many osds belong to your
>> problem PGs).  Here's some good reading for you.
>> https://www.spinics.net/lists/ceph-users/msg32895.html
>>
>> Basically the jist is that when running with size = 2 you should assume
>> that data loss is an eventuality and choose that it is ok for your use
>> case.  This can be mitigated by using min_size = 2, but then your pool will
>> block while an OSD is down and you'll have to manually go in and change the
>> min_size temporarily to perform maintenance.
>>
>> All it takes for data loss is that an osd on server 1 is marked down and
>> a write happens to an osd on server 2.  Now the osd on server 2 goes down
>> before the osd on server 1 has finished backfilling and the first osd
>> receives a request to modify data in the object that it doesn't know the
>> current state of.  Tada, you have data loss.
>>
>> How likely is this to happen... eventually it will.  PG subfolder
>> splitting (if you're using filestore) will occasionally take long enough to
>> perform the task that the osd is marked down while it's still running, and
>> this usually happens for some time all over the cluster when it does.
>> Another option is something that causes segfaults in the osds; another is
>> restarting a node before all pgs are done backfilling/recovering; OOM
>> killer; power outages; etc; etc.
>>
>> Why does min_size = 2 prevent this?  Because for a write to be
>> acknowledged by the cluster, it has to be written to every OSD that is up
>> as long as there are at least min_size available.  This means that every
>> write is acknowledged by at least 2 osds every time.  If you're running
>> with size = 2, then both copies of the data need to be online for a write
>> to happen and thus can never have a write that the other does not.  If
>> you're running with size = 3, then you always have a majority of the OSDs
>> online receiving a write and they can both agree on the correct data to
>> give to the third when it comes back up.
>>
>> On Wed, Nov 1, 2017 at 3:31 AM Mario Giammarco <[email protected]>
>> wrote:
>>
>>> Sure here it is ceph -s:
>>>
>>> cluster:
>>>    id:     8bc45d9a-ef50-4038-8e1b-1f25ac46c945
>>>    health: HEALTH_ERR
>>>            100 scrub errors
>>>            Possible data damage: 56 pgs inconsistent
>>>
>>>  services:
>>>    mon: 3 daemons, quorum 0,1,pve3
>>>    mgr: pve3(active)
>>>    osd: 3 osds: 3 up, 3 in
>>>
>>>  data:
>>>    pools:   1 pools, 256 pgs
>>>    objects: 269k objects, 1007 GB
>>>    usage:   2050 GB used, 1386 GB / 3436 GB avail
>>>    pgs:     200 active+clean
>>>             56  active+clean+inconsistent
>>>
>>> ---
>>>
>>> ceph health detail :
>>>
>>> PG_DAMAGED Possible data damage: 56 pgs inconsistent
>>>    pg 2.6 is active+clean+inconsistent, acting [1,0]
>>>    pg 2.19 is active+clean+inconsistent, acting [1,2]
>>>    pg 2.1e is active+clean+inconsistent, acting [1,2]
>>>    pg 2.1f is active+clean+inconsistent, acting [1,2]
>>>    pg 2.24 is active+clean+inconsistent, acting [0,2]
>>>    pg 2.25 is active+clean+inconsistent, acting [2,0]
>>>    pg 2.36 is active+clean+inconsistent, acting [1,0]
>>>    pg 2.3d is active+clean+inconsistent, acting [1,2]
>>>    pg 2.4b is active+clean+inconsistent, acting [1,0]
>>>    pg 2.4c is active+clean+inconsistent, acting [0,2]
>>>    pg 2.4d is active+clean+inconsistent, acting [1,2]
>>>    pg 2.4f is active+clean+inconsistent, acting [1,2]
>>>    pg 2.50 is active+clean+inconsistent, acting [1,2]
>>>    pg 2.52 is active+clean+inconsistent, acting [1,2]
>>>    pg 2.56 is active+clean+inconsistent, acting [1,0]
>>>    pg 2.5b is active+clean+inconsistent, acting [1,2]
>>>    pg 2.5c is active+clean+inconsistent, acting [1,2]
>>>    pg 2.5d is active+clean+inconsistent, acting [1,0]
>>>    pg 2.5f is active+clean+inconsistent, acting [1,2]
>>>    pg 2.71 is active+clean+inconsistent, acting [0,2]
>>>    pg 2.75 is active+clean+inconsistent, acting [1,2]
>>>    pg 2.77 is active+clean+inconsistent, acting [1,2]
>>>    pg 2.79 is active+clean+inconsistent, acting [1,2]
>>>    pg 2.7e is active+clean+inconsistent, acting [1,2]
>>>    pg 2.83 is active+clean+inconsistent, acting [1,0]
>>>    pg 2.8a is active+clean+inconsistent, acting [1,0]
>>>    pg 2.92 is active+clean+inconsistent, acting [1,2]
>>>    pg 2.98 is active+clean+inconsistent, acting [1,0]
>>>    pg 2.9a is active+clean+inconsistent, acting [1,0]
>>>    pg 2.9e is active+clean+inconsistent, acting [1,0]
>>>    pg 2.9f is active+clean+inconsistent, acting [1,2]
>>>    pg 2.c6 is active+clean+inconsistent, acting [0,2]
>>>    pg 2.c7 is active+clean+inconsistent, acting [1,0]
>>>    pg 2.c8 is active+clean+inconsistent, acting [1,2]
>>>    pg 2.cb is active+clean+inconsistent, acting [1,2]
>>>    pg 2.cd is active+clean+inconsistent, acting [1,2]
>>>    pg 2.ce is active+clean+inconsistent, acting [1,2]
>>>    pg 2.d2 is active+clean+inconsistent, acting [2,1]
>>>    pg 2.da is active+clean+inconsistent, acting [1,0]
>>>    pg 2.de is active+clean+inconsistent, acting [1,2]
>>>    pg 2.e1 is active+clean+inconsistent, acting [1,2]
>>>    pg 2.e4 is active+clean+inconsistent, acting [1,0]
>>>    pg 2.e6 is active+clean+inconsistent, acting [0,2]
>>>    pg 2.e8 is active+clean+inconsistent, acting [1,2]
>>>    pg 2.ee is active+clean+inconsistent, acting [1,0]
>>>    pg 2.f9 is active+clean+inconsistent, acting [1,2]
>>>    pg 2.fa is active+clean+inconsistent, acting [1,0]
>>>    pg 2.fb is active+clean+inconsistent, acting [1,2]
>>>    pg 2.fc is active+clean+inconsistent, acting [1,2]
>>>    pg 2.fe is active+clean+inconsistent, acting [1,0]
>>>    pg 2.ff is active+clean+inconsistent, acting [1,0]
>>>
>>>
>>> and ceph pg 2.6 query:
>>>
>>> {
>>>    "state": "active+clean+inconsistent",
>>>    "snap_trimq": "[]",
>>>    "epoch": 1513,
>>>    "up": [
>>>        1,
>>>        0
>>>    ],
>>>    "acting": [
>>>        1,
>>>        0
>>>    ],
>>>    "actingbackfill": [
>>>        "0",
>>>        "1"
>>>    ],
>>>    "info": {
>>>        "pgid": "2.6",
>>>        "last_update": "1513'89145",
>>>        "last_complete": "1513'89145",
>>>        "log_tail": "1503'87586",
>>>        "last_user_version": 330583,
>>>        "last_backfill": "MAX",
>>>        "last_backfill_bitwise": 0,
>>>        "purged_snaps": [
>>>            {
>>>                "start": "1",
>>>                "length": "178"
>>>            },
>>>            {
>>>                "start": "17a",
>>>                "length": "3d"
>>>            },
>>>            {
>>>                "start": "1b8",
>>>                "length": "1"
>>>            },
>>>            {
>>>                "start": "1ba",
>>>                "length": "1"
>>>            },
>>>            {
>>>                "start": "1bc",
>>>                "length": "1"
>>>            },
>>>            {
>>>                "start": "1be",
>>>                "length": "44"
>>>            },
>>>            {
>>>                "start": "205",
>>>                "length": "12c"
>>>            },
>>>            {
>>>                "start": "332",
>>>                "length": "1"
>>>            },
>>>            {
>>>                "start": "334",
>>>                "length": "1"
>>>            },
>>>            {
>>>                "start": "336",
>>>                "length": "1"
>>>            },
>>>            {
>>>                "start": "338",
>>>                "length": "1"
>>>            },
>>>            {
>>>                "start": "33a",
>>>                "length": "1"
>>>            }
>>>        ],
>>>        "history": {
>>>            "epoch_created": 90,
>>>            "epoch_pool_created": 90,
>>>            "last_epoch_started": 1339,
>>>            "last_interval_started": 1338,
>>>            "last_epoch_clean": 1339,
>>>            "last_interval_clean": 1338,
>>>            "last_epoch_split": 0,
>>>            "last_epoch_marked_full": 0,
>>>            "same_up_since": 1338,
>>>            "same_interval_since": 1338,
>>>            "same_primary_since": 1338,
>>>            "last_scrub": "1513'89112",
>>>            "last_scrub_stamp": "2017-11-01 05:52:21.259654",
>>>            "last_deep_scrub": "1513'89112",
>>>            "last_deep_scrub_stamp": "2017-11-01 05:52:21.259654",
>>>            "last_clean_scrub_stamp": "2017-10-25 04:25:09.830840"
>>>        },
>>>        "stats": {
>>>            "version": "1513'89145",
>>>            "reported_seq": "422820",
>>>            "reported_epoch": "1513",
>>>            "state": "active+clean+inconsistent",
>>>            "last_fresh": "2017-11-01 08:11:38.411784",
>>>            "last_change": "2017-11-01 05:52:21.259789",
>>>            "last_active": "2017-11-01 08:11:38.411784",
>>>            "last_peered": "2017-11-01 08:11:38.411784",
>>>            "last_clean": "2017-11-01 08:11:38.411784",
>>>            "last_became_active": "2017-10-15 20:36:33.644567",
>>>            "last_became_peered": "2017-10-15 20:36:33.644567",
>>>            "last_unstale": "2017-11-01 08:11:38.411784",
>>>            "last_undegraded": "2017-11-01 08:11:38.411784",
>>>            "last_fullsized": "2017-11-01 08:11:38.411784",
>>>            "mapping_epoch": 1338,
>>>            "log_start": "1503'87586",
>>>            "ondisk_log_start": "1503'87586",
>>>            "created": 90,
>>>            "last_epoch_clean": 1339,
>>>            "parent": "0.0",
>>>            "parent_split_bits": 0,
>>>            "last_scrub": "1513'89112",
>>>            "last_scrub_stamp": "2017-11-01 05:52:21.259654",
>>>            "last_deep_scrub": "1513'89112",
>>>            "last_deep_scrub_stamp": "2017-11-01 05:52:21.259654",
>>>            "last_clean_scrub_stamp": "2017-10-25 04:25:09.830840",
>>>            "log_size": 1559,
>>>            "ondisk_log_size": 1559,
>>>            "stats_invalid": false,
>>>            "dirty_stats_invalid": false,
>>>            "omap_stats_invalid": false,
>>>            "hitset_stats_invalid": false,
>>>            "hitset_bytes_stats_invalid": false,
>>>            "pin_stats_invalid": false,
>>>            "stat_sum": {
>>>                "num_bytes": 3747886080 <374%20788%206080>,
>>>                "num_objects": 958,
>>>                "num_object_clones": 295,
>>>                "num_object_copies": 1916,
>>>                "num_objects_missing_on_primary": 0,
>>>                "num_objects_missing": 0,
>>>                "num_objects_degraded": 0,
>>>                "num_objects_misplaced": 0,
>>>                "num_objects_unfound": 0,
>>>                "num_objects_dirty": 958,
>>>                "num_whiteouts": 0,
>>>                "num_read": 333428,
>>>                "num_read_kb": 135550185,
>>>                "num_write": 79221,
>>>                "num_write_kb": 13441239,
>>>                "num_scrub_errors": 1,
>>>                "num_shallow_scrub_errors": 0,
>>>                "num_deep_scrub_errors": 1,
>>>                "num_objects_recovered": 245,
>>>                "num_bytes_recovered": 1012833792,
>>>                "num_keys_recovered": 6,
>>>                "num_objects_omap": 0,
>>>                "num_objects_hit_set_archive": 0,
>>>                "num_bytes_hit_set_archive": 0,
>>>                "num_flush": 0,
>>>                "num_flush_kb": 0,
>>>                "num_evict": 0,
>>>                "num_evict_kb": 0,
>>>                "num_promote": 0,
>>>                "num_flush_mode_high": 0,
>>>                "num_flush_mode_low": 0,
>>>                "num_evict_mode_some": 0,
>>>                "num_evict_mode_full": 0,
>>>                "num_objects_pinned": 0,
>>>                "num_legacy_snapsets": 0
>>>            },
>>>            "up": [
>>>                1,
>>>                0
>>>            ],
>>>            "acting": [
>>>                1,
>>>                0
>>>            ],
>>>            "blocked_by": [],
>>>            "up_primary": 1,
>>>            "acting_primary": 1
>>>        },
>>>        "empty": 0,
>>>        "dne": 0,
>>>        "incomplete": 0,
>>>        "last_epoch_started": 1339,
>>>        "hit_set_history": {
>>>            "current_last_update": "0'0",
>>>            "history": []
>>>        }
>>>    },
>>>    "peer_info": [
>>>        {
>>>            "peer": "0",
>>>            "pgid": "2.6",
>>>            "last_update": "1513'89145",
>>>            "last_complete": "1513'89145",
>>>            "log_tail": "1274'68440",
>>>            "last_user_version": 315687,
>>>            "last_backfill": "MAX",
>>>            "last_backfill_bitwise": 0,
>>>            "purged_snaps": [
>>>                {
>>>                    "start": "1",
>>>                    "length": "178"
>>>                },
>>>                {
>>>                    "start": "17a",
>>>                    "length": "3d"
>>>                },
>>>                {
>>>                    "start": "1b8",
>>>                    "length": "1"
>>>                },
>>>                {
>>>                    "start": "1ba",
>>>                    "length": "1"
>>>                },
>>>                {
>>>                    "start": "1bc",
>>>                    "length": "1"
>>>                },
>>>                {
>>>                    "start": "1be",
>>>                    "length": "44"
>>>                },
>>>                {
>>>                    "start": "205",
>>>                    "length": "82"
>>>                },
>>>                {
>>>                    "start": "288",
>>>                    "length": "1"
>>>                },
>>>                {
>>>                    "start": "28a",
>>>                    "length": "1"
>>>                },
>>>                {
>>>                    "start": "28c",
>>>                    "length": "1"
>>>                },
>>>                {
>>>                    "start": "28e",
>>>                    "length": "1"
>>>                },
>>>                {
>>>                    "start": "290",
>>>                    "length": "1"
>>>                }
>>>            ],
>>>            "history": {
>>>                "epoch_created": 90,
>>>                "epoch_pool_created": 90,
>>>                "last_epoch_started": 1339,
>>>                "last_interval_started": 1338,
>>>                "last_epoch_clean": 1339,
>>>                "last_interval_clean": 1338,
>>>                "last_epoch_split": 0,
>>>                "last_epoch_marked_full": 0,
>>>                "same_up_since": 1338,
>>>                "same_interval_since": 1338,
>>>                "same_primary_since": 1338,
>>>                "last_scrub": "1513'89112",
>>>                "last_scrub_stamp": "2017-11-01 05:52:21.259654",
>>>                "last_deep_scrub": "1513'89112",
>>>                "last_deep_scrub_stamp": "2017-11-01 05:52:21.259654",
>>>                "last_clean_scrub_stamp": "2017-10-25 04:25:09.830840"
>>>            },
>>>            "stats": {
>>>                "version": "1337'71465",
>>>                "reported_seq": "347015",
>>>                "reported_epoch": "1338",
>>>                "state": "active+undersized+degraded",
>>>                "last_fresh": "2017-10-15 20:35:36.930611",
>>>                "last_change": "2017-10-15 20:30:35.752042",
>>>                "last_active": "2017-10-15 20:35:36.930611",
>>>                "last_peered": "2017-10-15 20:35:36.930611",
>>>                "last_clean": "2017-10-15 20:30:01.443288",
>>>                "last_became_active": "2017-10-15 20:30:35.752042",
>>>                "last_became_peered": "2017-10-15 20:30:35.752042",
>>>                "last_unstale": "2017-10-15 20:35:36.930611",
>>>                "last_undegraded": "2017-10-15 20:30:35.749043",
>>>                "last_fullsized": "2017-10-15 20:30:35.749043",
>>>                "mapping_epoch": 1338,
>>>                "log_start": "1274'68440",
>>>                "ondisk_log_start": "1274'68440",
>>>                "created": 90,
>>>                "last_epoch_clean": 1331,
>>>                "parent": "0.0",
>>>                "parent_split_bits": 0,
>>>                "last_scrub": "1294'71370",
>>>                "last_scrub_stamp": "2017-10-15 09:27:31.756027",
>>>                "last_deep_scrub": "1284'70813",
>>>                "last_deep_scrub_stamp": "2017-10-14 06:35:57.556773",
>>>                "last_clean_scrub_stamp": "2017-10-15 09:27:31.756027",
>>>                "log_size": 3025,
>>>                "ondisk_log_size": 3025,
>>>                "stats_invalid": false,
>>>                "dirty_stats_invalid": false,
>>>                "omap_stats_invalid": false,
>>>                "hitset_stats_invalid": false,
>>>                "hitset_bytes_stats_invalid": false,
>>>                "pin_stats_invalid": false,
>>>                "stat_sum": {
>>>                    "num_bytes": 3555027456 <355%20502%207456>,
>>>                    "num_objects": 917,
>>>                    "num_object_clones": 255,
>>>                    "num_object_copies": 1834,
>>>                    "num_objects_missing_on_primary": 0,
>>>                    "num_objects_missing": 0,
>>>                    "num_objects_degraded": 917,
>>>                    "num_objects_misplaced": 0,
>>>                    "num_objects_unfound": 0,
>>>                    "num_objects_dirty": 917,
>>>                    "num_whiteouts": 0,
>>>                    "num_read": 275095,
>>>                    "num_read_kb": 111713846,
>>>                    "num_write": 64324,
>>>                    "num_write_kb": 11365374,
>>>                    "num_scrub_errors": 0,
>>>                    "num_shallow_scrub_errors": 0,
>>>                    "num_deep_scrub_errors": 0,
>>>                    "num_objects_recovered": 243,
>>>                    "num_bytes_recovered": 1008594432,
>>>                    "num_keys_recovered": 6,
>>>                    "num_objects_omap": 0,
>>>                    "num_objects_hit_set_archive": 0,
>>>                    "num_bytes_hit_set_archive": 0,
>>>                    "num_flush": 0,
>>>                    "num_flush_kb": 0,
>>>                    "num_evict": 0,
>>>                    "num_evict_kb": 0,
>>>                    "num_promote": 0,
>>>                    "num_flush_mode_high": 0,
>>>                    "num_flush_mode_low": 0,
>>>                    "num_evict_mode_some": 0,
>>>                    "num_evict_mode_full": 0,
>>>                    "num_objects_pinned": 0,
>>>                    "num_legacy_snapsets": 0
>>>                },
>>>                "up": [
>>>                    1,
>>>                    0
>>>                ],
>>>                "acting": [
>>>                    1,
>>>                    0
>>>                ],
>>>                "blocked_by": [],
>>>                "up_primary": 1,
>>>                "acting_primary": 1
>>>            },
>>>            "empty": 0,
>>>            "dne": 0,
>>>            "incomplete": 0,
>>>            "last_epoch_started": 1339,
>>>            "hit_set_history": {
>>>                "current_last_update": "0'0",
>>>                "history": []
>>>            }
>>>        }
>>>    ],
>>>    "recovery_state": [
>>>        {
>>>            "name": "Started/Primary/Active",
>>>            "enter_time": "2017-10-15 20:36:33.574915",
>>>            "might_have_unfound": [
>>>                {
>>>                    "osd": "0",
>>>                    "status": "already probed"
>>>                }
>>>            ],
>>>            "recovery_progress": {
>>>                "backfill_targets": [],
>>>                "waiting_on_backfill": [],
>>>                "last_backfill_started": "MIN",
>>>                "backfill_info": {
>>>                    "begin": "MIN",
>>>                    "end": "MIN",
>>>                    "objects": []
>>>                },
>>>                "peer_backfill_info": [],
>>>                "backfills_in_flight": [],
>>>                "recovering": [],
>>>                "pg_backend": {
>>>                    "pull_from_peer": [],
>>>                    "pushing": []
>>>                }
>>>            },
>>>            "scrub": {
>>>                "scrubber.epoch_start": "1338",
>>>                "scrubber.active": false,
>>>                "scrubber.state": "INACTIVE",
>>>                "scrubber.start": "MIN",
>>>                "scrubber.end": "MIN",
>>>                "scrubber.subset_last_update": "0'0",
>>>                "scrubber.deep": false,
>>>                "scrubber.seed": 0,
>>>                "scrubber.waiting_on": 0,
>>>                "scrubber.waiting_on_whom": []
>>>            }
>>>        },
>>>        {
>>>            "name": "Started",
>>>            "enter_time": "2017-10-15 20:36:32.592892"
>>>        }
>>>    ],
>>>    "agent_state": {}
>>> }
>>>
>>>
>>>
>>>
>>>
>>> 2017-10-30 23:30 GMT+01:00 Gregory Farnum <[email protected]>:
>>>
>>>> You'll need to tell us exactly what error messages you're seeing, what
>>>> the output of ceph -s is, and the output of pg query for the relevant PGs.
>>>> There's not a lot of documentation because much of this tooling is new,
>>>> it's changing quickly, and most people don't have the kinds of problems
>>>> that turn out to be unrepairable. We should do better about that, though.
>>>> -Greg
>>>>
>>>> On Mon, Oct 30, 2017, 11:40 AM Mario Giammarco <[email protected]>
>>>> wrote:
>>>>
>>>>>  >[Questions to the list]
>>>>>  >How is it possible that the cluster cannot repair itself with ceph pg
>>>>> repair?
>>>>>  >No good copies are remaining?
>>>>>  >Cannot decide which copy is valid or up-to date?
>>>>>  >If so, why not, when there is checksum, mtime for everything?
>>>>>  >In this inconsistent state which object does the cluster serve when
>>>>> it
>>>>> doesn't know which one is the valid?
>>>>>
>>>>>
>>>>> I am asking the same questions too, it seems strange to me that in a
>>>>> fault tolerant clustered file storage like Ceph there is no
>>>>> documentation about this.
>>>>>
>>>>> I know that I am pedantic but please note that saying "to be sure use
>>>>> three copies" is not enough because I am not sure what Ceph really does
>>>>> when three copies are not matching.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> ceph-users mailing list
>>>>> [email protected]
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>
>>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> [email protected]
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
>
>
> _______________________________________________
> ceph-users mailing 
> [email protected]http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] PGs inconsistent, do I fear data loss?

Reply via email to