Hi Sam,

When the acting set changes order two chunks for the same object may co-exist 
in the same placement group. The key should therefore also contain the chunk 
number. 

That's probably the most sensible comment I have so far. This document is 
immensely useful (even in its current state) because it shows me your 
perspective on the implementation. 

I'm puzzled by:

CEPH_OSD_OP_DELETE: The possibility of rolling back a delete requires that we 
retain the deleted object until all replicas have persisted the deletion event. 
ErasureCoded backend will therefore need to store objects with the version at 
which they were created included in the key provided to the filestore. Old 
versions of an object can be pruned when all replicas have committed up to the 
log event deleting the object.

because I don't understand why the version would be necessary. I thought that 
deleting an erasure coded object could be even easier than erasing a replicated 
object because it cannot be resurrected if enough chunks are lots, therefore 
you don't need to wait for ack from all OSDs in the up set. I'm obviously 
missing something.

I failed to understand how important the pg logs were to maintaining the 
consistency of the PG. For some reason I thought about them only in terms of 
being a light weight version of the operation logs. Adding a payload to the 
pg_log_entry ( i.e. APPEND size or attribute ) is a new idea for me and I would 
have never thought or dared think the logs could be extended in such a way. 
Given the recent problems with logs writes having a high impact on performances 
( I'm referring to what forced you to introduce code to reduce the amount of 
logs being written to only those that have been changed instead of the complete 
logs ) I thought about the pg logs as something immutable.

I'm still trying to figure out how PGBackend::perform_write / read / 
try_rollback would fit in the current backfilling / write / read / scrubbing 
... code path. 

https://github.com/athanatos/ceph/blob/ba5c97eda4fe72a25831031a2cffb226fed8d9b7/doc/dev/osd_internals/erasure_coding.rst
https://github.com/athanatos/ceph/blob/ba5c97eda4fe72a25831031a2cffb226fed8d9b7/src/osd/PGBackend.h

Cheers

-- 
Loïc Dachary, Artisan Logiciel Libre
All that is necessary for the triumph of evil is that good people do nothing.

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to