On Thu, Dec 3, 2015 at 1:30 AM, Lakis, Jacek <jacek.la...@intel.com> wrote:
> Hi cephers!
> I got two questions about "client->osd->replica osd->osd->client" path that 
> appears during my deep dive into this part.
>         1. eval_repop() is called twice [in C_OSD_RepopCommit and 
> C_OSD_RepopApplied context finish] in primary OSD, after receiving 
> MOSDOpRepopReply message from replica OSD. It's called twice with different 
> flags (ondisk, onack) and sends reply to the client two times. Do the client 
> really need to receive two replies, why? Maybe single reply after operation 
> is applied and committed is enough?

In common deployments the client is actually only getting sent the
ondisk response, but yes, we need to maintain both of those paths.
It's part of the protocol that if the data is made readable before
it's committed, we tell the client that it's happened.

>         2. MOSDOpRepopReply, caught by Pipe::reader() need to go through all 
> the dispatching->enq->deq->shards->workers path just to call finish() in 
> contexts mentioned before. Since the number of checks for this kind of 
> message is smaller than for the OSD ops, maybe it's good to consider another, 
> faster way to execute it, e.g. another simple queue with single thread 
> consuming and executing it, without whole enqueueing-dequeueing-shards stuff? 
> Ordering and PrioritizedQueue features are really important for this kind of 
> message?

That's an interesting question. Off the top of my head, maybe these
are important. The priority stuff probably isn't, but we do need to
maintain ordering within each PG — and I'm not sure if we can easily
identify which messages are "just" client data requests versus more
complicated things like returning data?
It's really a question of which particular pieces of the system we
could skip over, and whether those specific ones are worth the time
investment of doing so. I tend to assume it's not worth the effort —
the edge case handling would be hard to replace separately (eg, what
happens when we get a reply for a PG which we no longer have?).
-Greg

>
> Thank you.
>
> Best regards,
> JJ
>
> --------------------------------------------------------------------
>
> Intel Technology Poland sp. z o.o.
> ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII 
> Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 
> 957-07-52-316 | Kapital zakladowy 200.000 PLN.
>
> Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i 
> moze zawierac informacje poufne. W razie przypadkowego otrzymania tej 
> wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; 
> jakiekolwiek
> przegladanie lub rozpowszechnianie jest zabronione.
> This e-mail and any attachments may contain confidential material for the 
> sole use of the intended recipient(s). If you are not the intended recipient, 
> please contact the sender and delete all copies; any review or distribution by
> others is strictly prohibited.
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to