I think the terminology here gets confusing between initial/trailing metadata, gRPC rule names, and HTTP/2 frame types. Our retry design doc was indeed underspecified in regards to dealing with initial metadata, and will be updated. I go over all of the considerations in detail below.
For clarity, I will use all caps for the names of HTTP/2 frame types, e.g., HEADERS frame, and use the capitalized gRPC rule names from the specification <https://github.com/grpc/grpc/blob/f1666d48244143ddaf463523030ee76cc0fe691c/doc/PROTOCOL-HTTP2.md> . The gRPC specification ensures that a status (containing a gRPC status code) is only sent in Trailers, which is contained in an HTTP/2 HEADERS frame. The only way that the gRPC status code can be contained in the first HTTP/2 frame received is if the server sends a Trailers-Only response. Otherwise, the gRPC spec mandates that the first frame sent be the Response-Headers (again, sent in an HTTP/2 HEADERS frame). Response-Headers includes (optional) Custom-Metadata, which is usually what we are talking about when we say "initial metadata". Regardless of whether the Response-Headers includes anything in its Custom-Metadata, if the gRPC client library notifies the client application layer of what metadata is (or is not) included, we now have to view the RPC as committed, aka no longer retryable. This is the only option, as a later retry attempt could receive different Custom-Metadata, contradicting what we've already told the client application layer. We cannot include gRPC status codes in the Response-Headers along with "initial metadata". It's perfectly valid according to the spec for a server to send metadata along a stream in its Response-Headers, wait for one hour, then (without having sent any messages), close the stream with a retryable error. However, the proposal that a server include the gRPC status code (if known) in the initial response is still sound. Concretely, this means: if a gRPC server has not yet sent Response-Headers and receives an error response, it should send a Trailers-Only response containing the gRPC status code. This would allow retry attempts on the client-side to proceed, if applicable. This is going to be superior to sending Response-Headers immediately followed by Trailers, which would cause the RPC to become committed on the client side (if the Response-Header metadata is made available to the client application layer) and stop retry attempts. We still can encounter the case where a server intentionally sends Response-Headers to open a stream, then eventually closes the stream with an error without ever sending any messages. Such cases would not be retryable, but I think it's fair to argue that if the server *has* to send metadata in advance of sending any responses, that metadata is actually a response, and should be treated as such (i.e., their metadata just ensured the RPC will be committed on the client-side). Rather than either explicitly disallowing such behavior by modifying some specification (this behavior is currently entirely unspecified, so while specification is worthwhile, it should be separate from the retry policy design currently under discussion), we can just change the default server behavior of C++, and Go if necessary, to match Java. In Java servers, the Response-Headers are delayed until some response message is sent. If the server application returns an error status before sending a message, then Trailers-Only is sent instead of Response-Headers. We can also leave it up to the gRPC client library implementation to decide when an RPC is committed based on received Response-Headers. If and while the client library can guarantee that the presence (or absence) of initial metadata is not visible to the client application layer, the RPC can be considered uncommitted. This is an implementation detail that should very rarely be necessary if the above change is made to default server behavior, but it would not violate anything in the retry spec or semantics. Eric On Wed, Mar 1, 2017 at 11:32 AM, 'Eric Anderson' via grpc.io < [email protected]> wrote: > On Wed, Mar 1, 2017 at 10:51 AM, 'Mark D. Roth' via grpc.io < > [email protected]> wrote: > >> On Wed, Mar 1, 2017 at 10:20 AM, 'Eric Anderson' via grpc.io < >> [email protected]> wrote: >> >>> What? That does not seem to be a proper understanding of the text, or >>> the text is wrongly worded. Why would the RPC be "committed as soon as it >>> receives the initial metadata"? That isn't in the text... In your example >>> it seems it would be committed at "the trailing metadata that includes a >>> status" as long as that status was OK, as per the "an explicit OK status" >>> in the text. >>> >> >> The language in the above quote is probably not as specific as it should >> be, at least with respect to the wire protocol. The intent here is that >> the RPC should be considered committed when it receives either initial >> metadata or a payload message. >> > > If initial metadata causes a commit, then the "any response message" will > *never* apply, as initial metadata always comes first. So even the > corrected intent you propose is questionable since one of the two > conditions of "either initial metadata or a payload message" will never > occur. Now, maybe the document is wrong or based on false assumptions and > needs to be fixed, but the plain reading of text seems the only coherent > interpretation at this point. > > It is necessary that receiving initial metadata commits the RPC, because >> we need to report the initial metadata to the caller when it arrives. >> > > That's not strictly true. It could be buffered until it was decided it is > a "good" response. Yes, we may not want to do that, but it doesn't seem > "necessary" unless it was discussed earlier in the thread. > > If my correction of the nomenclature is correct, then Java already does >>>> this for the most part. This isn't something that can be enforced in Java. >>>> But the normal stub delays sending the initial metadata until the >>>> first response message >>>> <https://github.com/grpc/grpc-java/blob/master/stub/src/main/java/io/grpc/stub/ServerCalls.java#L281>. >>>> If the call is completed without any message, then only trailing metadata >>>> is sent. >>>> >>> >> Interesting. If that's the case, then why did that interop test only >> fail with Go, not with Java? >> > > Very good question. I don't know. I can't read that code well enough to > figure out what was actually happening. My naïve reading of the change > makes it look like PHP is now processing the initial metadata when > previously it wasn't. > > I don't see anything strange in Java's server that would change the > behavior. I had previously thought that Go was the only implementation that > always sent initial metadata on server-side. So I'm quite surprised to hear > it being the only one that doesn't send initial metadata when unnecessary. > > -- > You received this message because you are subscribed to the Google Groups " > grpc.io" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/grpc-io. > To view this discussion on the web visit https://groups.google.com/d/ > msgid/grpc-io/CA%2B4M1oMXxH55qXb8Mne9mYJgp1L2eF > _C29Z%2B6pLT0cB1gxBaHw%40mail.gmail.com > <https://groups.google.com/d/msgid/grpc-io/CA%2B4M1oMXxH55qXb8Mne9mYJgp1L2eF_C29Z%2B6pLT0cB1gxBaHw%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "grpc.io" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/grpc-io. To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/CALUXJ7g%3DqP5qu7jTFBmka3GvSeuH1D2SSKN%3DEHEwV-jLGt4zzg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
