On Thu, Mar 7, 2019 at 8:53 AM 'Carl Mastrangelo' via grpc.io < [email protected]> wrote:
> Inline responses: > > On Thursday, March 7, 2019 at 7:19:41 AM UTC-8, Mark D. Roth wrote: >> >> On Wed, Mar 6, 2019 at 2:29 PM 'Carl Mastrangelo' via grpc.io < >> [email protected]> wrote: >> >>> Mark: I would be interested in taking over this, assuming you don't have >>> many more concerns. >>> >> >> I'd be fine with you considering yourself the new owner of this design. >> However, the gRFC has already been merged, so there's no need for more work >> here unless there are changes we want to see. (I do notice that the gRFC >> still says it's in state "Draft", but this was supposed to have been >> finalized a while ago, so I suspect that Noah simply forgot to change that >> when he merged.) >> >> >>> >>> I have two changes I would like to propose as well: >>> >>> 1. Add a CT_DEBUG level between CT_UNKNOWN and CT_INFO. >>> ChannelTraces at the DEBUG level are *never* surfaced up to channelz, and >>> implementations are free to not implement it. However, implementations >>> that do want to surface debug info may. This is relevant for Java, maybe >>> C++, where there are leveled loggers that can go below the info level. We >>> use the channel trace log level throughout our implementation, and convert >>> it to the relevant Channelz or logger level. It would be nice if we could >>> skip the Channelz conversion half. >>> >> >> I don't think we want to clutter up the channel trace buffer with >> debug-level messages, since the buffer is of limited size, and we don't >> want debug messages to cause important messages from earlier to be expired >> out of the buffer. So I would rather not add this mechanism to the design. >> > > Agreed. The proposal is to add another enum value and throw away > everything with that value for the trace buffer. There is no extra logic. > > >> >> I think it was probably a poor choice for Java to use an API with >> different log levels. If there's an API for adding channel trace events, >> that API should expose only the log levels used for channel trace events; >> we should not conflate that with any pre-existing log levels used for >> different purposes. >> > > Your suggestion is a tradeoff that Java traded the other way on. The > events relevant to channels is a superset of events representable by > Channel Trace. We don't want two calls (a log() and a trace()) per event > callsite, so we combine them. Leveling them down keeps the noise down when > things are going well, and the default access for events is through the > Channelz UI. When things are bad, and the events are too numerous, they > have to be logged or thrown away. Hence, they have to be logged, and we > have to pick some level to log them at. The C++ gRPC implementation groups > such events by type (as args to GRPC_TRACE), while Java gRPC groups them by > severity (via -Dio.grpc.ChannelLogger.level). I think Java's > implementation is easier to use. > > But, I think time (and users) will tell if this approach is correct, and > diversity in this area will help us find good solutions more quickly. > > >> >> On a related note, Abhishek did suggest one change a while back, which I >> don't think Noah ever got around to. Currently, the trace event buffer >> size is set in terms of number of events. However, Abhishek suggested that >> it should instead be expressed in terms of total amount of memory used. >> The distinction is, of course, that instead of the buffer fitting a fixed >> number of events, the number would fluctuate based on the size of the >> events in memory. >> >> >>> >>> 2. Expose Channel trace (as perhaps a TransportTrace) that is specific >>> to a socket. Right now its just for channels and subchannels, which means >>> if there is a lot of activitiy on sockets for a given subchannel, the trace >>> events all get mixed up with the Subchannels logs. I can make a concrete >>> proposal if you are generally onboard with the idea. >>> >> >> I'm not opposed to this if there is a use-case for it, but I'd like to >> hear examples of use-cases to justify this. Can you give examples of >> events that would be associated with the transport instead of the >> subchannel? >> > > Main use case is during connection setup, where there can be multiple > stages. For example, while debugging an internal issue, a user claimed the > initial RPC timed out, but they channel stayed stuck in Connecting. > Transport level events would show how far into the connection pipeline gRPC > made it. > > A non internal example is when connecting via SSL. If the SSL handshake > completes but the remote server never sends the HTTP/2 handshake, the > connection won't fail, but eventually time out. The details of how far it > got into establishing the connection are lost, making it hard to figure out > which stage was the problem. Some of the events I had in mind: > > * HTTP Connect sent > * HTTP connect complete > * HTTP/1.1 upgrade sent > * HTTP/1.1 upgrade complete > * SSL Handshake started > * SSL Handshake complete > * ALTS Handshake started > * ALTS handshake complete > * HTTP/2 settings sent > These events seem far too detailed for channel trace events. If added channel trace events for this level of granularity, then we would wind up expiring earlier events from the buffer that are generally more interesting. > > > > > > >> >> >>> >>> On Monday, January 30, 2017 at 9:46:26 AM UTC-8, Mark D. Roth wrote: >>>> >>>> (+ctiller) >>>> >>>> Overall, this looks pretty decent. Here are a few initial thoughts... >>>> >>>> I like the idea of using JSON for the returned tracing data for C-core, >>>> especially since it means less overhead in wrapped languages that want to >>>> expose the new tracing APIs. However, JSON may not be the desired form for >>>> this data in all languages; the Java and Go folks may prefer some >>>> language-specific data structure. I suggest checking with folks on those >>>> teams to see what they think. (If we are going to have a >>>> language-independent server UI for displaying trace data, then that may be >>>> a good argument for all languages using JSON, but we need to make sure >>>> everyone is on board with that.) >>>> >>>> The gRFC should document the schema for the JSON data. In particular, >>>> we should probably make sure that the JSON data is in a form that can be >>>> automatically converted into a protobuf (which we'll want to define), as >>>> per https://developers.google.com/protocol-buffers/docs/proto3#json. >>>> >>>> In terms of the C-core implementation, as you and I and Craig discussed >>>> last week, the grpc_subchannel_tracer struct will probably need a refcount, >>>> since it may be referenced by multiple parent channels. Whenever a parent >>>> channel gets a trace node indicating that a subchannel has been added or >>>> removed from the parent channel, that trace node should hold a reference to >>>> the subchannel trace. Thus, the subchannel trace will live until the last >>>> node referencing it is removed from the parent channels' buffers. (Update: >>>> Ah, I see you mentioned this at the very end of the doc. It might be >>>> useful to make this clear earlier, when the data structures themselves are >>>> presented.) >>>> >>>> You might also consider making the list of subchannel tracers a >>>> doubly-linked list, so that it's easier to delete entries from the middle >>>> of the list. >>>> >>>> It might be advantageous to use grpc_channel_tracer for both parent >>>> channels and subchannels, so that you don't need a separate internal API >>>> for adding nodes to each type. Or perhaps simply create some common base >>>> class for the head_trace and tail_trace fields, and >>>> have grpc_channel_tracer_add_trace() operate on that base class. >>>> >>>> Please let me know if you have any questions or concerns about any of >>>> this. >>>> >>>> On Wed, Jan 25, 2017 at 11:18 AM, ncteisen via grpc.io < >>>> [email protected]> wrote: >>>> >>>>> My first link was to the blob, so it is stale. >>>>> >>>>> Instead use this link <https://github.com/grpc/proposal/pull/7> to >>>>> the pull request itself. >>>>> >>>>> On Wednesday, January 25, 2017 at 10:16:46 AM UTC-8, >>>>> [email protected] wrote: >>>>>> >>>>>> I've created a gRFC describing the design and implementation plan for >>>>>> gRPC Channel Tracing >>>>>> >>>>>> Take a look at the planning doc. >>>>>> <https://github.com/ncteisen/proposal/blob/b80ca8410120fdae99ef0ad54bc3005cff87bea4/A3.md> >>>>>> >>>>>> Would love to hear some feedback on the design! >>>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "grpc.io" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To post to this group, send email to [email protected]. >>>>> Visit this group at https://groups.google.com/group/grpc-io. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/grpc-io/90af5752-0d28-41ad-8887-372070ad2430%40googlegroups.com >>>>> <https://groups.google.com/d/msgid/grpc-io/90af5752-0d28-41ad-8887-372070ad2430%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> >>>> >>>> -- >>>> Mark D. Roth <[email protected]> >>>> Software Engineer >>>> Google, Inc. >>>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "grpc.io" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at https://groups.google.com/group/grpc-io. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/grpc-io/b0653c00-d85c-4adc-8659-ab24945b5a9c%40googlegroups.com >>> <https://groups.google.com/d/msgid/grpc-io/b0653c00-d85c-4adc-8659-ab24945b5a9c%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> >> -- >> Mark D. Roth <[email protected]> >> Software Engineer >> Google, Inc. >> > -- > You received this message because you are subscribed to the Google Groups " > grpc.io" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/grpc-io. > To view this discussion on the web visit > https://groups.google.com/d/msgid/grpc-io/5dd3278e-158c-4ede-892d-a828eeb6409f%40googlegroups.com > <https://groups.google.com/d/msgid/grpc-io/5dd3278e-158c-4ede-892d-a828eeb6409f%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- Mark D. Roth <[email protected]> Software Engineer Google, Inc. -- You received this message because you are subscribed to the Google Groups "grpc.io" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/grpc-io. To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/CAJgPXp4PVj_q6jTv%3DpsAykCW4sy13jk22SyC9X%2BUP5KNqbG7%2BQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
