Yeah, I meant the latter — joining CouchDB’s span information to spans in an app built against CouchDB so a developer can see the end-to-end story. Wasn’t proposing user-customized spans inside the DB :)
Adam > On Sep 10, 2019, at 5:43 PM, Paul Davis <paul.joseph.da...@gmail.com> wrote: > > Looks pretty awesome. I've got basically the same questions as Koco on > performance. There are also games like the lager transforms that > conditionally enable/disable log levels at runtime. If memory serves, > it ended up being a single function call overhead to check for > disabled based on some dynamically compiled module or ets lookup I > think. > > Koco, are client inherited spans an opentracing concept? At first I > read it as "let a user specify points in CouchDB to insert trace > markers at runtime" and it sounded kinda crazy. But if you mean > somehow connecting the CouchDB generated span with some other span in > a different application that sounds like something reasonable to > support. > > Ilya, you mentioned hopping from the coordinator to RPC workers which > is definitely an open problem. I only skimmed the docs months ago but > one of the things I came across was trying to figure out how to > represent parallel traces. Given we have a coordinator that has N>1 > RPC workers running in parallel I wasn't sure how that'd work. Granted > that was on the shallowest of shallow dives skimming their docs when > someone mentioned the tracing thing somewhere. > > On Tue, Sep 10, 2019 at 3:46 PM Adam Kocoloski <kocol...@apache.org> wrote: >> >> I think this is a great idea overall, particularly given the number of >> significant changes that are happening in the codebase between 3.0 and 4.0. >> >> For me the main question is how much overhead is associated with tracing. >> Can an admin safely configure it to run in production? Is it possible to >> sample just a small percentage of events? Does the overhead change if no >> OpenTracing tracer is configured? >> >> I also think a full picture here might include the ability to inherit >> client-provided spans, so an app developer could drill down from her own >> code into the database internals and figure out why some DB request was >> unexpectedly slow. >> >> Thanks for starting this discussion. Cheers, >> >> Adam >> >>> On Sep 10, 2019, at 2:32 PM, Ilya Khlopotov <iil...@apache.org> wrote: >>> >>> Hi, >>> >>> I wanted to run this idea by the ML to see if there is any interest before >>> investing time into preparing formal RFC. >>> >>> # Problem statement >>> >>> Collecting profiling data is very tricky at the moment. Developers have to >>> run generic profiling tools which are not aware of CouchDB specifics. >>> This makes it hard to do the performance optimization work. We need a tool >>> which would allow us to get profiling data from specific points in the >>> codebase. This means code instrumentation. >>> >>> # Proposed solution >>> >>> There is an https://opentracing.io/ project, which is a vendor-neutral APIs >>> and instrumentation for distributed tracing. In Erlang it is implemented by >>> https://github.com/Bluehouse-Technology/otter library. The library provides >>> a nice abstraction to start/finish tracing spans as well as adding tags and >>> log entries to a given span. In the context of CouchDB this means that we >>> can do something like the following: >>> - start tracing span on every HTTP request >>> - add tags to capture additional information such as "database name"/"name >>> of endpoint"/"nonce" >>> - add otter logs in critical parts of the codebase to get profiling data >>> for these points. >>> >>> The otter is the most useful in combination with >>> [zipkin](https://zipkin.io/) compatible server such as >>> [jaeger](https://github.com/jaegertracing/jaeger). However it can be used >>> even without zipkin. It has a configurable set of counters, which makes it >>> possible to get answers on questions like: >>> - what kind of requests are slow >>> - if we get a slow request (taking longer then configured threshold) what >>> was the trace (annotated with time spent between trace points) >>> - which function in the trace taking the most time >>> >>> # Otter API >>> >>> The otter has multiple APIs which we would choose on a case by case basis: >>> - functional API - the span structure need to be passed around (we could >>> extend `#http{}`/`#user_ctx{}`/`#db{}`) >>> - simple process dictionary API - the span data are stored in the process >>> dictionary >>> - Multiple span process dictionary API - supports multiple spans per process >>> - Span id API - starts a process per span >>> >>> # Roadblocks >>> >>> One of the problems we would need to solve is to how to do multiple nodes >>> spans. We would need this functionality to trace the request from the HTTP >>> endpoint handler running on coordinator to the shard updater process >>> running on the storage nodes. >>> >>> We could use either: >>> - extend rexi or fabric to pass and aggregate span information >>> - pass span info explicitly in every fabric function. >>> >>> # Quick demo (warning very technical content) >>> >>> The goal of this demo is to demonstrate the value of otter without zipkin >>> server. >>> >>> ``` >>> diff --git a/rebar.config.script b/rebar.config.script >>> index c38b6e235..c2b162751 100644 >>> --- a/rebar.config.script >>> +++ b/rebar.config.script >>> @@ -129,6 +129,11 @@ OptionalDeps = case WithProper of >>> [] >>> end, >>> >>> +ManualDeps = [ >>> + {otter, {url, "https://github.com/Bluehouse-Technology/otter"}, {branch, >>> "master"}}, >>> + {otter_lib, {url, "https://github.com/Bluehouse-Technology/otter_lib"}, >>> {branch, "master"}} >>> +], >>> + >>> BaseUrl = "https://github.com/apache/", >>> >>> MakeDep = fun >>> @@ -152,7 +157,7 @@ end, >>> AddConfig = [ >>> {require_otp_vsn, "19|20|21|22"}, >>> {deps_dir, "src"}, >>> - {deps, lists:map(MakeDep, DepDescs ++ OptionalDeps)}, >>> + {deps, lists:map(MakeDep, DepDescs ++ OptionalDeps ++ ManualDeps)}, >>> {sub_dirs, SubDirs}, >>> {lib_dirs, ["src"]}, >>> {erl_opts, [{i, "../"} | ErlOpts]}, >>> diff --git a/src/chttpd/src/chttpd.erl b/src/chttpd/src/chttpd.erl >>> index 1e1d638be..a7aad5010 100644 >>> --- a/src/chttpd/src/chttpd.erl >>> +++ b/src/chttpd/src/chttpd.erl >>> @@ -158,6 +158,7 @@ handle_request(MochiReq0) -> >>> >>> handle_request_int(MochiReq) -> >>> Begin = os:timestamp(), >>> + otter_span_pdict_api:start_with_tags("http request", [{"begin_ts", >>> Begin}]), >>> case config:get("chttpd", "socket_options") of >>> undefined -> >>> ok; >>> @@ -233,6 +234,7 @@ handle_request_int(MochiReq) -> >>> >>> % put small token on heap to keep requests synced to backend calls >>> erlang:put(nonce, Nonce), >>> + otter_span_pdict_api:tag("nonce", Nonce), >>> >>> % suppress duplicate log >>> erlang:put(dont_log_request, true), >>> @@ -282,6 +284,8 @@ after_request(HttpReq, HttpResp0) -> >>> end, >>> HttpResp2 = update_stats(HttpReq, HttpResp1), >>> chttpd_stats:report(HttpReq, HttpResp2), >>> + otter_span_pdict_api:tag("status", HttpResp2#httpd_resp.status), >>> + otter_span_pdict_api:log("completed"), >>> + otter_span_pdict_api:finish(), >>> maybe_log(HttpReq, HttpResp2), >>> HttpResp2. >>> >>> diff --git a/src/fabric/src/fabric.erl b/src/fabric/src/fabric.erl >>> index 27fa8c045..a1972b445 100644 >>> --- a/src/fabric/src/fabric.erl >>> +++ b/src/fabric/src/fabric.erl >>> @@ -56,6 +56,7 @@ all_dbs() -> >>> %% @doc returns a list of all database names >>> -spec all_dbs(Prefix::iodata()) -> {ok, [binary()]}. >>> all_dbs(Prefix) when is_binary(Prefix) -> >>> + otter_span_pdict_api:tag("request_type", "_all_dbs"), >>> Length = byte_size(Prefix), >>> MatchingDbs = mem3:fold_shards(fun(#shard{dbname=DbName}, Acc) -> >>> case DbName of >>> ``` >>> >>> ``` >>> application:start(otter_lib). >>> application:start(otter). >>> f(Rules), Rules = [ >>> {[ >>> {greater, otter_span_duration, 10} >>> ],[ >>> {snapshot_count, [long_span], [otter_span_name]}, >>> send_to_zipkin >>> ]} >>> ]. >>> otter_config:write(filter_rules, Rules). >>> otter:counter_snapshot([long_span,"http request"]). >>> [{[long_span,"http request"], >>> [{snap_timestamp,{2019,9,10,13,46,43,368208}}, >>> {data,{span,1568123203366286,17299637839902614236, >>> "http request",782788946072648712,undefined, >>> [{"begin_ts",{1568,123203,366255}}, >>> {"nonce","a0a1d7c58e"}, >>> {"status",ok}], >>> [],1911}}]}] >>> ``` >>> >>> Unfortunately `counter_snapshot` API doesn't return log events. However >>> this problem is very easy to overcome. Here is the example of how to get >>> matching spans logged into a log file. >>> >>> ``` >>> diff --git a/src/chttpd/src/chttpd.erl b/src/chttpd/src/chttpd.erl >>> index 1e1d638be..5f2d60690 100644 >>> --- a/src/chttpd/src/chttpd.erl >>> +++ b/src/chttpd/src/chttpd.erl >>> @@ -33,7 +33,7 @@ >>> send_chunked_error/2, send_json/2,send_json/3,send_json/4, >>> validate_ctype/2]). >>> >>> --export([authenticate_request/3]). >>> +-export([authenticate_request/3, zipkin_log/2]). >>> >>> @@ -1207,6 +1216,10 @@ get_user(#httpd{user_ctx = #user_ctx{name = User}}) >>> -> >>> get_user(#httpd{user_ctx = undefined}) -> >>> "undefined". >>> >>> +zipkin_log(_URL, EncodedSpans) -> >>> + Spans = otter_lib_zipkin_thrift:decode_spans(EncodedSpans), >>> + couch_log:error("ZIPKIN :: ~p~n", [Spans]). >>> + >>> ``` >>> >>> ``` >>> otter_config:write(http_client, {chttpd, zipkin_log}). >>> ``` >>> >>> This would produce the log events looking as the following: >>> ``` >>> [error] 2019-09-10T18:01:38.758631Z node1@127.0.0.1 <0.4737.0> -------- >>> ZIPKIN :: [{span,1568138498710679,3212384889493927141,<<"http >>> request">>,345610195655038913,undefined,[{<<"lc">>,<<>>,{<<"otter_test">>,{127,0,0,1},0}},{<<"begin_ts">>,<<"{1568,138498,710639}">>},{<<"path">>,<<"_all_dbs">>},{<<"method">>,<<"GET">>},{<<"nonce">>,<<"c225c6aef1">>},{<<"status">>,<<"ok">>}],[{1568138498712456,<<"completed">>}],1784}] >>> ``` >>> >>> As you can see the timestamp of the `completed` event is included in the >>> entry. Also, since it is a function call we can format the event in any way >>> we like. >>> >>> # Conclusion >>> >>> - The otter configuration is simple and presence of `http_client` parameter >>> allows us to use otter without zipkin server. >>> - The API is simple which makes it possible to replace otter with something >>> else if we wouldn't be happy with the implementation. >>> - The codebase of otter is concise and easy to re-implement in case we >>> would need to. >>> >>> Overall I don't think it will be too complicated to introduce basic >>> functionality and extend it latter when we need to. >>> >>> In case you agree with direction I would appreciate any feedback which >>> would help to form requirements for the RFC and PR in the future. >>> >>> Best regards, >>> ILYA (aka iilyak) >>