Re: Introduction of open tracing (otter)

Adam Kocoloski Tue, 10 Sep 2019 14:51:45 -0700

Yeah, I meant the latter — joining CouchDB’s span information to spans in an 
app built against CouchDB so a developer can see the end-to-end story. Wasn’t 
proposing user-customized spans inside the DB :)


Adam

> On Sep 10, 2019, at 5:43 PM, Paul Davis <paul.joseph.da...@gmail.com> wrote:
> 
> Looks pretty awesome. I've got basically the same questions as Koco on
> performance. There are also games like the lager transforms that
> conditionally enable/disable log levels at runtime. If memory serves,
> it ended up being a single function call overhead to check for
> disabled based on some dynamically compiled module or ets lookup I
> think.
> 
> Koco, are client inherited spans an opentracing concept? At first I
> read it as "let a user specify points in CouchDB to insert trace
> markers at runtime" and it sounded kinda crazy. But if you mean
> somehow connecting the CouchDB generated span with some other span in
> a different application that sounds like something reasonable to
> support.
> 
> Ilya, you mentioned hopping from the coordinator to RPC workers which
> is definitely an open problem. I only skimmed the docs months ago but
> one of the things I came across was trying to figure out how to
> represent parallel traces. Given we have a coordinator that has N>1
> RPC workers running in parallel I wasn't sure how that'd work. Granted
> that was on the shallowest of shallow dives skimming their docs when
> someone mentioned the tracing thing somewhere.
> 
> On Tue, Sep 10, 2019 at 3:46 PM Adam Kocoloski <kocol...@apache.org> wrote:
>> 
>> I think this is a great idea overall, particularly given the number of 
>> significant changes that are happening in the codebase between 3.0 and 4.0.
>> 
>> For me the main question is how much overhead is associated with tracing. 
>> Can an admin safely configure it to run in production? Is it possible to 
>> sample just a small percentage of events? Does the overhead change if no 
>> OpenTracing tracer is configured?
>> 
>> I also think a full picture here might include the ability to inherit 
>> client-provided spans, so an app developer could drill down from her own 
>> code into the database internals and figure out why some DB request was 
>> unexpectedly slow.
>> 
>> Thanks for starting this discussion. Cheers,
>> 
>> Adam
>> 
>>> On Sep 10, 2019, at 2:32 PM, Ilya Khlopotov <iil...@apache.org> wrote:
>>> 
>>> Hi,
>>> 
>>> I wanted to run this idea by the ML to see if there is any interest before 
>>> investing time into preparing formal RFC.
>>> 
>>> # Problem statement
>>> 
>>> Collecting profiling data is very tricky at the moment. Developers have to 
>>> run generic profiling tools which are not aware of CouchDB specifics.
>>> This makes it hard to do the performance optimization work. We need a tool 
>>> which would allow us to get profiling data from specific points in the 
>>> codebase. This means code instrumentation.
>>> 
>>> # Proposed solution
>>> 
>>> There is an https://opentracing.io/ project, which is a vendor-neutral APIs 
>>> and instrumentation for distributed tracing. In Erlang it is implemented by 
>>> https://github.com/Bluehouse-Technology/otter library. The library provides 
>>> a nice abstraction to start/finish tracing spans as well as adding tags and 
>>> log entries to a given span. In the context of CouchDB this means that we 
>>> can do something like the following:
>>> - start tracing span on every HTTP request
>>> - add tags to capture additional information such as "database name"/"name 
>>> of endpoint"/"nonce"
>>> - add otter logs in critical parts of the codebase to get profiling data 
>>> for these points.
>>> 
>>> The otter is the most useful in combination with 
>>> [zipkin](https://zipkin.io/) compatible server such as 
>>> [jaeger](https://github.com/jaegertracing/jaeger). However it can be used 
>>> even without zipkin. It has a configurable set of counters, which makes it 
>>> possible to get answers on questions like:
>>> - what kind of requests are slow
>>> - if we get a slow request (taking longer then configured threshold) what 
>>> was the trace (annotated with time spent between trace points)
>>> - which function in the trace taking the most time
>>> 
>>> # Otter API
>>> 
>>> The otter has multiple APIs which we would choose on a case by case basis:
>>> - functional API - the span structure need to be passed around (we could 
>>> extend `#http{}`/`#user_ctx{}`/`#db{}`)
>>> - simple process dictionary API - the span data are stored in the process 
>>> dictionary
>>> - Multiple span process dictionary API - supports multiple spans per process
>>> - Span id API - starts a process per span
>>> 
>>> # Roadblocks
>>> 
>>> One of the problems we would need to solve is to how to do multiple nodes 
>>> spans. We would need this functionality to trace the request from the HTTP 
>>> endpoint handler running on coordinator to the shard updater process 
>>> running on the storage nodes.
>>> 
>>> We could use either:
>>> - extend rexi or fabric to pass and aggregate span information
>>> - pass span info explicitly in every fabric function.
>>> 
>>> # Quick demo (warning very technical content)
>>> 
>>> The goal of this demo is to demonstrate the value of otter without zipkin 
>>> server.
>>> 
>>> ```
>>> diff --git a/rebar.config.script b/rebar.config.script
>>> index c38b6e235..c2b162751 100644
>>> --- a/rebar.config.script
>>> +++ b/rebar.config.script
>>> @@ -129,6 +129,11 @@ OptionalDeps = case WithProper of
>>>        []
>>> end,
>>> 
>>> +ManualDeps = [
>>> +  {otter, {url, "https://github.com/Bluehouse-Technology/otter"}, {branch, 
>>> "master"}},
>>> +  {otter_lib, {url, "https://github.com/Bluehouse-Technology/otter_lib"}, 
>>> {branch, "master"}}
>>> +],
>>> +
>>> BaseUrl = "https://github.com/apache/";,
>>> 
>>> MakeDep = fun
>>> @@ -152,7 +157,7 @@ end,
>>> AddConfig = [
>>>    {require_otp_vsn, "19|20|21|22"},
>>>    {deps_dir, "src"},
>>> -    {deps, lists:map(MakeDep, DepDescs ++ OptionalDeps)},
>>> +    {deps, lists:map(MakeDep, DepDescs ++ OptionalDeps ++ ManualDeps)},
>>>    {sub_dirs, SubDirs},
>>>    {lib_dirs, ["src"]},
>>>    {erl_opts, [{i, "../"} | ErlOpts]},
>>> diff --git a/src/chttpd/src/chttpd.erl b/src/chttpd/src/chttpd.erl
>>> index 1e1d638be..a7aad5010 100644
>>> --- a/src/chttpd/src/chttpd.erl
>>> +++ b/src/chttpd/src/chttpd.erl
>>> @@ -158,6 +158,7 @@ handle_request(MochiReq0) ->
>>> 
>>> handle_request_int(MochiReq) ->
>>>    Begin = os:timestamp(),
>>> +    otter_span_pdict_api:start_with_tags("http request", [{"begin_ts", 
>>> Begin}]),
>>>    case config:get("chttpd", "socket_options") of
>>>    undefined ->
>>>        ok;
>>> @@ -233,6 +234,7 @@ handle_request_int(MochiReq) ->
>>> 
>>>    % put small token on heap to keep requests synced to backend calls
>>>    erlang:put(nonce, Nonce),
>>> +    otter_span_pdict_api:tag("nonce", Nonce),
>>> 
>>>    % suppress duplicate log
>>>    erlang:put(dont_log_request, true),
>>> @@ -282,6 +284,8 @@ after_request(HttpReq, HttpResp0) ->
>>>        end,
>>>    HttpResp2 = update_stats(HttpReq, HttpResp1),
>>>    chttpd_stats:report(HttpReq, HttpResp2),
>>> +    otter_span_pdict_api:tag("status", HttpResp2#httpd_resp.status),
>>> +    otter_span_pdict_api:log("completed"),
>>> +    otter_span_pdict_api:finish(),
>>>    maybe_log(HttpReq, HttpResp2),
>>>    HttpResp2.
>>> 
>>> diff --git a/src/fabric/src/fabric.erl b/src/fabric/src/fabric.erl
>>> index 27fa8c045..a1972b445 100644
>>> --- a/src/fabric/src/fabric.erl
>>> +++ b/src/fabric/src/fabric.erl
>>> @@ -56,6 +56,7 @@ all_dbs() ->
>>> %% @doc returns a list of all database names
>>> -spec all_dbs(Prefix::iodata()) -> {ok, [binary()]}.
>>> all_dbs(Prefix) when is_binary(Prefix) ->
>>> +    otter_span_pdict_api:tag("request_type", "_all_dbs"),
>>>    Length = byte_size(Prefix),
>>>    MatchingDbs = mem3:fold_shards(fun(#shard{dbname=DbName}, Acc) ->
>>>        case DbName of
>>> ```
>>> 
>>> ```
>>> application:start(otter_lib).
>>> application:start(otter).
>>> f(Rules), Rules = [
>>> {[
>>>       {greater, otter_span_duration, 10}
>>>   ],[
>>>       {snapshot_count, [long_span], [otter_span_name]},
>>>       send_to_zipkin
>>>  ]}
>>> ].
>>> otter_config:write(filter_rules, Rules).
>>> otter:counter_snapshot([long_span,"http request"]).
>>> [{[long_span,"http request"],
>>> [{snap_timestamp,{2019,9,10,13,46,43,368208}},
>>>  {data,{span,1568123203366286,17299637839902614236,
>>>              "http request",782788946072648712,undefined,
>>>              [{"begin_ts",{1568,123203,366255}},
>>>               {"nonce","a0a1d7c58e"},
>>>               {"status",ok}],
>>>              [],1911}}]}]
>>> ```
>>> 
>>> Unfortunately `counter_snapshot` API doesn't return log events. However 
>>> this problem is very easy to overcome. Here is the example of how to get 
>>> matching spans logged into a log file.
>>> 
>>> ```
>>> diff --git a/src/chttpd/src/chttpd.erl b/src/chttpd/src/chttpd.erl
>>> index 1e1d638be..5f2d60690 100644
>>> --- a/src/chttpd/src/chttpd.erl
>>> +++ b/src/chttpd/src/chttpd.erl
>>> @@ -33,7 +33,7 @@
>>>    send_chunked_error/2, send_json/2,send_json/3,send_json/4,
>>>    validate_ctype/2]).
>>> 
>>> --export([authenticate_request/3]).
>>> +-export([authenticate_request/3, zipkin_log/2]).
>>> 
>>> @@ -1207,6 +1216,10 @@ get_user(#httpd{user_ctx = #user_ctx{name = User}}) 
>>> ->
>>> get_user(#httpd{user_ctx = undefined}) ->
>>>    "undefined".
>>> 
>>> +zipkin_log(_URL, EncodedSpans) ->
>>> +    Spans = otter_lib_zipkin_thrift:decode_spans(EncodedSpans),
>>> +    couch_log:error("ZIPKIN :: ~p~n", [Spans]).
>>> +
>>> ```
>>> 
>>> ```
>>> otter_config:write(http_client, {chttpd, zipkin_log}).
>>> ```
>>> 
>>> This would produce the log events looking as the following:
>>> ```
>>> [error] 2019-09-10T18:01:38.758631Z node1@127.0.0.1 <0.4737.0> -------- 
>>> ZIPKIN :: [{span,1568138498710679,3212384889493927141,<<"http 
>>> request">>,345610195655038913,undefined,[{<<"lc">>,<<>>,{<<"otter_test">>,{127,0,0,1},0}},{<<"begin_ts">>,<<"{1568,138498,710639}">>},{<<"path">>,<<"_all_dbs">>},{<<"method">>,<<"GET">>},{<<"nonce">>,<<"c225c6aef1">>},{<<"status">>,<<"ok">>}],[{1568138498712456,<<"completed">>}],1784}]
>>> ```
>>> 
>>> As you can see the timestamp of the `completed` event is included in the 
>>> entry. Also, since it is a function call we can format the event in any way 
>>> we like.
>>> 
>>> # Conclusion
>>> 
>>> - The otter configuration is simple and presence of `http_client` parameter 
>>> allows us to use otter without zipkin server.
>>> - The API is simple which makes it possible to replace otter with something 
>>> else if we wouldn't be happy with the implementation.
>>> - The codebase of otter is concise and easy to re-implement in case we 
>>> would need to.
>>> 
>>> Overall I don't think it will be too complicated to introduce basic 
>>> functionality and extend it latter when we need to.
>>> 
>>> In case you agree with direction I would appreciate any feedback which 
>>> would help to form requirements for the RFC and PR in the future.
>>> 
>>> Best regards,
>>> ILYA (aka iilyak)
>>

Re: Introduction of open tracing (otter)

Reply via email to