chewbranca commented on PR #5602:
URL: https://github.com/apache/couchdb/pull/5602#issuecomment-3105033930

   Okay over in 38608f7cd I've dropped the extraneous `abs` usage, cleaned up 
the gen_server callbacks, dropped the `unlink` call, and cleaned up some naming 
discrepancies, plus I've added additional documentation clarifying that the 
default logger matchers are for matching against any requests in CouchDB as 
opposed to the "dbnames_io" matchers being targeted to a specific database 
name, and mentioned the tradeoff of losing some of the granularity of matching 
against particular dimensions. Alternatively, we could add `dbnames_ioq` 
matcher for creating a matcher on a dbname with a particular IOQ threshold, but 
because we can't easily chain these definition's we would need to make 
dedicated matchers for _all_ of the desired combinations, eg 
`dbnames_docs_read`, `dbnames_rows_read`, etc. The combinatorics become even 
more problematic when you want to express matchers like "match on db foo for 
changes requests that have induced more than 1000 IOQ calls and more than 100 
Javascript filter
  invocations".
   
   You can actually create that matcher now, but it needs to be registered 
directly by way of remsh, for example, the follow dynamically creates a CSRT 
logger matcher that satisfies the constraint `"match on db foo for changes 
requests that have induced more than 1000 IOQ calls and more than 100 
Javascript filter invocations".`:
   
   ```erlang
   (node1@127.0.0.1)16> rr(csrt_server).
   [coordinator,rctx,rpc_worker,st]
   (node1@127.0.0.1)17> ets:fun2ms(fun(#rctx{dbname = <<"foo">>, 
type=#coordinator{mod='chttpd_db', func='handle_changes_req'}, ioq_calls=IC, 
js_filter=JF}=R) when IC > 1000 andalso JF > 100 -> R end).
   [{#rctx{started_at = '_',updated_at = '_',pid_ref = '_',
           nonce = '_',
           type = #coordinator{mod = chttpd_db,
                               func = handle_changes_req,method = '_',path = 
'_'},
           dbname = <<"foo">>,username = '_',db_open = '_',
           docs_read = '_',docs_written = '_',rows_read = '_',
           changes_returned = '_',ioq_calls = '$1',js_filter = '$2',
           js_filtered_docs = '_',get_kv_node = '_',get_kp_node = '_'},
     [{'andalso',{'>','$1',1000},{'>','$2',100}}],
     ['$_']}]
   (node1@127.0.0.1)18> csrt_logger:register_matcher("custom_foo", 
ets:fun2ms(fun(#rctx{dbname = <<"food_db', func='handle_changes_req'}, 
ioq_calls=IC, js_filter=JF}=R) when IC > 1000 andalso JF > 100 -> R end)).
   ok
   ```
   
   That'll dynamically compile the matchspec and push it out by way of 
persistent_term to be picked up by the tracker pids to decide whether or not to 
generate a process lifecycle report.
   
   The tricky bit is mapping `ets:fun2ms(fun(#rctx{dbname = <<"foo">>, 
type=#coordinator{mod='chttpd_db', func='handle_changes_req'}, ioq_calls=IC, 
js_filter=JF}=R) when IC > 1000 andalso JF > 100 -> R end)` to something we can 
express in `default.ini`. The `ets:fun2ms` transform is a great tool that 
facilitates declaratively constructing these matchspecs that allow us to 
efficiently query the ets tracking table, while also re-using these same 
matchers directly against a given `#rctx{}` to filter requests to log, however, 
its use of parse transforms makes it difficult to iteratively and 
programmatically construct complex pattern match statements.
   
   I would love to see a simple translation layer that basically lets us use 
Mango syntax for declaring these filters, as that would make it much easier to 
express within the ini files, but also it would allow us to dynamically 
construct the logger matcher specs on the fly for a given HTTP request, eg you 
could `POST` Mango spec but with fields in `#rctx{}` to then query the ets 
table. If we get to where we have the expressiveness of something like Mango 
query syntax to define the logger matchers, then we can replace most of these 
default matchers with better more targeted matchers while also providing an 
HTTP query API that can dynamically generate these ets matchspecs for efficient 
querying and aggregating.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to