[ 
https://issues.apache.org/jira/browse/OAK-6807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16222342#comment-16222342
 ] 

Thomas Mueller commented on OAK-6807:
-------------------------------------

> Not sure how many concurrent query calls happen

[~chetanm] Yes. I think concurrent queries is very rare (not that many queries 
are run, and internal queries are ignored). For simplicity, I will still use a 
regular HashMap, but ensure logging is outside of the synchronized block. I 
would be surprised if this is a concurrency issue that way.

> Log stats on shutdown to capture low usage mode

Not sure how to best do that. I wouldn't want to add a shutdown hook. To 
analyze which queries are slow, it's probably best to use the regular query 
stats, that one has much more data (including timing data, and accurate 
execution and node counts). So most important is to log the first time a query 
is run (in my view). My goal is to use the data (the list of queries that are 
run) in order to remove indexes that are not used, or otherwise optimize index 
configuration. By the way the patch above had a bug (now fixed) so that the 
first time a query is run is not logged at all (off-by-one error).

> Log line to have simple format for easier parsing

I will use the tab character as a separator, that should be easy to parse. 
Other characters like the pipe can occur in the query as well (specially for 
XPath). In the query, I will replace tab and newline with a space.

>  it might be useful to have a plan 

[~catholicon] I thought about this, but it's tricky: a query can use multiple 
indexes, if converted to union. So not sure when / where to best collect that 
info. For the use case I have in mind (analyze which indexes are not needed), a 
second step is needed: given the list of queries, build the execution plans, 
and that way find out which indexes are (not) needed.

> Query Recorder
> --------------
>
>                 Key: OAK-6807
>                 URL: https://issues.apache.org/jira/browse/OAK-6807
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: query
>            Reporter: Thomas Mueller
>            Assignee: Thomas Mueller
>             Fix For: 1.7.11
>
>
> In order to manage indexes (e.g. find out which indexes are no longer needed, 
> which properties don't need to be indexed any longer), we have an easy way to 
> log all executed queries / query plans. 
> Each entry only needs to be logged once (logging multiple times is OK, but 
> ensure it's not logged to often). Different log levels can be used (e.g. log 
> level "TRACE" logs more data, "DEBUG" less). For "DEBUG" level, overhead of 
> logging should be minimal, so this can be kept enabled for a long time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to