Andrew Purtell commented on PHOENIX-2715:

Random thoughts on trying to use this in a production setting
 * Cool to have a LogWriter that puts the log into a table, so the log itself 
can be queried. Powerful. How about a LogWriter that just emits to Java logging 
as well. HBase+Phoenix systems throw off a ton of this type of logging, so we 
already need a solution for managing it, for which query log would just be a 
new subset. Many places may want their log search solution to be based on 
something else (Splunk, Elastic, Solr, etc.)
 * If not an alternate implementation of LogWriter, at least a better 
factoring. Make LogWriter abstract or an interface. That should be quickly 
 * What happens if query logging becomes too expensive? We can turn it all the 
way on and all the way off. Can we have a knob for probabilistic sampling? This 
is really easy to implement. Add one config parameter, a float or double, one 
that can ideally be changed dynamically. Call it something like 
QUERY_LOG_SAMPLE_RATE (not a great name but whatever) In the code where you go 
to do the query logging, add a conditional \{{if 
(ThreadLocalRandom.getCurrent().getDouble() <= 
getConfig(QUERY_LOG_SAMPLE_RATE))}} . Easy. So if logging 100% of queries is 
too expensive (at QUERY_LOG_SAMPLE_RATE = 1.0), we can try logging 50% of them 
(at QUERY_LOG_SAMPLE_RATE = 0.5), or 10% of them (at QUERY_LOG_SAMPLE_RATE = 
0.1), or 1% of them (at QUERY_LOG_SAMPLE_RATE = 0.01). 

> Query Log
> ---------
>                 Key: PHOENIX-2715
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2715
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: Nick Dimiduk
>            Assignee: Ankit Singhal
>            Priority: Major
>         Attachments: PHOENIX-2715.patch, PHOENIX-2715_master.patch, 
> PHOENIX-2715_master_V1.patch
> One useful feature of other database systems is the query log. It allows the 
> DBA to review the queries run, who's run them, time taken, &c. This serves 
> both as an audit and also as a source of "ground truth" for performance 
> optimization. For instance, which columns should be indexed. It may also 
> serve as the foundation for automated performance recommendations/actions.
> What queries are being run is the first piece. Have this data tied into 
> tracing results and perhaps client-side metrics (PHOENIX-1819) becomes very 
> useful.
> This might take the form of clients writing data to a new system table, but 
> other implementation suggestions are welcome.

This message was sent by Atlassian JIRA

Reply via email to