Sergey, that's great idea! Generally, user is not interested much in some average numbers, especially in case of SQL queries. What they need is a list of slow queries and detailed information about the execution flow of these particular queries.
-Val On Fri, Mar 3, 2017 at 2:50 AM, Sergey Kozlov <[email protected]> wrote: > One more comment: > > In general the customer is interested in slow queries details thus we can > introduce an option which will allow to store only queries executed more > than NNN seconds. It may significantly reduce the the memory consumption > for history (but logging of all queries is still available if set that > option to 0). > > On Fri, Mar 3, 2017 at 1:00 AM, Denis Magda <[email protected]> wrote: > > > Vovan, > > > > When I’m speaking of JOIN metrics I’m simply assume that we need to add > > metrics relevant for queries with joins, metrics that will help us get > more > > insights on non-collocated and collocated joins execution flow. > > > > > 1) Query exec count > > > 2) Query exec time (first define what "time" means) - min, max, avg > > > > Total query execution time might not be helpful in the trickiest cases. > > What if you have multiple joins in your query? How do I know which one > > contributes to the execution most? > > > > So, I would do a breakdown having total time, map time, per-join time, > > reduce time. Hope it’s possible. If it’s unclear how to support > everything > > at the first place then it’s a different question. Let’s create several > > tickets and start implementing everything gracefully. > > > > > 3) Number of bytes exchanged between nodes during query execution > > > > It will be really helpful to make a breakdown showing a number of bytes > > exchanged per-join (physical join). Again, if you believe it makes sense > to > > do the breakdown later then let’s create an additional ticket for the 2 > > tier metrics then. > > > > > 4) Number of returned rows - min, max, avg > > > > > > Plus, let’s add the following to the list: > > > > 5) Collocated: yes or no > > > > — > > Denis > > > > > On Mar 2, 2017, at 1:32 PM, Vladimir Ozerov <[email protected]> > > wrote: > > > > > > Denis, > > > > > > The main problem with suggested metrics is that they implies that > ceratin > > > internal mechanics work in predefined way. For example, what is JOIN > > > metrics? There are no guarantees that JOIN in user's query will be > > > translated to a real physical join. What if several different query > > > execution pieces happen in parallel? What if we rework our distributed > > > query engine from pull to push approach for performance reasons and > there > > > will be no JOINs in classical sense? > > > > > > This is why I think that we should start with very basic things. > > Something > > > like: > > > 1) Query exec count > > > 2) Query exec time (first define what "time" means) - min, max, avg > > > 3) Number of bytes exchanged between nodes during query execution > > > 4) Number of returned rows - min, max, avg > > > > > > Once we have base numbers in place, we can think of carefully > integrating > > > and enhancing all pieces of query execution into more verbose formats, > > > similar to query plans with relative weights in classical RDBMS > systems. > > > > > > Thoughts? > > > > > > > > > > > > On Thu, Mar 2, 2017 at 11:31 PM, Denis Magda <[email protected]> > wrote: > > > > > >> Vovan, > > >> > > >> Your metrics make perfect sense to me. However, I see a high demand > for > > >> JOINs based metrics especially from those who give a try to > > non-collocated > > >> joins in production and want to measure them somehow. This is why, > > >> personally, I prefer to see the metrics below in the top priority list > > as > > >> well: > > >> > > >> if a query was executed in the collocated or non-collocated mode. > Three > > >> results are valid: collocated, non-collocated, simple query (no > joins). > > >> non-collocated query: size of the data exchanged between the nodes to > > >> complete a specific join. If there are multiple joins in the query we > > need > > >> to provide this metric for every of them. > > >> non-collocated and collocated query: a part of the time spent joining > > the > > >> data. If there are multiple joins in the query we need to provide this > > >> metric for every of them. > > >> > > >> As for “unicast” and “broadcast”, agree, let’s ignore it for now. > > >> > > >> In any case, can we include timing information (map phase, reduce > phase, > > >> join phase) into an execution plan produced by H2? Are there any > > >> implementation hooks? > > >> > > >> — > > >> Denis > > >> > > >> > > >>> On Mar 2, 2017, at 12:02 PM, Dmitriy Setrakyan < > [email protected]> > > >> wrote: > > >>> > > >>> I think some of the metrics specified by Denis also make sense, so I > > >> would > > >>> add them as well. See below... > > >>> > > >>> On Thu, Mar 2, 2017 at 12:36 AM, Vladimir Ozerov < > [email protected] > > >> <mailto:[email protected]>> > > >>> wrote: > > >>> > > >>>> Denis, > > >>>> > > >>>> Query execution is complex process involving different stages which > > are > > >> not > > >>>> very easy to match with each other. Especially provided that any > node > > >> can > > >>>> leave topology at any time. Another problem is that engine evolves > and > > >>>> metrics like "did a query do broadcast or unicast" may easily become > > >>>> useless at some point, because for example there will be neither > > >> unicast, > > >>>> nor broadast, but something different. On the other hand I > completely > > >> agree > > >>>> that performance monitoring is essential part of any mature DBMS. > > >>>> > > >>>> I would start with metrics which are both very basic and easy to > > >> implement > > >>>> at the same time. For example we can add fingerprint (hash) to every > > >> query > > >>>> which will be used to join "map" and "reduce" parts with each other > > and > > >> add > > >>>> the following basic metrics: > > >>>> 1) Execution count for particular query > > >>>> 2) Number of map nodes - min, max, avg > > >>>> > > >>> > > >>> (1) and (2) makes sense > > >>> > > >>> > > >>>> 3) Map step duration (if applicable) - min, max, > > >>> > > >>> 4) Reduce step duration (if applicable) - min, max, avg > > >>>> > > >>> > > >>> Not sure if (3) and (4) are needed. I would only add them if they are > > >> easy > > >>> to implement. > > >>> > > >>> I would also add these: > > >>> > > >>> 5) Collocated: yes/no > > >>> 6) last execution time > > >>> 7) min/max/average execution duration > > >>> > > >>> > > >>>> > > >>>> Once done users will be able to get statistics for particular > queries. > > >>>> > > >>>> Vladimir. > > >>>> > > >>>> > > >>>> On Tue, Feb 28, 2017 at 3:12 AM, Denis Magda <[email protected]> > > wrote: > > >>>> > > >>>>> BTW, > > >>>>> > > >>>>> What if we expose per-query metrics below as a part of EXPLAIN > > ANALYZE? > > >>>>> Sergi, is this feasible? > > >>>>> > > >>>>> — > > >>>>> Denis > > >>>>> > > >>>>>> On Feb 27, 2017, at 2:35 PM, Denis Magda <[email protected]> > wrote: > > >>>>>> > > >>>>>> Igniters, > > >>>>>> > > >>>>>> Let’s shed more light on SQL query execution internals > introducing a > > >>>> set > > >>>>> of useful metrics (https://issues.apache.org/ > jira/browse/IGNITE-4757 > > ). > > >>>>>> > > >>>>>> Per-query metrics. Total history size is defined by > > >>>> *CacheConfiguration. > > >>>>> getQueryDetailMetricsSize*: > > >>>>>> * if a query was executed in the collocated or non-collocated > mode. > > >>>>> Three results are valid: collocated, non-collocated, simple query > (no > > >>>>> joins). > > >>>>>> * non-collocated query: size of the data exchanged between the > nodes > > >> to > > >>>>> complete a join. > > >>>>>> * non-collocated query: did a query do broadcast or unicast to get > > >> data > > >>>>> needed to complete a join. > > >>>>>> * non-collocated and collocated query: a part of the time spent > > >> joining > > >>>>> the data. > > >>>>>> > > >>>>>> CacheMetrics: > > >>>>>> * an average number of executed SQL queries (collocated, > > >>>> non-collocated, > > >>>>> simple query (no joins)). > > >>>>>> > > >>>>>> Please don’t hesitate do share suggest another metrics or improve > > >>>>> proposed ones. > > >>>>>> > > >>>>>> — > > >>>>>> Denis > > >> > > >> > > > > > > > -- > Sergey Kozlov > GridGain Systems > www.gridgain.com >
