Hm... as a user I would be interested to know that, say, 95% of my "select * from sometable where..." query executes under 10ms or so.
I think holding some history is important and is not that hard to implement. D. On Fri, Mar 3, 2017 at 10:55 AM, Denis Magda <[email protected]> wrote: > Sergey, agree, good point! > > Igniters, any other thoughts before we wrap up the discussion updating the > ticket content? > > — > Denis > > > On Mar 3, 2017, at 10:06 AM, Valentin Kulichenko < > [email protected]> wrote: > > > > Sergey, that's great idea! Generally, user is not interested much in some > > average numbers, especially in case of SQL queries. What they need is a > > list of slow queries and detailed information about the execution flow of > > these particular queries. > > > > -Val > > > > On Fri, Mar 3, 2017 at 2:50 AM, Sergey Kozlov <[email protected]> > wrote: > > > >> One more comment: > >> > >> In general the customer is interested in slow queries details thus we > can > >> introduce an option which will allow to store only queries executed more > >> than NNN seconds. It may significantly reduce the the memory consumption > >> for history (but logging of all queries is still available if set that > >> option to 0). > >> > >> On Fri, Mar 3, 2017 at 1:00 AM, Denis Magda <[email protected]> wrote: > >> > >>> Vovan, > >>> > >>> When I’m speaking of JOIN metrics I’m simply assume that we need to add > >>> metrics relevant for queries with joins, metrics that will help us get > >> more > >>> insights on non-collocated and collocated joins execution flow. > >>> > >>>> 1) Query exec count > >>>> 2) Query exec time (first define what "time" means) - min, max, avg > >>> > >>> Total query execution time might not be helpful in the trickiest cases. > >>> What if you have multiple joins in your query? How do I know which one > >>> contributes to the execution most? > >>> > >>> So, I would do a breakdown having total time, map time, per-join time, > >>> reduce time. Hope it’s possible. If it’s unclear how to support > >> everything > >>> at the first place then it’s a different question. Let’s create several > >>> tickets and start implementing everything gracefully. > >>> > >>>> 3) Number of bytes exchanged between nodes during query execution > >>> > >>> It will be really helpful to make a breakdown showing a number of bytes > >>> exchanged per-join (physical join). Again, if you believe it makes > sense > >> to > >>> do the breakdown later then let’s create an additional ticket for the 2 > >>> tier metrics then. > >>> > >>>> 4) Number of returned rows - min, max, avg > >>> > >>> > >>> Plus, let’s add the following to the list: > >>> > >>> 5) Collocated: yes or no > >>> > >>> — > >>> Denis > >>> > >>>> On Mar 2, 2017, at 1:32 PM, Vladimir Ozerov <[email protected]> > >>> wrote: > >>>> > >>>> Denis, > >>>> > >>>> The main problem with suggested metrics is that they implies that > >> ceratin > >>>> internal mechanics work in predefined way. For example, what is JOIN > >>>> metrics? There are no guarantees that JOIN in user's query will be > >>>> translated to a real physical join. What if several different query > >>>> execution pieces happen in parallel? What if we rework our distributed > >>>> query engine from pull to push approach for performance reasons and > >> there > >>>> will be no JOINs in classical sense? > >>>> > >>>> This is why I think that we should start with very basic things. > >>> Something > >>>> like: > >>>> 1) Query exec count > >>>> 2) Query exec time (first define what "time" means) - min, max, avg > >>>> 3) Number of bytes exchanged between nodes during query execution > >>>> 4) Number of returned rows - min, max, avg > >>>> > >>>> Once we have base numbers in place, we can think of carefully > >> integrating > >>>> and enhancing all pieces of query execution into more verbose formats, > >>>> similar to query plans with relative weights in classical RDBMS > >> systems. > >>>> > >>>> Thoughts? > >>>> > >>>> > >>>> > >>>> On Thu, Mar 2, 2017 at 11:31 PM, Denis Magda <[email protected]> > >> wrote: > >>>> > >>>>> Vovan, > >>>>> > >>>>> Your metrics make perfect sense to me. However, I see a high demand > >> for > >>>>> JOINs based metrics especially from those who give a try to > >>> non-collocated > >>>>> joins in production and want to measure them somehow. This is why, > >>>>> personally, I prefer to see the metrics below in the top priority > list > >>> as > >>>>> well: > >>>>> > >>>>> if a query was executed in the collocated or non-collocated mode. > >> Three > >>>>> results are valid: collocated, non-collocated, simple query (no > >> joins). > >>>>> non-collocated query: size of the data exchanged between the nodes to > >>>>> complete a specific join. If there are multiple joins in the query we > >>> need > >>>>> to provide this metric for every of them. > >>>>> non-collocated and collocated query: a part of the time spent joining > >>> the > >>>>> data. If there are multiple joins in the query we need to provide > this > >>>>> metric for every of them. > >>>>> > >>>>> As for “unicast” and “broadcast”, agree, let’s ignore it for now. > >>>>> > >>>>> In any case, can we include timing information (map phase, reduce > >> phase, > >>>>> join phase) into an execution plan produced by H2? Are there any > >>>>> implementation hooks? > >>>>> > >>>>> — > >>>>> Denis > >>>>> > >>>>> > >>>>>> On Mar 2, 2017, at 12:02 PM, Dmitriy Setrakyan < > >> [email protected]> > >>>>> wrote: > >>>>>> > >>>>>> I think some of the metrics specified by Denis also make sense, so I > >>>>> would > >>>>>> add them as well. See below... > >>>>>> > >>>>>> On Thu, Mar 2, 2017 at 12:36 AM, Vladimir Ozerov < > >> [email protected] > >>>>> <mailto:[email protected]>> > >>>>>> wrote: > >>>>>> > >>>>>>> Denis, > >>>>>>> > >>>>>>> Query execution is complex process involving different stages which > >>> are > >>>>> not > >>>>>>> very easy to match with each other. Especially provided that any > >> node > >>>>> can > >>>>>>> leave topology at any time. Another problem is that engine evolves > >> and > >>>>>>> metrics like "did a query do broadcast or unicast" may easily > become > >>>>>>> useless at some point, because for example there will be neither > >>>>> unicast, > >>>>>>> nor broadast, but something different. On the other hand I > >> completely > >>>>> agree > >>>>>>> that performance monitoring is essential part of any mature DBMS. > >>>>>>> > >>>>>>> I would start with metrics which are both very basic and easy to > >>>>> implement > >>>>>>> at the same time. For example we can add fingerprint (hash) to > every > >>>>> query > >>>>>>> which will be used to join "map" and "reduce" parts with each other > >>> and > >>>>> add > >>>>>>> the following basic metrics: > >>>>>>> 1) Execution count for particular query > >>>>>>> 2) Number of map nodes - min, max, avg > >>>>>>> > >>>>>> > >>>>>> (1) and (2) makes sense > >>>>>> > >>>>>> > >>>>>>> 3) Map step duration (if applicable) - min, max, > >>>>>> > >>>>>> 4) Reduce step duration (if applicable) - min, max, avg > >>>>>>> > >>>>>> > >>>>>> Not sure if (3) and (4) are needed. I would only add them if they > are > >>>>> easy > >>>>>> to implement. > >>>>>> > >>>>>> I would also add these: > >>>>>> > >>>>>> 5) Collocated: yes/no > >>>>>> 6) last execution time > >>>>>> 7) min/max/average execution duration > >>>>>> > >>>>>> > >>>>>>> > >>>>>>> Once done users will be able to get statistics for particular > >> queries. > >>>>>>> > >>>>>>> Vladimir. > >>>>>>> > >>>>>>> > >>>>>>> On Tue, Feb 28, 2017 at 3:12 AM, Denis Magda <[email protected]> > >>> wrote: > >>>>>>> > >>>>>>>> BTW, > >>>>>>>> > >>>>>>>> What if we expose per-query metrics below as a part of EXPLAIN > >>> ANALYZE? > >>>>>>>> Sergi, is this feasible? > >>>>>>>> > >>>>>>>> — > >>>>>>>> Denis > >>>>>>>> > >>>>>>>>> On Feb 27, 2017, at 2:35 PM, Denis Magda <[email protected]> > >> wrote: > >>>>>>>>> > >>>>>>>>> Igniters, > >>>>>>>>> > >>>>>>>>> Let’s shed more light on SQL query execution internals > >> introducing a > >>>>>>> set > >>>>>>>> of useful metrics (https://issues.apache.org/ > >> jira/browse/IGNITE-4757 > >>> ). > >>>>>>>>> > >>>>>>>>> Per-query metrics. Total history size is defined by > >>>>>>> *CacheConfiguration. > >>>>>>>> getQueryDetailMetricsSize*: > >>>>>>>>> * if a query was executed in the collocated or non-collocated > >> mode. > >>>>>>>> Three results are valid: collocated, non-collocated, simple query > >> (no > >>>>>>>> joins). > >>>>>>>>> * non-collocated query: size of the data exchanged between the > >> nodes > >>>>> to > >>>>>>>> complete a join. > >>>>>>>>> * non-collocated query: did a query do broadcast or unicast to > get > >>>>> data > >>>>>>>> needed to complete a join. > >>>>>>>>> * non-collocated and collocated query: a part of the time spent > >>>>> joining > >>>>>>>> the data. > >>>>>>>>> > >>>>>>>>> CacheMetrics: > >>>>>>>>> * an average number of executed SQL queries (collocated, > >>>>>>> non-collocated, > >>>>>>>> simple query (no joins)). > >>>>>>>>> > >>>>>>>>> Please don’t hesitate do share suggest another metrics or improve > >>>>>>>> proposed ones. > >>>>>>>>> > >>>>>>>>> — > >>>>>>>>> Denis > >>>>> > >>>>> > >>> > >>> > >> > >> > >> -- > >> Sergey Kozlov > >> GridGain Systems > >> www.gridgain.com > >> > >
