Hello Nikita, I observed we have an open PR in ignite repo for this feature with different set of changes compared to ignite extensions repo.
apache/ignite#7693 <https://github.com/apache/ignite/pull/7693> https://github.com/apache/ignite-extensions/pull/16 Can you please share more info on how we can use the profiling tool with ignite-extensions modules? Regards, Saikat On Mon, Jun 8, 2020 at 5:51 AM Nikolay Izhikov <nizhi...@apache.org> wrote: > Hello, Alexey. > > Thanks for the review. > > My understanding if the following: > > We will have 3 in-depth tool to find issues in cluster: > > 1. Metrics + System views - data that describe Ignite entities very > high-level. > > 2. Profiling - tool to know what specific query of transactions are slow. > In many cases, this knowledge is enough to fix the issue(rewrite query, > redesign transactions flow, etc) > > 3. Tracing - tool to know why one of 1000 of the same queries was slow. > The most detailed view of the Ignite internal processes. > > > For example, a user would not be able to match a long task with a long > job in that task. > > This is not true. > Profiling report will aggregate data from all nodes. > So there will be both > > * summary time of the task > * time of the each job in the task. > > > > 8 июня 2020 г., в 12:52, Alexey Goncharuk <alexey.goncha...@gmail.com> > написал(а): > > > > Nikita, Igniters, > > > > I left a few comments on the tool itself in the PR. > > > > However, I would like to reiterate and discuss why a user would prefer to > > use the profiling tool over tracing? Profiling tool only captures very > > high-level details of the operations (a single cache operation, for > > example), and does not interconnect operations happened on different > nodes. > > For example, a user would not be able to match a long task with a long > job > > in that task. In other words, profiling data is always a subset of data > > collected by tracing. > > > > Maybe it makes sense to adopt local log file approach to write spans so > we > > can process those spans later to build a report? > > > > чт, 4 июн. 2020 г. в 11:16, Nikita Amelchev <nsamelc...@gmail.com>: > > > >> Hi, Igniters. > >> > >> I have implemented cluster profiling and tool to build the performance > >> report. It's ready to be reviewed. [1, 2] > >> > >> Profiling can be managed by JMX bean. I have plans to implement it to > >> control.sh also. > >> > >> Nodes write statistics to the temporary off heap buffer and then one > >> thread flushes to the profiling files. The write mechanics and format > >> is like WAL logging. > >> The report contains the following statistics: > >> - nodes and caches info > >> - cache operations and transaction statistics > >> - SQL and scan queries statistics (include logical and physical reads > per > >> query) > >> - tasks and jobs statistics. > >> > >> More details in the IEP [3]. > >> > >> [1] https://github.com/apache/ignite/pull/7693 > >> [2] https://issues.apache.org/jira/browse/IGNITE-12666 > >> [3] > >> > https://cwiki.apache.org/confluence/display/IGNITE/Cluster+performance+profiling+tool > >> > >> вс, 26 апр. 2020 г. в 17:29, Вячеслав Коптилин < > slava.kopti...@gmail.com>: > >>> > >>> Hello Nikolay, > >>> > >>>> Who deprecated visor and when? Maybe I miss something? > >>> On the one hand, there was technically no community consensus that this > >>> tool should be obsolete. > >>> On the other hand, my opinion based on the following topic: > >>> > >> > http://apache-ignite-developers.2346864.n4.nabble.com/Re-Visor-plugin-tp44879p44939.html > >>> Moreover, it seems to me, currently, the control utility is widely used > >> and > >>> actively developed, instead of the visor. > >>> > >>>> It's true that, for now, Ignite doesn't have "tool strategy" I think > >> it's > >>> a big issue from the user's point of view. > >>> I absolutely agree with that. > >>> > >>>> We should solve it in the nearest time. Feel free to start this > >> activity > >>> I have no plan at the moment. However, at the first stage, we could > >>> understand the difference between visor and control utility. > >>> All useful features from visor should be moved/implemented in control > >>> utility and after that visor tool and should be marked as > >>> deprecated/obsoleted. > >>> > >>>> You need to throw in control.sh also, which does some kind of > >> statistics > >>> too, such as idle_verify. > >>>> Please, clarify your idea: > >>>> We should use some info from control.sh to the report? > >>>> The report should be generated by some control.sh subcommand? > >>> If I am not mistaken, the oracle database has AWR tool (mentioned on > the > >>> IEP page) which is a command-line utility that generates HTML reports. > >>> I like this idea and I think this is a good approach that can be > realized > >>> in the control utility. > >>> If we have a case that cannot be implemented in this way, we have to > >>> clearly states the difference between these tools so as not to confuse > >> our > >>> users. > >>> What do you think? > >>> > >>> Thanks, > >>> Slava. > >>> > >>> > >>> сб, 25 апр. 2020 г. в 12:00, Nikolay Izhikov <nizhi...@apache.org>: > >>> > >>>> Hello, Slava, Ilya, Denis. > >>>> > >>>> Thanks for joining this discussion! > >>>> > >>>>> - visor (which is deprecated) > >>>> > >>>> Who deprecated visor and when? > >>>> Maybe I miss something? > >>>> > >>>>> - web-console (to be honest, I don't quite understand the status of > >> this > >>>> tool) > >>>> > >>>> +1. > >>>> > >>>>> I am not against the new tool, I just want to understand the > >> motivation > >>>> to not improve the existing sub-projects. > >>>> > >>>> It's true that, for now, Ignite doesn't have "tool strategy" > >>>> I think it's a big issue from the user's point of view. > >>>> We should solve it in the nearest time. > >>>> Feel free to start this activity. > >>>> > >>>>> - new ignite-profiling (which is a monitoring tool as well, judging > >> by > >>>> the provided link [1] ) > >>>> > >>>> The general idea is the following: > >>>> > >>>> 1. We should have some profiling mechanism that will generate a > >> node-local > >>>> event log > >>>> 2. We should have a tool that can export events to some third-party > >>>> system. This system can be an Elastic Search(Kibana) or Ignite > >> performance > >>>> report or Kafka log, whatever. > >>>> 3. Ignite performance report, in the first release, should be a > >> "static" > >>>> tool. > >>>> This means we take static logs(that is not rewritten in the > >> analysis > >>>> time) and feed them in the script(or tool or control.sh, whatever) > >>>> The script produces static report that can be used for overall > >>>> performance analysis. > >>>> > >>>> The primary users of this report is a developer of Ignite based > >>>> applications and performance engineers. > >>>> > >>>> Ilya, > >>>> > >>>>> You need to throw in control.sh also, which does some kind of > >> statistics > >>>> too, such as idle_verify. > >>>> > >>>> Please, clarify your idea: > >>>> We should use some info from control.sh to the report? > >>>> The report should be generated by some control.sh subcommand? > >>>> > >>>> > >>>> Denis, > >>>> > >>>>> Speaking of the probes/statistics collection approach, is it > >> supposed to > >>>> reuse tracing capabilities that are to be added as part of IEP-35? > >>>> > >>>> For now, we don't have any results of tracing development available in > >>>> Apache Ignite. > >>>> Hopefully, we got some in a couple of weeks. > >>>> After it, we can start a discussion of how to merge two improvements. > >>>> > >>>> > >>>> > >>>>> 24 апр. 2020 г., в 20:32, Denis Magda <dma...@apache.org> > >> написал(а): > >>>>> > >>>>>> > >>>>>> Tracing is more deeply takes statistics. If it will be possible, > >> I'm for > >>>>>> reuse. > >>>>> > >>>>> > >>>>> Looks like we need to sync up on these activities/initiatives to > >> ensure > >>>> we > >>>>> don't do a duplicate job. If you think a separate discussion is > >> necessary > >>>>> let's kick it off. > >>>>> > >>>>> - > >>>>> Denis > >>>>> > >>>>> > >>>>> On Fri, Apr 24, 2020 at 9:18 AM Nikita Amelchev < > >> nsamelc...@gmail.com> > >>>>> wrote: > >>>>> > >>>>>> Denis, Ilya, > >>>>>> > >>>>>> I will try to integrate profiling functionality into control.sh > >> utility. > >>>>>> > >>>>>>> Speaking of the probes/statistics collection approach, is it > >> supposed > >>>> to > >>>>>>> reuse tracing capabilities that are to be added as part of IEP-35? > >>>>>> Tracing is more deeply takes statistics. If it will be possible, > >> I'm for > >>>>>> reuse. > >>>>>> > >>>>>> пт, 24 апр. 2020 г. в 18:59, Ilya Kasnacheev < > >> ilya.kasnach...@gmail.com > >>>>> : > >>>>>>> > >>>>>>> Hello! > >>>>>>> > >>>>>>> I suggest that it's one of the places where it could be put > >> instead of > >>>>>>> adding a new tool. > >>>>>>> > >>>>>>> Regards, > >>>>>>> -- > >>>>>>> Ilya Kasnacheev > >>>>>>> > >>>>>>> > >>>>>>> пт, 24 апр. 2020 г. в 18:56, Nikita Amelchev <nsamelc...@gmail.com > >>> : > >>>>>>> > >>>>>>>> Ilya, > >>>>>>>> > >>>>>>>> You suggest using control.sh to build the report? > >>>>>>>> > >>>>>>>> пт, 24 апр. 2020 г. в 18:20, Ilya Kasnacheev < > >>>>>> ilya.kasnach...@gmail.com>: > >>>>>>>>> > >>>>>>>>> Hello! > >>>>>>>>> > >>>>>>>>> You need to throw in control.sh also, which does some kind of > >>>>>> statistics > >>>>>>>>> too, such as idle_verify. > >>>>>>>>> > >>>>>>>>> Regards, > >>>>>>>>> -- > >>>>>>>>> Ilya Kasnacheev > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> пт, 24 апр. 2020 г. в 18:06, Вячеслав Коптилин < > >>>>>> slava.kopti...@gmail.com > >>>>>>>>> : > >>>>>>>>> > >>>>>>>>>> Hello Nikita, > >>>>>>>>>> > >>>>>>>>>> Perhaps, I am missing something... > >>>>>>>>>> Apache Ignite already has a web-console tool. Do we want to > >>>>>> improve the > >>>>>>>>>> existing tool instead of creating a new one? > >>>>>>>>>> It seems to me, this can be confusing for users. > >>>>>>>>>> - visor (which is deprecated) > >>>>>>>>>> - web-console (to be honest, I don't quite understand the status > >>>>>> of > >>>>>>>> this > >>>>>>>>>> tool) > >>>>>>>>>> - new ignite-profiling (which is a monitoring tool as well, > >>>>>> judging > >>>>>>>> by the > >>>>>>>>>> provided link [1] ) > >>>>>>>>>> > >>>>>>>>>> I am not against the new tool, I just want to understand the > >>>>>>>> motivation to > >>>>>>>>>> not improve the existing sub-projects. > >>>>>>>>>> > >>>>>>>>>> [1] > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>> > >>>>>> > >>>> > >> > https://cwiki.apache.org/confluence/display/IGNITE/Cluster+performance+profiling+tool > >>>>>>>>>> > >>>>>>>>>> Thanks, > >>>>>>>>>> S. > >>>>>>>>>> > >>>>>>>>>> пт, 24 апр. 2020 г. в 14:58, Nikita Amelchev < > >> nsamelc...@gmail.com > >>>>>>> : > >>>>>>>>>> > >>>>>>>>>>> Hi, Igniters. > >>>>>>>>>>> > >>>>>>>>>>> I'm working on cluster profiling and the tool for creating a > >>>>>>>>>>> performance report. [1] I have prepared PoC based on > >> performance > >>>>>>>>>>> logging to a separate category of Ignite log. The report > >>>>>> contains: > >>>>>>>>>>> > >>>>>>>>>>> - Cache operations and its distribution by types [2] > >>>>>>>>>>> - Transactions and histogram of durations [3] > >>>>>>>>>>> - SQL and Scan query statistics, top of slowest queries, > >> logical > >>>>>> and > >>>>>>>>>>> physical reads by query [4] > >>>>>>>>>>> - Compute statistics, top of slowest tasks and their jobs [5] > >>>>>>>>>>> Soon I will add: > >>>>>>>>>>> - Topology and Ignite versions info > >>>>>>>>>>> - Client ID in case of operations from clients > >>>>>>>>>>> > >>>>>>>>>>> For now, I'm developing a binary logging format to reduce the > >>>>>> effect > >>>>>>>>>>> on performance. I'll try to reuse Ignite mechanisms. > >>>>>>>>>>> > >>>>>>>>>>> I would like to hear suggestions for the profiling and the > >>>>>>>> performance > >>>>>>>>>>> report. > >>>>>>>>>>> > >>>>>>>>>>> [1] > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>> > >>>>>> > >>>> > >> > https://cwiki.apache.org/confluence/display/IGNITE/Cluster+performance+profiling+tool > >>>>>>>>>>> [2] > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>> > >>>>>> > >>>> > >> > https://cwiki.apache.org/confluence/display/IGNITE/Cluster+performance+profiling+tool?preview=/145723859/148647581/p1.png > >>>>>>>>>>> [3] > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>> > >>>>>> > >>>> > >> > https://cwiki.apache.org/confluence/display/IGNITE/Cluster+performance+profiling+tool?preview=/145723859/148647582/p2.png > >>>>>>>>>>> [4] > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>> > >>>>>> > >>>> > >> > https://cwiki.apache.org/confluence/display/IGNITE/Cluster+performance+profiling+tool?preview=/145723859/148647583/p3.png > >>>>>>>>>>> [5] > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>> > >>>>>> > >>>> > >> > https://cwiki.apache.org/confluence/display/IGNITE/Cluster+performance+profiling+tool?preview=/145723859/152112279/p5.png > >>>>>>>>>>> > >>>>>>>>>>> -- > >>>>>>>>>>> Best wishes, > >>>>>>>>>>> Amelchev Nikita > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> -- > >>>>>>>> Best wishes, > >>>>>>>> Amelchev Nikita > >>>>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> Best wishes, > >>>>>> Amelchev Nikita > >>>>>> > >>>> > >>>> > >> > >> > >> > >> -- > >> Best wishes, > >> Amelchev Nikita > >> > >