Alex P, sounds like a good plan for me. Vladimir, do you have any suggestions or corrections?
— Denis > On Feb 12, 2018, at 4:57 AM, Alex Plehanov <plehanov.a...@gmail.com> wrote: > > The views engine and the first view are almost ready to merge (review > comments are resolved). Which views should we take next? My proposal - > NODES, NODE_ATTRIBUTES, NODE_METRICS, NODE_HOSTS and NODE_ADDRESSES, since > these views are clear and all topology data available on each node. > Any objections? > > 2018-01-25 16:27 GMT+03:00 Alex Plehanov <plehanov.a...@gmail.com>: > >> Anton, Vladimir, I've made some fixes. There is only one view left and >> it's renamed to 'IGNITE.LOCAL_TRANSACTIONS'. >> >> High level design of solution: >> When IgniteH2Indexing is starting, it create and start >> new GridH2SysViewProcessor, which create and register in H2 (via its own >> table engine) all implementations of system views. Each system view >> implementation extends base abstract class GridH2SysView. View >> implementation describes columns, their types and indexes in constructor >> and must override method getRows for data retrieval (this method called by >> H2-compatible table and index implementations for ignite system views). >> Almost no fixes to existing parsing engine was made, except some places, >> where GridH2Table instance was expected, but for system views there is >> another class. >> >> New PR: [1]. Please have a look. >> >> [1] https://github.com/apache/ignite/pull/3433 >> >> 2018-01-24 19:12 GMT+03:00 Anton Vinogradov <avinogra...@gridgain.com>: >> >>> I've created IEP-13 [1] to cover all cases. >>> Feel free to create issues. >>> >>> [1] >>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=75962769 >>> >>> On Wed, Jan 24, 2018 at 6:10 PM, Vladimir Ozerov <voze...@gridgain.com> >>> wrote: >>> >>>> Let's start with a single and the most simple view, e.g. >>>> LOCAL_TRANSACTIONS. We will review and merge it along with necessary >>>> infrastructure. Then will handle the rest view in separate tickets and >>>> separate focused discussions. >>>> >>>> On Wed, Jan 24, 2018 at 5:29 PM, Alex Plehanov <plehanov.a...@gmail.com >>>> >>>> wrote: >>>> >>>>> 1) It’s not a principal point, I can change schema. The >>>> INFORMATION_SCHEMA >>>>> was used because it’s already exists and usually used for metadata >>> tables >>>>> and views. Your proposal is to use schema “IGNITE”, am I understand >>> you >>>>> right? BTW, for now, we can’t query another (H2) meta tables from the >>>>> INFORMATION_SCHEMA, so, “Ignite system views” is only available views >>> to >>>>> query from this schema. >>>>> 2) Exactly for this reason the IGNITE_INSTANCE view is useful: to >>>> determine >>>>> which node we are connected to. >>>>> 3) As the first phase, in my opinion, local views will be enough. >>>>> Performance and caching of distributed views should be discussed at >>> next >>>>> phases, when distributed views implementation will be planned. In >>> current >>>>> implementation I tried to use indexing for local views wherever it’s >>>>> possible. >>>>> 4) I don’t think, that JVM info is more critical information than, for >>>>> example, caches or nodes information. When authorization capabilities >>>>> planned to implement? >>>>> >>>>> About local data: yes, we can rename all currently implemented views >>> for >>>>> the local node data as LOCAL_..., and create (someday) new whole >>> cluster >>>>> views (which use distributed requests) without prefix or, for example, >>>> with >>>>> CLUSTER_ prefix. But some views can show all cluster information using >>>> only >>>>> local node data, without distributed requests (for example >>>>> IGNITE_NODE_METRICS, IGNITE_PART_ASSIGNMENT, IGNITE_PART_ALLOCATION, >>>>> IGNITE_NODES, etc). Are they local or cluster views in this concept? >>>> Which >>>>> prefix should be used? And what about caches? Are they local or >>> cluster? >>>> On >>>>> local node we can see cluster wide caches (replicated and distributed) >>>> and >>>>> caches for current node only. Local caches list may differ from node >>> to >>>>> node. Which prefix should be used for this view? And one more, there >>> is >>>> no >>>>> sense for some views to make them cluster wide (for example >>>>> INGNITE_INSTANCE). Should we name it LOCAL_INSTANCE without creating >>>>> INSTANCE view? >>>>> >>>>> So, next steps: split PR, change schema name (IGNITE?), change view >>> name >>>>> for caches (CACHES, LOCAL_CACHES?) >>>>> >>>>> >>>>> 2018-01-24 13:03 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>: >>>>> >>>>>> Hi Alex, >>>>>> >>>>>> System views could be extremely valuable addition for Ignite. >>> Ideally, >>>>> user >>>>>> should be able to monitor and manage state of the whole cluster >>> with a >>>>>> single SQL command line. We have plans to implement it for a very >>> long >>>>>> time. However, this is very sensitive task which should take a lot >>> of >>>>>> moving pieces in count, such as usability, consistency, performance, >>>>>> security, etc.. >>>>>> >>>>>> Let me point several major concerns I see at the moment: >>>>>> >>>>>> 1) Usability: INFORMATION_SCHEMA >>>>>> This schema is part of SQL ANSI standard. When creating system >>> views, >>>>> some >>>>>> vendors prefer to store them in completely different predefined >>> schema >>>>>> (Oracle, MS SQL). Others prefer to keep them in INFORMATION_SCHEMA >>>>>> directly. Both approaches could work. However, the latter breaks >>>>> separation >>>>>> of concerns - we store typical metadata near to possibly sensitive >>>> system >>>>>> data. Also it makes security management more complex - system data >>> is >>>>> very >>>>>> sensitive, and now we cannot simply grant access >>> INFORMATIONAL_SCHEMA >>>> to >>>>>> user. Instead, we have to grant that access on per-view basis. For >>> this >>>>>> reason my preference is to store system tables in separate schema, >>> not >>>> in >>>>>> INFORMATION_SCHEMA >>>>>> >>>>>> 2) Consistency: local data >>>>>> One of implemented view GridH2SysViewImplInstance. Normally SQL >>> users >>>>>> communicate with Ignite through JDBC/ODBC drivers. These drivers are >>>>>> connected to a single node, typically client node. Moreover, we will >>>>>> introduce high-availability feature when drivers were able to >>> connect >>>> to >>>>>> any address from a predefined list. It renders this view useless, as >>>> you >>>>> do >>>>>> not know which node you connected to. Also, local-only data cannot >>> be >>>>>> joined in general case - you will receive different results on >>>> different >>>>>> nodes. The same goes for transactions, JVM info, etc. >>>>>> >>>>>> 3) Performance >>>>>> Suppose we fixed consistency of transactions and now this view shows >>>>>> transactions in the whole cluster with possibility to filter them by >>>>> nodes >>>>>> - this is what user would expect out of the box. Another problem >>>> appears >>>>>> then - performance. How would we collect necessary data? How would >>> we >>>>>> handle joins, when particular view could be scanned multiple times >>>> during >>>>>> query execution? How we achieve sensible consistency? Most probably >>> we >>>>>> would collect remote data once when query is started, cache it >>> somehow >>>> on >>>>>> query session level, and then re-use during joins. But again, this >>>> should >>>>>> be discussed separately. >>>>>> >>>>>> 4) Security: JVM info >>>>>> We should define clear boundaries of what info is exposed. JVM data >>>> along >>>>>> with running threads is critically sensitive information. We should >>> not >>>>>> expose it until we have authorization capabilities. >>>>>> >>>>>> In order to start moving this code from prototype to production >>> state >>>> we >>>>>> should start with the most simple and consistent views. E.g. >>>>> IGNITE_CACHES. >>>>>> Let's move it to a separate PR, review infrastructure code, review >>> view >>>>>> implementation, agree on proper naming and placement, and merge it. >>>> Then >>>>>> each and every view (or group of related views) should be discussed >>> and >>>>>> reviewed separately. >>>>>> >>>>>> As far as node-local stuff, may be we should move it to a separate >>>>> schema, >>>>>> or mark with special prefix. E.g. "IGNITE.TRANSACTIONS" - all >>>>> transactions >>>>>> in the cluster, "IGNITE.LOCAL_TRANSACTIONS" - transactions on the >>> local >>>>>> node. In this case we will be able to merge "local" stuff shortly, >>> and >>>>>> implement more complex but at the same time much more useful >>>> distributed >>>>>> stuff later on. >>>>>> >>>>>> Makes sense? >>>>>> >>>>>> Vladimir. >>>>>> >>>>>> >>>>>> On Tue, Jan 23, 2018 at 8:30 PM, Alex Plehanov < >>>> plehanov.a...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hello, Igniters! >>>>>>> >>>>>>> For Ignite diagnostic usually it’s helpful to get some Ignite >>>> internals >>>>>>> information. But currently, in my opinion, there are no convenient >>>>> tools >>>>>>> for this purpose: >>>>>>> >>>>>>> · Some issues can be solved by analyzing log files. Log >>> files >>>>> are >>>>>>> useful for dumps, but sometimes they are difficult to read. Also >>>>>>> interesting metrics can’t be received runtime by request, we need >>> to >>>>> wait >>>>>>> until Ignite will write these metrics by timeout or other events. >>>>>>> >>>>>>> · JMX is useful for scalar metrics. Complex and table data >>> can >>>>>> also >>>>>>> be received, but it’s difficult to read, filter and sort them >>> without >>>>>>> processing by specialized external tools. For most frequently used >>>>> cases >>>>>>> almost duplicating metrics are created to show data in an >>>> easy-to-read >>>>>>> form. >>>>>>> >>>>>>> · Web-console is able to show table and complex data. >>> Perhaps, >>>>>>> someday web-console will contain all necessary dashboards for >>> most >>>>>> problem >>>>>>> investigation, but some non-trivial queries will not be covered >>>> anyway. >>>>>>> Also web-console needs additional infrastructure to work. >>>>>>> >>>>>>> · External “home-made” tools can be used for non-trivial >>>> cases. >>>>>> They >>>>>>> cover highly specialized cases and usually can’t be used as >>> general >>>>>> purpose >>>>>>> tools. >>>>>>> >>>>>>> Sometimes we are forced to use more than one tool and join data by >>>>> hands >>>>>>> (for example, current thread dump and data from logs). >>>>>>> >>>>>>> Often RDBMS for diagnostic purposes provides system views (for >>>> example, >>>>>>> DBA_% and V$% in Oracle), which can be queried by SQL. This >>> solution >>>>>> makes >>>>>>> all internal diagnostic information available in a readable form >>>> (with >>>>>> all >>>>>>> possible filters and projections) without using any other >>> internal or >>>>>>> external tools. My proposal is to create similar system views in >>>>> Ignite. >>>>>>> >>>>>>> I implement working prototype (PR: [1]). It contains views: >>>>>>> >>>>>>> IGNITE_SYSTEM_VIEWS >>>>>>> >>>>>>> Registered system views >>>>>>> >>>>>>> IGNITE_INSTANCE >>>>>>> >>>>>>> Ignite instance >>>>>>> >>>>>>> IGNITE_JVM_THREADS >>>>>>> >>>>>>> JVM threads >>>>>>> >>>>>>> IGNITE_JVM_RUNTIME >>>>>>> >>>>>>> JVM runtime >>>>>>> >>>>>>> IGNITE_JVM_OS >>>>>>> >>>>>>> JVM operating system >>>>>>> >>>>>>> IGNITE_CACHES >>>>>>> >>>>>>> Ignite caches >>>>>>> >>>>>>> IGNITE_CACHE_CLUSTER_METRICS >>>>>>> >>>>>>> Ignite cache cluster metrics >>>>>>> >>>>>>> IGNITE_CACHE_NODE_METRICS >>>>>>> >>>>>>> Ignite cache node metrics >>>>>>> >>>>>>> IGNITE_CACHE_GROUPS >>>>>>> >>>>>>> Cache groups >>>>>>> >>>>>>> IGNITE_NODES >>>>>>> >>>>>>> Nodes in topology >>>>>>> >>>>>>> IGNITE_NODE_HOSTS >>>>>>> >>>>>>> Node hosts >>>>>>> >>>>>>> IGNITE_NODE_ADDRESSES >>>>>>> >>>>>>> Node addresses >>>>>>> >>>>>>> IGNITE_NODE_ATTRIBUTES >>>>>>> >>>>>>> Node attributes >>>>>>> >>>>>>> IGNITE_NODE_METRICS >>>>>>> >>>>>>> Node metrics >>>>>>> >>>>>>> IGNITE_TRANSACTIONS >>>>>>> >>>>>>> Active transactions >>>>>>> >>>>>>> IGNITE_TRANSACTION_ENTRIES >>>>>>> >>>>>>> Cache entries used by transaction >>>>>>> >>>>>>> IGNITE_TASKS >>>>>>> >>>>>>> Active tasks >>>>>>> >>>>>>> IGNITE_PART_ASSIGNMENT >>>>>>> >>>>>>> Partition assignment map >>>>>>> >>>>>>> IGNITE_PART_ALLOCATION >>>>>>> >>>>>>> Partition allocation map >>>>>>> >>>>>>> >>>>>>> >>>>>>> There are much more useful views can be implemented (executors >>>>>> diagnostic, >>>>>>> SPIs diagnostic, etc). >>>>>>> >>>>>>> Some usage examples: >>>>>>> >>>>>>> Cache groups and their partitions, which used by transaction more >>>> than >>>>> 5 >>>>>>> minutes long: >>>>>>> >>>>>>> SELECT cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION, count(*) AS >>>>> ENTITIES_CNT >>>>>>> FROM INFORMATION_SCHEMA.IGNITE_TRANSACTIONS t >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_TRANSACTION_ENTRIES te ON t.XID = >>>>> te.XID >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_CACHES c ON te.CACHE_NAME = c.NAME >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_CACHE_GROUPS cg ON c.GROUP_ID = >>> cg.ID >>>>>>> WHERE t.START_TIME < TIMESTAMPADD('MINUTE', -5, NOW()) >>>>>>> GROUP BY cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION >>>>>>> >>>>>>> >>>>>>> >>>>>>> Average CPU load on server nodes grouped by operating system: >>>>>>> >>>>>>> SELECT na.VALUE, COUNT(n.ID), AVG(nm.AVG_CPU_LOAD) AVG_CPU_LOAD >>>>>>> FROM INFORMATION_SCHEMA.IGNITE_NODES n >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_NODE_ATTRIBUTES na ON na.NODE_ID = >>>> n.ID >>>>>> AND >>>>>>> na.NAME = 'os.name' >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_NODE_METRICS nm ON nm.NODE_ID = >>> n.ID >>>>>>> WHERE n.IS_CLIENT = false >>>>>>> GROUP BY na.VALUE >>>>>>> >>>>>>> >>>>>>> >>>>>>> Top 5 nodes by puts to cache ‘cache’: >>>>>>> >>>>>>> SELECT cm.NODE_ID, cm.CACHE_PUTS FROM >>>>>>> INFORMATION_SCHEMA.IGNITE_CACHE_NODE_METRICS cm >>>>>>> WHERE cm.CACHE_NAME = 'cache' >>>>>>> ORDER BY cm.CACHE_PUTS DESC >>>>>>> LIMIT 5 >>>>>>> >>>>>>> >>>>>>> >>>>>>> Does this implementation interesting to someone else? Maybe any >>> views >>>>> are >>>>>>> redundant? Which additional first-priority views must be >>> implemented? >>>>> Any >>>>>>> other thoughts or proposal? >>>>>>> >>>>>>> [1] https://github.com/apache/ignite/pull/3413 >>>>>>> >>>>>> >>>>> >>>> >>> >> >>