The views engine and the first view are almost ready to merge (review comments are resolved). Which views should we take next? My proposal - NODES, NODE_ATTRIBUTES, NODE_METRICS, NODE_HOSTS and NODE_ADDRESSES, since these views are clear and all topology data available on each node. Any objections?
2018-01-25 16:27 GMT+03:00 Alex Plehanov <plehanov.a...@gmail.com>: > Anton, Vladimir, I've made some fixes. There is only one view left and > it's renamed to 'IGNITE.LOCAL_TRANSACTIONS'. > > High level design of solution: > When IgniteH2Indexing is starting, it create and start > new GridH2SysViewProcessor, which create and register in H2 (via its own > table engine) all implementations of system views. Each system view > implementation extends base abstract class GridH2SysView. View > implementation describes columns, their types and indexes in constructor > and must override method getRows for data retrieval (this method called by > H2-compatible table and index implementations for ignite system views). > Almost no fixes to existing parsing engine was made, except some places, > where GridH2Table instance was expected, but for system views there is > another class. > > New PR: [1]. Please have a look. > > [1] https://github.com/apache/ignite/pull/3433 > > 2018-01-24 19:12 GMT+03:00 Anton Vinogradov <avinogra...@gridgain.com>: > >> I've created IEP-13 [1] to cover all cases. >> Feel free to create issues. >> >> [1] >> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=75962769 >> >> On Wed, Jan 24, 2018 at 6:10 PM, Vladimir Ozerov <voze...@gridgain.com> >> wrote: >> >> > Let's start with a single and the most simple view, e.g. >> > LOCAL_TRANSACTIONS. We will review and merge it along with necessary >> > infrastructure. Then will handle the rest view in separate tickets and >> > separate focused discussions. >> > >> > On Wed, Jan 24, 2018 at 5:29 PM, Alex Plehanov <plehanov.a...@gmail.com >> > >> > wrote: >> > >> > > 1) It’s not a principal point, I can change schema. The >> > INFORMATION_SCHEMA >> > > was used because it’s already exists and usually used for metadata >> tables >> > > and views. Your proposal is to use schema “IGNITE”, am I understand >> you >> > > right? BTW, for now, we can’t query another (H2) meta tables from the >> > > INFORMATION_SCHEMA, so, “Ignite system views” is only available views >> to >> > > query from this schema. >> > > 2) Exactly for this reason the IGNITE_INSTANCE view is useful: to >> > determine >> > > which node we are connected to. >> > > 3) As the first phase, in my opinion, local views will be enough. >> > > Performance and caching of distributed views should be discussed at >> next >> > > phases, when distributed views implementation will be planned. In >> current >> > > implementation I tried to use indexing for local views wherever it’s >> > > possible. >> > > 4) I don’t think, that JVM info is more critical information than, for >> > > example, caches or nodes information. When authorization capabilities >> > > planned to implement? >> > > >> > > About local data: yes, we can rename all currently implemented views >> for >> > > the local node data as LOCAL_..., and create (someday) new whole >> cluster >> > > views (which use distributed requests) without prefix or, for example, >> > with >> > > CLUSTER_ prefix. But some views can show all cluster information using >> > only >> > > local node data, without distributed requests (for example >> > > IGNITE_NODE_METRICS, IGNITE_PART_ASSIGNMENT, IGNITE_PART_ALLOCATION, >> > > IGNITE_NODES, etc). Are they local or cluster views in this concept? >> > Which >> > > prefix should be used? And what about caches? Are they local or >> cluster? >> > On >> > > local node we can see cluster wide caches (replicated and distributed) >> > and >> > > caches for current node only. Local caches list may differ from node >> to >> > > node. Which prefix should be used for this view? And one more, there >> is >> > no >> > > sense for some views to make them cluster wide (for example >> > > INGNITE_INSTANCE). Should we name it LOCAL_INSTANCE without creating >> > > INSTANCE view? >> > > >> > > So, next steps: split PR, change schema name (IGNITE?), change view >> name >> > > for caches (CACHES, LOCAL_CACHES?) >> > > >> > > >> > > 2018-01-24 13:03 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>: >> > > >> > > > Hi Alex, >> > > > >> > > > System views could be extremely valuable addition for Ignite. >> Ideally, >> > > user >> > > > should be able to monitor and manage state of the whole cluster >> with a >> > > > single SQL command line. We have plans to implement it for a very >> long >> > > > time. However, this is very sensitive task which should take a lot >> of >> > > > moving pieces in count, such as usability, consistency, performance, >> > > > security, etc.. >> > > > >> > > > Let me point several major concerns I see at the moment: >> > > > >> > > > 1) Usability: INFORMATION_SCHEMA >> > > > This schema is part of SQL ANSI standard. When creating system >> views, >> > > some >> > > > vendors prefer to store them in completely different predefined >> schema >> > > > (Oracle, MS SQL). Others prefer to keep them in INFORMATION_SCHEMA >> > > > directly. Both approaches could work. However, the latter breaks >> > > separation >> > > > of concerns - we store typical metadata near to possibly sensitive >> > system >> > > > data. Also it makes security management more complex - system data >> is >> > > very >> > > > sensitive, and now we cannot simply grant access >> INFORMATIONAL_SCHEMA >> > to >> > > > user. Instead, we have to grant that access on per-view basis. For >> this >> > > > reason my preference is to store system tables in separate schema, >> not >> > in >> > > > INFORMATION_SCHEMA >> > > > >> > > > 2) Consistency: local data >> > > > One of implemented view GridH2SysViewImplInstance. Normally SQL >> users >> > > > communicate with Ignite through JDBC/ODBC drivers. These drivers are >> > > > connected to a single node, typically client node. Moreover, we will >> > > > introduce high-availability feature when drivers were able to >> connect >> > to >> > > > any address from a predefined list. It renders this view useless, as >> > you >> > > do >> > > > not know which node you connected to. Also, local-only data cannot >> be >> > > > joined in general case - you will receive different results on >> > different >> > > > nodes. The same goes for transactions, JVM info, etc. >> > > > >> > > > 3) Performance >> > > > Suppose we fixed consistency of transactions and now this view shows >> > > > transactions in the whole cluster with possibility to filter them by >> > > nodes >> > > > - this is what user would expect out of the box. Another problem >> > appears >> > > > then - performance. How would we collect necessary data? How would >> we >> > > > handle joins, when particular view could be scanned multiple times >> > during >> > > > query execution? How we achieve sensible consistency? Most probably >> we >> > > > would collect remote data once when query is started, cache it >> somehow >> > on >> > > > query session level, and then re-use during joins. But again, this >> > should >> > > > be discussed separately. >> > > > >> > > > 4) Security: JVM info >> > > > We should define clear boundaries of what info is exposed. JVM data >> > along >> > > > with running threads is critically sensitive information. We should >> not >> > > > expose it until we have authorization capabilities. >> > > > >> > > > In order to start moving this code from prototype to production >> state >> > we >> > > > should start with the most simple and consistent views. E.g. >> > > IGNITE_CACHES. >> > > > Let's move it to a separate PR, review infrastructure code, review >> view >> > > > implementation, agree on proper naming and placement, and merge it. >> > Then >> > > > each and every view (or group of related views) should be discussed >> and >> > > > reviewed separately. >> > > > >> > > > As far as node-local stuff, may be we should move it to a separate >> > > schema, >> > > > or mark with special prefix. E.g. "IGNITE.TRANSACTIONS" - all >> > > transactions >> > > > in the cluster, "IGNITE.LOCAL_TRANSACTIONS" - transactions on the >> local >> > > > node. In this case we will be able to merge "local" stuff shortly, >> and >> > > > implement more complex but at the same time much more useful >> > distributed >> > > > stuff later on. >> > > > >> > > > Makes sense? >> > > > >> > > > Vladimir. >> > > > >> > > > >> > > > On Tue, Jan 23, 2018 at 8:30 PM, Alex Plehanov < >> > plehanov.a...@gmail.com> >> > > > wrote: >> > > > >> > > > > Hello, Igniters! >> > > > > >> > > > > For Ignite diagnostic usually it’s helpful to get some Ignite >> > internals >> > > > > information. But currently, in my opinion, there are no convenient >> > > tools >> > > > > for this purpose: >> > > > > >> > > > > · Some issues can be solved by analyzing log files. Log >> files >> > > are >> > > > > useful for dumps, but sometimes they are difficult to read. Also >> > > > > interesting metrics can’t be received runtime by request, we need >> to >> > > wait >> > > > > until Ignite will write these metrics by timeout or other events. >> > > > > >> > > > > · JMX is useful for scalar metrics. Complex and table data >> can >> > > > also >> > > > > be received, but it’s difficult to read, filter and sort them >> without >> > > > > processing by specialized external tools. For most frequently used >> > > cases >> > > > > almost duplicating metrics are created to show data in an >> > easy-to-read >> > > > > form. >> > > > > >> > > > > · Web-console is able to show table and complex data. >> Perhaps, >> > > > > someday web-console will contain all necessary dashboards for >> most >> > > > problem >> > > > > investigation, but some non-trivial queries will not be covered >> > anyway. >> > > > > Also web-console needs additional infrastructure to work. >> > > > > >> > > > > · External “home-made” tools can be used for non-trivial >> > cases. >> > > > They >> > > > > cover highly specialized cases and usually can’t be used as >> general >> > > > purpose >> > > > > tools. >> > > > > >> > > > > Sometimes we are forced to use more than one tool and join data by >> > > hands >> > > > > (for example, current thread dump and data from logs). >> > > > > >> > > > > Often RDBMS for diagnostic purposes provides system views (for >> > example, >> > > > > DBA_% and V$% in Oracle), which can be queried by SQL. This >> solution >> > > > makes >> > > > > all internal diagnostic information available in a readable form >> > (with >> > > > all >> > > > > possible filters and projections) without using any other >> internal or >> > > > > external tools. My proposal is to create similar system views in >> > > Ignite. >> > > > > >> > > > > I implement working prototype (PR: [1]). It contains views: >> > > > > >> > > > > IGNITE_SYSTEM_VIEWS >> > > > > >> > > > > Registered system views >> > > > > >> > > > > IGNITE_INSTANCE >> > > > > >> > > > > Ignite instance >> > > > > >> > > > > IGNITE_JVM_THREADS >> > > > > >> > > > > JVM threads >> > > > > >> > > > > IGNITE_JVM_RUNTIME >> > > > > >> > > > > JVM runtime >> > > > > >> > > > > IGNITE_JVM_OS >> > > > > >> > > > > JVM operating system >> > > > > >> > > > > IGNITE_CACHES >> > > > > >> > > > > Ignite caches >> > > > > >> > > > > IGNITE_CACHE_CLUSTER_METRICS >> > > > > >> > > > > Ignite cache cluster metrics >> > > > > >> > > > > IGNITE_CACHE_NODE_METRICS >> > > > > >> > > > > Ignite cache node metrics >> > > > > >> > > > > IGNITE_CACHE_GROUPS >> > > > > >> > > > > Cache groups >> > > > > >> > > > > IGNITE_NODES >> > > > > >> > > > > Nodes in topology >> > > > > >> > > > > IGNITE_NODE_HOSTS >> > > > > >> > > > > Node hosts >> > > > > >> > > > > IGNITE_NODE_ADDRESSES >> > > > > >> > > > > Node addresses >> > > > > >> > > > > IGNITE_NODE_ATTRIBUTES >> > > > > >> > > > > Node attributes >> > > > > >> > > > > IGNITE_NODE_METRICS >> > > > > >> > > > > Node metrics >> > > > > >> > > > > IGNITE_TRANSACTIONS >> > > > > >> > > > > Active transactions >> > > > > >> > > > > IGNITE_TRANSACTION_ENTRIES >> > > > > >> > > > > Cache entries used by transaction >> > > > > >> > > > > IGNITE_TASKS >> > > > > >> > > > > Active tasks >> > > > > >> > > > > IGNITE_PART_ASSIGNMENT >> > > > > >> > > > > Partition assignment map >> > > > > >> > > > > IGNITE_PART_ALLOCATION >> > > > > >> > > > > Partition allocation map >> > > > > >> > > > > >> > > > > >> > > > > There are much more useful views can be implemented (executors >> > > > diagnostic, >> > > > > SPIs diagnostic, etc). >> > > > > >> > > > > Some usage examples: >> > > > > >> > > > > Cache groups and their partitions, which used by transaction more >> > than >> > > 5 >> > > > > minutes long: >> > > > > >> > > > > SELECT cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION, count(*) AS >> > > ENTITIES_CNT >> > > > > FROM INFORMATION_SCHEMA.IGNITE_TRANSACTIONS t >> > > > > JOIN INFORMATION_SCHEMA.IGNITE_TRANSACTION_ENTRIES te ON t.XID = >> > > te.XID >> > > > > JOIN INFORMATION_SCHEMA.IGNITE_CACHES c ON te.CACHE_NAME = c.NAME >> > > > > JOIN INFORMATION_SCHEMA.IGNITE_CACHE_GROUPS cg ON c.GROUP_ID = >> cg.ID >> > > > > WHERE t.START_TIME < TIMESTAMPADD('MINUTE', -5, NOW()) >> > > > > GROUP BY cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION >> > > > > >> > > > > >> > > > > >> > > > > Average CPU load on server nodes grouped by operating system: >> > > > > >> > > > > SELECT na.VALUE, COUNT(n.ID), AVG(nm.AVG_CPU_LOAD) AVG_CPU_LOAD >> > > > > FROM INFORMATION_SCHEMA.IGNITE_NODES n >> > > > > JOIN INFORMATION_SCHEMA.IGNITE_NODE_ATTRIBUTES na ON na.NODE_ID = >> > n.ID >> > > > AND >> > > > > na.NAME = 'os.name' >> > > > > JOIN INFORMATION_SCHEMA.IGNITE_NODE_METRICS nm ON nm.NODE_ID = >> n.ID >> > > > > WHERE n.IS_CLIENT = false >> > > > > GROUP BY na.VALUE >> > > > > >> > > > > >> > > > > >> > > > > Top 5 nodes by puts to cache ‘cache’: >> > > > > >> > > > > SELECT cm.NODE_ID, cm.CACHE_PUTS FROM >> > > > > INFORMATION_SCHEMA.IGNITE_CACHE_NODE_METRICS cm >> > > > > WHERE cm.CACHE_NAME = 'cache' >> > > > > ORDER BY cm.CACHE_PUTS DESC >> > > > > LIMIT 5 >> > > > > >> > > > > >> > > > > >> > > > > Does this implementation interesting to someone else? Maybe any >> views >> > > are >> > > > > redundant? Which additional first-priority views must be >> implemented? >> > > Any >> > > > > other thoughts or proposal? >> > > > > >> > > > > [1] https://github.com/apache/ignite/pull/3413 >> > > > > >> > > > >> > > >> > >> > >