I would start with NODES and NODE_ATTRIBUTES as the most simple thing. On Tue, Feb 13, 2018 at 4:10 AM, Denis Magda <dma...@apache.org> wrote:
> Alex P, sounds like a good plan for me. > > Vladimir, do you have any suggestions or corrections? > > — > Denis > > > On Feb 12, 2018, at 4:57 AM, Alex Plehanov <plehanov.a...@gmail.com> > wrote: > > > > The views engine and the first view are almost ready to merge (review > > comments are resolved). Which views should we take next? My proposal - > > NODES, NODE_ATTRIBUTES, NODE_METRICS, NODE_HOSTS and NODE_ADDRESSES, > since > > these views are clear and all topology data available on each node. > > Any objections? > > > > 2018-01-25 16:27 GMT+03:00 Alex Plehanov <plehanov.a...@gmail.com>: > > > >> Anton, Vladimir, I've made some fixes. There is only one view left and > >> it's renamed to 'IGNITE.LOCAL_TRANSACTIONS'. > >> > >> High level design of solution: > >> When IgniteH2Indexing is starting, it create and start > >> new GridH2SysViewProcessor, which create and register in H2 (via its own > >> table engine) all implementations of system views. Each system view > >> implementation extends base abstract class GridH2SysView. View > >> implementation describes columns, their types and indexes in constructor > >> and must override method getRows for data retrieval (this method called > by > >> H2-compatible table and index implementations for ignite system views). > >> Almost no fixes to existing parsing engine was made, except some places, > >> where GridH2Table instance was expected, but for system views there is > >> another class. > >> > >> New PR: [1]. Please have a look. > >> > >> [1] https://github.com/apache/ignite/pull/3433 > >> > >> 2018-01-24 19:12 GMT+03:00 Anton Vinogradov <avinogra...@gridgain.com>: > >> > >>> I've created IEP-13 [1] to cover all cases. > >>> Feel free to create issues. > >>> > >>> [1] > >>> https://cwiki.apache.org/confluence/pages/viewpage. > action?pageId=75962769 > >>> > >>> On Wed, Jan 24, 2018 at 6:10 PM, Vladimir Ozerov <voze...@gridgain.com > > > >>> wrote: > >>> > >>>> Let's start with a single and the most simple view, e.g. > >>>> LOCAL_TRANSACTIONS. We will review and merge it along with necessary > >>>> infrastructure. Then will handle the rest view in separate tickets and > >>>> separate focused discussions. > >>>> > >>>> On Wed, Jan 24, 2018 at 5:29 PM, Alex Plehanov < > plehanov.a...@gmail.com > >>>> > >>>> wrote: > >>>> > >>>>> 1) It’s not a principal point, I can change schema. The > >>>> INFORMATION_SCHEMA > >>>>> was used because it’s already exists and usually used for metadata > >>> tables > >>>>> and views. Your proposal is to use schema “IGNITE”, am I understand > >>> you > >>>>> right? BTW, for now, we can’t query another (H2) meta tables from the > >>>>> INFORMATION_SCHEMA, so, “Ignite system views” is only available views > >>> to > >>>>> query from this schema. > >>>>> 2) Exactly for this reason the IGNITE_INSTANCE view is useful: to > >>>> determine > >>>>> which node we are connected to. > >>>>> 3) As the first phase, in my opinion, local views will be enough. > >>>>> Performance and caching of distributed views should be discussed at > >>> next > >>>>> phases, when distributed views implementation will be planned. In > >>> current > >>>>> implementation I tried to use indexing for local views wherever it’s > >>>>> possible. > >>>>> 4) I don’t think, that JVM info is more critical information than, > for > >>>>> example, caches or nodes information. When authorization capabilities > >>>>> planned to implement? > >>>>> > >>>>> About local data: yes, we can rename all currently implemented views > >>> for > >>>>> the local node data as LOCAL_..., and create (someday) new whole > >>> cluster > >>>>> views (which use distributed requests) without prefix or, for > example, > >>>> with > >>>>> CLUSTER_ prefix. But some views can show all cluster information > using > >>>> only > >>>>> local node data, without distributed requests (for example > >>>>> IGNITE_NODE_METRICS, IGNITE_PART_ASSIGNMENT, IGNITE_PART_ALLOCATION, > >>>>> IGNITE_NODES, etc). Are they local or cluster views in this concept? > >>>> Which > >>>>> prefix should be used? And what about caches? Are they local or > >>> cluster? > >>>> On > >>>>> local node we can see cluster wide caches (replicated and > distributed) > >>>> and > >>>>> caches for current node only. Local caches list may differ from node > >>> to > >>>>> node. Which prefix should be used for this view? And one more, there > >>> is > >>>> no > >>>>> sense for some views to make them cluster wide (for example > >>>>> INGNITE_INSTANCE). Should we name it LOCAL_INSTANCE without creating > >>>>> INSTANCE view? > >>>>> > >>>>> So, next steps: split PR, change schema name (IGNITE?), change view > >>> name > >>>>> for caches (CACHES, LOCAL_CACHES?) > >>>>> > >>>>> > >>>>> 2018-01-24 13:03 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>: > >>>>> > >>>>>> Hi Alex, > >>>>>> > >>>>>> System views could be extremely valuable addition for Ignite. > >>> Ideally, > >>>>> user > >>>>>> should be able to monitor and manage state of the whole cluster > >>> with a > >>>>>> single SQL command line. We have plans to implement it for a very > >>> long > >>>>>> time. However, this is very sensitive task which should take a lot > >>> of > >>>>>> moving pieces in count, such as usability, consistency, performance, > >>>>>> security, etc.. > >>>>>> > >>>>>> Let me point several major concerns I see at the moment: > >>>>>> > >>>>>> 1) Usability: INFORMATION_SCHEMA > >>>>>> This schema is part of SQL ANSI standard. When creating system > >>> views, > >>>>> some > >>>>>> vendors prefer to store them in completely different predefined > >>> schema > >>>>>> (Oracle, MS SQL). Others prefer to keep them in INFORMATION_SCHEMA > >>>>>> directly. Both approaches could work. However, the latter breaks > >>>>> separation > >>>>>> of concerns - we store typical metadata near to possibly sensitive > >>>> system > >>>>>> data. Also it makes security management more complex - system data > >>> is > >>>>> very > >>>>>> sensitive, and now we cannot simply grant access > >>> INFORMATIONAL_SCHEMA > >>>> to > >>>>>> user. Instead, we have to grant that access on per-view basis. For > >>> this > >>>>>> reason my preference is to store system tables in separate schema, > >>> not > >>>> in > >>>>>> INFORMATION_SCHEMA > >>>>>> > >>>>>> 2) Consistency: local data > >>>>>> One of implemented view GridH2SysViewImplInstance. Normally SQL > >>> users > >>>>>> communicate with Ignite through JDBC/ODBC drivers. These drivers are > >>>>>> connected to a single node, typically client node. Moreover, we will > >>>>>> introduce high-availability feature when drivers were able to > >>> connect > >>>> to > >>>>>> any address from a predefined list. It renders this view useless, as > >>>> you > >>>>> do > >>>>>> not know which node you connected to. Also, local-only data cannot > >>> be > >>>>>> joined in general case - you will receive different results on > >>>> different > >>>>>> nodes. The same goes for transactions, JVM info, etc. > >>>>>> > >>>>>> 3) Performance > >>>>>> Suppose we fixed consistency of transactions and now this view shows > >>>>>> transactions in the whole cluster with possibility to filter them by > >>>>> nodes > >>>>>> - this is what user would expect out of the box. Another problem > >>>> appears > >>>>>> then - performance. How would we collect necessary data? How would > >>> we > >>>>>> handle joins, when particular view could be scanned multiple times > >>>> during > >>>>>> query execution? How we achieve sensible consistency? Most probably > >>> we > >>>>>> would collect remote data once when query is started, cache it > >>> somehow > >>>> on > >>>>>> query session level, and then re-use during joins. But again, this > >>>> should > >>>>>> be discussed separately. > >>>>>> > >>>>>> 4) Security: JVM info > >>>>>> We should define clear boundaries of what info is exposed. JVM data > >>>> along > >>>>>> with running threads is critically sensitive information. We should > >>> not > >>>>>> expose it until we have authorization capabilities. > >>>>>> > >>>>>> In order to start moving this code from prototype to production > >>> state > >>>> we > >>>>>> should start with the most simple and consistent views. E.g. > >>>>> IGNITE_CACHES. > >>>>>> Let's move it to a separate PR, review infrastructure code, review > >>> view > >>>>>> implementation, agree on proper naming and placement, and merge it. > >>>> Then > >>>>>> each and every view (or group of related views) should be discussed > >>> and > >>>>>> reviewed separately. > >>>>>> > >>>>>> As far as node-local stuff, may be we should move it to a separate > >>>>> schema, > >>>>>> or mark with special prefix. E.g. "IGNITE.TRANSACTIONS" - all > >>>>> transactions > >>>>>> in the cluster, "IGNITE.LOCAL_TRANSACTIONS" - transactions on the > >>> local > >>>>>> node. In this case we will be able to merge "local" stuff shortly, > >>> and > >>>>>> implement more complex but at the same time much more useful > >>>> distributed > >>>>>> stuff later on. > >>>>>> > >>>>>> Makes sense? > >>>>>> > >>>>>> Vladimir. > >>>>>> > >>>>>> > >>>>>> On Tue, Jan 23, 2018 at 8:30 PM, Alex Plehanov < > >>>> plehanov.a...@gmail.com> > >>>>>> wrote: > >>>>>> > >>>>>>> Hello, Igniters! > >>>>>>> > >>>>>>> For Ignite diagnostic usually it’s helpful to get some Ignite > >>>> internals > >>>>>>> information. But currently, in my opinion, there are no convenient > >>>>> tools > >>>>>>> for this purpose: > >>>>>>> > >>>>>>> · Some issues can be solved by analyzing log files. Log > >>> files > >>>>> are > >>>>>>> useful for dumps, but sometimes they are difficult to read. Also > >>>>>>> interesting metrics can’t be received runtime by request, we need > >>> to > >>>>> wait > >>>>>>> until Ignite will write these metrics by timeout or other events. > >>>>>>> > >>>>>>> · JMX is useful for scalar metrics. Complex and table data > >>> can > >>>>>> also > >>>>>>> be received, but it’s difficult to read, filter and sort them > >>> without > >>>>>>> processing by specialized external tools. For most frequently used > >>>>> cases > >>>>>>> almost duplicating metrics are created to show data in an > >>>> easy-to-read > >>>>>>> form. > >>>>>>> > >>>>>>> · Web-console is able to show table and complex data. > >>> Perhaps, > >>>>>>> someday web-console will contain all necessary dashboards for > >>> most > >>>>>> problem > >>>>>>> investigation, but some non-trivial queries will not be covered > >>>> anyway. > >>>>>>> Also web-console needs additional infrastructure to work. > >>>>>>> > >>>>>>> · External “home-made” tools can be used for non-trivial > >>>> cases. > >>>>>> They > >>>>>>> cover highly specialized cases and usually can’t be used as > >>> general > >>>>>> purpose > >>>>>>> tools. > >>>>>>> > >>>>>>> Sometimes we are forced to use more than one tool and join data by > >>>>> hands > >>>>>>> (for example, current thread dump and data from logs). > >>>>>>> > >>>>>>> Often RDBMS for diagnostic purposes provides system views (for > >>>> example, > >>>>>>> DBA_% and V$% in Oracle), which can be queried by SQL. This > >>> solution > >>>>>> makes > >>>>>>> all internal diagnostic information available in a readable form > >>>> (with > >>>>>> all > >>>>>>> possible filters and projections) without using any other > >>> internal or > >>>>>>> external tools. My proposal is to create similar system views in > >>>>> Ignite. > >>>>>>> > >>>>>>> I implement working prototype (PR: [1]). It contains views: > >>>>>>> > >>>>>>> IGNITE_SYSTEM_VIEWS > >>>>>>> > >>>>>>> Registered system views > >>>>>>> > >>>>>>> IGNITE_INSTANCE > >>>>>>> > >>>>>>> Ignite instance > >>>>>>> > >>>>>>> IGNITE_JVM_THREADS > >>>>>>> > >>>>>>> JVM threads > >>>>>>> > >>>>>>> IGNITE_JVM_RUNTIME > >>>>>>> > >>>>>>> JVM runtime > >>>>>>> > >>>>>>> IGNITE_JVM_OS > >>>>>>> > >>>>>>> JVM operating system > >>>>>>> > >>>>>>> IGNITE_CACHES > >>>>>>> > >>>>>>> Ignite caches > >>>>>>> > >>>>>>> IGNITE_CACHE_CLUSTER_METRICS > >>>>>>> > >>>>>>> Ignite cache cluster metrics > >>>>>>> > >>>>>>> IGNITE_CACHE_NODE_METRICS > >>>>>>> > >>>>>>> Ignite cache node metrics > >>>>>>> > >>>>>>> IGNITE_CACHE_GROUPS > >>>>>>> > >>>>>>> Cache groups > >>>>>>> > >>>>>>> IGNITE_NODES > >>>>>>> > >>>>>>> Nodes in topology > >>>>>>> > >>>>>>> IGNITE_NODE_HOSTS > >>>>>>> > >>>>>>> Node hosts > >>>>>>> > >>>>>>> IGNITE_NODE_ADDRESSES > >>>>>>> > >>>>>>> Node addresses > >>>>>>> > >>>>>>> IGNITE_NODE_ATTRIBUTES > >>>>>>> > >>>>>>> Node attributes > >>>>>>> > >>>>>>> IGNITE_NODE_METRICS > >>>>>>> > >>>>>>> Node metrics > >>>>>>> > >>>>>>> IGNITE_TRANSACTIONS > >>>>>>> > >>>>>>> Active transactions > >>>>>>> > >>>>>>> IGNITE_TRANSACTION_ENTRIES > >>>>>>> > >>>>>>> Cache entries used by transaction > >>>>>>> > >>>>>>> IGNITE_TASKS > >>>>>>> > >>>>>>> Active tasks > >>>>>>> > >>>>>>> IGNITE_PART_ASSIGNMENT > >>>>>>> > >>>>>>> Partition assignment map > >>>>>>> > >>>>>>> IGNITE_PART_ALLOCATION > >>>>>>> > >>>>>>> Partition allocation map > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> There are much more useful views can be implemented (executors > >>>>>> diagnostic, > >>>>>>> SPIs diagnostic, etc). > >>>>>>> > >>>>>>> Some usage examples: > >>>>>>> > >>>>>>> Cache groups and their partitions, which used by transaction more > >>>> than > >>>>> 5 > >>>>>>> minutes long: > >>>>>>> > >>>>>>> SELECT cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION, count(*) AS > >>>>> ENTITIES_CNT > >>>>>>> FROM INFORMATION_SCHEMA.IGNITE_TRANSACTIONS t > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_TRANSACTION_ENTRIES te ON t.XID = > >>>>> te.XID > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_CACHES c ON te.CACHE_NAME = c.NAME > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_CACHE_GROUPS cg ON c.GROUP_ID = > >>> cg.ID > >>>>>>> WHERE t.START_TIME < TIMESTAMPADD('MINUTE', -5, NOW()) > >>>>>>> GROUP BY cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> Average CPU load on server nodes grouped by operating system: > >>>>>>> > >>>>>>> SELECT na.VALUE, COUNT(n.ID), AVG(nm.AVG_CPU_LOAD) AVG_CPU_LOAD > >>>>>>> FROM INFORMATION_SCHEMA.IGNITE_NODES n > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_NODE_ATTRIBUTES na ON na.NODE_ID = > >>>> n.ID > >>>>>> AND > >>>>>>> na.NAME = 'os.name' > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_NODE_METRICS nm ON nm.NODE_ID = > >>> n.ID > >>>>>>> WHERE n.IS_CLIENT = false > >>>>>>> GROUP BY na.VALUE > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> Top 5 nodes by puts to cache ‘cache’: > >>>>>>> > >>>>>>> SELECT cm.NODE_ID, cm.CACHE_PUTS FROM > >>>>>>> INFORMATION_SCHEMA.IGNITE_CACHE_NODE_METRICS cm > >>>>>>> WHERE cm.CACHE_NAME = 'cache' > >>>>>>> ORDER BY cm.CACHE_PUTS DESC > >>>>>>> LIMIT 5 > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> Does this implementation interesting to someone else? Maybe any > >>> views > >>>>> are > >>>>>>> redundant? Which additional first-priority views must be > >>> implemented? > >>>>> Any > >>>>>>> other thoughts or proposal? > >>>>>>> > >>>>>>> [1] https://github.com/apache/ignite/pull/3413 > >>>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> > >> > >