wu-sheng commented on a change in pull request #995: Collector table description for develop guide. URL: https://github.com/apache/incubator-skywalking/pull/995#discussion_r178219315
########## File path: docs/en/Collector-Table-Description.md ########## @@ -0,0 +1,572 @@ +# Collector Table Description +This document describes the usage of tables and their columns, based on elasticsearch storage implementation. + +## Metric table time bucket +### Date format +- second: `yyyyMMddHHmmss` +- minute: `yyyyMMddHHmm` +- hour: `yyyyMMddHH` +- day: `yyyyMMdd` +- month: `yyyyMM` + +## Tables of Register related +### Application +- Table name: application +- Get or create a database record by "application_code". + +Column Name | Short Name | Data Type | Description +----------- | ---------- | --------- | --------- +_id | _id | Keyword | primary key, es speciality, the value same as application_id +application_code | c1 | Keyword | The application name, see `agent.config` +application_id | c2 | Integer | Auto increment, is a signed integer +layer | c3 | Integer | Register by client or server side +is_address | c4 | Integer | Is a boolean data. True(1), False(0) +address_id | c5 | Integer | A foreign key reference by network_address table + +- Column `is_address` + - `false`. A real application, which has a custom `application_code`. At the same time, the `address_id` column value must to be 0. + - `true`. A conjunction application based on IP address. `address_id` is registered in `network_address` table. + +### Instance +- Table name: instance +- Create a instance by `application_id` and `agent_uuid` + +Column Name | Short Name | Data Type | Description +----------- | ---------- | --------- | --------- +_id | _id | Keyword | primary key, es speciality, the value same as instance_id +application_id | c1 | Integer | Owner application id +application_code | c2 | Text | Owner application code +agent_uuid | c3 | Keyword | Uniquely identifies each server monitored by agent +register_time | c4 | Long | First register time +instance_id | c5 | Integer | Auto increment, is a unsigned integer +heartbeat_time | c6 | Long | Represent server is alive +os_info | c7 | Text | A Json data. +is_address | c8 | Integer | Is a boolean data. True(1), False(0) +address_id | c9 | Integer | A foreign key reference by network_address table + +- Column `os_info` + - For example: {"osName":"MacOS X","hostName":peng-yongsheng","processId":1000,"ipv4s":["10.0.0.1","10.0.0.2"]} +- Column `heartbeat_time` + - Updated by agent heart beat [1] + - Updated by JVM metric data [2] + - Updated by trace segment data. [3] + - Priority: [1] > [2] > [3] + +### NetworkAddress +- Table name: network_address +- Create a network address record by "network_address" and "span_layer". + +Column Name | Short Name | Data Type | Description +----------- | ---------- | --------- | --------- +_id | _id | Keyword | primary key, es speciality, the value same as address_id +address_id | c1 | Integer | Auto increment, is a signed integer +network_address | c2 | Keyword | Host name or IP address +span_layer | c3 | Integer | Register by client or server side +server_type | c4 | Integer | Such as component id, used for topology. + +### ServiceName +- Table name: service_name +- Create a service record by "service_name_keyword", "application_id" and "src_span_type". + +Column Name | Short Name | Data Type | Description +----------- | ---------- | --------- | --------- +_id | _id | Keyword | primary key, es speciality, the value same as service_id +service_id | c1 | Integer | Auto increment, is a signed integer +service_name | c2 | Text | Operation name, used for fuzzy matching +service_name_keyword | c3 | Keyword | Operation name, used for full matching +application_id | c4 | Integer | Owner application id +src_span_type | c5 | Integer | Register from client or server side based on `src_span_type` + +- See `src_span_type` in [protocol doc](Trace-Data-Protocol.md#network-address-register-service) + +## Table of Trace Metric related +### ApplicationComponent +- Table name: application_component_`TimeDimension` +- TimeDimension contains minute, hour, day, month +- It is primarily used for the view of node type in application topology. + +Column Name | Short Name | Data Type | Description +----------- | ---------- | --------- | --------- +_id | _id | Keyword | primary key, es speciality, `time_bucket`_`metric_id` +metric_id | c1 | Keyword | `application_id`_`component_id` +component_id | c2 | Integer | [Component id](https://github.com/apache/incubator-skywalking/blob/master/apm-protocol/apm-network/src/main/java/org/apache/skywalking/apm/network/trace/component/ComponentsDefine.java) +application_id | c3 | Integer | Owner application id +time_bucket | tb | Long | [Date format](Collector-Table-Description.md#Metric-table-time-bucket) + +### ApplicationMapping +- Table name: application_mapping_`TimeDimension` +- TimeDimension contains minute, hour, day, month +- For example: A application invoke B application, collector will generate two metrics: + * From the caller's trace data: A application -> B application's IP address (Topology will use this metric when B application is not monitored by agent) + * From the callee's trace data: A application -> B application (Topology will use this metric when B application is monitored by agent) + +Column Name | Short Name | Data Type | Description +----------- | ---------- | --------- | --------- +_id | _id | Keyword | primary key, es speciality, `time_bucket`_`metric_id` +metric_id | c1 | Keyword | `application_id`_`mapping_application_id` +application_id | c2 | Integer | Registered at server side. +mapping_application_id | c3 | Integer | Registered at client side with the server's IP address. +time_bucket | tb | Long | [Date format](Collector-Table-Description.md#Metric-table-time-bucket) + +### ApplicationMetric +- Table name: application_metric_`TimeDimension` +- TimeDimension contains minute, hour, day, month + +Column Name | Short Name | Data Type | Description +----------- | ---------- | --------- | --------- +_id | _id | Keyword | primary key, es speciality, `time_bucket`_`metric_id` +metric_id | c1 | Keyword | `application_id`_`source_value` +application_id | c2 | Integer | Owner application id +source_value | c3 | Integer | Caller(0), Callee(1) +transaction_calls | a1 | Long | The total number of calls, sums values aggregate by `time_bucket` +transaction_error_calls | a2 | Long | The total number of error calls, sums values aggregate by `time_bucket` +transaction_duration_sum | a3 | Long | The duration sum of all calls, sums values aggregate by `time_bucket` +transaction_error_duration_sum | a4 | Long | The duration sum of error calls, sums values aggregate by `time_bucket` +transaction_average_duration | a5 | Long | The average duration of all calls, used for order by this column in database. +business_transaction_calls | b1 | Long | +business_transaction_error_calls | b2 | Long | +business_transaction_duration_sum | b3 | Long | +business_transaction_error_duration_sum | b4 | Long | +business_transaction_average_duration | b5 | Long | +mq_transaction_calls | m1 | Long | +mq_transaction_error_calls | m2 | Long | +mq_transaction_duration_sum | m3 | Long | +mq_transaction_error_duration_sum | m4 | Long | +mq_transaction_average_duration | m5 | Long | +satisfied_count | d1 | Long | [The formula](../../apm-collector/apm-collector-core/src/main/java/org/apache/skywalking/apm/collector/core/util/ApdexThresholdUtils.java) +tolerating_count | d2 | Long | +frustrated_count | d3 | Long | +time_bucket | tb | Long | [Date format](Collector-Table-Description.md#Metric-table-time-bucket) + +### ApplicationReferenceMetric +- Table name: application_reference_metric_`TimeDimension` +- TimeDimension contains minute, hour, day, month + +Column Name | Short Name | Data Type | Description +----------- | ---------- | --------- | --------- +_id | _id | Keyword | primary key, es speciality, `time_bucket`_`metric_id` +metric_id | c1 | Keyword | `front_application_id`_`behind_application_id`_`source_value` +front_application_id | c2 | Integer | +behind_application_id | c3 | Integer | +source_value | c4 | Integer | Caller(0), Callee(1) +transaction_calls | a1 | Long | The total number of calls, sums values aggregate by `time_bucket` +transaction_error_calls | a2 | Long | The total number of error calls, sums values aggregate by `time_bucket` +transaction_duration_sum | a3 | Long | The duration sum of all calls, sums values aggregate by `time_bucket` +transaction_error_duration_sum | a4 | Long | The duration sum of error calls, sums values aggregate by `time_bucket` +transaction_average_duration | a5 | Long | The average duration of all calls, used for order by this column in database. +business_transaction_calls | b1 | Long | +business_transaction_error_calls | b2 | Long | +business_transaction_duration_sum | b3 | Long | +business_transaction_error_duration_sum | b4 | Long | +business_transaction_average_duration | b5 | Long | +mq_transaction_calls | m1 | Long | +mq_transaction_error_calls | m2 | Long | +mq_transaction_duration_sum | m3 | Long | +mq_transaction_error_duration_sum | m4 | Long | +mq_transaction_average_duration | m5 | Long | Review comment: Same concern I mentioned. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services