kezhenxu94 commented on a change in pull request #8705:
URL: https://github.com/apache/skywalking/pull/8705#discussion_r830085837
##########
File path: CHANGES.md
##########
@@ -112,14 +113,28 @@ Release Notes.
, `SW_CORE_REST_JETTY_DELTA`).
* [Breaking Change] Remove configuration `graphql/path` (env var:
`SW_QUERY_GRAPHQL_PATH`).
* Add storage column attribute `indexOnly`, support ElasticSearch only index
and not store some fields.
-* Add `indexOnly=true` to `SegmentRecord.tags`, `AlarmRecord.tags`,
`AbstractLogRecord.tags`, to reduce unnecessary storage.
+* Add `indexOnly=true` to `SegmentRecord.tags`, `AlarmRecord.tags`,
`AbstractLogRecord.tags`, to reduce unnecessary
+ storage.
* [Breaking Change] Remove configuration `restMinThreads` (env var:
`SW_CORE_REST_JETTY_MIN_THREADS`
, `SW_RECEIVER_SHARING_JETTY_MIN_THREADS`).
* Refactor the core Builder mechanism, new storage plugin could implement
their own converter and get rid of hard
requirement of using HashMap to communicate between data object and database
native structure.
* [Breaking Change] Break all existing 3rd-party storage extensions.
* Remove hard requirement of BASE64 encoding for binary field.
* Add complexity limitation for GraphQL query to avoid malicious query.
+* Add `Column.shardingKeyIdx` for column definition for BanyanDB.
+
+```
+Sharding key is used to group time series data per metric of one entity in one
place (same sharding or same
+column for column-oriented database).
+For example,
+ServiceA's traffic gauge, service call per minute, includes following
timestamp values, then it should be sharded by service ID
+[ServiceA(encoded ID): 01-28 18:30 values-1, 01-28 18:31 values-2, 01-28 18:32
values-3, 01-28 18:32 values-4]
+
+BanyanDB is the 1st storage implementation supporting this. It would make
continuous time series metrics stored closely and compressed better.
+
+NOTICE, this sharding concept is NOT just for splitting data into different
database instances or physical files.
Review comment:
You keep using the term `shard` but explaining it's actually for
grouping, what's the reason not to just use name like `Column.groupKeyIdx`?
##########
File path: CHANGES.md
##########
@@ -112,14 +113,28 @@ Release Notes.
, `SW_CORE_REST_JETTY_DELTA`).
* [Breaking Change] Remove configuration `graphql/path` (env var:
`SW_QUERY_GRAPHQL_PATH`).
* Add storage column attribute `indexOnly`, support ElasticSearch only index
and not store some fields.
-* Add `indexOnly=true` to `SegmentRecord.tags`, `AlarmRecord.tags`,
`AbstractLogRecord.tags`, to reduce unnecessary storage.
+* Add `indexOnly=true` to `SegmentRecord.tags`, `AlarmRecord.tags`,
`AbstractLogRecord.tags`, to reduce unnecessary
+ storage.
* [Breaking Change] Remove configuration `restMinThreads` (env var:
`SW_CORE_REST_JETTY_MIN_THREADS`
, `SW_RECEIVER_SHARING_JETTY_MIN_THREADS`).
* Refactor the core Builder mechanism, new storage plugin could implement
their own converter and get rid of hard
requirement of using HashMap to communicate between data object and database
native structure.
* [Breaking Change] Break all existing 3rd-party storage extensions.
* Remove hard requirement of BASE64 encoding for binary field.
* Add complexity limitation for GraphQL query to avoid malicious query.
+* Add `Column.shardingKeyIdx` for column definition for BanyanDB.
+
+```
+Sharding key is used to group time series data per metric of one entity in one
place (same sharding or same
+column for column-oriented database).
+For example,
+ServiceA's traffic gauge, service call per minute, includes following
timestamp values, then it should be sharded by service ID
+[ServiceA(encoded ID): 01-28 18:30 values-1, 01-28 18:31 values-2, 01-28 18:32
values-3, 01-28 18:32 values-4]
+
+BanyanDB is the 1st storage implementation supporting this. It would make
continuous time series metrics stored closely and compressed better.
+
+NOTICE, this sharding concept is NOT just for splitting data into different
database instances or physical files.
Review comment:
You keep using the term `shard` but explaining it's actually for
grouping, what's the reason not to just use name like `Column.groupKeyIdx`?
This is really confusing
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]