[GitHub] [incubator-inlong-website] luchunliang commented on a change in pull request #255: [INLONG-2170] add Inlong-Sort-standalone document.

GitBox Wed, 19 Jan 2022 19:57:56 -0800


luchunliang commented on a change in pull request #255:
URL: 
https://github.com/apache/incubator-inlong-website/pull/255#discussion_r788332134




##########
File path: docs/modules/sort-standalone/quick_start.md
##########
@@ -0,0 +1,198 @@
+---
+title: Deployment
+sidebar_position: 2
+---
+
+## Preparing installation files
+The installation file is located in the 'inlong sort standard / sort standard 
dist / target /' directory. The file name is Apache inlong sort standard - 
${project. Version} - bin tar. gz。
+
+## Start the inlong sort standalone application
+With the tar produced in the above compilation stage After you unpack the GZ 
package, you can start the inlong sort standalone application.
+Example:
+
+```
+./bin/sort-start. sh
+```
+

Review comment:
       fixed

##########
File path: 
i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/sort-standalone/overview.md
##########
@@ -0,0 +1,44 @@
+---
+title: 总览
+sidebar_position: 1
+---
+
+## 简介
+Inlong-sort-standalone是一个负责将用户上报的数据流从缓存层消费出来，分发到不同数据存储的模块，支持Hive、ElasticSearch、CLS等多种数据存储。
+Inlong-sort-standalone依赖inlong-manager进行系统元数据的管理，Inlong-sort-standalone按集群部署，按目标存储聚合分发任务。
+
+## 特性
+### 多租户系统
+inlong-sort-standalone支持多租户，一个inlong-sort-standalone集群可以承载不同租户的分发任务，分发任务从Inlong-manager获取。
+每个分发任务负责将多个数据流分发到一个数据存储，用户只需要在Inlong-manager的前端页面进行配置，指定数据流分发到具体数据存储。
+举例：Inlong数据流d1和d2，都分发Hive集群H1，d1还分发到ElasticSearch集群E1，d2还分发到CLS集群C1，那么inlong-sort-standalone集群会收到三个分发任务。
+- H1分发任务消费d1和d2，分发到Hive集群H1；
+- E1分发任务消费d1，分发到ElasticSearch集群E1；
+- C1分发任务消费d2，分发到CLS集群C1。
+
+### 分发任务支持动态更新
+inlong-sort-standalone支持动态更新分发任务，比如Inlong数据流所在数据源的信息，数据流schema信息，目标数据存储的信息。
+需要注意的是，Inlong数据流新增分发，会从缓存层的最新位置开始消费；
+Inlong数据流分发下线后重新上线，如果下线时的消费位置还在缓存层的生命周期内，则从下线时的消费位置继续消费；
+如果下线时的消费位置已不在缓存层的生命周期内，则从缓存层的最新位置开始消费。
+
+### 缓存层支持的消息队列
+- Inlong-tubemq
+- Apache Pulsar
+
+### 支持的数据存储
+- Apache Hive（当前只支持sequence文件格式）
+- Apache Pulsar
+- Apache Kafka
+
+### 未来规划
+#### 支持更多种类的缓存层消息队列
+Apache Kafka等。
+

Review comment:
       fixed

##########
File path: 
i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/sort-standalone/overview.md
##########
@@ -0,0 +1,44 @@
+---
+title: 总览
+sidebar_position: 1
+---
+
+## 简介
+Inlong-sort-standalone是一个负责将用户上报的数据流从缓存层消费出来，分发到不同数据存储的模块，支持Hive、ElasticSearch、CLS等多种数据存储。
+Inlong-sort-standalone依赖inlong-manager进行系统元数据的管理，Inlong-sort-standalone按集群部署，按目标存储聚合分发任务。
+
+## 特性
+### 多租户系统
+inlong-sort-standalone支持多租户，一个inlong-sort-standalone集群可以承载不同租户的分发任务，分发任务从Inlong-manager获取。
+每个分发任务负责将多个数据流分发到一个数据存储，用户只需要在Inlong-manager的前端页面进行配置，指定数据流分发到具体数据存储。
+举例：Inlong数据流d1和d2，都分发Hive集群H1，d1还分发到ElasticSearch集群E1，d2还分发到CLS集群C1，那么inlong-sort-standalone集群会收到三个分发任务。
+- H1分发任务消费d1和d2，分发到Hive集群H1；
+- E1分发任务消费d1，分发到ElasticSearch集群E1；
+- C1分发任务消费d2，分发到CLS集群C1。
+
+### 分发任务支持动态更新
+inlong-sort-standalone支持动态更新分发任务，比如Inlong数据流所在数据源的信息，数据流schema信息，目标数据存储的信息。
+需要注意的是，Inlong数据流新增分发，会从缓存层的最新位置开始消费；
+Inlong数据流分发下线后重新上线，如果下线时的消费位置还在缓存层的生命周期内，则从下线时的消费位置继续消费；
+如果下线时的消费位置已不在缓存层的生命周期内，则从缓存层的最新位置开始消费。
+
+### 缓存层支持的消息队列
+- Inlong-tubemq
+- Apache Pulsar
+
+### 支持的数据存储
+- Apache Hive（当前只支持sequence文件格式）
+- Apache Pulsar
+- Apache Kafka
+
+### 未来规划
+#### 支持更多种类的缓存层消息队列
+Apache Kafka等。
+
+
+#### 支持更多种类的数据存储
+Hbase，ElasticSearch等。
+

Review comment:
       fixed

##########
File path: docs/modules/sort-standalone/overview.md
##########
@@ -0,0 +1,51 @@
+---
+title: Overview
+sidebar_position: 1
+---
+
+## Overview
+Inlong sort standalone is a module responsible for consuming the data stream 
reported by users from the cache layer and distributing it to different data 
stores. It supports hive, elasticsearch, CLS and other data stores.
+
+Inlong sort standalone relies on inlong manager to manage system metadata. 
Inlong sort standalone is deployed by cluster and aggregates and distributes 
tasks by target storage.
+
+## Feature
+### Multi tenant system
+Inlong sort standalone supports multi tenancy. An inlong sort standalone 
cluster can host the distribution tasks of different tenants. The distribution 
tasks are obtained from the inlong manager.
+
+Each distribution task is responsible for distributing multiple data streams 
to a data store. Users only need to configure on the front page of inlong 
manager to specify the data streams to be distributed to a specific data store.
+
+For example, the inlong data streams D1 and D2 are distributed to hive cluster 
H1, D1 is also distributed to elasticsearch cluster E1, and D2 is also 
distributed to CLS cluster C1. Then the inlong sort standalone cluster will 
receive three distribution tasks.
+- H1 distributes task consumption D1 and D2 to hive cluster H1;
+- E1 distribution task consumption D1, distributed to elasticsearch cluster E1;
+- C1 distributes the task consumption D2 and distributes it to CLS cluster C1.
+
+### Distribution tasks support dynamic updates
+Inlong sort standalone supports dynamic updating of distribution tasks, such 
as the information of the data source where the inlong data stream is located, 
the data stream schema information, and the information of the target data 
store.
+
+It should be noted that the new distribution of inlong data stream will be 
consumed from the latest location of the cache layer;
+
+After the inlong data stream is distributed offline, it goes online again. If 
the consumption location when it goes offline is still within the life cycle of 
the cache layer, it continues to consume from the consumption location when it 
goes offline;
+
+If the consumption location at the time of offline is no longer within the 
life cycle of the cache layer, consumption starts from the latest location of 
the cache layer.
+
+### message queues supported by the cache layer
+- Inlong-tubemq
+- Apache Pulsar
+
+### supported data storage
+- Apache hive (currently only supports sequence file format)
+- Apache Pulsar
+- Apache Kafka
+
+### Future planning
+#### support more types of cache layer message queues
+Apache Kafka, etc.
+

Review comment:
       fixed




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [incubator-inlong-website] luchunliang commented on a change in pull request #255: [INLONG-2170] add Inlong-Sort-standalone document.

Reply via email to