[GitHub] [incubator-hudi] yihua commented on a change in pull request #977: [HUDI-221] Translate concept page

GitBox Tue, 29 Oct 2019 11:54:04 -0700

yihua commented on a change in pull request #977: [HUDI-221] Translate concept 
page
URL: https://github.com/apache/incubator-hudi/pull/977#discussion_r340253838


 ##########
 File path: docs/concepts.cn.md
 ##########
 @@ -4,168 +4,151 @@ keywords: hudi, design, storage, views, timeline
 sidebar: mydoc_sidebar
 permalink: concepts.html
 toc: false
-summary: "Here we introduce some basic concepts & give a broad technical 
overview of Hudi"
+summary: "这里我们将介绍Hudi的一些基本概念和宽泛的技术概述"
 ---
 
-Apache Hudi (pronounced “Hudi”) provides the following streaming primitives 
over datasets on DFS
+Apache Hudi(发音为“Hudi”)在DFS的数据集上提供以下流原语
 
- * Upsert                     (how do I change the dataset?)
- * Incremental pull           (how do I fetch data that changed?)
+ * 插入更新                     (如何改变数据集?)
+ * 增量拉取           (如何获取变更的数据?)
 
-In this section, we will discuss key concepts & terminologies that are 
important to understand, to be able to effectively use these primitives.
+在本节中，我们将讨论重要的概念和术语，这些概念和术语有助于理解并有效使用这些原语。
 
-## Timeline
-At its core, Hudi maintains a `timeline` of all actions performed on the 
dataset at different `instants` of time that helps provide instantaneous views 
of the dataset,
-while also efficiently supporting retrieval of data in the order of arrival. A 
Hudi instant consists of the following components 
+## 时间轴
+在它的核心，Hudi维护在不同的`即时`时间所有对数据集操作的`时间轴`，从而提供，从不同时间点出发得到不同的视图下的数据集。Hudi瞬间包含以下组件
 
- * `Action type` : Type of action performed on the dataset
- * `Instant time` : Instant time is typically a timestamp (e.g: 
20190117010349), which monotonically increases in the order of action's begin 
time.
- * `state` : current state of the instant
- 
-Hudi guarantees that the actions performed on the timeline are atomic & 
timeline consistent based on the instant time.
+ * `操作类型` : 对数据集执行的操作类型
+ * `即时时间` : 即时时间通常是一个时间戳(例如：20190117010349)，该时间戳按操作开始时间的顺序单调增加。
+ * `状态` : 即时的状态
 
-Key actions performed include
+Hudi保证在时间轴上执行的操作的原子性和时间轴上即时时间一致性。
 
- * `COMMITS` - A commit denotes an **atomic write** of a batch of records into 
a dataset.
- * `CLEANS` - Background activity that gets rid of older versions of files in 
the dataset, that are no longer needed.
- * `DELTA_COMMIT` - A delta commit refers to an **atomic write** of a batch of 
records into a  MergeOnRead storage type of dataset, where some/all of the data 
could be just written to delta logs.
- * `COMPACTION` - Background activity to reconcile differential data 
structures within Hudi e.g: moving updates from row based log files to columnar 
formats. Internally, compaction manifests as a special commit on the timeline
- * `ROLLBACK` - Indicates that a commit/delta commit was unsuccessful & rolled 
back, removing any partial files produced during such a write
- * `SAVEPOINT` - Marks certain file groups as "saved", such that cleaner will 
not delete them. It helps restore the dataset to a point on the timeline, in 
case of disaster/data recovery scenarios.
+执行的关键操作包括
 
-Any given instant can be 
-in one of the following states
+ * `COMMITS` - 一次提交表示将一组记录**原子写入**到数据集中。
+ * `CLEANS` - 删除数据集中不再需要的旧文件版本的后台活动。
+ * `DELTA_COMMIT` - 
增量提交是指将一批记录**原子写入**到MergeOnRead存储类型的数据集中，其中一些/所有数据都可以只写到增量日志中。
+ * `COMPACTION` - 协调Hudi中的差异数据结构，例如：将更新从基于行的日志文件移动到列格式的后台活动。在内部，压缩表现为时间轴上的特殊提交。
 
 Review comment:
   "协调Hudi中的差异数据结构，例如：将更新从基于行的日志文件移动到列格式的后台活动"
   => "协调Hudi中差异数据结构的后台活动，例如：将更新从基于行的日志文件变成列格式"

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] yihua commented on a change in pull request #977: [HUDI-221] Translate concept page

Reply via email to