[GitHub] [flink] TisonKun commented on a change in pull request #9299: [FLINK-13405][docs-zh] Translate "Basic API Concepts" page into Chinese

2019-08-21 Thread GitBox
TisonKun commented on a change in pull request #9299: [FLINK-13405][docs-zh] 
Translate "Basic API Concepts" page into Chinese
URL: https://github.com/apache/flink/pull/9299#discussion_r316478746
 
 

 ##
 File path: docs/dev/api_concepts.zh.md
 ##
 @@ -24,68 +24,46 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-Flink programs are regular programs that implement transformations on 
distributed collections
-(e.g., filtering, mapping, updating state, joining, grouping, defining 
windows, aggregating).
-Collections are initially created from sources (e.g., by reading from files, 
kafka topics, or from local, in-memory
-collections). Results are returned via sinks, which may for example write the 
data to
-(distributed) files, or to standard output (for example, the command line 
terminal).
-Flink programs run in a variety of contexts, standalone, or embedded in other 
programs.
-The execution can happen in a local JVM, or on clusters of many machines.
-
-Depending on the type of data sources, i.e. bounded or unbounded sources, you 
would either
-write a batch program or a streaming program where the DataSet API is used for 
batch
-and the DataStream API is used for streaming. This guide will introduce the 
basic concepts
-that are common to both APIs but please see our
-[Streaming Guide]({{ site.baseurl }}/dev/datastream_api.html) and
-[Batch Guide]({{ site.baseurl }}/dev/batch/index.html) for concrete 
information about
-writing programs with each API.
-
-**NOTE:** When showing actual examples of how the APIs can be used  we will use
-`StreamingExecutionEnvironment` and the `DataStream` API. The concepts are 
exactly the same
-in the `DataSet` API, just replace by `ExecutionEnvironment` and `DataSet`.
+Flink 
程序是实现了分布式集合转换(例如过滤、映射、更新状态、join、分组、定义窗口、聚合)的规范化程序。集合初始创建自数据源(例如读取文件、kafka 
主题,或本地内存中的集合)。结果通过 sink 返回,例如,它可以将数据写入(分布式)文件,或标准输出(例如命令行终端)。Flink 
程序可以在多种环境中运行,独立运行或嵌入到其他程序中。可以在本地 JVM 中执行,也可以在多台机器的集群上执行。
+
+针对有界和无界两种数据源类型,你可以使用 DataSet API 来编写批处理程序或使用 DataStream API 
来编写流处理程序。本篇指南将介绍这两种 API 通用的基本概念,使用每种 API 编写程序的具体信息请查阅
+[流处理指南]({{ site.baseurl }}/zh/dev/datastream_api.html) 和
+[批处理指南]({{ site.baseurl }}/zh/dev/batch/index.html)。
+
+**请注意:** 当展示如何使用 API 的实际示例时我们使用 `StreamingExecutionEnvironment` 和 `DataStream 
API`。对于批处理,将他们替换为 `ExecutionEnvironment` 和 `DataSet API` 即可,概念是完全相同的。
 
 * This will be replaced by the TOC
 {:toc}
 
-DataSet and DataStream
+DataSet 和 DataStream
 --
 
-Flink has the special classes `DataSet` and `DataStream` to represent data in 
a program. You
-can think of them as immutable collections of data that can contain 
duplicates. In the case
-of `DataSet` the data is finite while for a `DataStream` the number of 
elements can be unbounded.
+Flink 用特有的 `DataSet` 和 `DataStream` 类来表示程序中的数据。你可以将他们视为包含重复项的不可变数据集合。对于 
`DataSet`,数据是有限的,而对于 `DataStream`,元素的数量可以是无限的。
 
-These collections differ from regular Java collections in some key ways. 
First, they
-are immutable, meaning that once they are created you cannot add or remove 
elements. You can also
-not simply inspect the elements inside.
+这些集合与标准的 Java 集合有一些关键的区别。首先它们是不可变的,也就是说它们一旦被创建你就不能添加或删除元素了。你也不能简单地检查它们内部的元素。
 
-A collection is initially created by adding a source in a Flink program and 
new collections are
-derived from these by transforming them using API methods such as `map`, 
`filter` and so on.
+在 Flink 程序中,集合最初通过添加数据源来创建,通过使用 API 的诸如 `map`、`filter` 等方法对数据源进行转换从而派生新的集合。
 
-Anatomy of a Flink Program
+剖析一个 Flink 程序
 --
 
-Flink programs look like regular programs that transform collections of data.
-Each program consists of the same basic parts:
+Flink 程序看起来像是转换数据集合的规范化程序。每个程序由一些基本的部分组成:
 
-1. Obtain an `execution environment`,
-2. Load/create the initial data,
-3. Specify transformations on this data,
-4. Specify where to put the results of your computations,
-5. Trigger the program execution
+1. 获取执行环境 `execution environment`,
 
 Review comment:
   可以把 \`execution environment\` 用括号括起来或者直接去掉。


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] TisonKun commented on a change in pull request #9299: [FLINK-13405][docs-zh] Translate "Basic API Concepts" page into Chinese

2019-08-21 Thread GitBox
TisonKun commented on a change in pull request #9299: [FLINK-13405][docs-zh] 
Translate "Basic API Concepts" page into Chinese
URL: https://github.com/apache/flink/pull/9299#discussion_r316478163
 
 

 ##
 File path: docs/dev/api_concepts.zh.md
 ##
 @@ -24,68 +24,46 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-Flink programs are regular programs that implement transformations on 
distributed collections
-(e.g., filtering, mapping, updating state, joining, grouping, defining 
windows, aggregating).
-Collections are initially created from sources (e.g., by reading from files, 
kafka topics, or from local, in-memory
-collections). Results are returned via sinks, which may for example write the 
data to
-(distributed) files, or to standard output (for example, the command line 
terminal).
-Flink programs run in a variety of contexts, standalone, or embedded in other 
programs.
-The execution can happen in a local JVM, or on clusters of many machines.
-
-Depending on the type of data sources, i.e. bounded or unbounded sources, you 
would either
-write a batch program or a streaming program where the DataSet API is used for 
batch
-and the DataStream API is used for streaming. This guide will introduce the 
basic concepts
-that are common to both APIs but please see our
-[Streaming Guide]({{ site.baseurl }}/dev/datastream_api.html) and
-[Batch Guide]({{ site.baseurl }}/dev/batch/index.html) for concrete 
information about
-writing programs with each API.
-
-**NOTE:** When showing actual examples of how the APIs can be used  we will use
-`StreamingExecutionEnvironment` and the `DataStream` API. The concepts are 
exactly the same
-in the `DataSet` API, just replace by `ExecutionEnvironment` and `DataSet`.
+Flink 
程序是实现了分布式集合转换(例如过滤、映射、更新状态、join、分组、定义窗口、聚合)的规范化程序。集合初始创建自数据源(例如读取文件、kafka 
主题,或本地内存中的集合)。结果通过 sink 返回,例如,它可以将数据写入(分布式)文件,或标准输出(例如命令行终端)。Flink 
程序可以在多种环境中运行,独立运行或嵌入到其他程序中。可以在本地 JVM 中执行,也可以在多台机器的集群上执行。
+
+针对有界和无界两种数据源类型,你可以使用 DataSet API 来编写批处理程序或使用 DataStream API 
来编写流处理程序。本篇指南将介绍这两种 API 通用的基本概念,使用每种 API 编写程序的具体信息请查阅
 
 Review comment:
   `使用每种 API 编写程序的具体信息请查阅` 去掉 `每种` 或者说重新写作 `具体使用 API 编写程序的方法请查阅` 或通顺一点。


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] TisonKun commented on a change in pull request #9299: [FLINK-13405][docs-zh] Translate "Basic API Concepts" page into Chinese

2019-08-21 Thread GitBox
TisonKun commented on a change in pull request #9299: [FLINK-13405][docs-zh] 
Translate "Basic API Concepts" page into Chinese
URL: https://github.com/apache/flink/pull/9299#discussion_r316477628
 
 

 ##
 File path: docs/dev/api_concepts.zh.md
 ##
 @@ -24,68 +24,46 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-Flink programs are regular programs that implement transformations on 
distributed collections
-(e.g., filtering, mapping, updating state, joining, grouping, defining 
windows, aggregating).
-Collections are initially created from sources (e.g., by reading from files, 
kafka topics, or from local, in-memory
-collections). Results are returned via sinks, which may for example write the 
data to
-(distributed) files, or to standard output (for example, the command line 
terminal).
-Flink programs run in a variety of contexts, standalone, or embedded in other 
programs.
-The execution can happen in a local JVM, or on clusters of many machines.
-
-Depending on the type of data sources, i.e. bounded or unbounded sources, you 
would either
-write a batch program or a streaming program where the DataSet API is used for 
batch
-and the DataStream API is used for streaming. This guide will introduce the 
basic concepts
-that are common to both APIs but please see our
-[Streaming Guide]({{ site.baseurl }}/dev/datastream_api.html) and
-[Batch Guide]({{ site.baseurl }}/dev/batch/index.html) for concrete 
information about
-writing programs with each API.
-
-**NOTE:** When showing actual examples of how the APIs can be used  we will use
-`StreamingExecutionEnvironment` and the `DataStream` API. The concepts are 
exactly the same
-in the `DataSet` API, just replace by `ExecutionEnvironment` and `DataSet`.
+Flink 
程序是实现了分布式集合转换(例如过滤、映射、更新状态、join、分组、定义窗口、聚合)的规范化程序。集合初始创建自数据源(例如读取文件、kafka 
主题,或本地内存中的集合)。结果通过 sink 返回,例如,它可以将数据写入(分布式)文件,或标准输出(例如命令行终端)。Flink 
程序可以在多种环境中运行,独立运行或嵌入到其他程序中。可以在本地 JVM 中执行,也可以在多台机器的集群上执行。
 
 Review comment:
   社区有过一次讨论[1],建议都不翻译。
   
   [1] 
https://lists.apache.org/x/thread.html/8a041adc57c36b2228cdc7394a0442db61a39e82c382e598f8842805@%3Cuser-zh.flink.apache.org%3E


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services