[GitHub] [flink] LakeShen commented on issue #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese

GitBox Sun, 14 Jul 2019 21:33:24 -0700

LakeShen commented on issue #9097: [FLINK-11529][docs-zh] Translate the 
"DataStream API Tutorial" page into Chinese
URL: https://github.com/apache/flink/pull/9097#issuecomment-511271265
 
 
   Hi, thanks to your review,I will do that now------------------ 原始邮件 
------------------
   发件人: "Congxian Qiu"<[email protected]>
   发送时间: 2019年7月14日(星期天) 下午5:32
   收件人: "apache/flink"<[email protected]>;
   抄送: "lakeshen"<[email protected]>;"Mention"<[email protected]>;
   主题: Re: [apache/flink] [FLINK-11529][docs-zh] Translate the "DataStreamAPI 
Tutorial" page into Chinese (#9097)
   
   
   
   @klion26 commented on this pull request.
    
   @LakeShen thanks for your contribution, I passed the first-term review and 
left some comments.
    
   you can preview the translation locally by executing sh docs/build_docs.sh 
-p in Flink project, and open http://localhost:4000 in your browser.
   
    
   In docs/getting-started/tutorials/datastream_api.zh.md:
    > @@ -26,19 +26,14 @@ under the License.  * This will be replaced by the 
TOC  {:toc}   -In this guide we will start from scratch and go from setting up 
a Flink project to running -a streaming analysis program on a Flink cluster. 
+在本节指南中，我们将从零开始创建一个在 flink 集群上面进行流分析的 Flink 项目。  &#xFE0F; Suggested change 
-在本节指南中，我们将从零开始创建一个在 flink 集群上面进行流分析的 Flink 项目。 +在本节指南中，我们将在 Flink 
集群上从零开始创建一个流分析项目。  
    
   In docs/getting-started/tutorials/datastream_api.zh.md:
    >   -Wikipedia provides an IRC channel where all edits to the wiki are 
logged. We are going to -read this channel in Flink and count the number of 
bytes that each user edits within -a given window of time. This is easy enough 
to implement in a few minutes using Flink, but it will -give you a good 
foundation from which to start building more complex analysis programs on your 
own. +维基百科提供了一个能够记录所有对 wiki 编辑的 IRC 通道。我们将使用 Flink 读取该通道的数据，同时  &#xFE0F; 
Suggested change -维基百科提供了一个能够记录所有对 wiki 编辑的 IRC 通道。我们将使用 Flink 读取该通道的数据，同时 
+维基百科提供了一个记录所有 wiki 编辑历史的 IRC 通道。我们将使用 Flink 读取该通道的数据，同时  
    
   In docs/getting-started/tutorials/datastream_api.zh.md:
    >   -Wikipedia provides an IRC channel where all edits to the wiki are 
logged. We are going to -read this channel in Flink and count the number of 
bytes that each user edits within -a given window of time. This is easy enough 
to implement in a few minutes using Flink, but it will -give you a good 
foundation from which to start building more complex analysis programs on your 
own. +维基百科提供了一个能够记录所有对 wiki 编辑的 IRC 通道。我们将使用 Flink 读取该通道的数据，同时 
+在给定的时间窗口，计算出每个用户在其中编辑的字节数。这使用 Flink 很容易就能实现，但它会为你提供一个良好的基础去开始构建你自己更为复杂的分析程序。  
   计算出每个用户在给定时间窗口内的编辑字节数?
    
    
   In docs/getting-started/tutorials/datastream_api.zh.md:
    >   -We are going to use a Flink Maven Archetype for creating our project 
structure. Please -see [Java API Quickstart]({{ site.baseurl 
}}/dev/projectsetup/java_api_quickstart.html) for more details -about this. For 
our purposes, the command to run is this: +我们准备使用 Flink Maven Archetype 
创建项目结构。更多细节请查看[Java API 快速指南]({{ site.baseurl 
}}/zh/dev/projectsetup/java_api_quickstart.html)。项目运行命令如下：  &#xFE0F; Suggested 
change -我们准备使用 Flink Maven Archetype 创建项目结构。更多细节请查看[Java API 快速指南]({{ 
site.baseurl }}/zh/dev/projectsetup/java_api_quickstart.html)。项目运行命令如下： +我们准备使用 
Flink Maven Archetype 创建项目结构。更多细节请查看 [Java API 快速指南]({{ site.baseurl 
}}/zh/dev/projectsetup/java_api_quickstart.html)。项目运行命令如下：  
   do we need to translate Maven Archetype here?
    
    
   In docs/getting-started/tutorials/datastream_api.zh.md:
    > @@ -59,8 +54,7 @@ $ mvn archetype:generate \  </p>  
   I think we need to translate the Note also
    
    
   In docs/getting-started/tutorials/datastream_api.zh.md:
    >   -<a href="{{ site.baseurl 
}}/page/img/quickstart-example/jobmanager-job.png" ><img class="img-responsive" 
src="{{ site.baseurl }}/page/img/quickstart-example/jobmanager-job.png" 
alt="Example Job View"/></a> +<a href="{{ site.baseurl 
}}/zh/page/img/quickstart-example/jobmanager-job.png" ><img 
class="img-responsive" src="{{ site.baseurl 
}}/zh/page/img/quickstart-example/jobmanager-job.png" alt="样例作业视图"/></a>  
   maybe we do not change the url of image?
    
    
   In docs/getting-started/tutorials/datastream_api.zh.md:
    > @@ -59,8 +54,7 @@ $ mvn archetype:generate \  </p>  {% endunless %}   
-You can edit the `groupId`, `artifactId` and `package` if you like. With the 
above parameters, -Maven will create a project structure that looks like this: 
+你可以根据自己需求编辑 `groupId`、`artifactId` 以及 `package`。对于上面的参数，Maven 将会创建一个这样的项目结构：   
   "你可以按需修改 groupId、artifactId 以及 package"?
    对于上面的参数，Maven 将会创建一个这样的项目结构 seems a little odd to me, do you think we can 
make it better?
    
    
   In docs/getting-started/tutorials/datastream_api.zh.md:
    > @@ -76,16 +70,13 @@ wiki-edits/              &#x2514;&#x2500;&#x2500; 
log4j.properties  {% endhighlight %}   -There is our `pom.xml` file that 
already has the Flink dependencies added in the root directory and -several 
example Flink programs in `src/main/java`. We can delete the example programs, 
since -we are going to start from scratch: +项目根目录下的 `pom.xml` 文&#x4EF6;已经将 
Flink 依赖添加进来，同时在 `src/main/java` 目录下也有几个 Flink 程序实例。由于我们从头开始创建，我们可以删除程序实例：  
   "Flink 依赖已经添加到根目录下的 pom.xml 文&#x4EF6;中"？
    
   Flink 程序实例 -> Flink 实例程序?
    
   由于我们将从头开始创建，因此可以删除这些实例程序？
    
    
   In docs/getting-started/tutorials/datastream_api.zh.md:
    >    {% highlight bash %}  $ rm wiki-edits/src/main/java/wikiedits/*.java  
{% endhighlight %}   -As a last step we need to add the Flink Wikipedia 
connector as a dependency so that we can -use it in our program. Edit the 
`dependencies` section of the `pom.xml` so that it looks like this: 
+作为最后一步，我们需要添加 Flink 维基百科连接器作为依赖项，这样就可以在我们的项目中进行使用。编辑 `pom.xml` 的 
`dependencies` 部分，使它看起来像这样:  &#xFE0F; Suggested change -作为最后一步，我们需要添加 Flink 
维基百科连接器作为依赖项，这样就可以在我们的项目中进行使用。编辑 `pom.xml` 的 `dependencies` 部分，使它看起来像这样: 
+作为最后一步，我们需要添加 Flink 维基百科连接器的依赖，从而可以在项目中进行使用。修改 `pom.xml` 的 `dependencies` 
部分，使它看起来像这样:  
    
   In docs/getting-started/tutorials/datastream_api.zh.md:
    >   -It's coding time. Fire up your favorite IDE and import the Maven 
project or open a text editor and -create the file 
`src/main/java/wikiedits/WikipediaAnalysis.java`: +现在是编程时间。启动你最喜欢的 IDE 并导入 
Maven 项目或打开文本编辑器创建文&#x4EF6; `src/main/java/wikiedits/WikipediaAnalysis.java`:  
&#xFE0F; Suggested change -现在是编程时间。启动你最喜欢的 IDE 并导入 Maven 项目或打开文本编辑器创建文&#x4EF6; 
`src/main/java/wikiedits/WikipediaAnalysis.java`: +现在是编程时间。启动你最喜欢的 IDE 并导入 
Maven 项目或打开文本编辑器，然后创建文&#x4EF6; 
`src/main/java/wikiedits/WikipediaAnalysis.java`:  
    
   In docs/getting-started/tutorials/datastream_api.zh.md:
    >   -This concludes our little tour of Flink. If you have any questions, 
please don't hesitate to ask on our [Mailing 
Lists](http://flink.apache.org/community.html#mailing-lists). +这就结束了 Flink 
项目构建之旅. 如果你有任何问题, 你可以在我们的 
[邮&#x4EF6;组](http://flink.apache.org/community.html#mailing-lists)提出.  &#xFE0F; 
Suggested change -这就结束了 Flink 项目构建之旅. 如果你有任何问题, 你可以在我们的 
[邮&#x4EF6;组](http://flink.apache.org/community.html#mailing-lists)提出. +这就结束了 
Flink 项目构建之旅. 如果你有任何问题, 
可以在我们的[邮&#x4EF6;组](http://flink.apache.org/community.html#mailing-lists)提出.  
    
   In docs/getting-started/tutorials/datastream_api.zh.md:
    > @@ -131,32 +120,24 @@ public class WikipediaAnalysis {  }  {% 
endhighlight %}   -The program is very basic now, but we will fill it in as we 
go. Note that I'll not give -import statements here since IDEs can add them 
automatically. At the end of this section I'll show -the complete code with 
import statements if you simply want to skip ahead and enter that in your 
-editor. +这个程序现在很基础，但我们会边做边进行补充。注意我不会给出导入语句，因为 IDE 
会自动添加它们。在本节的最后，我将展示带有导入语句的完整代码  
   边做边完善?
    
    
   In docs/getting-started/tutorials/datastream_api.zh.md:
    > @@ -131,32 +120,24 @@ public class WikipediaAnalysis {  }  {% 
endhighlight %}   -The program is very basic now, but we will fill it in as we 
go. Note that I'll not give -import statements here since IDEs can add them 
automatically. At the end of this section I'll show -the complete code with 
import statements if you simply want to skip ahead and enter that in your 
-editor. +这个程序现在很基础，但我们会边做边进行补充。注意我不会给出导入语句，因为 IDE 
会自动添加它们。在本节的最后，我将展示带有导入语句的完整代码 +如果您只是想跳过并在您的编辑器中编辑他们。  
   ，如果需要你可以将他们复制到你的编辑器中?
    
    
   In docs/getting-started/tutorials/datastream_api.zh.md:
    >   -The first step in a Flink program is to create a 
`StreamExecutionEnvironment` -(or `ExecutionEnvironment` if you are writing a 
batch job). This can be used to set execution -parameters and create sources 
for reading from external systems. So let's go ahead and add -this to the main 
method: +在一个 Flink 程序中，首先你需要创建一个 `StreamExecutionEnvironment` (或者处理批作业环境的 
`ExecutionEnvironment`)。这可以用来设置程序运行参数，同时也能够创建从外部系统读取的源。我们把这个添加到 main 方法中:  
   这可以用来设置程序运行参数、创建从外部系统读取的源？
    
    
   In docs/getting-started/tutorials/datastream_api.zh.md:
    >    {% highlight java %}  StreamExecutionEnvironment see = 
StreamExecutionEnvironment.getExecutionEnvironment();  {% endhighlight %}   
-Next we will create a source that reads from the Wikipedia IRC log: 
+接下来我们将创建一个读取维基百科 IRC 数据源：  &#xFE0F; Suggested change -接下来我们将创建一个读取维基百科 IRC 
数据源： +接下来我们将创建一个读取维基百科 IRC 数据的源：  
    
   In docs/getting-started/tutorials/datastream_api.zh.md:
    >    {% highlight java %}  DataStream<WikipediaEditEvent> edits = 
see.addSource(new WikipediaEditsSource());  {% endhighlight %}   -This creates 
a `DataStream` of `WikipediaEditEvent` elements that we can further process. 
For -the purposes of this example we are interested in determining the number 
of added or removed -bytes that each user causes in a certain time window, 
let's say five seconds. For this we first -have to specify that we want to key 
the stream on the user name, that is to say that operations -on this stream 
should take the user name into account. In our case the summation of edited 
bytes in the windows -should be per unique user. For keying a Stream we have to 
provide a `KeySelector`, like this: +上面代码创建了一个 `WikipediaEditEvent` 
事&#x4EF6;的`DataStream`，我们可以进一步处理它。这个代码实例的目的是为了确定每个用户在特定时间窗口中添加或删除的字节数，比如5秒一个时间窗口。首先
  &#xFE0F; Suggested change -上面代码创建了一个 `WikipediaEditEvent` 
事&#x4EF6;的`DataStream`，我们可以进一步处理它。这个代码实例的目的是为了确定每个用户在特定时间窗口中添加或删除的字节数，比如5秒一个时间窗口。首先
 +上面代码创建了一个 `WikipediaEditEvent` 事&#x4EF6;的 
`DataStream`，我们可以进一步处理它。这个代码实例的目的是为了确定每个用户在特定时间窗口中添加或删除的字节数，比如 5 秒一个时间窗口。首先  
    
   In docs/getting-started/tutorials/datastream_api.zh.md:
    >    {% highlight java %}  DataStream<WikipediaEditEvent> edits = 
see.addSource(new WikipediaEditsSource());  {% endhighlight %}   -This creates 
a `DataStream` of `WikipediaEditEvent` elements that we can further process. 
For -the purposes of this example we are interested in determining the number 
of added or removed -bytes that each user causes in a certain time window, 
let's say five seconds. For this we first -have to specify that we want to key 
the stream on the user name, that is to say that operations -on this stream 
should take the user name into account. In our case the summation of edited 
bytes in the windows -should be per unique user. For keying a Stream we have to 
provide a `KeySelector`, like this: +上面代码创建了一个 `WikipediaEditEvent` 
事&#x4EF6;的`DataStream`，我们可以进一步处理它。这个代码实例的目的是为了确定每个用户在特定时间窗口中添加或删除的字节数，比如5秒一个时间窗口。首先
 +我们必须指定用户名来划分我们的数据流，也就是说这个流上的操作应该考虑用户名。  
   根据用户名来划分？
    
    
   In docs/getting-started/tutorials/datastream_api.zh.md:
    >    {% highlight java %}  DataStream<WikipediaEditEvent> edits = 
see.addSource(new WikipediaEditsSource());  {% endhighlight %}   -This creates 
a `DataStream` of `WikipediaEditEvent` elements that we can further process. 
For -the purposes of this example we are interested in determining the number 
of added or removed -bytes that each user causes in a certain time window, 
let's say five seconds. For this we first -have to specify that we want to key 
the stream on the user name, that is to say that operations -on this stream 
should take the user name into account. In our case the summation of edited 
bytes in the windows -should be per unique user. For keying a Stream we have to 
provide a `KeySelector`, like this: +上面代码创建了一个 `WikipediaEditEvent` 
事&#x4EF6;的`DataStream`，我们可以进一步处理它。这个代码实例的目的是为了确定每个用户在特定时间窗口中添加或删除的字节数，比如5秒一个时间窗口。首先
 +我们必须指定用户名来划分我们的数据流，也就是说这个流上的操作应该考虑用户名。 
+在我们这个统计窗口编辑的字节数的例子中，每个用户应该唯一的。对于划分一个数据流，我们必须提供一个 `KeySelector`，像这样:  
   I think here does not mean "每个用户应该是唯一的", It means "每个不同的用户每个窗口都应该计算一个结果"
    
    
   In docs/getting-started/tutorials/datastream_api.zh.md:
    > @@ -203,26 +180,20 @@ DataStream<Tuple2<String, Long>> result = 
keyedEdits      });  {% endhighlight %}   -The first call, `.timeWindow()`, 
specifies that we want to have tumbling (non-overlapping) windows -of five 
seconds. The second call specifies a *Aggregate transformation* on each window 
slice for -each unique key. In our case we start from an initial value of `("", 
0L)` and add to it the byte -difference of every edit in that time window for a 
user. The resulting Stream now contains -a `Tuple2<String, Long>` for every 
user which gets emitted every five seconds. +首先调用 `.timeWindow()` 
方法指定五秒翻滚(非重&#x53E0;)窗口。第&#x4E8C;个调用方法对于每一个唯一关键字指定每个窗口片`聚合转换`。 
+在本例中，我们从`(""，0L)`初始值开始，并将每个用户编辑的字节添加到该时间窗口中。对于每个用户来说，结果流现在包含的元素为 
`Tuple2<String, Long>`，它每5秒发出一次。  &#xFE0F; Suggested change 
-在本例中，我们从`(""，0L)`初始值开始，并将每个用户编辑的字节添加到该时间窗口中。对于每个用户来说，结果流现在包含的元素为 
`Tuple2<String, Long>`，它每5秒发出一次。 +在本例中，我们从 `(""，0L)` 
初始值开始，并将每个用户编辑的字节添加到该时间窗口中。对于每个用户来说，结果流现在包含的元素为 `Tuple2<String, 
Long>`，它每5秒发出一次。  
    
   In docs/getting-started/tutorials/datastream_api.zh.md:
    >   -The only thing left to do is print the stream to the console and start 
execution: +唯一剩下要做的就是将打印流输出到控制台并开始执行:  &#xFE0F; Suggested change 
-唯一剩下要做的就是将打印流输出到控制台并开始执行: +唯一剩下的就是将结果输出到控制台并开始执行:  
    
   In docs/getting-started/tutorials/datastream_api.zh.md:
    >   -This should get you started with writing your own Flink programs. To 
learn more -you can check out our guides -about [basic concepts]({{ 
site.baseurl }}/dev/api_concepts.html) and the -[DataStream API]({{ 
site.baseurl }}/dev/datastream_api.html). Stick -around for the bonus exercise 
if you want to learn about setting up a Flink cluster on -your own machine and 
writing results to [Kafka](http://kafka.apache.org). +这可以让你开始创建你自己的 Flink 
项目。你可以查看[基本概念]({{ site.baseurl }}/zh/dev/api_concepts.html)和[DataStream API] 
+({{ site.baseurl }}/zh/dev/datastream_api.html)指南。如果你想学习了解更多关于 Flink 
集群安装以及写入数据到 [Kafka](http://kafka.apache.org),  
   [DataStream API] and ({{ site. baseurl }}..... have to be on the same line.
    
    
   In docs/getting-started/tutorials/datastream_api.zh.md:
    >   -Please follow our [local setup tutorial](local_setup.html) for setting 
up a Flink distribution -on your machine and refer to the [Kafka 
quickstart](https://kafka.apache.org/0110/documentation.html#quickstart) -for 
setting up a Kafka installation before we proceed. 
+请按照我们的[本地安装教程](local_setup.html)在你的机器上构建一个Flink分布式环境，同时参考[Kafka快速指南](https://kafka.apache.org/0110/documentation.html#quickstart)安装一个我们需要使用的Kafka环境。
  &#xFE0F; Suggested change 
-请按照我们的[本地安装教程](local_setup.html)在你的机器上构建一个Flink分布式环境，同时参考[Kafka快速指南](https://kafka.apache.org/0110/documentation.html#quickstart)安装一个我们需要使用的Kafka环境。
 +请按照我们的[本地安装教程](local_setup.html)在你的机器上构建一个Flink分布式环境，同时参考 
[Kafka快速指南](https://kafka.apache.org/0110/documentation.html#quickstart)安装一个我们需要使用的Kafka环境。
  
    
   In docs/getting-started/tutorials/datastream_api.zh.md:
    >    {% highlight bash %}  bin/kafka-console-consumer.sh  --zookeeper 
localhost:2181 --topic wiki-result  {% endhighlight %}   -You can also check 
out the Flink dashboard which should be running at 
[http://localhost:8081](http://localhost:8081). -You get an overview of your 
cluster resources and running jobs: 
+你还可以查看运行在[http://localhost:8081](http://localhost:8081)上的 Flink 
作业仪表盘。你可以概览集群资源以及正在运行的作业：  &#xFE0F; Suggested change 
-你还可以查看运行在[http://localhost:8081](http://localhost:8081)上的 Flink 
作业仪表盘。你可以概览集群资源以及正在运行的作业： +你还可以查看运行在 
[http://localhost:8081](http://localhost:8081) 上的 Flink 
作业仪表盘。你可以概览集群资源以及正在运行的作业：  
    
   In docs/getting-started/tutorials/datastream_api.zh.md:
    > @@ -168,12 +149,8 @@ KeyedStream<WikipediaEditEvent, String> keyedEdits = 
edits      });  {% endhighlight %}   -This gives us a Stream of 
`WikipediaEditEvent` that has a `String` key, the user name. -We can now 
specify that we want to have windows imposed on this stream and compute a 
-result based on elements in these windows. A window specifies a slice of a 
Stream -on which to perform a computation. Windows are required when computing 
aggregations -on an infinite stream of elements. In our example we will say 
-that we want to aggregate the sum of edited bytes for every five seconds: 
+这给了我们一个 `WikipediaEditEvent` 数据流，它有一个 `String` 键，即用户名。  
   maybe we can have a better translation for this paragraph.
    
    
   In docs/getting-started/tutorials/datastream_api.zh.md:
    >   -This should get you started with writing your own Flink programs. To 
learn more -you can check out our guides -about [basic concepts]({{ 
site.baseurl }}/dev/api_concepts.html) and the -[DataStream API]({{ 
site.baseurl }}/dev/datastream_api.html). Stick -around for the bonus exercise 
if you want to learn about setting up a Flink cluster on -your own machine and 
writing results to [Kafka](http://kafka.apache.org). +这可以让你开始创建你自己的 Flink 
项目。你可以查看[基本概念]({{ site.baseurl }}/zh/dev/api_concepts.html)和[DataStream API] 
+({{ site.baseurl }}/zh/dev/datastream_api.html)指南。如果你想学习了解更多关于 Flink 
集群安装以及写入数据到 [Kafka](http://kafka.apache.org), +你可以自己多加以练习尝试。  
   where is the source of this translation?
    
    
   In docs/getting-started/tutorials/datastream_api.zh.md:
    > @@ -309,24 +279,17 @@ similar to this:  4> (KasparBot,-245)  {% 
endhighlight %}   -The number in front of each line tells you on which parallel 
instance of the print sink the output -was produced. 
+每行数据前面的数字代表着打印接收器在哪个并行实例上产生的输出数据。  
   每行数据前面的数字代表着打印接收器运行的并实例？
    
   &mdash;
   You are receiving this because you were mentioned.
   Reply to this email directly, view it on GitHub, or mute the thread.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] [flink] LakeShen commented on issue #9097: [FLINK-11529][docs-zh] Translate the "DataStream API Tutorial" page into Chinese

Reply via email to