klion26 commented on a change in pull request #9097: [FLINK-11529][docs-zh]
Translate the "DataStream API Tutorial" page into Chinese
URL: https://github.com/apache/flink/pull/9097#discussion_r303226389
##########
File path: docs/getting-started/tutorials/datastream_api.zh.md
##########
@@ -131,32 +120,24 @@ public class WikipediaAnalysis {
}
{% endhighlight %}
-The program is very basic now, but we will fill it in as we go. Note that I'll
not give
-import statements here since IDEs can add them automatically. At the end of
this section I'll show
-the complete code with import statements if you simply want to skip ahead and
enter that in your
-editor.
+这个程序现在很基础,但我们会边做边进行补充。注意我不会给出导入语句,因为 IDE 会自动添加它们。在本节的最后,我将展示带有导入语句的完整代码
+如果您只是想跳过并在您的编辑器中编辑他们。
-The first step in a Flink program is to create a `StreamExecutionEnvironment`
-(or `ExecutionEnvironment` if you are writing a batch job). This can be used
to set execution
-parameters and create sources for reading from external systems. So let's go
ahead and add
-this to the main method:
+在一个 Flink 程序中,首先你需要创建一个 `StreamExecutionEnvironment` (或者处理批作业环境的
`ExecutionEnvironment`)。这可以用来设置程序运行参数,同时也能够创建从外部系统读取的源。我们把这个添加到 main 方法中:
{% highlight java %}
StreamExecutionEnvironment see =
StreamExecutionEnvironment.getExecutionEnvironment();
{% endhighlight %}
-Next we will create a source that reads from the Wikipedia IRC log:
+接下来我们将创建一个读取维基百科 IRC 数据源:
{% highlight java %}
DataStream<WikipediaEditEvent> edits = see.addSource(new
WikipediaEditsSource());
{% endhighlight %}
-This creates a `DataStream` of `WikipediaEditEvent` elements that we can
further process. For
-the purposes of this example we are interested in determining the number of
added or removed
-bytes that each user causes in a certain time window, let's say five seconds.
For this we first
-have to specify that we want to key the stream on the user name, that is to
say that operations
-on this stream should take the user name into account. In our case the
summation of edited bytes in the windows
-should be per unique user. For keying a Stream we have to provide a
`KeySelector`, like this:
+上面代码创建了一个 `WikipediaEditEvent`
事件的`DataStream`,我们可以进一步处理它。这个代码实例的目的是为了确定每个用户在特定时间窗口中添加或删除的字节数,比如5秒一个时间窗口。首先
+我们必须指定用户名来划分我们的数据流,也就是说这个流上的操作应该考虑用户名。
+在我们这个统计窗口编辑的字节数的例子中,每个用户应该唯一的。对于划分一个数据流,我们必须提供一个 `KeySelector`,像这样:
Review comment:
I think here does not mean "每个用户应该是唯一的", It means "每个不同的用户每个窗口都应该计算一个结果"
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services