[GitHub] [flink] YngwieWang commented on a change in pull request #9150: [FLINK-13227][docs-zh] Translate "asyncio" page into Chinese

GitBox Wed, 17 Jul 2019 20:47:42 -0700

YngwieWang commented on a change in pull request #9150: [FLINK-13227][docs-zh] 
Translate "asyncio" page into Chinese
URL: https://github.com/apache/flink/pull/9150#discussion_r304724423


 ##########
 File path: docs/dev/stream/operators/asyncio.zh.md
 ##########
 @@ -140,130 +124,114 @@ DataStream<Tuple2<String, String>> resultStream =
 <div data-lang="scala" markdown="1">
 {% highlight scala %}
 /**
- * An implementation of the 'AsyncFunction' that sends requests and sets the 
callback.
+ * 实现 'AsyncFunction' 用于发送请求和设置回调。
  */
 class AsyncDatabaseRequest extends AsyncFunction[String, (String, String)] {
 
-    /** The database specific client that can issue concurrent requests with 
callbacks */
+    /** 使用回调函数来并发发送请求的数据库客户端 */
     lazy val client: DatabaseClient = new DatabaseClient(host, post, 
credentials)
 
-    /** The context used for the future callbacks */
+    /** 用于 future 回调的上下文环境 */
     implicit lazy val executor: ExecutionContext = 
ExecutionContext.fromExecutor(Executors.directExecutor())
 
 
     override def asyncInvoke(str: String, resultFuture: ResultFuture[(String, 
String)]): Unit = {
 
-        // issue the asynchronous request, receive a future for the result
+        // 发送异步请求，接收 future 结果
         val resultFutureRequested: Future[String] = client.query(str)
 
-        // set the callback to be executed once the request by the client is 
complete
-        // the callback simply forwards the result to the result future
+        // 设置客户端完成请求后要执行的回调函数
+        // 回调函数只是简单地把结果发给 future
         resultFutureRequested.onSuccess {
             case result: String => resultFuture.complete(Iterable((str, 
result)))
         }
     }
 }
 
-// create the original stream
+// 创建初始 DataStream
 val stream: DataStream[String] = ...
 
-// apply the async I/O transformation
+// 应用异步 I/O 转换操作
 val resultStream: DataStream[(String, String)] =
     AsyncDataStream.unorderedWait(stream, new AsyncDatabaseRequest(), 1000, 
TimeUnit.MILLISECONDS, 100)
 
 {% endhighlight %}
 </div>
 </div>
 
-**Important note**: The `ResultFuture` is completed with the first call of 
`ResultFuture.complete`.
-All subsequent `complete` calls will be ignored.
+**重要提示**： 第一次调用 `ResultFuture.complete` 后 `ResultFuture` 就完成了。
+后续的 `complete` 调用都将被忽略。
 
-The following two parameters control the asynchronous operations:
+下面两个参数控制异步操作：
 
-  - **Timeout**: The timeout defines how long an asynchronous request may take 
before it is considered failed. This parameter
-    guards against dead/failed requests.
+  - **Timeout**： 超时参数定义了异步请求发出多久后未得到响应即被认定为失败。 它可以防止一直等待得不到响应的请求。
 
-  - **Capacity**: This parameter defines how many asynchronous requests may be 
in progress at the same time.
-    Even though the async I/O approach leads typically to much better 
throughput, the operator can still be the bottleneck in
-    the streaming application. Limiting the number of concurrent requests 
ensures that the operator will not
-    accumulate an ever-growing backlog of pending requests, but that it will 
trigger backpressure once the capacity
-    is exhausted.
+  - **Capacity**： 容量参数定义了可以同时进行的异步请求数。
+    即使异步 I/O 通常带来更高的吞吐量， 执行异步 I/O  操作的算子仍然可能成为流处理的瓶颈。 
限制并发请求的数量可以确保算子不会持续累积待处理的请求进而造成积压，而是在容量耗尽时触发反压。
 
 
-### Timeout Handling
+### 超时处理
 
-When an async I/O request times out, by default an exception is thrown and job 
is restarted.
-If you want to handle timeouts, you can override the `AsyncFunction#timeout` 
method.
+当异步 I/O 请求超时的时候，默认会抛出异常并重启作业。
+如果你想处理超时，可以覆写 `AsyncFunction#timeout` 方法。
 
+### 结果的顺序
 
-### Order of Results
+`AsyncFunction` 发出的并发请求经常以不确定的顺序完成，这取决于请求得到响应的顺序。
+Flink 提供两种模式控制结果记录以何种顺序发出。
 
-The concurrent requests issued by the `AsyncFunction` frequently complete in 
some undefined order, based on which request finished first.
-To control in which order the resulting records are emitted, Flink offers two 
modes:
+  - **无序模式**： 异步请求一结束就立刻发出结果记录。
+    流中记录的顺序在经过异步 I/O 算子之后发生了改变。
+    当使用 *处理时间* 作为基本时间特征时，这个模式具有最低的延迟和最少的开销。
+    此模式使用 `AsyncDataStream.unorderedWait(...)` 方法。
 
-  - **Unordered**: Result records are emitted as soon as the asynchronous 
request finishes.
-    The order of the records in the stream is different after the async I/O 
operator than before.
-    This mode has the lowest latency and lowest overhead, when used with 
*processing time* as the basic time characteristic.
-    Use `AsyncDataStream.unorderedWait(...)` for this mode.
+  - **有序模式**: 
这种模式保持了流的顺序。发出结果记录的顺序与触发异步请求的顺序（记录输入算子的顺序）相同。为了实现这一点，算子将缓冲一个结果记录直到这条记录前面的所有记录都发出（或超时）。由于记录或者结果要在
 checkpoint  的状态中保存更长的时间，所以与无序模式相比，有序模式通常会带来一些额外的延迟和 checkpoint  开销。此模式使用 
`AsyncDataStream.orderedWait(...)` 方法。
 
-  - **Ordered**: In that case, the stream order is preserved. Result records 
are emitted in the same order as the asynchronous
-    requests are triggered (the order of the operators input records). To 
achieve that, the operator buffers a result record
-    until all its preceding records are emitted (or timed out).
-    This usually introduces some amount of extra latency and some overhead in 
checkpointing, because records or results are maintained
-    in the checkpointed state for a longer time, compared to the unordered 
mode.
-    Use `AsyncDataStream.orderedWait(...)` for this mode.
 
+### 事件时间
 
-### Event Time
+当流处理应用使用[事件时间]({{ site.baseurl }}/zh/dev/event_time.html)时，异步 I/O 算子会正确处理 
watermark。对于两种顺序模式，这意味着以下内容：
 
-When the streaming application works with [event time]({{ site.baseurl 
}}/dev/event_time.html), watermarks will be handled correctly by the
-asynchronous I/O operator. That means concretely the following for the two 
order modes:
+  - **无序模式**： Watermark 既不超前于记录也不落后于记录，即 watermark 建立了*顺序的边界*。
+    只有连续两个 watermark 之间的记录是无序发出的。
+    在一个 watermark 后面生成的记录只会在这个 watermark 发出以后才发出。
+    在一个 watermark 之前的所有输入的结果记录全部发出以后，才会发出这个 watermark。
 
-  - **Unordered**: Watermarks do not overtake records and vice versa, meaning 
watermarks establish an *order boundary*.
-    Records are emitted unordered only between watermarks.
-    A record occurring after a certain watermark will be emitted only after 
that watermark was emitted.
-    The watermark in turn will be emitted only after all result records from 
inputs before that watermark were emitted.
+    这意味着存在 watermark 的情况下，*无序模式* 会引入一些与*有序模式* 相同的延迟和管理开销。开销大小取决于 watermark 的频率。
 
-    That means that in the presence of watermarks, the *unordered* mode 
introduces some of the same latency and management
-    overhead as the *ordered* mode does. The amount of that overhead depends 
on the watermark frequency.
+  - **有序模式**： 连续两个 watermark 之间的记录顺序也被保留了。开销与使用*处理时间* 相比，没有显著的差别。
+    
 
-  - **Ordered**: Order of watermarks an records is preserved, just like order 
between records is preserved. There is no
-    significant change in overhead, compared to working with *processing time*.
+请记住，*摄入时间* 是一种特殊的*事件时间*，它基于数据源的处理时间自动生成 watermark。
 
-Please recall that *Ingestion Time* is a special case of *event time* with 
automatically generated watermarks that
-are based on the sources processing time.
 
+### 容错保证
 
-### Fault Tolerance Guarantees
+异步 I/O 算子提供了完全的精确一次容错保证。它将在途的异步请求的记录保存在 checkpoint 中，在故障恢复时重新触发请求。
 
 Review comment:
   原文是“in the checkpointed state”，是否是保存在 checkpoint 中的意思？

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] YngwieWang commented on a change in pull request #9150: [FLINK-13227][docs-zh] Translate "asyncio" page into Chinese

Reply via email to